






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The second year mathematics & statistics exam paper from lancaster university, 2008. The exam focuses on statistics and includes questions related to likelihood functions, maximum likelihood estimates, statistical distributions, and linear regression models.
Typology: Exams
1 / 10
This page cannot be seen from the preview
Don't miss anything!







PART II (Second year)
MATHEMATICS & STATISTICS 2 hours
Math 235: Statistics
You should answer ALL Section A questions and THREE Section B questions. In Section A there are questions worth a total of 50 marks, but the maximum mark that you can gain there is capped at 40. There is are statistical tables at the end of this examination paper.
SECTION A
A1. In a coin tossing experiment successive outcomes are independent with θ the probability of a head resulting from a single toss. The coin is tossed until the first tail is observed. The total number of tosses required was six. (a) Write down the likelihood function L(θ) and make a rough sketch of it; [4] (b) Determine the maximum likelihood estimate of θ; [3] (c) Calculate the relative likelihood of the value θ = 0.75; [2] (d) Explain the importance of the set of θ–values whose relative likelihood is at least 0.1466. [3]
A2. X 1 , X 2 ,... , Xn are independent and identically distributed gamma random variables, each having probability density function
f (x|θ) = θ
(^3) x (^2) exp(−θx) 2 , x >^0. (a) Determine the log-likelihood function l(θ); [3] (b) Find the score function, the maximum likelihood estimator of θ and the observed infor- mation; [6] (c) Determine the maximum likelihood estimator of θ^4. [4]
please turn over
SECTION A continued
A3. An ornithologist wishes to estimate the weights A, B and C, of three young birds (puffins) using a delicate balance with two sacks. When both sacks are empty, the expected measure- ment is zero. His first measurement, y 1 , is of all three birds placed in one sack. His second measurement, y 2 , is of A − B, obtained by placing the first bird in one sack and the second in the other. Similarly he obtains a measure y 3 of B − C and y 4 of C − A. (a) Write down the response vector and design matrix of the linear model to estimate A, B and C from these measurements. [2] (b) Hence show that the least squares estimate of C is (y 1 − y 3 + y 4 ) 3.^ [3]
(c) In the case where y 1 = 15.37, y 2 = 1.43, y 3 = 3.67 and y 4 = − 0. 89 (i) Obtain the estimates of each of the weights. [2] (ii) Evaluate the residuals giving answers to two decimal places. [2] (iii) Estimate the residual variance. [2] (iv) Calculate a 95% confidence interval for A. [2]
please turn over
B1. For planning purposes in a particular region, the following model is used for car ownership by household: the probability that a household has access to a single car is θ, the probability that a household has access to two or more cars is 2θ^2 , the probability that a household has no access to a car is 1 − θ − 2 θ^2. Determine the permitted range of θ for this model. [3] This model applies in both parts (a) and (b) of this question.
(a) The results of a survey of 100 households are as follows:
Number of cars 0 1 2 or more Number of households 28 40 32 (i) Write down the likelihood L(θ) and show that the maximum likelihood estimate for θ is 0.4. [4] (ii) Determine the observed information. [4] (b) Suppose now that the survey had simply sought information on whether households had access to any car, irrespective of how many. The results are now Number of households with access to a car 72 Number of households with no access to a car 28 (i) Using these data to make inferences about θ, explain in detail why the maximum likelihood estimate θˆ must satisfy the equation
2 θˆ^2 + θˆ = 0. 72
and is hence unchanged at 0.4. [4] (ii) Determine the observed information. Comment on the results in (a)(ii) and (b)(ii). [5]
please turn over
SECTION B continued
B2. (a) The random variables X 1 , X 2 ,... , Xn are independent and identically distributed with the geometric distribution
f (x|θ) = θx(1 − θ), x = 0, 1 , 2 ,...
where θ is a parameter in the range of 0 ≤ θ ≤ 1 to be estimated. The mean of the above geometric distribution is θ/(1 − θ). (i) Write down formulae for the maximum likelihood estimator for θ and for Fishers’s information; [5] (ii) Write down what you know about the distribution of the maximum likelihood esti- mator for this example when n is large. [3] (b) In a particular experiment, n = 10, ∑n i=1 xi^ = 10. (i) Compute an approximate 95% confidence interval for θ based on the asymptotic distribution of the maximum likelihood estimator; [4] (ii) Compute the deviance D(θ) and sketch it over the range 0. 1 ≤ θ ≤ 0 .9. Use your sketch to describe how to use the deviance to obtain an approximate 95% confidence interval for θ; [4] (iii) If you were asked to produce an approximate 95% confidence interval for the mean of the distribution θ/(1 − θ), what would be your recommended approach? Justify your answer. [4]
please turn over
SECTION B continued
B3. (d) (iii) Given that the estimated residual standard variance for the model is ˆσ^2 = 0.8955, obtain the standard error of this estimated coefficient. (You may quote any standard result without proof). [2] (iv) Is the effect of tyre pressure significant at the 5% level? [2]
B4. (a) Let y = Xθ + z and y = Aφ + z be two linear models for the same response. Define what is meant by saying that the two models are equivalent. [2]
The linear regression model relating measured responses, yi, to explanatory variables xi for i = 1, 2,... , n, may be expressed either as
yi = θ 1 + θ 2 xi + zi
or as yi = φi + φ 2 (xi − ¯x) + zi where zi is the value of the measurement error. (b) Write down the respective design matrices, X and A, of these linear models as they are expressed in matrix forms y = Xθ + z and y = Aφ + z. [3] (c) Show that the two models are equivalent by finding the transformation matrix, T , such that X = AT and φ = T θ. [4] (d) What advantage is there in the use of the second form of the model given above, in terms of the least squares equations for the parameter estimates, and their distributional properties? [4] (e) Given the values of xi and yi in the table below, calculate the estimates of φ 1 and φ 2. xi 0 2 4 6 8 10 yi 3.0 3.1 3.2 3.4 3.5 3. [3] (f) By considering the relationship θ = T −^1 φ explain the relationship between the estimated coefficient θˆ 2 and φˆ 2 and explain why their “t” values are the same. [4]
end of exam
Standard Distributions
Here we list the basic properties of three standard distributions. Throughout X denotes either a continuous random variable with probability density function fX (x) or a discrete random variable with probability mass function pX (x).
Poisson distribution: if X ∼ Pois(θ), θ > 0, the probability mass function is
p(x|θ) = exp(−θ)^ θ
x x! for^ x^ = 0,^1 ,^2 ,...
Summary measures are: E(X) = θ and var (X) = θ.
Exponential Distribution: if X ∼ Exponential(β), with β > 0,
fX (x|β) =
β exp(−βx) x ≥ 0 , 0 otherwise,
Summary measures are: E(X) = β^1 and var (X) = (^) β^12.
Normal Distribution: if X ∼ N (μ, σ^2 ), with σ > 0, then
fX (x|μ, σ^2 ) = √^1 2 πσ^2
exp
− (x^ −^ μ)
2 2 σ^2
for − ∞ < x < ∞,
Summary measures are: E(X) = μ and var (X) = σ^2.
Table 4: The F distribution Values of f for which P (F > f ) = 0.05 (upper values) and P (F > f ) = 0.01 (lower values) where F has an F -distribution with r and s degrees of freedom. r
- 0.20 0.10 0.05 0.01 0. p - 1 3.078 6.314 12.706 63.657 636. - 2 1.886 2.920 4.303 9.925 31. - 3 1.638 2.353 3.182 5.841 12. - 4 1.533 2.132 2.776 4.604 8. - 5 1.476 2.015 2.571 4.032 6. - 6 1.440 1.943 2.447 3.707 5. - 7 1.415 1.895 2.365 3.499 5. - 8 1.397 1.860 2.306 3.355 5. - 9 1.383 1.833 2.262 3.250 4. - 10 1.372 1.812 2.228 3.169 4. - 11 1.363 1.796 2.201 3.106 4. - 12 1.356 1.782 2.179 3.055 4. - 13 1.350 1.771 2.160 3.012 4. - 14 1.345 1.761 2.145 2.977 4. - 15 1.341 1.753 2.131 2.947 4. - 16 1.337 1.746 2.120 2.921 4.