




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The instructions and questions for a 2-hour statistics exam, specifically for the course math331: statistical inference, at lancaster university from the academic year 2012. The exam covers topics such as likelihood functions, log-likelihoods, maximum likelihood estimators, deviance functions, bayes' theorem, and various distributions like beta, gamma, and poisson.
Typology: Exams
1 / 8
This page cannot be seen from the preview
Don't miss anything!





PART II (Third year)
MATHEMATICS & STATISTICS 2 hours
Math331: Statistical inference
You should answer ALL Section A questions and TWO Section B questions. Section A is capped at 40 marks. There is a formula sheet at the end of the exam. You may also find the following results helpful when answering certain questions. If the random variable Wi has a χ^2 i distribution, i.e. chi-squared on i degrees of freedom, then P (Wi > wi) = 0.05 for wi ≈ 3. 84 , 5 .99 and 7.81 for i = 1, 2 , 3 respectively, and P (W 1 > 6 .63) ≈ 0 .01. If X ∼ N (0, 1), then P[X ≤ 1 .96] ≈ 0 .975. Throughout the paper, the following abbreviations will be adopted. “IID” will stand for “indepen- dent and identically distributed” and “MLE” will abbreviate “maximum likelihood estimator”.
SECTION A
A1. Let X 1 ,... , Xn be an IID sample from the Beta(α, 1) distribution. (a) Write down the likelihood L(α) and the log-likelihood (α) for the problem of estimating α. [2] (b) Calculate the asymptotic distribution of ˆα. [6] (c) What is the asymptotic distribution of log ˆα? [3]
please turn over
SECTION A continued
A2. (a) Define the deviance function^ D(θ) for a statistical model with^ d-dimensional parameter θ, in terms of the log-likelihood function (θ) and the MLE ˆθ. [3] (b) State carefully the asymptotic distribution of D(θ) for different values of θ. [You need not state precisely the regularity conditions under which your result holds.] [4] (c) Consider the linear regression model: Yi ∼ N (α + βzi, 1), where the Yi’s are assumed independent and the {zi} are known covariates. (i) Write down the asymptotic distribution of the MLE. (Do not find the MLE) [8] You may use the identity below: ( a b c d
= (^) ad 1 − bc
d −b −c a
when ad = bc. (ii) Write down an expression for the deviance D(α, β) in terms of ˆα, βˆ and the data. [2] (d) Give a reason why deviance based confidence intervals might be preferred to those based on the asymptotic distribution of the MLE. [3]
A3. (a) State Bayes’ theorem for the exact conditional distribution^ f^ (θ|x) of a parameter^ θ^ in terms of the likelihood f (x|θ) and prior distribution f (θ) of θ. [3] (b) An engineer assesses the precision θ of a gauge by measuring the error in millimeters (mm) of two measurements x 1 and x 2. She assumes that these are independent obser- vations from the normal distribution with probability density function
f (x|θ) =
θ 2 π exp
− 12 x^2 θ
, −∞ < x < ∞, and her prior density for θ is f (θ) = 5 exp(− 5 θ), θ > 0. (i) Given that x 1 = 1 and x 2 = 3, show that her posterior density for θ is given by f (θ|x 1 , x 2 ) ∝ θ exp(− 10 θ), θ > 0. [4]
(ii) To which family of densities does this belong and what are the parameters? [2] (iii) What is the most probable value of θ from this density? [4] please turn over
SECTION A continued SECTION B
B1. (a) Consider a sequence of n independent trials where each trial has 3 possible outcomes with respective probabilities θ 1 , θ 2 and 1 − θ 1 − θ 2 where θ 1 ≥ 0, θ 1 ≥ 0 and θ 1 + θ 2 ≤ 1. Let (Y 1 , Y 2 , Y 3 ) denote the number of occurrences of each of the outcomes. (i) Write down the log-likelihood function for θ = (θ 1 , θ 2 )T^. [4] (ii) Calculate the MLE for θ, and show that its asymptotic distribution is given by
θˆ ∼ N
θ, (^) n^1
θ 1 (1 − θ 1 ) −θ 1 θ 2 −θ 1 θ 2 θ 2 (1 − θ 2 )
(b) Interest now lies in φ = θ 1 θ 2. Using your answer to (a) or otherwise, calculate the asymptotic distribution of φˆ. [8] (c) Suppose that (Y 1 , Y 2 , Y 3 ) = (9, 9 , 18). Find a 95% confidence interval for φ. [6]
please turn over
SECTION B continued
B2. The distribution of flaws along the length of an artificial fiber follows a Poisson process, so that the number of flaws (Y ) in a length of x metres of the fiber is distributed as
Y | θ, x ∼ Poisson(xθ),
where θ is the average number of flaws per length of fiber. You choose the uninformative prior π(θ) ∝ (^1) θ , for θ > 0.
(a) Suppose 2 fibers of length x 1 = 10 and x 2 = 15 metres have y 1 = 3 and y 2 = 2 flaws respectively. (We can denote these observations by x = {x 1 , x 2 } and y = {y 1 , y 2 } respectively.) Show that the resulting posterior distribution has the form
θ | x, y ∼ Gamma (G, H)
and find G and H in terms of x and y. [9] (b) Find the MLE, θˆ, based on the data given in Part (a). [3] (c) Find the posterior mean of θ. How does this compare with the MLE? [3]
The next set of questions,(d) to (f), refer to the distribution of further y∗^ flaws in a new piece of fiber of length x∗^ metres. The MLE can be used to predict future observations using what is known as the estimative distribution. This is defined as f (y∗^ | θ, xˆ ∗). (d) Find the estimative distribution for the above data and specify its parameters. [3]
Question B2 continued over the page
please turn over
SECTION B continued
B3. (a) A coin is tossed 10 times resulting in 9 heads and one tail.^ The question of interest is whether the coin is biased or not. The null hypothesis, Ho, is that the coin is fair, (π = 1/2), with the alternative hypothesis, Ha, is that the coin is biased (π = 1/2). (i) Define the p-value used in classical hypothesis testing. What is the implication of a low p-value? [4] (ii) Use a classical hypothesis test to test whether the coin is fair with a 5% significance level. [4] (iii) Under Ha you assume a Beta(1, 1) prior for the probability of a head. Calculate the Bayes Factor for Ha relative to Ho. [6] (iv) If I assume (before the experiment) that each hypothesis is equally likely, what is the probability that the coin is biased? [4]
(b) Discuss the differences between the Bayesian and the classical approach under three of the headings below. (i) The comparative treatment of data and parameters in a model. [4] (ii) Providing an interval that expresses uncertainty in a parameter value. [4] (iii) Dealing with nuisance parameters. [4]
please turn over
Formula Sheet
You may use the following:
∗ A univariate Normal distribution on Y ∈ R with mean μ ∈ R and variance σ^2 > 0 is denoted by Normal (μ, σ), and the corresponding density function is:
p
Y = y | μ, σ^2
=^1 σ (2π^1 ) 1 / 2 exp
− (^2) σ^12 (y − μ)^2
∗ A Gamma distribution on Y > 0 with shape parameter α > 0 and rate parameter β > 0 is denoted by Gamma (α, λ), and the corresponding density is:
p(Y = y | α, β) = β
α Γ(α) y
α− (^1) exp(−βy) for y > 0
The Gamma distribution, Gamma (α, β), has mean α β and variance (^) βα 2.
∗ A Beta distribution on Y > 0 with parameters α > 0 and β > 0 is denoted by Beta (α, β), and the corresponding density is:
p(Y = y | α, β) = (^) B(α, β^1 ) yα−^1 (1 − y)β−^1 for 0 < y < 1
where B(α, β) = Γ( Γ(αα)Γ(+ββ)) where Γ(n + 1) = n! for integer n. The Beta distribution, Beta (α, β), has mean (^) αα+β.
∗ A Binomial distribution on Y ∈ { 0 , 1 ,... , n} with parameter 0 ≤ p ≤ 1 is denoted by Binomial (n, p), and the corresponding probability mass function is:
p(Y = y | n, p) =
n y
py(1 − p)n−y^ for y = 0, 1 ,... , n
∗ A Poisson distribution on Y ∈ { 0 , 1 ,.. .} with parameter λ > 0 is denoted by Poisson (λ), and the corresponding probability mass function is:
p(Y = y | λ) = e
−λ y! λ
y (^) for y = 0, 1 ,...
∗ A Negative binomial distribution on Y ∈ { 0 , 1 ,... , n} with parameters 0 ≤ θ ≤ 1 and k ∈ { 1 ,.. .} is denoted by Negative-Binomial (k, θ), and the corresponding probability mass function is: p(Y = y|θ, k) =
( (^) y + k − 1 k − 1
θk(1 − θ)y, y = 0, 1 , 2 ,...