University of Wales, Aberystwyth - Linear Statistical Models Exam, January/February 2009, Exams of Statistics

The january/february 2009 exam for the linear statistical models course at the university of wales, aberystwyth. The exam covers topics such as mean and dispersion of random vectors, bivariate distributions, chi-squared distributions, gauss-markov theorem, and linear regression. It includes various mathematical problems and calculations.

Typology: Exams

2012/2013

Uploaded on 02/14/2013

anushri
anushri 🇮🇳

5

(2)

66 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
PRIFYSGOL CYMRU / UNIVERSITY OF WALES
ABERYSTWYTH
INSTITUTE OF MATHEMATICAL AND PHYSICAL SCIENCES
SEMESTER 1 EXAMINATIONS, JANUARY/FEBRUARY 2009
MA36510 – Linear Statistical Models
Time allowed – 2 hours
All questions may be attempted
Marks gained from questions in Section B will be given greater consideration in
assessing a first class performance.
Calculators are permitted, provided they are silent, self-powered, without
communications facilities, and incapable of holding text or other material that could be
used to give a candidate an unfair advantage. They must be made available on request
for inspection by invigilators, who are authorised to remove any suspect calculators.
Statistical Tables will be provided
Information
Unless otherwise stated you may quote without proof:
The n-dimensional multivariate Normal MVN(µ,Σ) distribution with probability density
function
f(y) = (2π)
n/2
|Σ|
–1/2
exp(–Q/2)
where Q = (yµ)
T
Σ
–1
(yµ) = y
T
Σ
–1
y – 2y
T
Σ
–1
µ + µ
T
Σ
–1
µ.
In turn, Q has a chi-squared distribution on n degrees of freedom.
The Normal equations for ordinary least squares estimation in the linear model E[Y]=Xβ
are given by X
T
Xβ = X
T
y.
pf3
pf4
pf5

Partial preview of the text

Download University of Wales, Aberystwyth - Linear Statistical Models Exam, January/February 2009 and more Exams Statistics in PDF only on Docsity!

PRIFYSGOL CYMRU / UNIVERSITY OF WALES

ABERYSTWYTH

INSTITUTE OF MATHEMATICAL AND PHYSICAL SCIENCES

SEMESTER 1 EXAMINATIONS, JANUARY/FEBRUARY 2009

MA36510 – Linear Statistical Models

Time allowed – 2 hours

 All questions may be attempted

 Marks gained from questions in Section B will be given greater consideration in

assessing a first class performance.

 Calculators are permitted, provided they are silent, self-powered, without

communications facilities, and incapable of holding text or other material that could be

used to give a candidate an unfair advantage. They must be made available on request

for inspection by invigilators, who are authorised to remove any suspect calculators.

 Statistical Tables will be provided

Information

Unless otherwise stated you may quote without proof:

The n-dimensional multivariate Normal MVN(μ,Σ) distribution with probability density

function

f(y) = (2π)–n/2|Σ|–1/2exp(–Q/2)

where Q = (y–μ)TΣ–1(y–μ) = yTΣ–1y – 2yTΣ–1μ + μTΣ–1μ.

In turn, Q has a chi-squared distribution on n degrees of freedom.

The Normal equations for ordinary least squares estimation in the linear model E[Y]=Xβ

are given by XTXβ = XTy.

Section A

The random vector

1 2 3

Y

Y

Y

has mean (1 2 3)T^ and dispersion

Find the mean vector and the dispersion matrix of 1 1 2 2 1 3

W Y Y

W Y Y

and the

expected value of Q = 3 Y 12 + Y 22 – 2Y 32 – 4Y 1 Y 3. [10]

2 Identify the bivariate distributions with probability density functions proportional to (a) exp{– ½(8y 12 + 2y 22 – 4y 1 y 2 )}. (b) exp{–4y 12 – y 22 + 2y 1 y 2 – 6y 1 + 8y 2 }. In (a), calculate (i) P(Y 1 + Y 2 >1) (ii) P(8Y 12 + 2Y 22 – 4Y 1 Y 2 > 0.575). [12]

3 X, Y and Z are uncorrelated standard Normal random variables and Q is defined by Q = 5X 2 + Y 2 + 5Z^2 + 3XY + XZ – 3YZ. Show that Q is a multiple of a chi-squared variate and find its median value. Find also the values of a and b such that T = X + aY + bZ is distributed independently of Q. Deduce the value of k if P(Q < kT 2 ) = 0.05. (^) [12]

4 In an economic study of the level of a scaled indicator for the years between between 1999 and 2007 (both inclusive) the model used relates the expected value to x = (year–2003) as follows: 2 [ ] (^) 1, 0, 1

x x Y (^) x x

 α + γ ≥ = α + β + γ = − + 

E

Errors were assumed to be uncorrelated and Normally distributed of constant variance σ^2 = 1. (a) Briefly describe this model in terms of what it would represent on a plot of Y against x. Formulate the model in matrix terms and deduce the matrix XTX. Suggest a reason why x was taken to be (year – 2003) rather than any other quantity. [7]

Section B

6 Two symmetric matrices A and B satisfy the inter-relationships A^2 = 5A B^2 = 2B AB = 5B. The trace of A is 15 and that of B is 4.

(a) Show that C = 15 A −^12 B is an idempotent matrix and write down its trace.

(b) The random vector Y has the standard multivariate Normal distribution and quadratic forms Q 1 , Q 2 and Q 3 are defined by Q 1 = YTAY Q 2 = YTBY Q 3 = YTCY. Show that all three are related to chi-squared distributions. (c) Verify that Q 3 and Q 2 are independently distributed, but that this does not apply to the other two pairs. (d) Find the values of c and d such that P(Q 3 > cQ 2 ) = P(Q 1 > dQ 2 ) = 0.05.

[3]

[4]

[3]

[8]

7 State carefully the Gauss-Markov Theorem.

Six uncorrelated observations each have the same variance. Their respective expectations are α+6β, 2α+5β, 3α+4β, 4α+3β, 5α+2β and 6α+β. (a) Show that the best linear unbiased estimator of α+β is a multiple of the simple average of the six observations. (b) Find the correlation between the ordinary least squares estimators of the two parameters. (c) What is the efficiency of (Y 6 – Y 1 )/5 as a linear unbiased estimator of α–β?

[2]

[5]

[4]

[4]

8 In the context of Question 4 in Section A, obtain a joint 90% confidence region for α and β. Would you believe that α = 65 and β = 7? [8]

9 In a study investigating a new method of measuring body composition, percentage body fat, age and sex were recorded for 18 normal adults aged between 23 and 61 years. A linear relationship between body fat and age was fitted, with different slope and intercept values for males and females. The accompanying data sheet contains some results along with a plot and a table

that gives the values of: age, fat as recorded in the investgation m,f m =1 for male, f=1 for female mage,fage Products of m⋅age, f⋅age respectively

ε^ ˆi Ordinary residuals

e ˆi Standardised residuals

ε i * Studentized^ residuals

hii (^) Leverages

y^ ˆ i Fitted values

(a) Describe the model fitted. Specify the usual assumptions underlying an analysis such as this. (b) Give the best linear unbiased estimate of the expected percentage body fat reading for a 23-year old male. (c) Two quantities have been highlighted towards the bottom left of the printout. How are these two quantities related to each other? (d) Estimate the difference between the two slope coefficients and its standard error. Would you say that the slopes are different?

(e) Describe briefly how the results in the ˆεi , eˆi andεi* columns have been

calculated. (f) Explain how the value h 22 = 0.629831 was calculated. (g) The MINITAB analysis highlights two ‘unusual’ observations. Comment briefly on these, referring to the tables and/or plot as appropriate.

[3]

[1]

[2]

[4]

[4]

[2]

[3]

Formulae:

Standardized:

i^ i ii

e

h

Studentized: *

[ ]

i^ i

i hii