Qualifying Exam with Solutions - Statistical Methods | STAT 51100, Exams of Data Analysis & Statistical Methods

Material Type: Exam; Professor: Zhang; Class: Statistical Methods; Subject: STAT-Statistics; University: Purdue University - Main Campus; Term: Spring 2008;

Typology: Exams

Pre 2010

Uploaded on 07/30/2009

koofers-user-1mf-1
koofers-user-1mf-1 🇺🇸

9 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
QUALIFYING EXAM SOLUTIONS
Statistical Methods
Saturday, Jan 2, 2008, 8:00 am -12:00 pm
1. (a) Plot is omitted. We can find strong evidence of interaction effect from the plot.
(b) The ANOVA model is
yijk =µ+αi+βj+ (αβ)ij +²ijk ;i= 1,2,3, j = 1,2, k = 1,· · · ,11.
Note that
SST =
3
X
i=1
2
X
j=1
11
X
k=1
(yijk ¯y···)2
=22
3
X
i=1
yi·· ¯y···)2+ 33
2
X
j=1
y·j·¯y···)2+ 11
3
X
i=1
2
X
j=1
yij·¯yi·· ¯yj·· + ¯y···)2
+
3
X
i=1
2
X
j=1
11
X
k=1
(yijk ¯yij·)2
=SSA +SSG +SS AG +SSE.
Thus, we can compute SSA,SSG and SS AG from the table, but we can not compute
SSE. In this table, we only need
SSG = 33[(8.94 11.70)2+ (14.45 11.70)2] = 66(8.94 11.70)2= 502.76.
Then, we have the table
Source df SS MS F
G 1 502.76 502.76 14.69
A 2 3286.39 1643.2 48.02
G*A 2 491.30 245.65 7.18
Error 60 2053.49 34.22
Total 65 6333.94
(c) Since Age is a quantity variable, we may need to look at whether the effect is linear for
either interaction of main effect.
(d) The plot show that the equal variance assumption is vialated. We need a Box-cox
transformation. This plot shows that it is very likely the squared root transformation.
2. Suppose we consider the simple linear regression model
Yi=β0+β1Xi+²i,
where ²iiid N(0, σ2), which gives estimators as
ˆ
β1=Pn
i=1(Xi¯
X)(Yi¯
Y)
Pn
i=1(Xi¯
X)2
1
pf3
pf4

Partial preview of the text

Download Qualifying Exam with Solutions - Statistical Methods | STAT 51100 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

QUALIFYING EXAM SOLUTIONS

Statistical Methods Saturday, Jan 2, 2008, 8:00 am -12:00 pm

  1. (a) Plot is omitted. We can find strong evidence of interaction effect from the plot.

(b) The ANOVA model is

yijk = μ + αi + βj + (αβ)ij + ≤ijk; i = 1, 2 , 3 , j = 1, 2 , k = 1, · · · , 11.

Note that

SST =

∑^3 i=

∑^2 j=

∑^11 k=

(yijk − ¯y···)^2

∑^3 i=

(¯yi·· − y¯···)^2 + 33

∑^2 j=

(¯y·j· − ¯y···)^2 + 11

∑^3 i=

∑^2 j=

(¯yij· − y¯i·· − y¯j·· + ¯y···)^2

∑^3 i=

∑^2 j=

∑^11 k=

(yijk − y¯ij·)^2

=SSA + SSG + SSAG + SSE.

Thus, we can compute SSA, SSG and SSAG from the table, but we can not compute SSE. In this table, we only need

SSG = 33[(8. 94 − 11 .70)^2 + (14. 45 − 11 .70)^2 ] = 66(8. 94 − 11 .70)^2 = 502. 76.

Then, we have the table Source df SS MS F G 1 502.76 502.76 14. A 2 3286.39 1643.2 48. G*A 2 491.30 245.65 7. Error 60 2053.49 34. Total 65 6333. (c) Since Age is a quantity variable, we may need to look at whether the effect is linear for either interaction of main effect. (d) The plot show that the equal variance assumption is vialated. We need a Box-cox transformation. This plot shows that it is very likely the squared root transformation.

  1. Suppose we consider the simple linear regression model

Yi = β 0 + β 1 Xi + ≤i,

where ≤i ∼iid^ N (0, σ^2 ), which gives estimators as

βˆ 1 =

∑n i=1 ∑(Xni^ −^ X¯)(Yi^ −^ Y¯^ ) i=1(Xi^ −^ X¯)^2

and βˆ 0 = Y¯ − βˆ 1 X 1. In order to obtain those, we need the following quantities ∑^ n i=

(Xi − X¯)^2

∑^ n i=

(Xi − X¯)(Yi − Y¯ )

and (^) ∑n

i=

(Yi − Y¯ ).

However, ∑ni=1(Xi − X¯)(Yi − Y¯ ) cannot be recovered from the condensed data. Therefore, we have to use some other models. One option is the joint modeling method as fitted a weighted regression as yi = β 0 + β 1 Xi + ≤i with ≤i ∼ N (0, σ^2 i ) and i indicates the different values of Xi. We can model σ^2 i by Gamma GLM and derived the estimated of σ^2 and then fit the model. The limitation of the condensed data is that this will give a larger variance estimates.

  1. The study design is OK as it provides two counties with observations before and after the ban respectively, and the data have been summaried into a 2 × 2 table. The best way to analyze the 2 × 2 table is the use of the Odds Ratio. Let θ be the odds ratio. For this particular data, we have θˆ =^17 ×^16 5 × 18

and σlog θˆ = [^1 17

+^1

+^1

+^1

]^1 /^2 = 0. 6139.

The 95% confidence interval is

  1. 02 e±^1.^96 ×^0.^6139 = [0. 9067 , 10 .06],

which is insignificant. In addition, we can also use two-sample binomial methods, in which we assumes X ∼ bin(22, p 1 ) for Monreo County and Y ∼ bin(34, p 2 ) for Delaware County. Then, we have ˆp 1 = 17/22 = 0.7727, Vˆ (ˆp 1 ) = 0.007983, ˆp 2 = 18/34 = 0.5294 and Vˆ (ˆp 2 ) = 0.007375. When p 1 = p 2 = p, we have ˆp = 0.625. Then, the z-score is

√^0.^7727 −^0.^5294 0 .625(1 − 0 .625)(1/22 + 1/34)

which implies insignificant at level 0.05. Thus, we conclude insinificance of the test. The method used by this problem is try to combined the conclusion of the two test together. Since each of them may be a mistake with some probability, the total error rate could be higher than 0.05. This is caused by the multiple testing problem. Thus, we suspect the conclusion of this problem. In addition, our standard method gives a different answer.

  1. This means that given SES, Boy Scout and Delinquent behavior is independent. However, marginal they are model Exactly, those are caused by the following Poisson model as

log(λijk) = μ + αi + βj + γk + (αγ)ik + (βγ)jk,

where i, j and k represent Boy Scout, Delinquent Behavior and SES respectively. This is the conditional independent model, but not independent model. It may cause marginally independent and this phonomenon is called Simpson’s Paradox. For example, this could happen in the following data:

SES Yes No Delinquent Boy Scout Boy Scout Behavior Yes No Yes No Yes 10 20 90 30 No 20 40 30 10