Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Qualifying Exam with Solutions - Statistical Methods | STAT 51100, Exams of Data Analysis & Statistical Methods

Purdue University Data Analysis & Statistical Methods

Prof. Tonglin Zhang

Material Type: Exam; Professor: Zhang; Class: Statistical Methods; Subject: STAT-Statistics; University: Purdue University - Main Campus; Term: Spring 2008;

Typology: Exams

Pre 2010

Uploaded on 07/30/2009

koofers-user-1mf-1 🇺🇸

9 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

QUALIFYING EXAM SOLUTIONS

Statistical Methods

Saturday, Jan 2, 2008, 8:00 am -12:00 pm

1. (a) Plot is omitted. We can find strong evidence of interaction effect from the plot.

(b) The ANOVA model is

yijk =µ+αi+βj+ (αβ)ij +²ijk ;i= 1,2,3, j = 1,2, k = 1,· · · ,11.

Note that

SST =

3

X

i=1

2

X

j=1

11

X

k=1

(yijk −¯y···)2

=22

3

X

i=1

(¯yi·· −¯y···)2+ 33

2

X

j=1

(¯y·j·−¯y···)2+ 11

3

X

i=1

2

X

j=1

(¯yij·−¯yi·· −¯yj·· + ¯y···)2

+

3

X

i=1

2

X

j=1

11

X

k=1

(yijk −¯yij·)2

=SSA +SSG +SS AG +SSE.

Thus, we can compute SSA,SSG and SS AG from the table, but we can not compute

SSE. In this table, we only need

SSG = 33[(8.94 −11.70)2+ (14.45 −11.70)2] = 66(8.94 −11.70)2= 502.76.

Then, we have the table

Source df SS MS F

G 1 502.76 502.76 14.69

A 2 3286.39 1643.2 48.02

G*A 2 491.30 245.65 7.18

Error 60 2053.49 34.22

Total 65 6333.94

(c) Since Age is a quantity variable, we may need to look at whether the effect is linear for

either interaction of main effect.

(d) The plot show that the equal variance assumption is vialated. We need a Box-cox

transformation. This plot shows that it is very likely the squared root transformation.

2. Suppose we consider the simple linear regression model

Yi=β0+β1Xi+²i,

where ²i∼iid N(0, σ2), which gives estimators as

ˆ

β1=Pn

i=1(Xi−¯

X)(Yi−¯

Y)

Pn

i=1(Xi−¯

X)2

1

Discover Exams of Data Analysis & Statistical Methods Purdue University

Partial preview of the text

Download Qualifying Exam with Solutions - Statistical Methods | STAT 51100 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

QUALIFYING EXAM SOLUTIONS

Statistical Methods Saturday, Jan 2, 2008, 8:00 am -12:00 pm

(a) Plot is omitted. We can find strong evidence of interaction effect from the plot.

(b) The ANOVA model is

yijk = μ + αi + βj + (αβ)ij + ≤ijk; i = 1, 2 , 3 , j = 1, 2 , k = 1, · · · , 11.

Note that

SST =

∑^3 i=

∑^2 j=

∑^11 k=

(yijk − ¯y···)^2

∑^3 i=

(¯yi·· − y¯···)^2 + 33

∑^2 j=

(¯y·j· − ¯y···)^2 + 11

∑^3 i=

∑^2 j=

(¯yij· − y¯i·· − y¯j·· + ¯y···)^2

∑^3 i=

∑^2 j=

∑^11 k=

(yijk − y¯ij·)^2

=SSA + SSG + SSAG + SSE.

Thus, we can compute SSA, SSG and SSAG from the table, but we can not compute SSE. In this table, we only need

SSG = 33[(8. 94 − 11 .70)^2 + (14. 45 − 11 .70)^2 ] = 66(8. 94 − 11 .70)^2 = 502. 76.

Then, we have the table Source df SS MS F G 1 502.76 502.76 14. A 2 3286.39 1643.2 48. G*A 2 491.30 245.65 7. Error 60 2053.49 34. Total 65 6333. (c) Since Age is a quantity variable, we may need to look at whether the effect is linear for either interaction of main effect. (d) The plot show that the equal variance assumption is vialated. We need a Box-cox transformation. This plot shows that it is very likely the squared root transformation.

Suppose we consider the simple linear regression model

Yi = β 0 + β 1 Xi + ≤i,

where ≤i ∼iid^ N (0, σ^2 ), which gives estimators as

βˆ 1 =

∑n i=1 ∑(Xni^ −^ X¯)(Yi^ −^ Y¯^ ) i=1(Xi^ −^ X¯)^2

and βˆ 0 = Y¯ − βˆ 1 X 1. In order to obtain those, we need the following quantities ∑^ n i=

(Xi − X¯)^2

∑^ n i=

(Xi − X¯)(Yi − Y¯ )

and (^) ∑n

i=

(Yi − Y¯ ).

However, ∑ni=1(Xi − X¯)(Yi − Y¯ ) cannot be recovered from the condensed data. Therefore, we have to use some other models. One option is the joint modeling method as fitted a weighted regression as yi = β 0 + β 1 Xi + ≤i with ≤i ∼ N (0, σ^2 i ) and i indicates the different values of Xi. We can model σ^2 i by Gamma GLM and derived the estimated of σ^2 and then fit the model. The limitation of the condensed data is that this will give a larger variance estimates.

The study design is OK as it provides two counties with observations before and after the ban respectively, and the data have been summaried into a 2 × 2 table. The best way to analyze the 2 × 2 table is the use of the Odds Ratio. Let θ be the odds ratio. For this particular data, we have θˆ =^17 ×^16 5 × 18

and σlog θˆ = [^1 17

+^1

]^1 /^2 = 0. 6139.

The 95% confidence interval is

02 e±^1.^96 ×^0.^6139 = [0. 9067 , 10 .06],

which is insignificant. In addition, we can also use two-sample binomial methods, in which we assumes X ∼ bin(22, p 1 ) for Monreo County and Y ∼ bin(34, p 2 ) for Delaware County. Then, we have ˆp 1 = 17/22 = 0.7727, Vˆ (ˆp 1 ) = 0.007983, ˆp 2 = 18/34 = 0.5294 and Vˆ (ˆp 2 ) = 0.007375. When p 1 = p 2 = p, we have ˆp = 0.625. Then, the z-score is

√^0.^7727 −^0.^5294 0 .625(1 − 0 .625)(1/22 + 1/34)

which implies insignificant at level 0.05. Thus, we conclude insinificance of the test. The method used by this problem is try to combined the conclusion of the two test together. Since each of them may be a mistake with some probability, the total error rate could be higher than 0.05. This is caused by the multiple testing problem. Thus, we suspect the conclusion of this problem. In addition, our standard method gives a different answer.

This means that given SES, Boy Scout and Delinquent behavior is independent. However, marginal they are model Exactly, those are caused by the following Poisson model as

log(λijk) = μ + αi + βj + γk + (αγ)ik + (βγ)jk,

where i, j and k represent Boy Scout, Delinquent Behavior and SES respectively. This is the conditional independent model, but not independent model. It may cause marginally independent and this phonomenon is called Simpson’s Paradox. For example, this could happen in the following data:

SES Yes No Delinquent Boy Scout Boy Scout Behavior Yes No Yes No Yes 10 20 90 30 No 20 40 30 10

Qualifying Exam with Solutions - Statistical Methods | STAT 51100, Exams of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Qualifying Exam with Solutions - Statistical Methods | STAT 51100 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

QUALIFYING EXAM SOLUTIONS

+^1

+^1

+^1

]^1 /^2 = 0. 6139.