Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Advanced Statistical Methodology (STAT 526) Midterm Exam Solutions, Exams of Applied Statistics

Purdue University Applied Statistics

Solutions to the midterm exam for the advanced statistical methodology course (stat 526). It covers topics such as regression analysis, goodness-of-fit tests, and odds ratios. The solutions demonstrate the application of statistical concepts and techniques to real-world problems. Valuable for students studying statistics and data analysis.

Typology: Exams

2018/2019

Uploaded on 02/13/2025

jasmine-liu 🇺🇸

1 document

1 / 8

This page cannot be seen from the preview

Don't miss anything!

ADVANCED STATISTICAL METHODOLOGY (STAT 526)

Spring 2019

MIDTERM EXAM (BRNG 2290)

8:00-10:00PM, Wednesday, Feburary 27, 2019

There are totally 32 points in the exam. The students with score higher than or equal to 30 points will

receive 30 points. Please write down your name and student ID number below.

NAME:

ID:

> summary(Midterm)

Min. 1st Qu. Median Mean 3rd Qu. Max.

17.00 23.75 29.25 26.88 30.00 30.00

> sort(Midterm)

[1] 17.0 20.0 23.0 23.0 24.0 26.5 28.0 29.0 29.5 30.0

[11] 30.0 30.0 30.0 30.0 30.0 30.0

Discover Exams of Applied Statistics Purdue University

Partial preview of the text

Download Advanced Statistical Methodology (STAT 526) Midterm Exam Solutions and more Exams Applied Statistics in PDF only on Docsity!

ADVANCED STATISTICAL METHODOLOGY (STAT 526)

Spring 2019 MIDTERM EXAM (BRNG 2290) 8:00-10:00PM, Wednesday, Feburary 27, 2019

There are totally 32 points in the exam. The students with score higher than or equal to 30 points will receive 30 points. Please write down your name and student ID number below.

NAME:

ID:

summary(Midterm) Min. 1st Qu. Median Mean 3rd Qu. Max. 17.00 23.75 29.25 26.88 30.00 30. sort(Midterm) [1] 17.0 20.0 23.0 23.0 24.0 26.5 28.0 29.0 29.5 30. [11] 30.0 30.0 30.0 30.0 30.0 30.

(10 points). The data set reports exam information for preliminary school students. It contains counts of pass/fail with respect to students’ weekly studying hours (hours) and three studying methods (method, coded by 1, 2, and 3). The working hours are partitioned into many intervals. The center values of these intervals are used in fitting models. The R output is given below.

summary(mod.main) glm(formula=cbind(pass,fail)~hours+factor(method),family =binomial,data=exam) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.13325 0.27940 -11.214 < 2e- hours 0.22174 0.01522 14.566 < 2e- factor(method)2 0.81913 0.22643 3.618 0. factor(method)3 1.21552 0.23671 5.135 2.82e- Null deviance: 397.538 on 17 degrees of freedom Residual deviance: 11.465 on 14 degrees of freedom summary(mod.int) glm(formula=cbind(pass,fail)~hours*factor(method),family =binomial,data=exam) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.06060 0.40367 -7.582 3.4e- hours 0.21693 0.02458 8.827 < 2e- factor(method)2 0.34967 0.56810 0.616 0. factor(method)3 1.40095 0.52562 2.665 0. hours:factor(method)2 0.03725 0.03896 0.956 0. hours:factor(method)3 -0.01765 0.03531 -0.500 0. Null deviance: 397.5383 on 17 degrees of freedom Residual deviance: 9.4339 on 12 degrees of freedom round(summary(mod.main)$cov.unscaled,6) (Intercept) hours factor(method)2 factor(method) (Intercept) 0.078063 -0.003499 -0.034658 -0. hours -0.003499 0.000232 0.000624 0. factor(method)2 -0.034658 0.000624 0.051269 0. factor(method)3 -0.038451 0.000875 0.027593 0. round(qchisq(0.95,1:20),2) [1] 3.84 5.99 7.81 9.49 11.07 12.59 14.07 15.51 16.92 18. [11] 19.68 21.03 22.36 23.68 25.00 26.30 27.59 28.87 30.14 31.

(e) (2 points). Provide the 95% confidence interval for hours in the first methods, respectively, if one wants to have 90% passing probability. Solution: For method 1, we have ˆx = (log 9 + 3.13325)/ 0 .22174 = 24.0393. By

( ∂^ xˆ ∂ βˆ 0

, ∂^ ˆx ∂ βˆ 1

, − log 9 + 3.^13325

221742

we have

σx^2 ˆ = (0. 31916 , − 108 .41)

(

078062 − 0. 003499 − 0. 003499 0. 00232

) (

31916 − 108. 41

) = 2. 9767.

Thus, the 95% confidence interval for hours is

0393 ± 1. 96

2 .9767 = [20. 6577 , 27 .4209].

(8 points). The following table reported the relationship between education (educ) and religious beliefs (belief).

Education Religious Beliefs Degree Fundamentalist Moderate Liberal Total High School 178 138 101 417 College 570 648 442 1660 Bachelor Degree 145 252 252 649 Total 893 1038 795 2726

summary(mod.main) Call: glm(formula=yy~factor(educ)+factor(belief),family=poisson,data=Religion) Deviance Residuals: 1 2 3 4 5 6 7 8 9 3.3824 1.1150 -4.9219 -1.6875 0.6302 0.3091 -1.9260 -1.9429 4. summary(mod.ll) Call: glm(formula=yy~factor(educ)+factor(belief)+educ:belief, family=poisson,data=Religion) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 4.89267 0.05490 89.123 < 2e-16 *** factor(educ)2 0.81846 0.08970 9.124 < 2e-16 *** factor(educ)3 -0.73975 0.16889 -4.380 1.19e-05 *** factor(belief)2 -0.46604 0.09237 -5.045 4.53e-07 *** factor(belief)3 -1.38392 0.17656 -7.838 4.56e-15 *** educ:belief 0.30336 0.04049 7.493 6.76e-14 *** Null deviance: 1013.4427 on 8 degrees of freedom Residual deviance: 8.7621 on 3 degrees of freedom summary(mod.row)

Call: glm(formula=yy~factor(educ)+factor(belief)+factor(educ):belief, family=poisson,data=Religion) Coefficients: (1 not defined because of singularities) Estimate Std. Error z value Pr(>|z|) (Intercept) 5.70452 0.13550 42.100 < 2e-16 *** factor(educ)2 1.04760 0.14159 7.399 1.37e-13 *** factor(educ)3 -0.70291 0.17244 -4.076 4.58e-05 *** factor(belief)2 0.49840 0.06718 7.419 1.18e-13 *** factor(belief)3 0.54221 0.10205 5.313 1.08e-07 *** factor(educ)1:belief -0.57554 0.08198 -7.020 2.21e-12 *** factor(educ)2:belief -0.39688 0.06008 -6.605 3.96e-11 *** factor(educ)3:belief NA NA NA NA Null deviance: 1013.4427 on 8 degrees of freedom Residual deviance: 4.2737 on 2 degrees of freedom

(a) (2 points). Provide a goodness-of-fit test to assess whether the main effects model fits the data. Solution: By the values of deviance residuals in the output, we obtain G^2 =3. 38242 + 1. 11502 + (− 4 .9219)^2 + (− 1 .6875)^2 + 0. 63022 + 0. 30912

(− 1 .9260)^2 + (− 1 .9429)^2 + 4. 33722 =66. 54. Since G^2 > χ^20. 05 , 4 = 9.49, we conclude that the model does not fit the data. (b) (2 points). State the linear-by-liner association model. Provide two tests to assess significance of the linear-by-linear association term. Solution: The linear-by-linear association model is

log λij = μ + αi + βj + γ(uivj ),

where λij = E(yij ), αi with α 1 = 0 represents the main effects of educ, βj with β 1 = 0 represents the main effects of belief, γ is the coefficient of the linear-by-linear association term, and ui and vj are score values of educ and belief. In this output, we have ui = i and vj = j. We can use the Wald and the likelihood ratio test. The Wald statistic value is 7.493 and its p-value is

76 × 10 −^14. Thus, it conclude that the linear-by-linear association term is significant. The likelihood ratio statistic value is 66. 54 − 8 .76 = 57. 78 < χ^20. 05 , 1 = 3.84. It also conclude that the linear-by-linear association is significant. (c) (2 points). State the null hypothesis in the test between the linear-by-linear association model and the row effects model. Provide a test statistic to assess whether the row effects model can be reduced to the linear-by-linear association model. Solution: The row effects model is

log λij = μ + αi + βj + γivj ,

where λij = E(yij ), αi with α 1 = 0 represents row main effects, βj with β 1 = 0 represents column main effects, γi with γ 3 = 0 represents row effects in the interaction, and vj = j are scores of

(d) (2 points). Provide the 95% confidence interval for the probability of killed when conc = 0.6 in the model with overdispersion. Solution: The predicted value of the linear term is ηˆ = − 1 .5655 + 3. 2791 × 0 .6 = 0. 40196. The variance is

ϕˆ

( 1 0. 6

) (^0. 015926 − 0. 030085

) ( 1

) = 0. 02553.

The 95% confidence interval for η is 0. 40196 ± 1. 96

0 .02553 = [0. 08879 , 0 .71513]. The 95% confi- dence interval for the probability is [e^0.^08879 /(1+e^0.^08879 ), e^0.^71513 /(1+e^0.^71513 )] = [0. 5222 , 0 .67153].

(6 points). The data reported the feeling of life (low, medium and high) with respect to income levels (xx) (1–low, 5– high). The R output is given below.

g <- multinom(yy~factor(xx),weight=freq) g$dev [1] 441. g$edf [1] 10 g1 <- multinom(yy~xx,weight=freq) g1$dev [1] 444. g1$edf [1] 4 summary(g1)$coefficient (Intercept) xx Median -0.1973812 0. High -0.3598186 0. g2 <- polr(yy~xx,weight=freq) g2$dev [1] 445. g2$edf [1] 3 summary(g2)$coefficient $ Re-fitting to get Hessian Value Std. Error t value xx 0.2171542 0.1032490 2. Low|Median -0.5685649 0.3895745 -1. Median|High 0.9913447 0.3932452 2.

(a) (2 points) Write down the model assumptions of the second and the third models in the output. Solution: Let π 1 (x), π 2 (x), and π 3 (x) for feeling levels. The assumption of the second model is

log πj^ (x) π 1 (x) = β 0 j + β 1 j x, j = 2, 3.

The assumption of the third model is

log

∑j k=1 πk(x) 1 − ∑j k=1 πk(x)

= β 0 j − β 1 x, j = 1, 2.

(b) (2 points). Provide a goodness-of-fit test about whether the multinomial and the proportional odds models fit the data if income levels are treated as their score values. Solution: The residual deviance of the second model is G^2 = 444. 8235 − 441 .7743 = 3. 0492 < χ 0. 05 , 6 = 12.59. Therefore, the second model fits the data. The residual deviance of the third model is G^2 = 445. 2111 − 441 .7743 = 3. 4368 < χ 0. 05 , 7 = 14.07. Therefore, the proportional odds model also fits the data. (c) (2 points). Predict the probability in the multinomial model and the proportional odds model, respectively, if the income level is 5. Solution: For the second model,

ηˆ 2 = − 0 .19738 + 5(0.20261) = 0. 81567 , ˆη 3 = − 0 .35982 + 5(0.32062) = 1. 24328.

Then, ˆπ 2 = e^0.^81567 πˆ 1 , ˆπ 3 = e^1.^24328 ˆπ 1 , and ˆπ 1 + ˆπ 2 + ˆπ 3 = 1, implying that ˆπ 1 = 0.1486, πˆ 2 = 0.3360, and ˆπ 3 = 0.5153. For the second model,

πˆ 1 = e−^0.^56856 −5(0.21715) 1 + e−^0.^56856 −5(0.21715)^ = 0.^1605 πˆ 3 =

e^0.^99134 −5(0.21715)^ = 0.^5236 πˆ 2 =1 − πˆ 1 − ˆπ 3 = 0. 3159.

Advanced Statistical Methodology (STAT 526) Midterm Exam Solutions, Exams of Applied Statistics

Related documents

Partial preview of the text

Download Advanced Statistical Methodology (STAT 526) Midterm Exam Solutions and more Exams Applied Statistics in PDF only on Docsity!

ADVANCED STATISTICAL METHODOLOGY (STAT 526)

NAME:

ID:

2 .9767 = [20. 6577 , 27 .4209].

) (^0. 015926 − 0. 030085