Stat/For/Hort 572: Midterm I Solutions - Spring 2006 | Exams Data Analysis & Statistical Methods

Stat/For/Hort 572 — Midterm I, Spring 2006 — Solutions

1. (a) Consider the four model assumptions: correct model, independence, equal variance, and normal distri-

bution. The linear line relationship app ears inadequate and the equal variance assumption may not be

satisfied.

(b) H0: no LOF versus HA: LOF of the SLR model. From the R output, the observed f= 9.0348. Compare

to Fdistribution with df = (12, 14), the p-value is less than 0.001. Thus reject H0at 5% level and there

is very strong evidence of a lack of fit of the SLR model.

confidence interval is ˆyest ±t0.025,26 ×s.e.(ˆyest), which is 6.76 ±1.46 or [5.30,8.22].

(d) H0: no LOF versus HA: LOF of the quadratic regression model. Since SS Pure Error is 18.601 on 14

df and SS Error is 126.64 on 25 df, SS LOF is 126.64 −18.601 = 108.039 on 25 −14 = 11 df. By the

additional sum of squares principle, the observed f=108.039/11

18.60/14 = 7.39. Compare to Fdistribution with

df = (11, 14), the p-value is less than 0.001. Thus reject H0at 5% level and there is very strong evidence

of a lack of fit of the quadratic regression model.

2. (a) The parameter b3is the slope difference between the two regression lines corresponding to farms A and

B respectively. From the R output, use either the observed t=−0.608 on 6 df or f= 0.3694 on (1, 6) df.

The p-value is 0.5656. Do not reject H0at 5% level and there is no evidence of a nonzero b3.

(b) The parameter b0is the intercept of the regression line for farm A, which represents the expected weight

gain for a zero level of diet supplement. Since ˆ

b0= 1.888, s.e.(ˆ

b0) = 0.3299, t0.05,6= 1.943, the 90%

confidence interval is ˆ

b0±t0.05,6×s.e.(ˆ

b0), which is 1.89 ±0.64 or [1.25,2.53].

10.04 + 0.0106 = 10.0506 on 2 df and the SSE of the full model is 0.1718 on 6 df. By the additional sum

of squares principle, the observed f=10.0506/2

0.1718/6= 175.51. Compare to Fdistribution with df = (2, 6),

the p-value is less than 0.001. Thus reject H0at 5% level and there is very strong evidence that the two

regression lines are not equal.

(d) The full model is y=b0+b1w1+b2w2+ewhich has SSE = 0.0106 + 0.1718 = 0.1824 on 7 df. The

reduced model is y=b0+b1w1+ewith additional sum of squares 10.04 on 1 df. By the additional sum

of squares principle, the observed f=10.04/1

0.1824/7= 385.31. Compare to Fdistribution with df = (1, 7), the

p-value is less than 0.001. Thus reject H0at 5% level and there is very strong evidence of a nonzero b2.

That is, although there is no evidence of slope difference, there is strong evidence of intercept difference

between the two regression lines for farms A and B.

3. (a) From the R output, the correlation between yand each individual xis the highest for x3. This implies

that the third model y=b0+b1x3+ehas the largest R2and thus the smallest SSE. The df of SSE is

the same for all three models. Thus the third model has the smallest MSE.

(b) H0: the third observation is not an outlier versus HA: not H0. Because of the way x4is coded, use the

observed t-value 2.983 on 7 df. The p-value is 0.02043 for one comparison and thus the exp eriment-wise

p-value is 12 ×0.02043 = 0.2452. Do not reject H0at 5% level and there is no evidence that the third

observation is an outlier.

x1is the first step. Now fit the model with x2and x3. Since the smallest t-value is 16.905 and is more

than 2, stop. The model selected by backward elimination is

y=b0+b2x2+b3x3+e

Grade Distribution

100:2

90-99:15

80-89:18

70-79:17 mean = 78, median = 80

60-69:3

<60:10

Partial preview of the text

Download Stat/For/Hort 572: Midterm I Solutions - Spring 2006 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

Stat/For/Hort 572 — Midterm I, Spring 2006 — Solutions

(a) Consider the four model assumptions: correct model, independence, equal variance, and normal distri- bution. The linear line relationship appears inadequate and the equal variance assumption may not be satisfied. (b) H 0 : no LOF versus HA : LOF of the SLR model. From the R output, the observed f = 9.0348. Compare to F distribution with df = (12, 14), the p-value is less than 0.001. Thus reject H 0 at 5% level and there is very strong evidence of a lack of fit of the SLR model. (c) From the R output, for x∗^ = 18, ˆyest = 6. 76 , s.e.(ˆyest) = 0.7084. Since t 0. 025 , 26 = 2.056, the 95% confidence interval is ˆyest ± t 0. 025 , 26 × s.e.(ˆyest), which is 6. 76 ± 1 .46 or [5. 30 , 8 .22]. (d) H 0 : no LOF versus HA : LOF of the quadratic regression model. Since SS Pure Error is 18.601 on 14 df and SS Error is 126.64 on 25 df, SS LOF is 126. 64 − 18 .601 = 108.039 on 25 − 14 = 11 df. By the additional sum of squares principle, the observed f = 10818 ..^03960 // 1411 = 7.39. Compare to F distribution with df = (11, 14), the p-value is less than 0.001. Thus reject H 0 at 5% level and there is very strong evidence of a lack of fit of the quadratic regression model.
(a) The parameter b 3 is the slope difference between the two regression lines corresponding to farms A and B respectively. From the R output, use either the observed t = − 0 .608 on 6 df or f = 0.3694 on (1, 6) df. The p-value is 0.5656. Do not reject H 0 at 5% level and there is no evidence of a nonzero b 3. (b) The parameter b 0 is the intercept of the regression line for farm A, which represents the expected weight gain for a zero level of diet supplement. Since ˆb 0 = 1. 888 , s.e.(ˆb 0 ) = 0. 3299 , t 0. 05 , 6 = 1.943, the 90% confidence interval is ˆb 0 ± t 0. 05 , 6 × s.e.(ˆb 0 ), which is 1. 89 ± 0 .64 or [1. 25 , 2 .53]. (c) The test of interest is H 0 : [b 2 = b 3 = 0|b 0 , b 1 ] versus HA: not H 0. The additional sum of squares is 10 .04 + 0.0106 = 10.0506 on 2 df and the SSE of the full model is 0.1718 on 6 df. By the additional sum of squares principle, the observed f = (^100) .. 17180506 // 62 = 175.51. Compare to F distribution with df = (2, 6), the p-value is less than 0.001. Thus reject H 0 at 5% level and there is very strong evidence that the two regression lines are not equal. (d) The full model is y = b 0 + b 1 w 1 + b 2 w 2 + e which has SSE = 0.0106 + 0.1718 = 0.1824 on 7 df. The reduced model is y = b 0 + b 1 w 1 + e with additional sum of squares 10.04 on 1 df. By the additional sum of squares principle, the observed f = (^010). 1824.^04 //^17 = 385.31. Compare to F distribution with df = (1, 7), the p-value is less than 0.001. Thus reject H 0 at 5% level and there is very strong evidence of a nonzero b 2. That is, although there is no evidence of slope difference, there is strong evidence of intercept difference between the two regression lines for farms A and B.
(a) From the R output, the correlation between y and each individual x is the highest for x 3. This implies that the third model y = b 0 + b 1 x 3 + e has the largest R^2 and thus the smallest SSE. The df of SSE is the same for all three models. Thus the third model has the smallest MSE. (b) H 0 : the third observation is not an outlier versus HA: not H 0. Because of the way x 4 is coded, use the observed t-value 2.983 on 7 df. The p-value is 0.02043 for one comparison and thus the experiment-wise p-value is 12 × 0 .02043 = 0.2452. Do not reject H 0 at 5% level and there is no evidence that the third observation is an outlier. (c) According to the full model fit, the t-value for b 1 (1.885) is the smallest and is less than 2. Thus eliminate x 1 is the first step. Now fit the model with x 2 and x 3. Since the smallest t-value is 16.905 and is more than 2, stop. The model selected by backward elimination is

y = b 0 + b 2 x 2 + b 3 x 3 + e

Grade Distribution

100: 90-99: 80-89: 70-79:17 mean = 78, median = 80 60-69: <60:

Stat/For/Hort 572: Midterm I Solutions - Spring 2006, Exams of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Stat/For/Hort 572: Midterm I Solutions - Spring 2006 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

Stat/For/Hort 572 — Midterm I, Spring 2006 — Solutions