Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Regression Analysis: Comparing Simple and Multiple Linear Regression Models, Exercises of Mathematical Statistics

Anna University Mathematical Statistics

The results of a regression analysis for various models with different independent variables (x1, x2, x3, x4, x5) to predict y. The analysis includes stem-and-leaf plots, regression equations, analysis of variance tables, and f-tests to determine the significance of each variable and compare different models. The document also discusses the importance of looking at data graphs before analysis and the use of confidence intervals.

Typology: Exercises

2012/2013

Uploaded on 01/11/2013

m-alam 🇮🇳

4.7

(12)

54 documents

1 / 7

This page cannot be seen from the preview

Don't miss anything!

Question #1:

(a) The least-square estimate of the regression line when Yregressed on X1is:

ˆ

Y=−70.42020 + 227.09370 X1

Based on the computer output on pages 1 and 2, we have R2= 0.9194 and

rY X1= 0.95884. Therefore, r2

Y X1= (0.95884)2= 0.9197 = R2. Yes, they are

equal, and it should be since in the simple linear regression model R2=R2

Y X1.

(b) The least-square estimate of the regression line when Yregressed on X1and

X2is: ˆ

Y=−8.08481 + 68.25068 X1+ 2.29387 X2

Based on the computer output on pages 3 and 4, we have R2= 0.9664, rYX1=

0.95884, and rY X2= 0.97907. Therefore, r2

Y X1= (0.95884)2= 0.9197 6=R2,

and r2

Y X2= (0.97907)2= 0.9586 6=R2. No, they are not equal, and it should

not be since in the multiple linear regression model R26=R2

Y X1and R26=R2

Y X2.

We can use a F-test to compare the following two models:

Full Model : Y=β0+β1X1+β2X2+E

Reduced Model : Y=β0+β1X1+E

The F-test (or the test statistic) is

T S =[SSR(Full) −SSR(Reduced)] /1

MSRes(Full) =[673.20680 −640.42489] /1

23.37865/8= 11.22

Since T S = 11.22 > F0.05,1,8= 5.32, we reject H0:β2= 0. Therefore, adding

X2in the model is useful to predict Y. We can also use a ttest for testing

H0:β2= 0. Based on the computer output on page 3, the t-test for testing

H0:β2= 0, is T S = 3.35, and p−value = 0.0101 < α = 0.05, therefore, we

reject H0. (Note that (3.35)2= 11.22, the F-value above.)

(c) The least-square estimate of the regression line when Yregressed on X1,X2

and X3and its R2are:

ˆ

Y=−1.87932 + 77.32578 X1+ 1.55910 X2−23.90378X3R2= 0.9769

Based on part (a)−(c), we have

1

Docsity.com

Discover Exercises of Mathematical Statistics Anna University

Partial preview of the text

Download Regression Analysis: Comparing Simple and Multiple Linear Regression Models and more Exercises Mathematical Statistics in PDF only on Docsity!

Question #1:

(a) The least-square estimate of the regression line when Y regressed on X 1 is:

Yˆ = − 70 .42020 + 227. 09370 X 1

Based on the computer output on pages 1 and 2, we have R^2 = 0.9194 and rY X 1 = 0.95884. Therefore, r^2 Y X 1 = (0.95884)^2 = 0.9197 = R^2. Yes, they are equal, and it should be since in the simple linear regression model R^2 = R^2 Y X 1.

(b) The least-square estimate of the regression line when Y regressed on X 1 and X 2 is: Yˆ = − 8 .08481 + 68. 25068 X 1 + 2. 29387 X 2

Based on the computer output on pages 3 and 4, we have R^2 = 0.9664, rY X 1 = 0 .95884, and rY X 2 = 0.97907. Therefore, r^2 Y X 1 = (0.95884)^2 = 0. 9197 6 = R^2 , and r^2 Y X 2 = (0.97907)^2 = 0. 9586 6 = R^2. No, they are not equal, and it should not be since in the multiple linear regression model R^2 6 = R^2 Y X 1 and R^2 6 = R^2 Y X 2. We can use a F -test to compare the following two models:

Full Model : Y = β 0 + β 1 X 1 + β 2 X 2 + E

Reduced Model : Y = β 0 + β 1 X 1 + E

The F -test (or the test statistic) is

T S =

[SSR(Full) − SSR(Reduced)] / 1 M SRes(Full)

[673. 20680 − 640 .42489] / 1

Since T S = 11. 22 > F 0. 05 , 1 , 8 = 5.32, we reject H 0 : β 2 = 0. Therefore, adding X 2 in the model is useful to predict Y. We can also use a t test for testing H 0 : β 2 = 0. Based on the computer output on page 3, the t-test for testing H 0 : β 2 = 0, is T S = 3.35, and p − value = 0. 0101 < α = 0.05, therefore, we reject H 0. (Note that (3.35)^2 = 11.22, the F -value above.)

(c) The least-square estimate of the regression line when Y regressed on X 1 , X 2 and X 3 and its R^2 are:

Yˆ = − 1 .87932 + 77. 32578 X 1 + 1. 55910 X 2 − 23. 90378 X 3 R^2 = 0. 9769

Based on part (a) − (c), we have

Number Model R^2 Adjusted R^2 M SRes 1 X 1 0.9194 0.9104 6. 2 X 1 , X 2 0.9664 0.9580 2. 3 X 1 , X 2 , X 3 0.9769 0.9670 2.

If we only use R^2 , we choose model number 1. (If we look at M SRes, we choose model 2. Note that

R^2 Adj = 1 −

n − 1 SST

M SRes.

Therefore, the criteria minimum M SRes and maximum adjusted R^2 are equiv- alent.)

(d) Based on the computer output on page 6, we have

Dependent Predicted Residuals Plot Variable (y) Value (ˆy) (e = y − ˆy) 1 27.1000 26.2837 0. 5 40.2000 40.3619 -0. 7 55.5000 53.3486 2.

Note that the predicted response values ˆy can also provide a measure of model validity. Unrealistic predicted values such as negative predictions of a positive quantity or predictions that fall outside the actual range of the response, indicate poorly estimated coefficients or an incorrect model form. In this case, there is no unusual predicted value.

(e)

Source df Sum of squares Mean of squares F -value p-value Model 3 680.49122 226. 98.66 < 0. 0001 Error 7 16.09423 2.

Total 10 696.

p − value < 0 .0001, therefore, we reject H 0 : β 1 = β 2 = β 3 = 0. This means that at least one of the variables X 1 , X 2 , and X 3 is useful to predict Y. (We might need other variables too.)

Since the p-value for Temperature and Particle Size is less than α = 0.05, these two variables are useful to predict Y. Other variables may not be useful to predict Y , since their p-value is greater than α = 0.05. We might want to delete them from the model.

(e) Let

Full Model : Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + E

Reduced Model : Y = β 0 + β 2 X 2 + β 5 X 5 + E

Then

Model R^2 R^2 Adj M SRes Full 0.9372 0.9058 65. Reduced 0.9149 0.9018 67.

By comparing R^2 and R^2 Adj, we conclude that adding variables X 1 , X 3 , X 4 , doest not increase R^2 and R^2 Adj significantly. Therefore, we should not include those variables in the mode. To compare the two model, the test statistic is:

T S =

[SSR(Full) − SSR(Reduced)] / 3 M SRes(Full)

[9712. 50 − 9481 .25] / 3

Since T S = 1. 136 < F 0. 05 , 3 , 13 = 3.41, we fail to reject H 0 : β 1 = β 3 = β 4 = 0. Therefore, adding X 1 , X 3 , and X 2 in the model is not useful to predict Y. Note that we cannot use a t-test for testing the null hypothesis.

(f ) Using the computer output on pages 22 and 23, we have that a 95% con- fidence interval for β 2 is (0. 15378 , 0 .41051) for full model and a 95% for β 2 is (0. 15506 , 0 .40923). These two confidence interval are very similar, and therefore, deleting variables X 1 , X 3 and X 4 does not change our result for Pressure.

(g) A 95% confidence interval for μY given x 0 = (1, 500 , 95 , 15 , 40 , 4) ′ is

yˆ 0 ± t α 2

ˆσ^2

x ′ 0 (X

′ (^) X)− (^1) x 0

where

ˆy 0 = 52 .07905 + 0.05556 (500) + 0.28214 (95)

0.12500 (15) + 4.56594(10−^15 ) (40) − 16 .06498(4) = 44. 27

t 0. 025 ,df =10 = 2.228 and x

′ 0 (X

′ X)−^1 x 0 = 0.3122698. Therefore, a 95% confi- dence interval for μY given x 0 is

27 ± 2. 228

By using R, a 95% confidence interval when X 1 = 500, X 2 = 95, X 3 = 15, X 4 = 40, and X 5 = 4 is (34. 23322 , 54 .31772).

(h) A 95% confidence interval for μY given x 0 = (1, 95 , 4)

′ is

yˆ 0 ± t α 2

ˆσ^2

x′ 0 (X′^ X)−^1 x 0

where yˆ 0 = 80.13461 + + 0.28214 (95) − 16 .06498(4) = 42. 678

t 0. 025 ,df =13 = 2.160 and x ′ 0 (X

′ X)−^1 x 0 = 0.1830688. Therefore, a 95% confi- dence interval for μY given x 0 is

678 ± 2. 160

By using R, a 95% confidence interval when X 2 = 95, and X 5 = 4 is (35. 06559 , 50 .2909). This confidence interval is shorter than the confidence interval in part (g). Therefore, the model that include X 2 and X 5 gives a shorter confidence in- terval and it is better.

(i) The least-square estimate of the regression line when Y regressed on X 5 is:

Yˆ = 97. 06318 − 16. 06498 X 5

Let Full Model : Y = β 0 + β 2 X 2 + β 5 X 5 + E Reduced Model : Y = β 0 + β 5 X 5 + E

Therefore, the F -test (or the test statistic) for testing the contribution of tem- perature to the model is

T S =

[SSR(Full) − SSR(Reduced)] / 1 M SRes(Full)

[9481. 25 − 7921 .00] / 1

Since T S = 23. 04 > F 0. 05 , 1 , 13 = 4.67, we reject H 0 : β 2 = 0. Therefore, adding X 2 in the model is useful to predict Y. We can also use a t test for testing H 0 : β 2 = 0. Based on the computer output on page 23, the t-test for testing H 0 : β 2 = 0, is T S = 0. 28214 / 0 .05883 = 4.79585 (or 4.80), and p − value =

0003 < α = 0.05, therefore, we reject H 0. Note that (4.79585)^2 = 23.00, the F -value above.)

Let

Full Model : β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 12 + β 5 X 22 + β 6 X 1 X 2 + E

Reduced Model : Y = β 0 + β 1 X 1 + β 2 X 2 + β 6 X 1 X 2 + E

Therefore, the F -test (or the test statistic) for testing H 0 : β 4 = 0 is

T S =

[SSR(Full) − SSR(Reduced)] / 1 M SRes(Full)

[633. 4901190 − 576 .0019463] / 3

Since T S = 5. 84 > F 0. 05 , 3 , 17 = 3.20, we reject H 0 : β 3 = β 4 = β 5 = 0. Therefore, adding X 3 , X 12 , and X 22 in the model does lead to large improvement in the model with X 1 , X 2 , and X 1 X 2 as independent variable. Then, the full model is superior. On the other hand,

Model R^2 R^2 Adj M SRes Full 0.9191 0.8905 3. Reduced 0.8357 0.8110 5.

R^2 and RAdj^2 did not change significantly. Based on the above results, we may

say that the reduced model is superior. Note that R^2 is not always a good criteria for selecting a model. Since 3. 28058 / 5 .66290 = 0.63 , there is 63% deduction in M SRes. So we should choose the full model. Therefore, the full model is superior.

Regression Analysis: Comparing Simple and Multiple Linear Regression Models, Exercises of Mathematical Statistics

Related documents

Partial preview of the text

Download Regression Analysis: Comparing Simple and Multiple Linear Regression Models and more Exercises Mathematical Statistics in PDF only on Docsity!

T S =

[673. 20680 − 640 .42489] / 1

T S =

[9712. 50 − 9481 .25] / 3

T S =

[9481. 25 − 7921 .00] / 1

T S =

[633. 4901190 − 576 .0019463] / 3