Download Homework 7 Solutions - Applied Regression Analysis | STAT 4230 and more Assignments Statistics in PDF only on Docsity! STAT 423016230 Homework ff7 Solutions April 28,2008 1. Exercise 8.29 (20 Points) a. The plot of the residuals versus time is given 6elow, and we see a tendency for the residuals to have long positive and negative runs. This is a result of correlation present in the residuals. In other words, the residuals are not independent, and the value at time t is influenced by what happened at times t-'I,t - 2, and so on. y - 2.1584 -O.0619t Buypower Data Sel Residuals versus Time Plot la a a a. a ale a a a a a aa a o.5 0.4 o-3 o.2 : 0- r .. -0. t -0.2 -0-3 x, 2a 0.8G7 trlBq o. ala2 re 0.2s9 g.o 7.s l0.o r2.s ts.o l?.5 n.o 22.s ?S.o 2?.5 3o.O b. The Durbin-Watson d statistic is 0.084. We then test for the presence of positive correlation among the residuals. The hypotheses a.re I{s : The residuals are not correlated. Ho : The residuals are positively correlated. Using a : 0.05, the critical values from Table C.9 with n :28 and k : ! are dL,o: dt,o.os: 1.33 and du,o : dv,o.os: 1.48. Thus, the rejection region is d I dr,o: 1'33' Because d : 0.084 < 1.33, we reject the null hypothesis and conclude that there is significant positive correlation among the residuals at o : 0.05. 2. Exercise 8.3O (2O Points) a. The estimated straight-Iine model for the data is 9:4837]41+ 0'16719 t' , The SAS output appears below: The BEG Procedure Model: MODELl Dependent Vari,able: Y Nr:mber of 0bservations Read 24 Number of Observations Used 24 Analysis of Variance Sum of Me:n Source DF Squares Square F Value Pr > F Mod.el t 32.L++20 32-t+42O 46'57 <'OOO1 Error 22 15.18486 O.AgOiZ corrected Total 23 47.32906 Root MSE 0.83080 R-Square "0.6792 Dependent Mean 50-46t25 Adj R-Sq 0'6646 Coeff Var 1.64640 \ Paraxoeter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > ltl Intercept 7 4A 37I4t 0.35006 138.18 < '0001 t 1 o. 16719 0.02450 6.82 <.0001 b. The plot of the residuals versus f appears on the next page, and we see that there are short runs cif positive and negative residuals. However, the question is if these are significant enough to cohclude that correlation is indeed present in the residuals. c. We now conduct a statistical test at level a : 0.05 to determine if the residuals are conelated. Thus, we test the hypotheses flo : The residuals are not correlated. Ho : The residuals are correlated. Using the attached Durbin-Watson table for a: 0.05/2:0.025, we see that the critical values withA:2andn:24is dr.,o/z : dr,,o.ozs: 1.161 and du,olz: du,o.ors : L-329. Thus, the rejection regions take the form of d < 1.161 or (4- d) < 1.161' For this data, the Durbin-Watson test statistic is d -- 7.334, and we see that d is not less than dr,o.ozs: 1.161. Additionally, 4- d,:2.666 is not less thandt,o.ozs: 1.161' However, we see that d : t.334) du,o.ozs. Thus, the test statistic'falls into the non-rejection region, and we do not have evidence of correlation &t a:0.05. Therefore, we fail to reject the null hypothesis. I/ Number of Observations Used Source Model Error Corrected Totaf Variable DF Intercept I x1 l.s te.o t2.s Root MSE Dependent Mean Coeff Var A"nalysis of Srrm of Squares 81.50571 2.49429 84.00000 0.78967 26.00000 3. 03718 Variance Mean Square 81 .50571 0.62357 R-Square Adj R-Sq F Value Pr > F 130.71 0.0003 0.9703 0 .9629 IJ-T 4 4 5 Parameter Estinates Paraneter Staldard Estinate Error t Valuen s.06837 1.8s901 2.73 0.06227 0.0054s 71..43 Pr > ltl 0.0526 0.0003 We see that the F :1.30.77, which corresponds to a Fvalue of 0.0003. This leads us to reject the null hypothesis and conclude ihat the model is now statistically useful in predicting the life'span based on gestation period. Likewise, the R2 is now 0.9?03, which means we have accounted for 97.03% of the variation in life span about the mean with the model as constructed. The removal of horse #3 from the data resulted in a much better and statistically useful model. The least squares line for the data without horse f3 is 0 : 5.06837 * 0'06227 r' 4. Exercise 8.42 (2O Points) a. The'plot of regression residuals versus the fitted values appears below. Plot of Residuals vers:s Predtcted Values for TVSMRE Data y - -l.56ls .0.6822a ----- ------------------- a o a x I o. s330 o,009? re 0. 7496 t{,0 t{-5 tg.o 15.5 ls,o Pdlcd v.lF The computed regression residuals appear in fhe following SAS output: The REG Procedure Model: MODEL1 Dependent Variable: y Output Statistics Dependent Predicted Obs Variable Value Residual 1 15.0000 74.2433 0 .7567 2 17.0000 76.9920 0.008021 3 17.0000 15.6176 t.3824 4 13.0000 12.8690 0.1310 5 12.0000 12 . 1818 -0. 1818 '6 14.0000 14.9305 -0.9305 7 16.0000 16.3048 -0.3048 8 14.0000 14.2433 -0.2433 9 15.0000 15.6176 -0.6176 b. The response variable gr is recorded as a percentage. Therefore, there may be a violation of the least squares assumption of homoscedasticity. The residuals versns predicted values lilot confirms this, ae we see a funnel trend. Plot of Residuals versus Predicted Values for TVSMRE Data a - ---- --------- -- -----{- I I a y - -l.5Gl5 +0.587?x t.5 x 9 o 4330 0,8!92 re o.?{96 . tz.o r2.5 l3.o 13.5 l4.O l{.5 ls.o ls.s ls'o 16.5 l7.o Prdlcd vllE ! c. Based on the plot and the nature of the data, we would suggest the transfotmation g* : si"- t ( rly). Here, we use the decimal form of g for the transformation (i.e., 0.15 instead of 15). We refit the first-order model using the transformed model: --:- Y* : sin-'(Y) : 0.16172 * 0.00977 c. The SAS output: The REG Procedure Model: UODEL1 Dependent Variable: asiny Number of 0bservations Read 9 Number of 0bservatious Used 9 Analysis of. Variance Sun of Metn DF Squares Square "F Value Pr > FSource !4ode1 Error Corrected Total Root MSE 1 0.00397 0.00397 36.52 0.0005 7 0.00076049 0.00010864 I o.boa73 0.01042 R-Square 0.8392 Dependent Mean 0.39405 Adj R-Sq 0.8162 Coeff Var 2.64st0 t Parameter Estinates Parameter Standard larj.able DF Estinate Error t Value Pr > ltl fntercept 1 0.t6772 0.03860 4. 19 ' 0-0041 x t - 0.00977 0.00162 6.04 0.0005 The calculated residuals appea.r below: The REG Procedure Model: MODEL1 Dependent Variable: asiaY output Statistics Dependent Predicted Obs Variable Value Residual 1 0.3977 0.3865 0.0112 2 0.4250 0.4255 -0.000551 3 0.4250 0.4060 0.0190 4 0.3689 0.3669 0.001950. 5' 0.3537 0.3571 -0.003400 6 0.3835 0.3962 -0.0127 7 0.4115 0.41s8 -0.004251 8 0.3835 0.3865 -0.002958 9 0.3977 0.4060 -O-008298 Lastty, we visualize the residuals versus fitted values plot for the transformed model: Plot of Residuals versus Predcted \blues for Transformed Data 0.020 o.ot5 o.oto 0.o05 -o.005 -o.oto -0. ot 5 O. l5l7 +0.0098 r a a 0,35 0.36 o.3Z 0.38 0.39 0 a0. o.4l o.12 o.ts x 3 o- a392 o. Er52 ffi o.ol6 Htcd ulls We see that the trend is still apparent but on a smaller scale. BOUNDS FOR CRITICAL VALUES OF THE DURBIN-WATSON STATISTIC 2.5% SIGNIFICANCE POINTS OF QT, AND QU ly-:Z A-3 A:4 A=5 Qr Qu Qr. Qu Qr Qd Qr Qu ,o.orr 15 16 L7 18 19 20 2L 22 23 24 25 26 ,7 ,9. 29 30 31 32, 33 34 35 36 37 38 39 40 45 50 oo 60 OD 70 to 80 85 90 95 100 4.949 L.222 0.980 1.235 1.009 1.248 1.035 1.261 1.060 1.274 1.082 1.286 L.I04 L.297 1.124 1.308 1.144 r.319 1.161 1.329 1.178 1.339 t.7941.348 1.208 1.358 r.222 7.367 r..236 1.375 1.249 1.383 1.261 1.391 1.273 1.399 t.284 7.406 t.2947.413 1.305 1.420 1.315 1.426 1.3241.433 1.333 1.439 7.3421.445 1.350 1.451 1.388 t.477 1.420 1.500 r.447 L.524 r.471 1.538 r.4921..554 1.511 1.568 1.528 1.582 1.543 1.594 1.557 1.605 1:570 1.614 L.582 t.624 1.593 1.633 0.827 7.405 0.864 1.403 0.899 1.403 0.930 1.405 0.959 1.407 0.988 1.410 1.012 1.415 1.036 1.419 t.0591.424 1.080 7.429 1.099 1.435 1.118 1.439 1.135 1.445 1.153 1.450 1.168 1.455 1.183 1.460 1.197 1.465 1.211 1.469 r.224 L.474 7.236 r.479 t.248 L.484 1.259 1.488 r.270 r.493 1.281 1.497 1.291 1.501 1.300 1.506 1.343 r.525 1.380 1.543 L.4ll1.559 1.438 1;573 1.461 1.587 1.482 1.599 1.501 1.610 1.518 1.619 1.534 1.629 1.548 1.638 1.560 1.646 t.5731.654 0.706 1.615 0.748 L.594 0.788 1.578 0.825 1.567 0.859 1.558 0.890 1.551 0.920 1.546 0.947 r.543 0.973 1.541 0.997 1.539 1.019 1.539 1.041 1.538 1.061 1.539 1.080 1.540 1.098 1.541 L.t751.542 1.132 t.544 t.t47 L.546 1.163 1.548 1.176 1.550 1.190 1.553 1.203 1.555 1.215 1.557 1.227 L.560 r.2381.562 ,L.249 L.564 1.298 7.576 1.338 1.588 1.373 1.600 1.404 1.610 1.430 L.620 1.453 1.630 1.474 1.638 L.493 r.647 1.510 r..654 r.525 L.662 1.539 1.668 7.552 r.675 2 0.588 1.848 0.636 1.806 a.6801.773 0.720 t.746 0.758 t.724 0.794 1.705 0.826 1.691 0.858 1.678 0.887 1.668 0.914 1.659 0.939 1.652 0.964 7.646 0.986 1.641 r.007 7.637 1.028 1.634 LO471.632 1.066 1.630 1.083 1.628 1.099 1.627 1.115 r..626 1.13r 1.626 1.145 1.625 1.159 1.625 1.173 1.625 r.185 1.626 1.197 1.626 7.2521.630 1.297 1.636 1.335 1.642 1.369 1.649 1.398 1.655 1.424I.662 1.446 1.668 L.467 t.674 1.485 1.680 1.502 1.686 1.517 1.691 1.532 1.696 4:O Qr Qu 0.478 2.099 0.527 2.035 0.574 1.983 0.619 1.939 0.660 1.902 0.699 1.871 4.734 t.845 0.769 1.823 0.801 1.804 0.830 1.787 0.859 1.773 0.886 1.761 0.911 1.751 0.934 7.742 0.958 1.734 0.978 r.727 0.999 1.721 1.018 1.715 1.037 1.711 1.054 1.707 1.071 1.704 1.087 1.701 1.102 1.698 1.117 1.695 1.131 1.693 t.I44 t.692 7.204 7.687 1.255 1.685 L.297 1.686 1.333 L.688 1.365 1.691 1.393 1.695 1.418 1.699 1.441 1.703 r.461 1.707 L.479 t.7tl 1.495.1.715 1..511 1.718 BOUNDS FOR CRITICAL VALUES OF THE DURBIN-WATSON STATISTIC j5;TGNTFICANCE POINTS OF Qr, AND Ou It:2 A=3 lt =4 A:5 Qr Qu Qr. Qu Qr Qu Qt Qu o.oS r.077 r.36L 1.106 1.371 1.133 1.381 1.158 1.392 1.180 1.401 1.20t r.4rt r.22r t.420 r.240 r.429 t.257 1.437 L.273 r.446 1.288 1.454 r.302 r.461 1.316 1.468 1.328I.476 1.341 1.483 1.352 1.489 1.363 1.496 r.373 L.502 1.383 1.508 1.393 1.514 L.402 L.5t9 t.4tt 1.524 1.419 1.530 7.427 r.535 1.435 1.540 1.M2 7.544 t.4751.566 1.503 1.585 1.527 t.60L 1.549 1.616 L567 r.629 1.583 1.641 1.598 1.652 1.611 1.662 7.624 r.671 1.635 1.679 1.645 1.687 1.654 1.694 0.945 1.543 0.982 1.539 1.015 1.536 1.046 1.535 1.075 1.535 1.r"00 1.537 1.125 1.538 r.L47 t.541 1.168 1.543 1.188 r..546 1.206 1.550 1.224 r.553 1.2401.556 1.255 1.560 1.270 1.563 r.284 L.567 t.297 L.570 1.309 1.573 L.32L 7.577 1.332 1.580 1.343 1.584 r.354 r.587 1.364 1.590 1.373 1.594 r.3821.597 1.391 1.600 1.430 1.615 1.462 1.628 1.490 1.640 1.5141.652 1.536 1.662 L.554 t.67r 1.571 1.680 1.586 1.688 1,600 1.696 r.612 r.703 1.623 1.709 1.634 L.715 0.814 1.750 0.857 L.728 0.897 1.710 0.933 1.696 0.967 1.685 0.998 1.676 1.026 1.669 1.053 1.664 1.078 1.660 1.101 1.657 1,.123 1.654 L.L43 r.652 1.162 1.651 1.r81 1.650 1.198 1,650 1.214 1.650 L.229 L.650 L.2441.650 1..258 1.651 1..271L.652 1.283 1.653 1.295 L.654 1.307 1.655 1.317 1.656 L.328 1.658 1.338 1.659 r.sds r.ooo L.42L t.674 L.452 L.68r r..480 1.689 1.503 r.696 1..524 r.703 1.543 1.709 1.560 1.715 L.575 r.72r 1.589 1.726 r.6021.732 1.613 1.736 3 0.685 r.977 0.734 L.935 0.779 1.900 0.820 r.872 0.859 1.848' 0.894 1.828 0.9271.812 0.9581..797 0.986 1.786 1.013 1.775 1.038 1.767 L.062 r.759 1.083 1.753 r.L041.747 r.r24 t.743 1.143 1.739 1.160 1.735 L.1.77 r.732 1.193 1.730 r.208 L.728 L.222I.726 t.236 r.725 L.249 r.723 t.2617.723 t.273 t.722 r.285 t.721. 1.336 1.720 r.378 t.72t r.4L4 r.724 t.44t.727 t.47t L.73r L.494 r.735 1.515 1.739 r.5341.743 L.55r r.747 1,.566 1.751 1.579 r.755 r.592 r.758 A:6 Qt Qu 0.562 2.220 0.615 2.L57 0.664 2.104 0.710 2.060 0.752 2.022 0.792 1.991 0.828 1.964 0.863 1.940 0.895 1.919 0.925 r.902 0.953 1.886 0.979 1.873 1.004 1.861 1.028 1.850 1.050 1.841 1.070 1.833 1.090 1.825 1.109 1.819 1.r-27 1.813 1.144 r.807 1.160 1.803 r.L75 L.799 1.190 1.795 1.204 t.792 1.218 1.789 1.231 1.786 r.287 t.776 1.334 1.77r. L.374 7.768 1.408 1.767 L.4381.767 1.464 t.768 L.486 r.770 1.507 7.772 1.525 7.774 t.5421.776 t.557 t.778 r.571 1.780 15 16 t7 18 19 20 2T 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 45 50 DC 60 OD 70 (D 80 6l) 90 95 100