Download Homework 4 Solutions | Applied Regression Analysis | STAT 645 and more Assignments Statistics in PDF only on Docsity! HOMEWORK 4 SOLUTION 3.7 a. Stem-and-Leaf Display: x Stem-and-leaf of x N = 60 Leaf Unit = 1.0 8 4 11122334 15 4 5677788 21 5 123344 29 5 56777999 (8) 6 00013334 23 6 5556889 16 7 001223 I agree that the plot is consistent with the random selection of women from each 10-year age group for 40-60. Women in their mid to late 70’s are not represented. b. The residuals are scattered around zero, with what appears to be a symmetric distribution. Checking for normality would be difficult from this plot. RESI1 24181260-6-12 Dotplot of RESI1 c. The two graphs provide the same information. No clear departure from the regression line is evident. 1 Fitted Value R es id ua l 11010090807060 20 10 0 -10 -20 Residuals Versus the Fitted Values (response is y) x R es id ua l 8070605040 20 10 0 -10 -20 Residuals Versus x (response is y) d. 2 Regression 1 12.597 12.597 55.99 0.000 Residual Error 13 2.925 0.225 Lack of Fit 3 2.767 0.922 58.60 0.000 Pure Error 10 0.157 0.016 Total 14 15.522 c. The lack of fit only indicates that a linear regression is inappropriate, but does not indicate what regression function to try next. A study of the residuals might help. 3.16 a. By examing the scatter plot, I might try to transform Y to logY to attain a constant variance and linearity. There are problems with both – log should improve the constant variance – if linearity is still a problem you might transform x as well. x15 y1 5 9876543210 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 Scatterplot of y15 vs x15 c. The regression equation is logY15 = 0.655 - 0.195 x15 d. The regression line appears to be a good fit for the transformed data and the constant variance assumption seems satisfied on the new scale. 5 x15 lo gY 15 9876543210 0.50 0.25 0.00 -0.25 -0.50 -0.75 -1.00 -1.25 0 Scatterplot of logY15 vs x15 e. The following graph shows a good linear regression fit for the transformed data. Fitted Value R es id ua l 0.500.250.00-0.25-0.50-0.75-1.00-1.25 0.10 0.05 0.00 -0.05 -0.10 Residuals Versus the Fitted Values (response is logY15) 6 Residual Pe rc en t 0.100.050.00-0.05-0.10 99 95 90 80 70 60 50 40 30 20 10 5 1 Normal Probability Plot of the Residuals (response is logY15) f. The transformed regression equation is Y15 = 4.518559 / 10^(0.195 x) 3.15 a. I'd try to transform x first, since the variances seem to be constant. 7 Residual Pe rc en t 7.55.02.50.0-2.5-5.0 99.9 99 95 90 80 70 60 50 40 30 20 10 5 1 0.1 Normal Probability Plot of the Residuals (response is y18) e. The question is ambiguous – when transforming x the y’s are still in the original units, although x is not. (y - 1.25)^2 = 13.1044 x 3.20 The error terms are still independent normal after the transformation. But if the same transformation is used on Y, the error terms no longer have normal distribution. 3.23 H0: E{Y } = β1X, Ha: E{Y } not equal β1X The degrees of freedom are full = 10 and reduced = 19. 3.23 a. There seems an outlier at x=12. Regression Analysis: y24 versus x24 The regression equation is y24 = 48.7 + 2.33 x24 10 x24 R es id ua l 13121110987654 15 10 5 0 -5 -10 Residuals Versus x24 (response is y24) b. Case 7 pulls the regression line toward the count-clockwise direction. The regression equation is y24 = 53.1 + 1.62 x24 c. Case 7 falls out of the 99% prediction interval. That means case 7 doesn’t seem to follow the pattern of the rest of the data. New Obs Fit SE Fit 99% CI 99% PI 1 72.52 1.47 (66.58, 78.47) (60.31, 84.74) 3.32 Based on the original data, the linear regression model is fitted. And the residual seems to increase as X increase. Although the normal probability plot suggests the normal assumption for the error term isn't satisfied, this plot is not appropriate when the variances are unequal. Regression Analysis: PSA level versus cancel volume The regression equation is PSA level = 1.12 + 3.23 cancel volume 11 Residual Pe rc en t 200150100500-50-100 99.9 99 95 90 80 70 60 50 40 30 20 10 5 1 0.1 Normal Probability Plot of the Residuals (response is PSA level) cancel volume R es id ua l 50403020100 200 150 100 50 0 -50 Residuals Versus cancel volume (response is PSA level) Since the residuals increase as the cancel volume increases and the increasing rate of psa level with the increase of cancel volume is decreasing, so I transform psa level to log(psa 12