Download Stat 371-003 Homework Solutions: Regression Analysis and Correlation and more Assignments Statistics in PDF only on Docsity! Stat 371-003, Solutions to Homework #11 1. 12.10 (pg 540) (a) The estimated slope is b1 = β̂1 = ∑ (xi−x̄)(yi−ȳ)/ ∑ (xi−x̄)2 = 5288/1172 = 4.512 The estimated y-intercept is b0 = β̂0 = ȳ − β̂1x̄ = 660− 180.4 · 4.512 = −154.0 So the estimate regression line for y on x is y = −154.0 + 4.512x (b) For the predicted (aka fitted) values, we calculate ŷ = −154.0 + 4.512x; see the fol- lowing table. (I asked you to calculate just the first three.) Subject x y ŷ y − ŷ 1 174 733 631.1 101.9 2 183 572 671.7 –99.7 3 176 500 640.1 –140.1 4 169 738 608.5 129.5 5 183 616 671.7 –55.7 6 186 787 685.2 101.8 7 178 866 649.1 216.9 8 175 670 635.6 34.4 9 172 550 622.1 –72.1 10 179 660 653.6 6.4 11 171 575 617.5 –42.5 12 184 577 676.2 –99.2 13 200 783 748.4 34.6 14 195 625 725.8 –100.8 15 176 470 640.1 –170.1 16 176 642 640.1 1.9 17 190 856 703.3 152.7 (c) For the residuals, we calculate y− ŷ = y− (−154.0 + 4.512x); see the previous table. (I asked you to calculate just the first three.) (d) sY |X = √ SS(resid)/(n− 2) = √ 198, 909/(17− 2) = 115.2. The units are the same as for y (Li/min). (e) 12/17 = 71% of the data points are within ±sY |X of the regression line. 2. 12.18 (pg 548) The estimated mean peak flow for men 180 cm tall is −154.0 + 4.512 · 180 = 658.2 The estimated SD of peak flow for men 180 cm tall is sY |X = 115.2. 1 3. 12.26 (pg 553) We first calculate the estimated standard error of the estimated slope. ŜE(β̂1) = sY |X/ √∑ (xi − x̄)2 = 115.2/ √ 1172 = 3.364 (a) To test β1 = 0, we look at β̂1/ŜE(β̂1) = 4.512/3.364 = 1.341. For a non-directional test at α = 0.1, we compare this to the 95th percentile of a t distribution with 15 degrees of freedom, 2.131. Since 1.341 < 2.131, we fail to reject the null hypothesis: there is insufficient evidence to conclude that there is a relationship between peak flow and height. Note that the P-value is very close to 0.2. (b) For a directional test at α = 0.1, we compare the t statistic, 1.341, to the 90th percentile of a t distribution with 15 degrees of freedom, 1.341. We just reject the null hypothesis of no relationship and conclude that peak flow increases with height. 4. 12.32 (pg 564–565) (a) The correlation coefficient (which I’d prefer to call the estimated correlation) is∑ (xi − x̄)(yi − ȳ)√∑ (xi − x̄)2 ∑ (yi − ȳ)2 = 893.689√ 1419.82 · 853.396 = 0.8119 (b) You could create a scatterplot by hand, or you could read the data into R and plot them as follows: dat <- read.csv("http://www.biostat.wisc.edu/˜kbroman/teaching/stat371/data_12-32.csv") plot(dat[,2], dat[,3]) Here are a couple of alternatives for creating the scatterplot: plot(dat$site1, dat$site2) plot(site2 ˜ site1, data=dat) Here’s the actual plot. ● ● ● ● ● ● ● ● ● 10 20 30 40 50 10 15 20 25 30 35 site1 si te 2 2