Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Material Type: Assignment; Class: Applied Regression Analysis; Subject: Statistics; University: University of Georgia; Term: Spring 2008;

Typology: Assignments

Pre 2010

1 / 8

Download Applied Regression Analysis - Homework 2 Solutions | STAT 4230 and more Assignments Statistics in PDF only on Docsity! i il STAT 423016230 Homework f.2 Solutions .!r' ,[i.i February 17, 2008 't 1. Exercise 4.4 (LO Points). r { a. The first-order model for mean annual e'arnings, E(A) , as a function of age (r1) and hours worked' (r2) is E(A):9o*lptt0zrz. b. The least squares prediction equation is it , : 6- A : -20.35201 + 13.3504511 +243.7744612..* c. The term B6 : -20.35207 has no practical meaning because rr :0 and. r2: 0 are not in the ' observed range. For fr : 13.35b45, the mean'annual earnings is estikated to increase 6y $13.350a5 for each ad- ditional year of age, holding th'e hours worked per diy constant. For 0z :'243.7i446, we estimate that the mean annual earnings will increase by S24i.ZU+e for each additiorial hour rivorked.per day, holding the year of ag'e constant. d.'To test if age (c1) is a'statistically useful predictor of annualearnings, we will test the fo;o*i.tg ,hypotheses with. a : 0.01: ll Hs:81:g r. Ho-th*O. Flom the output, the f statistic ii 1.74,- and the two-sided pvalire is 0.7074. Because we are . carrying out.a two-sided hypothesis test, this is the pvalue for the test. Because the pvalue .is greater than a:0.01, we fail to.reject the null hypoth6sis. There iS'insufficient evidence to ,- indicate that age is a useful predictor of annual income, adjusting for hourS worked p'ef week, at Ievel a:,0.01. ;- /r ^F n6 /^. A^^ ^^,^F\e. FYom the output, the 9570 confidence interval for B2 is (105.33428, 382.09465). We are 95%" confident that for each additional hour worked per day, the mean bnnual salary will increase " ;:^- -: .anywhere from $105.33 to $382.09, holding the agd constant 2.'Exercise 4.f-0 (10 Points) ! a. The mod-el relating the mean value of the college'GPA as a linear fiinction of the three explanatory variables is q E(il : 0o'l ipt * Lzxz * lzrs. The.interpretations,of the parameters are as follows: o B6 does not have any practical meaning because vre would not observe a 0 high school GPA ' or 0 SAT score. o B1 is the cha-nge in the mean college GPA for each unit change ih high school GPA, with all. " 'other variabl6s held constant; t 't '" I ."82 isthe change in the mean it"gu GpA fo, euJn unit change in siudy hows, with Jt o,n",' variables' held constant; and r B3 is the change in the mean college GPA for each unit change in the SAT score, with all other variables heid constanl. | , - ""b. For,both models, the estimate'of the B parameter"for the SAT score is B3: b'OOf . We estimate thit the'mean'college GPA will increase by 0i001 point for each additional point scored on the"' r , SAT, with the other two variabie5'(the high s'chool GPA and hours. of siudying in'a sbason) held' constant. , ' c. Here, we are n6t given a.specific value of a to use, so we will use a common onel a : 0.05. If o is not specified inlhe prciblem, you"will state your choice before even looking at fvblues. The reported pvalue for testing Ho: Ft:0 ii 0.734.foi the black athletes'model and 0.000 for " the white athletes' model. We iriterpret ,the pvalues "as,follows: o The.p-value for the black athletes' model for testing Ile is 0.734 > 0.05. Thus, in light of,this " ;;;Jp;;",.;fail tb reject the null hypothesis. We ilo not have sufficient evidence'of a' linear relationship between high schoot GPA and college GPA, adjusting for study hours and . SAT scores for black athletes at level o:0'05' 1. ' o Since ihe pvalue ior the white athlbtes' mbdel is so sinall (0.000 < 0.05), therb is strong evidehce to reject f/s. .Ihere is sufficient efidence of a linear relaiionship b<itween high schobl .: GpA and college GPA,'adji-rsfin$ for study hours-and FAl.scores for white athletes at level : , 'a:0'05' ; 3. Exercise 4.L4 (LO Points) ' * a. Fyom the SAS outputj we see that R2: R-Jq"rr"":0.5823."-This rheans that"58.23% of.the sample variability of annual' earnings about the" mean. is ,exflained by the'linear relationship ""betwee"n annual earnings aird the indepehdent variables ag'e and.hburs woiked per day. : b...ThdSAS 6utput shows that the Rf;: Adj R-Sq : 0.1120, Thus, 51.26.% of lhe sample variabilitv df annual earnings,about dhe mean is explained by.thr5 linear itlationshib between annual'earnings , 'and the"'independent variables age and td.r., *oirca.per day, p.o.rid"d tliut we"have bdjusted for; ih" sample sire and,the nimber of-B parameters"included in the model' . , c. ,We set,o :,0.01 and conduct a test*of llobal utility of the model. The hjpotheses we test are Ho'h:02:0' " I Ho: A:least..o:re U:f o.^, iFr., i 1 . 'The output shows ttiat the F statistic is -8.36,.wliich corresponhs to'a 1>value of 0'0053. Because ;tr;;- ;" fuffr lao* o:0.01, we rejtjct"the'null hypothesis arid find'suffiiient evidence'that- ' , at least one of the indbpendent vaiiables (age or tbrris *otked pertday)'is useful for predicting """"J r;1"-". ,q.[otft"r *uy bf doirrg this problem is to compare the F statisticawith a critical F' value, which'in this case was Fo.or,z,rz:6'926608"Since F ) tr!, we reject the null hypothesis and ^reach the same concltrsio'n. { . 't ' * ld' 4. Exercise 4.2L (LO Points) / a. For the indeperident variable, rental price,'we see that B1 :2.87.'Thus, we estimate that the mean number of homeless per 100,000 population will increase by"2.87 for'each dollar increase in the rental price (10% perceirtile), wiih all other independent variables held constant' ,, b. we s<it the significance level at a : 0.05, as giveri by t[re problem. To test th'e hypothesis that the incidence of homelessness decreases as'employment growth increases, we test ' Hst 9+':b ' I ' ' H;:'ga < 0' *" 4 *! r' " , ,To test for a negative r'elaiionship, we test whether the apfropriate parameter is negative' H=ere, r we see that tfrJt-siatistic is -2.7L. With n - (k + 1)": 50 - (16 + 1)".: 33 degree,' of freedom t4 Lastly, we test if th"ri i! suflicient evidence to indicate that the relationship between support foi a'military.resolutibn and race depends on political knowledge." At lev.el a:0.05, we test the hypotlieses a ' ", Hs: Bs: g , ' H" :'Bs 10. i-. J The two-tailed pvalue is 0.08 > 0.05, which means that'we fail to reject the null hypothesis. Therefore, there is insufficient evidence to indicate that the'relationship between support ior a military resolution and race depends on political knowledge, with all other va.riables held constant. Because Rz :0.Lg'4, 19.4% ofthe va,riation in the support foi military redolution is explained'by the model containing the seven independent variables and the two interaction terms. f. The hypotheses are Ho : 0t : Fz : 0s : 0E :.0s : Fa : 0z : 0a : 0s : 0 , Ho: At least one 0t * 0,'l' : I,2,3,. . . ,9. 0.7941s : 46.88. (1 - 0.1e4)/[1763 - (e + 1)] The rejection regioil iequires o : 0.05 in the upper tail of the F distribution with v1 : lx : $ . and uz : n - (/r + 1) :1753 degrees of freedom. -The rejection region is F > Fo.os,s,rzsa = 1.88.+*' Because the observed F statistic falls into the rejection region (F:46.88 > 1.88), we reject the null hypothesis. There is su-fficient evidence to indicate that ,the model is useful at level a : 0.05. 7. Exercise 4.28 (10 Points) 1 a. If meaningful, the term Ao = 325;790 would have meLnt that the mean percentage of motor vehicles without catalytic conveiters would be 325,790% if the year is O. However, it has no practical interpretation because r : 0 is not in the observed range. We have data for the years 1984 to 1999. , b. The term h : _321.67 should not be interpreted as'a slope because'of the presence of the . quadratic term, fi2. It is just a shift parameter and has nd practical interpretation. c. The value of 0z:0.794 is positive, indicating an upward curvature in the sample data. d. Since we have no idea of the relatio.rrilip b"t*""n y and r taking plb,ce outside of the-observed range, we should not use the least squares predictioh equation to predict the value of y for r , "_.:;::'::;:r;T::H- rn this case;2027-is werr cjutside trre ra;e of (1e84' leee) a. The 95% prediction interval for gr when nr : 45 and 12: 10 is (1760,4275). With 95% confidence, we conclude that'the actual annual earnings for a sireet vendor of age 45 who works 10 hours ai, day is betweeh $1760.00 and $4275.00. b. The 95% confidence interval for E(g) when z1 : 45 and tz : 10 is (2620,3415)''with 95% coifidence, we conclude that the mean annual earnings for a 45-year-old street vendor working 10,hours a day is between'$2620.00 and $gatS'OO c. Yes, Individuals aie more variable than means, so we will have a la,rger margin of error for intervals containing aringe,of predicted values for an individual g versus the mean of g, E(y). d. The test statistic is , - R'lk'=(r-R)l[(.r,-(,b+r)] 9. Exercise 4.78 (lO Points) ' . a. To jletermine whether the modet.is adequate foi prlailtirrg ,uUo.dirrute performance g at level o :0.10, we tdst the foliowing hypotheses . Thb,test statisi,ic (formula on p. 185 of the textbook) is - F- The pvalue is ,i t'- llo't 9t :pz: gs - 0' Ho: At least one of the P" I 0 for i :7,2,3. R'lk o.2ili (t - R2)lfn -.(/' + i)] (1- 0.22)/(8e - (3 + 1)) {: P(Fe,6 > 7.99145) = 0.000093 < 0.10 : o.. 1 :7.99745. ' Thus, we reject the null hypothesis. We conclude that there is sufficient evidence to indicate the model is adequate for predicting subordinate perfoimanc6 at level o : 0.10. Alternatively, we can co'mpare inu f .tutirtic..ivith the critical value for the F distribution (where a :'0.10 is the area in the upper tail). This critical value is Fo.ro,e,es = 2.75, so the rejection region is .{F:F >f':2.15}. o * b. For tz: | (tow conflict legitimization), " 'r A, : 7.09 - 0.4411- (0.01 ' 1) +.0.06(11 ',1) * : '7.08-- 0.3811 . .. r For r: :.7 (high conflict legitimization), 9 : 7.09 - 0.44ri - (0.01 ' 7) + 0'06(11 ' 7) ' ; : 7.02 :0'02x1' Below is the graph with the lines for gr as a function of 11 when z2 :'i and nz : 7, respectivbly. The dotted line corresponds to the high confliit legitimization, and the solid,line corresponds to the low conflict legitimization. Wb see that for .r2 : 1 (lovi conflict legitimization), y declines much faster with increasing z1 . olt , r6 6 I('lJ d''l 45 I 3J 3 7-5 Plot tif Least Squares Prediction Equation for Subordinate Perforuance as a Function of Gmup Decision Method for Low and High Conflict Legitimization . u ' '. 4r1i - 702 - 0'0ht ' 1- .. dr1) .- r-,os - oJgrt 1: 0 c. To determine"if the i6lationship between sribordinate performance (d) and manager's use of a 'igroup decision method (21) ddpends or) the manager's legitimization of corrflict (r2),we test the following hyfotheses: ' o-'+ J. r il , Hot/s:Q ' , ^ Ho:0s*0. Use the-significance level of o:'0.,0, u^r rO".rU"O in the proble*. fU" t statistic is 1.85, where we have ilg - (l * 1) -85 d,egrees of freedom. Thus,*the pvalue is ! - 2'P(t8s > 1i85) :0'067788 < 0.10. ' Because the pvalue falls-below-the significan'cd level, wdrejdct'the null hj,pothesis. Thus, bt level. a : 0.10, the relationship between sudordinate p-erformance and manager's use.'of group decision method depends on a manager's legitimization of conflict. Alternativel|, th'e'critical f value is ' ,." t0'05,85 : 1'663" , ,- d. Based on the results of part c, the resed,rchers should not conduct i i;rr. on"Bi'and 82. If an interaction term is. significint, the-main'effects of each of ihe variables ma/ be covered up. According to the textb<iok (p. 195), if an interactioir'term is signifi.cant, the main effect terms should be includei in the model, regardless of the magnitude,of the pvalues..' Exercise 4.80 (10 Points) rl..' " a. We interpret the model coefficients as followsl A -^- -- t ' a . lJo: -rub: rf meani_ngful, we,would,have estimated the m'ean,daily admissions for overcast " weekdays with a predicted daily high of 0" F to be :105. However, this is not a very practical interpretation due to extrapolation a 0, :.ZSt We bstimate ih" difi"."rr"e in weekend and. weekday mean daily admissions io be .25, h'olding the weather conditions and temperature constant. . 243{ J, , 5 t: