Download Midterm Exam 1 Questions with Solutions - Applied Regression Analysis | STAT 4230 and more Exams Statistics in PDF only on Docsity! SPR|NG 2008, STAT 416230 FIRST MIDTERM f i ,, i-t\ ^,jLl* r r-)i'l if e\'lNAME: '? L"' j!J\,_/ STUDENT ID NUMBER: (This is your 810-number) INSTRUCTIONS \ it For all students: There are four Problems, each with 3 parts. Write your answers in the space : provided. lf you need more space, you can use the back of a page, but clearly indicate where the continuation of your answer can be found. For all questions, show your work. For example, don't just answer a question by 'yes' or 'no''or by writing a number without any explanation. For STAT 4230 students: Answer any 10 parts. Each has a maximum score of 10 points. {You may attempt more than 10 parts, in which case you will receive credit for your 10 highest scores only.) For STAT 6230 students: Answer all 12 parts. Each has a maximum score of 8.33 points. Good luck! \ PROBLEM 1. A car dealer specializing in Corvettes offered a number of used models for sale at a reopening celebration. The sales price {rounded to the nearest thousands of dollars) and age (in years) are shown on page 3 for"15 models. Also shown are a scatter plot with the regression line and some SAS output from the regression of price on age. a) Based on the plot and the SAS output, comment on the model. For example, is it a useful model?lsitagoodmodel?Explainyouranswers. , . ---.t f'e fvfr:gfl iIlp u;er; of rtr ,,]lcba/ I- lrs ( (n'vsltuP < ' oo0 | ) i) r r - "^,-,:J^.-,,^ '".,".^- 'i +!ne vint'iqI i'0t\ rv1 l ' A'!'{ tJp vv'?0' t)ri,lur te prgJn,h s0r^rrr 0f thr Va 'irrt oq rt. 4 nin li 18r,( DI {ht' di;rr''qhvn ir P'FP!n'\4+r1-'r,!^P n4orle/ (Laxtl ^2\ , t[:PprrrJ 10 hp cT frqJ} tu'blT,qtld 'hoD/0/'JijLil ,( - I ' ' in .lhv plol 'i'hrol 4'\ ilru,f- thiirr i5'ryn l"Tg:t'' l',, (ir; r'vt rn A t'14 i?clt'/ nn; Sitl' fu b;* llps^ f 4 t;':i #!t1?'ll ;':? b) Based on the titteA model, provide ah estimate for the change in price for every increase . of 5 years in age? E.xf-;.gqs) : lt7, o{51 ;p {haF"{hp ptrtP dprrp2tr's br,r .$ /?, oi,S -fo' o"''l-1 'rtttf rtlJr N' " i*''[ '5" LJ;tG'r's"i i.-J c) Predict the'price of an very reliable. L8-year-old model. A[so, explain why this prediction may not be ")',r'""")'==,8, i = 6i' 08{ - iP 8p5 JriU =' 5cl{ rri Y- f i\rna?V Dr $5qd ( ; ,5 ir1!-p 1': rB i: rur{r;r/r' D} '/hr t'atr't thp r{p{,r 6 n"f rl4,la{clu* ), '-1f'"J il l ;a!'u1eJ 't'1, ; I lrul"nt \ r/P f \4 .=J 1. 0 o\ ru! rq Lle irt raur-li' Jr',i 4ror/r I nnuvl i c /a^t?Pr Au/c"'( F rrn ' i i h;-*""u/ \$.{'u'rel {c {i/'{ itrf'rt:f ihl vfrf ifil1 l t uf 'l [r;f rt' h" {o riiv ^ lrold , prrdiC h r'2 ^ot Corarrr,u' "J \rtlvacr'cn ort ,.o\ c tro'{cl PR'BLEM 3 rhe,n'";.l;, #:#;1#i:l;;";l* n1'4Set'9 lntsactron r<i+* thl e<t+l.cn\t'ip blw -lht flCt,o1..*, hrqh Co;rtlcr\iyr vnalres irteradlo^ ynec^n\^Xta$ blc &ar (f fu14 la C.ngt,fit i ctnofl{e f,r a) Consideiihe interaction model for two quantitative variables x1 and x2. Describe in words, possibly supported by graphs, what flexibility this model provides over the first- ordermodel. r ir A r | | i i .l; h '{hp ^rrLrrfr L,\r nr;cfe} , 4, Ia frvr,l valurt n{ x^ *n| t\tcv'' I i: ulv) = ,/A,, t 3,xz) oifl , + ?tXr)X, T'ir rhr. rloP\ Df// 9---i; :'-i-- I hn rula.t ,*:i"ir*fijJj y o^,i',ilu car.) r iiancle {gp*"qi,-r,.r{ r"),, thv .r.,ctlu, * Xl , 0,' , uQlrvale^fly, iV tlhtl^ci & 4 cns Lrnil changt rY) x r ftS' on ) cfepen$S idl {hp. /alue o! Xz ' b) lf a data sei contains L8 observations and we fit a complete second-order model in two independent quantitative variables, how many degrees of freedom will be left for error? ft=S r\ | -{ hri} c) A researcher wanfs to study whether, for young two-parent families, total years of education for the parents (x1) and family income (x2) are useful predictors for whether the family has small children (y=1) or not (y=0). He proposes to use the model E(y) = Fo + Prxr + 9 zxz, with assumptions about the error terms as discussed fn class. Do you think i 13, +?,xtti3.{. * TiX,fr+,?,,x,'r }r,x! ) , Jo df Prror :- n - {k+r) = 'B -B = i.L . that this is a good'model for this problem? Why or why not? rJo, ln;s rs ^;; ; ,r; ;;dr] ' 'f '€ hryr s s1r'rm4f tf ;sirifqko1 t 5o uleq'5 ) Btot- i, t&:4 only Lv]!'vntt v q lq p \ D Dr I r-:r rrcf to,') fher ih,, ,n*i -lha ue c\ ro rn^c( | d, sf r.i l* h rn PROBLEM 4. To predict sales prices for houses in a certain neighborhood, a multiple linear regression model was built with sales price as the response'variable (y) and'property tax for the current year (x1), the number of baths (x2), lot size (x3), living space (x4) and the number of garage stalls (x5) as the independent variables. The model v = Fo + p1xL + p2x2+ p3 x3 + pa x4 + p5 x5 + e wasfitted to data based on24 recentsales. Edited outputfrom a 5AS program forthese data and this model can be found on page 7. L a) A student who'looked at the output on page 7 concluded that if forward selection w'ere used with an alpha of .1-0, then the first variable to be included in the model would be x1. Does the output that is shown on page 7 contain enough information to reach this conclusion?Explain.. . I t f r . N,r i- crdrr -Jo fle t r cJt ruh,rh vr,r,qblp SJtpJar br i^ciucfe.f dnb ll "-;r i, f; uhi, mcde]s u))=i4, r?,x,-,.ilift," rt'r.,- i r I) , 4aXS Gr^ul :e!rt.{ -thU u ariablg rn th{, tl))= i'-r" f l' , fi) | rrnudu! ui,)[n if.t iu'ri+sl ii: 'varLlp'*) b) After some further thought, the student who was introduced to you in the previous part of this problem also concluded that if backward elimination vJere used with an alpha of .10, then the first variable to be removed would be x4. Does the o"utput shown on page 7 contain enough information to reach this conclusion? Explain. 'f*r B qi kr,r;qrci p I inrrrrr\rvl .Sic,r I I'u1 f, tln 7") ht' rn*tVef rCilh ql] X4,e D,4Df tit[+ss {}* *{t:iqblu 'it'r?h thz iar3r:b P-ycilrzq .Pnov,cfqti r) is 7o'o) , rcnich )s Vurrqhfp X4 c) When using all-possible-regressions with the adjusted R-square value for these data, the model that included only x1- (property tax) and x2 (number of baths) yielded the largest adjusted R-square. The real estate'agent who had requested this study responded that this seemed absurd to her. My 10-!ear old knows, she exclaimed in exasperation, that living space (x4) is important to predict housing prices, and you tell me now that I should not use this variable at all! What would you tell her? PROG REG OUTPUT Source Model Error Gorrected TotaI DF 5 18 2g Sum of Squares 674.84652 150.19973 829.04625 Mean Square F Value Pr > F 16.27 <.0001135.76930 8.34443 Root MSE Dependent ilean Coeff Var 2.88867 34,61250 8.34575 R-Square Adi R-sq o.81 88 o.76ri5 Variable fntercept x1 x2 x3 x4 x5 DF 1 1 1 1 I 1 Parameter' Estimate 9.49785 2.15009 7.14931 0.41668 -o.77429 1.14477 Parameter Estimates Standard Error t Value 3.26988 2.90 4.72575 2.96 3.87804 1.84 0.44132 0.94 3,78160 -0.20 1.1s918 1.03 ' ltl o.oo94 0.0083 0.081 I 0.3576 0.8401 0.31 87 7r,i,r iloltvPi' rnv l.[ a/rAeS\ ti'i3 iht Dl bl. qt,? ,4 XY ?''"*l 'lt',p n4scrl d00,\ h0f. vb'tqn 1t',"p XY ;1 no!- htfl|1 conreiqkait.r'ilh y 'e5Lirkt\''1 *hrsiseivpfu i J ?o s r) iyp t prrr lr\ro0' iQ"\ wetlrl X Li ciur pl f I . ctarl /or Xz ,\4r,.:i Di thp vittf,qhaa h ^t:tr*-J, Prlrt -thc'f lLl fcafl Ptp/a,)r,] {'Dr,1 uFlrciierr tl1 tli"o bt e'rp/a4"" ,yY^ G rLryr l";hqArrq 0f xl,GrflY"r ll.J^,o,^,l?Tto,.-1;,"1 ":;:-;;:;;;;;' 'I''Jr *,irr''a la'ri* n,-o'k,/'l- naLh5 u'jrl {e^r{ +oJ hDr i p ^'{4, ry l,u'rh ct Spdou )