









































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Adjusted R2 can even be negative, whereas R2 itself must be between 0 and 1. A value of that is substantially smaller than R2 itself is a warning that the model ...
Typology: Study notes
1 / 49
This page cannot be seen from the preview
Don't miss anything!










































Chapter 12
Definition The multiple regression model equation is
2 . Again, it is assumed that ε is normally distributed. This is not a regression line any longer, but a regression surface and we relate y to more than one predictor variable x 1 , x 2 , … , xp. (ex. Blood sugar level vs. weight and age)
The multiple regression model can be written in matrix form.
To estimate the parameters b 0 , b 1 ,..., b p using the principle of least squares, form the sum of squared deviations of the observed yj ’s from the regression line: The least squares estimates are those values of the b i s that minimize the equation. You could do this by taking the partial derivative w.r.t. to each parameter, and then solving the k+ unknowns using the k+1 equations (akin to the simple regression method). But we don’t do it that way. Q = " #$ % & $'( = " (*$ − ,- − ,(.($ − ⋯ − , 0. 1 $ ) % & $'(
Suppose, for example, that y is the lifetime of a certain tool, and that there are 3 brands of tool being investigated. Let: x 1 = 1 if tool A is used, and 0 otherwise, x 2 = 1 if tool B is used, and 0 otherwise, x 3 = 1 if tool C is used, and 0 otherwise. Then, if an observation is on a: brand A tool: we have x 1 = 1 and x 2 = 0 and x 3 = 0, brand B tool: we have x 1 = 0 and x 2 = 1 and x 3 = 0, brand C tool: we have x 1 = 0 and x 2 = 0 and x 3 = 1. What would our X matrix look like?
2
Just as before, the total sum of squares is SST = S( yi – y ) 2 , And the regression sum of squares is: Then the coefficient of multiple determination R 2 is R 2 = 1 – SSE/SST = SSR/SST It is interpreted in the same way as before.
2 !!" = $ (&' − &)
= !!+ − !!,.
2 Unfortunately, there is a problem with R 2 : Its value can be inflated by adding lots of predictors into the model even if most of these predictors are frivolous.
2 The objective in multiple regression is not simply to explain most of the observed y variation, but to do so using a model with relatively few predictors that are easily interpreted. It is thus desirable to adjust R 2 to take account of the size of the model: !"
= 1 − ''( ) − * + 1 '', ) − 1 = 1 − ) − 1 ) − (* + 1 ) × ''( '', 0
2 Because the ratio in front of SSE/SST exceeds 1, is smaller than R 2
. Furthermore, the larger the number of predictors p relative to the sample size n , the smaller will be relative to R 2 . Adjusted R 2 can even be negative, whereas R 2 itself must be between 0 and 1. A value of that is substantially smaller than R 2 itself is a warning that the model may contain too many predictors.
Investigators carried out a study to see how various characteristics of concrete are influenced by x 1 = % limestone powder x 2 = water cement ratio, resulting in data published in “Durability of Concrete with Addition of Limestone Powder,” Magazine of Concrete Research , 1996: 131 – 137.
Consider predicting compressive strength (strength) with percent limestone powder (perclime) and water cement ratio (watercement). > fit = lm(strength ~ perclime + watercement, data = dataset) > summary(fit) ... Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 86.2471 21.7242 3.970 0.00737 ** perclime 0.1643 0.1993 0.824 0. watercement - 80.5588 35.1557 - 2.291 0..
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.832 on 6 degrees of freedom Multiple R-squared: 0.4971, Adjusted R-squared: 0. F-statistic: 2.965 on 2 and 6 DF, p-value: 0. cont’d
Important Questions:
The model utility test in simple linear regression involves
no useful linear relation between y and the predictor x. In MLR we test the hypothesis
which says that there is no useful linear relationship between y and any of the p predictors. If at least one of
and be very conservative (if Bonferroni correction is used). A better test is a joint test, and is based on a statistic that has an F distribution when H 0 is true.