Hypothesis Testing in Multiple Linear Regression | BIOST 515, Study notes of Biostatistics

Material Type: Notes; Class: BIOSTATISTICS II; Subject: Biostatistics; University: University of Washington - Seattle; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-pra-1
koofers-user-pra-1 🇺🇸

10 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 5
Hypothesis Testing in Multiple Linear
Regression
BIOST 515
January 20, 2004
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download Hypothesis Testing in Multiple Linear Regression | BIOST 515 and more Study notes Biostatistics in PDF only on Docsity!

Lecture 5

Hypothesis Testing in Multiple Linear

Regression

BIOST 515

January 20, 2004

Types of tests

  • Overall test
  • Test for addition of a single variable
  • Test for addition of a group of variables

Test for an addition of a single variable

Does the addition of one particular variable of interest add significantly to the prediction of y acheived by the other independent variables already in the model?

yi = β 0 + xi 1 β 1 + · · · + xipβp + i

Test for addition of a group of variables

Does the addition of some group of independent variables of interest add significantly to the prediction of y obtained through other independent variables already in the model?

yi = β 0 + xi 1 β 1 + · · · + xi,p− 1 βp− 1 + xipβp + i

The ANOVA table for

yi = β 0 + xi 1 β1 + xi 2 β2 + · · · + xipβp + i

is often provided in the output from statistical software as

Source of Sums of squares Degrees of F variation freedom Regression x 1 1 x. 2 |x 1 1 .. xp|xp− 1 , xp− 2 , · · · , x 1 1 Error SSE n − (p + 1) Total SST O n − 1

where SSR = SSR(x 1 ) + SSR(x 2 |x 1 ) + · · · + SSR(xp|xp− 1 , xp− 2 ,... , x 1 ) and has p degrees of freedom.

Overall test

H 0 : β 1 = β 2 = · · · = βp = 0 H 1 : βj 6 = 0 for at least one j, j = 1,... , p

Rejection of H 0 implies that at least one of the regressors, x 1 , x 2 ,... , xp, contributes significantly to the model.

We will use a generalization of the F-test in simple linear regression to test this hypothesis.

CHS example, cont.

yi = β 0 + weightiβ 1 + heightiβ 2 + i

anova(lmwtht) Analysis of Variance Table

Response: DIABP Df Sum Sq Mean Sq F value Pr(>F) WEIGHT 1 1289 1289 10.2240 0.001475 ** HEIGHT 1 120 120 0.9498 0. Residuals 495 62426 126


Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

F 0 =

= 5. 59 > F 2 , 495 ,. 95 = 3. 01

We reject the null hypothesis at α =. 05 and conclude that at least one of β 1 or β 2 is not equal to 0.

The overall F statistic is also available from the output of summary().

summary(lmwtht)

Call: lm(formula = DIABP ~ WEIGHT + HEIGHT, data = chs)

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 55.65777 8.91267 6.245 9.14e-10 *** WEIGHT 0.04140 0.01723 2.403 0.0166 * HEIGHT 0.05820 0.05972 0.975 0.


Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 11.23 on 495 degrees of freedom Multiple R-Squared: 0.02208, Adjusted R-squared: 0.

F-statistic: 5.587 on 2 and 495 DF, p-value: 0.

yi = β 0 + xi 1 β 1 + · · · + xijβj + · · · + xipβp + i H 0 : βj = 0 H 1 : βj 6 = 0

As in simple linear regression, under the null hypothesis

t 0 =

βˆj se ˆ ( βˆj)

∼ tn−p− 1.

We reject H 0 if |t 0 | > tn−p− 1 , 1 −α/ 2.

This is a partial test because βˆj depends on all of the other predictors xi, i 6 = j that are in the model. Thus, this is a test of the contribution of xj given the other predictors in the model.

CHS example, cont.

yi = β 0 + weightiβ 1 + heightiβ 2 + i

H 0 : β 2 = 0 vs H 1 : β 2 6 = 0, given that weight is in the model.

From the ANOVA table, σˆ^2 = 126. 11.

C = (X′X)−^1 =

 

  1. 6299 2. 329 × 10 −^4 − 4. 05 × 10 −^3
  2. 329 × 10 −^4 2. 353 × 10 −^6 − 3. 714 × 10 −^6 − 4. 050 × 10 −^3 − 3. 714 × 10 −^6 2. 828 × 10 −^5

 

t 0 = 0. 05820 /

  1. 11 × 2. 828 × 10 −^5 = 0. 975 < t 495 ,. 975 = 1. 96

Therefore, we fail to reject the null hypothesis.

Using sums of squares to test for groups of

predictors

Determine the contribution of a predictor or group of predictors to SSR given that the other regressors are in the model using the extra-sums-of-squares method.

Consider the regression model with p predictors

y = Xβ + .

We would like to determine if some subset of r < p predictors contributes significantly to the regression model.

Partition the vector of regression coefficients as

β =

[

β^1 β^2

]

where β^1 is (p + 1 − r) × 1 and β^2 is r × 1. We want to test the hypothesis H 0 : β^2 = 0 H 1 : β^2 6 = 0

Rewrite the model as

y = Xβ +  = X^1 β^1 + X^2 β^2 + , (1)

where X = [X^1 |X^2 ].

and

SSR(X^1 ) = βˆ^1 X^1

′ y (p+1-r degrees of freedom).

The regression sums of squares due to X^2 when X^1 is already in the model is

SSR(X^2 |X^1 ) = SSR(X) − SSR(X^1 )

with r degrees of freedom. This is also known as the extra sum of squares due to X^2.

SSR(X^2 |X^1 ) is independent of M SE. We can test H 0 : β^2 = 0 with the statistic

F 0 =

SSR(X^2 |X^1 )/r M SE

∼ Fr,n−p− 1.

CHS example, cont.

  • Full model: yi = β 0 + weightiβ 1 + heightiβ
  • H 0 : β 2 =
    • WEIGHT 1 1289.38 1289.38 10.22 0. Df Sum Sq Mean Sq F value Pr(>F)
    • HEIGHT 1 119.78 119.78 0.95 0.
    • Residuals 495 62425.91 126.
      • F 0 = 119. 78 / 126 .11 = 0. 95 < F 1 , 495 , 0 95 = 3.
  • This should look very similar to the t-test for H