Lecture 10 Multiple Linear Regression, Study notes of Computer Science

Lecture 10. Multiple Linear Regression ... 10-2. Topic Overview. • Multiple Linear Regression Model ... Note formulas are same as before, with hat matrix:.

Typology: Study notes

2021/2022

Uploaded on 08/05/2022

char_s67
char_s67 🇱🇺

4.5

(116)

1.9K documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
10-1
Lecture 10
Multiple Linear Regression
STAT 512
Spring 2011
Background Reading
KNNL: 6.1-6.5
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Lecture 10 Multiple Linear Regression and more Study notes Computer Science in PDF only on Docsity!

Lecture 10

Multiple Linear Regression

STAT 512Spring 2011

Background Reading

KNNL: 6.1-6.

Topic Overview

Multiple Linear Regression Model

Multicollinearity

Predictor variables are often correlated to

each other.

If predictor variables are highly correlated,

they will be “fighting” to explain the samepart of the variation in the responsevariable.

Caution

: Using highly correlated predictor

variables in the same model will not leadto useful parameter estimates. Want to becareful of this.

Multiple Regression Model

0

1

1

2

2

1

,^

1

i^

i^

i^

p^

i p

i

Y
X
X
X

β

β

β

β

ε

i^

n

observations

Assumptions exactly as before:

iid i^

N

ε

σ

i Y

is the value of the response variable for the

i

th

case.

ik X

is the value of the

k

th

explanatory

variable for the

i

th

case.

Regression Plane/Surface

Model in Matrix Form

~ N

n

n

n

p p

n

n

N

×

×

×

×

×

Y

X

I

Y

X

I

Coefficient matrix

p

β^ β p β

×

^

^

^

^

=

^

^

^

^

^

β

Least Squares Solution

Minimize distances between point and

response surface

Find b to minimize

(^

)^

(^

SSE
Y

Xb

Y

Xb

Obtain normal equations as before:

′^

X Xb = X Y

Least Squares Solution as before:

)^

1 −

′^

b

X X
X Y

“Linear” Regression Models •^

The term

linear

here refers to the

parameters

, not the predictor variables.

We can use

linear

regression models to deal

with almost any “function” of a predictorvariable (e.g.

(

)

2 , log X

X

, etc.)

We cannot use

linear

regression models to

deal with nonlinear functions of theparameters (unless we can find atransformation that makes them linear).

Types of Predictors

Continuous Predictors – we are used to

these.

Qualitative Predictors

^

Two possible outcomes (e.g. male/female)represented by 0 or 1

Polynomial Regression

^

Use squared or higher-ordered terms in regressionmodel. ^

Typically always include lower order terms. ^

2

1

0

1

2

1

p

i^

i^

i^

p^

i^

i

Y

X

X

X

β

β

β

β

ε

=

Analysis of Variance

Formulas for sums of squares(in matrix terms)are the same as before

(

)

(

)

(^

) 2 2

2

i i^

i

i

SSR
Y
Y

n

SSE
Y
Y
SSTO
Y
Y

^ n

′^
′^
^
′^
′^
′^
^
′^
^

∑^ ∑

b X Y

Y JY

e e

Y Y

b X Y

Y Y
Y JY

Analysis of Variance (2)

Degrees of Freedom depend on the model

Always

n – 1

total degrees of freedom

Model degrees of freedom is equal to the

number of terms in the model

p



Each variable has at least one term 

May be additional terms for squares,interactions, etc.

Error degrees of freedom is difference

between total and model degrees offreedom

n

p −

ANOVA Table

Source

df

SS
MS
F

Regression

(Model)

p-

(^

(^2) )

ˆY^ i

Y

R SSR^ df

MSR MSE

Error

n-p

(

(^2) ) ˆ

i^

i

Y
Y

E SSE^ df

Total

n-

(^

(^2) )

i Y

Y

T SSTO

df

10-

F-test for model significance •

The ratio F = MSR / MSE is again used to

test for a regression relationship.

Difference from SLR

Null Hyp:

0

1

2

1

:^

p

H

β

β

β

Alt Hyp:

:^

at least one

a^

k

H

β

Tests model significance, not individual

variables; gives no indication of whichvariable(s) in particular are important