Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Regression Problem - Linear Regression Analysis | M 374G, Study notes of Mathematics

The University of Texas at Austin Mathematics

Material Type: Notes; Class: LINEAR REGRESSION ANALYSIS; Subject: Mathematics; University: University of Texas - Austin; Term: Fall 2004;

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-kdq-1 🇺🇸

4

(1)

8 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

1

M 374G/384G, Fall 2004

SELECTING TERMS (Supplement to Section 11.5)

Consider a regression problem where E(Y | x) = ηη

ηηTu is the correct model for the

mean function. Often such a model has too many terms to be usable. Can some terms be

deleted without important loss of information?

One problem that might result from dropping terms is that the resulting mean

estimator might be biased. For example, if the correct model is E(Y | x) =η0 + η1u1 + η2u2

+ … ηk-1uk-1 where ηk-1 ≠ 0 and we fit the model E(Y | x) = γ0 + γ1u1 + γ2u2 + … γk-2uk-2 by

least squares to get fitted values



ˆ

yi

, then (since the least squares estimates are unbiased

for the model used),

E(



ˆ

yi

) = γ0 + γ1ui1 + γ2ui2 + … γk-2ui,k-2,

which might not be the same as

η0 + η1ui1 + η2ui2 + … ηk-1ui,k-1 = E(Y | xi).

The difference between the expected value of the estimate and the parameter being

estimated is called the bias of the estimator:

bias (



ˆ

yi

) = E(



ˆ

yi

) - E(Y | xi)

However, dropping terms might also reduce the variance. Sometimes having

biased estimates is the lesser of two evils. One way to address this problem is to

evaluate the model by a measure that includes both bias and variance. This is the mean

squared error. The mean squared error of a fitted value is the expected value of the

square of the error between the fitted value (for the submodel) and the true conditional

mean at xi:

MSE (



ˆ

yi

) = E([



ˆ

yi

- E(Y | xi)]2).

Please note: Do not confuse with another use of MSE -- to denote RSS/df = Mean Square

for Residuals (on regression ANOVA table)

We would like MSE (



ˆ

yi

) to be small. To understand MSE better, we will examine, for

fixed i, the variance of



ˆ

yi

- E(Y | xi):

Discover Study notes of Mathematics The University of Texas at Austin

Partial preview of the text

Download Regression Problem - Linear Regression Analysis | M 374G and more Study notes Mathematics in PDF only on Docsity!

M 374G/384G, Fall 2004

SELECTING TERMS (Supplement to Section 11.5)

ηη ) = x Consider a regression problem where E(Y | ηη

T

is the correct model for the u

mean function. Often such a model has too many terms to be usable. Can some terms be

deleted without important loss of information?

One problem that might result from dropping terms is that the resulting mean

η ) = x model is E(Y | correct estimator might be biased. For example, if the

0

η +

1

u

1

η +

2

u

2

η + …

k-

u

k-

η where

k-

γ ) = x 0 and we fit the model E(Y | ≠

0

γ +

1

u

1

γ +

2

u

2

γ + …

k-

u

k-

by

least squares to get fitted values ˆ

y

i

, then (since the least squares estimates are unbiased

for the model used),

E(

y

i

γ ) =

0

γ +

1

u

i

γ +

2

u

i

γ + …

k-

u

i,k-

,

be the same as not which might

η

0

η +

1

u

i

η +

2

u

i

η + …

k-

u

i,k-

x = E(Y |

i

The difference between the expected value of the estimate and the parameter being

of the estimator: bias estimated is called the

bias ( ˆ

y

i

) = E(

y

i

x ) - E(Y |

i

)

However, dropping terms might also reduce the variance. Sometimes having

biased estimates is the lesser of two evils. One way to address this problem is to

mean evaluate the model by a measure that includes both bias and variance. This is the

The mean squared error of a fitted value is the expected value of the squared error.

square of the error between the fitted value (for the submodel) and the true conditional

x mean at

i

MSE (

y

i

) = E([

y

i

x - E(Y |

i

)]

2

: Do not confuse with another use of MSE -- to denote RSS/df = Mean Square Please note

for Residuals (on regression ANOVA table)

We would like MSE ( ˆ

y

i

) to be small. To understand MSE better, we will examine, for

fixed i, the variance of ˆ

y

i

x - E(Y |

i

):

ˆ Var(

y

i

x - E(Y |

i

= E([

y

i

x - E(Y |

i

)]

2

) - [E(

y

i

x -E(Y |

i

))]

2

= MSE(

y

i

) - [E(

y

i

x ) - E(Y |

i

)]

2

= MSE(

y

i

) - [bias ( ˆ

y

i

)]

2

x Also, since E(Y |

i

) is constant,

Var( ˆ

y

i

x - E(Y |

i

)) = Var( ˆ

y

i

Thus,

MSE(

y

i

) = Var( ˆ

y

i

) + [bias ( ˆ

y

i

)]

2

So MSE really is a combined measure of variance and bias. Now (see Section 10.1.5)

Var(

ˆ

j

) =

σ

2

R 1 SU U

j j j

where SU

j

U

j

is defined like SXX, and R

j

2

is the coefficient of multiple determination for

the regression of u

j

on the other terms in the model. Notice that the first factor is

independent of the other terms. Adding a term usually increases R

j

2

; deleting one usually

decreases R

j

2

. Thus adding a term usually increases Var( ˆ

j

); deleting a term usually

decreases Var( ˆ

j

) (i.e., gives a more precise estimate of ˆ

j

). Since ˆ

y

i

is a linear

combination of the ˆ

j

ˆ 's, the effect will be the same for Var(

y

i

Summarizing: Deleting a term typically decreases Var( ˆ

y

i

) but increases bias. So

we want to play these effects off against each other by minimizing MSE ( ˆ

y

i

). But we

total mean squared error need to do this minimization for all i's, so we consider the

J =

i

n

=

1

MSE (

y

i

n

=

1

{Var( ˆ

y

i

) + [bias ( ˆ

y

i

)]

2

We want this to be small. Since it's a parameter, we need to estimate it. It works better to

total normed mean squared error estimate the

σ ) = J/Γ (or γ

2

C

I

= k

I

(n - k

I

)

σ

I

2

(n - k

I

)

RSS

I

σ

2

2k

I

n.

Thus we can use Mallow's statistic to help identify good candidates for submodels by

looking for submodels where C

I

is both

(i) small (suggesting small total error)

and

k ≤ (ii)

I

(suggesting small bias)

Comments:

Mallow's statistic is provided by many software packages in some model-selection

routine. Arc gives it in both Forward selcetion and Backward elimination. Other software

(e.g., Minitab) may use different procedures for Forward and Backward

selection/elimination, but give Mallow's statistic in another routine.

Since C

I

is a statistic, it will have sampling variability. It might happen, for example,

that C

I

is negative, which would suggest small bias. It also might happen that C

I is larger

than k

I

even when the model is unbiased, but there is no way to distinguish this situation

from a case where there is bias but C

I

γ happens to be less than

I

Regression Problem - Linear Regression Analysis | M 374G, Study notes of Mathematics

Related documents

Partial preview of the text

Download Regression Problem - Linear Regression Analysis | M 374G and more Study notes Mathematics in PDF only on Docsity!

E(

) = E(

MSE (

) = E([

)]

= E([

)]

) - [E(

))]

= MSE(

) - [E(

= MSE(

)]

MSE(

)]

R 1 SU U

J =

MSE (

)]

C

RSS