Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Lecture Notes on Multiple Regression Model I | STAT 231B, Study notes of Statistics

University of California-Riverside Statistics

Prof. Xinping Cui

Material Type: Notes; Professor: Cui; Class: STATISTCS FOR BIOLOGICL SCIENCES; Subject: Statistics; University: University of California-Riverside; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-b8l 🇺🇸

10 documents

1 / 20

This page cannot be seen from the preview

Don't miss anything!

MULTPLE REGRESSION-I

Regression analysis examines the relation between a single dependent variable Y and one

or more independent variables X1, …,Xp.

SIMPLE LINEAR REGRESSION MODELS

First Order Model with One Predictor Variable

i1i10i

XY 

, i=1,2,…,n

 0 is the intercept of the line, and 1 is the slope of

the line. One unite increase in X gives 1 unites

increase in Y.

 i is called a statistical random error for the ith

observation Yi. It accounts for the fact that the

statistical model does not give an exact fit to the

data.

 i cannot be observed. We assume:

oE(i)=0

oVar(i)=2 for all i=1,…,n

oCov(i, j)=0 for all ij

Discover Study notes of Statistics University of California-Riverside

Partial preview of the text

Download Lecture Notes on Multiple Regression Model I | STAT 231B and more Study notes Statistics in PDF only on Docsity!

MULTPLE REGRESSION-I

Regression analysis examines the relation between a single dependent variable Y and one

or more independent variables X 1 , …,Xp.

SIMPLE LINEAR REGRESSION MODELS

First Order Model with One Predictor Variable

i 0 1 i 1 i

Y  X 

, i=1,2,…,n

  0 is the intercept of the line, and  1 is the slope of

the line. One unite increase in X gives 

unites

increase in Y.

is called a statistical random error for the i

observation Y

. It accounts for the fact that the

statistical model does not give an exact fit to the

data.

cannot be observed. We assume:

o E(i)=

o Var(

for all i=1,…,n

o Cov(

)=0 for all ij

 Response/Regression function and an example:

 

  i i 1

i 0 1 i 1

EY 9. 5 2. 1 X

EY X

The response Y i

, given the level of X in the i

trial Xi, comes from a probability distribution

whose mean is 9.5+2.1 Xi. Therefore,

Response/Regression function relates the

means of the probability distributions of Y (for

given X ) to the level of X.

o We want to find the pair (b 0

, b 1

) that

minimizes

SSE= ^

e = ^

i 0 1 i

(Y b bX)

o We set the partial derivatives of SSE with

respect to b 0 , b 1 equal to zero:

X(Y b bX) 0

( X)( 2 )(Y b bX) 0

SSE

Normalequation(2) :

(Y b bX) 0

( 1 )( 2 )(Y b bX) 0

SSE

Normalequation(1) :

i i 0 1 i

i 0 1 i



Then the solution is (derivation on board);



i i

0 1

(X X )

(X X)(Y Y )

Y b X

 Properties of the residuals

e 0. i

 since the regression line goes

through the point

( X,Y) .

 

Ye 0

X e 0 and i i i i The residuals are

uncorrelated with the independent variables

X i

and with the fitted values i

Y

. (prove it on

board)

Why

e 0. i

 ,

Ye 0?

X e 0 and i i i i

 

In fact, from normal equation (1) and (2), we

can immediately tell

e 0 i

 (^) and

X e 0 i i

 (^). Since

   

i i 0 1 i i 0 i 1 i i

Ye (b bX)e b e b Xe

, we can easily

know that

Ye 0?

i i



o Least square estimates are uniquely defined

as long as the values of the independent

variable are not all identical. In that case the

numerator

(X X) 0

 (^) (draw figure).

 Point Estimation of Error Terms Variance 

o Single Population: Unbiased sample

variance estimator of the population

variance

 

 

2 2

i 2

E s

n 1

Y Y



 Estimating the Mean Value at Xh:

Estimator

Mean

Variance

Estimated variance

 Analysis of Variance Approach to Regression

Analysis

 

   

 

 

 

 

h h

SS

X X

n

s Y MSE

SS

X X

n

Y

EY E Y

Y b b X

EY X

2 2

0 1

   

i i i

i i i i

TOT

Y

Y Y Y

Y

Y Y Y

Y Y

Y

Y Y Y

Y Y

SS SSR SSE

 

Basic Table

Source of

Variation SS df MS E { MS }

Regression

SSR  Y Y  i

  



2 1

MSR

SSR



   



X X

Error

SSE (^)  Y Y  i i

  



2 n - 2

MSE

SSE

n



Total SS (^)  Y Y  TOT i



n - 1

o Test Statistic

o F Distribution

o Numerator

Degrees of Freedom dfR=

o Denominator

Degrees of Freedom dfR=n-

o Hypothesis:

o Decision Rule:

Fitting a regression in SAS

data Toluca;

infile 'C:\stat231B06\ch01ta01.txt';

input lotsize workhrs;

proc reg ; /*least square estimation of regression

coefficient*/

model workhrs=lotsize;

output out=results p=yhat r=residual;/*yhat denotes for fitted values

and residual denotes for residual values*/

 

       

TailProbability :

PF F n

F n

MSE

MSR

F

MSE

MSR

F

 

  1

1 1

0 1

If 1 ; 1 , 2

For levelofsignifican ce

F F n H

H

Extension to multiple regression model

Degrees of freedom: the number of independent components that are needed to calculate the

respective sum of squares.

(1) SSTOT is the sum of n squared components. However, since

(y y) 0 i

 , only

n-1 components are needed for its calculation. The remaining one can always be

calculated from (^) 





n 1

i 1

n i

y y (y y)

. Hence, SSTOT has n-1 degrees of freedom.

(2) SSE= (^) 

is the sum of n squared components. However, there are p

restrictions among the residuals, coming from the p normal equations

X Y XXb X'(Y Xb) X'e 0

p 1 pp p 1

  

. Hence SSE has n-p degrees of freedom.

(3) SSR= (^) 

(y ˆ y)

Lecture Notes on Multiple Regression Model I | STAT 231B, Study notes of Statistics

Related documents

Partial preview of the text

Download Lecture Notes on Multiple Regression Model I | STAT 231B and more Study notes Statistics in PDF only on Docsity!

MULTPLE REGRESSION-I

Regression analysis examines the relation between a single dependent variable Y and one

or more independent variables X 1 , …,Xp.

SIMPLE LINEAR REGRESSION MODELS

First Order Model with One Predictor Variable

Y  X 

, i=1,2,…,n

  0 is the intercept of the line, and  1 is the slope of

the line. One unite increase in X gives 

unites

increase in Y.

is called a statistical random error for the i

observation Y

. It accounts for the fact that the

statistical model does not give an exact fit to the

data.

cannot be observed. We assume:

o E(i)=

o Var(

for all i=1,…,n

o Cov(

)=0 for all ij

EY 9. 5 2. 1 X

EY X

SSE

SSE

(X X )

(X X)(Y Y )

Y

(X X) 0

Y Y

SS

X X

n

s Y MSE

SS

X X

n

Y

EY E Y

Y b b X

EY X

Y

Y Y Y

Y

Y Y Y

Y Y

Y

Y Y Y

Y Y

SS SSR SSE

X X

MSE

SSE

n

TailProbability :

PF F n

F n

MSE

MSR

F

MSE

MSR

F

H

H

Extension to multiple regression model

Extension to multiple regression model

Extension to multiple regression model

Extension to multiple regression model

Degrees of freedom: the number of independent components that are needed to calculate the

respective sum of squares.

(1) SSTOT is the sum of n squared components. However, since

n-1 components are needed for its calculation. The remaining one can always be

y y (y y)

. Hence, SSTOT has n-1 degrees of freedom.