# Restricted residuals, Lecture notes for Agroforestry. University of Northern Colorado (CO)

PPTX (2 MB)
57 pages
129Number of visits
Description
Restricted residuals in multiple linear regression
20 points
this document
Preview3 pages / 57

Part 5: Finite Sample Properties5-1/57

Econometrics I Professor William Greene Stern School of Business

Department of Economics

Part 5: Finite Sample Properties5-2/57

Econometrics I

Part 5 – Finite Sample Properties

Part 5: Finite Sample Properties5-3/57

Terms of Art

 Estimates and estimators  Properties of an estimator - the sampling

distribution  “Finite sample” properties as opposed to

“asymptotic” or “large sample” properties  Scientific principles behind sampling

distributions and ‘repeated sampling.’

Part 5: Finite Sample Properties5-4/57

Application: Health Care Panel Data German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Data downloaded from Journal of Applied Econometrics Archive.  There are altogether 27,326 observations.  The number of observations ranges from 1 to 7.   (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987).  Variables in the file are DOCVIS =  number of doctor visits in last three months HOSPVIS =  number of hospital visits in last calendar year DOCTOR = 1(Number of doctor visits > 0) HOSPITAL = 1(Number of hospital visits > 0) HSAT =  health satisfaction, coded 0 (low) - 10 (high)   PUBLIC =  insured in public health insurance = 1; otherwise = 0 ADDON =  insured by add-on insurance = 1; otherswise = 0 HHNINC =  household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC =  years of schooling AGE = age in years MARRIED = marital status For now, treat this sample as if it were a cross section, and as if it were the full population.

Part 5: Finite Sample Properties5-5/57

Population Regression

This is the true value of .

Part 5: Finite Sample Properties5-6/57

Sampling Distribution Repeated Sampling Creates Variation

A sampling experiment: Draw 25 observations at random from the population of 27,326. Compute the regression. Repeat 100 times. Display estimates.

matrix ; beduc=init(100,1,0)\$ proc\$ draw ; n=25 \$ regress; quietly ; lhs=hhninc ; rhs = one,educ \$ matrix ; beduc(i)=b(2) \$ sample;all\$ endproc\$ execute ; i=1,100 \$ histogram;rhs=beduc \$

Part 5: Finite Sample Properties5-7/57

How should we interpret this variation in the regression slope? The centering suggests the estimator is unbiased. We will have only one sample. We could have drawn any one of the possible samples.

Part 5: Finite Sample Properties5-8/57

The Statistical Context of Least Squares Estimation

The sample of data from the population: Data generating process is y = x+ 

The stochastic specification of the regression model: Assumptions about the random .

Endowment of the stochastic properties of the model upon the least squares estimator. The estimator is a function of the observed (realized) data.

Part 5: Finite Sample Properties5-9/57

Least Squares 

 

 

 

 

 

  

 

 

 

1

1 1

1

N1 i ii 1

1

N1 i ii 1

N 1 i ii 1

i ii

( ) = ( ) ( ) ( ) = The true parameter plus sampling error.

Also ( )

= ( ) y ( )

( )

= ( )

=

b X'X X'y X'X X' X + = X'X X'

b

b X'X X'y X'X x

X'X X' X'X x

X'X x

v

   

 

  N

1

= The true parameter plus a linear function of the disturbances.b

Part 5: Finite Sample Properties5-10/57

Deriving the Properties

b = a parameter vector + a linear combination of the disturbances, each times a vector.

Therefore, b is a vector of random variables. We analyze it as such.

The assumption of nonstochastic regressors. How it is used at this point.

We do the analysis conditional on an X, then show that results do not depend on the particular X in hand, so the result must be general – i.e., independent of X.

Part 5: Finite Sample Properties5-11/57

Properties of the LS Estimator: b is unbiased

Expected value and the property of unbiasedness. E[b|X] = E[+(XX)-1X|X] = +(XX)-1XE[|X]

= + 0

E[b] = E X{E[b|X]}

= E[b].

(The law of iterated expectations.)

Part 5: Finite Sample Properties5-12/57

Another Sampling Experiment

Part 5: Finite Sample Properties5-13/57

Means of Repetitions b|x

Part 5: Finite Sample Properties5-14/57

Partitioned Regression

y = X11 + X22 + 

Two sets of variables. What if the regression is computed without the second set of

variables?

What is the expectation of the "short" regression estimator? E[b1|(y = X11 + X22 +

)]

b1 = (X1X1) -1X

1y

Part 5: Finite Sample Properties5-15/57

The Left Out Variable Formula

“Short” regression means we regress y on X1 when y = X11 + X22 +  and 2 is not 0

(This is a VVIR!)

b1 = (X1X1) -1X

1y

= (X1X1) -1X

1(X11 + X22 + )

= (X1X1) -1X

1X11 + (X1X1) -1X

1 X22

+ (X1X1) -1X

1)

E[b1] = 1 + (X1X1) -1X

1X22

Part 5: Finite Sample Properties5-16/57

Historical Application: Left Out Dummy Variable in a Keynesian Consumption Function

0 1C Y W      

Part 5: Finite Sample Properties5-17/57

Application

The (truly) short regression estimator is biased. Application: Quantity = 1Price + 2Income +  If you regress Quantity on Price and leave out Income. What do you get?

Part 5: Finite Sample Properties5-18/57

Application: Left out Variable Leave out Income. What do you get?

In time series data, 1 < 0, 2 > 0 (usually) Cov[Price,Income] > 0 in time series data.

So, the short regression will overestimate the price coefficient. It will be pulled toward and even past

zero.

Simple Regression of G on a constant and PG

Price Coefficient should be negative.

     

1 1 2 Cov[Price,Income]E[b ] =β + β

Var[Price]

Part 5: Finite Sample Properties5-19/57

Estimated ‘Demand’ Equation Shouldn’t the Price Coefficient be Negative?

Part 5: Finite Sample Properties5-20/57

Multiple Regression of G on Y and PG. The Theory Works!

---------------------------------------------------------------------- Ordinary least squares regression ............ LHS=G Mean = 226.09444 Standard deviation = 50.59182 Number of observs. = 36 Model size Parameters = 3 Degrees of freedom = 33 Residuals Sum of squares = 1472.79834 Standard error of e = 6.68059 Fit R-squared = .98356 Adjusted R-squared = .98256 Model test F[ 2, 33] (prob) = 987.1(.0000) --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X --------+------------------------------------------------------------- Constant| -79.7535*** 8.67255 -9.196 .0000 Y| .03692*** .00132 28.022 .0000 9232.86 PG| -15.1224*** 1.88034 -8.042 .0000 2.31661 --------+-------------------------------------------------------------

Part 5: Finite Sample Properties5-21/57

The Extra Variable Formula

A Second Crucial Result About Specification: y = X11 + X22 +  but 2 really is 0. Two sets of variables. One is superfluous. What if the regression is computed with it anyway?

The Extra Variable Formula: (This is a VIR!)

E[b1.2| 2 = 0] = 1

The long regression estimator in a short regression is unbiased.)

Extra variables in a model do not induce biases. Why not just include them?

Part 5: Finite Sample Properties5-22/57

Variance of b  Assumption about disturbances:  i has zero mean and is uncorrelated with every other j  Var[i|X] = 

2. The variance of  i does not depend on any data in the sample.

2 1

2 2 2

2 N

0 ... 0 0 ... 0

Var | ... 0 0 0

0 0 ...

                                  

X I O

Part 5: Finite Sample Properties5-23/57

2 1

2 2 2

2 N

1 1 1

2 2 2

N N N

0 ... 0 0 ... 0

Var | ... 0 0 0

0 0 ...

Var E Var | Var E ... ... ...

                                  

                                                                 

X

X

I O

 2 2

|

0 0

E Var = . ... 0

                

                  

X

I I

Part 5: Finite Sample Properties5-24/57

Variance of the Least Squares Estimator 

 

 

 

    

     

 

   

 

1

1 1

1

1 1

1 2 1

( ) = ( ) ( ) ( ) E[ | ]= ( ) [ | ] as [ | ] Var[ | ] E[( )( ) '| ]

( ) [ '| ] ( ) ( ) ( )

b X'X X'y X'X X' X + = X'X X'

bX X'X X'E X = E X 0 b X b b X

= X'X X'E X X X'X = X'X X' I X X'X  

 

  

2 1 1

2 1 1

2 1

( ) ( ) ( ) ( ) ( )

= X'X X'I X X'X = X'X X'X X'X = X'X

Part 5: Finite Sample Properties5-25/57

Variance of the Least Squares Estimator 

 

 

1

2 1

2 1

2 1

( ) E[ | ] = Var[ | ] ( ) Var[ ] = E{Var[ | ] } + Var{E[ | ]} = E[( ) ] + Var{ } = E[( ) ] + We will ultimately ne

b X'X X'y bX

b X X'X b b X b X

X'X X'X 0

1ed to estimate E[( ) ]. We will use the only information we have, , itself.

X'X X