Comparison of Different Estimation Methods for a Population Parameter, Study notes of Econometrics and Mathematical Economics

Three methods for estimating a population parameter: method of moments, least squares, and maximum likelihood estimation. It includes a monte carlo simulation to compare the bias, variance, and mean squared error of the estimators. The methods are then applied to an exponential distribution example with a mean that depends on a variable x.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-xmv
koofers-user-xmv 🇺🇸

5

(1)

10 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Notes For Intermediate Econometrics - 4
Paul L. Fackler - North Carolina State University
March5, 2001
Constructing Estimators
Consider a random variable with the property that its mean is half the negative of its variance:
E[
Y
]=
Var[
Y
]
=
2
:
Let
represent the variance. Howcan we estimate
?
An obvious choice is to choose
to be the sample variance, i.e., to estimate the population
parameter by its sample analog. To makethisconcrete, for agiven sample of size
n
dene
s
1
=
n
X
i
=1
y
i
and
s
2
=
n
X
i
=1
y
2
i
:
An unbiased estimate of the sample variance is
^
1
=
s
2
n
1
s
2
1
(
n
1)
n
:
In this case, however, the sample analog could also be taken to be twice the negative of the same
mean:
^
2
=
2
s
1
n
:
Both of these estimators are method of moments estimators.
In fact other method of moment estimators are possible. For example, consider that
=Var[
Y
]=E[
Y
2
]
E[
Y
]
2
=E[
Y
2
]
2
4
;
which can be rewritten as the quadratic equation
2
4
+
E[
Y
2
]=0
;
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Comparison of Different Estimation Methods for a Population Parameter and more Study notes Econometrics and Mathematical Economics in PDF only on Docsity!

Notes For Intermediate Econometrics - 4

Paul L. Fackler - North Carolina State University

March 5, 2001

Constructing Estimators

Consider a random variable with the prop erty that its mean is half the negative of its variance:

E[Y ] = Var[Y ]= 2 :

Let  represent the variance. How can we estimate ? An obvious choice is to cho ose  to b e the sample variance, i.e., to estimate the p opulation parameter by its sample analog. To make this concrete, for a given sample of size n de ne

s 1 =

X^ n

i=

yi

and

s 2 =

X^ n

i=

y (^) i^2 :

An unbiased estimate of the sample variance is

^ 1 = s^2 n 1

s^21 (n 1)n

In this case, however, the sample analog could also b e taken to b e twice the negative of the same mean:

^ 2 = 2 s^1 n

Both of these estimators are metho d of moments estimators. In fact other metho d of moment estimators are p ossible. For example, consider that

 = Var[Y ] = E[Y 2 ] E[Y ]^2 = E [Y 2 ]

which can b e rewritten as the quadratic equation

 2 4

+  E [Y 2 ] = 0 ;

the p ositive real ro ot of which is

 = 2

q

E[Y 2 ] + 1 1

Supp ose we replace the exp ectation in this expression with its sample analog; we get another estimator for  :

^ 3 = 2

r

s 2 n

Which estimator is b est? I conducted a Monte Carlo exp eriment by setting the true  = 25 and sampling from a Normal (Gaussian) distribution with mean 12 : 5 and variance 25. I generated 50 ; 000 samples with 100 observations (n=100) each and computed values of the 3 estimators. I then computed means, variances and mean squared errors over the 50,000 replications. The following results were obtained: Estimator Bias Covariance MSE ^ 1 0 : 0002 12 : 6388 12 : 6386 ^ 2 0 : 0050 1 : 0032 1 : 0032 ^ 3 0 : 0217 0 : 9273 0 : 9278 Perhaps surprisingly, the third estimator app ears to do the b est job overall, even though it may exhibit a slight degree of bias (although from the Monte Carlo results, we would not b e able to reject the hyp othesis that the bias is 0). In mean squared error terms, the third estimator is somewhat b etter than the second and b oth of these are far b etter than the rst estimator, which is a very ineÆcient estimator. The third estimator can, in fact, b e shown to b e the maximum likeliho o d estimator under the assumption that the random variable is normally distributed (as was assumed in the Monte Carlo exercise). It is therefore no terribly surprising that it did well. I rep eated the exercise but changed the sample size in each replicate to 1000 and obtained the following results: Estimator Bias Covariance MSE ^ 1 -0.0028 1.2654 1. ^ 2 -0.0011 0.1000 0. ^ 3 0.0005 0.0925 0. Notice how the mean squared error has dropp ed for each estimator by approximately an order of magnitude. In fact, an estimate of the variance of  3 based on its likeliho o d interpretation is

Var( ^ 3 ) =

n

which is very close to the estimates obtained in the Monte Carlo exercises. Matlab co de to p erform this demonstration is given b elow.

theta=25; n=100; rep=5000; y=randn(n,rep)sqrt(thet a)-t het a/2; s1=sum(y)'; s2=sum(y.y)';

Maximum Likeliho o d

Supp ose your have a completely sp eci ed probability mo del with a log-likeliho o d function l (Y ;  ).  can b e estimated by maximizing the log-likeliho o d for the sample:

max 

Xn

i=

l (yi ;  ):

This can b e accomplished by nding the value, ^ that causes the derivative to equal 0:

X^ n

i=

@ l (yi ;  ) @ 

The asymptotic distribution of this estimator is normal with mean  and covariance that can b e computed in either of two ways:

Cov ( ^ ) =

n

@E

4 @^ l^ (Y^ ;^ ^ )

@ l (Y ;  ) @ 

!^3

A

1

n

E

@ 2 l (Y ;  ) @  @  >

In the scalar case this can b e written as

Var( ^ ) =

nE

4 dl^ (Y^ ;^ ^ )

d 

nE

d^2 l (Y ;  ) d  2

Least Squares

Supp ose you have an incomplete probability mo del for which

Y = f ( ) + e;

where Var(e) =  2. Y is treated as equal to a function of a parameter  and (p ossibly) observations on a variable X plus noise (e). Given a sample with indep endent observations, one estimation strategy is least squares, which cho oses  to solve

min 

X^ n

i=

(yi f ( ))^2 :

As usual, this is accomplished by taking the derivative of the ob jective function, setting it equal to 0 and solving for . Hence we nd the ^ that solves

X^ n

i=

(yi f ( ))

@ f ( ) @ 

The distribution theory asso ciated with this estimator is asymptotic. Central limit theorem ar- guments can b e used to show that ^ is asymptotically normal with mean  and covariance given by

Cov ( ^ ) =

n

@E

4 @^ f^ (^ )

@ f ( ) @ 

A

1 :

When  is a scalar this simpli es to

Var( ^ ) =

nE

df ( ) d

 2 ^ :

An Example

Supp ose that Y is a random variable describ ed by an exp onential distribution. The density of Y dep ends on a single parameter,  :

 e^ Y^ :

The exp onential distribution has b een widely used in reliability mo deling, where Y measures the time a comp onent (e.g., a light bulb or a computer chip) takes to fail. One useful fact ab out the exp onential distribution that can b e demonstrated using integration by parts is

E [Y i^ ] =

i 

E [Y i^1 ];

with E [Y ] = 1 =. An immediate implication is that Var[Y ] = 1 = 2.

Metho d of Moments The natural moment condition for this problem is

E

Y

The metho d of moments estimator therefore solves

1 n

X^ n

i=

yi

n

X^ n

i=

yi

leading to the estimator:

^ = Pn

n i=1 yi

The covariance formula involves

E

@ m(Y ;  ) @ 

= E

Least Squares To nd the least squares estimator of  , solve

min 

X^ n

i=

yi

(the 1 = 2 is for convenience only). The rst order condition is

X^ n

i=

yi

which is solved by

^ = Pn

n i=1 yi

The asymptotic variance of ^ is

Var( ^ ) =

Var(Y ) n

n

In this example all three approaches led to the same estimator (with the same asymptotic variance). A nal note ab out these examples. The variance of the estimator dep ends on the unknown parameter . The asymptotic distribution theory do es not change is we substitute a consistent estimator for this unknown value. Hence, we can use ^ to derive an estimate of the variance of ^.

Conditioning Information

The basic framework requires little mo di cation when the parameters of the distribution of Y dep end on some conditioning information. With metho d of moments estimators, the moment conditions may now dep end on X but other- wise nothing has changed. The moment condition can b e written as

E [m(Y ; X ;  )] = 0 ;

and estimated by solving:

1 n

X^ n

i=

m(yi ; xi ;  ) = 0 :

The covariance of  b ecomes

Cov ( ^ ) =

X^ n

i=

E

@ m(yi ; xi ;  ) @ 

# !> n

X

i=

E

h

m(yi ; xi  )m(yi ; xi ;  )>

i!^ Xn

i=

E

@ m(yi ; xi  ) @ 

Maximum likeliho o d estimation is also essentially unchanged:

max 

Xn

i=

l (yi ; xi ;  )

by solving

X^ n

i=

@ l (yi ; xi ;  ) @ 

The covariance can b e computed:

Cov ( ^ ) =

Xn

i=

E

4 @^ l^ (yi^ ;^ xi^ ;^ ^ )

@ l (yi ; xi  ) @ 

!>^3

A

1 =

X^ n

i=

E

@ 2 l (yi ; xi ;  ) @  @  >

In the least squares context, the mean of Y may dep end on one or more variables represent by X :

Y = f (X ;  ) + e;

The estimator do es not change in substance:

min 

X^ n

i=

(yi f (xi ;  ))^2 :

by solving

X^ n

i=

(yi f (xi ;  ))

@ f (xi ;  ) @ 

The covariance of the estimator b ecomes

Cov ( ^ ) =  2

Xn

i=

E

4 @^ f^ (xi^ ;^ ^ )

@ f (xi ;  ) @ 

A

1 :

In each of these case, the covariance typically dep ends on  and may b e estimated using ^. It is also often true that the exp ectations in these expressions are diÆcult to evaluate; in many situations the expression themselves, esp ecially derivatives, are not known in closed form. In such situations, one can ignore the exp ectation op erator b ecause the sum over the observations (divided by n) provides a consistent estimate of the relevant exp ectation.

Exp onential Example Returning to the case in which Y is exp onentially distributed, supp ose now that the single parameter of the exp onential distribution dep ends on the variable X. Sp eci cally, supp ose we make the mean of Y equal to  0 +  1 X , so

Y  Exp onential

 0 +  1 X

In this case the assumptions b ehind the least squares mo del are now not quite satis ed b ecause the error terms e do es not have a constant variance. Instead its variance is now ( 0 +  1 X )^2. This situation, termed heteroskedasticity, taken up at a later date.

This leads to

Cov ( ^ ) =

n

P

i xi 2

P

i i 2

P

i i xi

# 1 " P

^2 i 2

P

i  3 i 2

P

i ^3 i 8

P

i ^4 i

n 2

P

P i^ i

i xi 2

P

i i xi

The maximum likeliho o d problem is

max  0 ; 1

X

i

ln( 0 +  1 xi )

X

i

yi  0 +  1 xi

The asso ciated rst-order conditions are

X

i

yi ( 0 +  1 xi )^2

X

i

 0 +  1 xi

and

X

i

xi yi ( 0 +  1 xi )^2

X

i

xi  0 +  1 xi

In general these would need to b e solved numerically. The asymptotic covariance of this estimator is

Cov ( ^ ) =

P

i 1 ( 0 + 1 x 1 )^2

P

i xi ( 0 + 1 xi )^2

P

i xi ( 0 + 1 xi )^2

P

i

x^2 i ( 0 + 1 xi )^2

1 :

Let's return now and examine the least squares estimator for this mo del; the ob jective function is

min  0 ; 1

X

i

(yi  0  1 xi )^2 :

The two rst order conditions are

X

i

(yi  0  1 xi ) = 0

and

X

i

(yi  0  1 xi )xi = 0 :

When solved for  0 and  1 these result in the usual OLS estimator, which will b e denoted ~. Notice that the rst order conditions can b e asso ciated with the moment conditions

E[ei ] = 0

and

E[ei xi ] = 0 :

This provides us with a way of determining the distribution of the estimator using the metho d of moments interpretation. The two comp onents needed for computing the covariance are

E[m(yi ; xi ;  )m(yi ; xi ;  )] = E

e^2 i xi ei xi ei x^2 i e^2 i

= V ar (yi )

1 xi xi x^2 i

The variance of yi in this mo del is ( 0 +  1 xi )^2. The second comp onent is the matrix of partial derivatives of the moment conditions with resp ect to  0 and  1 :

@ m(yi ; xi ;  ) @ 

1 xi xi x^2 i

Hence the covariance of the least squares estimator is

Cov ( ~ ) =

n

P

P i^ xi

i xi

P

i x 2 i

# 1 " P

i ( 0 +^  1 xi )^2

P

P i^ (^0 +^ ^1 xi^ )^2 xi

i (^0 +^ ^1 xi^ )

2 xi^ P

i (^0 +^ ^1 xi^ ) (^2) x 2 i

n

P

P i^ xi

i xi

P

i x 2 i

Recall that the MLE estimator ^ has a covariance given by:

Cov ( ^ ) =

P

i 1 ( 0 + 1 x 1 )^2

P

i xi ( 0 + 1 xi )^2

P

i xi ( 0 + 1 xi )^2

P

i

x^2 i ( 0 + 1 xi )^2

1 :

Given the mo del assumptions this is an eÆcient estimator and hence Cov ( ^ ) can b e no large than Cov ( ~ ). As an illustration, consider a situation in with n = 20 and for which 10 of the values of the xi equal 0 and the other 10 equal 1. Then

Cov ( ^ ) =

whereas

Cov ( ~ ) = 10

Thus, in this case, the OLS estimator is as eÆcient as the ML estimator (it is go o d that we obtained this result, b ecause, in fact, the two estimators are identical in this case). Consider another case, however, with xi = i,  0 = 1 and  1 = 0 :1. In this case the ML and OLS covariance matrices are

Cov ( ^ ) =