Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli


Modello di regressione multipla, Guide, Progetti e Ricerche di Econometria

Il modello di regressione multipla, una tecnica statistica utilizzata per analizzare la relazione tra una variabile dipendente e più variabili indipendenti. Vengono discussi i concetti chiave come la matrice di proiezione, il coefficiente di determinazione, il modello di regressione partizionato, la regressione in forma deviata, le proprietà statistiche del stimatore dei minimi quadrati ordinari, l'ortogonalità e il problema del bias da variabili omesse, la regressione con vincoli lineari e l'utilizzo di funzioni trigonometriche nella regressione. Una trattazione approfondita degli aspetti teorici e matematici alla base del modello di regressione multipla, con numerosi esempi e dimostrazioni formali. Questo documento potrebbe essere particolarmente utile per studenti universitari di corsi di econometria, statistica applicata o analisi quantitativa, che necessitano di una comprensione solida dei fondamenti del modello di regressione multipla.

Tipologia: Guide, Progetti e Ricerche

2021/2022

Caricato il 21/08/2024

marco.s30
marco.s30 🇮🇹

4.5

(2)

2 documenti

1 / 34

Toggle sidebar

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

bg1
EC3062 ECONOMETRICS
THE MULTIPLE REGRESSION MODEL
Consider Trealisations of the regression equation
(1) y=β0+β1x1+···+βkxk+ε,
which can be written in the following form:
(2)
y1
y2
.
.
.
yT
=
1x11 ... x
1k
1x21 ... x
2k
.
.
..
.
..
.
.
1xT1... x
Tk
β0
β1
.
.
.
βk
+
ε1
ε2
.
.
.
εT
.
This can be represented in summary notation by
(3) y= +ε.
The object is to derive an expression for the ordinary least-squares
estimates of the elements of the parameter vector β=[β0
1,...,β
k].
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22

Anteprima parziale del testo

Scarica Modello di regressione multipla e più Guide, Progetti e Ricerche in PDF di Econometria solo su Docsity!

EC3062 ECONOMETRICS

THE MULTIPLE REGRESSION MODEL

Consider

T

realisations of the regression equation

y = β 0 + β 1 x 1 +

  • β k x k +

ε,

(2) which can be written in the following form:

y 1

y ... 2

y T

 

x 11

x 1 k

x 21

x 2 k

x T (^1)

x T k

β 0

β 1

β k

 

ε 1

ε ... 2

ε T

 

(^).

(3) This can be represented in summary notation by

y

=

ε.

The object is to derive an expression for the ordinary least-squares

estimates of the elements of the parameter vector

β

= [

β 0 , β

1 ,... , β

k ] ′ .

EC3062 ECONOMETRICS

The ordinary least-squares (OLS) estimate of

β

is the value that minimises

S

β ) =

ε ′ ε

y

′ ( y

y ′ y

y ′ Xβ

− β ′ X ′ y + β ′

X

′ Xβ

y ′ y

y ′ Xβ

β ′ X

′ Xβ.

(5) According to the rules of matrix differentiation, the derivative is

∂β∂S

y ′ X

β ′ X

′ X.

Setting this to zero gives 0 =

β

′ X ′ X − y ′

X

, which is transposed to provide

(6) the so-called normal equations:

X

′ Xβ

X

′ y.

(7) unique solution, which is the vector of ordinary least-squares estimates: On the assumption that the inverse matrix exists, the equations have a

βˆ

X

′ X ) − 1 X ′

y.

EC3062 ECONOMETRICS

that the total sum of squares This is an instance of Pythagorus theorem; and the equation indicates

y ′ y

is equal to the regression sum of squares

βˆ ′ X

′ X

βˆ

plus the residual or error sum of squares

e ′ e .

By projecting

y

perpendicularly onto the manifold of

X

, the distance

between

y

and

P y

X

βˆ

is minimised.

Proof.

Let

γ

P g

be an arbitrary vector in the manifold of

X

. Then

( y − γ ) ′ ( y − γ

y

X

βˆ ) + (

X

βˆ

− γ ) } ′ { ( y − X

βˆ ) + (

X

βˆ

γ ) }

= { ( I − P

y

P

y − g ) } ′ { ( I − P

y

P

y − g ) }.

The properties of

P

indicate that

( y − γ ) ′ ( y − γ

y ′ ( I − P

y

  • (

y − g ) ′ P

y

g )

e ′ e

  • (

X

βˆ

− γ ) ′ ( X

βˆ

γ ) .

Since the squared distance (

X

βˆ

γ ) ′ ( X

βˆ

γ ) is nonnegative, it follows

that (

y

γ ) ′ ( y − γ ) ≥ e ′ e

, where

e

=

y

X

βˆ

; which proves the assertion.

EC3062 ECONOMETRICS

The Coefficient of Determination

A summary measure of the extent to which the ordinary least-squares

regression accounts for the observed vector

y

is provided by the coefficient

(11) of determination. This is defined by

R

2

=

βˆ

′ X

′ X

βˆ

y ′ y

y ′ P y

y ′ y

vectors The measure is just the square of the cosine of the angle between the

y

and

P y

X

βˆ ; and the inequality 0

R

2

1 follows from the

fact that the cosine of any angle must lie between

1 and +1.

If

X

is a square matrix of full rank, with as many regressors as

observations, then

X

1

exists and

P = X ( X ′ X ) − 1 X = X { X − 1 X

′−

1 } X

I,

and so

R

2

If

X

′ y

= 0, then,

P y

= 0 and

R

2

But, if

y

is

distibuted continuously, then this event has a zero probability.

EC3062 ECONOMETRICS

The Partitioned Regression Model

Consider partitioning the regression equation of (3) to give

y

= [

X

1

X

2

] [

β^ 1

β 2 ] + ε = X 1 β 1 + X 2 β 2 +

ε,

where [

X

1 , X

2 ] =

X

and [

β (^1) ′ , β

(^2) ′ ] ′

=

β

. The normal equations of (6) can

be partitioned likewise:

X

(^1) ′ X 1 β 1 + X

(^1) ′ X 2 β 2 = X

(^1) ′ y,

X

(^2) ′ X 1 β 1 + X

(^2) ′ X 2 β 2 = X

(^2) ′ y.

From (13), we get the(14)

X

(^1) ′ X 1 β 1 = X

(^1) ′ ( y − X 2 β 2

), which gives

βˆ 1

= (

X

(^1) ′ X 1 ) − 1 X

(^1) ′ ( y

X

2 βˆ 2 ) .

To obtain an expression for

βˆ 2 , we must eliminate

β 1

from equation (14).

For this, we multiply equation (13) by

X

(^2) ′ X

1 ( X

(^1) ′ X

1 ) −

1

to give

X

(^2) ′ X 1 β 1 + X

(^2) ′ X

1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ X 2 β 2 = X

(^2) ′ X

1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ y.

EC3062 ECONOMETRICS

(14) From

X

(^2) ′ X 1 β 1 + X

(^2) ′ X 2 β 2 = X

(^2) ′ y,

(16) we take the resulting equation

X

(^2) ′ X 1 β 1 + X

(^2) ′ X

1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ X 2 β 2 = X

(^2) ′ X

1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ y

(17) to give

X

(^2) ′ X

2

X

(^2) ′ X

1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ X 2 } β 2 = X

(^2) ′ y

X

(^2) ′ X

1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ y.

On defining

P 1 = X 1 ( X

(^1) ′ X 1 ) − 1 X

(^1) ′ , equation (17) can be written as

X

(^2) ′ ( I − P 1 ) X 2 } β 2 = X

(^2) ′ ( I − P 1 )

y,

(20) whence

βˆ 2

=

X

(^2) ′ ( I − P 1 ) X 2 } − 1 X

(^2) ′ ( I − P 1 )

y.

EC3062 ECONOMETRICS

To understand the effect of the operator

P

ι , consider

ι ′ y

=

T

t ∑

y t , ( ι ′ ι ) − 1 ι ′ y =

T 1

T

t ∑

y t = ¯

y,

and

P

ι y

=

ι ¯y

=

ι ( ι ′ ι ) − 1 ι ′

y

= [¯

y,

¯y,... ,

¯y ] ′ .

Here,

P

ι y

= [¯

y,

¯y,... ,

¯y ] ′

is a column vector containing

T

repetitions of

the sample mean.

From the above, it can be understood that, if

x

= [

x 1 , x

2 ,... x

T (^) ] ′

is

vector of

T

elements, then

x ′ ( I − P ι ) x = T

t ∑

x t ( x t −

¯x ) =

T

t ∑

x t −

¯x ) x t =

T

t ∑

x t −

¯x ) 2 .

The final equality depends on the fact that

x t (^) −

(^) ¯x

)¯ x

x

x t (^) −

(^) ¯x

) = 0.

EC3062 ECONOMETRICS

The Regression Model in Deviation Form

Consider the matrix of cross-products in equation (24). This is

Z

′ ( I − P ι ) Z = { ( I − P ι ) Z } ′

{ Z ( I − P ι ) }

Z

Z

′ ( Z

Z

Here,

Z

contains the sample means of the

k

explanatory variables repeated

T

times. The matrix (

I − P ι ) Z

Z

Z

) contains the deviations of the

data points about the sample means. The vector (

I

P

ι ) y

= (

y (^) −

(^) ι ¯y ) may

be described likewise.

It follows that the estimate

βˆ z = { Z ′ ( I − P ι ) Z } − 1 Z ′ ( I − P ι ) y

is

(28) obtained by applying the least-squares regression to the equation

y 1

¯y

y 2

¯y

y T

¯y  

x 11

¯x 1

x 1 k

¯x k

x 21

¯x 1

x 2 k

¯x k

x T (^1)

¯x 1

x T k

¯x k

 

β 1

β k

 

ε 1

¯ε

ε 2

¯ε

ε T

¯ε  

(^) ,

which lacks an intercept term.

EC3062 ECONOMETRICS

The Assumptions of the Classical Linear Model

Consider the regression equation

y

=

ε,

where

y

= [

y 1 , y

2 ,... , y

T

] ′ ,

ε

= [

ε 1 , ε

2 ,... , ε

T (^) ] ′ ,

β

= [

β 0 , β

1 ,... , β

k ] ′

and

X

= [

x tj

(^) ], with

x t 0

= 1 for all

t .

It is assumed that the disturbances have expected values of zero. Thus

E

ε ) = 0

or, equivalently,

E

ε t ) = 0

t = 1

,... , T.

(34) have a common variance. Thus Next, it is assumed that they are mutually uncorrelated and that they

D

ε ) =

E

εε

′ ) =

σ 2 I,

or

E ( ε t ε s

σ^ 2 ,

if

t

=

s ;

if

t



s .

If

t

is a temporal index, then these assumptions imply that there is

no inter-temporal correlation in the sequence of disturbances.

EC3062 ECONOMETRICS

that A conventional assumption, borrowed from the experimental sciences, is

X

is a nonstochastic matrix with linearly independent columns.

Linear independence is necessary in order to distinguish the separate

effects of the

k

explanatory variables.

In econometrics, it is more appropriate to regard the elements of

X

(37) as random variables distributed independently of the disturbances:

E

X

′ ε | X

X

′ E

( ε ) = 0

(38) Then,

βˆ

X ′ X ) − 1 X ′

y

is unbiased such that

E

βˆ ) =

β.

(39) To demonstrate this, we may write

βˆ

X ′ X ) − 1 X ′ y

X

′ X ) − 1 X ′ (

ε )

β

  • (

X ′ X ) − 1 X ′

ε.

(40) Taking expectations gives

E

βˆ ) =

β

X ′ X ) − 1 X ′ E ( ε )

β.

EC3062 ECONOMETRICS

Matrix Traces

If

A

= [

a ij (^) ] is a square matrix, then Trace(

A

i n

a ii . If

A

= [

a ij (^) ]

is of order

n

×

m

and

B

= [

b k

] is of order

m

×

n , then

AB

C

= [

c i

]

with

c i

m

j ∑

a ij (^) b j

and

BA

D

= [

d kj

(^) ]

with

d kj

n

=

b k

a j

(^).

(46) Now,

Trace(

AB

n

i ∑

m

j ∑

a ij

(^) b ji

and

Trace(

BA

m

j ∑

n

=

b j

a j

n

=

m

j ∑

a j

(^) b j

.

Apart from a change of notation, where

replaces

i , the expressions on

the RHS are the same. It follows that Trace(

AB

) = Trace(

BA

). For three

factors

A, B, C

, we have Trace(

ABC

) = Trace(

CAB

) = Trace(

BCA

EC3062 ECONOMETRICS

Estimating the Variance of the Disturbance

It is natural to estimate

σ 2 = V ( ε t

) via its empirical counterpart.

With

e t = y t − x

t. βˆ

in place of

ε t , it follows that

T

(^) −

1

t e t 2

may be used

to estimate

σ 2 .

However, it transpires that this is biased.

An unbiased

(48) estimate is provided by

ˆσ 2

=

T

k

T

t ∑

e t 2

=

T

k (^) ( y

X

βˆ

) ′ ( y − X

βˆ ) .

expected value of ( The unbiasedness of this estimate may be demonstrated by finding the

y

X

βˆ ) ′ ( y

X

βˆ ) =

y ′ ( I − P

y .

Given that (

I

P

y

= (

I

P

(^) ε ) = (

I

P

ε

in consequence of

the condition (

I

P

X

= 0, it follows that

E { ( y − X

βˆ ) ′ ( y − X

βˆ

) } = E ( ε ′ ε ) − E ( ε ′

P ε

EC3062 ECONOMETRICS

Statistical Properties of the OLS Estimator

The expectation or mean vector of

βˆ , and its dispersion matrix as

(53) well, may be found from the expression

βˆ

X

′ X ) − 1 X ′

ε )

β

X

′ X ) − 1 X ′

ε.

(54) The expectation is

E

βˆ ) =

β

X ′ X ) − 1 X ′ E ( ε )

β.

Thus,

βˆ

is an unbiased estimator.

The deviation of

βˆ

from its expected

value is

βˆ (^) −

E

βˆ ) = (

X ′ X ) − 1 X ′

ε

. Therefore, the dispersion matrix, which

contains the variances and covariances of the elements of

βˆ

, is

D

βˆ

) =

E

[

βˆ

E

βˆ ) }{

βˆ

E

βˆ ) } ′ ]

X

′ X ) − 1 X ′

E

εε

′ ) X ( X ′

X

1

= σ 2 ( X ′

X

1 .

EC3062 ECONOMETRICS

The Gauss–Markov theorem asserts that

βˆ

is the unbiased linear es-

(56) timator of least dispersion. Thus,

If

βˆ

is the OLS estimator of

β , and if

β ∗

is any other linear unbiased

estimator of

β

, then

V

q ′ β ∗ ) ≥ V ( q ′

βˆ ), where

q

is a constant vector.

Proof

. Since

β ∗

=

Ay

is an unbiased estimator, it follows that

E

β

∗ ) =

AE

y ) =

AXβ

β

, which implies that

AX

I

Now write

A

( X ′ X ) − 1 X ′ + G

. Then,

AX

I

implies that

GX

= 0. It follows that

D

β ∗ ) =

AD

y ) A

= σ 2 { ( X ′ X ) − 1 X ′ + G

X

X

′ X ) − 1 + G ′

= σ 2 ( X ′

X ) − 1 + σ 2

GG

D

βˆ ) +

σ

2 GG

′ .

Therefore, for any constant vector

q

of order

k , there is

V ( q ′ β ∗

q ′ D

βˆ

) q + σ 2 q ′

GG

′ q

q ′ D

βˆ

) q = V ( q ′

βˆ );

and thus the inequality

V ( q ′ β ∗ ) ≥ V ( q ′

βˆ ) is established.