















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
CLRM stands for the Classical Linear Regression Model. The CLRM is also ... Three sets of assumptions define the multiple CLRM -- essentially the same.
Typology: Study Guides, Projects, Research
1 / 23
This page cannot be seen from the preview
Don't miss anything!
















CLRM stands for the Classical Linear Regression Model. The CLRM is also known as the standard linear regression model.
Three sets of assumptions define the multiple CLRM -- essentially the same three sets of assumptions that defined the simple CLRM, with one modification to assumption A8.
1. Assumptions respecting the formulation of the population regression equation , or PRE.
Assumption A
2. Assumptions respecting the statistical properties of the random error term and the dependent variable.
Assumptions A2-A
Assumptions A5-A
Assumption A1: The population regression equation, or PRE, takes the form
Y X X X u = X u (A1)
k
j 1
or
i^ (A1)
k
j 1
The second form of (A1) writes the PRE for a particular observation i.
As in the simple CLRM, the PRE (A1) incorporates three distinct assumptions.
A1.1: Assumption of an Additive Random Error Term.
⇒ The random error term u (^) i enters the PRE additively****.
u
i
i (^) = ∂
for all i ( ∀ i).
A1.2: Assumption of Linearity-in-Parameters or Linearity-in-Coefficients.
⇒ The PRE is linear in the population regression coefficients β j (j = 0, ..., k).
Let x (^) i = [ 1 X 1 i X 2 i L Xki] be the (K×1) vector of regressor values for observation i.
f (x )
j i j
i (^) = ∂β
where f (^) j (xi) contains no unknown parameters , j = 0, ..., k.
A1.3: Assumption of Parameter or Coefficient Constancy.
⇒ The population regression coefficients β j (j = 0, 1, ..., k) are (unknown) constants that do not vary across observations.
βji =β j = a constant ∀ i (j = 0, 1, ..., k).
E ( u x) = 0 ⇒ Cov( X (^) j ,u) = E( Xju) = 0 , j = 1, 2, …, k (A2-2) or E ( ui xi) = 0 ⇒ Cov( X (^) ji ,ui) = E( Xjiui) = 0 ∀ i, j = 1, 2, …, k (A2-2)
1. The equality Cov( Xji ,ui) = E(X (^) jiui)in (A2-2) follows from the definition of the covariance between X (^) ji and u (^) i , and from assumption (A2):
( ) {[ ][ ]} { [ ] } [ ]
=E(X u ) sinceE(u) E(u|x ) 0 by A2.
E(X u ) E(X )E(u) sinceE(X )isaconstant
=EX u E(X )u
=E X E(X )u sinceE(u|x ) 0 byA
CovX ,u E X E(X ) u E(u|x ) bydefinition
ji i i i i
ji i ji i ji
ji i ji i
ji ji i i i
ji i ji ji i i i
2. Implication (A2-2) states that the random error term u has zero covariance with , or is uncorrelated with , each of the regressors X (^) j (j = 1, …, k) in the population. This assumption means that there exists no linear association between u and any of the k regressors X (^) j (j = 1, …, k).
Note that zero covariance between Xji and u (^) i implies zero correlation between Xji and u (^) i , since the simple correlation coefficient between Xji and u (^) i , denoted as ρ(Xji , u (^) i ), is defined as
ρ( , )
X u.
Cov X u Var X Var u
Cov X u ji i sd X sd u
ji i ji i
ji i ji i
From this definition of ρ(Xji , u (^) i ), it is obvious that if Cov(X (^) ji , u (^) i ) = 0, then ρ(Xji , u (^) i ) = 0, i.e.,
Cov X ( (^) ji , ui) = (^0) ⇒ ρ( X (^) ji , ui) = (^0).
E ( u x) = 0 ⇒ E ( Y x) = f(x)=β 0 +β 1 X 1 +β 2 X 2 +L+βkXk
= β + β
k j 1 0 jX^ j (A2-3)
or
E ( ui xi) = 0 ⇒ E ( Yi xi) = f(xi)=β 0 +β 1 X 1 i+β 2 X 2 i+L+βkXki
= β + β
k
j 1
0 jXji
Proof: Take the conditional expectation of the PRE (A1) for some given set of regressor values x (^) i = [ 1 X 1 i X 2 i L Xki]:
i
k
j 1
( ) ( ) ( ) ( ) ( )
X since E X x X.
E X X X x byA 2 ,Eu x 0
EY x E X X X x Eu x
k
j 1
0 j ji
k
j 1
i
k
j 1
0 j ji 0 j ji
0 1 1 i 2 2 i k ki
0 1 1 i 2 2 i k ki i i i
i i 0 1 1 i 2 2 i k ki i i i
∑ ∑ ∑ = = =
⎟ =β + β ⎠
=β + β β + β
=β +β +β + +β
= β +β +β + +β =
= β +β +β + +β +
Each set of regressor values x (^) i = [ 1 X 1 i X 2 i L Xki]identifies a segment or subset of the relevant population, specifically the segment that has those particular values of the regressors. For each of these population segments or subsets, assumption A2 says that the mean of the random error u is zero.
Assumption A3: The Assumption of Constant Error Variances The Assumption of Homoskedastic Errors The Assumption of Homoskedasticity
The conditional variances of the random error terms u (^) i are identical for all observations -- i.e., for all sets of regressor values x = [ 1 X 1 X 2 L Xk] ) -- and equal the same finite positive constant σ^2 for all i:
Var ( ux) = E( u^2 x) =σ^2 > 0 (A3) or Var^ ( ui xi)^ = E(^ ui^2 xi)^ =σ^2 > 0 ∀i (A3)
where σ^2 is a finite positive (unknown) constant and x (^) i =[ 1 X 1 i X 2 i L Xki] is the (K×1) vector of regressor values for observation i.
( ) {[ ] }
{ [ ] } E ( u x ).
=E u 0 x becauseE(u|x) 0 byassumptionA 2
Var u x E u E(u|x) x bydefinition
i
2 i
i i i
2 i
i
2 i i i i i
In other words, the variances of the conditional random error distributions corresponding to each set of regressor values in the relevant population are all equal to the same finite positive constant σ^2.
Var ( ui xi) = Var( us xs) =σ^2 > 0 for all x (^) s ≠ xi.
x =xi = 1 X 1 i X 2 i x = xs = [ 1 X 1 s X 2 s L Xks](for all x (^) s ≠ xi). The conditional distributions of the population Y values around the PRF have the same constant variance σ^2 for all sets of regressor values.
Var ( Yi xi) = Var( Ys xs) =σ^2 > 0 for all x (^) s ≠ xi.
Assumption A4: The Assumption of Zero Error Covariances The Assumption of Nonautoregressive Errors The Assumption of Nonautocorrelated Errors
Consider any pair of distinct random error terms u (^) i and u (^) s (i ≠ s) corresponding to two different sets (or vectors) of regressor values xi ≠ xs. This assumption states that u (^) i and u (^) s have zero covariance :
Cov ( u (^) i ,us xi,xs) = E( uius xi,xs) = 0 ∀i≠s. (A4)
Cov ( u (^) i ,us xi,xs) ≡ E{ [ ui−E(ui|xi)][ us−E(us|xs)] xi,xs} by definition
= E( u (^) i us xi,xs) since E( ui xi)= E(us xs)= 0 by A2.
Cov u ( (^) i , u (^) s x (^) i , x (^) s) = 0 ∀ i ≠s ⇒ Cov Y Y( (^) i , (^) s x (^) i , x (^) s) = 0 ∀ i ≠s.
Assumption A5: Random Sampling or Independent Random Sampling
The sample data consist of N randomly selected observations on the regressand Y and the regressors X (^) j (j = 1, ..., k), the observable variables in the PRE described by A1. These N randomly selected observations can be written as N row vectors:
[ ] ( ) (Y,x) i 1 , ,N.
Y, 1 ,X ,X , ,X i 1 , ,N
Sampledata (Y,x ),(Y,x ), ,(Y ,x )
i i
i 1 i 2 i ki
1 1 2 2 N N
The assumption of random sampling implies that the sample observations are statistically independent.
1. It thus means that the error terms u (^) i and u (^) s are statistically independent , and hence have zero covariance , for any two observations i and s.
Random sampling ⇒ Cov( ui ,us xi,xs) = Cov( ui ,us)= 0 ∀ i ≠ s.
2. It also means that the dependent variable values Yi and Y (^) s are statistically independent , and hence have zero covariance , for any two observations i and s.
Random sampling ⇒ Cov( Y (^) i ,Ys xi,xs) = Cov( Yi ,Ys)= 0 ∀ i ≠ s.
The assumption of random sampling is therefore sufficient for assumption A of zero covariance between observations, but is stronger than necessary.
The random sampling assumption is often appropriate for cross-sectional regression models , but is hardly ever appropriate for time-series regression models.
Assumption A6: The number of sample observations N is greater than the number of unknown parameters K:
number of sample observations > number of unknown parameters
Assumption A7: Nonconstant Regressors
The sample values X (^) ji of each regressor X (^) j (j = 1, …, k) in a given sample (and hence in the population) are not all equal to a constant:
Xji ≠ cj ∀ i = 1, ..., N where the cj are constants (j = 1, ..., k). (A7)
all k − 1 non-constant regressors Xj (j = 1, ..., k) must be finite positive numbers for any sample size N; i.e.,
sample variance of Xji ≡ Var(Xji ) =
i X^ ji Xj N
= s (^) X^2 j > 0,
where s (^) X^2 j > 0 are finite positive numbers for all j = 1, ..., k.
Unless this assumption is satisfied, it is not possible to compute from the sample data an estimate of the effect on the regressand Y of changes in the value of the regressor X (^) j. In other words, to calculate the effect of changes in Xj on Y, the sample values X (^) ji of the regressor X (^) j must vary across observations in any given sample.
Consider the following multiple linear regression model:
Yi =β 0 +β 1 X 1 i+β 2 X 2 i+ui (i=1,...,N). (2)
Suppose that the sample values of the regressors X (^) 1i and X2i satisfy the following linear equality for all sample observations:
X 1 i = 3 X 2 i or X (^1) i − 3 X 2 i= 0 ∀ i = 1,...,N. (3)
The exact linear relationship (3) can be written in the general form (1).
λ 0 + λ 1 X (^1) i+λ 2 X 2 i= 0 ∀ i =1 2, , K, N.
X (^1) i − 3 X 2 i= 0 ∀ i =1 2, , K , N. (identical to equation (3) above.)
Yi =β 0 +β 1 X 1 i+β 2 X 2 i+u i
=β 0 +β 1 ( 3 X (^2) i) +β 2 X 2 i+ui
=β 0 + 3 β 1 X 2 i+β 2 X 2 i+u i
=β 0 + ( 3 β 1 +β 2 )X (^2) i+ui
= β 0 +α 2 X (^2) i+u i where α 2 = 3 β 1 +β 2 (4a)
♦ It is possible to estimate from the sample data the regression coefficients β 0 and α2.
♦ But from the estimate of α 2 it is not possible to compute estimates of the coefficients β 1 and β2. Reason: The equation
α 2 = 3 β 1 +β 2
is one equation containing two unknowns, namely β 1 and β2.
Result: It is not possible to compute from the sample data estimates of both β 1 and β 2 , the separate linear effects of X (^) 1i and X2i on the regressand Yi.
Yi = β 0 +β 1 X 1 i +β 2 X 2 i +β 3 X 3 i + u i (5)
X1, X 2 and X 3 are three distinct independent or explanatory variables that determine the population values of Y.
Because regression equation (5) contains more than one regressor, it is called a multiple linear regression model.
E ( Yi xi) = E( Yi X 1 i,X 2 i,X 3 i) =β 0 +β 1 X 1 i +β 2 X 2 i +β 3 X 3 i (6)
where x (^) iis the 1×4 row vector of regressors: x (^) i = ( 1 X 1 iX 2 iX 3 i).
Interpreting the Slope Coefficients in Multiple Regression Model (5)
( ) ( ) j ji
i 1 i 2 i 3 i ji
i i X
E Y x =β ∂
j = 1, 2, 3 (7)
For example, for j = 1 in multiple regression model (5):
( ) 1 1 i
0 1 1 i 2 2 i 3 3 i 1 i
i 1 i 2 i 3 i X
=β ∂
∂
Example: In multiple regression model (5)
Yi = β 0 +β 1 X 1 i +β 2 X 2 i +β 3 X 3 i + u i (5)
with population regression function
E ( Yi X 1 i,X 2 i,X 3 i) = β 0 +β 1 X 1 i +β 2 X 2 i +β 3 X 3 i (6)
the slope coefficients β 1, β 2 and β 3 are interpreted as follows:
β 1 = the partial marginal effect of X 1 on the conditional mean of Y holding constant the values of the other regressors X 2 and X (^) 3.
β 2 = the partial marginal effect of X 2 on the conditional mean of Y holding constant the values of the other regressors X 1 and X (^) 3.
β 3 = the partial marginal effect of X 3 on the conditional mean of Y holding constant the values of the other regressors X 1 and X (^) 2.