Download Models for Pooled and Panel Data: Analyzing Intercepts and Slopes Differences and more Slides Business in PDF only on Docsity!
Section 8 Models for Pooled and Panel Data
Data definitions
- Pooled data occur when we have a “time series of cross sections,” but the observations in each cross section do not necessarily refer to the same unit.
- Panel data refers to samples of the same cross-sectional units observed at multiple points in time. A panel-data observation has two dimensions: X (^) it , where i runs from 1 to n and denotes the cross-sectional unit and t runs from 1 to T and denotes the time of the observation. o A balanced panel has every observation from 1 to n observable in every period 1 to T. o An unbalanced panel has missing data. o Panel data commands in Stata start with xt, as in xtreg. Be careful about models and default assumptions in these commands.
Regression with pooled cross sections
- The crucial question with pooled cross sections from different time periods is “Does the same model apply in each time period?” o Has inflation changed the real values of some variables, requiring adjustment? o Was the business cycle at different phases in different periods? o Were there changes in technology or regulation that would cause behavior to be different? o Are there other factors that might cause coefficients in one period to differ from those in others?
- This is a special case of the Assumption #0 question: Do all observations come from the same model?
- Time dummy variables o A very general way of modeling (and testing for) differences in intercept terms or slope coefficients between periods is the use of time dummies. o Including time dummies (for all but one, omitted date in the sample to avoid the dummy-variable trap) alone allows the intercept to have a different value in each period. The estimated intercept term in the model with time dummies is the estimated intercept in the period with the omitted dummy. The estimated coefficient on an included time dummy corresponding to a particular period is an estimate of the difference between the intercept in that period and the intercept in the omitted period.
A joint test of whether all the dummies’ coefficients are zero tests the hypothesis that the intercept does not vary at all over periods. The simple test of whether a particular dummy’s coefficient is zero tests the hypothesis that the intercept in that dummy’s period does not differ from that of the omitted period. o Including interactions between time dummies and another variable Z allows the coefficient on (effect of) Z to vary across periods. As before, the estimated coefficient on non-interacted Z is the estimated effect in the period for which the dummy is omitted. The estimated coefficient on the interaction between Z and the dummy for period t is the estimated difference between the effect of Z in period t and the effect in the omitted period. The joint test of the interaction terms tests the hypothesis that the coefficients (effects) of Z are the same in all periods. The simple test of the interaction term for the period t dummy tests whether the effect of Z in period t differs from the effect in the omitted period.
- Using aggregate variables that vary only over time (not across units) o Suppose that we think that the reason for variation in either the intercept or a slope coefficient across periods is due to changes in one particular variable (the aggregate unemployment rate, for example). o In this case, we can include that variable (for intercept effects) and perhaps interactions of that variable with some regressor Z (to capture effects on unemployment on the marginal effect of Z ). o Interpretation of these coefficients is standard for continuous interaction variables.
- Limitations on variables that vary only over time o If we include time dummies, we cannot include any other variables that vary only over time. Any variable that varies only over time can be expressed as a linear function of the dummies. - If there are two periods with unemployment = 4 in the first period and 6 in the second, then U = 4 + 2 D 2 , where D 2 is a dummy equal to one in the second period. Thus, including U , D 2 , and a constant will result in perfect multicollinearity. - Same thing happens with more periods and/or more variables like U that vary only over time (and not across units). o If there are T time periods represented in the data, there can be at most T – 1 only-time-varying variables in the regression (assuming no dummies).
- Note that it is critically important that X vary over time within units (and in a different way across units), otherwise the regressor in the differenced regression is zero (or constant).
- This estimator is nice when T = 2, but what if T > 2? o We can generalize it as the “fixed-effects” estimator.
Unit (entity) fixed effects
With T > 2, we could do T – 1 differences across pairs of time periods, allowing n ( T – 1) observations in the differenced sample (and n ( T – 1) – k degrees of freedom because there is no constant term). Alternatively, we can get a similar (identical if T = 2) regression in two other ways.
- Regression with unit dummy variables. o Let Di = 1 for all observations on unit i and 0 otherwise, for i = 2, 3, …, n. There are n – 1 such dummies. o We can run the unit fixed-effects regression Yit = β 0 + α (^) i Di + β 1 X (^) 1 it + β 2 X (^) 2 it + …+ β (^) k X (^) kit + uit (generalizing S&W to more than one X ). o Note that although we have nT observations in this regression, we end up with nT – ( n – 1) – k – 1 = n ( T – 1) – k degrees of freedom, just as in the differenced case. o One can show that these two regressions are formally equivalent: the estimators for β are the same and have the same distributions, standard errors, etc. o Any X that has no variation across time within unit will be a linear function of the dummies, so we have perfect multicollinearity. We can’t identify the effects of such non-time-varying variables in a fixed-effects model.
- De-meaned regression o Another equivalent way of estimating this model is to subtract the unit-mean from each observation. Let =
1
1 T
X i n i Xit and = ∑= 1
1 T
Yi (^) n (^) i Yit.
Let (^) X ^ (^) it = X (^) it − X (^) i and (^) Y it = Yit − Yi. However, we really don’t have nT independent observations because
=
1
T it i
X so
−
1 1
T iT it i
X X (and the same for Y and u ).
In order to correct for this problem, we can either drop one time unit to eliminate the redundant observations or we can adjust the degrees of freedom to correct for this. o Most statistical packages actually do fixed-effects regression using the de- meaning procedure because it takes less time to calculate the means and the tilde
variables and invert a matrix of order k + 1 than to invert a matrix of order k + n , which would be necessary to estimate the model with n – 1 unit dummy variables. (Option fe in Stata xtreg, which is not the default) o This estimator is sometimes called the within-unit estimator because it estimates the coefficients strictly based on variation (over time) within cross-sectional unit. Corresponding to this is a between-unit estimator that is the regression of Yi on X (^) i for the n observations of the sample. (Option be in Stata xtreg) o Again, any variable that doesn’t vary over time within each unit will have zero values from the deviation from mean, so regression breaks down and we can’t identify the effect.
- It is worth thinking about where the most meaningful variation in your sample occurs. o Fixed-effect regression uses only changes over time within units in calculating the relationship among the variables. o If the meaningful variation in your sample is mostly between units (such as panel regressions on a sample of colleges, where the differences between colleges are much more important than the differences within colleges over time), then fixed- effect regression is unlikely to be effective. If the variation is all between units, then we have perfect multicollinearity and we can’t estimate the effects at all. If most of the variation is between units, then we will have high (but not perfect) multicollinearity and our estimates will be very imprecise. o Because of this, fixed-effects regression sets a very high bar: if your effects are significant and meaningful in fixed effects you can probably attach considerable confidence to them.
- Is the fixed-effects model identical to the first-difference model? o Not if T > 2. o Although the data series span the same space, the assumptions made about the error terms are different. o If there is high correlation between ui,t and ui,t –1, then the first-difference estimator is often better because the differencing eliminates this high correlation in a way that subtracting the mean does not. o If there is no strong correlation between adjacent (in time) observations, then the fixed-effects estimator is often better.
- The fixed-effects estimator is a straightforward application of OLS, and has the usual properties of the OLS estimator. o Stock and Watson list the assumptions of fixed-effects estimation in the box on page 365: Conditional expectation of u conditional on X and α is zero. IID draws.
Estimate with Y and X expressed as deviations from time means.
- Any variable that varies only across time, and not across units, will be collinear with the dummy variables (or zero when de-meaned) and its effect cannot be estimated.
- We can also combine both unit and time fixed effects. o Either LSDV with both unit and time dummies, or o Demeaning the data both with respect to time and with respect to units.
To do this, we calculate Y it^ = ( Yit − Yi ) − ( Y t − Y ),where Y is the overall
mean across both units and time, and regress it on a similarly transformed X. This is sometimes called the “differences-in-differences” estimator because it excludes the effects of changes that are strictly over time (taken out with time dummies or demeaning) and the effects of changes that are strictly across units (taken out with unit dummies or demeaning). This leaves only differences across units in how the variables change over time to estimate β.
Random-effects models
Sometimes rather than having a different fixed constant term for each unit, we want to think of each unit having a common error term drawn randomly from some distribution. In other words, the α i terms are thought of as random variables drawn, for each i , from some common distribution rather than constants.
• The model is Yit = β 0 + β 1 X 1 it + β 2 X 2 it + …+ β k X kit + α +( i uit ),where we have grouped α
and u together as a composite error term. o This is sometimes called an error-components model because the error term has two components: One that is the same across time within units, and An “idiosyncratic” error term for each unit/period.
- In order for OLS to be consistent for the random-effects model, we must be able to assume that α is uncorrelated with X : The unobserved, time-invariant characteristics of units that influence the dependent variable must be uncorrelated with the measured variables whose effects we want to estimate.
- The composite error terms ( v ) of observations within the same group are correlated: if α
and u are independent of one another, then ( ) α
α
= σ σ + σ
2 cov (^) it , (^) is 2 2. u
v v
o The random-effects estimator is a feasible GLS estimator that estimates this covariance based on correlation between same-unit residuals, then calculates an estimator that is BLUE conditional on this calculated covariance matrix.
o Computationally, this involves “quasi-de-meaning” the data by calculating
Y it^ = Yit − λ Yi ,where α
⎛ (^) σ ⎞ λ = − ⎜ ⎟ ⎝ σ^ +^ σ ⎠
2 1 2 (^1 2 ) u u T^
. As T → ∞, λ → 1 and the random-
effects estimator becomes equivalent to the fixed-effects model.
- Random effects or fixed effects? o Fixed-effects models have the advantage of not requiring cov( X , α) = 0, which is often difficult to justify. o However, fixed-effects models cannot identify the effects of any variables that vary only across units (and has difficulty identifying effects if most of the meaningful variation is across units). o Can do a Hausman test to examine whether the random-effects model is appropriate. (It is a nested sub-model of the fixed-effects model.) The Hausman test is rejected if - The estimates are sufficiently different, and - The fixed-effects estimators are sufficiently precise. Use random-effects unless the Hausman test rejects it.
Class demonstration
- Dataset: S&W’s Seatbelts.dta o Show dataset o Define as panel xtset fips year o Discuss missing values problem o Note two state identifiers, one alpha and one numeric
- Following S&W’s Empirical Exercise E10. o Generate lnincome variable o Discuss expected results of regressing fatalityrate on sb_usage speed65 speed drinkage21 ba08 lnincome age
- OLS regression o sb_usage has “wrong” sign Authors argue that this is endogeneity and might be corrected partially by including state fixed effects. o Other effects are plausible o Send to outreg2 using fatal , word ctitle(OLS)
- FE regression o Now sb_usage has the expected sign o Other variables decline in coefficient magnitude o With and without the “robust” option, which gives clustered standard errors in this case.