Panel data methods for microeconometrics using Stata, Lecture notes of Applied Statistics

Panel data are repeated measures on individuals i over time t . ... This talk: overview of panel data methods and xt commands for Stata.

Typology: Lecture notes

2021/2022

Uploaded on 09/27/2022

alenapool
alenapool 🇬🇧

4.6

(13)

223 documents

1 / 39

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Panel data methods for microeconometrics using Stata
A. Colin Cameron
Univ. of California - Davis
Prepar ed for Wes t Coast S tata U sers’ Gro up Mee ting
Based o n A. Coli n Came ron and P ravin K . Trivedi,
Micro econ omet rics us ing Sta ta, Sta ta Pres s, forthc omin g.
October 25, 2007
A. Colin C ameron Univ. of Californ ia - Davis (Prepar ed for West Co ast Stata U sers Group M eeting Ba sed on A. Co lin Camer on and Pra vin K. Trivedi, M icroec onomet rics usin g Stata, S tata Pres s, forthco ming.)Panel me thods for S tata Octob er 25, 200 7 1 / 39
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27

Partial preview of the text

Download Panel data methods for microeconometrics using Stata and more Lecture notes Applied Statistics in PDF only on Docsity!

Panel data methods for microeconometrics using Stata

A. Colin Cameron Univ. of California - Davis

Prepared for West Coast Stata UsersíGroup Meeting Based on A. Colin Cameron and Pravin K. Trivedi, Microeconometrics using Stata, Stata Press, forthcoming.

October 25, 2007

1. Introduction

Panel data are repeated measures on individuals (i ) over time (t).

Regress yit on xit for i = 1 , ..., N and t = 1 , ..., T.

Complications compared to cross-section data: (^1) Inference: correct (ináate) standard errors. This is because each additional year of data is not independent of previous years. (^2) Modelling: richer models and estimation methods are possible with repeated measures. Fixed e§ects and dynamic models are examples. (^3) Methodology: di§erent areas of applied statistics may apply di§erent methods to the same panel data set.

Outline

(^1) Introduction (^2) Linear models overview (^3) Example: wages (^4) Standard linear panel estimators (^5) Linear panel IV estimators (^6) Linear dynamic models (^7) Long panels (^8) Random coe¢ cient models (^9) Clustered data (^10) Nonlinear panel models overview (^11) Nonlinear panel models estimators (^12) Conclusions

2.1 Some basic considerations

(^1) Regular time intervals assumed. (^2) Unbalanced panel okay (xt commands handle unbalanced data). [Should then rule out selection/attrition bias]. (^3) Short panel assumed, with T small and N! ∞. [Versus long panels, with T! ∞ and N small or N! ∞.] (^4) Errors are correlated. [For short panel: panel over t for given i, but not over i.] (^5) Parameters may vary over individuals or time. Intercept: Individual-speciÖc e§ects model (Öxed or random e§ects). Slopes: Pooling and random coe¢ cients models. (^6) Regressors: time-invariant, individual-invariant, or vary over both. (^7) Prediction: ignored. [Not always possible even if marginal e§ects computed.] (^8) Dynamic models: possible. [Usually static models are estimated.]

2.2 Fixed e§ects versus random e§ects

Individual-speciÖc e§ects model: yit = x^0 it β + ( α i + ε it ). Fixed e§ects (FE): α i is possibly correlated with xit regressor xit can be endogenous (though only wrt a time-invariant component of the error) can consistently estimate β for time-varying xit (mean-di§erencing or Örst-di§erencing eliminates α i ) cannot consistently estimate α i if short panel prediction is not possible β = E[yit j α i , xit ]/ xit

Random e§ects (RE) or population-averaged (PA)

α i is purely random (usually iid ( 0 , σ^2 α )). regressor xit must be exogenous corrects standard errors for equicorrelated clustered errors prediction is possible β = E[yit jxit ]/ xit

Fundamental divide

Microeconometricians: Öxed e§ects Many others: random e§ects.

2.4 Stata linear panel commands

Panel summary xtset; xtdescribe; xtsum; xtdata; xtline; xttab; xttran Pooled OLS regress Feasible GLS xtgee, family(gaussian) xtgls; xtpcse Random e§ects xtreg, re; xtregar, re Fixed e§ects xtreg, fe; xtregar, fe Random slopes xtmixed; quadchk; xtrc First di§erences regress (with di§erenced data) Static IV xtivreg; xthtaylor Dynamic IV xtabond; xtdpdsys; xtdpd

3.1 Example: wages

PSID wage data 1976-82 on 595 individuals. Balanced. Source: Baltagi and Khanti-Akom (1990). [Corrected version of Cornwell and Rupert (1998).] Goal: estimate causative e§ect of education on wages. Complication: education is time-invariant in these data. Rules out Öxed e§ects. Need to use IV methods (Hausman-Taylor).

3.3 Summarizing panel data

describe, summarize and tabulate confound cross-section and time series variation.

Instead use specialized panel commands:

xtdescribe: extent to which panel is unbalanced xtsum: separate within (over time) and between (over individuals) variation xttab: tabulations within and between for discrete data e.g. binary xttrans: transition frequencies for discrete data xtline: time series plot for each individual on one chart xtdata: scatterplots for within and between variation.

4.2 Example

Coe¢ cients vary considerably across OLS, FE and RE estimators. Cluster-robust standard errors (su¢ x rob) larger even for FE and RE. Coe¢ cient of ed not identiÖed for FE as time-invariant regressor.

4.3 Fixed e§ects versus random e§ects

Use Hausman test to discriminate between FE and RE. If Öxed e§ects: FE consistent and RE inconsistent. If not Öxed e§ects: FE consistent and RE consistent. So see whether di§erence between FE and RE is zero.

H =

 e β 1 ,RE b β 1 ,FE

 0 h Covd[e β 1 ,RE b β 1 ,FE ]

i 1  e β 1 ,RE b β 1 ,W

 ,

where β 1 corresponds to time-varying regressors (or a subset of these). Problem: hausman command assumes RE is fully e¢ cient. But not the case here as robust seís for RE di§er from default seís. So hausman is incorrect. Instead implement Hausman test using suest or panel bootstrap or Wooldridge (2002) robust version of Hausman test.

5.2 Hausman-Taylor IV estimator

Problem in the Öxed e§ects model If an endogenous regressor is time-invariant Then FE estimator cannot identify β (as time-invariant). Solution: Assume the endogenous regressor is correlated only with α i (and not with ε it ) Use exogenous time-varying regressors xit from other periods as instruments Command xthtaylor does this (and has option amacurdy).

6.1 Linear dynamic panel models

Simple dynamic model regresses yit in polynomial in time. e.g. Growth curve of child height or IQ as grow older use previous models with xit polynomial in time or age.

Richer dynamic model regresses yit on lags of yit.