Sociology Regression: Schooling & Experience's Impact on Hourly Wage, Exercises of Sociology

Instructions for a sociology exercise involving regression analysis using the wooldridge data file cps78 85.raw. Students are required to perform simple and multiple regressions to investigate the relationship between hourly wage, years of schooling, and labor force experience. The document also covers topics such as least-squares estimation, inference in multiple regression, and testing hypotheses.

Typology: Exercises

2011/2012

Uploaded on 11/20/2012

shubnam
shubnam 🇮🇳

4.5

(6)

127 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
sociology 362
data exercise 2
For this exercise you will use the Wooldridge data file Cps78 85.raw. You will be doing regressions of the
form:
regress lwage educ
regress lwage exper
regress lwage educ exper
where the variables are log wage, years of schooling and years of experience. You will also need to run
some auxiliary regressions.
In the notation that follows, I sometimes will use y,sand xto stand for lwage, schooling and experience.
least-squares estimation: simple vs multiple regression
1. A researcher is interested in the effect of years of schooling (s) and labor force experience (x) on hourly
wage (lwage). Do the simple regressions lwage =f(s) and lwage =f(x). Judging from these results,
what do you conclude about the direction, magnitude, signficance of the “effects” of schooling and
experience on hourly wage?
2. The researcher wants to run a multiple regression of the form lwage =f(s, x). Is this a good idea, is it
necessary? Explain. What statistic(s) might you look at to determine this before you actually run the
regression?
3. Run the multiple regression. What does this show about the effects of schooling and experience?
What light does it cast on the results of the simple regressions? In particular:
a. Explain the change in the coefficient of sbetween the simple regression and the multiple regression.
Use Stata to generate the numbers you need to exactly and numerically account for the difference
between the simple regression coefficient of sand the partial regression coefficient, i.e., ˆ
βys ˆ
βys.x.
b. Account for the change in the coefficient of x.
c. The partial regression coefficient of, say, schooling, ˆ
βys.x gives the “effect’ of a year of schooling on
lwage after removing from schooling (i.e., s) its linear relationship with experience. Use Stata to
demonstrate this.
d. Given the results of the multiple regression, what can be said about the bias in the estimate of the
effect of schooling on hourly wage yielded by the simple regression you started with, namely,
lwage =f(s) ?
4. Let’s think now about the changes in the standard errors of the coefficients as we move from the
simple regressions to the multiple regression. Describe these changes and explain in general terms what
accounts for them. In particular:
a. Account exactly and numerically for the change in the estimated variance and thus standard error of
the educ coefficient. Use Stata to generate the numbers you need to identify the effects on the standard
error of changes in the mean square residual and changes in the relevant variation in educ as one moves
from the simple to multiple regression.
b. Do the same for exper.
1
docsity.com
pf2

Partial preview of the text

Download Sociology Regression: Schooling & Experience's Impact on Hourly Wage and more Exercises Sociology in PDF only on Docsity!

sociology 362 data exercise 2

For this exercise you will use the Wooldridge data file Cps78 85.raw. You will be doing regressions of the form:

regress lwage educ

regress lwage exper

regress lwage educ exper

where the variables are log wage, years of schooling and years of experience. You will also need to run some auxiliary regressions.

In the notation that follows, I sometimes will use y, s and x to stand for lwage, schooling and experience.

least-squares estimation: simple vs multiple regression

  1. A researcher is interested in the effect of years of schooling (s) and labor force experience (x) on hourly wage (lwage). Do the simple regressions lwage = f (s) and lwage = f (x). Judging from these results, what do you conclude about the direction, magnitude, signficance of the “effects” of schooling and experience on hourly wage?
  2. The researcher wants to run a multiple regression of the form lwage = f (s, x). Is this a good idea, is it necessary? Explain. What statistic(s) might you look at to determine this before you actually run the regression?
  3. Run the multiple regression. What does this show about the effects of schooling and experience? What light does it cast on the results of the simple regressions? In particular:

a. Explain the change in the coefficient of s between the simple regression and the multiple regression. Use Stata to generate the numbers you need to exactly and numerically account for the difference between the simple regression coefficient of s and the partial regression coefficient, i.e., βˆys − βˆys.x.

b. Account for the change in the coefficient of x.

c. The partial regression coefficient of, say, schooling, βˆys.x gives the “effect’ of a year of schooling on lwage after removing from schooling (i.e., s) its linear relationship with experience. Use Stata to demonstrate this.

d. Given the results of the multiple regression, what can be said about the bias in the estimate of the effect of schooling on hourly wage yielded by the simple regression you started with, namely, lwage = f (s)?

  1. Let’s think now about the changes in the standard errors of the coefficients as we move from the simple regressions to the multiple regression. Describe these changes and explain in general terms what accounts for them. In particular:

a. Account exactly and numerically for the change in the estimated variance and thus standard error of the educ coefficient. Use Stata to generate the numbers you need to identify the effects on the standard error of changes in the mean square residual and changes in the relevant variation in educ as one moves from the simple to multiple regression.

b. Do the same for exper.

docsity.com

inference in multiple regression

  1. In the simple regressions the coefficients of x and s are both highly significant judging by the t-ratios. For each coefficient in turn, discuss the magnitude and source of change (i.e., either the coefficient or standard error, or both) in the t-ratio, and hence the confidence intervals, as one moves from the simple to multiple regression.
  2. In the model y = f (s, x), use the t-ratio to test the following hypotheses:

a. βys.x = 0 vs 6 = 0

b. βys.x = .86 vs >. 86

c. βyx.s = 0 vs 6 = 0

d. βyx.s = .09 vs >. 09

  1. Test the hypothesis in 6c above using an F-test. Are the two tests equivalent? Explain and show.
  2. Test the hypothesis βys.x = βyx.s = 0 against the alternative “not both zero.” Is it possible to reject this hypothesis and conclude that there does exist a regression relation, and yet fail to reject both the hypotheses in 1a and 1c above? Explain.
  3. If you haven’t already, try using stata’s postestimation command test to do problem 8 above.
  4. Fit a model which imposes the constraint βy,s.x = βy,x.s. How does the fit compare to the unconstrained model?

b. Test the hypothesis βys.x = βyx.s against the alternative “not equal.” Do this test by comparing the relevant fitted models using the F-statistic, and also do it using stata’s test command.

c. Impose and test the constraint βys.x = 8 × βyx.s against the unconstrained model.

docsity.com