Assignment 5 - Answer Problems - Econometrics | ECON 30331, Assignments of Introduction to Econometrics

Material Type: Assignment; Class: Econometrics; Subject: Economics; University: Notre Dame; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 02/24/2010

koofers-user-5cp-1
koofers-user-5cp-1 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Suggested Answers, Problem Set 5
ECON 30331
Dan Hungerman
1. In class, we discussed how we can estimate the standard error of a predicted value, ˆ
y
, evaluated at a
set of values 112 2
, ,..., kk
x
cx c x c== =, by appropriately transforming the x-variables.
Departing from my lecture notes, I provided an example where one could estimate
2
01 2
y age age e
ββ β
=+ + +
and then calculate ˆ
y
when age = 21 by subtracting 21 from the
variables age and age2. It was then asked whether I would want to subtract 21, or 212 = 441,
from the variable age2. I said I didn’t know.
Your job for this question is to figure it out.
A. Download the data house_price.dta from the website. Run a regression of house price on
age and age2. Report the associated coefficients and standard errors.
Source | SS df MS Number of obs = 114
-------------+------------------------------ F( 2, 111) = 0.38
Model | 27201.4182 2 13600.7091 Prob > F = 0.6866
Residual | 4001093.16 111 36045.8843 R-squared = 0.0068
-------------+------------------------------ Adj R-squared = -0.0111
Total | 4028294.57 113 35648.6246 Root MSE = 189.86
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .3576939 2.244945 0.16 0.874 -4.090815 4.806202
age2 | .0005313 .0120905 0.04 0.965 -.0234268 .0244894
_cons | 310.9385 92.10297 3.38 0.001 128.4303 493.4467
------------------------------------------------------------------------------
B. Immediately after the regression, type “predict phat”. This creates a new variable, phat, that
is the predicted prices from the regression. What is the predicted price for a house that is 88
years old?
If you browse your data, or calculate phat using the coefficients from A, you will see that an 88
year old house should have a price of 0.358*88+0.0005*(882) +311= 346.
pf3
pf4
pf5

Partial preview of the text

Download Assignment 5 - Answer Problems - Econometrics | ECON 30331 and more Assignments Introduction to Econometrics in PDF only on Docsity!

Suggested Answers, Problem Set 5

ECON 30331

Dan Hungerman

1. In class, we discussed how we can estimate the standard error of a predicted value, y ˆ , evaluated at a

set of values x 1 = c 1 , x 2 = c 2 , ... , xk = ck , by appropriately transforming the x-variables.

Departing from my lecture notes, I provided an example where one could estimate

2

y = β 0 + β 1 age + β 2 age + e and then calculate y ˆ when age = 21 by subtracting 21 from the

variables age and age

2

. It was then asked whether I would want to subtract 21, or 21

2

from the variable age

2

. I said I didn’t know.

Your job for this question is to figure it out.

A. Download the data house_price.dta from the website. Run a regression of house price on

age and age

2

. Report the associated coefficients and standard errors.

Source | SS df MS Number of obs = 114 -------------+------------------------------ F( 2, 111) = 0. Model | 27201.4182 2 13600.7091 Prob > F = 0. Residual | 4001093.16 111 36045.8843 R-squared = 0. -------------+------------------------------ Adj R-squared = -0. Total | 4028294.57 113 35648.6246 Root MSE = 189.


price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .3576939 2.244945 0.16 0.874 -4.090815 4. age2 | .0005313 .0120905 0.04 0.965 -.0234268. _cons | 310.9385 92.10297 3.38 0.001 128.4303 493.


B. Immediately after the regression, type “predict phat”. This creates a new variable, phat , that

is the predicted prices from the regression. What is the predicted price for a house that is 88

years old?

If you browse your data, or calculate phat using the coefficients from A, you will see that an 88

year old house should have a price of 0.35888+0.0005(

2

C. Now, by creating transformed variables, run a regression that appropriately estimates the

standard error of your estimate in part B. What is the correct way to transform the variable

age

2

here?

. gen newage = age- . gen newage2 = age2 - (88^2) . reg price newage newage

Source | SS df MS Number of obs = 114 -------------+------------------------------ F( 2, 111) = 0. Model | 27201.4182 2 13600.7091 Prob > F = 0. Residual | 4001093.16 111 36045.8843 R-squared = 0. -------------+------------------------------ Adj R-squared = -0. Total | 4028294.57 113 35648.6246 Root MSE = 189.

price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- newage | .3576939 2.244945 0.16 0.874 -4.090815 4. newage2 | .0005313 .0120905 0.04 0.965 -.0234268. _cons | 346.53 25.55931 13.56 0.000 295.8825 397.


We can see that this is the correct transformation because the coefficient for the intercept

matches our answer in part B.

D. Now estimate the equation

2

price = δ 0 + δ 1 age + δ 2 age + δ 3 sq _ feet + δ 4 bedrooms + e

What is the expected price of an 88 year-old house with 1,500 square feet and 3 bedrooms?

What is the 95% confidence interval for this expected value?

The regression is:

. reg price age age2 sq_feet bedrooms

Source | SS df MS Number of obs = 114 -------------+------------------------------ F( 4, 109) = 17. Model | 1601238.52 4 400309.63 Prob > F = 0. Residual | 2427056.05 109 22266.5693 R-squared = 0. -------------+------------------------------ Adj R-squared = 0. Total | 4028294.57 113 35648.6246 Root MSE = 149.

price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.4356044 1.770898 -0.25 0.806 -3.945467 3. age2 | .0015517 .0095055 0.16 0.871 -.0172878. sq_feet | .1982142 .0271161 7.31 0.000 .1444709. bedrooms | -19.47592 17.30307 -1.13 0.263 -53.77004 14. _cons | 78.35878 82.16293 0.95 0.342 -84.48549 241.


F. Is it worrisome that Mr. Silly was able to do that silly stuff in part E? If Stata will give us

predicted prices for houses that cannot exist, does this suggest that the entire notion of using a

regression to look at predicted prices is a waste of time?

It is not worrisome, but it is a reminder that Stata is not good at making judgment calls about

what makes sense and what doesn’t. Stata is basically a super powerful calculator; it is up to us

to have it calculate things that make sense. If we want to estimate the price of a house whose

characteristics resemble those of the houses in our dataset, and if we have run an appropriate

regression ( e.g., one that includes appropriate x variables ) , than the regression will probably

give us a good idea of what the expected price of the may be.

2. Below is a regression I have run on some data (it is a regression on US states, excluding Hawaii

and Alaska. The dependent variable is per-capital income, and the X variables are the fraction of

a state’s population that is over age 65, the fraction black, the fraction foreign-born, and the

state’s unemployment rate.) The R-squared and adjusted R-squared has been removed

A. What is the R-squared?

B. What is the adjusted R-squared?

C. Suppose I removed the variable for the fraction of the population over 65, p65. What would

happen to the R-squared? What about the adjusted R-squared?

. reg disposable_income p65 pblack pforeign riu_unemployment

Source | SS df MS Number of obs = 48

-------------+------------------------------ F( 4, 43) = 5.

Model | 169.901986 4 42.4754965 Prob > F = 0.

Residual | 336.559814 43 7.82697241 R-squared = 0.

-------------+------------------------------ Adj R-squared = 0.

Total | 506.4618 47 10.775783 Root MSE = 2.

disposable~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]

p65 | .1535071 .2594459 0.59 0.557 -.3697155.

pblack | -.0082877 .0429338 -0.19 0.848 -.094872.

pforeign | .3530978 .0765579 4.61 0.000 .1987041.

riu_unempl~t | -.5848798 .7734795 -0.76 0.454 -2.14475.

_cons | 20.69081 3.471959 5.96 0.000 13.68894 27.

A. The answer is above and can be calculated using the SS-df-MS table at the top-left of the

regression output.

B. The adjusted R-squared can be calculated using the formula relating R-squareds and adjusted R-

squared that we learned in class

C. If you removed p65, the R-squared would fall while the adjusted R-squared would rise (the latter

is true since the t-statistic is less than 1.)

3. Can an adjusted R-squared ever be larger than the traditional R-squared? (You can assume that

we are discussing a setting where k > 0 and that the traditional R-squared is greater than zero and

less than 1.)

Using algebra and the expression relating R-squared with adjusted R-squared, we can write:

2 R = 1 − (1 − R ) x

where x = ( n − 1) / ( nk − 1) > 1_. Subtract R-squared from both sides:_ 2 2 2 RR = 1 − x + xRR

The expression can be rewritten as:

2 2 RR = (1 − x )(1 − R )

The first term on the right must be negative while the second term must be positive, so the whole

thing is negative. The adjusted R-squared must be less than the traditional R-squared.

4. Here is a set of three data points:

y x

Observation 1 1 3

Observation 2 -4 -

Observation 3 3 7

A. Consider a regression of y = bo + b x 1 + e here. What would the OLS estimates of b 0 (^) & b 1

be?

b 0 = 0; b 1 = 0.

B. Calculate the standard errors for your estimates in part A. (Ignore heteroskedasticity here.)

se( b 0 ) = 0.16, se( b 1 ) = 0.

C. Now calculate the standard errors so that they are robust to heteroskedasticity.

se( b 0 ) = 0.16, se( b 1 ) = 0.