## Search in the document preview

**PART ONE
**

**Solutions to Exercises**

**Chapter 2
Review of Probability
**

** Solutions to Exercises
**1. (a) Probability distribution function for *Y
*

Outcome (number of heads)

*Y *= 0 *Y* = 1 *Y* = 2

probability 0.25 0.50 0.25

(b) Cumulative probability distribution function for *Y
*

Outcome (number of heads)

*Y *< 0 0≤* Y *< 1 1≤* Y *< 2 *Y *≥ 2

Probability 0 0.25 0.75 1.0

(c) = ( ) (0 0.25) (1 0.50) (2 0.25) 1.00*Y E Y*μ = × + × + × =

Using Key Concept 2.3: 2 2var( ) ( ) [ ( )] ,*Y E Y E Y*= − and

= × + × + × =2 2 2 2( ) (0 0.25) (1 0.50) (2 0.25) 1.50*E Y *
so that 2 2 2var( ) ( ) [ ( )] 1.50 (1.00) 0.50.*Y E Y E Y*= − = − =

2. We know from Table 2.2 that = = .Pr ( 0) 0 22,*Y * = = .Pr ( 1) 0 78,*Y * = = .Pr ( 0) 0 30,*X *
= = .Pr ( 1) 0 70.*X * So

(a)

( ) 0 Pr ( 0) 1 Pr ( 1) 0 0 22 1 0 78 0 78,

( ) 0 Pr ( 0) 1 Pr ( 1) 0 0 30 1 0 70 0 70

*Y
*

*X
*

*E Y Y Y
*

*E X X X
*

μ

μ

= = × = + × =

= × . + × . = .

= = × = + × =

= × . + × . = . .

(b) 2 2

2 2

2 2

2 2

2 2

2 2

[( ) ]

(0 0.70) Pr ( 0) (1 0.70) Pr ( 1) ( 0 70) 0 30 0 30 0 70 0 21

[( ) ]

(0 0.78) Pr ( 0) (1 0.78) Pr ( 1) ( 0 78) 0 22 0 22 0 78 0 1716

*X X
*

*Y Y
*

*E X
*

*X X
*

*E Y
*

*Y Y
*

σ μ

σ μ

= −

= − × = + − × =

= − . × . + . × . = . ,

= −

= − × = + − × =

= − . × . + . × . = . .

4 Stock/Watson - Introduction to Econometrics - Second Edition

(c) Table 2.2 shows Pr ( 0, 0) 0 15,*X Y*= = = . Pr ( 0, 1) 0 15,*X Y*= = = . Pr ( 1, 0) 0 07,*X Y*= = = .
Pr ( 1, 1) 0 63.*X Y*= = = . So

cov( , ) [( )( )] (0 - 0.70)(0 - 0.78)Pr( 0, 0)

(0 0 70)(1 0 78)Pr ( 0 1) (1 0 70)(0 0 78)Pr ( 1 0) (1 0 70)(1 0 78)Pr ( 1 1)

( 0 70) ( 0 78) 0 15 ( 0 70) 0 22 0 15 0 30 ( 0 78) 0 07 0

*XY X YX Y E X Y
X Y
*

*X Y
X Y
X Y
*

σ μ μ= = − − = = =

+ − . − . = , = + − . − . = , = + − . − . = , =

= − . × − . × . + − . × . × . + . × − . × . + .30 0 22 0 63

0 084,

0 084( , ) 0 4425 0 21 0 1716

*XY
*

*X Y
*

*cor X Y *σ
σ σ

× . × . = .

. = = = . .

. × .

3. For the two new random variables 3 6*W X*= + and 20 7 ,*V Y*= − we have:

(a)

( ) (20 7 ) 20 7 ( ) 20 7 0 78 14 54, ( ) (3 6 ) 3 6 ( ) 3 6 0 70 7 2

*E V E Y E Y
E W E X E X
*

= − = − = − × . = . = + = + = + × . = . .

(b) 2 2 2

2 2 2

var (3 6 ) 6 36 0 21 7 56,

var (20 7 ) ( 7) 49 0 1716 8 4084
*W X
*

*V Y
*

*X
*

*Y
*

σ σ

σ σ

= + = ⋅ = × . = .

= − = − ⋅ = × . = . .

(c)

(3 6 , 20 7 ) 6( 7) ( , ) 42 0 084 3 528 3 528( , ) 0 4425

7 56 8 4084

*WV
*

*WV
*

*W V
*

*cov X Y cov X Y
*

*cor W V
*

σ σ

σ σ

= + − = − = − × . = − .

− . = = = − . .

. × .

4. (a) = × − + × =3 3 3( ) 0 (1 ) 1*E X p p p
*

(b) = × − + × =( ) 0 (1 ) 1*k k kE X p p p
*(c) =( ) 0.3*E X *

= − = − =2 2var ( ) ( ) [ ( )] 0.3 0.09 0.21*X E X E X *

Thus, = =0.21 0.46.σ

To compute the skewness, use the formula from exercise 2.21: 3 3 2 3

2 3

( ) ( ) 3[ ( )][ ( )] 2[ ( )] 0.3 3 0.3 2 0.3 0.084

*E X E X E X E X E X*μ− = − +
= − × + × =

Alternatively, − − × + − × =3 3 3( ) = [(1 0.3) 0.3] [(0 0.3) 0.7] 0.084*E X *μ

Thus, skewness 3 3 3( ) / .084/0.46 0.87.*E X *μ σ= − = =

Solutions to Exercises in Chapter 2 5

To compute the kurtosis, use the formula from exercise 2.21: 4 4 3 2 2 4

2 3 4

( ) ( ) 4[ ( )][ ( )] 6[ ( )] [ ( )] 3[ ( )] 0.3 4 0.3 6 0.3 3 0.3 0.0777

*E X E X E X E X E X E X E X*μ− = − + −
= − × + × − × =

Alternatively, − − × + − × =4 4 4( ) = [(1 0.3) 0.3] [(0 0.3) 0.7] 0.0777*E X *μ

Thus, kurtosis is 4 4 4( ) / = .0777/0.46 1.76*E X *μ σ− =

5. Let *X *denote temperature in °F and *Y *denote temperature in °C. Recall that *Y *= 0 when *X *= 32 and
*Y *= 100 when *X *= 212; this implies (100/180) ( 32) or 17.78 (5/9) .*Y X Y X*= × − = − + × Using Key
Concept 2.3, = °70 F*X*μ implies that 17.78 (5/9) 70 21.11 C,*Y*μ = − + × = ° and = °7 F *X*σ
implies (5/9) 7 3.89 C.*Y*σ = × = °

6. The table shows that Pr ( 0, 0) 0 045,*X Y*= = = . Pr ( 0, 1) 0 709,*X Y*= = = . Pr ( 1, 0) 0 005,*X Y*= = = .
Pr ( 1, 1) 0 241,*X Y*= = = . Pr ( 0) 0 754,*X *= = . = = .Pr ( 1) 0 246,*X * = = .Pr ( 0) 0 050,*Y *

= = .Pr ( 1) 0 950.*Y *
(a)

= = × = + × =

= × . + × . = . .

( ) 0 Pr( 0) 1 Pr ( 1) 0 0 050 1 0 950 0 950

*YE Y Y Y*μ

(b)

(unemployed)Unemployment Rate (labor force)

Pr ( 0) 0 050 1 0 950 1 ( )

*#
#
*

*Y E Y
*

=

= = = . = − . = − .

(c) Calculate the conditional probabilities first:

Pr ( 0, 0) 0 045Pr ( 0| 0) 0 0597,
Pr ( 0) 0 754
*X YY X
*

*X
*= = .

= = = = = . = .

Pr ( 0, 1) 0 709Pr ( 1| 0) 0 9403,
Pr ( 0) 0 754
*X YY X
*

*X
*= = .

= = = = = . = .

Pr ( 1, 0) 0 005Pr ( 0| 1) 0 0203,
Pr ( 1) 0 246
*X YY X
*

*X
*= = .

= = = = = . = .

Pr ( 1, 1) 0 241Pr ( 1| 1) 0 9797
Pr ( 1) 0 246
*X YY X
*

*X
*= = .

= = = = = . . = .

The conditional expectations are

( | 1) 0 Pr ( 0| 1) 1 Pr ( 1| 1) 0 0 0203 1 0 9797 0 9797,

( | 0) 0 Pr ( 0| 0) 1 Pr ( 1| 0) 0 0 0597 1 0 9403 0 9403

*E Y X Y X Y X
*

*E Y X Y X Y X
*

= = × = = + × = = = × . + × . = .

= = × = = + × = = = × . + × . = . .

6 Stock/Watson - Introduction to Econometrics - Second Edition

(d) Use the solution to part (b),

Unemployment rate for college grads 1 ( | 1) 1 0.9797 0.0203.

Unemployment rate for non-college grads 1 ( | 0) 1 0.9403 0.0597.

*E Y X
*

*E Y X
*

= − = = − =

= − = = − =

(e) The probability that a randomly selected worker who is reported being unemployed is a college graduate is

Pr ( 1, 0) 0 005Pr ( 1| 0) 0 1
Pr ( 0) 0 050
*X YX Y
*

*Y
*= = .

= = = = = . . = .

The probability that this worker is a non-college graduate is

Pr ( 0| 0) 1 Pr ( 1| 0) 1 0 1 0 9*X Y X Y*= = = − = = = − . = . .

(f) Educational achievement and employment status are not independent because they do not satisfy
that, for all values of *x *and *y*,

Pr ( | ) Pr ( )*Y y X x Y y*= = = = .

For example,

Pr ( 0| 0) 0 0597 Pr ( 0) 0 050*Y X Y*= = = . ≠ = = . .

7. Using obvious notation, = + ;*C M F * thus = +*C M F*μ μ μ and
2 2 2 2cov( , ).*C M F M F*σ σ σ= + + This

implies
(a) = + =40 45 $85,000*C*μ per year.

(b) ( , )( , ) ,
*M F
*

*Cov M Fcor M F *σ σ= so that ( , ) ( , ).*M FCov M F cor M F*σ σ= Thus
( , ) 12 18 0.80 172.80,*Cov M F *= × × = where the units are squared thousands of dollars per year.

(c) 2 2 2 2cov( , ),*C M F M F*σ σ σ= + + so that
2 2 212 18 2 172.80 813.60,*C*σ = + + × = and

= =813.60 28.524*C*σ thousand dollars per year.

(d) First you need to look up the current Euro/dollar exchange rate in the Wall Street Journal, the
Federal Reserve web page, or other financial data outlet. Suppose that this exchange rate is *e
*(say *e *= 0.80 euros per dollar); each 1$ is therefore with *e*E. The mean is therefore *e*μ*C* (in units
of thousands of euros per year), and the standard deviation is *e*σ*C* (in units of thousands of euros
per year). The correlation is unit-free, and is unchanged.

8. = =( ) 1,*Y E Y*μ
2 var ( ) 4.*Y Y*σ = = With = −12 ( 1),*Z Y *

2 2

1 1 1( 1) ( 1) (1 1) 0, 2 2 2

1 1 1var ( 1) 4 1 2 4 4

*Z Y
*

*Z Y
*

*E Y
*

*Y
*

μ μ

σ σ

⎛ ⎞= − = − = − =⎜ ⎟ ⎝ ⎠

⎛ ⎞= − = = × = .⎜ ⎟ ⎝ ⎠

Solutions to Exercises in Chapter 2 7

9.

**Value of Y**

** 14 22 30 40 65
**

**Probability
Distribution of
**

** X
1 **0.02 0.05 0.10 0.03 0.01 0.21

**5**0.17 0.15 0.05 0.02 0.01 0.40

**Value of X**

**8 **0.02 0.03 0.15 0.10 0.09 0.39
**Probability distribution
of Y
**

0.21 0.23 0.30 0.15 0.11 1.00

(a) The probability distribution is given in the table above.

2 2 2 2 2 2

2 2

( ) 14 0.21 22 0.23 30 0.30 40 0.15 65 0.11 30.15 ( ) 14 0.21 22 0.23 30 0.30 40 0.15 65 0.11 1127.23

Var(Y) ( ) [ ( )] 218.21
14.77*Y
*

*E Y
E Y
*

*E Y E Y
*σ

= × + × + × + × + × =

= × + × + × + × + × =

= − = =

(b) Conditional Probability of *Y*|*X* = 8 is given in the table below

**Value of Y
14 22 30 40 65
**

0.02/0.39 0.03/0.39 0.15/0.39 0.10/0.39 0.09/0.39

( | 8) 14 (0.02/0.39) 22 (0.03/0.39) 30 (0.15/0.39) 40 (0.10/0.39) 65 (0.09/0.39) 39.21

*E Y X *= = × + × + ×
+ × + × =

2 2 2 2

2 2

( | 8) 14 (0.02/0.39) 22 (0.03/0.39) 30 (0.15/0.39) 40 (0.10/0.39) 65 (0.09/0.39) 1778.7

*E Y X *= = × + × + ×
+ × + × =

2

8

Var( ) 1778.7 39.21 241.65
15.54*Y X
*

*Y
*σ | =

= − = =

(c) ( ) (1 14 0.02) (1 22 : 0.05) (8 65 0.09) 171.7*E XY *= × × + × + × × =L
Cov( , ) ( ) ( ) ( ) 171.7 5.33 30.15 11.0*X Y E XY E X E Y*= − = − × =
Corr( , ) Cov( , )/( ) 11.0 /(5.46 14.77) 0.136*X YX Y X Y *σ σ= = × =

10. Using the fact that if 2,*Y YY N *μ σ
⎛ ⎞
⎜ ⎟
⎝ ⎠

: then ~ (0,1)*Y
Y
*

*Y N*μσ
− and Appendix Table 1, we have

(a)

1 3 1Pr ( 3) Pr (1) 0 8413 2 2

*YY *− −⎛ ⎞≤ = ≤ = Φ = . .⎜ ⎟
⎝ ⎠

(b)

Pr( 0) 1 Pr( 0) 3 0 31 Pr 1 ( 1) (1) 0 8413

3 3

*Y Y
Y
*

> = − ≤

− −⎛ ⎞= − ≤ = − Φ − = Φ = . .⎜ ⎟ ⎝ ⎠

8 Stock/Watson - Introduction to Econometrics - Second Edition

(c)

40 50 50 52 50Pr (40 52) Pr 5 5 5

(0 4) ( 2) (0 4) [1 (2)] 0 6554 1 0 9772 0 6326

*YY *− − −⎛ ⎞≤ ≤ = ≤ ≤⎜ ⎟
⎝ ⎠

= Φ . − Φ − = Φ . − − Φ = . − + . = . .

(d)

6 5 5 8 5Pr (6 8) Pr 2 2 2

(2 1213) (0 7071) 0 9831 0 7602 0 2229

*YY *− − −⎛ ⎞≤ ≤ = ≤ ≤⎜ ⎟
⎝ ⎠

= Φ . − Φ . = . − . = . .

11. (a) 0.90
(b) 0.05
(c) 0.05
(d) When 210~ ,*Y *χ then 10,/10 ~ .*Y F *∞
(e) 2 ,*Y Z*= where ~ N(0,1),*Z * thus Pr ( 1) Pr ( 1 1) 0.32.*Y Z*≤ = − ≤ ≤ =

12. (a) 0.05
(b) 0.950
(c) 0.953
(d) The *tdf *distribution and N(0, 1) are approximately the same when *df *is large.
(e) 0.10
(f) 0.01

13. (a) 2 2 2 2( ) Var ( ) 1 0 1; ( ) Var ( ) 100 0 100.*Y WE Y Y E W W*μ μ= + = + = = + = + =

(b) *Y *and *W *are symmetric around 0, thus skewness is equal to 0; because their mean is zero, this
means that the third moment is zero.

(c) The kurtosis of the normal is 3, so 4

$
( )3 ;*Y
*

*Y
*

*E Y *μ
σ

−= solving yields 4E( ) 3;*Y *= a similar calculation

yields the results for *W*.

(d) First, condition on 0,*X *= so that :*S W*=
2 3 4 2

.( | 0) 0; ( | 0) 100, ( | 0) 0, ( | 0) 3 100*E S X E S X E S X E S X*= = = = = = = = ×

Similarly,
2 3 4( | 1) 0; ( | 1) 1, ( | 1) 0, ( | 1) 3.*E S X E S X E S X E S X*= = = = = = = =

From the large of iterated expectations

( ) ( | 0) Pr (X 0) ( | 1) Pr( 1) 0*E S E S X E S X X*= = × = + = × = =
2 2 2( ) ( | 0) Pr (X 0) ( | 1) Pr( 1) 100 0.01 1 0.99 1.99*E S E S X E S X X*= = × = + = × = = × + × =
3 3 3( ) ( | 0) Pr (X 0) ( | 1) Pr( 1) 0*E S E S X E S X X*= = × = + = × = =
4 4 4 2( ) ( | 0) Pr (X 0) ( | 1) Pr( 1) 3 100 0.01 3 1 0.99 302.97*E S E S X E S X X*= = × = + = × = = × × + × × =

Solutions to Exercises in Chapter 2 9

(e) ( ) 0,*S E S*μ = = thus
3 3( ) ( ) 0*SE S E S*μ− = = from part d. Thus skewness = 0.

Similarly, 2 2 2( ) ( ) 1.99,*S SE S E S*σ μ= − = = and
4 4( ) ( ) 302.97.*SE S E S*μ− = =

Thus, 2kurtosis 302.97 /(1.99 ) 76.5= =

14. The central limit theorem suggests that when the sample size (*n*) is large, the distribution of the
sample average ( )*Y * is approximately 2,*Y YN *μ σ

⎛ ⎞ ⎜ ⎟ ⎝ ⎠

with
22 .*YnY
*

σσ = Given 100,*Y*μ =
2 43 0,*Y*σ = .

(a) 100,*n *=
22 43

100 0 43,*YnY
*σσ = = = . and

100 101 100Pr ( 101) Pr (1 525) 0 9364 0 43 0 43

*YY
*⎛ ⎞− −

≤ = ≤ ≈ Φ . = . .⎜ ⎟ . .⎝ ⎠

(b) 165,*n *=
22 43

165 0 2606,*YnY
*σσ = = = . and

100 98 100Pr ( 98) 1 Pr ( 98) 1 Pr 0 2606 0 2606

1 ( 3 9178) (3 9178) 1 000 (rounded to four decimal places)

*YY Y
*⎛ ⎞− −

> = − ≤ = − ≤⎜ ⎟ . .⎝ ⎠

≈ − Φ − . = Φ . = . .

(c) 64,*n *=
22 43

64 64 0 6719,*YY
*σσ = = = . and

101 100 100 103 100Pr (101 103) Pr 0 6719 0 6719 0 6719

(3 6599) (1 2200) 0 9999 0 8888 0 1111

*YY
*⎛ ⎞− − −

≤ ≤ = ≤ ≤⎜ ⎟ . . .⎝ ⎠

≈ Φ . − Φ . = . − . = . .

15. (a)

9.6 10 10 10.4 10Pr (9.6 10.4) Pr 4/ 4/ 4/

9.6 10 10.4 10Pr 4/ 4/

*YY
n n n
*

*Z
n n
*

≤ ≤

≤

⎛ ⎞− − − ≤ = ≤⎜ ⎟

⎝ ⎠ − −⎛ ⎞

= ≤⎜ ⎟ ⎝ ⎠

where *Z *~ N(0, 1). Thus,

(i) *n *= 20; 9.6 10 10.4 10Pr Pr ( 0.89 0.89) 0.63
4/ 4/

*Z Z
n n
*

≤ ≤ − −⎛ ⎞

≤ = − ≤ =⎜ ⎟ ⎝ ⎠

(ii) *n *= 100; 9.6 10 10.4 10Pr Pr( 2.00 2.00) 0.954
4/ 4/

*Z Z
n n
*

≤ ≤ − −⎛ ⎞

≤ = − ≤ =⎜ ⎟ ⎝ ⎠

(iii) *n *= 1000; 9.6 10 10.4 10Pr Pr( 6.32 6.32) 1.000
4/ 4/

*Z Z
n n
*

≤ ≤ − −⎛ ⎞

≤ = − ≤ =⎜ ⎟ ⎝ ⎠

10 Stock/Watson - Introduction to Econometrics - Second Edition

(b)

10Pr (10 10 ) Pr 4/ 4/ 4/

Pr . 4/ 4/

*c Y cc Y c
n n n
*

*c cZ
n n
*

≤ ≤

≤

⎛ ⎞− − − ≤ + = ≤⎜ ⎟

⎝ ⎠ −⎛ ⎞

= ≤⎜ ⎟ ⎝ ⎠

As *n *get large
4/
*c
*

*n
* gets large, and the probability converges to 1.

(c) This follows from (b) and the definition of convergence in probability given in Key Concept 2.6.

16. There are several ways to do this. Here is one way. Generate *n *draws of *Y*, *Y*1, *Y*2, … *Yn*. Let *Xi *= 1 if
*Yi* < 3.6, otherwise set *Xi* = 0. Notice that *Xi* is a Bernoulli random variables with μ*X* = Pr(*X* = 1) =
Pr(*Y *< 3.6). Compute .*X * Because *X * converges in probability to μ*X* = Pr(*X* = 1) = Pr(*Y *< 3.6), *X *
will be an accurate approximation if *n *is large.

17. μ*Y *= 0.4 and 2 0.4 0.6 0.24*Y*σ = × =

(a) (i) *P*(*Y * ≥ 0.43) = 0.4 0.43 0.4 0.4Pr Pr 0.6124 0.27
0.24/ 0.24/ 0.24/

*Y Y
n n n
*

⎛ ⎞ ⎛ ⎞− − − ≥ = ≥ =⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠

(ii) *P*(*Y * ≤ 0.37) = 0.4 0.37 0.4 0.4Pr Pr 1.22 0.11
0.24/ 0.24/ 0.24/

*Y Y
n n n
*

⎛ ⎞ ⎛ ⎞− − − ≤ = ≤ − =⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠

(b) We know Pr(−1.96 ≤ *Z *≤ 1.96) = 0.95, thus we want *n *to satisfy 0.41 0.4
0.24/

0.41 1.96
*n
*

−= > − and 0.39 0.4

0.24/ 1.96.

*n
*− < − Solving these inequalities yields *n *≥ 9220.

18. Pr ( 0) 0 95,*Y $*= = . Pr ( 20000) 0 05.*Y $*= = .
(a) The mean of *Y * is

0 Pr ( 0) 20,000 Pr ( 20000) 1000.*Y Y $ Y $ $*μ = × = + × = =

The variance of *Y * is

( ) ( )

22

22

2 2 7

Pr 0 (20000 1000) Pr ( 20000)(0 1000)

( 1000) 0 95 19000 0 05 1 9 10 ,

*Y YE Y
*

*Y Y
*

σ μ⎡ ⎤⎢ ⎥ ⎢ ⎥⎣ ⎦

= −

= × = + − × =−

= − × . + × . = . ×

so the standard deviation of *Y* is
1
27(1 9 10 ) 4359*Y $*σ = . × = .

(b) (i) ( ) 1000,*YE Y $*μ= =
2 72 51 9 10

100 1 9 10 .*YnY
*σσ . ×= = = . ×

(ii) Using the central limit theorem,

5 5

Pr ( 2000) 1 Pr ( 2000)

1000 2 000 1 0001 Pr 1 9 10 1 9 10

1 (2 2942) 1 0 9891 0 0109

*Y Y
*

*Y
*

> = − ≤

⎛ ⎞− , − , = − ≤⎜ ⎟

. × . ×⎝ ⎠ ≈ − Φ . = − . = . .

Solutions to Exercises in Chapter 2 11

19. (a)

1

1

Pr ( ) Pr ( , )

Pr ( | )Pr ( )

*l
*

*j i j
i
*

*l
*

*j i i
i
*

*Y y X x Y y
*

*Y y X x X x
*

=

=

= = = =

= = = =

∑

∑

(b)

1 1 1

11

1

( ) Pr ( ) Pr ( | )Pr ( )

Pr ( | ) Pr ( )

( | )Pr ( )

*i
*

*i i
*

*k k l
*

*j j j j i i
j j i
*

*kl
*

*j j i
ji
*

*l
*

*i
*

*E Y y Y y y Y y X x X x
*

*y Y y X x X x
*

*E Y X x X x
*

= = =

⎛ ⎞ ⎜ ⎟

=⎜ ⎟ ⎜ ⎟⎜ ⎟== ⎝ ⎠

=

= = = = = =

= = =

= = = .

∑ ∑ ∑

∑∑

∑

(c) When *X * and *Y * are independent,

Pr ( , ) Pr ( )Pr ( )*i j i jX x Y y X x Y y*= = = = = ,

so

1 1

1 1

1 1

[( )( )]

( )( )Pr ( , )

( )( )Pr ( )Pr ( )

( )Pr ( ) ( )Pr (

( ) ( ) 0 0 0,

*XY X Y
l k
*

*i X j Y i j
i j
*

*l k
*

*i X j Y i j
i j
*

*l k
*

*i X i j Y j
i j
*

*X Y
*

*E X Y
*

*x y X x Y y
*

*x y X x Y y
*

*x X x y Y y
*

*E X E Y
*

σ μ μ

μ μ

μ μ

μ μ

μ μ

= =

= =

= =

= − −

= − − = =

= − − = =

⎛ ⎞⎛ ⎞ = − = − =⎜ ⎟⎜ ⎟

⎝ ⎠⎝ ⎠ = − − = × =

∑ ∑

∑ ∑

∑ ∑

0( , ) 0*XY
X Y X Y
*

*cor X Y *σ
σ σ σ σ

= = = .

20. (a) 1 1

Pr ( ) Pr ( | , )Pr ( , )
*l m
*

*i i j h j h
j h
*

*Y y Y y X x Z z X x Z z
*= =

= = = = = = =∑∑

(b)

1

1 1 1

1 1 1

1 1

( ) Pr ( )Pr ( )

Pr ( | , )Pr ( , )

Pr ( | , ) Pr ( , )

( | , )Pr ( , )

*k
*

*i i i
i
*

*k l m
*

*i i j h j h
i j h
*

*l m k
*

*i i j h j h
j h i
*

*l m
*

*j h j h
j h
*

*E Y y Y y Y y
*

*y Y y X x Z z X x Z z
*

*y Y y X x Z z X x Z z
*

*E Y X x Z z X x Z z
*

=

= = =

= = =

= =

= = =

= = = = = =

⎡ ⎤ = = = = = =⎢ ⎥

⎣ ⎦

= = = = =

∑

∑ ∑∑

∑∑ ∑

∑∑

12 Stock/Watson - Introduction to Econometrics - Second Edition

where the first line in the definition of the mean, the second uses (a), the third is a rearrangement, and the final line uses the definition of the conditional expectation.

21. (a)

3 2 3 2 2 2 2 3

3 2 2 3 3 2 2 3

3 2 3

( ) [( ) ( )] [ 2 2 ] ( ) 3 ( ) 3 ( ) ( ) 3 ( ) ( ) 3 ( )[ ( )] [ ( )] ( ) 3 ( ) ( ) 2 ( )

*E X E X X E X X X X X
E X E X E X E X E X E X E X E X E X
E X E X E X E X
*

μ μ μ μ μ μ μ μ

μ μ μ

− = − − = − + − + −

= − + − = − + −

= − +

(b) 4 3 2 2 3

4 3 2 2 3 3 2 2 3 4

4 3 2 2 3 4

4 3 2 2 4

( ) [( 3 3 )( )] [ 3 3 3 3 ] ( ) 4 ( ) ( ) 6 ( ) ( ) 4 ( ) ( ) ( )

( ) 4[ ( )][ ( )] 6[ ( )] [ ( )] 3[ ( )]

*E X E X X X X
E X X X X X X X
E X E X E X E X E X E X E X E X
E X E X E X E X E X E X
*

μ μ μ μ μ μ μ μ μ μ μ μ

− = − + − −

= − + − − + − +

= − + − +

= − + −

22. The mean and variance of *R *are given by

2 2 2 2

0.08 (1 ) 0.05
0.07 (1 ) 0.042 2 (1 ) [0.07 0.04 0.25]
*w w
w w w w
*

μ

σ

= × + − ×

= × + − × + × × − × × ×

where 0.07 0.04 0.25 ( , )*s bCov R R*× × = follows from the definition of the correlation between
*Rs* and *Rb*.
(a) 0.065; 0.044μ σ= =
(b) 0.0725; 0.056μ σ= =

(c) *w *= 1 maximizes ; 0.07μ σ = for this value of *w*.
(d) The derivative of σ2 with respect to *w *is

2 2 22 .07 2(1 ) 0.04 (2 4 ) [0.07 0.04 0.25]

0.0102 0.0018

*d w w w
dw
*

*w
*

σ = × − − × + − × × ×

= −

solving for *w *yields 18 /102 0.18.*w *= = (Notice that the second derivative is positive, so that this
is the global minimum.) With 0.18, .038.*Rw *σ= =

23. *X *and *Z *are two independently distributed standard normal random variables, so

2 20, 1, 0.*X Z X Z XZ*μ μ σ σ σ= = = = =

(a) Because of the independence between *X * and ,*Z * Pr ( | ) Pr ( ),*Z z X x Z z*= = = = and
( | ) ( ) 0.*E Z X E Z*= = Thus 2 2 2 2( | ) ( | ) ( | ) ( | ) 0*E Y X E X Z X E X X E Z X X X*= + = + = + = .

(b) 2 2 2( ) 1,*X XE X *σ μ= + = and
2 2( ) ( ) 1 0 1*Y ZE X Z E X*μ μ= + = + = + = .

(c) 3 3( ) ( ) ( ) ( ).*E XY E X ZX E X E ZX*= + = + Using the fact that the odd moments of a standard normal
random variable are all zero, we have 3( ) 0.*E X *= Using the independence between *X * and ,*Z * we
have ( ) 0.*Z XE ZX *μ μ= = Thus

3( ) ( ) ( ) 0.*E XY E X E ZX*= + =

Solutions to Exercises in Chapter 2 13

(d)

( ) [( )( )] [( 0)( 1)] ( ) ( ) ( )

0 0 0 0( , ) 0

*X Y
*

*XY
*

*X Y X Y
*

*Cov XY E X Y E X Y
E XY X E XY E X
*

*cor X Y
*

μ μ

σ σ σ σ σ

= − − = − − = − = − = − = .

= = = .

24. (a) 2 2 2 2( )*iE Y *σ μ σ= + = and the result follows directly.

(b) (*Yi*/σ) is distributed i.i.d. N(0,1), 21( / ) ,
*n
*

*ii
W Y *σ

= = ∑ and the result follows from the definition of a

2
*n*χ random variable.

(c) *E*(*W*) =
2 2

2 2 1 1

( ) .
*n n
*

*i i
*

*i i
*

*Y Y
E W E E n
*

σ σ= = = = =∑ ∑

(d) Write

2 2 2 2

1 1

( / ) 1 1

/
*n n
i i iY Y
n n
*

*Y YV
*σ

σ = =∑ ∑ − −

= =

which follows from dividing the numerator and denominator by σ. *Y*1/σ~ N(0,1), 22 ( / )
*n
*

*ii
Y *σ

=∑ ~ 2

1*n*χ − , and *Y*1/σand
2

2
( / )*n ii Y *σ=∑ are independent. The result then follows from the definition of

the *t *distribution.

**Chapter 3
Review of Statistics
**

** Solutions to Exercises
**1. The central limit theorem suggests that when the sample size ( *n *) is large, the distribution of the

sample average (*Y *) is approximately 2,*Y YN *μ σ
⎛ ⎞
⎜ ⎟
⎝ ⎠

with
22 .*YnY
*σσ = Given a population 100,*Y*μ =

2 43 0,*Y*σ = . we have

(a) 100,*n *=
22 43

100 0 43,*YnY
*σσ = = = . and

100 101 100Pr ( 101) Pr (1.525) 0 9364 0 43 0 43

*YY
*⎛ ⎞− −

< = < ≈ Φ = . .⎜ ⎟ . .⎝ ⎠

(b) 64,*n *=
22 43

64 64 0 6719,*YY
*σσ = = = . and

101 100 100 103 100Pr(101 103) Pr 0 6719 0 6719 0 6719

(3 6599) (1 2200) 0 9999 0 8888 0 1111

*YY
*⎛ ⎞− − −

< < = < <⎜ ⎟ . . .⎝ ⎠

≈ Φ . −Φ . = . − . = . .

(c) 165,*n *=
22 43

165 0 2606,*YnY
*σσ = = = . and

100 98 100Pr ( 98) 1 Pr ( 98) 1 Pr 0 2606 0 2606

1 ( 3 9178) (3 9178) 1 0000 (rounded to four decimal places)

*YY Y
*⎛ ⎞− −

> = − ≤ = − ≤⎜ ⎟ . .⎝ ⎠

≈ −Φ − . = Φ . = . .

2. Each random draw *iY * from the Bernoulli distribution takes a value of either zero or one with
probability Pr ( 1)*iY p*= = and Pr ( 0) 1 .*iY p*= = − The random variable *iY * has mean

( ) 0 Pr( 0) 1 Pr( 1) ,*iE Y Y Y p*= × = + × = =

and variance 2

2 2

2 2

var( ) [( ) ]

(0 ) Pr( 0) (1 ) Pr( 1)

(1 ) (1 ) (1 )

*i i YY E Y
*

*p Y p Yi i
p p p p p p
*

μ= −

= − × = + − × =

= − + − = − .

Solutions to Exercises in Chapter 3 15

(a) The fraction of successes is

1( 1)(success)ˆ
*n
i ii Y# Y#p Y
*

*n n n
*=∑== = = = .

(b)

1

1 1

1 1ˆ( ) ( )
*n n n
i i
*

*i
i i
*

*Y
E p E E Y p p
*

*n n n
*=

= =

⎛ ⎞∑ = = = = .⎜ ⎟⎜ ⎟ ⎝ ⎠

∑ ∑

(c)

1 2 2

1 1

1 1 (1 )ˆvar( ) var var( ) (1 )
*n n n
i i
*

*i
i i
*

*Y p pp Y p p
n nn n
*=

= =

⎛ ⎞∑ − = = = − = .⎜ ⎟⎜ ⎟

⎝ ⎠ ∑ ∑

The second equality uses the fact that 1*Y *, …, *Yn* are i.i.d. draws and cov( , ) 0,*i jY Y *= for .*i j*≠

3. Denote each voter’s preference by .*Y * 1*Y *= if the voter prefers the incumbent and 0*Y *= if the voter
prefers the challenger. *Y *is a Bernoulli random variable with probability Pr ( 1)*Y p*= = and
Pr ( 0) 1 .*Y p*= = − From the solution to Exercise 3.2, *Y * has mean *p * and variance (1 ).*p p*−
(a) 215400ˆ 0 5375.*p *= = .

(b) · ˆ ˆ(1 ) 0.5375 (1 0.5375) 4400ˆvar( ) 6 2148 10 .
*p p
*

*np
*− × − −= = = . × The standard error is SE

1
2ˆ ˆ( ) (var( )) 0 0249.*p p*= = .

(c) The computed *t*-statistic is

0ˆ 0 5375 0 5 1 506
ˆSE( ) 0 0249
*pact pt
*

*p
*μ ,− . − .= = = . .

.

Because of the large sample size ( 400),*n *= we can use Equation (3.14) in the text to get the
*p*-value for the test 0 0 5*H p*: = . vs. 1 0 5 :*H p*: ≠ .

-value 2 ( | |) 2 ( 1 506) 2 0 066 0 132*actp t*= Φ − = Φ − . = × . = . .

(d) Using Equation (3.17) in the text, the *p*-value for the test 0 0 5*H p*: = . vs. 1 0 5*H p*: > . is

-value 1 ( ) 1 (1 506) 1 0 934 0 066*actp t*= −Φ = −Φ . = − . = . .

(e) Part (c) is a two-sided test and the *p*-value is the area in the tails of the standard normal
distribution outside ± (calculated *t*-statistic). Part (d) is a one-sided test and the *p*-value is the area
under the standard normal distribution to the right of the calculated *t*-statistic.

(f) For the test 0 0 5*H p*: = . vs. 1 0 5,*H p*: > . we cannot reject the null hypothesis at the 5%
significance level. The *p*-value 0.066 is larger than 0.05. Equivalently the calculated *t*-statistic
1 506. is less than the critical value 1.645 for a one-sided test with a 5% significance level. The
test suggests that the survey did not contain statistically significant evidence that the incumbent
was ahead of the challenger at the time of the survey.

16 Stock/Watson - Introduction to Econometrics - Second Edition

4. Using Key Concept 3.7 in the text
(a) 95% confidence interval for *p* is

ˆ ˆ1.96 ( ) 0.5375 1.96 0.0249 (0.4887,0.5863).*p SE p*± = ± × =

(b) 99% confidence interval for *p* is

ˆ ˆ2.57 ( ) 0.5375 2.57 0.0249 (0.4735,0.6015).*p SE p*± = ± × =

(c) The interval in (b) is wider because of a larger critical value due to a lower significance level.
(d) Since 0.50 lies inside the 95% confidence interval for *p*, we cannot reject the null hypothesis at a

5% significance level.

5. (a) (i) The size is given by ˆPr(| 0.5| .02),*p *− > where the probability is computed assuming that
0.5.*p *=

ˆ ˆPr(| 0.5| .02) 1 Pr( 0.02 0.5 .02) ˆ0.02 0.5 0.021 Pr

.5 .5/1055 .5 .5/1055 .5 .5/1055 ˆ 0.51 Pr 1.30 1.30

.5 .5/1055 0.19

*p p
p
*

*p
*

− > = − − ≤ − ≤

− −⎛ ⎞ = − ≤ ≤⎜ ⎟

× × ×⎝ ⎠ −⎛ ⎞

= − − ≤ ≤⎜ ⎟ ×⎝ ⎠

=

where the final equality using the central limit theorem approximation
(ii) The power is given by ˆPr(| 0.5| .02),*p*− > where the probability is computed assuming that

*p *= 0.53.

ˆ ˆPr(| 0.5| .02) 1 Pr( 0.02 0.5 .02) ˆ0.02 0.5 0.021 Pr

.53 .47/1055 .53 .47/1055 .53 .47/1055 ˆ0.05 0.53 0.011 Pr

.53 .47/1055 .53 .47/1055 .53 .47/1055 ˆ 0.531 Pr 3.25

.53 .47/1055

*p p
p
*

*p
*

*p
*

− > = − − ≤ − ≤

− −⎛ ⎞ = − ≤ ≤⎜ ⎟

× × ×⎝ ⎠ − − −⎛ ⎞

= − ≤ ≤⎜ ⎟ × × ×⎝ ⎠

− = − − ≤ ≤

× 0.65

0.74

⎛ ⎞ −⎜ ⎟

⎝ ⎠ =

where the final equality using the central limit theorem approximation. (b) (i) 0.54 0.5

0.54 0.46/1055
2.61, Pr(| | 2.61) .01,*t t*−

× = = > = so that the null is rejected at the 5% level.

(ii) Pr( 2.61) .004,*t *> = so that the null is rejected at the 5% level.

(iii) 0.54 1.96 0.54 0.46 /1055 0.54 0.03, or 0.51 to 0.57.± × = ±

(iv) 0.54 2.58 0.54 0.46 /1055 0.54 0.04, or 0.50 to 0.58.± × = ±

(v) 0.54 0.67 0.54 0.46 /1055 0.54 0.01, or 0.53 to 0.55.± × = ± (c) (i) The probability is 0.95 is any single survey, there are 20 independent surveys, so the

probability if 200.95 0.36= (ii) 95% of the 20 confidence intervals or 19.

Solutions to Exercises in Chapter 3 17

(d) The relevant equation is ˆ1.96 SE( ) .01or 1.96 (1 ) / .01.*p p p n*× < × − < Thus *n *must be chosen so

that 2

2 1.96 (1 )

.01
,*p pn *−> so that the answer depends on the value of *p*. Note that the largest value that

*p*(1 − *p*) can take on is 0.25 (that is, *p *= 0.5 makes *p*(1 − *p*) as large as possible). Thus if
2

2 1.96 0.25

.01
9604,*n *×> = then the margin of error is less than 0.01 for all values of *p*.

6. (a) No. Because the *p*-value is less than 5%, μ= 5 is rejected at the 5% level and is therefore not
contained in the 95% confidence interval.

(b) No. This would require calculation of the *t*-statistic for μ= 5, which requires *Y *and SE ( ).*Y * Only
one the *p-*value for μ= 5 is given in the problem.

7. The null hypothesis in that the survey is a random draw from a population with *p* = 0.11. The
*t*-statistic is ˆ 0.11ˆ( ) ,

*p
SE pt
*−= where ˆ ˆ ˆ( ) (1 )/ .*SE p p p n*= − (An alternative formula for SE( *p̂ *) is

0.11 (1 0.11) / ,*n*× − which is valid under the null hypothesis that 0.11).*p *= The value of the *t-*statistic
is −2.71, which has a *p*-value of that is less than 0.01. Thus the null hypothesis 0.11*p *= (the survey is
unbiased) can be rejected at the 1% level.

8 ( )12310001110 1.96± or 1110 7.62.±
9. Denote the life of a light bulb from the new process by .*Y * The mean of *Y * is μ and the standard

deviation of *Y * is 200*Y*σ = hours. *Y * is the sample mean with a sample size 100.*n *= The standard
deviation of the sampling distribution of *Y * is 200

100
20*YY n
*

σσ = = = hours. The hypothesis test is

0 : 2000*H *μ = vs. 1 2000 .*H *μ: > The manager will accept the alternative hypothesis if 2100*Y *>
hours.
(a) The size of a test is the probability of erroneously rejecting a null hypothesis when it is valid.

The size of the manager’s test is

7

size Pr( 2100| 2000) 1 Pr( 2100| 2000)

2000 2100 20001 Pr | 2000 20 20

1 (5) 1 0 999999713 2 87 10

*Y Y
*

*Y
*

μ μ

μ

−

= > = = − ≤ =

⎛ ⎞− − = − ≤ =⎜ ⎟

⎝ ⎠ = −Φ = − . = . × .

Pr( 2100| 2000)*Y *μ> = means the probability that the sample mean is greater than 2100 hours
when the new process has a mean of 2000 hours.

(b) The power of a test is the probability of correctly rejecting a null hypothesis when it is invalid. We calculate first the probability of the manager erroneously accepting the null hypothesis when it is invalid:

2150 2100 2150Pr( 2100| 2150) Pr | 2150 20 20

( 2 5) 1 (2 5) 1 0 9938 0 0062

*YY*β μ μ⎛ ⎞− −= ≤ = = ≤ =⎜ ⎟
⎝ ⎠

= Φ − . = −Φ . = − . = . .

The power of the manager’s testing is 1 1 0 0062 0 9938.β− = − . = .

18 Stock/Watson - Introduction to Econometrics - Second Edition

(c) For a test with 5%, the rejection region for the null hypothesis contains those values of the
*t*-statistic exceeding 1.645.

2000 1 645 2000 1 645 20 2032 9 20

*act
act actYt Y
*

− = > . ⇒ > + . × = . .

The manager should believe the inventor’s claim if the sample mean life of the new product is greater than 2032.9 hours if she wants the size of the test to be 5%.

10. (a) New Jersey sample size 1 100,*n *= sample average 1 58,*Y *= sample standard deviation 1 8.*s *=

The standard error of 1*Y * is SE 1
1

8 1 100

( ) 0 8.*s
nY *= = = . The 95% confidence interval for the mean

score of all New Jersey third graders is

1 11 1 96SE( ) 58 1 96 0 8 (56 432 59 568)*Y Y*μ = ± . = ± . × . = . , . .

(b) Iowa sample size 2 200,*n *= sample average 2 62,*Y *= sample standard deviation 2 11.*s *= The

standard error of 1 2*Y Y*− is SE
2 2
1 2

1 2

64 121 1 2 100 200( ) 1 1158.

*s s
n nY Y*− = + = + = . The 90% confidence

interval for the difference in mean score between the two states is

1 2 1 21 2 ( ) 1 64SE( )
(58 62) 1 64 1 1158 ( 5 8299 2 1701)
*Y Y Y Y*μ μ− = − ± . −

= − ± . × . = − . ,− . .

(c) The hypothesis tests for the difference in mean scores is

0 1 2 1 1 20 vs 0*H H*μ μ μ μ: − = . : − ≠ .

From part (b) the standard error of the difference in the two sample means is
SE 1 2( ) 1 1158.*Y Y*− = . The *t*-statistic for testing the null hypothesis is

1 2

1 2

58 62 3 5849 SE( ) 1 1158

*act Y Yt
Y Y
*− −

= = = − . . − .

Use Equation (3.14) in the text to compute the *p*-value:

value 2 ( | |) 2 ( 3 5849) 2 0 00017 0 00034*actp t*− = Φ − = Φ − . = × . = . .

Because of the extremely low *p*-value, we can reject the null hypothesis with a very high degree
of confidence. That is, the population means for Iowa and New Jersey students are different.

11. Assume that *n * is an even number. Then *Y*% is constructed by applying a weight of 12 to the 2*n * “odd”
observations and a weight of 32 to the remaining 2*n * observations.

1 2 1

1 2 12

2 2 2

2

1 1 3 1 3( ) ( ) ( ) ( ) ( ) 2 2 2 2

1 1 3 2 2 2 2

1 1 9 1 9var( ) var( ) var( ) var( ) var( ) 4 4 4 4

1 1 9 1 25 4 2 4 2

*n n
*

*Y Y Y
*

*n n
*

*Y
Y Y
*

*E Y E Y E Y E Y E Y
n
*

*n n
n
*

*Y Y Y Y Y
n
*

*n n
n n
*

μ μ μ

σ σ σ

⎛ ⎞ ⎜ ⎟ ⎜ ⎟−⎜ ⎟ ⎝ ⎠

−

= + + +

⎛ ⎞= ⋅ ⋅ + ⋅ ⋅ =⎜ ⎟ ⎝ ⎠ ⎛ ⎞= + + +⎜ ⎟ ⎝ ⎠

⎛ ⎞= ⋅ ⋅ + ⋅ ⋅ = . .⎜ ⎟ ⎝ ⎠

% L

% L

Solutions to Exercises in Chapter 3 19

12. Sample size for men 1 100,*n *= sample average 1 3100,*Y *= sample standard deviation 1 200.*s *=
Sample size for women 2 64,*n *= sample average 2 2900,*Y *= sample standard deviation 2 320.*s *=

The standard error of 1 2*Y Y*− is SE
2 2 2 21 2

1 2

200 320 1 2 100 64( ) 44 721.

*s s
n nY Y*− = + = + = .

(a) The hypothesis test for the difference in mean monthly salaries is

0 1 2 1 1 20 vs 0*H H*μ μ μ μ: − = . : − ≠ .

The *t*-statistic for testing the null hypothesis is

1 2

1 2

3100 2900 4 4722 SE( ) 44 721

*act Y Yt
Y Y
*− −

= = = . . − .

Use Equation (3.14) in the text to get the *p*-value:
6 6-value 2 ( | |) 2 ( 4 4722) 2 (3 8744 10 ) 7 7488 10*actp t *− −= Φ − = Φ − . = × . × = . × .

The extremely low level of *p*-value implies that the difference in the monthly salaries for men
and women is statistically significant. We can reject the null hypothesis with a high degree of
confidence.

(b) From part (a), there is overwhelming statistical evidence that mean earnings for men differ from mean earnings for women. To examine whether there is gender discrimination in the compensation policies, we take the following one-sided alternative test

0 1 2 1 1 20 vs 0*H H*μ μ μ μ: − = . : − > .

With the *t*-statistic 4 4722,*actt *= . the *p*-value for the one-sided test is:
6-value 1 ( ) 1 (4 4722) 1 0 999996126 3 874 10*actp t *−= −Φ = −Φ . = − . = . × .

With the extremely small *p*-value, the null hypothesis can be rejected with a high degree of
confidence. There is overwhelming statistical evidence that mean earnings for men are greater
than mean earnings for women. However, by itself, this does not imply gender discrimination by
the firm. Gender discrimination means that two workers, identical in every way but gender, are
paid different wages. The data description suggests that some care has been taken to make sure
that workers with similar jobs are being compared. But, it is also important to control for
characteristics of the workers that may affect their productivity (education, years of experience,
etc.). If these characteristics are systematically different between men and women, then they may
be responsible for the difference in mean wages. (If this is true, it raises an interesting and
important question of why women tend to have less education or less experience than men, but
that is a question about something other than gender discrimination by this firm.) Since these
characteristics are not controlled for in the statistical analysis, it is premature to reach a
conclusion about gender discrimination.

13. (a) Sample size 420,*n *= sample average 654 2,*Y *= . sample standard deviation 19 5.*Ys *= . The
standard error of *Y * is SE 19 5

420
( ) 0 9515.*Ys
*

*n
Y *.= = = . The 95% confidence interval for the mean test

score in the population is

1 96SE( ) 654 2 1 96 0 9515 (652 34 656 06)*Y Y*μ = ± . = . ± . × . = . , . .

20 Stock/Watson - Introduction to Econometrics - Second Edition

(b) The data are: sample size for small classes 1 238,*n *= sample average 1 657 4,*Y *= . sample
standard deviation

1
19 4;*s *= . sample size for large classes 2 182,*n *= sample average 2 650 0,*Y *= .

sample standard deviation 2 17 9.*s *= . The standard error of 1 2*Y Y*− is
2 2 2 21 2

1 2

19 4 17 9 1 2 238 182( ) 1 8281.

*s s
n nSE Y Y
*

. .− = + = + = . The hypothesis tests for higher average scores in

smaller classes is

0 1 2 1 1 20 vs 0*H H*μ μ μ μ: − = . : − > .

The *t*-statistic is

1 2

1 2

657 4 650 0 4 0479 SE( ) 1 8281

*act Y Yt
Y Y
*− . − .

= = = . . − .

The *p*-value for the one-sided test is:
5-value 1 ( ) 1 (4 0479) 1 0 999974147 2 5853 10*actp t *−= −Φ = −Φ . = − . = . × .

With the small *p*-value, the null hypothesis can be rejected with a high degree of confidence.
There is statistically significant evidence that the districts with smaller classes have higher
average test scores.

14. We have the following relations: 1 0 0254*in m*= . (or 1 39 37 ),*m in*= . 1 0 4536*lb kg*= .
(or 1 2.2046 ).*kg lb*= The summary statistics in the metric system are 70 5 0 0254 1 79 ;*X m*= . × . = .

158 0 4536 71 669 ;*Y kg*= × . = . 1 8 0 0254 0 0457 ;*Xs m*= . × . = . 14 2 0 4536 6 4411 ;*Ys kg*= . × . = .
21 73 0 0254 0 4536 0 2504 ,*XYs m kg*= . × . × . = . × and 0 85.*XYr *= .

15. Let *p *denote the fraction of the population that preferred Bush.

(a) ˆ ˆ405/755 0.536; SE ( ) .0181;*p p*= = = 95% confidence interval is ˆ ˆ1.96 SE ( ) or 0.536 .036*p p*± ±
(b) ˆ ˆ378/756 0.500; SE( ) .0182;*p p*= = = 95% confidence interval is ˆ ˆ1.96 SE( ) or 0.500 0.36*p p*± ±

(c) 0.536(1 0.536) 0.5(1 0.5)755 756ˆ ˆ ˆ ˆ0.036; SE( )*Sep Oct Sep Octp p p p
*− −− = − = + (because the surveys are independent.

The 95% confidence interval for the change in *p *is ˆ ˆ ˆ ˆ( ) 1.96 SE( )*Sep Oct Sep Octp p p p*− ± − or

0.036 .050.± The confidence interval includes ( ) 0.0,*Sep Octp p*− = so there is not statistically
significance evidence of a change in voters’ preferences.

16. (a) The 95% confidence interval if 108 453

1.96 SE( ) or 1013 1.96 or 1013 9.95.*Y Y*± ± × ±

(b) The confidence interval in (a) does not include μ= 0, so the null hypothesis that μ= 0 (Florida students have the same average performance as students in the U.S.) can be rejected at the 5% level.

(c) (i) The 95% confidence interval is 1.96 ( )*prep Non prep prep Non prepY Y SE Y Y*− −− ± − where
2 2 2 295 108

503 453SE( ) 6.61;
*prep non prep
*

*prep non prep
*

*S S
prep Non prep n nY Y
*

−

−− − = + = + = the 95% confidence interval is

(1019 1013) 12.96 6 12.96.*or*− ± ±
(ii) No. The 95% confidence interval includes 0.*prep non prep*μ μ −− =

Solutions to Exercises in Chapter 3 21

(d) (i) Let *X* denote the change in the test score. The 95% confidence interval for μ*X* is
60
453

1.96 ( ), where SE( ) 2.82;*X SE X X*± = = thus, the confidence interval is 9 5.52.±

(ii) Yes. The 95% confidence interval does not include μ*X *= 0.
(iii) Randomly select *n *students who have taken the test only one time. Randomly select one half

of these students and have them take the prep course. Administer the test again to all of the *n
*students. Compare the gain in performance of the prep-course second-time test takers to the
non-prep-course second-time test takers.

17. (a) The 95% confidence interval is , 2004 ,1992 , 2004 ,19921.96 SE( )*m m m mY Y Y Y*− ± − where
2 2 2 2, 2004 , 1992

, 2004 , 1992

10.39 8.70 , 2004 ,1992 1901 1592SE( ) 0.32;

*m m
*

*m m
*

*S S
m m n nY Y*− = + = + = the 95% confidence interval is

(21.99 − 20.33) ± 0.63 or 1.66 ± 0.63.
(b) The 95% confidence interval is , 2004 , 1992 , 2004 ,19921.96 SE( )*w w w wY Y Y Y*− ± − where

2 2 2 2, 2004 , 1992

, 2004 , 1992

8.16 6.90 , 2004 ,1992 1739 1370SE( ) 0.27;

*w w
*

*w w
*

*S S
w w n nY Y*− = + = + = the 95% confidence interval is

(18.47 17.60) 0.53− ± or 0.87 ± 0.53. (c) The 95% confidence interval is

, 2004 , 1992 , 2004 , 1992 , 2004 ,1992 , 2004 ,1992( ) ( ) 1.96 SE[( ) ( )],*m m w w m m w wY Y Y Y Y Y Y Y*− − − ± − − − where
2 2 2 2 2 2 2 2, 2004 , 1992 , 2004 , 2004

, 2004 , 1992 , 2004 , 2004

10.39 8.70 8.16 6.90 , 2004 , 1992 , 2004 , 1992 1901 1592 1739 1370SE[( ) ( )] 0.42.

*m m w w
*

*m m w w
*

*S S S S
m m w w n n n nY Y Y Y*− − − = + + + = + + + =

The 95% confidence interval is (21.99 − 20.33) − (18.47 − 17.60) ± 1.96 × 0.42 or 0.79 ± 0.82.

18. 1 *nY … Y*, , are i.i.d. with mean *Y*μ and variance
2 .*Y*σ The covariance cov ( , ) 0,*j iY Y *= .*j i*≠ The

sampling distribution of the sample average *Y * has mean *Y*μ and variance var
22( ) .*YnYY
*σσ= =

(a) 2 2

2 2

2 2

[( ) ] {[( ) ( )] }

[( ) 2( )( ) ( ) ]

[( ) ] 2 [( )( )] [( ) ] var( ) 2cov( , ) var( ).

*i i Y Y
*

*i Y i Y Y Y
*

*i Y i Y Y Y
*

*i i
*

*E Y Y E Y Y
*

*E Y Y Y Y
*

*E Y E Y Y E Y
Y Y Y Y
*

μ μ

μ μ μ μ

μ μ μ μ

− = − − −

= − − − − + −

= − − − − + −

= − +

22 Stock/Watson - Introduction to Econometrics - Second Edition

(b)

1

1

2

2

2

cov( ) [( )( )]

( )

( ) ( )

1 1[( ) [( )( )]

1 1 cov( , )

*Y i Y
*

*n
j j
*

*Y i Y
*

*n
j j Y
*

*i Y
*

*I Y j Y i Y
j i
*

*Y j i
j i
*

*Y
*

*Y Y E Y Y
*

*Y
E Y
*

*n
*

*Y
E Y
*

*n
*

*E Y E Y Y
n n
*

*Y Y
n n
*

*n
*

μ μ

μ μ

μ μ

μ μ μ

σ

σ

⎡ ⎤⎛ ⎞ ⎢ ⎥⎜ ⎟=⎢ ⎥⎜ ⎟ ⎢ ⎥⎜ ⎟ ⎢ ⎥⎜ ⎟ ⎝ ⎠⎢ ⎥⎣ ⎦

⎡ ⎤ ⎢ ⎥=⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦

≠

≠

, = − −

∑ = − −

⎛ ⎞∑ − = −⎜ ⎟⎜ ⎟ ⎝ ⎠

= − + − −

= +

= .

∑

∑

(c)

2 2

1

2

1

1

2 2 2

1

2

1

2

1 ( ) 1

1 [( ) ] 1

1 [var ( ) 2cov( ) var( )] 1

1 2 1

1 1 1

.

*n
*

*Y i
i
*

*n
*

*i
i
*

*n
*

*i i
i
*

*n
Y Y
*

*Y
i
*

*n
*

*Y
i
*

*Y
*

*E s E Y Y
n
*

*E Y Y
n
*

*Y Y Y Y
n
*

*n n n
*

*n
n n
*

σ σ σ

σ

σ

⎛ ⎞ ⎜ ⎟ ⎝ ⎠

=

=

=

=

⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

= ⎝ ⎠

⎛ ⎞ = −⎜ ⎟−⎝ ⎠

= − −

= − , + −

⎡ ⎤ = − × +⎢ ⎥− ⎣ ⎦

− = −

=

∑

∑

∑

∑

∑

19. (a) No. 2 2 2 2( ) and ( ) for .*i Y Y i j YE Y E YY i j*σ μ μ= + = ≠ Thus

2 2 2

2 2 1 1 1

2 2

1 1 1( ) ( ) ( )

1

*n n n
*

*i i i j
i i i j i
*

*Y Y
*

*E Y E Y E Y E YY
n n n
*

*n
*μ σ

= = = ≠

⎛ ⎞ = = +⎜ ⎟ ⎝ ⎠

= +

∑ ∑ ∑∑

(b) Yes. If *Y *gets arbitrarily close to μ*Y* with probability approaching 1 as *n *gets large, then 2*Y *gets
arbitrarily close to 2*Y*μ with probability approaching 1 as *n *gets large. (As it turns out, this is an
example of the “continuous mapping theorem” discussed in Chapter 17.)

Solutions to Exercises in Chapter 3 23

20. Using analysis like that in equation (3.29)

1

1

1 ( )( ) 1

1 ( )( ) ( )( ) 1 1

*n
*

*XY i i
i
*

*n
*

*i X i Y X Y
i
*

*s X X Y Y
n
*

*n nX Y X Y
n n n
*

μ μ μ μ

=

=

= − − −

⎡ ⎤ ⎛ ⎞= − − − − −⎜ ⎟⎢ ⎥− −⎝ ⎠⎣ ⎦

∑

∑

because *p XX *μ→ and
*p
*

*YY *μ→ the final term converges in probability to zero.

Let ( )( ).*i i x i YW X Y*μ μ= − − Note *Wi* is *iid *with mean σ*XY* and second moment
2 2[( ) ( ) ].*i X i YE X Y*μ μ− − But

2 2 4 4[( ) ( ) ] ( ) ( )*i X i Y i X i YE X Y E X E Y*μ μ μ μ− − ≤ − − from the Cauchy-
Schwartz inequality. Because *X *and *Y *have finite fourth moments, the second moment of *Wi* is finite,

so that it has finite variance. Thus 1 1 ( ) .
*pn
*

*i i i XYn W E W *σ= → =∑ Thus,
*p
*

*XY XYs *σ→ (because the term

1 1*nn*− → ).

21. Set *nm* = *nw* = *n*, and use equation (3.19) write the squared SE of *m wY Y*− as

2 2 1 1

2

2 2 1 1

1 1( ) ( ) ( 1) ( 1)[ ( )]

( ) ( ) . ( 1)

*n n
i mi m i wi w
*

*m w
*

*n n
i mi m i wi w
*

*Y Y Y Y
n nSE Y Y
*

*n n
*

*Y Y Y Y
n n
*

= =

= =

∑ − ∑ − − −− = +

∑ − + ∑ − =

−

Similary, using equation (3.23)

2 2 1 1

2

2 2 1 1

1 1( ) ( ) 2( 1) ( 1)

[ ( )] 2

( ) ( ) . ( 1)

*n n
i mi m i wi w
*

*pooled m w
*

*n n
i mi m i wi w
*

*Y Y Y Y
n n
*

*SE Y Y
n
*

*Y Y Y Y
n n
*

= =

= =

⎡ ⎤ ∑ − + ∑ −⎢ ⎥− −⎣ ⎦− =

∑ − + ∑ − =

−

**Chapter 4
Linear Regression with One Regressor
**

** Solutions to Exercises
**1. (a) The predicted average test score is

· 520 4 5 82 22 392 36*TestScore *= . − . × = .

(b) The predicted change in the classroom average test score is

· ( 5 82 19) ( 5 82 23) 23 28*TestScore*Δ = − . × − − . × = .

(c) Using the formula for 0β̂ in Equation (4.8), we know the sample average of the test scores across

the 100 classrooms is

0 1
ˆ ˆ 520 4 5 82 21 4 395 85*TestScore CS*β β= + × = . − . × . = . .

(d) Use the formula for the standard error of the regression (SER) in Equation (4.19) to get the sum of squared residuals:

2 2( 2) (100 2) 11 5 12961*SSR n SER*= − = − × . = .

Use the formula for 2*R * in Equation (4.16) to get the total sum of squares:

2 2

12961 13044
1 1 0 08
*SSRTSS
*

*R
*= = = .
− − .

The sample variance is 2*Ys *= TSS 130441 99 131 8.*n*− = = . Thus, standard deviation is
2 11 5.*Y Ys s*= = .

2. The sample size 200.*n *= The estimated regression equation is

· 2(2 15) 99 41 (0 31) 3 94 0 81 SER 10 2*Weight Height R*= . − . + . . , = . , = . .

(a) Substituting 70, 65, and 74*Height *= inches into the equation, the predicted weights are 176.39,
156.69, and 192.15 pounds.

(b) · 3 94 3 94 1 5 5 91.*Weight Height*Δ = . ×Δ = . × . = .
(c) We have the following relations: 1 2 54 and 1 0 4536 .*in cm lb kg*= . = . Suppose the regression

equation in the centimeter-kilogram space is

·
0 1ˆ ˆ*Weight Height*γ γ= + .

The coefficients are 0ˆ 99 41 0 4536 45 092 ;*kg*γ = − . × . = − . 0 45362 541ˆ 3 94 0 7036 *kg*γ . .= . × = . per *cm*. The
2*R * is unit free, so it remains at 2 0 81*R *= . . The standard error of the regression is

10 2 0 4536 4 6267 .*SER kg*= . × . = .

Solutions to Exercises in Chapter 4 25

3. (a) The coefficient 9.6 shows the marginal effect of *Age *on *AWE*; that is, *AWE *is expected to
increase by $9.6 for each additional year of age. 696.7 is the intercept of the regression line. It
determines the overall level of the line.

(b) *SER* is in the same units as the dependent variable (*Y*, or *AWE *in this example). Thus *SER* is
measures in dollars per week.

(c) *R*2 is unit free.
(d) (i) 696.7 9.6 25 $936.7;+ × =
(ii) 696.7 9.6 45 $1,128.7+ × =
(e) No. The oldest worker in the sample is 65 years old. 99 years is far outside the range of the

sample data.
(f) No. The distribution of earning is positively skewed and has kurtosis larger than the normal.
(g) 0 1ˆ ˆ ,*Y X*β β= − so that 0 1ˆ ˆ .*Y X*β β= + Thus the sample mean of *AWE *is 696.7 + 9.6 × 41.6 =

$1,096.06.

4. (a) ( ) ( ) ,*f m fR R R R u*β− = − + so that var
2( ) var( ) var( ) 2 cov( , ).*f m f m fR R R R u u R R*β β− = × − + + × −

But cov( , ) 0,*m fu R R*− = thus
2var( ) var( ) var( ).*f m fR R R R u*β− = × − + With β > 1, var(*R* − *Rf*) >

var(*Rm* − *Rf*), follows because var(*u*) ≥ 0.
(b) Yes. Using the expression in (a)

2var ( ) var ( ) ( 1) var ( ) var( ),*f m f m fR R R R R R u*β− − − = − × − + which will be positive if
2var( ) (1 ) var ( ).*m fu R R*β> − × −

(c) 7.3% 3.5% 3.8%.*m fR R*− = − = Thus, the predicted returns are
ˆ ˆˆ ( ) 3.5% 3.8%*f m fR R R R*β β= + − = + ×

Kellog: 3.5% 0.24 3.8% 4.4%+ × = Waste Management: 3.5% 0.38 3.8% 4.9%+ × = Sprint: 3.5% 0.59 3.8% 5.7%+ × = Walmart: 3.5% 0.89 3.8% 6.9%+ × = Barnes and Noble: 3.5% 1.03 3.8% 7.4%+ × = Best Buy: 3.5% 1.8 3.8% 10.3%+ × = Microsoft: 3.5% 1.83 3.8% 10.5%+ × =

5. (a) *ui* represents factors other than time that influence the student’s performance on the exam
including amount of time studying, aptitude for the material, and so forth. Some students will
have studied more than average, other less; some students will have higher than average aptitude
for the subject, others lower, and so forth.

(b) Because of random assignment *ui* is independent of *Xi*. Since *ui* represents deviations from
average *E*(*ui*) = 0. Because *u *and *X *are independent *E*(*ui*|*Xi*) = *E*(*ui*) = 0.

(c) (2) is satisfied if this year’s class is typical of other classes, that is, students in this year’s class
can be viewed as random draws from the population of students that enroll in the class. (3) is
satisfied because 0 ≤ *Yi* ≤ 100 and *Xi* can take on only two values (90 and 120).

(d) (i) 49 0.24 90 70.6; 49 0.24 120 77.8; 49 0.24 150 85.0+ × = + × = + × = (ii) 0.24 10 2.4.× =