Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Inference from Two Samples: Hypothesis Testing and Confidence Intervals - Prof, Assignments of Mathematical Statistics

Solutions to various statistical exercises related to hypothesis testing and confidence intervals based on two samples. Topics include calculating test statistics, degrees of freedom, p-values, and constructing confidence intervals.

Typology: Assignments

Pre 2010

Uploaded on 12/20/2009

mlazybum
mlazybum 🇺🇸

5

(2)

14 documents

1 / 36

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Inference from Two Samples: Hypothesis Testing and Confidence Intervals - Prof and more Assignments Mathematical Statistics in PDF only on Docsity!

CHAPTER 9

Section 9.

a. EXY   EX   EY   4. 1  4. 5 . 4 , irrespective of sample sizes.

b. (^)      

   

2 2 2

2

2

1

       

m n

V X Y V X V Y

 

, and the s.d. of

X  Y . 0724 . 2691.

c. A normal curve with mean and s.d. as given in a and b (because m = n = 100, the CLT

implies that both X and Y have approximately normal distributions, so X  Y does

also). The shape is not necessarily that of a normal curve when m = n = 10, because the

CLT cannot be invoked. So if the two lifetime population distributions are not normal,

the distribution of

X  Y

will typically be quite complicated.

2. The test statistic value is

n

s

m

s

x y

z

2

2

2

1

, and H 0 will be rejected if either z  1. 96 or
z  1. 96. We compute
42 , 500 40 , 400

2 2

 
z 

. Since 4.85 >

1.96, reject H 0 and conclude that the two brands differ with respect to true average tread lives.

3. The test statistic value is

 

n

s

m

s

x y

z

2

2

2

1

 

, and H 0 will be rejected at level .01 if

z  2. 33. We compute

 

42 , 500 36 , 800 5000

2 2

 
 
z 

, which is

not > 2.33, so we don’t reject H 0 and conclude that the true average life for radials does not

exceed that for economy brand by significantly more than 500.

a. From Exercise 2, the C.I. is

   1. 96  2100 1. 96  433. 33  2100 849. 33

2

2

2

1

      

n

s

m

s

x y

  1250. 67 , 2949. 33 

. In the context of this problem situation, the interval is

moderately wide (a consequence of the standard deviations being large), so the

information about 1

and 2

is not as precise as might be desirable.

b. From Exercise 3, the upper bound is

5700  1. 645  396. 93   5700  652. 95  6352. 95 .

a. Ha says that the average calorie output for sufferers is more than 1 cal/cm

2 /min below that

for non-sufferers.

   

2 2 2

2

2

1

   

m n

 

, so

   

. 64 2. 05 1



  

z . At level .01, H

0 is rejected if^

z  2. 33 ;

since –2.90 < -2.33, reject H 0.

b. P ^ ^2.^90 ^ .^0019

c.^1 ^.^92 ^.^8212

1 2. 33   
  
   

d.

 

 

. 2 2. 33 1. 28

2

2

m  n  , so use 66.
a. H 0 should be rejected if z  2. 33. Since

 

3. 53 2. 33
18. 12 16. 87
 
z 
, H 0

should be rejected at level .01.

b. ^ ^ ^.^50 ^.^3085

1 2. 33  
 
  

c.

 

. 0529 37. 06
1. 645 1. 28

2

    
  n
n n

, so use

n = 38.

d. Since n = 32 is not a large sample, it would no longer be appropriate to use the large

sample z test of Section 9.1. A small sample t procedure should be used (Section 9.2),

and the appropriate conclusion would follow. Note, however, that the test statistic of 3.

would not change, and thus it shouldn’t come as a surprise that we would still reject H 0 at

the .01 significance level.

1 Parameter of interest:  

1 2

  the true difference of means for males and

females on the Boredom Proneness Rating. Let 

1

 men’s average and 

2

women’s average.

2 H 0 : 0

1 2

   

3 Ha: 0

1 2

   

   

n

s

m

s

x y

n

s

m

s

x y

z

o

2

2

2

1

2

2

2

1

0

 

  

5 RR: z  1. 645

 

10. 40 9. 26 0

2 2

 
z 

7 Reject H 0. The data indicates the average Boredom Proneness Rating is higher for

males than for females.

a.

1 Parameter of interest: ^ 

1 2

  the true difference of mean tensile strength of the

1064 grade and the 1078 grade wire rod. Let 

1

 1064 grade average and 

2

1078 grade average.

2 H 0 : 10

1 2

   

3 Ha: 10

1 2

   

     

n

s

m

s

x y

n

s

m

s

x y

z

o

2

2

2

1

2

2

2

1

  

  

5 RR: p ^ value ^ 

   

107. 6 123. 6 10

2 2


  
z 

7 For a lower-tailed test, the p-value = ^ ^28.^57 ^ ^0 , which is less than any^ ^ ,

so reject H 0. There is very compelling evidence that the mean tensile strength of the

1078 grade exceeds that of the 1064 grade by more than 10.

b. The requested information can be provided by a 95% confidence interval for 1 2

   :

  1. 96  16  1. 96 . 210   16. 412 , 15. 588 

2

2

2

1

      

n

s

m

s

x y.

a. point estimate x^ ^ y ^19.^9 ^13.^7 ^6.^2. It appears that there could be a

difference.

b.

H 0 : 0

1 2

    ,Ha: 0

1 2

    ,

 

19. 9 13. 7

2 2

 
z 

, and the

p-value = 2[P(z > 1.14)] = 2( .1271) = .2542. The p value is larger than any reasonable

, so we do not reject H 0. There is no significant difference.

c. No. With a normal distribution, we would expect most of the data to be within 2 standard

deviations of the mean, and the distribution should be symmetric. 2 sd’s above the mean

is 98.1, but the distribution stops at zero on the left. The distribution is positively

skewed.

d. We will calculate a 95% confidence interval for μ, the true average length of stays for

patients given the treatment. 19.^99.^9 ^10.^0 ,^21.^8 

19. 9  1. 96   
a. The hypotheses are H 0 : 5

1 2

    and H

a:^

1 2

   . At level .001, H

0 should be

rejected if z  3. 08. Since

 

2. 89 3. 08

65. 6 59. 8 5

 

 

z  , H 0 cannot be

rejected in favor of Ha at this level, so the use of the high purity steel cannot be justified.

b.^1

1 2

  

o

  , so . 53 . 2891

3. 08  
 

11. (^)  

n

s

m

s

X Y z

2

2

2

1

/ 2

  

. Standard error =

n
s

. Substitution yields

     

2

2

2

/ 2 1

x  y  z SE  SE

. Using ^ .^05 , z ^ / 2 ^1.^96 , so

 5. 5 3. 8  1. 96  0. 3   0. 2   0. 99 , 2. 41 

2 2

   . We are 95% confident that the

true average blood lead level for male workers is between 0.99 and 2.41 higher than the

corresponding average for female workers.

12. The C.I. is (^)    

2. 58 8. 77 2. 58. 9104 8. 77 2. 46

2

2

2

1

      

n

s

m

s

x y

   11. 23 , 6. 31 

. With 99% confidence we may say that the true difference between

the average 7-day and 28-day strengths is between -11.23 and -6.31 N/mm

2 .

13.. 05

1 2

    , d = .04, ^ .^01 ,^ .^05 , and the test is one-tailed, so

  

. 0025. 0025 2. 33 1. 645

2

 
n  , so use n = 50.

14. The appropriate hypotheses are H 0 :   0 vs. Ha:   0 , where

1 2

  2   . (   0 is

equivalent to 1 2

2    , so normal is more than twice schizophrenic) The estimator of  is

 2 X  Y
ˆ

 , with^      

m n

Var VarX Var Y

2

2

2

1

ˆ

 

     , 

 is the square root of

  

ˆ
Var , and^ 
ˆ is obtained by replacing each

2

i

 with

2

i

S. The test statistic is then

ˆ

ˆ

(since

 0

o

), and H 0 is rejected if z  2. 33. With

2  2. 69  6. 35. 97

ˆ

    and

   

ˆ

2 2

  

,

z  ; Because –1.05 > -2.33, H

0 is not rejected.

a. As either m or n increases,  decreases, so

  o

 

1 2 increases (the numerator is

positive), so 
  

 

o

z

1 2

decreases, so 
  
 

 

 

o

z

1 2

decreases.

b. As^ ^ decreases, 

z increases, and since 

z is the numerator of n , n increases also.

n^ n

s

n

s

x y

z

2

2

2

1

. For n = 100, z = 1.41 and p-value = 2  1   1. 41  . 1586.

For n = 400, z = 2.83 and p-value = .0046. From a practical point of view, the closeness of

x and y^ suggests that there is essentially no difference between true average fracture

toughness for type I and type I steels. The very small difference in sample averages has been

magnified by the large sample sizes – statistical rather than practical significance. The p-

value by itself would not have conveyed this message.

Section 9.

a.

 

   

  1. 43 17

. 694 1. 44

  1. 21

9 9

2 2

2 2

 

 

b.

 

   

  1. 7 21

. 694. 411

  1. 01

9 14

2 2

2 2

 

 

c.

 

  (^)  

  1. 27 18

. 018. 411

  1. 84

9 14

2 2

2 2

 

 

d.

 

   

  1. 05 26

. 395. 098

  1. 84

11 23

2 2

2 2

 

 

18. With H 0 : 0

1 2

    vs. Ha: 0

1 2

    , we will reject H 0 if p^ ^ value ^ .

 

   

  1. 8 6

5 4

2 2

2 2

 

 

, and the test statistic

  1. 17

. 1265

  1. 73 21. 95. 78

5

. 240

6

. 164

2 2

 

t

leads to a p-value of 2[ P(t > 6.17)] < 2(.0005)

=.001, which is less than most reasonable  ' s , so we reject H 0 and conclude that there is a

difference in the densities of the two brick types.

19. For the given hypotheses, the test statistic

  1. 20

  2. 007

  3. 7 129. 3 10 3. 6

6

  1. 38

6

  1. 03

2 2

 

 

t  ,

and the d.f. is

 

   

4. 2168 4. 8241

2 2

2

 

, so use d.f. = 9. We will reject H 0 if

2. 764 ;

. 01 , 9

t   t  since –1.20 > -2.764, we don’t reject H 0.

20. We want a 95% confidence interval for 1 2

  . 2.^262

. 025 , 9

t  , so the interval is

 13. 6  2. 262  3. 007    20. 40 , 6. 80 . Because the interval is so wide, it does

not appear that precise information is available.

21. Let 

1

 the true average gap detection threshold for normal subjects, and 

2

 the

corresponding value for CTS subjects. The relevant hypotheses are H 0 : 0

1 2

    vs. Ha:

0

1 2

    , and the test statistic 2.^46
. 0351125. 07569
1. 71 2. 53

t 

. Using

d.f.

 

   

. 0351125. 07569

2 2

2

 

, or 15, the rejection region is

. 01 , 15

t  t 

. Since –2.46 is not  2. 602 , we fail to reject H 0. We have

insufficient evidence to claim that the true average gap detection threshold for CTS subjects

exceeds that for normal subjects.

22. Let 

1

 the true average strength for wire-brushing preparation and let 

2

 the average

strength for hand-chisel preparation. Since we are concerned about any possible difference

between the two means, a two-sided test is appropriate. We test :^0

0 1 2

H     vs.
: 0

1 2

   

a

H

. We need the degrees of freedom to find the rejection region:

 

   

  1. 33

. 0039. 1632

  1. 3964

11 11

2 2

2 2

 

, which we round down to 14, so

we reject H 0 if

. 025 , 14

tt

. The test statistic is

 

19. 20 23. 13 3. 93

12

  1. 01

12

  1. 58

2 2

 

t 

, which is ^2.^145 , so we reject H 0 and

conclude that there does appear to be a difference between the two population average

strengths.

a.

Using Minitab to generate normal probability plots, we see that both plots illustrate

sufficient linearity. Therefore, it is plausible that both samples have been selected from

normal population distributions.

P-Value: 0. 344

A-Squared: 0. 396

Anderson-Darling Normality Test

N: 24

StDev : 0. 444206

Av erage: 1. 50833

  1. 8 1. 3 1. 8 2. 3

. 999 . 99 . 95 . 80 . 50 . 20 . 05 . 01 . 001

P

r

o

b

a

b

i

li

t

y

H:

Normal Probability Plot for High Quality Fabric

Av erage: 1. 58750

StDev : 0. 530330 N: 24

Anderson-Darling Normality Test

A-Squared: - 10. 670 P-Value: 1. 000

  1. 0 1. 5 2. 0 2. 5

. 001 . 01 . 05 . 20 . 50 . 80 . 95 . 99 . 999

P

r

o

b

a

b

i

li

ty

P:

Normal Probability Plot for Poor Quality Fabric

b.

  1. 5 1. 5 2. 5

Comparative Box Plot for High Quality and Poor Quality Fabric

Quality

Poor

Quality

High

extensibility (%)

The comparative boxplot does not suggest a difference between average extensibility for

the two types of fabrics.

c. We test :^0

0 1 2

H     vs. : 0

1 2

   

a

H . With degrees of freedom

 

2

   , which we round down to 10, and using significance

level .05 (not specified in the problem), we reject H 0 if 2.^228

. 025 , 10

tt . The test

statistic is

 



t  , which is not

 2. 228

in absolute value, so we

cannot reject H 0. There is insufficient evidence to claim that the true average

extensibility differs for the two types of fabrics.

a. 95% upper confidence bound:

x
  • t .05,65-1 SE = 13.4 + 1.671(2.05) = 16.83 seconds

b. Let μ 1 and μ 2 represent the true average time spent by blackbirds at the experimental and

natural locations, respectively. We wish to test H 0 : μ 1 – μ 2 = 0 v. Ha: μ 1 – μ 2 > 0. The

relevant test statistic is

2 2

  1. 05 1. 76
13. 4 9. 7

t  = 1.37, with estimated df =

( 2. 05 1. 76 )

4 4

2 2 2

≈ 112.9. Rounding to t = 1.4 and df = 120, the tabulated P -value is

very roughly .082. Hence, at the 5% significance level, we fail to reject the null

hypothesis. The true average time spent by blackbirds at the experimental location is not

statistically significantly higher than at the natural location.

c. 95% CI for silvereyes’ average time – blackbirds’ average time at the natural location:

(38.4 – 9.7) ± (2.00)

2 2

  1. 76  5. 06 = (17.96 sec, 39.44 sec). The^ t -value 2.00 is

based on estimated df = 55.

25. We calculate the degrees of freedom

 

   

  1. 95

27 30

2 2

2 2

 

, or about 54

(normally we would round down to 53, but this number is very close to 54 – of course for this

large number of df, using either 53 or 54 won’t make much difference in the critical t value)

so the desired confidence interval is  

31

2 2

  1. 5  88. 3  1. 68 

 3. 2  2. 931  . 269 , 6. 131 

. Because 0 does not lie inside this interval, we can be

reasonably certain that the true difference 1 2

   is not 0 and, therefore, that the two

population means are not equal. For a 95% interval, the t value increases to about 2.01 or so,

which results in the interval 3. 2  3. 506. Since this interval does contain 0, we can no

longer conclude that the means are different if we use a 95% confidence interval.

26. Let 

1

 the true average potential drop for alloy connections and let 

2

 the true

average potential drop for EC connections. Since we are interested in whether the potential

drop is higher for alloy connections, an upper tailed test is appropriate. We test

: 0

0 1 2

H     vs. : 0

1 2

   

a

H . Using the SAS output provided, the test statistic,

when assuming unequal variances, is t = 3.6362, the corresponding df is 37.5, and the p-value

for our upper tailed test would be ½ (two-tailed p-value) = . 0008 . 0004 2

1

. Our p-

value of .0004 is less than the significance level of .01, so we reject H 0. We have sufficient

evidence to claim that the true average potential drop for alloy connections is higher than that

for EC connections.

27. The approximate degrees of freedom for this estimate are

 

   

  1. 83

  2. 175

  3. 59

5 7

2 2

2 2

 

 

, which we round down to 8, so

. 025 , 8

t  and the desired interval is

 40. 3 21. 4  2. 306 18. 9 2. 306  5. 4674 

8

2 2

    

 18. 9  12. 607   6. 3 , 31. 5 

. Because 0 is not contained in this interval, there is

strong evidence that 1 2

   is not 0; i.e., we can conclude that the population means are not

equal. Calculating a confidence interval for 2 1

   would change only the order of

subtraction of the sample means, but the standard error calculation would give the same result

as before. Therefore, the 95% interval estimate of 2 1

   would be ( -31.5, -6.3), just the

negatives of the endpoints of the original interval. Since 0 is not in this interval, we reach

exactly the same conclusion as before; the population means are not equal.

28. We will test the hypotheses: H^ 0 :^  1 ^  2 ^10 vs. H^ a :^  1 ^  2 ^10. The test statistic is

 

 

5

  1. 44

10

  1. 75

2 2

 

 

x y

t

The degrees of freedom

 

   

  1. 59 5

  2. 95

  3. 08

9 4

2 2

2 2

  

 

, and the p-value from table A.8 is

approx .045, which is < .10 so we reject H 0 and conclude that the true average lean angle for

older females is more than 10 degrees smaller than that of younger females.

29. Let 

1

 the true average compression strength for strawberry drink and let 

2

 the true

average compression strength for cola. A lower tailed test is appropriate. We test

: 0

0 1 2

H     vs. : 0

1 2

   

a

H . The test statistic is 2. 10

t .

 

   

2 2

2

 

 

, so use df=25. The p-value

 P ( t  2. 10 ). 023

. This p-value indicates strong support for the alternative

hypothesis. The data does suggest that the extra carbonation of cola results in a higher

average compression strength.

a. We desire a 99% confidence interval. First we calculate the degrees of freedom:

 

   

  1. 24

26 26

2 2

2 2

 

, which we would round down to 37, except that

there is no df = 37 row in Table A.5. Using 36 degrees of freedom (a more conservative

choice), t.^ 005 , 36 ^2.^719 , and the 99% C.I. is

 33. 4 42. 8  2. 719 9. 4 2. 576  11. 98 , 6. 83 

26

2 2

      

.

We are 99% confident that the true average load for carbon beams exceeds that for

fiberglass beams by between 6.83 and 11.98 kN.

b. The upper limit of the interval in part a does not give a 99% upper confidence bound.

The 99% upper bound would be ^9.^4 ^2.^434 ^.^9473 ^ ^7.^09 , meaning that the

true average load for carbon beams exceeds that for fiberglass beams by at least 7.09 kN.

a.

The most notable feature of these boxplots is the larger amount of variation present in the

mid-range data compared to the high-range data. Otherwise, both look reasonably

symmetric with no outliers present.

b. Using df = 23, a 95% confidence interval for ^ mid  range ^  high  range is

 438. 3 437. 45  2. 069. 85 8. 69  7. 84 , 9. 54 

11

2 2

      

.
Since plausible values for ^ mid  range ^  high  range are both positive and negative (i.e.,

the interval spans zero) we would conclude that there is not sufficient evidence to suggest

that the average value for mid-range and the average value for high-range differ.

32. Let 

1

 the true average proportional stress limit for red oak and let 

2

 the true

average proportional stress limit for Douglas fir. We test H^ 0 :^  1 ^  2 ^1 vs.
: 1

1 2

   

a

H . The test statistic is

 

  1. 818

. 2084

  1. 48 6. 65 1 1. 83

10

  1. 28

14

. 79

2 2

 

t

. With

degrees of freedom

 

   

  1. 85 13

13 9

. 2084

2

10

  1. 28

2

14

. 79

2

2 2

 

 

, the p-value = P(t > 1.8)

= .048. We would reject H 0 at significance levels greater than .046 (e.g., the standard 5%

significance level). At α = .05, there is sufficient evidence to claim that true average

proportional stress limit for red oak exceeds that of Douglas fir by more than 1 MPa.

mid range high range

470

460

450

440

430

420

m

id

r

a

n

g

e

Comparative Box Plot for High Range and Mid Range

33. Let μ 1 and μ 2 represent the true mean body mass decrease for the vegan diet and the control

diet, respectively. We wish to test the hypotheses H 0 : μ 1 – μ 2 ≤ 1 v. Ha: μ 1 – μ 2 > 1. The

relevant test statistic is

( 5. 8 3. 8 ) 1

2 2

 

t

= 1.33, with estimated df = 60 using the

formula. Rounding to t = 1.3, Table A.8 gives a one-sided P-value of .098 (a computer will

give the more accurate P-value of .094). Since our P-value > α = .05, we fail to reject H 0 at the

5% level. We do not have statistically significant evidence that the true average weight loss

for the vegan diet exceeds that for the control diet by more than 1 kg.

a. Following the usual format for most confidence intervals: statistic(critical value)critical value)

(critical value)standard error), a pooled variance confidence interval for the difference between two

means is ^ ^

m n p m n

x y t s

1 1

/ 2 , 2

   

 

.
b. The sample means and standard deviations of the two samples are x  13. 90 ,

1

s  , y  12. 20 , 1. 010

2

s . The pooled variance estimate is 

2

p

s

   

2 2 2

2

2

1

 
^ 
 
^ 
 
^ 
 
s
m n
n
s
m n
m
 1. 260 , so s^ p ^1.^1227. With df = m+n-1 = 6 for this interval,

. 025 , 6

t  and the desired interval is

    

4

1

4

1

  1. 90  12. 20  2. 447 1. 1227   1. 7  1. 943  . 24 , 3. 64 .

This interval contains 0, so it does not support the conclusion that the two population

means are different.

c. Using the two-sample t interval discussed earlier, we use the CI as follows: First, we need

to calculate the degrees of freedom.

 

   

  1. 78 5

. 0686 . 3971

3 3

2 2

2 2

  

 

so

. 025 , 5

t

. Then the

interval is

 13. 9 12. 2  2. 571 1. 70 2. 571 . 7938  . 34 , 3. 74 

4

2 2

      

. This interval is slightly wider, but it still supports the same conclusion.

35. There are two changes that must be made to the procedure we currently use. First, the

equation used to compute the value of the t test statistic is:

   

m n
s
x y
t

p

  

where sp is

defined as in Exercise 34 above. Second, the degrees of freedom = m + n – 2. Assuming

equal variances in the situation from Exercise 33, we calculate sp as follows:

   2. 5  2. 544

 
^ 

p

s . The value of the test statistic is, then,

   

2. 24 2. 2
32. 8 40. 5 5
 
  
t 

. The degrees of freedom = 16, and the p-

value is P ( t < -2.2) = .021. Since .021 > .01, we fail to reject H 0.

Section 9.

36. d  7. 25 ,  11. 8628

D

s
1 Parameter of Interest: 

D

 true average difference of breaking load for fabric in

unabraded or abraded condition.

2 :^0

0

D

H 
3 :^ ^0

a D

H 
s n
d
s n
d
t

D D

D

/
0
/

5 RR:

. 01 , 7

tt

11. 8628 / 8
t 

7 Fail to reject H 0. The data does not indicate a significant mean difference in

breaking load for the two fabric load conditions.

a. This exercise calls for paired analysis. First, compute the difference between indoor and

outdoor concentrations of hexavalent chromium for each of the 33 houses. These 33

differences are summarized as follows: n = 33, d . 4239 , .^3868

d

s , where d =

(indoor value – outdoor value). Then t^. 025 , 32 ^2.^037 , and a 95% confidence interval

for the population mean difference between indoor and outdoor concentration is

 . 4239. 13715 . 5611 ,. 2868 

. 4239 2. 037    
 

. We can

be highly confident, at the 95% confidence level, that the true average concentration of

hexavalent chromium outdoors exceeds the true average concentration indoors by

between .2868 and .5611 nanograms/m

3 .

b. A 95% prediction interval for the difference in concentration for the 34

th house is

 1 . 4239  2. 037 . 3868 1   1. 224 ,. 3758  33

1 1

. 025 , 32

     

d n

d t s.

This prediction interval means that the indoor concentration may exceed the outdoor

concentration by as much as .3758 nanograms/m

3 and that the outdoor concentration may

exceed the indoor concentration by a much as 1.224 nanograms/m

3 , for the 34

th house.

Clearly, this is a wide prediction interval, largely because of the amount of variation in

the differences.

a. The median of the “Normal” data is 46.80 and the upper and lower quartiles are 45.

and 49.55, which yields an IQR of 49.55 – 45.55 = 4.00. The median of the “High” data

is 90.1 and the upper and lower quartiles are 88.55 and 90.95, which yields an IQR of

90.95 – 88.55 = 2.40. The most significant feature of these boxplots is the fact that their

locations (medians) are far apart.

High: Normal:

90

80

70

60

50

40

Comparative Boxplots

for Normal and High Strength Concrete Mix

b. This data is paired because the two measurements are taken for each of 15 test conditions.

Therefore, we have to work with the differences of the two samples. A normal

probability plot of the 15 differences shows that the data follows (approximately) a

straight line, indicating that it is reasonable to assume that the differences follow a

normal distribution. Taking differences in the order “Normal” – “High”, we find

d  42. 23 , and^

 4. 34

d

s . With 2. 145

. 025 , 14

t  , a 95% confidence interval

for the difference between the population means is

  42. 23 2. 404  44. 63 , 39. 83 

42. 23 2. 145    
 

. Because

0 is not contained in this interval, we can conclude that the difference between the

population means is not 0; i.e., we conclude that the two population means are not equal.

a. A normal probability plot shows that the data could easily follow a normal distribution.

b. We test :^0

0

d

H  vs. :  0

a d

H  , with test statistic
2. 74 2. 7
228 / 14
/
0
 

s n

d

t

D

. The two-tailed p-value is 2[ P( t > 2.7)] =

2[.009] = .018. Since .018 < .05, we can reject H 0. There is strong evidence to support

the claim that the true average difference between intake values measured by the two

methods is not 0. There is a difference between them.

40. From the data, n = 10, (^) d = 105.7, sd = 103.845.

a. Let μd = true mean difference in TBBMC, postweaning minus lactation. We wish to test

the hypotheses H 0 : μd ≤ 25 v. Ha: μd > 25. The test statistic is

103. 845 / 10
105. 7  25

t  =

2.46; at 9df, the corresponding P-value is around .018. Hence, at the 5% significance

level, we reject H 0 and conclude that true average TBBMC during postweaning does

exceed the average during lactation by more than 25 grams.

b. A 95% upper confidence bound for μd = (^) d + t .05,9 sd/ (^) n = 105.7 + 1.833(103.845)/

(^10) = 165.89 grams.

c. No. If we pretend the two samples are independent, the new standard error is is roughly

235, far greater than 103.845/ 10. In turn, the resulting t statistic is just t = 0.45, with

estimated df = 17 and P-value = .329 (all using a computer).

41. We test :^5

0

d

H  vs. :  5

a d

H  . With

d  7. 600 , and^

 4. 178

d

s ,
1. 87 1. 9
4. 178 / 9
  

t

. With degrees of freedom n – 1 = 8, the

corresponding p-value is P( t > 1.9 ) = .047. We would reject H 0 at any alpha level greater

than .047. So, at the typical significance level of .05, we would reject H 0 , and conclude that

the data indicates that the higher level of illumination yields a decrease of more than 5

seconds in true average task completion time.

1 Parameter of interest: d

denotes the true average difference of spatial ability in

brothers exposed to DES and brothers not exposed to DES. Let

d exp osed un exp osed.

    
2 H 0 :^  D ^0
3 Ha :^  D ^0
s n
d
s n
d
t

D D

D

/
0
/

5 RR: P-value < .05, df = 9

 

12. 6 13. 7 0



 

t  , with corresponding p-value .028 (from Table

A.8)

7 Reject H 0. The data supports the idea that exposure to DES reduces spatial ability.

a. Although there is a “jump” in the middle of the Normal Probability plot, the data follow a

reasonably straight path, so there is no strong reason for doubting the normality of the

population of differences.

b. A 95% lower confidence bound for the population mean difference is:

  38. 60 10. 54 49. 14

38. 60 1. 761

. 05 , 14

  

 

n

s

d t

d

.

We are 95% confident that the true mean difference between age at onset of Cushing’s

disease symptoms and age at diagnosis is greater than -49.14.

c. A 95% upper confidence bound for the corresponding population mean difference is

38.60 + 10.54 = 49.14.

44. We need to check the differences to see if the assumption of normality is plausible. A normal

probability plot validates our use of the t distribution. A 95% upper confidence bound for μD

is ^ ^2635.^63222.^91

2635. 63 1. 753

. 05 , 15

 

 

n

s

d t

d

= 2858.54.

We are 95% confident that the true mean difference between modulus of elasticity after 1

minute and after 4 weeks is at most 2858.54.

45. From the data, n = 12, (^) d = –0.73, sd = 2.81.

a. Let μd = the true mean difference in strength between curing under moist conditions and

laboratory drying conditions. A 95% CI for μd is (^) d ± t .025,11 sd/ (^) n = –0.73 ±

2.201(2.81)/ 10 = (–2.52 MPa, 1.05 MPa). In particular, this interval estimate includes

the value zero, suggesting that true mean strength is not significantly different under

these two conditions.

b. Since n = 12, we must check that the differences are plausibly from a normal population.

The normal probability plot below strongly substantiates that condition.

Differences

P

e

r

c

e

n

t

-7.5 -5.0 -2.5 0.0 2.5 5.

99

95

90

80

70

60

50

40

30

20

10

5

1

Normal Probability Plot of Differences

Normal

46. With  ,   6 , 5 

1 1

x y  ,  ,   15 , 14 

2 2

x y  , ^ ,^ ^ ^1 ,^0  3 3

x y  , and  ,   21 , 20 

4 4

x y  ,

d  1 and ^0

d

s (the d

I’s are 1, 1, 1, and 1), while s 1 = s 2 = 8.96, so sp = 8.96 and t = .16.

Section 9.

47. H 0 will be rejected if z^ ^ z. 01 ^2.^33. With ˆ.^150

1

p  , and ˆ. 300

2

p  ,
ˆ  
p  , and q ˆ^ . 737. The calculated test statistic is

   

. 263. 737

. 150. 300

600

1

200

1



z 

. Because ^4.^18 ^2.^33 , H 0 is

rejected; the proportion of those who repeat after inducement appears lower than those who

repeat after no inducement.

a. H 0 will be rejected if z^ ^1.^96. With. 2100
ˆ

1

p   , and
ˆ

2

p   ,. 2875
ˆ 
p  ,

   

. 2875. 7125

. 2100. 4167

180

1

300

1



z 

. Since  4. 84  1. 96

, H 0 is rejected.

b.

p . 275 and (^)  . 0432 , so power =

         

  

^ 

 

 

1. 96. 0421. 2

1. 96. 0421. 2

1     6. 54   2. 72  . 9967.

1 Parameter of interest: p 1 – p 2 = true difference in proportions of those responding to

two different survey covers. Let p 1 = Plain, p 2 = Picture.

2 :^0

0 1 2

H p  p 
3 :^0

1 2

H p  p 

a

 

m n

p q

p p

z

1 1

1 2

ˆ ˆ

ˆ ˆ

5 Reject H 0 if p-value < .10

   

. 1910

213

1

207

1

420

207

420

213

213

109

207

104

 

z  ; p-value = .4247

7 Fail to Reject H 0. The data does not indicate that plain cover surveys have a lower

response rate.

50. Let  . 05. A 95% confidence interval is

   

n
p q
m
p q

p p z

1 1 22

ˆˆ ˆˆ
1 2 / 2

ˆ  ˆ  

 

     

. 0934. 0774 . 0160 ,. 1708 

266

140

266

126

395

171

395

224

266

126

395

224

  

   .

a. Let p 1 and p 2 denote the true incidence rates of GI problems for the olestra and control

groups, respectively. We wish to test H 0 : p 1 – μ 2 = 0 v. Ha: p 1 – p 2 ≠ 0. The pooled

proportion is

529 (. 176 ) 563 (. 158 )
ˆ
p  = .1667, from which the relevant test

statistic is z =

(. 1667 )(. 8333 )[ 529 563 ]
. 176. 158

 1  1 

= 0.78. The two-sided P-value is

2P(Z ≥ 0.78) = .433 > α = .05, hence we fail to reject the null hypothesis. The data do not

suggest a statistically significant difference between the incidence rates of GI problems

between the two groups.

b.

 

(. 05 )
1. 96 (. 35 )( 1. 65 )/ 2 1. 28 (. 15 )(. 85 ) (. 2 )(. 8 )

2

2

 

n  , so a

common sample size of m = n = 1211 would be required.

52. Let p 1 = true proportion of irradiated bulbs that are marketable; p 2 = true proportion of

untreated bulbs that are marketable; The hypotheses are :^0

0 1 2

H p  p  vs.
: 0

0 1 2

H p  p . The test statistic is

 

m n

p q

p p

z

1 1

1 2

ˆ ˆ

ˆ ˆ

. With.^850
ˆ

1

p   , and
ˆ

2

p   ,. 756
p ˆ   ,

   

. 756. 244

. 850. 661

180

1

180

1

 

z 

.

The p-value = 1   4. 2   0 , so reject H 0 at any reasonable level. Radiation appears to

be beneficial.

a. A 95% large sample confidence interval formula for ln   is

 

ny

n y

mx

m x

z

/ 2

ˆ

ln 

 (^). Taking the antilogs of the upper and lower bounds

gives the confidence interval for  itself.

b.^1.^818

ˆ

11 , 037

104

11 , 034

189

   ,  . 598

ˆ
ln   , and the standard deviation is

     

11 , 037 104
10 , 933
11 , 034 189
10 , 845

  , so the CI for ln is

. 598  1. 96 . 1213  . 360 ,. 836 . Then taking the antilogs of the two bounds

gives the CI for  to be  1. 43 , 2. 31 . We are 95% confident that people who do not

take the aspirin treatment are between 1.43 and 2.31 times more likely to suffer a heart

attack than those who do. This suggests aspirin therapy may be effective in reducing the

risk of a heart attack.

a. The “after” success probability is p 1 + p 3 while the “before” probability is p 1 + p 2 , so p 1 +

p 3 > p 1 + p 2 becomes p 3 > p 2 ; thus we wish to test 0 3 2

H : p  p versus

3 2

H : p p

a

.

b. The estimator of (p 1 + p 3 ) – (p 1 + p 2 ) is

   

n

X X

n

X X X X

1 3 1 2 3 2

  

.

c. When H 0 is true, p 2 = p 3 , so

n
p p
n
X X
Var

3 2 2 3

 
 ^

, which is estimated by

n

p p

2 3

ˆ  ˆ

. The Z statistic is then

2 3

3 2

2 3

3 2

ˆ ˆ X X

X X

n

p p

n

X X

.
d. The computed value of Z is 2.^68

, so P^ ^1 ^2.^68 ^ .^0037.

At level .01, H 0 can be rejected but at level .001 H 0 would not be rejected.

55..^550
ˆ

1

p  ,. 690
ˆ

2

p   , and the 95% C.I. is

. 550 . 690   1. 96 . 106  . 14 . 21  . 35 ,. 07 .

56. Using p 1 = q 1 = p 2 = q 2 = .5,  

n n n
L
. 25. 25 2. 7719
2 1. 96 
  , so L=.1 requires

n=769.

Section 9.5

a. From Table A.9, column 5, row 8,

. 01 , 5 , 8

F 
.

b. From column 8, row 5,

. 01 , 8 , 5

F 
.

c.

. 05 , 8 , 5 . 95 , 5 , 8

 
F
F
.

d.

. 05 , 5 , 8 . 95 , 8 , 5

 
F
F

e.

. 01 , 10 , 12

F 

f.

. 01 , 12 , 10 . 99 , 10 , 12

  
F
F
.

g.

. 05 , 6 , 4

F 

, so

P  F  6. 16  . 95

.
h. Since. 177

. 99 , 10 , 5

F   ,

P . 177  F  4. 74   P  F  4. 74   P  F . 177  . 95 . 01 . 94

.

a. Since the given f value of 4.75 falls between F.^ 05 , 5 , 10 ^3.^33 and F. 01 , 5 , 10 ^5.^64

, we can say that the upper-tailed p-value is between .01 and .05.

b. Since the given f of 2.00 is less than F.^ 10 , 5 , 10 ^2.^52 , the p-value > .10.

c. The two tailed p-value = 2 P ^^ F ^5.^64 ^ ^2 (.^01 ).^02.

d. For a lower tailed test, we must first use formula 9.9 to find the critical values:

. 10 , 10 , 5 . 90 , 5 , 10

 
F
F
,

. 05 , 10 , 5 . 95 , 5 , 10

 
F
F
,

. 01 , 10 , 5 . 99 , 5 , 10

 
F
F

. Since .0995 < f = .200 < .2110, .01 < p-value < .05

(but obviously closer to .05).

e. There is no column for numerator d.f. of 35 in Table A.9, however looking at both df =

30 and df = 40 columns, we see that for denominator df = 20, our f value is between F.01

and F.001. So we can say .001< p-value < .01.

59. We test

2 2

0 1 2

H :    vs.

2 2

1 2

:   

a

H . The calculated test statistic is

 

 

2

2

f  . With numerator d.f. = m – 1 = 10 – 1 = 9, and denominator d.f. = n –

1 = 5 – 1 = 4, we reject H 0 if f^  F. 05 , 9 , 4 ^6.^00 or

. 05 , 4 , 9 . 95 , 9 , 4

   

F

f F

. Since .384 is in neither rejection region, we do

not reject H 0 and conclude that there is no significant difference between the two standard

deviations.

60. With 

1

 true standard deviation for not-fused specimens and 

2

 true standard

deviation for fused specimens, we test H^ 0 :^ ^1 ^  2 vs. H^ a :^ ^1 ^  2. The calculated test

statistic is

 

 

2

2

f  . With numerator d.f. = m – 1 = 10 – 1 = 9, and

denominator d.f. = n – 1 = 8 – 1 = 7, (^). 10 , 9 , 7

f  1. 814  2. 72  F

. We can say that the p-

value > .10, which is obviously > .01, so we cannot reject H 0. There is not sufficient

evidence that the standard deviation of the strength distribution for fused specimens is smaller

than that of not-fused specimens.

61. Let 

2

1

 variance in weight gain for low-dose treatment, and^ 

2

2

 variance in weight

gain for control condition. We wish to test

2

2

2

0 1

H :    vs.

2

2

2

1

:   

a

H. The test

statistic is 2

2

2

1

s

s

f  , and we reject H

0 at level .05 if^

. 05 , 19 , 22

fF  .

 

 

2. 85 2. 08

2

2

f    , so reject H 0 at level .05. The data does suggest that there is

more variability in the low-dose weight gains.

62. For the hypotheses 0 1 2

H :    versus

1 2

:   

a

H , we find a test statistic of f = 1.22.

At df = (47,44)  (40,40), 1.22 < 1.51 indicates the P-value is greater than 2(.10) = .20.

Hence, H 0 is not rejected. The data does not suggest a significant difference in the two

population variances.