Statistical Inference from Two Samples: Hypothesis Testing and Confidence Intervals - Prof, Assignments of Mathematical Statistics

Solutions to various statistical exercises related to hypothesis testing and confidence intervals based on two samples. Topics include calculating test statistics, degrees of freedom, p-values, and constructing confidence intervals.

Typology: Assignments

Pre 2010

Uploaded on 12/20/2009

mlazybum
mlazybum 🇺🇸

5

(2)

14 documents

1 / 36

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CHAPTER 9
Section 9.1
1.
a.
4.5.41.4 YEXEYXE
, irrespective of sample sizes.
b.
0724.
100
0.2
100
8.1
22
2
2
2
1
nm
YVXVYXV
, and the s.d. of
2691.0724. YX
.
c. A normal curve with mean and s.d. as given in a and b (because m = n = 100, the CLT
implies that both
X
and
Y
have approximately normal distributions, so
YX
does
also). The shape is not necessarily that of a normal curve when m = n = 10, because the
CLT cannot be invoked. So if the two lifetime population distributions are not normal,
the distribution of
YX
will typically be quite complicated.
2. The test statistic value is
n
s
m
s
yx
z
2
2
2
1
, and H0 will be rejected if either
96.1z
or
. We compute
85.4
33.433
2100
45
1900
45
2200
400,40500,42
2
2
z
. Since 4.85 >
1.96, reject H0 and conclude that the two brands differ with respect to true average tread lives.
3. The test statistic value is
n
s
m
s
yx
z
2
2
2
1
5000
, and H0 will be rejected at level .01 if
33.2z
. We compute
76.1
93.396
700
45
1500
45
2200
5000800,36500,42
2
2
z
, which is
not > 2.33, so we don’t reject H0 and conclude that the true average life for radials does not
exceed that for economy brand by significantly more than 500.
265
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24

Partial preview of the text

Download Statistical Inference from Two Samples: Hypothesis Testing and Confidence Intervals - Prof and more Assignments Mathematical Statistics in PDF only on Docsity!

CHAPTER 9

Section 9.

a. EXY   EX   EY   4. 1  4. 5 . 4 , irrespective of sample sizes.

b. (^)      

   

2 2 2

2

2

1

m n

V X Y V X V Y

 

, and the s.d. of

X  Y . 0724 . 2691.

c. A normal curve with mean and s.d. as given in a and b (because m = n = 100, the CLT

implies that both X and Y have approximately normal distributions, so X  Y does

also). The shape is not necessarily that of a normal curve when m = n = 10, because the

CLT cannot be invoked. So if the two lifetime population distributions are not normal,

the distribution of

X  Y

will typically be quite complicated.

2. The test statistic value is

n

s

m

s

x y

z

2

2

2

1

, and H 0 will be rejected if either z  1. 96 or
z  1. 96. We compute

2 2

z 

. Since 4.85 >

1.96, reject H 0 and conclude that the two brands differ with respect to true average tread lives.

3. The test statistic value is

n

s

m

s

x y

z

2

2

2

1

, and H 0 will be rejected at level .01 if

z  2. 33. We compute

 

2 2

z 

, which is

not > 2.33, so we don’t reject H 0 and conclude that the true average life for radials does not

exceed that for economy brand by significantly more than 500.

a. From Exercise 2, the C.I. is

   1. 96  2100 1. 96  433. 33  2100 849. 33

2

2

2

1

n

s

m

s

x y

  1250. 67 , 2949. 33 

. In the context of this problem situation, the interval is

moderately wide (a consequence of the standard deviations being large), so the

information about 1

and 2

is not as precise as might be desirable.

b. From Exercise 3, the upper bound is

5700  1. 645  396. 93   5700  652. 95  6352. 95 .

a. Ha says that the average calorie output for sufferers is more than 1 cal/cm

2 /min below that

for non-sufferers.

   

2 2 2

2

2

1

m n

, so

   

z . At level .01, H

0 is rejected if^

z  2. 33 ;

since –2.90 < -2.33, reject H 0.

b. P ^ ^2.^90 ^ .^0019

c.^1 ^.^92 ^.^8212

d.

 

 

2

2

m  n  , so use 66.

a.

1 Parameter of interest: ^ 

1 2

  the true difference of mean tensile strength of the

1064 grade and the 1078 grade wire rod. Let 

1

 1064 grade average and 

2

1078 grade average.

2 H 0 : 10

1 2

3 Ha: 10

1 2

n

s

m

s

x y

n

s

m

s

x y

z

o

2

2

2

1

2

2

2

1

5 RR: p ^ value ^ 

   

2 2

z 

7 For a lower-tailed test, the p-value = ^ ^28.^57 ^ ^0 , which is less than any^ ^ ,

so reject H 0. There is very compelling evidence that the mean tensile strength of the

1078 grade exceeds that of the 1064 grade by more than 10.

b. The requested information can be provided by a 95% confidence interval for 1 2

  1. 96  16  1. 96 . 210   16. 412 , 15. 588 

2

2

2

1

n

s

m

s

x y.

a. point estimate x^ ^ y ^19.^9 ^13.^7 ^6.^2. It appears that there could be a

difference.

b.

H 0 : 0

1 2

    ,Ha: 0

1 2

 

2 2

z 

, and the

p-value = 2[P(z > 1.14)] = 2( .1271) = .2542. The p value is larger than any reasonable

, so we do not reject H 0. There is no significant difference.

c. No. With a normal distribution, we would expect most of the data to be within 2 standard

deviations of the mean, and the distribution should be symmetric. 2 sd’s above the mean

is 98.1, but the distribution stops at zero on the left. The distribution is positively

skewed.

d. We will calculate a 95% confidence interval for μ, the true average length of stays for

patients given the treatment. 19.^99.^9 ^10.^0 ,^21.^8 

a. The hypotheses are H 0 : 5

1 2

    and H

a:^

1 2

   . At level .001, H

0 should be

rejected if z  3. 08. Since

 

z  , H 0 cannot be

rejected in favor of Ha at this level, so the use of the high purity steel cannot be justified.

b.^1

1 2

o

  , so . 53 . 2891

11. (^)  

n

s

m

s

X Y z

2

2

2

1

/ 2

. Standard error =

n
s

. Substitution yields

     

2

2

2

/ 2 1

x  y  z SE  SE

. Using ^ .^05 , z ^ / 2 ^1.^96 , so

2 2    . We are 95% confident that the

true average blood lead level for male workers is between 0.99 and 2.41 higher than the

corresponding average for female workers.

12. The C.I. is (^)    

2

2

2

1

n

s

m

s

x y

   11. 23 , 6. 31 

. With 99% confidence we may say that the true difference between

the average 7-day and 28-day strengths is between -11.23 and -6.31 N/mm

2 .

1 2

    , d = .04, ^ .^01 ,^ .^05 , and the test is one-tailed, so

  

2

n  , so use n = 50.

14. The appropriate hypotheses are H 0 :   0 vs. Ha:   0 , where

1 2

  2   . (   0 is

equivalent to 1 2

2    , so normal is more than twice schizophrenic) The estimator of  is

 2 X  Y

 , with^      

m n

Var VarX Var Y

2

2

2

1

 is the square root of

  

Var , and^ 
ˆ is obtained by replacing each

2

i

 with

2

i

S. The test statistic is then

ˆ

(since

o

), and H 0 is rejected if z  2. 33. With

    and

   

2 2

z  ; Because –1.05 > -2.33, H

0 is not rejected.

d.

2 2

2 2

18. With H 0 : 0

1 2

    vs. Ha: 0

1 2

    , we will reject H 0 if p^ ^ value ^ .

2 2

2 2

, and the test statistic

5

. 240

6

. 164

2 2

t 

leads to a p-value of 2[ P(t > 6.17)] < 2(.0005)

=.001, which is less than most reasonable  ' s , so we reject H 0 and conclude that there is a

difference in the densities of the two brick types.

19. For the given hypotheses, the test statistic

6

  1. 38

6

  1. 03

2 2

t 

and the d.f. is

2 2

2

, so use d.f. = 9. We will reject H 0 if

. 01 , 9

t   t  since –1.20 > -2.764, we don’t reject H 0.

20. We want a 95% confidence interval for 1 2

  . 2.^262

. 025 , 9

t  , so the interval is

 13. 6  2. 262  3. 007    20. 40 , 6. 80 . Because the interval is so wide, it does

not appear that precise information is available.

21. Let 

1

 the true average gap detection threshold for normal subjects, and 

2

 the

corresponding value for CTS subjects. The relevant hypotheses are H 0 : 0

1 2

    vs. Ha:

1 2

    , and the test statistic 2.^46
t 

. Using

d.f.

2 2

2

, or 15, the rejection region is

. 01 , 15

t  t 

. Since –2.46 is not  2. 602 , we fail to reject H 0. We have

insufficient evidence to claim that the true average gap detection threshold for CTS subjects

exceeds that for normal subjects.

b.

  1. 5 1. 5 2. 5

Comparative Box Plot for High Quality and Poor Quality Fabric

Quality

Poor

Quality

High

extensibility (%)

The comparative boxplot does not suggest a difference between average extensibility for

the two types of fabrics.

c. We test :^0

0 1 2

H     vs. : 0

1 2

a

H . With degrees of freedom

 

2

   , which we round down to 10, and using significance

level .05 (not specified in the problem), we reject H 0 if 2.^228

. 025 , 10

tt . The test

statistic is  

t  , which is not

in absolute value, so we

cannot reject H 0. There is insufficient evidence to claim that the true average

extensibility differs for the two types of fabrics.

a. 95% upper confidence bound:

x
  • t .05,65-1 SE = 13.4 + 1.671(2.05) = 16.83 seconds

b. Let μ 1 and μ 2 represent the true average time spent by blackbirds at the experimental and

natural locations, respectively. We wish to test H 0 : μ 1 – μ 2 = 0 v. Ha: μ 1 – μ 2 > 0. The

relevant test statistic is 2 2

  1. 05 1. 76

t  = 1.37, with estimated df =

4 4

2 2 2

≈ 112.9. Rounding to t = 1.4 and df = 120, the tabulated P -value is

very roughly .082. Hence, at the 5% significance level, we fail to reject the null

hypothesis. The true average time spent by blackbirds at the experimental location is not

statistically significantly higher than at the natural location.

c. 95% CI for silvereyes’ average time – blackbirds’ average time at the natural location:

2 2

  1. 76  5. 06 = (17.96 sec, 39.44 sec). The^ t -value 2.00 is

based on estimated df = 55.

25. We calculate the degrees of freedom

2 2

2 2

, or about 54

(normally we would round down to 53, but this number is very close to 54 – of course for this

large number of df, using either 53 or 54 won’t make much difference in the critical t value)

so the desired confidence interval is

2 2

 3. 2  2. 931  . 269 , 6. 131 

. Because 0 does not lie inside this interval, we can be

reasonably certain that the true difference 1 2

   is not 0 and, therefore, that the two

population means are not equal. For a 95% interval, the t value increases to about 2.01 or so,

which results in the interval 3. 2  3. 506. Since this interval does contain 0, we can no

longer conclude that the means are different if we use a 95% confidence interval.

26. Let 

1

 the true average potential drop for alloy connections and let 

2

 the true

average potential drop for EC connections. Since we are interested in whether the potential

drop is higher for alloy connections, an upper tailed test is appropriate. We test

0 1 2

H     vs. : 0

1 2

a

H . Using the SAS output provided, the test statistic,

when assuming unequal variances, is t = 3.6362, the corresponding df is 37.5, and the p-value

for our upper tailed test would be ½ (two-tailed p-value) = . 0008 . 0004

2

1

. Our p-

value of .0004 is less than the significance level of .01, so we reject H 0. We have sufficient

evidence to claim that the true average potential drop for alloy connections is higher than that

for EC connections.

27. The approximate degrees of freedom for this estimate are

2 2

2 2

, which we round down to 8, so

. 025 , 8 t  and the desired interval is

2 2

 18. 9  12. 607   6. 3 , 31. 5 

. Because 0 is not contained in this interval, there is

strong evidence that 1 2

   is not 0; i.e., we can conclude that the population means are not

 33. 4 42. 8  2. 719 9. 4 2. 576  11. 98 , 6. 83  26

2 2

      

We are 99% confident that the true average load for carbon beams exceeds that for

fiberglass beams by between 6.83 and 11.98 kN.

b. The upper limit of the interval in part a does not give a 99% upper confidence bound.

The 99% upper bound would be ^9.^4 ^2.^434 ^.^9473 ^ ^7.^09 , meaning that the

true average load for carbon beams exceeds that for fiberglass beams by at least 7.09 kN.

a.

The most notable feature of these boxplots is the larger amount of variation present in the

mid-range data compared to the high-range data. Otherwise, both look reasonably

symmetric with no outliers present.

b. Using df = 23, a 95% confidence interval for ^ mid  range ^  high  range is

 438. 3 437. 45  2. 069. 85 8. 69  7. 84 , 9. 54  11

2 2

      

Since plausible values for ^ mid  range ^  high  range are both positive and negative (i.e.,

the interval spans zero) we would conclude that there is not sufficient evidence to suggest

that the average value for mid-range and the average value for high-range differ.

32. Let 

1

 the true average proportional stress limit for red oak and let 

2

 the true

average proportional stress limit for Douglas fir. We test H^ 0 :^  1 ^  2 ^1 vs.

1 2

a

H . The test statistic is

 

  1. 818

. 2084

  1. 48 6. 65 1 1. 83

10

  1. 28

14

. 79

2 2

 

t

. With

degrees of freedom

 

   

  1. 85 13

13 9

. 2084

2

10

  1. 28

2

14

. 79

2

2 2

 

 

, the p-value = P(t > 1.8)

= .048. We would reject H 0 at significance levels greater than .046 (e.g., the standard 5%

significance level). At α = .05, there is sufficient evidence to claim that true average

proportional stress limit for red oak exceeds that of Douglas fir by more than 1 MPa.

mid range high range

470

460

450

440

430

420

m

id

r

a

n

g

e

Comparative Box Plot for High Range and Mid Range

35. There are two changes that must be made to the procedure we currently use. First, the

equation used to compute the value of the t test statistic is:

   

m n
s
x y
t

p

where sp is

defined as in Exercise 34 above. Second, the degrees of freedom = m + n – 2. Assuming

equal variances in the situation from Exercise 33, we calculate sp as follows:

   2. 5  2. 544

^ 

p

s . The value of the test statistic is, then,

   

t 

. The degrees of freedom = 16, and the p-

value is P ( t < -2.2) = .021. Since .021 > .01, we fail to reject H 0.

Section 9.

36. d  7. 25 ,  11. 8628

D

s
1 Parameter of Interest: 

D

 true average difference of breaking load for fabric in

unabraded or abraded condition.

2 :^0

0

D

H 
3 :^ ^0

a D

H 
s n
d
s n
d
t

D D

D

5 RR:

. 01 , 7

tt

t 

7 Fail to reject H 0. The data does not indicate a significant mean difference in

breaking load for the two fabric load conditions.

a. This exercise calls for paired analysis. First, compute the difference between indoor and

outdoor concentrations of hexavalent chromium for each of the 33 houses. These 33

differences are summarized as follows: n = 33, d . 4239 , .^3868

d

s , where d =

(indoor value – outdoor value). Then t^. 025 , 32 ^2.^037 , and a 95% confidence interval

for the population mean difference between indoor and outdoor concentration is

 . 4239. 13715 . 5611 ,. 2868 

. We can

be highly confident, at the 95% confidence level, that the true average concentration of

hexavalent chromium outdoors exceeds the true average concentration indoors by

between .2868 and .5611 nanograms/m

3 .

b. A 95% prediction interval for the difference in concentration for the 34

th house is

 1 . 4239  2. 037 . 3868 1   1. 224 ,. 3758  33

1 1

. 025 , 32

d n

d t s.

This prediction interval means that the indoor concentration may exceed the outdoor

concentration by as much as .3758 nanograms/m

3 and that the outdoor concentration may

exceed the indoor concentration by a much as 1.224 nanograms/m

3 , for the 34

th house.

Clearly, this is a wide prediction interval, largely because of the amount of variation in

the differences.

a. The median of the “Normal” data is 46.80 and the upper and lower quartiles are 45.

and 49.55, which yields an IQR of 49.55 – 45.55 = 4.00. The median of the “High” data

is 90.1 and the upper and lower quartiles are 88.55 and 90.95, which yields an IQR of

90.95 – 88.55 = 2.40. The most significant feature of these boxplots is the fact that their

locations (medians) are far apart.

High: Normal:

90

80

70

60

50

40

Comparative Boxplots

for Normal and High Strength Concrete Mix

1 Parameter of interest: d

denotes the true average difference of spatial ability in

brothers exposed to DES and brothers not exposed to DES. Let

d exp osed un exp osed.

2 H 0 :^  D ^0
3 Ha :^  D ^0
s n
d
s n
d
t

D D

D

5 RR: P-value < .05, df = 9

 

t  , with corresponding p-value .028 (from Table

A.8)

7 Reject H 0. The data supports the idea that exposure to DES reduces spatial ability.

a. Although there is a “jump” in the middle of the Normal Probability plot, the data follow a

reasonably straight path, so there is no strong reason for doubting the normality of the

population of differences.

b. A 95% lower confidence bound for the population mean difference is:

. 05 , 14

n

s

d t

d .

We are 95% confident that the true mean difference between age at onset of Cushing’s

disease symptoms and age at diagnosis is greater than -49.14.

c. A 95% upper confidence bound for the corresponding population mean difference is

44. We need to check the differences to see if the assumption of normality is plausible. A normal

probability plot validates our use of the t distribution. A 95% upper confidence bound for μD

is ^ ^2635.^63222.^91

. 05 , 15

n

s

d t

d = 2858.54.

We are 95% confident that the true mean difference between modulus of elasticity after 1

minute and after 4 weeks is at most 2858.54.

45. From the data, n = 12, (^) d = –0.73, sd = 2.81.

a. Let μd = the true mean difference in strength between curing under moist conditions and

laboratory drying conditions. A 95% CI for μd is (^) d ± t .025,11 sd/ (^) n = –0.73 ±

2.201(2.81)/ 10 = (–2.52 MPa, 1.05 MPa). In particular, this interval estimate includes

the value zero, suggesting that true mean strength is not significantly different under

these two conditions.

b. Since n = 12, we must check that the differences are plausibly from a normal population.

The normal probability plot below strongly substantiates that condition.

Differences

P

e

r

c

e

n

t

-7.5 -5.0 -2.5 0.0 2.5 5.

99

95

90

80

70

60 50 40 30

20

10

5

1

Normal Probability Plot of Differences

Normal

46. With  ,   6 , 5 

1 1

x y  ,  ,   15 , 14 

2 2

x y  , ^ ,^ ^ ^1 ,^0  3 3

x y  , and  ,   21 , 20 

4 4

x y  ,

d  1 and ^0

d

s (the d

I’s are 1, 1, 1, and 1), while s 1 = s 2 = 8.96, so sp = 8.96 and t = .16.

Section 9.

47. H 0 will be rejected if z^ ^ z. 01 ^2.^33. With ˆ.^150

1

p  , and ˆ. 300

2

p  ,
p  , and q ˆ^ . 737. The calculated test statistic is

   

600

1

200

1

z 

. Because ^4.^18 ^2.^33 , H 0 is

rejected; the proportion of those who repeat after inducement appears lower than those who

repeat after no inducement.

a. H 0 will be rejected if z^ ^1.^96. With. 2100

1

p   , and

2

p   ,. 2875
p  ,

   

180

1

300

1

z 

. Since  4. 84  1. 96

, H 0 is rejected.

b. p . 275 and (^)  . 0432 , so power =

         

^ 