Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comparing Two Population Means: Hypothesis Testing and Confidence Intervals - Prof. Dongme, Study notes of Statistics

An in-depth analysis of the methods used to compare two population means, including independent sampling, large and small sample cases, and paired differences. It covers hypothesis testing using both critical-value and p-value approaches, as well as constructing confidence intervals.

Typology: Study notes

2009/2010

Uploaded on 04/29/2010

gbarb002
gbarb002 🇺🇸

4.3

(3)

10 documents

1 / 23

Toggle sidebar

Related documents


Partial preview of the text

Download Comparing Two Population Means: Hypothesis Testing and Confidence Intervals - Prof. Dongme and more Study notes Statistics in PDF only on Docsity!

Chapter 9: Inferences based on two samples:

Confidence intervals and tests of hypotheses

9.1 The target parameter

 1   2 : difference between two population means

p 1  p 2 : difference between two population proportions

2 (^1 ) 2

  : ratio of two population variances

9.2 Comparing two population means: independent sampling

Case 1, Large samples:

 Conditions required for valid large sample:

1. Two sample randomly and independently selected from two independent population, 2. n 1 (^)  30 and n 2  30

Sampling distribution of ( x 1^ ^ x 2 ) is approximately normal with:

Mean:^ ( x 1 (^)  x (^) 2 ) ^ ^1   2

Standard error: 1 2

2 2 2 1 2 1 ( ) 1 2 1

x x

S S n n n n

   (^)     

2 2 2

(1-) 100% Confidence interval for (^  1 ^  2 ) (difference of two population means):

1 2

2 2 1 2 1 2 2 ( ) (^1 ) 1 2 2 2 1 2 (^1 2 ) 1 2

( ) ( )

( )

x x Z (^) x x x x Z n n

S S x x Z n n

 

     (^)     

   

Interpret: We are (1-  ) 100% confident that the true difference between these

two population means will falls in this interval.

Example1, DIETSTUDY,

To investigate the effect of a new low-fat diet on weight loss, two random samples of 100 people each are selected. One group of 100 is placed on the low-fat diet, while the other group with regular diet. For each person, the amount of weight lost (or gained) in 3-week period is recorded.

Diet Weight loss Low-fat diet (1) 8, 21, 13, …………….., 10 (100 observations) Regular diet (2) 6, 14, 4, ………………, 8 (100 observations)

Q: Form a 95% confidence interval for (^  1 ^  2 ), to estimate the difference between the

population mean weight losses for the two diets. Interpret the result.

Samples information:

Group Statistics

DIET N Mean Std. Deviation Std. Error Mean

LOWFAT n 1 =100 x 1 =9.31 S 1 =4.668. WTLOSS REGULAR n 2 =100 x 2 =7.40 S 2 =4.035.

95% confidence interval for means difference (  1   2 ) :

(^2 2 2 ) 1 2 1 2 2 0. 1 2

4.668 4. ( ) (9.31 7.40) 100 100

1.91 1.96 0.62 1.91 1.22 (0.69, 3.13)

S S x x Z Z n n

  (^)      

     

Interpret: We are 95% confident that the difference between the mean weight loss of low-fat diet and regular diet is between 0.69 pounds and 3.13 pounds. Note: (^  1 ^  2 ) is at least 0.69 and at most 3.13 pounds, so we can infer^  1 ^  2.

 How can we make inference based on (1-  ) 100% Confidence interval for (^  1 ^  2 )?

If the confidence interval for (  1   2 ) includes 0, it implies that there is no significant

difference between these two population means. If the confidence interval for (  1   2 ) doesn’t include 0, it implies that there is significant difference between these two population means.

Example: A confidence interval for (  1   2 ) is (-10, 4), what inference can we make?

 1 and  2 are not significant different.

A confidence interval for (^  1 ^  2 ) is (-18, -9), what inference can we make?

 1  2

A confidence interval for (  1   2 ) is (3, 12), what inference can we make?

 1  2

Hypothesis test for (^  1 ^  2 ):

 Critical-value approach:

1.^0 1 2

1 2 0 1 2 0 1 2 0

:

a :^ (

H D

H D or D or D )

 

     

 

     

2. significance level  ;

3. test statistic:

1 2 0 1 2 0 2 2 2 2 1 2 1 2 1 2 1 2

( x x ) D ( x x ) D

z

S S

n n n n

 

   

 

 

4. rejection region :

2

Z  Z  when H a :  1   2  D 0

Z   Z  when H a :  1   2  D 0

Z  Z  when H a :  1   2  D 0

5. conclusion: if the value of test statistic falls in R.R, reject , and conclude

that at

H 0

 level, there is sufficient evidence to conclude Ha is true.

if the value of test statistic does not fall in R.R, do not reject , and

conclude

H 0

that at  level, there is insufficient evidence to conclude Ha is true.

 P-value approach:

1.^0 1 2

1 2 0 1 2 0 1 2 0

:

a :^ (

H D

H D or D or D )

 

     

 

     

2. significance level  ;

3. test statistic 0 1 2 2 2 0 1 2

1 2 1 2 1 2 1 2

z (^ x^ x^ )^ D^ (^ x^ x^ )^ D^0 S S n n n n

 

 ^ ^  ^ 

 

;

4. p-value = 2 p z (^ ^ z 0 )^ when Ha :   0

p-value = p^ (^ z^ ^ z 0 ) when Ha :   0

p-value =^ p^ (^ z^ ^ z 0 ) when Ha :   0

5. Conclusion: if p-value is smaller than  , reject H 0 , and conclude that at

 level, there is sufficient evidence to conclude Ha is true.

If p-value is no less than  , do not reject H 0 , and conclude that at  level,

there is insufficient evidence to conclude Ha is true.

Examples for comparing two population means: independent, large-samples:

Example1, DIETSTUDY To investigate the effect of a new low-fat diet on weight loss, two random samples of 100 people each are selected. One group of 100 is placed on the low-fat diet, while the other group with regular diet. For each person, the amount of weight lost (or gained) in 3-week period is recorded.

Diet Weight loss Low-fat diet (1) 8, 21, 13, …………….., 10 (100 observations) Regular diet (2) 6, 14, 4, ………………, 8 (100 observations)

a. At   0.05 , conduct a test of hypothesis to determine whether the mean weight loss for

low-fat diet is different from that of regular diet.

Sample information: (SPSS output) Step 1. H^ 0 :^  1 ^  2 ^0 H^ a :^  1 ^  2 0 ( different )

Step 2. test statistic:

1 2 2 2 2 2 1 2 1 2

( ) 0 9.31 7.40 1.

4.668 4.035 0.

x x

z

S S

n n

  

  

 

Step 3. rejection region : 0.

2

Z  Z   Z 1.

Step 4. Since 3.09 > 1.96 , reject H 0.

At   0.05 , there is sufficient evidence to conclude that the mean weight loss of low-fat diet

is different from that of regular diet.

pvalue  2 P z (  3.09 )  2(0.5  0.4990)  2  0.001 0.

*(SPSS output: p-value = 0.002 < 0.05)

b. At   0.05 , conduct a test of hypothesis to determine whether the mean weight loss for

low-fat diet is greater than that of regular diet.

Step 1. H^ 0 :^  1 ^  2 ^0 H^ a :^  1 ^  2 0 ( 1 greater than^^  2 )

Step 2. test statistic:

1 2 2 2 2 2 1 2 1 2

( ) 0 9.31 7.40 1.

4.668 4.035 0.

x x

z

S S

n n

  

  

 

Step 3. rejection region : Z^ ^ Z^ ^ Z 0.05 1.

Step 4. Since 3.09 > 1.645, reject H 0.

*(p-value= 0.002/2=0.001<0.05)

At   0.05 , there is sufficient evidence to conclude that the mean weight loss of

low-fat diet is greater than that of regular diet.

SPSS output for DIETSTUDY

Group Statistics

DIET N Mean Std. Deviation Std. Error Mean

LOWFAT 100 9.31 4.668. WTLOSS REGULAR 100 7.40^ 4.035^.

Independent Samples Test Levene's Test for Equality of Variances

t-test for Equality of Means

95% Confidence Interval of the F Sig. t df (^) Difference Sig. (2-tailed)

Mean Difference

Std. Error Difference Lower Upper Equal variances assumed

1.367 .244 3.095 198 .002 1.910 .617 .693 3.

WTLOSS (^) Equal variances not assumed

3.095 193.940 .002 1.910 .617 .693 3.

Case 2, Small samples with equal variances

 Conditions required for valid small sample:

**1. The two samples are randomly and independently selected from the two target population. (sampling procedure)

  1. Both sampled populations have distributions that are approx. normal. (normal probability plots)**

3. The population variances are equal. (  12   22 ) (side-by-side box plot)

4. Sample size is small ( n 1 (^)  30, n 2  30 ).

Since these two populations have equal variance, (

2 2  1   2 ), it is reasonable to use the information contained in both samples to construct a pooled variance estimator for use in confidence intervals and test statistics. 2 2 (^2 1 1 ) 1 2

( 1) ( 1)

p 2

n S n S

S

n n

  

 

2

(1- a^ )100% confidence interval for (  1   2 ):

2 1 2 (^2 1 )

1 1 ( x x ) t S (^) p ( n n

  (^)   ) 2

twith dfn 1 (^)  n 2  2

 Hypothesis test for mean difference (  1   2 ):

1.^0 1 2

1 2 0 1 2 0 1 2 0

:

a :^ (

H D

H D or D or D )

 

     

 

     

2. level of significance  ;

3. test statistic:

1 2 0 2 1 2

( )

p (^ )

x x D

t

S

n n

 

with df  n 1  n 2  2

4. rejection region :

2

t  t  when H a :  1   2  D 0

t   t  when H a :  1   2  D 0

t  t  when H a :  1   2  D 0 ;

5. conclusion.

Examples for comparing two population means: independent, small-samples:

Example1: READING

Suppose we wish to compare a new method of teaching reading to “slow learners” to the current standard method. The response variable is the reading test score after 6 months. 22 slow learners are randomly selected, 10 are taught by the new method, 12 by the standard method. The test score is listed below.

New method (1) 80, 80, 79, 81, 76, 66, 71, 76, 70, 85 Standard method (2) 79, 62, 70, 68, 73, 76, 86, 73, 72, 68, 75, 66

a. Use a 95% confidence interval to estimate the true mean difference between the test score for the new method and the standard method. Interpret the interval.

95% confidence interval for mean difference (  1   2 ):

2 2 2 2 (^2 1 1 2 ) 1 2

2 (

(

) t

  

1 2 ^ 0.025^12 2) (^2 1 )

( 1) ( 1) (10 1)5.835 1)6.

2 10 12 2

1 1 1 1 ( ) ( ) (76.4 72.33 37.457( ) 10 12

4.07 2.086 2.621 ( 1.396, 9.536)

p

p

n S n S S n n

x x t S n n

 

         

     

    

Interpret: We are 95% confident that the mean difference of test score between

two methods will fall between -1.396 and 9.536. (no sig. difference between

 1 and  2 )

b. Conduct a test of hypothesis to determine whether the new method leads to a higher test

score than standard method. Use   0..

Step 1. H^ 0 :^  1 ^  2 ^0 H^ a :^  1 ^ ^2 0 ( 1 higher than^^  2 )

Step 2. test statistic:

1 2 2 1 2

( ) 0 76.4 72.33 4.

( ) 37.457( )

p 10 12

x x

t

S

n n

  

  

 

Step 3. rejection region :

(10 12 2) t tt 0.05 1.     

Step 4. Since 1.552  1.725, fail to reject H 0. *(p-value = 0.136/2= 0.068)

At   0.05 , there is insufficient evidence to conclude that the new method leads to a higher

test score than standard method.

SPSS output for READING Group Statistics

METHOD N Mean Std. Deviation Std. Error Mean

NEW 10 76.40 5.835 1. SCORE STD 12 72.33 6.344 1.

Independent Samples Test

Levene's Test for Equality of Variances

t-test for Equality of Means

95% Confidence Interval of the F Sig. t df (^) Difference Sig. (2-tailed)

Mean Difference

Std. Error Difference Lower Upper Equal variances assumed

.002 .967 1.552 20 .136 4.067 2.620 -1.399 9.

SCORE Equal variances not assumed

1.564 19.769 .134 4.067 2.600 -1.360 9.

Case 3, Small samples with unequal variance

Conditions:

**1. two samples are randomly and independently selected from the two target population.

  1. both sampled populations are approx. normal.**

3. the populations variance are not equal (  12   22 ).

Procedure is on textbook P422-423.

9.3 Comparing two population means: paired difference experiments

Two sampling comparing:

Example1: To estimate the difference (  1   2 ) in mean test score between new teaching

method and standard method in reading:

**1. Randomly select 16 slow learners, 8 are assigned to new method, while the other 8 are assigned to the standard method. (independent sampling)

  1. 8 pairs slow learner are selected, not randomly, two learners in each pair with the similar reading IQs; in each pair, one use new method, the other one use standard method, then the difference between the test score of each pair could be used to make inference**

about (  1   2 ).

(two subjects in each pair with similar level, then assign treatments, to see the effect)

Example2: To estimate the difference (  1   2 ) in mean car mileage between new gasoline

additive and no additive applied:

**1. Randomly choose 10 cars, doesn’t matter what brand they are, 5 are assigned use the new additive while the other 5 not; (independent sampling)

  1. Suppose mileage differences exist among different brand cars. Then choose 5 pairs brand cars: Honda, Ford, Toyota, Nissan, Hyundai, for each brand, one use the new additive while the other one not. The difference between the mileage of each pair could be**

used to make inference about (  1   2 ).

(two subjects in each pair with similar level, then assign treatments, to see the effect)

 Paired difference experiment: each pair has two similar experimental units,

observations are paired and the differences are analyzed.

 Blocking: making comparisons within groups of similar experimental units.

 Paired difference experiment is a simple example of randomized block design.

(Read textbook P432, 433)

Data layout: Pairs Sample1(New) Sample2(Old) (^) Difference ( xD ) 2 x D

1 18 16 18-16 = 2 4 2 31 28 31-28 = 3 9 3 25 26 25-26 = -1 1 4 23 21 23-21 = 2 4 5 26 22 26-22 = 4 16

xD^ ^10  xD^^2 ^34

(^2 )

2 (^ )^10

5, 2, 5 1.

D D D (^) D D D D D D

x

x

x n

n x S

n n

 

      

 

Note: The variable we are interested is paired difference xD.

 Inference based on paired difference (large sample):

Conditions required for a valid large-sample inference about  D :

1. A random sample of difference is selected from the target population of differences;

2. The sample size is nD  30.

1. Paired difference (1- a^ )100% confidence interval for  D  (  1   2 ) :

2 2

D D D D D D

S

x z x z

n n

 

  

2. Paired difference Test of hypothesis for^ ^ D ^ (^  1 ^  2 ) :

0 0 0 0

:

: ( : :

D a D a D a D

H D

H D or H D or H D 0 )

  

   

2. Significance level  ;

3. Test statistic:

D 0 D 0 D D D D

x D x D

z

 n S n

 

 

4. Rejection region:

2

Z  Z  when H a :  D  D 0

Z   Z  when Ha :  D  D 0

Z  Z  when Ha :  D  D 0 ;

5. Conclusion.

Examples of making inference based on paired difference (large sample):

Example: To investigate which supermarket (A or B) has the lower prices in town, a agency randomly selected 100 items common to each of the two supermarkets and recorded the prices charged by each supermarket. The summary results are provided below.

2.09 1.99 0.

0.24 0.19 0.

A B D A B D

x x x

S S S

  

  

a. Use a 95% confidence interval for  D   A   B to estimate the mean price difference

between supermarket A and supermarket B. Interpret the result.

2 0.

100

D D 0.10^  1.96^ 0.003^ (0.09, 0.11) D

S x z z n

 (^)      

Interpret: We are 95% confident that the mean price difference between supermarket A and

B fall between $0.09 and $0.11. (  A   B )

b. Conduct a test of hypothesis to determine whether the mean price for supermarket B is

cheaper than that for supermarket A? Use   0..

step1. H^ 0 :^  D ^0 H^ a :^ ^ D ^ ^ A ^  B ^0

step 2. test statistic:

0 0.10^0 0.10^ 33. 0.03 (^) 0. 100

D D D

x D z S n

 (^)     

step 3. rejection region: Z^ ^ Z^ ^ Z 0.05 1.

step 4. since 33.3 > 1.645, reject H 0.

At   0.05 , there is sufficient evidence to conclude that the mean price for supermarket A is

higher than market B.

 Inference based on paired difference (small sample):

 Conditions required for a valid small-sample inference about  D :

**1. A random sample of difference is selected from the target population of differences;

  1. The population of differences is approximately normally distributed;
  2. The sample size** nD  30**.
  3. (1-** a^ )100% confidence interval for Paired differenceD (^)  ( 1   2 ) :

2

D D D

S x t n

 (^)  t  2 with dfnD  1

2. Test of hypothesis for Paired difference  D  (  1   2 ) :

0 )

0 0 0 0

:

: ( : :

D a D a D a D

H D

H D or H D or H D

 

   

2. Significance level^ ^ ;

3. Test statistic:

D 0 D D

x D

t

S n

4. Rejection region:

2

t  t  when H a :  D  D 0

t   t  when H a :  D  D 0

t  t  when Ha :  D  D 0

5. Conclusion.

Examples of making inference based on paired difference (small sample):

Example 1, NEW PROTEIN DIET: To investigate a new protein diet on weight-loss, FDA randomly choose five individuals and record their weight (in pounds), then instruct them to follow the protein diet for three weeks. At the end of this period, their weights are recorded again. Person Weight before (1)

Weight after (2) (^) difference xD 2 x D

1 148 141 148-141 = 7 49 2 193 188 193-188 = 5 25 3 186 183 186-183 = 3 9 4 195 189 195-189 = 6 36 5 202 198 202-198 = 4 16

 xD^ ^25  xD^^2 ^135

a. Calculate a 95% confidence interval for the difference between the mean weights before and after the diet is used. Interpret the interval.

(4)

1 0.95, 0.05, 2 0.025, df nD 1 5 1 4, t 0.025 2.

         

(^2 )

2 (^ )^25

5, 5 1.

D D D (^) D D D D D

x

x

x n

x S

n n

 

     

 

2

5 2.776 5 2.776 0.

5 1.96 (3.04, 6.96)

D D D

S

x t

n

      

  

Interpret: We are 95% confident that the difference of the mean weights before and after this diet will fall between 3.04 pounds and 6.96 pounds.

b. Do the data provide sufficient evidence that the protein diet has effect on the weight loss?

Use   0.05. (p-value =? )

step1. H^ 0 :^  D ^0 H^ a :^ ^ D ^  1 ^  2 ^0

step 2. test statistic:

(^0 5 0) 7.

5

D D D

x D t S n

    

step 3. rejection region:

(5 1) t t (^)  t 0.05 2.    

step 4. since 7.07 > 2.132, reject H 0*. (p-value = 0.002/2 =0.001)

At   0.05 , there is sufficient evidence to conclude that the protein diet has effect on

weight loss.

SPSS output for Example1, NEW PROTEIN DIET

Paired Samples Statistics

Mean N Std. Deviation

Std. Error Mean Pair 1 W1 184.80 5 21.347 9. W2 179.80 5 22.354 9.

Paired Samples Test

Paired Differences t df

Sig. (2-tailed)

Mean Std. Deviation

Std. Error Mean

95% Confidence Interval of the Difference Lower Upper Pair 1 W1 - W2 5.000 1.581 .707 3.037 6.963 7.071 4.

Example 2, PAIREDSCORES

To investigate the effect of a new teaching method on improving reading test score, 8 pairs slow learner are selected, not randomly, two learners in each pair with the similar reading IQs; in each pair, one use new method, the other one use standard method. Then after 6 months, the test scores are recorded. Pair New method (1) Standard method (2) xD

1 77 72 5 2 74 68 6 3 82 76 6 4 73 68 5 5 87 84 3 6 69 68 1 7 66 61 5 8 80 76 4

a. Construct a 95% confidence interval to estimate the difference of mean test scores between new method and standard method. Interpret the result.

(7)

1 0.95, 0.05, 2 0.025, df nD 1 8 1 7, t 0.025 2.

         

x (^) D  4.375, S (^) D 1.685 ( SPSS output )

(8 1) 2 0.

4.375 4.375 2.365 0. 8

4.375 1.409 (2.966, 5.784)

D D D

S x t t n

       

  

Interpret: We are 95% confident that the mean difference of test score between new and standard methods will fall between 2.966 and 5.784 points.

b. Do the data provide sufficient evidence that the new method leads to higher test scores than

the standard method? Use^ ^ ^ 0.05. (p-value =? )

step1. H^ 0 :^  D ^0 H^ a :^ ^ D ^  1 ^  2 ^0

step 2. test statistic:

0 4.375^0 4.375^ 7. 1.685 (^) 0. 8

D D D

x D t S n

     

step 3. rejection region:

(8 1) t tt 0.05 1.    

step 4. since 7.34 > 1.895, reject H 0. *(p-value = 0.000/2 =0.000)

At   0.05 , there is sufficient evidence to conclude that the new method

leads to higher test score than standard method.

SPSS output for Example2, PAIREDSCORES

Paired Samples Statistics

Mean N Std. Deviation

Std. Error Mean Pair 1 NEW 76.00 8 6.928 2. STD 71.63 8 7.009 2.

Paired Samples Test

Paired Differences t df

Sig. (2-tailed)

Mean Std. Deviation

Std. Error Mean

95% Confidence Interval of the Difference Lower Upper Pair 1 NEW - STD 4.375 1.685 .596 2.966 5.784 7.344 7.

9.4 Comparing two population proportions: independent sampling

Conditions required for valid large-sample inferences about ( p 1  p 2 ) :

**1. The two samples are randomly and independently selected from the two target populations.

  1. The sample size** n 1 and are both large. ( This condition will be satisfied if both

and

n 2

n p 1 ˆ 1 (^)  15, n q 1 ˆ 1 (^)  15, n 2 (^) p ˆ 2  15, n q 2 ˆ 2  15 .)

Under large sample size, by the Central Limit Theorem,

the sampling distribution of ( ˆ p 1 (^)  p ˆ 2 ) is approximately normal with:

mean:^ ^ ( p ˆ 1 (^)  p ˆ 2 ) ^ p 1^  p 2

standard deviation: (^) (ˆ 1 ˆ2)^1 1 2 1 2

p p

p q p q

n n

   

Large sample 100(1- a^ )% confidence interval for ( p 1 (^)  p 2 ) :

1 2

1 1 2 2 1 2 2 ( ˆ^ ˆ) 1 2 2 1 2

1 1 2 2 1 2 2 1 2

( ˆ^ ˆ^ ) ( ˆ^ ˆ )

ˆ ˆ ˆ ˆ ( ˆ^ ˆ )

p p

p q p q p p z p p z n n

p q p q p p z n n

 

   (^)     

   

 Large –sample test of hypothesis about ( p 1  p 2 ) :

1.^0 1

1 2 1 2 1 2

: 0

a :^0 (^0

H p p

H p p or p p or p p

 

      0)

2. Level of significance  ;

3. Test statistic:

1 2 1 2 1 2 1 2

( ˆ ˆ ) ˆ (^) , ˆ 1 1 1 ˆ ˆ( )

p p x x z where p n n pq n n

     

q   p^ ˆ

4. Rejection region : 2 ZZwhen H (^) a : p 1 (^)  p 2  0

Z   Zwhen H (^) a : p 1 (^)  p 2  0

Z  Z  when H a : p 1  p 2  0 ;

5. Conclusion.

Examples for comparing two population proportions (large-sample):

Example: Smoking Survey, Suppose the American cancer Society randomly sampled 1500 adults in 1995 and then sampled 1750 adults in 2005 to do a smoking survey to determine whether there was evidence that the percentage of smokers had decreased.

1995 (1) 2005 (2) n 1 (^)  1500 n 2 (^)  1750

x 1 (^)  555 x 2 (^)  578

Define: p 1 : the true proportion of adult smokers in 1995

p 2 : the true proportion of adult smokers in 2005

a. Give a point estimate of ( p 1 (^)  p 2 ).

A point estimate of p 1 : 1 1

1

ˆ 555 0.

x p n

  

A point estimate of p 2 : 2 2

2

ˆ 578

0.

x p n

  

A point estimate of ( p 1  p 2 ): 1 2 1 2

1 2

ˆ ˆ 555 578

0.37 0.33 0.

x x p p n n

       

b. Do the data indicate that the proportion of adult smokers decreased over this 10-year

period? Use   0..

Check if sample size large enough.

1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2

ˆ 555 15 ˆ 1500 555 945 15

ˆ 578 15 ˆ 1750 578 1172 15

x n p n x n q n x n x n p n x n q n x n

         

         

Two sample sizes are large enough.

Step1, H^ 0 :^ p 1^ ^ p 2^ ^0 H^ a :^ p 1^ ^ p 2^ ^ 0 (^ decreased^ , p 1^ ^ p 2 )

Step2, test statistic:

1 2

1 2 1 2 1 2

555 578 ( ˆ^ ˆ ) (^1500 )

1 1 1 1 ˆ ˆ( ) 0.349 0.651( ) 1500 1750

555 578 ˆ (^) 0.349, ˆ 1 ˆ 1 0.349 0. 1500 1750

p p z pq n n

x x where p q p n n

       

           

Step3. Rejection region: Z^ ^ Z^  ^ Z 0.05 1.

Step4. Since 2.37 > 1.645, reject H 0.

At   0.05, there is sufficient evidence to conclude that the proportion of adult smoker has

decreased over 1995 - 2005.

c. Form a 95% confidence interval for ( p 1 (^)  p 2 ) to estimate the extent of the decrease.

Interpret it.

1 1 2 2 1 2 0. 1 2

ˆ ˆ ˆ ˆ ( ˆ^ ˆ )

0.37 (1 0.37) 0.33 (1 0.33) (0.37 0.33) 1. 1500 1750 0.04 1.96 0.0168 0.04 0.033 (0.007, 0.073)

p q p q p p z n n

  

       

     

Interpret: we are 95% confident that the proportion difference of adult smoker between 1995

and 2005 will fall between 0.007 and 0.073. (we can infer p 1 (^)  p 2 .)

9.6 Comparing Two Population Variances: Independent Sampling

For some instance, we are interested in comparing two population variances.

The common statistical procedure for comparing population variances  12 and  22 ,

we use :

(^2 2 ) 0 1 2 1 2

H : ^1 (   2 )

 

(^2 2 ) (^1 2 1 ) 2

Ha : ^1 (   )

 

Test statistic:

2 1 2 2

s F s

How about the sampling distribution of

2 1 2 2

s s?

When 1. the two sampled populations are normally distributed.

2. the samples are randomly and independently selected from their respective populations.

3. the null hypothesis is true (^ ^12 ^  22 ).

Then, the sampling distribution of

2 (^1 ) 2

F s

s

 is the F-distribution with ( n 1  1 ) numerator

degrees of freedom and ( n 2 (^)  1 ) denominator degrees of freedom, respectively.

The properties of F-distribution:

**1. right-skewed.

  1. the total area under the F-curve equals 1.
  2. it bases on two degrees of freedom, (** n 1 (^)  1 ) numerator degrees of freedom and ( n 2 (^)  1 )

denominator degrees of freedom.

Note: 1. Table VIII-XI, p799-806, give the upper-tail F-value. To accomplish this, we will always place the larger sample variance in the numerator of the F-test statistic.

2. We always define:  1 2 is the population variance associated with the larger sample

variance s 12.

For example: F 0.05, 5, 8^ 3.69

 Testing of hypothesis

2 2

H 0 :  1  2

2 2 2 2

H a :  1   2 ( or H a : 1   2 )

Test statistic:

2 1 2 2 2 1 2 2

( )

s F s s s

 

Rejection region: F^ ^ F  2, ( n 1 (^) 1), ( n 2 1) when Ha :^ ^12  22

FF  , ( n 1 (^) 1), ( n 2 1) when Ha :  1 2  22

Conclusion.

Since variance of supplier 1 is larger than that of supplier 2, let define

Test statistic:

Rejection region:

2 , ( 1 1), (^2 1)^ 0.10^2 , (13 1), (18 1) 0.05, (12), (17)

F  F  n  n  = F   = F =2.3 8

Since 4.24>2.38 , reject H. 0

At  = 0.10, there is sufficient evidence to conclude that the two population variances differ.

We would advise the experimenter to purchase the mice from supplier 2 since they tend to be more homogeneous.