Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Inference for Proportions and Means: Confidence and Hypothesis Testing, Study notes of Statistics

An overview of statistical inference for two proportions and means, focusing on confidence intervals and hypothesis testing. It covers the concepts of sampling distributions, standard deviations, and confidence intervals for proportions and means. Additionally, it discusses hypothesis testing, null distributions, and conducting hypothesis tests to determine if the null hypothesis should be rejected.

Typology: Study notes

Pre 2010

Uploaded on 10/01/2009

koofers-user-y37
koofers-user-y37 🇺🇸

10 documents

1 / 14

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Inference for Proportions and Means: Confidence and Hypothesis Testing and more Study notes Statistics in PDF only on Docsity!

Review of Hypothesis Testing and Confidence Intervals

STA570 Spring 2006

1 First step - know your situation

This handout covers statistical inference in four situations - one proportion, two proportion, one mean, and two means. Proportions refer to situation where the data is dichotomous (two-values) such as “yes/no”, “male/female”, “success/failure”, and so forth. Means are used for interval/ratio data. In what follows, we color code the text by using red for one proportion, blue for two propor- tions, purple for one mean, and green for two means. Text remaining in black applies to all four scenarios. One population problems are fairly rare in practice. Typically we are interested in comparing two populations (ANOVA and Regression generalizes this to multiple population), such as compar- ing people given a placebo (one population) to people given a new drug (the second population). For one proportion, we are interested in the population proportion p. The population proportion p is estimated by the sample proportion ˆp. Two two proportions, we are interested in the difference of the two population proportion pX − pY that is estimated with ˆpX − pˆY. For one mean, we are interested in the population mean μ that is estimated by X¯, and for two means we are interested in the difference of the population means μX − μY that is estimated by X¯ − Y¯. Note the differences pX − pY and μX − μY are directly related to questions such as “is pX equal to pY ” (equivalent to pX − pY = 0) or “is μX greater than μY ?” (equivalent to μX − μY > 0).

2 Next step - know the sampling distribution

The sampling distribution describes the variation we observe in our estimator as a result of sampling variability, which is the tendency of any statistic to vary in repeated sampling. While statistics vary from sample to sample, they do so in a clear pattern which can be mathematically determined. This pattern is useful for determining how close we think the parameter is to statistic (e.g. how close the population proportion p is to the estimator ˆp, or how close the estimated difference X¯ − Y¯ is to the true difference μX − μY. All these distributions require sample sizes greater than 30. Note that in all cases the center of the sampling distribution is the parameter of interest.

  1. For one proportion we are estimating p with ˆp. The sampling distribution of ˆp is

ˆp ∼ N

 p,

√ p(1 − p) n

 

  1. For two proportions we are estimating pX − pY with ˆpX − pˆY. The sampling distribution is

pˆX − pˆY ∼ N

 pX − pY ,

√ pX (1 − pX ) nX

+

pY (1 − pY ) nY

 

  1. For one mean we are estimating μ with X¯. The sampling distribution is

X¯ ∼ N

( μ, σ √ n

)

where σ is the population standard deviation.

  1. For two means we are estimating μX − μY with X¯ − Y¯. The sampling distribution is

X¯ − Y¯ ∼ N

 μX − μY ,

√ σ X^2 nX

+

σ^2 Y nY

 

where σX and σY are the respective population standard deviations.

3 Confidence Intervals

If you want to make a confidence interval, note every one of the sampling distribution is normally distributed, with a mean equal to the parameter of interest, and some standard deviation. All confidence intervals we discussed have the form

estimate ± z 1 −(α/2) (standard deviation of estimate)

Because population parameters are unknown, any population parameters appearing in the standard deviations must be estimated

  1. For one proportion, the standard deviation of ˆp is √ p(1 − p) n estimated by

√ p ˆ(1 − pˆ) n

The confidence interval for p is

pˆ ± z 1 −(α/2)

√ p ˆ(1 − pˆ) n

  1. For two proportions, the standard deviation of ˆpX − pˆY is √ pX (1 − pX ) nX

+

pY (1 − pY ) nY

estimated by

√ p ˆX (1 − pˆX ) nX

+

pˆY (1 − pˆY ) nY

The confidence interval for pX − pY is

(ˆpX − pˆY ) ± z 1 −(α/2)

√ p ˆX (1 − pˆX ) nX

+

pˆY (1 − pˆY ) nY

  1. For one mean, the standard deviation of X¯ is

σ √ n

estimated by s √ n

The confidence interval for μ is

X¯ ± z 1 −(α/2) √^ s n

  1. For two means, the standard deviation of X¯ − Y¯ is √ σ^2 X nX

+

σ Y^2 nY estimated by

√ s^2 X nX

+

s^2 Y nY

The confidence interval for μX − μY is

( X¯ − Y¯ ) ± z 1 −(α/2)

√ s^2 X nX

+

s^2 Y nY

4 Hypothesis Testing

Hypothesis testing is a much broader subject than confidence intervals, since it includes investigat- ing the properties of the tests such as power and sample size determination. As before, you need to know whether you are investigating one proportion, two proportions, one mean, or two means.

4.1 Know your hypotheses

In all hypothesis tests, there is a “null value” available for the parameter, so you are testing

  1. One proportions: H 0 : p = p 0 (null value is p 0 )
  2. Two proportions: H 0 : pX − pY = 0 (we never use anything but 0 as the null value in this course)
  3. One mean: H 0 : μ = μ 0 (null value is μ 0 )
  4. Two means: H 0 : μX − μY = 0 (we never use anything but 0 as the null value in this course)

In addition to the null hypothesis, we also have an alternative hypothesis, which is defined as a direction where we “have something to prove”. The alternative hypothesis is often called the “research hypothesis” and you want it to be the “interesting” result. The alternative hypothesis can include either “>”, “<”, or “ 6 =” (for example the null hypothesis H 0 : μ = μ 0 can be matched with any of H 1 : μ > μ 0 , H 1 : μ < μ 0 , or H 1 : μ 6 = μ 0.

4.2 The null distribution

The null distribution is defined as the sampling distribution of our estimate when the null hypothesis is true. The only difference between the null distribution and the sampling distributions in section 2 is that, knowing the null value, we can plug it in the formula.

  1. For one proportion, we know H 0 : p = p 0 , so the null distribution is

pˆ ∼ N

 p 0 ,

√ p 0 (1 − p 0 ) n

 

  1. For two proportions, we know H 0 : pX −pY = 0. Furthermore, the knowledge that pX −pY = 0 means pX = pY , and thus while we have to estimate the value from the data we can utilize both samples. The null distribution is

pˆX − pˆY ∼ N

  0 ,

√ p ˆ 0 (1 − pˆ 0 ) nX

+

pˆ 0 (1 − pˆ 0 ) nY

 

where

pˆ 0 = (#X successes) + (#Y successes) nX + nY

  1. For one mean, we know H 0 : μ = μ 0 and σ must still be estimated by s, so the null distribution is

X¯ ∼ N

( μ 0 , s √ n

)

  1. For two means, we know H 0 : μX − μY = 0 and both σX and σY must be estimated by sX and sY , so the null distribution is

X¯ − Y¯ ∼ N

  0 ,

√ s^2 X nX

+

s^2 Y nY

 

4.3 Conducting a hypothesis test and computing a p-value

If you are given data and asked to conduct the hypothesis (i.e. determine whether or not to reject the null hypothesis), then you need to compute your estimate, compute the cutoff/s for the rejection region, and determine if the estimate lies in the rejection region. Once you have the null distribution from section 4.2, the hypothesis test proceeds by the same rules regardless of whether you are working with one proportion, two proportions, one mean, or two means. If the alternative hypothesis is “>”, you reject if your estimate is greater than the 1 − α percentile of the null distribution. If the alternative hypothesis is “<”, you reject if your estimate is less than the α percentile of the null distribution. If the alternative hypothesis is “ 6 =”, you reject if your estimate is less than the α/2 percentile of the null distribution OR if your estimate is greater than the 1 − (α/2) percentile of the null distribution. The “p-value” of a hypothesis test is defined as the point where 1) you would reject H 0 for any α above the p-value, and 2) you would not reject H 0 for any α below the p-value. To find this value, you have to equate your observed estimate to the cutoff of the hypothesis test, and then solve for α. The cutoff, of course, depends on the direction of the alternative hypothesis. If the alternative hypothesis is “ 6 =” and your data is greater than the null value, the cutoff is σnullz 1 −(α/2) + (null value), where σnull is the standard deviation of the null distribution. You need to set your estimate equal to this cutoff, which results in

σnullz 1 −(α/2) + (null value) = estimate

Z =

estimate − (null value) σnull = z 1 −(α/2)

Thus, find Z in the margin of the Z-table, and equate the result to 1 − (α/2)

Example - suppose the null value is 6, the null standard deviation is 0.14, and the estimate is 6.25. Compute

  1. 25 − 6
  2. 14 = 1.79 = z 1 −(α/2)

The probability a Z is less than 1.79 is 0.9633. Equating 0.9633 to 1 − (α/2), we find α = 0.0734, which is the p-value.

Sample problems

  1. You want to test H 0 : p = 0.4 against H 1 : p < 0 .4. You observe 64 successes in 200 trials. Conduct the hypothesis test using α = 0.1 and report the p-value. Here we have a test for a single proportion. The sample proportion ˆp = 64/200 = 0.32. The null distribution is

pˆ ∼ N (0. 4 ,

√ (0.4)(1 − 0 .4)/200 = 0.0346)

Since the alternative is “<” we reject if ˆp is less than α percentile of the null distribution. The 0 .1 percentile of a Z is − 1 .28, so the 0.1 percentile of the null distribution is (0.0346)(− 1 .28)+ 0 .4 = 0.3557. Our observed ˆp = 0.32 is less than 0.3557, so we reject H 0. As for the p-value. Note our observed data corresponds to a Z of

  1. 32 − 0. 40
  2. 0346

= − 2. 31

Equating − 2 .31 = zα, we find the probability below Z = − 2 .31 is 0.0104 = α, so 0.0104 is the p-value.

  1. Suppose you are testing H 0 : pX − pY = 0 against H 1 : pX − pY 6 = 0. You observe 78 successes out of 160 trials of the X population, and 92 out of 176 trials of the Y population. Determine the result of the hypothesis test using α = 0.01 and report the p-value. Here we have ˆpX = 78/160 = 0.4875 and ˆpY = 92/176 = 0.5227, for an estimated difference of ˆpX − ˆpY = 0. 4875 − 0 .5227 = (− 0 .0352). Also note to find the null distribution we need the value of ˆp 0 = (78 + 92)/(160 + 176) = 0.5060. The null distribution is

pˆX − pˆY ∼ N

  0 ,

√ (0.5060)(1 − 0 .5060) 160

+

(0.5060)(1 − 0 .5060)

= 0. 0546

 

The alternative is “ 6 =”, so the cutoffs are the α/2 and 1 − (α/2) percentiles of the null distribution, where α = 0.01 here. The α/2 and 1 − (α/2) percentiles of a Z are − 2 .57 and 2 .57. The α/2 percentile of the null distribution is (0.0546)(− 2 .57) + 0 = (− 0 .1403) and (0.0546)(2.57) + 0 = 0.1403. The observed difference of (− 0 .0352) is well within those cutoffs, so we do not reject H 0. As for the p-value, the observed difference is less than the null value, so we compute the Z value and equate it to zα/ 2. The Z value is

− 0. 0352 − 0

  1. 0546

= (− 0 .64)

The probability a Z is less than (− 0 .64) is 0.2611. Equating this to α/2, we find the p-value is 0. 5222

  1. Suppose we are testing H 0 : μ = 3 against H 1 : μ < 3. You observe X¯ = 2.78, n = 85, and s = 0.65. Determine the result of the hypothesis test using α = 0.05 and compute the p-value. The null distribution is

X¯ ∼ N (3, s/√n = 0. 65 /

85 = 0.0705)

Since the alternative hypothesis is “<”, we want to find the α = 0.05 percentile of the null distribution. The α = 0.05 percentile of a Z is (− 1 .64), and the 0.05 percentile of the null distribution is (0.0705)(− 1 .64) + 3 = 2.8844. The observed X¯ = 2.78 is less than the cutoff, so we reject H 0. As for the p-value, we compute Z and equate it to the zα.

Z =

2. 78 − 3

= − 3. 12

The probability a Z is less than (− 3 .12) is 0.0009, which we equate to α and thus the p-value is 0.0009.

  1. Suppose we are testing H 0 : μX − μY = 0 against H 1 : μX − μY > 0. We observe X¯ = 109.32, Y¯ = 106.13, nX = 130, nY = 155, sX = 15.3, and sY = 11.3. The null distribution is

X¯ − Y¯ ∼ N

  0 ,

  1. 32 130

+

= 1. 6200

 

Since the alternative is “>”, the cutoff is the 1 − α percentile of the null distribution. When α is not given, we use α = 0.05. The 0.05 percentile of a Z is 1.64 and the 0.05 percentile of the null distribution is (1.62)(1.64) + 0 = 2.6568. The observed difference of X¯ − Y¯ =

  1. 32 − 106 .13 = 3.19 is greater than the cutoff, so we reject H 0. As for the p-value, we need to compute Z and equate it to z 1 −α. We fine

Z =

3. 19 − 0

= 1. 97

The probability a Z is less than 1.97 is 0.9756. Equating that to 1 − α, we find α = 0. 0244 and thus the p-value is 0.0244.

4.4 Calculating Power with given sample sizes

In a power calculation, you will be given enough information to construct the hypotheses, α, a value of the parameter in the alternative hypothesis, and possibly some more information depending on the setting (see below). To compute the power, you need to compute the alternative distribution, and then determine the probability, under the alternative distribution, that the null hypothesis would be rejected. The alternative distribution is very similar to the null distribution, EXCEPT for the key dif- ference that the parameter is assumed to be the alternative value, not the null value.

  1. One proportion : The alternative value is p 1 , the alternative distribution is

pˆ ∼ N

 p 0 ,

√ p 0 (1 − p 0 ) n

 

  1. Two proportions : The alternative value d 1 specified a difference of the parameters (recall the null specified this difference is 0). Also required here is a expectation of the value of ˆp 0. Example “You are testing H 0 : pX − pY = 0 against H 1 : pX − pY < 0. Compute the power of the test when pX − pY = (− 0 .04) and ˆp 0 is around 0.3”. In the example, d 1 = (− 0 .04). The alternative distribution is

ˆpX − pˆY ∼ N

 d 1 ,

√ p ˆ 0 (1 − pˆ 0 ) nX

+

pˆ 0 (1 − pˆ 0 ) nY

 

where ˆp 0 is the value proposed in the problem.

  1. One mean : The alternative value is μ 1. You also need to be given a guess as to the value of σ^2 (or s). The alternative distribution is

X¯ ∼ N

( μ 1 , s √ n

)

where σ^2 can be used in place of s if σ^2 is given as part of the problem

  1. Two means : The alternative value is d 1 , which is an alternative value for the difference μX − μY. You also need to given an expectation of the values of σX (or sX ) and σY (or sY ). The alternative distribution is

X¯ − Y¯ ∼ N

 d 1 ,

√ s^2 X nX

+

s^2 Y nY

 

where σX or σY may be used instead of sX and sY if they are given.

To actually compute the power, follow these steps. First, compute the null distribution and the cutoffs for the hypothesis test (the cutoffs found as in section 4.3). Then compute the probability the alternative distribution assigns to the rejection region.

Sample problems

  1. One proportion : Suppose you are testing H 0 : p = 0.2 against H 1 : p > 0 .2 using α = 0. 01 and n = 300. What is the power at p 1 = 0.22? The null distribution is

pˆ ∼ N (0. 2 ,

√ (0.2)(1 − 0 .2)/300 = 0.0231)

and the alternative distribution is

pˆ ∼ N (0. 22 ,

√ (0.22)(1 − 0 .22)/300 = 0.0239)

The cutoff is the 1 − α percentile of the null distribution. The 1 − 0 .01 = 0.99 percentile of a Z is 2.33, so the 0.99 percentile of the null distribution is (0.0231)(2.33) + 0.2 = 0.2538. We reject if ˆp is greater than 0.2538. According to the alternative distribution, the probability ˆp > 0 .2538 id

P

( Z >

0. 2538 − 0. 22

= 1. 41

) = 1 − 0 .9207 = 0. 0793

So the power (0.0793) is very small in this example.

  1. Two proportions : Suppose you are testing H 0 : pX − pY = 0 against H 1 : pX − pY < 0 using α = 0.05, nX = 150, nY = 200. You expect ˆp 0 to be around 0.7. What is the power of this test if the true difference is (− 0 .10)? Then null distribution (assuming ˆp 0 to be around 0.7) is

pˆX − ˆpY ∼ N

  0 ,

√ (0.7)(1 − 0 .7) 150

+

(0.7)(1 − 0 .7)

= 0. 0495

 

while the alternative distribution is (note the second term does not change)

pˆX − pˆY ∼ N

 (− 0 .10),

√ (0.7)(1 − 0 .7) 150

+

(0.7)(1 − 0 .7)

= 0. 0495

 

The cutoff is the α = 0.05 percentile of the null distribution. The 0.05 percentile of a Z is − 1 .64, so the α = 0.05 percentile of the null distribution is (0.0495)(− 1 .64) + 0 = (− 0 .08118). Thus, we reject if the observed difference ˆpX − pˆY is less than (-0.08118). The power of the test is the probability the alternative distribution places below (− 0 .08118). To find this

P

( Z <

− 0. 08118 − (− 0 .10)

= 0. 38

) = 0. 6480

So the power is 0.6480.

  1. One mean : Suppose you are testing H 0 : μ = 50 against H 1 : μ 6 = 50 using α = 0.01 and n = 100. What is the power of the test at μ 1 = 45 assming σ is around 10? The null distribution is

X¯ ∼ N (50, 10 /

100 = 1)

while the alternative distribution is

X¯ ∼ N (45, 10 /

100 = 1)

The cutoffs are the α/2 = 0.005 and 1 − (α/2) = 0.995 percentile of the null distribution. The corresponding percentiles of a Z are − 2 .58 and 2.58. Thus, the cutoff from the null distribution are (1)(− 2 .58) + 50 = 47.42 and (1)(2.58) + 50 = 52.58. We reject if X¯ is outside that region. The probability the alternative distribution places on the rejection region is the sum of the probability of being below 47.42 and the probability of being above 52.58.

P

( Z <

47. 42 − 45

= 2. 42

) = 0. 9922

P

( Z >

52. 58 − 45

= 7. 58

) = 0

Note the second value is off the chart. Thus, the power here is the sum of these two values (one of which is 0), so the power is 0.9922.

  1. Two means : Suppose you are testing H 0 : μX − μY = 0 against H 1 : μX − μY 6 = 0 using α = 0.02, nX = 70, nY = 90. What is the power of the test if the true difference is 3, assuming σX is around 10 and σY is around 15? The null distribution is

X¯ − Y¯ ∼ N

  0 ,

√ 102 70

+

= 1. 9821

 

while the alternative distribution is (note the second term does not change)

X¯ − Y¯ ∼ N

  3 ,

√ 102 70

+

= 1. 9821

 

The cutoffs are the α/2 = 0.01 and 1 − (α/2) = 0.99 percentiles of the null distribution. The corresponding percentiles of a Z are − 2 .33 and 2.33. Thus, the cutoff from the null distribution are (1.9821)(− 2 .33) + 0 = − 4 .6183 and (1.9821)(2.33) + 0 = 4.6183. We reject if X¯ − Y¯ is outside that region. The probability the alternative distribution places on the rejection region is the sum of the probability of being below (− 4 .6183) and the probability of being above (4.6183).

P

( Z <

(− 4 .6183) − 3

= − 3. 84

) = 0

P

( Z >

4. 6183 − 3

= 0. 82

) = 0. 7939

Adding these values together (one of which is 0), the power is 0.7939.

4.5 Determining a sample size

Another problem faced in hypothesis testing is determining a sample size sufficient to both achieve a specific value of type I error (α) and a specified power. The solution to this problem is based on equating percentiles of the null distribution (corresponding to the cutoffs of the hypothesis test) with percentiles of the alternative distribution (those required to get to the appropriate power). As before, you will need the null and alternative distribution (and whatever information was required to compute the power alone). Your equation depends on the direction of the alternative hypothesis. If the alternative hypothesis is “>”, you need to equate the 1 − α percentile of the null distribution to the 1 − P OW percentile of the alternative distribution (where POW is the desired power). Then solve for n. If the alternative hypothesis is “>”, equate the α percentile of the null distribution to the P OW percentile of the alternative distribution. If the alternative hypothesis is “ 6 =”, then remember there are two cutoffs. We only use the one in the direction of the specified alternative value. If the alternative value is less than the null value, equate the α/2 percentile of the null distribution to the P OW percentile of the alternative distribution. If the alternative value is greater than the null value, equate the 1 − (α/2) percentile of the null distribution to the 1 − P OW percentile of the alternative distribution.

Sample Problems

  1. One proportion : Suppose you are testing H 0 : p = 0.4 against H 1 : p 6 = 0.4 using α = 0.01. You also want to have 90% power at p 1 = 0.42. What is the minimal sample size required? The null distribution is

ˆp ∼ N (0. 40 ,

√ (0.4)(1 − 0 .4)/n = 0. 4899 /

n)

and the alternative distribution is

pˆ ∼ N (0. 42 ,

√ (0.42)(1 − 0 .42)/n = 0. 4936 /

n)

To find the required sample size, we note that we have a “ 6 =” alternative and that the alternative value is greater than the null value. Thus, we need to equate the 1 − (α/2) = 0 .995 percentile of the null distribution to the 1 − P OW = 0.10 percentile of the alternative distribution. The 0.995 percentile of a Z is 2.58, so the 0.995 percentile of the null distribution is (0. 4899 /

n)(2.58)+ 0 .40. The 0.10 percentile of a Z is (− 1 .28), so the 0.10 percentile of the alternative distribution is (0. 4936 /

n)(− 1 .28) + 0.42. Equating these values and solving for n

  1. 4899 √ n

(2.58) + 0.40 =

n

(− 1 .28) + 0. 42

1 .2639 + 0.6318 = 0. 02

n

n = 8984. 2

So we need at least 8985 observations (a lot because we are expecting α to be so small and simultaneously detect a small difference).

  1. Two proportions : Suppose you are testing H 0 : pX − pY = 0 against H 1 : pX − pY < 0 using α = 0.05. Suppose you also want to have 80% power when the true difference of proportions is − 0 .03 and you expect ˆp 0 to be around 0.6. What is the minimum required sample size in each group, assuming the sample sizes are equal? Then null distribution (assuming ˆp 0 to be around 0.6) is

pˆX − pˆY ∼ N

  0 ,

√ (0.6)(1 − 0 .6) n

+

(0.6)(1 − 0 .6)

n

=

n

 

while the alternative distribution is (note the second term does not change)

pˆX − pˆY ∼ N

 − 0. 03 ,

√ (0.6)(1 − 0 .6) n

+

(0.6)(1 − 0 .6)

n

=

n

 

To find the minimum sample size for a “<” alternative, we need to equate the α = 0. 05 percentile of the null distribution to the P OW = 0.8 percentile of the alternative distribution. The 0.05 percentile of a Z is (− 1 .64), sothe 0.05 percentile of the null distribution is

  1. 6928 √ n

(− 1 .64) + 0

while the 0.80 percentile of a Z is 0.84 so the 0.80 percentile of the alternative distribution is

  1. 6928 √ n

(0.84) − 0. 03

Equating these

  1. 6928 √ n

(− 1 .64) + 0 =

n

(0.84) − 0. 03

− 1 .7181 = (− 0 .03)

n

n = 3279. 85

so n must be at least 3280.

  1. One mean : Suppose we are testing H 0 : μ = 4 against H 1 : μ > 4 using α = 0.05 and we want to have 70% power when μ = 4.5. We expect σ is close to 2.3. The null distribution is

X¯ ∼ N (4, 2. 3 /√n)

while the alternative distribution is

X¯ ∼ N (4. 5 , 2. 3 /√n)

Since the alternative is “>”, we want to equate the 1 − α = 0.95 percentile of the null distribution with the 1 − P OW = 0.3 percentile of the alternative distribution. The 0.95 percentile of a Z is 1.64 and the 0.95 percentile of the null distribution is

  1. 3 √ n

(1.64) + 4

while the 0.30 percentile of a Z is (− 0 .52) so the 0.30 percentile of the alternative distribution is

  1. 3 √ n

(− 0 .52) + 4. 5

Equating these

  1. 3 √ n

(1.64) + 4 =

n

(− 0 .52) + 4. 5

4 .968 = 0. 5

n

n = 98. 72

We need at least 99 observations.

  1. Two means : Suppose we are testing H 0 : μX − μY = 0 against H 1 : μX − μY 6 = 0. You want to have α = 0.05 and 90% power when the true difference is 2. You expect σX to be close to 5 and σY to be close to 7. Assuming you place the same number of observations in each group, what is the minimum sample size in each group?

X¯ − Y¯ ∼ N

  0 ,

√ 52 n

+

n

=

n

 

while the alternative distribution is (note the second term does not change)

X¯ − Y¯ ∼ N

  2 ,

√ 52 n

+

n

=

n

 

Because the alternative is “ 6 =” and the alternative value (2) is greater than the null value (0), we equate the 1 − (α/2) = 0.975 percentile of the null distribution to the 1 − P OW = 0. 10 percentile of the alternative distribution.

The 0.975 percentile of a Z is 1.96, so the 0.975 percentile of the null distribution is

  1. 6023 √ n

(1.96) + 0

while the 0.10 percentile of a Z is (− 1 .28) and thus the 0.10 percentile of the alternative distribution is

  1. 6023 √ n

(− 1 .28) + 2

Equating these

  1. 6023 √ n

(1.96) + 0 =

n

(− 1 .28) + 2

27 .8715 = 2

n

n = 194. 20

So we need a sample size of at least 195 in each group.