Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lecture File 16: Confidence Intervals and Hypothesis Testing for Proportions in Math 243 -, Study notes of Calculus

This document from math 243 lecture file 16, dated may 21, 2009, by n. Christopher phillips, covers the topics of confidence intervals and hypothesis testing for proportions. Examples and explanations of how to calculate confidence intervals for sample proportions and perform hypothesis tests for a single proportion. It also discusses the importance of having a simple random sample and the limitations of the large sample procedure for small sample sizes.

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-wms
koofers-user-wms 🇺🇸

10 documents

1 / 9

Toggle sidebar

Related documents


Partial preview of the text

Download Lecture File 16: Confidence Intervals and Hypothesis Testing for Proportions in Math 243 - and more Study notes Calculus in PDF only on Docsity!

Math 243: Lecture File 16

N. Christopher Phillips

21 May 2009

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 1 / 36

Proposed review sessions for the final exam

Friday 5 June, 7:00 pm. Saturday 6 June, 11:00 am. Saturday 6 June, 2:00 pm. Sunday 7 June, 9:00 am.

This week’s quiz will include at least one review problem.

The next two lectures will be given by Dev Sinha. My office hours next week are cancelled. (I do expect to have limited email access.)

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 2 / 36

A commonly made error

You choose a simple random sample of size 3 from a Math 251 class. Suppose your sample consists of John, Jane, and Mei-Chu. You do not have “three samples”. You have one sample with three members (or elements, or... ).

In particular, John is not a “sample”.

Compare:

A family has five members: John, Jane, John Jr., Jane Jr., and Jasmine. You do not have “five families”, Jane is not a family, etc.

Do not make this mistake when writing about sample sizes in homework or on quizzes or exams.

Large sample confidence interval

The procedures we describe are called the one proportion z procedures (for confidence intervals and hypothesis tests).

The usual form of confidence intervals applies; here it is:

(sample proportion) ± (critical value) · (standard error).

(We don’t have the standard deviation because we don’t know the true population proportion.) Thus, it is:

̂ p ± z∗

√̂

p(1 − ̂p) n

.

This requires large samples: at least 15 successes and at least 15 failures. (We will modify it below.)

Example: Polling on Senator Snort

I choose a simple random sample of 2500 registered voters. Of these, 1493 say they intend to vote for Senator Snort. Construct a 98% confidence interval for the true proportion of all registered voters who would say they intend to vote for Senator Snort.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 5 / 36

Example: Polling on Senator Snort

1493 out of a simple random sample of 2500 registered voters say they intend to vote for Senator Snort. We want 98% confidence.

We use ̂p ± z∗

bp(1−bp) n.^ We have

̂ p =

= 0. 5972 and

√̂

p(1 − ̂p) n

≈ 0. 009809223.

We take z∗^ = 2.326 (from the “z∗” row of Table C). So we get

  1. 5972 ± (2.326)(0.009809223) ≈ 0. 5972 ± 0. 0228163.

As an interval, this is (

  1. 5972 − 0. 0228163 , 0 .5972 + 0. 0228163

)

=

(

0. 574384 , 0. 620016

)

.

So, with 98% confidence, between 57.4% and 62.0% of registered voters would say they intend to vote for Senator Snort. N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 6 / 36

Reminder of the warnings

As usual, this is valid only if we truly have a simple random sample.

The announced margin of error does not account for the following problems: Some of the people in our original simple random sample were never home when we called them, or refused to talk to us. Some registered voters don’t have telephones, have unlisted numbers, or only have cell phones. Some people lied to us. (Probably not such a big problem here, but possibly serious in surveys about sensitive topics or criminal activities.) Etc.

Example: Defective widgets

I choose a simple random sample a simple random sample of 400 widgets manufactured by Wang’s Widgets Inc., and 8 of them are defective. Construct a 90% confidence interval for the true proportion of all defective widgets manufactured by Wang’s Widgets Inc.

There are less than 15 successes, so we can’t use the large sample method.

For small numbers of successes or failures, numerical simulations show that the large sample procedure often produces confidence intervals that are too small, so less likely to contain the true proportion than is claimed. Moreover, they may not improve with increasing sample size n. See page 499 of the book.

“Plus four” method

Numerical simulations show that the following “plus four” method works well under the following conditions: The population is much larger than the sample size. The requested confidence is at least 90%. The sample size is at least 10.

For the “plus four” method, simply add four imaginary observations, two successes and two failures, to the observed results. Thus, everywhere the formulas above call for ̂p, use instead

˜p =

(number of successes) + 2 (sample size) + 4

.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 9 / 36

“Plus four” method (continued)

This method does no damage even for large samples. The book therefore recommends always using it.

In exam and homework problems, you must specify whether or not you are using the “plus four” method.

The quantity ˜p in the “plus four” method is a biased estimator. For example, for p = 0 and sample size 20, the mean of the sampling distribution is 242 6 = p.

Warning: There is no “plus four” method for hypothesis tests! See below.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 10 / 36

Example: Defective widgets (continued)

8 out of a simple random sample of 400 widgets were defective. Want 90% confidence.

We use the “plus four” method, so use ˜p ± z∗

˜p(1 − ˜p) n + 4

. We have

˜p =

8 + 2

400 + 4

≈ 0. 02475248 and

˜p(1 − ˜p) n + 4

≈ 0. 007729939.

We take z∗^ = 1.645 (from the “z∗” row of Table C). So we get about

  1. 02475248 ± (1.645)(0.007729939) ≈ 0. 02475248 ± 0. 0127157.

Example: Defective widgets (continued)

We got about

  1. 02475248 ± (1.645)(0.007729939) ≈ 0. 02475248 ± 0. 0127157.

As an interval, this is (

  1. 02475248 − 0. 0127157 , 0 .02475248 + 0. 0127157

)

=

(

0. 0120367 , 0. 0374682

)

.

So, with 90% confidence, between 1.20% and 3.75% of all widgets are defective.

Example: Polling on Senator Snort (revisited)

1493 out of a simple random sample of 2500 registered voters say they intend to vote for Senator Snort. We want 98% confidence.

We use the “plus four” method, so use ˜p ± z∗

˜p(1 − ˜p) n + 4

. We have

˜p =

1493 + 2

2500 + 4

≈ 0. 5970447 and

˜p(1 − ˜p) n + 4

≈ 0. 009802000.

We take z∗^ = 2.326 (as before). So we get about

  1. 5970447 ± (2.326)(0.009802000) ≈ 0. 5970447 ± 0. 0227995.

As an interval, this is about

(

  1. 5970447 − 0. 0227995 , 0 .5970447 + 0. 0227995

)

=

(

0. 574245 , 0. 619844

)

.

So, with 98% confidence, between 57.4% and 62.0% of registered voters would say they intend to vote for Senator Snort. N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 13 / 36

Example: Polling on Senator Snort (comparison)

The 98% confidence interval from above: (

  1. 574245 , 0. 619844

)

.

The large sample method gave the 98% confidence interval (

  1. 574384 , 0. 620016

)

.

So there is very little difference.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 14 / 36

Choosing the sample size

How big a sample do we need?

We base this on the large sample method. (If you want a small margin of error, you will need a large sample.)

The margin of error is

z∗

√̂

p(1 − ̂p) n

.

It depends on ̂p, which of course we don’t know before choosing not only the sample size but also the sample. So we have to use a guess p∗^ for ̂p.

The margin of error is largest when ̂p = 12. So taking p∗^ = 12 never gives too small a sample size. Moreover, if you use p∗^ = 12 and ̂p turns out to be in the range 0. 3 ≤ ̂p ≤ 0. 7 , the sample size based on p∗^ = 12 will not be too big by very much.

Choosing the sample size: details

There are two choices: Use p∗^ = 12. If you expect 0. 3 ≤ ̂p ≤ 0. 7 , this is usually the appropriate choice. Use previous experience or a pilot study to obtain a guess p∗^ for your expected ̂p. This is recommended if you do not expect ̂p to be in the range 0. 3 ≤ ̂p ≤ 0. 7.

Computing the sample size estimate

Our estimate for the margin of error is

m = z∗

p∗(1 − p∗) n

.

Solve for n: m^2 = (z∗)^2

(

p∗(1 − p∗) n

)

n =

(

z∗ m

) 2

p∗(1 − p∗)

Then round up.

(You don’t need to memorize this formula. Just solve the first equation when it comes up.)

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 17 / 36

Example: A sample for Senator Snort

We want a 99% confidence interval for the proportion of registered voters who would say they intend to vote for Senator Snort, with margin of error 1%.

Two person races for Senate in the US hardly ever give one side more than 70%, so we are probably justified in taking p∗^ = 12.

Take z∗^ = 2.576 (from the “z∗” row of Table C).

n =

(

z∗ m

) 2

p∗(1 − p∗) =

(

) 2

1 2

(

1 − 12

)

= 16, 589. 4.

We can’t take n = 16, 589 .4, because it isn’t an integer. So we actually need a sample size of n = 16,590. (Always round up.)

(This is much bigger than most polls.)

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 18 / 36

Example: A sample of widgets

We want a 95% confidence interval for the proportion of defective widgets among all widgets manufactured by Wang’s Widgets Inc., with margin of error 0.5%.

Previous research (see above) suggests that between 1.20% and 3.75% of all widgets are defective. I will take p∗^ = 0.0375, since that gives the largest sample size of anything between 0.0120 and 0.0375. (The margin of error gets bigger the closer ̂p is to 12 .)

Take z∗^ = 1.960 (from the “z∗” row of Table C).

n =

(

z∗ m

) 2

p∗(1 − p∗) =

(

) 2

(0.0375)

(

1 − 0. 0375

)

≈ 5546. 31.

We can’t take n = 5546.31, because it isn’t an integer. So we actually need a sample size of n = 5547. (Always round up.)

Hypothesis testing for proportions

The hypotheses take the following form, for a given proportion p 0 (the proportion with which we want to compare the true proportion).

The null hypothesis is: H 0 : p = p 0.

The alternate hypothesis is one of: Ha: p 6 = p 0. Ha: p < p 0. Ha: p > p 0.

Example: Did the ads affect Senator Snort’s popularity?

You are working for Senator Snort’s opponent’s campaign. You have just run a series of negative ads featuring Senator Snort’s recent conviction for drunken driving. According to previous polls, 60% of registered voters in the state would say they intend to vote for Senator Snort. You want to know whether your ads have decreased his popularity.

Let p be the proportion of registered voters in the state who would now say they intend to vote for Senator Snort.

H 0 : p = 0. 6.

Ha: p < 0. 6.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 21 / 36

Example: Horn damage

Studies have shown that 11.5% of spiral-horned snorkacks have significant damage to their horns. You want to know whether the rate of horn damage in crumple-horned snorkacks is different.

Let p be the proportion of crumple-horned snorkacks which have significant damage to their horns.

H 0 : p = 0. 115.

Ha: p 6 = 0. 115.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 22 / 36

Hypothesis testing for proportions

There is no “plus four” method for hypothesis testing.

The “plus four” method corrects for errors in confidence intervals that are caused by using ̂p in place of p. In a hypothesis test, all computations are done under the assumption that the null hypothesis is true, and the null hypothesis claims an exact value for p.

One proportion test statistic

The test statistic for a one proportion hypothesis test is

z =

̂

p − p 0 √ p 0 (1−p 0 ) n

.

It is a z-score (actually, only approximately, because the sampling distribution is only approximately normal). Note that we take for the sampling mean the value p 0 which the null hypothesis says it is supposed to have, and for the sampling standard deviation the value √ p 0 (1 − p 0 ) n which the null hypothesis says it is supposed to have.

Conditions for use

We can use the one proportion z hypothesis test when the population is much larger than the sample, and when both np 0 and n(1 − p 0 ) are at least 10.

(The quantities np 0 and n(1 − p 0 ) are what the null hypothesis says the numbers of successes and failures are expected to be.)

Of course, all the nonmathematical warnings and hazards apply as well.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 25 / 36

Example: Did the ads affect Senator Snort’s popularity?

We want to know whether Senator Snort’s popularity decreased from 60% after we ran negative ads featuring his recent conviction for drunken driving.

H 0 : p = 0. 6.

Ha: p < 0. 6.

Let’s ask for significance α = 0.05. (There is nothing implausible about the suggestion that the ads decreased Senator Snort’s popularity.)

We choose a simple random sample of 250 registered voters in the state, and find that 131 of them say they intend to vote for Senator Snort.

(Through extremely good fortune, it turns out that state law mandates that all registered voters have telephones, and moreover everybody actually responded to the survey. Also, nobody claimed to be undecided.)

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 26 / 36

Example: Did the ads affect Senator Snort’s popularity?

(continued)

H 0 : p = 0.6; Ha: p < 0. 6. α = 0.05; 131 out of 250 intend to vote for Senator Snort.

Is the test safe to use? We have

np 0 = (250)(0.6) = 150 and n(1 − p 0 ) = (250)(1 − 0 .6) = 100,

both of which are certainly bigger than 10. Also, surely the population is much bigger than the sample size. So the mathematical conditions are satisfied.

Example: Did the ads affect Senator Snort’s popularity?

(continued)

The statement says we have a simple random sample, and that there is no problem with nonresponse. We didn’t see the question which was asked, but asking someone to choose a candidate from a list is unlikely to be biased as long as the list is in random order (randomized separately for each person called).

We also assumed nobody was undecided.

Example: Did the ads affect Senator Snort’s popularity?

(continued)

H 0 : p = 0.6; Ha: p < 0. 6. α = 0.05; 131 out of 250 intend to vote for Senator Snort. The sampling standard deviation, assuming H 0 is true, is √ p 0 (1 − p 0 ) n

=

0 .6(1 − 0 .6)

≈ 0. 0309839.

The sample proportion is ̂p = 131250 = 0. 524. So

z =

̂

p − p 0 √ p 0 (1−p 0 ) n

0. 524 − 0. 6

≈ − 2. 45289.

From Table A, this gives a P-value (using − 2 .45) of 0.0071. This is less than α, so we reject the null hypothesis. We have found strong evidence that Senator Snort’s popularity has indeed decreased. N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 29 / 36

Example: Did the ads affect Senator Snort’s popularity?

(continued)

At the significance level 0. 05 , so we rejected the null hypothesis. We found strong evidence that Senator Snort’s popularity has indeed decreased.

You can’t say, without further information, that the observed change was caused by your negative ads. Possibly your ads backfired, but Senator Snort was discovered to have taken bribes from the government of Iran. Possibly your ads were extremely effective, but your candidate was discovered to have taken bribes from the government of Iran.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 30 / 36

Example: Horn damage

You want to know whether the rate of significant horn damage in crumple-horned snorkacks is different from the rate of 11.5% known to hold for spiral-horned snorkacks.

H 0 : p = 0. 115. Ha: p 6 = 0. 115.

Let’s ask for significance α = 0.01. (There is nothing implausible about the suggestion that the damage rate is different, but I want to use a different value of α for variety.)

With some difficulty, we locate 13 crumple-horned snorkacks, and find that 5 of them have significant horn damage.

Example: Horn damage (continued)

H 0 : p = 0.115; Ha: p 6 = 0. 115. α = 0.01; 5 out of 13 have significant horn damage.

Is the test safe to use? We have

np 0 = (13)(0.115) = 1. 495

and n(1 − p 0 ) = (13)(1 − 0 .115) = 11. 505. Since np 0 < 10 , the mathematical conditions are not satisfied, and the test can’t be used.

Example: Horn damage (continued)

H 0 : p = 0.115; Ha: p 6 = 0. 115. α = 0.01. Suppose that, with enormous additional effort, we locate another 74 crumple-horned snorkacks, bringing our sample size to 87. Suppose now 8 out of 87 have significant horn damage.

Is the test now safe to use? We have

np 0 = (87)(0.115) = 10. 005

and n(1 − p 0 ) = (87)(1 − 0 .115) = 76. 995. Both are at least 10, so this mathematical condition is satisfied (just barely). We will just have to hope that the population is much bigger than the sample size (although, to be honest, I am doubtful).

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 33 / 36

Example: Horn damage (continued)

H 0 : p = 0.115; Ha: p 6 = 0. 115. α = 0.01; 8 out of 87 have significant horn damage.

It doesn’t matter that there are only 8 successes. What must be at least 10 is the number of successes (as well as the number of failures) predicted by the null hypothesis.

We will have to hope that the sample can be treated as a simple random sample.

N. Christopher Phillips () Math 243: Lecture File 16 21 May 2009 34 / 36

Example: Horn damage (continued)

H 0 : p = 0.115; Ha: p 6 = 0. 115. α = 0.01; 8 out of 87 have significant horn damage.

The sampling standard deviation, assuming H 0 is true, is √ p 0 (1 − p 0 ) n

=

0 .115(1 − 0 .115)

≈ 0. 0342027.

The sample proportion is ̂p = 878 ≈ 0. 091954. So

z =

̂

p − p 0 √ p 0 (1−p 0 ) n

0. 091954 − 0. 115

≈ − 0. 673805.

Example: Horn damage (continued)

H 0 : p = 0.115; Ha: p 6 = 0. 115. α = 0.01; 8 out of 87 have significant horn damage.

We got z ≈ − 0 .673805.

From Table A, we find P(z < − 0 .67) ≈ 0 .2514. We are, however, doing a two sided test, so the correct P-value is

P(z < − 0 .67) + P(z > 0 .67) = 2 P(z < − 0 .67) ≈ 0. 5028.

Since this isn’t less than α, we fail to reject the null hypothesis. We conclude that there isn’t sufficient evidence to say that the rate of significant horn damage among crumple-horned snorkacks differs from 0 .115.