Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A lecture file from math 243, a college-level statistics course. It covers the concepts of confidence intervals and hypothesis tests for proportions and means. Examples of calculating confidence intervals and testing hypotheses using z-tests and t-tests. It also discusses the conditions for using these tests and provides examples of applying them to different populations.
Typology: Exams
1 / 7
N. Christopher Phillips
2 June 2009
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 1 / 26
Sample problems for the final exam have been posted.
Partial review session information has been posted.
Instructions for the final exam will be given either Thursday or in the discussion sections.
Deadline for all extra credit: Midnight the night of Tuesday 9 June 2009. (We need to get it soon enough to get grades in by the university’s deadline.)
There is a quiz in the discussion sections this week.
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 2 / 26
The sample problem collection contains almost no problems from before the midterm. The final exam itself will, however, be cumulative, although with greater emphasis on material covered since the midterm. Look at the sample problem list for the midterm for sample problems on earlier material.
The sample problems are in a rather random order.
The emphasis that a topic receives on the final exam will not be the same as in this list of sample problems, for at least two reasons. First, the final exam will be much shorter. Second, some groups of related problems in this sample collection are longer because they illustrate a number of possible outcomes.
Caution: Some confidence interval and hypothesis test problems present information in a form that can’t be directly entered into a calculator, or ask for the result in a form different from the usual calculator output.
For example, the TI-83 calculator does not, as far as I can tell, give confidence intervals in the form
(estimator) ± (margin of error).
Also, there are some proportion problems which give the sample proportion instead of the number of successes. If you try to calculate the number of successes from the sample proportion, you might not get an integer. (This can occur because of rounding errors.) However, my calculator does not allow one to enter a noninteger number of successes.
Questions asked on the sample final problems or on the sample midterm problems about one of the procedures we have learned, say the one sample z procedure, may be asked on the actual exam about a different procedure (such as the matched pairs t procedure).
There are no problems on the sample problem list which ask for the standard error in a test, but there will be some on the final exam.
Also, review the comments on the sample midterm problems.
As for the midterm, you may bring one two-sided page of notes and a calculator, but no cell phones.
Exams will be available for inspection after they are graded. The original final exams will not be returned, but you can get a copy of your exam on request. N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 5 / 26
You choose a simple random sample of 100 students at the University of Oregon, and find that 72% of them think it is ridiculous to have problems in Math 243 which involve crumple-horned snorkacks. You also choose a simple random sample of 200 students at Oregon State University, and find that 67.5% of them think it is ridiculous to have problems in Math 243 which involve crumple-horned snorkacks.
Find a 98% confidence interval for the difference in the proportions of students at the two universities who think such problems are ridiculous.
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 6 / 26
Let p 1 be the true proportion of students at the University of Oregon who think it is ridiculous to have problems in Math 243 which involve crumple-horned snorkacks. Let p 2 be the true proportion of students at Oregon State University who think it is ridiculous to have such problems in Math 243.
Note: There is no particular reason we couldn’t have reversed the labels. I chose p 1 for the University of Oregon proportion for no better reason than that it was mentioned first.
Can we use the large sample confidence interval?
72% of 100 students at the University of Oregon think my problems are ridiculous. So do 67.5% of 200 students at Oregon State University.
Can we use the large sample confidence interval?
Both samples must have at least 10 successes and at least 10 failures.
The University of Oregon sample has 0. 72 · 100 = 72 successes and 100 − 72 = 28 failures. The Oregon State University sample has 0. 675 · 200 = 135 successes and 200 − 135 = 65 failures. So we can use the large sample confidence interval.
Note: There is nothing wrong with using the “plus four” confidence interval. But be sure to make clear which you do!
72% of 100 students at the University of Oregon think my problems are ridiculous. So do 67.5% of 200 students at Oregon State University. We use the large sample confidence interval.
We are given ̂ p 1 = 0. 72 and ̂p 2 = 0. 675. The sampling standard error is √̂ p 1 (1 − ̂p 1 ) n 1
p 2 (1 − ̂p 2 ) n 2
≈
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 9 / 26
We had ̂ p 1 = 0. 72 and ̂p 2 = 0. 675 , and sampling standard error 0.0557931. We were supposed to find a 98% confidence interval.
We take z∗^ = 2.326 (from the “z∗” row of Table C). The confidence interval is
̂ p 1 − ̂p 2 ± z∗^ (sampling standard error) ≈ 0. 72 − 0. 675 ± (2.326)(0.0557931) ≈ 0. 045 ± 0. 129775.
In interval form, this is about
(− 0. 084774 , 0 .174775).
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 10 / 26
We had ̂ p 1 = 0. 72 and ̂p 2 = 0. 675 , and sampling standard error 0.08054734. Our 98% confidence interval is (− 0. 084774 , 0 .174775).
This is a confidence interval for p 1 − p 2 , which is the amount by which the proportion at the University of Oregon exceeds the proportion at Oregon State University. Thus, we can say, with 98% confidence, that the proportion of University of Oregon students who think such problems are ridiculous is somewhere between 0.0847749 less (that is, − 0. 0847749 more) and 0.174775 more than the proportion at Oregon State University.
Suppose we reversed the labels.
Thus let p 1 be the true proportion of students at Oregon State University who think it is ridiculous to have problems in Math 243 which involve crumple-horned snorkacks. Let p 2 be the true proportion of students at the University of Oregon who think it is ridiculous to have such problems in Math 243.
We previously got a 98% confidence interval of
Now, however, we get
− 0. 045 ± 0. 129775 or (− 0. 174775 , 0 .0847749).
(Check this!)
We previously got a 98% confidence interval of
Now, however, we get
− 0. 045 ± 0. 129775 or (− 0. 174775 , 0 .0847749).
However, the interpretation is the same. Before, we said, with 98% confidence, that the proportion of University of Oregon students who think such problems are ridiculous is somewhere between 0.0847749 less (that is, − 0 .0847749 more) and 0.174775 more than the proportion at Oregon State University. Now, we say, with 98% confidence, that the proportion of Oregon State University students who think such problems are ridiculous is somewhere between 0.174775 less (that is, − 0 .174775 more) and 0.0847749 more than the proportion at University of Oregon. N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 13 / 26
For all tests, the data must come from a simple random sample of the population (if appropriate, two simple random samples), or at least one must be able to treat the data as if it were a simple random sample.
For all tests, the population(s) must be much larger than the sample(s), at least 10 or 20 times the size(s) of the sample(s).
One sample z procedure: confidence interval and hypothesis test (for a population mean). The data must come from a simple random sample of the population. You must know the population standard deviation. Other conditions as for the one sample t procedure (below).
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 14 / 26
One sample t procedure: confidence interval and hypothesis test (for a population mean). The data must come from a simple random sample of the population. (This is more important than normality of the population distribution.) For sample sizes n < 15 , the data must appear roughly normal (roughly symmetric, single peak, no outliers). Check with a stemplot or histogram. For sample sizes n with 15 ≤ n < 40 , the one sample t procedure can be used unless there are outliers or strong skewness. For sample sizes n ≥ 40 , the one sample t procedure can be used even for strongly skewed distributions. (But you need an even bigger sample size if there are outliers.)
Matched pairs t procedure: confidence interval and hypothesis test (for comparing two population means).
Carry out a one sample t procedure for the set of differences. (For example, the difference between the change in blood pressure for the twin who got the drug and the change in blood pressure for the twin who got the placebo.) The test can be used if the differences satisfy the conditions for the one sample t procedure. The distribution of the populations doesn’t matter, only the distribution of the differences matters.
Of course, the data must come from a simple random sample of the population.
Two sample t procedure: confidence interval and hypothesis test (for comparing two population means). The data must come from simple random samples of the populations. Use the guidelines for the one sample t procedure with the sum n 1 + n 2 of the two sample sizes in place of n, but considering the shapes of both distributions. For sample sizes as small as n 1 = n 2 = 5, and with equal sample sizes, some skewness can be tolerated as long as both distributions have similar shapes.
As examples of the last item: n 1 = n 2 = 5 and both distributions somewhat skewed left is OK n 1 = 5, n 2 = 8, and both distributions somewhat skewed left: No. n 1 = n 2 = 5, one distributions somewhat skewed left, and the other somewhat skewed right: No.
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 17 / 26
One proportion z procedure: confidence interval (for a population proportion).
The data must come from a simple random sample of the population.
For the large sample version, there must be at least 15 successes and at least 15 failures.
For the “plus four” version: Sample size n ≥ 10. Confidence level C ≥ 0 .90 (that is, 90%).
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 18 / 26
One proportion z procedure: hypothesis test (for a population proportion).
The data must come from a simple random sample of the population.
Let p 0 be what the null hypothesis says the true proportion is supposed to be. Then the sample size n must be large enough that np 0 ≥ 10 and n(1 − p 0 ) ≥ 10.
Remember that the test is carried out assuming that the null hypothesis is true. Thus, the assumption is that the null hypothesis says there should be at least 10 successes and at least 10 failures.
There is no “plus four” version of a hypothesis test!
Two proportion z procedure: confidence interval (for comparing two population proportions).
The data must come from simple random samples of the populations.
For the large sample version, there must be at least 10 successes and at least 10 failures in each sample.
For the “plus four” version, both sample sizes must be at least 5.
Two proportion z procedure: hypothesis test (for comparing two population proportions).
The data must come from simple random samples of the populations.
There must be at least 5 successes and at least 5 failures in each sample.
There is no “plus four” version of a hypothesis test!
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 21 / 26
Out of a simple random sample of 200 high school seniors in Megalopolis (a very large city), 25 are taking calculus.
Out of a simple random sample of 50 high school seniors in Gorman (a moderate size town with about 2000 high school students), 10 are taking calculus.
Out of a simple random sample of 20 high school seniors in Snailsville (also a moderate size town with about 2000 high school students), 9 are taking calculus.
Out of a simple random sample of 20 high school seniors in East Snailsville (which has one high school with about 600 students), 6 are taking calculus.
N. Christopher Phillips () Math 243: Lecture File 19 2 June 2009 22 / 26
Out of a simple random sample of 200 high school seniors in Megalopolis (a very large city), 25 are taking calculus.
There are surely more than 20,000 high school students in Megalopolis, so more than 5000 high school seniors, and the population is much bigger than the sample.
There are at least 15 each successes and failures in the sample, so both confidence interval procedures apply.
We can also do most reasonable hypothesis tests. We can’t, however, test whether at least 4% of Megalopolis high school seniors are taking calculus. This would give p 0 = 0. 04 , so np 0 = (200)(0.04) = 8, which is less than 10.
Out of a simple random sample of 20 high school seniors in East Snailsville (which has one high school with about 600 students), 6 are taking calculus.
There are probably only about 150 high school seniors in East Snailsville. So the population is less than 10 times the sample size, and we can do no tests at all.
Out of a simple random sample of 20 high school seniors in Snailsville (also a moderate size town with about 2000 high school students), 9 are taking calculus.
There are probably about 400 high school seniors in Snailsville. So the population is about 20 times the sample size, and we can do tests.
We can’t use the large sample confidence interval, since there are less than 15 successes. We can use the “plus four” confidence interval to get a 95% confidence interval or a 90% confidence interval, but not an 80% confidence interval.
We can test for whether less than half (or more than half, or different from a half) of high school seniors in Snailsville are taking calculus, since then np 0 = 10 and n(1 − p 0 ) = 10. We can’t do any other hypothesis tests.
Out of a simple random sample of 50 high school seniors in Gorman (a moderate size town with about 2000 high school students), 10 are taking calculus.
As for the Snailsville problems, there are probably about 400 high school seniors in Gorman. So the population is about 20 times the sample size, and we can do tests.
Are more than 10% of high school seniors in Gorman taking calculus?
To use the one proportion z hypothesis test procedure, we need np 0 ≥ 10 and n(1 − p 0 ) ≥ 10. Here np 0 = (50)(0.10) = 5, so we can’t use the test.
Are less than 30% of high school seniors in Gorman taking calculus?
Here np 0 = (50)(0.30) = 15 and n(1 − p 0 ) = (50)(0.70) = 35. Both are at least 10, so we can use the test.