Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An overview of hypothesis testing and confidence intervals for comparing two populations' means and proportions. It includes examples and formulas for independent and dependent populations, as well as instructions for calculating test statistics and determining p-values. From a university statistics course (math 382).
Typology: Exams
1 / 6
Suppose that we want to compare the two population means. For example, we conducted a study to compare two treatments for depression (from a completely randomized design) or compare the mean math scores for girls and boys by looking at a random sample of each from their standardized test scores.
We can show that the Sampling Distribution of X − Y where X 1 (^) , K Xn and Y 1 (^) , K Y (^) m are independent random samples is distributed approximately normal with mean
2 2 V X Y x y n m
σ^ σ − = + if n and m are sufficiently large.
4. 1 Hypothesis Test for the Difference Between Two Means, μ 1 (^) − μ 2 , from Independent Populations
1 2 1 2
o a
μ μ μ μ
1 2 1 2
o a
μ μ μ μ
1 2 1 2
o a
μ μ μ μ
Note: Using zero for the difference is simply asking are the means the same or are the means different. If you thought one mean was 5 more than the other, you could use a difference of 5.
1 2 1 2
c
x x Z
n n
μ μ σ σ
Z c is distributed approximately standard normal if both sample sizes are at least 30 and exactly normal if the populations are normal. Also, if σ (^) 12 and σ (^) 22 are unknown the sample variances s 1^2 and s^22 can be used as long as both sample sizes are at least 30.
P-value =
We will reject H (^) o is p − value ≤ α , otherwise we fail to reject H (^) o.
Another way to draw a conclusion from the test statistic is to simply compare it to values of the standard normal distribution.
Reject H (^) o if Zc ≥ Zα / 2 Reject^ H^ o if^ Zc ≤^ − Zα Reject^ H^ o if^ Zc ≥ Zα
Example: A medical researcher wishes to see whether the pulse rates of smokers are higher than the pulse rates or nonsmokers. Independent samples of 100 smokers and 100 nonsmokers are selected and the results are shown below. Can the researcher conclude at α = 0.05, that smokers have higher pulse rates on average than nonsmokers.
Smokers Nonsmokers 1 1
x s
1 2
x s
Confidence Interval for the Difference Between Two Means, μ 1 (^) − μ 2 , from Independent Populations
2 2 1 2 / 2^1 1 2
x x Z α n n − ±^ σ^ + σ
When n 1 (^) ≥ 30 and n 2 (^) ≥ 30 , s 1^2 and s^22 can be used in place of σ (^) 12 and σ (^) 22.
Example: Two brands of cigarettes are selected, and their mean nicotine content is compared. The summary statistic s are shown below. Find a 99% confidence interval of the true difference in means.
Brand A Brand B 28.6 milligrams 5.1 milligrams 30
A A A
x s n
32.9 milligrams 4.4 milligrams 40
B B B
x s n
4. 2 Two Means for Independent Samples with σ (^) 1 and σ (^) 2 unknown
Possible hypotheses are as above. The test statistic, assuming σ (^) 12 ≠ σ 22
2 2 1 2 1 2
c
x x t s s n n
has a t-distribution with df = min{ n 1 (^) − 1, n 2 − 1}.
2 2 1 2 / 2^1 1 2
x x t s^ s α n n
df = min{ n 1 (^) − 1, n 2 − 1}. Use the same rejection regions we specified with the t- distribution for tests of one population mean.
The test statistic, assuming σ (^) 12 = σ 22
2 1 2
c p
x x t s n n
+
2 2 2 1 1 2 2 1 2
p 2 s n^ s^ n^ s n n
has a t-distribution with df = n 1 (^) + n 2 − 2. Use the same
rejection regions we specified with the t-distribution for tests of one population mean.
1 2
x x t α s p (^) n n
2 2 2 1 1 2 2 1 2
p 2
n s n s s n n
and df = n 1 (^) + n 2 − 2.
4. 3 Hypothesis Test for the Difference Between Two Means, μ 1 (^) − μ 2 , from Dependent Populations
Simply find the difference between the observations from pairs of data and analyze as a Hypothesis Test from One Population Mean, Handout 3. If you want an estimate of the mean difference, find a Confidence Interval for a Mean, Handout 2.
Example: In an effort to improve the vocabulary of 10 students, a teacher provides a 1- hour tutoring session for them. A pretest is given before the sessions, and a posttest is given afterward. The results are shown below. At 0.01 level of significance, can the teacher conclude that the tutoring sessions helped improve the students’ vocabularies?
Student 1 2 3 4 5 6 7 8 9 10 Pretest 83 76 92 64 82 68 70 71 72 63 Posttest 88 82 100 72 81 75 79 68 81 70 Difference
4. 4 Hypothesis Test for the Difference Between Two Proportions, p 1 (^) − p 2 , from Independent Populations
1 2 1 2
o a
H p p H p p
1 2 1 2
o a
H p p H p p
1 2 1 2
o a
H p p H p p
Note: Using zero for the difference is simply asking are the means the same or are the means different. If you thought one mean was 5 more than the other, you could use a difference of 5.
1 2 1 2
1 2
c
p p p p Z p p n n
where 1 1 2 2 1 2
p n p^ ˆ^^ n p ˆ n n
% (^) and ˆ (^) i i i
p X n
=. Zc is distributed approximately standard normal if
P-value =
We will reject H (^) o is p − value ≤ α , otherwise we fail to reject H (^) o.
Another way to draw a conclusion from the test statistic is to simply compare it to values of the standard normal distribution.
Reject H (^) o if Zc ≥ Zα / 2 Reject^ H^ o if^ Zc ≤^ − Zα Reject^ H^ o if^ Zc ≥ Zα
Example: To test the effectiveness of a new pain-relieving drug, 80 patients at a clinic were given a pill containing the drug and 80 others were given a placebo. At the 0. level of significance, what can we conclude about the effectiveness of the drug if 56 of the first group felt a beneficial effect while 38 of those who received a placebo felt a beneficial effect?
Confidence Interval for the Difference Between Two Proportions, p 1 (^) − p 2 , from Independent Populations
1 2 / 2 1 2
ˆ p (^) ˆ p (^) Z p^ p^ p^ p α n n
5 for both samples.
Example: In a random sample of 200 tractors from one assembly line and 400 tractors from another, there were 16 and 20 tractors respectively, which required extensive adjustments before they could be shipped. Construct a 95% confidence interval for the difference between proportions.
Homework 16 (part II) - see Handout 3 for Part I Due April 28
8. Two brands of batteries are tested and their voltage is compared. The summary statistics are below. Find and interpret a 95% confidence interval of the true difference in means.
Brand X Brand Y 9.2 volts 0.3 volts 32
X X X
x s n
8.8 volts 0.1 volts 30
Y Y Y
x s n
9. The U.S. Department of Agriculture uses many types of surveys to obtain important economic estimates. In one pilot survey they estimated wheat prices in July and in September using independent samples. The summary is below:
July September $3. $0. 45
J J J
x
n
σ
S S S
x
n
σ
Use a significance test to examine whether or not the mean price of wheat changed from July to September.
10. In a sample of 80 workers from a factory in city A, it was found that 5% were unable to read, while in a sample of 50 workers from city B, 8% were unable to read.
a. Can it be concluded that there is a difference in the proportions of workers unable to read in the two cities? Use α = 0.10. b. Find a 90% confidence interval for the difference of the two proportions.
11. In an effort to increase production of an automobile part factory, the manager decides to play music in the manufacturing area. Eight workers are selected, and the number of items each produced for a specific day is recorded. After one week of music, the same workers are monitored again. The data are below. At α = 0.05, can the manager conclude that music increased mean production?
Worker 1 2 3 4 5 6 7 8 Before 6 8 10 9 5 12 9 7 After 10 12 9 12 8 13 8 10