




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of hypothesis testing for population proportions and means using independent samples. It covers the notation, hypotheses, test statistic, critical values, and decisions for testing the equality of proportions and means. The document also includes examples of testing the difference in proportions between two populations and the difference in means between two populations.
Typology: Study notes
1 / 8
This page cannot be seen from the preview
Don't miss anything!





Chapter 9
Key Ideas
Hypothesis Test (Two Populations)
Section 9-1: Overview
In Chapter 8, discussion centered around hypothesis tests for the proportion, mean, and standard deviation/variance of a single
population. However, often researchers want to compare two different populations. For example, surgeons may want to try out a new
surgical technique for a certain ailment, but they are not sure if it will be better. To test this, they could take two samples of people.
In one sample, they could use the existing technique, and in the other, they could use the new technique. By comparing survival
proportions of the two groups, they could then determine whether the new sample is any better. In this case, the first population is all
people who would receive the existing technique. The second population is all people who would receive the new technique. By
using the methods discussed in this chapter, such inference can be done.
Section 9-2: Inferences About Two Proportions
To use the methods described in this section, we first need to rely on a few conditions that must be met for everything to work
properly.
Conditions
from the first population are not related to, or paired with, observations from the second population.
Notation
p 1
= Population Proportion (Population 1) p 2
= Population Proportion (Population 2)
n 1
= Sample Size (Population 1) n 2
= Sample Size (Population 2)
x 1
= Number of Successes (Population 1) x 2
= Number of Successes (Population 2)
1
1
1
n
x
p = Sample Proportion (Population 1)
2
2
2
n
x
p = Sample Proportion (Population 2)
1 1
q p
2 2
q p
1 2
1 2
n n
x x
p
= Pooled Sample Proportion
q 1 p
The Test
The goal is to test the hypotheses given by:
0
: p 1
= p 2
1
: p 1
≠ p 2
(or p 1
p 2
or p 1
< p 2
In other words, we test whether the proportions for each population are equal vs. whether they are different in some way.
Test Statistic:
1 2
1 2
1 2
1 2 1 2
, or
n
p q
n
p q
p p
n
p q
n
p q
p p p p
The critical values and P-Values come from the standard normal distribution.
Decisions are made in exactly the same way as in Chapter 8.
(For the test statistic, we assume
0
is true, which means p 1
Example
Close to an election, ballot issues #3 and #4 are very controversial. Researchers want to see whether there is a difference in the
proportions of people who support issue #3 and those who support issue #4. They randomly sample 200 total people. They ask 100 of
these people whether they support issue #3, to which 56 say “yes”. They ask the other 100 people whether they support issue #4, to
which 45 say “yes”. Is there a difference in the proportions of supporters for each issue? Test this with α = 0.05.
Solution
From the information given, we see that:
n 1 = 100, x 1 = 56,
1
p
, n 2 = 100, x 2 = 45,
2
p
1 2
1 2
q
n n
x x
p
0
: p 1
= p 2
1
: p 1
≠ p 2
Test Statistic:
1 2
1 2
n
p q
n
p q
p p
Traditional Method
Since H 1
has a “≠” sign, we want to find the critical value that has an area of α/
above it and α/2 below it on the standard normal distribution (i.e. we want the value
2
From the table, this cut-off value with an area of 0.025 above is 1.96.
Now we compare the test statistic to 1.96 and find that Z = 1.56 < 1.96.
This means that Z is not in the critical region (shaded area).
Therefore, we do not reject H
In other words, we conclude that there is not enough
evidence to claim that there is a difference in proportions of supporters for the two
issues.
P-Value Method
Since H 1
has a “≠” sign, we want to find area above |Z| = 1.56 and below -|Z| = -1.
for the standard normal distribution.
From the Z-Table, this area is 2(0.0594) = 0.1188.
Now we compare this area to α and see that 0.1188 > 0.05.
Therefore, we do not reject H
In other words, we conclude that there is not enough
evidence to claim that there is a difference in proportions of supporters for the two
issues.
Confidence Interval for p 1
Sometimes, one would like to estimate the difference between the proportions, rather than just seeing whether the proportions differ
significantly. To construct a confidence interval for p 1
, we have the following:
Point Estimate: 1 2
p p
Critical Value:
2
Standard Error:
2
2 2
1
1 1
n
p q
n
p q
This gives the following confidence interval:
2
2 2
1
11
2
1 2
n
p q
n
p q
p p Z
Example
Consider the Gloria/Jules hairdresser example from the previous page.
We had the following quantities:
n 1
= 92, x 1
1
p , n 2
= 67, x 2
2
p
Therefore, a 95% confidence interval would be:
2
2 2
1
11
2
1 2
n
p q
n
p q
p p Z
Notice that 0 is not included in this confidence interval. Due to this fact, we could say that there is a difference between the two
proportions (i.e. we would reject H 0
in favor of H 1
: p 1
≠ p 2
Section 9-3: Inferences About Two Means: Independent Samples
In similar fashion to testing for differences in proportions, one may also wish to test for a difference in the means of two populations.
In the interests of time, we will consider the most general case, where the population standard deviations are unknown, and no
assumptions are made about them. Better hypothesis tests exist when these value are both known, or are unknown but assumed to be
equal. Consult the textbook for more information on these tests.
Again, certain requirements must be met for the techniques discussed in this section to be theoretically sound.
Conditions
and σ 2
are unknown and no assumption is made about the equality of σ 1
and σ 2
Notation
μ 1
= Population Mean (Population 1) μ 2
= Population Mean (Population 2)
n 1
= Sample Size (Population 1) n 2
= Sample Size (Population 2)
1
x = Sample Mean (Population 1) 2
x = Sample Mean (Population 2)
s 1
= Sample Standard Deviation (Population 1) s 2
= Sample Standard Deviation (Population 2)
Degrees of Freedom = df = min( n 1
, n 2
(so df is still n – 1, but here n is the smaller of the two sample sizes)
The Test
The goal is to test the hypotheses given by:
0
: μ 1
= μ 2
1
: μ 1
≠ μ 2
(or μ 1
μ 2
or μ 1
< μ 2
In other words, we test whether the means for each population are equal vs. whether they are different in some way.
Test Statistic:
2
2
2
1
2
1
1 2
2
2
2
1
2
1
1 2 1 2
, or
n
s
n
s
x x
t
n
s
n
s
x x
t
The critical values and P-Values come from the Student t distribution with Degrees of Freedom = min( n 1
, n 2
Decisions are made in exactly the same way as in Chapter 8.
(For the test statistic, we assume
0
is true, which means μ 1
Example
One question on everyone’s mind is whether there is a difference in the average number of pets owned by Columbus families and the
average number of pets owned by Cleveland families. Researchers set out to answer this important question. They sampled 35
Columbus families and 48 Cleveland families. Of the Columbus families, the average number of pets was 2.4, with a standard
deviation of 1.4. Of the Cleveland families, the average number of pets was 1.9, with a standard deviation of 0.9. Run a hypothesis
test (α = 0.05) to see if there is a difference in the average number of pets owned by families in the two cities.
Solution
From the information given, we see that:
n 1
1
x , s 1
= 1.4, n 2
2
x , s 2
= 0.9, df = min( n 1
, n 2
0
: μ 1
= μ 2
H 1 : μ 1 ≠ μ 2
Test Statistic:
2 2
2
2
2
1
2
1
1 2
n
s
n
s
x x
t
Traditional Method
Since H 1
has a “≠” sign, we want to find the critical value that has an area of α/
above it and α/2 below it on the t distribution (i.e. we want the value
2
t
From the table, this cut-off value with a one-tailed area of 0.025 is 2.032.
Now we compare the test statistic to 2.032 and find that t = 1.852 < 2.032.
This means that t is not in the critical region (shaded area).
Therefore, we do not reject H
There is not sufficient evidence to conclude that the
average number of pets is different in the two cities.
P-Value Method
Since H 1
has a “≠” sign, the p-value will be the area above | t| = 1.852 and below
-| t | = -1.852 for the t distribution.
Again, notice that the t-Table does not allow one to directly find this area. However,
for df = 34 we see that 1.691 has a two-tailed area of 0.10 and 2.032 has a two-tailed
area of 0.05.
Since 1.691 < t < 2.032, the p-value will fall between 0.05 and 0.10. (see picture)
Now we compare this area to α and see that p > 0.05 = α.
Since p > 0.05, we do not reject H
There is not sufficient evidence to conclude that
the average number of pets is different in the two cities.
Confidence Interval for p 1 – p 2
As in the previous section, one might like to estimate the difference between the means, rather than just testing for the difference.
To construct a confidence interval for μ 1
, we have the following:
Point Estimate: 1 2
x x
Critical Value:
2
t
(df = min( n 1
, n 2
Standard Error:
2
2
2
1
2
1
n
s
n
s
This gives the following confidence interval:
2
2
2
1
2
1
2
1 2
n
s
n
s
x x t
Example
Consider the avg. number of pets example from the previous page.
We had the following quantities:
n 1
1
x , s 1
= 1.4, n 2
2
x , s 2
= 0.9, df = min( n 1
, n 2
Therefore, a 95% confidence interval would be:
2 2
2
2
2
1
2
1
2
1 2
n
s
n
s
x x t
Notice that 0 is included in this confidence interval. Due to this fact, we could say that there is not a significant difference between
the two means (i.e. we do not reject H 0
in favor of H 1
: μ 1
≠ μ 2
, as in the example on the previous page).