Hypothesis Testing for Population Proportions and Means: Independent Samples, Study notes of Business Statistics

An overview of hypothesis testing for population proportions and means using independent samples. It covers the notation, hypotheses, test statistic, critical values, and decisions for testing the equality of proportions and means. The document also includes examples of testing the difference in proportions between two populations and the difference in means between two populations.

Typology: Study notes

Pre 2010

Uploaded on 07/22/2009

koofers-user-vtx
koofers-user-vtx 🇺🇸

4.3

(1)

10 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 9
Key Ideas
Hypothesis Test (Two Populations)
Section 9-1: Overview
In Chapter 8, discussion centered around hypothesis tests for the proportion, mean, and standard deviation/variance of a single
population. However, often researchers want to compare two different populations. For example, surgeons may want to try out a new
surgical technique for a certain ailment, but they are not sure if it will be better. To test this, they could take two samples of people.
In one sample, they could use the existing technique, and in the other, they could use the new technique. By comparing survival
proportions of the two groups, they could then determine whether the new sample is any better. In this case, the first population is all
people who would receive the existing technique. The second population is all people who would receive the new technique. By
using the methods discussed in this chapter, such inference can be done.
Section 9-2: Inferences About Two Proportions
To use the methods described in this section, we first need to rely on a few conditions that must be met for everything to work
properly.
Conditions
1. The proportions are taken from two simple random samples which are independent. Here, independent means that observations
from the first population are not related to, or paired with, observations from the second population.
2. For each of the two samples, there are at least 5 successes and 5 failures.
Notation
p1 = Population Proportion (Population 1) p2 = Population Proportion (Population 2)
n1 = Sample Size (Population 1) n2 = Sample Size (Population 2)
x1 = Number of Successes (Population 1) x2 = Number of Successes (Population 2)
1
1
1
ˆn
x
p
= Sample Proportion (Population 1)
2
2
2
ˆn
x
p
= Sample Proportion (Population 2)
11 ˆ
1
ˆpq
22 ˆ
1
ˆpq
21
21
nn
xx
p
= Pooled Sample Proportion
pq 1
The Test
The goal is to test the hypotheses given by:
H0: p1 = p2
H1: p1p2(or p1 > p2 or p1 < p2)
In other words, we test whether the proportions for each population are equal vs. whether they are different in some way.
Test Statistic:
21
21
21
2121 )
ˆˆ
(
or,
)()
ˆˆ
(
n
qp
n
qp
pp
Z
n
qp
n
qp
pppp
Z
The critical values and P-Values come from the standard normal distribution.
Decisions are made in exactly the same way as in Chapter 8.
(For the test statistic, we assume
H0 is true, which means p1p2 =
0)
pf3
pf4
pf5
pf8

Partial preview of the text

Download Hypothesis Testing for Population Proportions and Means: Independent Samples and more Study notes Business Statistics in PDF only on Docsity!

Chapter 9

Key Ideas

Hypothesis Test (Two Populations)

Section 9-1: Overview

In Chapter 8, discussion centered around hypothesis tests for the proportion, mean, and standard deviation/variance of a single

population. However, often researchers want to compare two different populations. For example, surgeons may want to try out a new

surgical technique for a certain ailment, but they are not sure if it will be better. To test this, they could take two samples of people.

In one sample, they could use the existing technique, and in the other, they could use the new technique. By comparing survival

proportions of the two groups, they could then determine whether the new sample is any better. In this case, the first population is all

people who would receive the existing technique. The second population is all people who would receive the new technique. By

using the methods discussed in this chapter, such inference can be done.

Section 9-2: Inferences About Two Proportions

To use the methods described in this section, we first need to rely on a few conditions that must be met for everything to work

properly.

Conditions

  1. The proportions are taken from two simple random samples which are independent. Here, independent means that observations

from the first population are not related to, or paired with, observations from the second population.

  1. For each of the two samples, there are at least 5 successes and 5 failures.

Notation

p 1

= Population Proportion (Population 1) p 2

= Population Proportion (Population 2)

n 1

= Sample Size (Population 1) n 2

= Sample Size (Population 2)

x 1

= Number of Successes (Population 1) x 2

= Number of Successes (Population 2)

1

1

1

n

x

p  = Sample Proportion (Population 1)

2

2

2

n

x

p  = Sample Proportion (Population 2)

1 1

q   p

2 2

q  p

1 2

1 2

n n

x x

p

= Pooled Sample Proportion

q  1  p

The Test

The goal is to test the hypotheses given by:

H

0

: p 1

= p 2

H

1

: p 1

p 2

(or p 1

p 2

or p 1

< p 2

In other words, we test whether the proportions for each population are equal vs. whether they are different in some way.

Test Statistic:

1 2

1 2

1 2

1 2 1 2

, or

n

p q

n

p q

p p

Z

n

p q

n

p q

p p p p

Z

The critical values and P-Values come from the standard normal distribution.

Decisions are made in exactly the same way as in Chapter 8.

(For the test statistic, we assume

H

0

is true, which means p 1

  • p 2

Example

Close to an election, ballot issues #3 and #4 are very controversial. Researchers want to see whether there is a difference in the

proportions of people who support issue #3 and those who support issue #4. They randomly sample 200 total people. They ask 100 of

these people whether they support issue #3, to which 56 say “yes”. They ask the other 100 people whether they support issue #4, to

which 45 say “yes”. Is there a difference in the proportions of supporters for each issue? Test this with α = 0.05.

Solution

From the information given, we see that:

n 1 = 100, x 1 = 56,

1

p

, n 2 = 100, x 2 = 45,

2

p

1 2

1 2

q

n n

x x

p

H

0

: p 1

= p 2

H

1

: p 1

p 2

Test Statistic:

1 2

1 2

n

p q

n

p q

p p

Z

Traditional Method

 Since H 1

has a “≠” sign, we want to find the critical value that has an area of α/

above it and α/2 below it on the standard normal distribution (i.e. we want the value

2

Z

 From the table, this cut-off value with an area of 0.025 above is 1.96.

 Now we compare the test statistic to 1.96 and find that Z = 1.56 < 1.96.

 This means that Z is not in the critical region (shaded area).

 Therefore, we do not reject H

In other words, we conclude that there is not enough

evidence to claim that there is a difference in proportions of supporters for the two

issues.

P-Value Method

 Since H 1

has a “≠” sign, we want to find area above |Z| = 1.56 and below -|Z| = -1.

for the standard normal distribution.

 From the Z-Table, this area is 2(0.0594) = 0.1188.

 Now we compare this area to α and see that 0.1188 > 0.05.

 Therefore, we do not reject H

In other words, we conclude that there is not enough

evidence to claim that there is a difference in proportions of supporters for the two

issues.

Confidence Interval for p 1

  • p 2

Sometimes, one would like to estimate the difference between the proportions, rather than just seeing whether the proportions differ

significantly. To construct a confidence interval for p 1

  • p 2

, we have the following:

Point Estimate: 1 2

pp

Critical Value:

2

Z

Standard Error:

2

2 2

1

1 1

n

p q

n

p q

This gives the following confidence interval:

2

2 2

1

11

2

1 2

n

p q

n

p q

ppZ

Example

Consider the Gloria/Jules hairdresser example from the previous page.

We had the following quantities:

n 1

= 92, x 1

1

p  , n 2

= 67, x 2

2

p

Therefore, a 95% confidence interval would be:

2

2 2

1

11

2

1 2

n

p q

n

p q

p p Z

Notice that 0 is not included in this confidence interval. Due to this fact, we could say that there is a difference between the two

proportions (i.e. we would reject H 0

in favor of H 1

: p 1

p 2

Section 9-3: Inferences About Two Means: Independent Samples

In similar fashion to testing for differences in proportions, one may also wish to test for a difference in the means of two populations.

In the interests of time, we will consider the most general case, where the population standard deviations are unknown, and no

assumptions are made about them. Better hypothesis tests exist when these value are both known, or are unknown but assumed to be

equal. Consult the textbook for more information on these tests.

Again, certain requirements must be met for the techniques discussed in this section to be theoretically sound.

Conditions

  1. σ 1

and σ 2

are unknown and no assumption is made about the equality of σ 1

and σ 2

  1. The two samples are independent.
  2. Both samples are simple random samples.
  3. Either both populations are normally distributed or n 1 and n 2 are both greater than 30.

Notation

μ 1

= Population Mean (Population 1) μ 2

= Population Mean (Population 2)

n 1

= Sample Size (Population 1) n 2

= Sample Size (Population 2)

1

x = Sample Mean (Population 1) 2

x = Sample Mean (Population 2)

s 1

= Sample Standard Deviation (Population 1) s 2

= Sample Standard Deviation (Population 2)

Degrees of Freedom = df = min( n 1

, n 2

(so df is still n – 1, but here n is the smaller of the two sample sizes)

The Test

The goal is to test the hypotheses given by:

H

0

: μ 1

= μ 2

H

1

: μ 1

≠ μ 2

(or μ 1

μ 2

or μ 1

< μ 2

In other words, we test whether the means for each population are equal vs. whether they are different in some way.

Test Statistic:

2

2

2

1

2

1

1 2

2

2

2

1

2

1

1 2 1 2

, or

n

s

n

s

x x

t

n

s

n

s

x x

t

The critical values and P-Values come from the Student t distribution with Degrees of Freedom = min( n 1

, n 2

Decisions are made in exactly the same way as in Chapter 8.

(For the test statistic, we assume

H

0

is true, which means μ 1

  • μ 2

Example

One question on everyone’s mind is whether there is a difference in the average number of pets owned by Columbus families and the

average number of pets owned by Cleveland families. Researchers set out to answer this important question. They sampled 35

Columbus families and 48 Cleveland families. Of the Columbus families, the average number of pets was 2.4, with a standard

deviation of 1.4. Of the Cleveland families, the average number of pets was 1.9, with a standard deviation of 0.9. Run a hypothesis

test (α = 0.05) to see if there is a difference in the average number of pets owned by families in the two cities.

Solution

From the information given, we see that:

n 1

1

x  , s 1

= 1.4, n 2

2

x  , s 2

= 0.9, df = min( n 1

, n 2

H

0

: μ 1

= μ 2

H 1 : μ 1 ≠ μ 2

Test Statistic:

2 2

2

2

2

1

2

1

1 2

n

s

n

s

x x

t

Traditional Method

 Since H 1

has a “≠” sign, we want to find the critical value that has an area of α/

above it and α/2 below it on the t distribution (i.e. we want the value

2

t

 From the table, this cut-off value with a one-tailed area of 0.025 is 2.032.

 Now we compare the test statistic to 2.032 and find that t = 1.852 < 2.032.

 This means that t is not in the critical region (shaded area).

 Therefore, we do not reject H

There is not sufficient evidence to conclude that the

average number of pets is different in the two cities.

P-Value Method

 Since H 1

has a “≠” sign, the p-value will be the area above | t| = 1.852 and below

-| t | = -1.852 for the t distribution.

 Again, notice that the t-Table does not allow one to directly find this area. However,

for df = 34 we see that 1.691 has a two-tailed area of 0.10 and 2.032 has a two-tailed

area of 0.05.

 Since 1.691 < t < 2.032, the p-value will fall between 0.05 and 0.10. (see picture)

 Now we compare this area to α and see that p > 0.05 = α.

 Since p > 0.05, we do not reject H

There is not sufficient evidence to conclude that

the average number of pets is different in the two cities.

Confidence Interval for p 1 – p 2

As in the previous section, one might like to estimate the difference between the means, rather than just testing for the difference.

To construct a confidence interval for μ 1

  • μ 2

, we have the following:

Point Estimate: 1 2

xx

Critical Value:

2

t

(df = min( n 1

, n 2

Standard Error:

2

2

2

1

2

1

n

s

n

s

This gives the following confidence interval:

2

2

2

1

2

1

2

1 2

n

s

n

s

xxt

Example

Consider the avg. number of pets example from the previous page.

We had the following quantities:

n 1

1

x  , s 1

= 1.4, n 2

2

x  , s 2

= 0.9, df = min( n 1

, n 2

Therefore, a 95% confidence interval would be:

2 2

2

2

2

1

2

1

2

1 2

n

s

n

s

x x t

Notice that 0 is included in this confidence interval. Due to this fact, we could say that there is not a significant difference between

the two means (i.e. we do not reject H 0

in favor of H 1

: μ 1

≠ μ 2

, as in the example on the previous page).