Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comparing Population Means: Inferences, Confidence Intervals, and Hypothesis Tests - Prof., Study notes of Data Analysis & Statistical Methods

An overview of inferential statistics for comparing two population means, focusing on independent and paired samples. It covers the identification of the target parameter, the standard deviation requirement, the normal distribution approximation, and the construction of confidence intervals and hypothesis tests for large and small samples. The document also includes examples and calculations for both independent and paired samples.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-07o
koofers-user-07o 🇺🇸

10 documents

1 / 25

Toggle sidebar

Related documents


Partial preview of the text

Download Comparing Population Means: Inferences, Confidence Intervals, and Hypothesis Tests - Prof. and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Chapter

- Part

A

Inferences

Based

on

Two

Samples:

Confidence

Intervals

and

Tests

of

Hypotheses

Identifying

the

Target

Parameter

μ

μ

p

p

σ

/

σ

Mean

difference;

difference

in

averages

Difference

between

proportions, percentages,

Ratio

of

variances;

difference

in

variability

or

spread;

Quantitative Data

fractions

or

rates;

compare

proportions

Qualitative Data

compare

variation

Quantitative Data

Quantitative

Data

Qualitative

Data

Quantitative

Data

Comparing

Two

Population

Means:

Independent SamplingIndependent

Sampling

rs o

d

Point

Estimato

Standard

Error

՜

Single sample

n

s

x

=

σ

ˆ

1

2

՜

(^22)

2 1

ˆ

s

s

+

=

σ

Two samples

2

1

2

1

n

n

x

x

+

=

σ

To

construct

a

confidence

interval

or

conduct

a

hypothesis

test,

we

need

the

standard

deviation:

Comparing

Two

Population

Means:

Independent Sampling

The

Sampling

Distribution

for

Independent

Sampling

2

1

x

x

The

mean

of

the

sampling

distribution

is:

μ

1

μ

2

If

the

two

samples

are

independent,

the

standard

deviation

of

the

sampling

distribution

(the

standard

error

of

the

difference

)

is:

(^22)

2 1

ˆ

s

s

S

3

Th

li

di t ib ti

f

i

i

t l

l

2 2

1 1

2

1

ˆ

s n

s n

SED

x

x

+

=

=

σ

3

.

The

sampling

di

stribution

f

or

i

s

approximately

normal

for

large

samples,

i.e.

2

1

x

x

)

(

(^22)

2 1

N

σ

σ

)

,

(

~

(^22)

1 1

2

1

2

1

n

n

N

x

x

μ

μ

+

Comparing

Two

Population

Means:

Independent SamplingIndependent

Sampling

Large

Sample

Confidence

Interval

for

μ

1

μ

2

(^22)

2 1

)

(

)

(

+

±

±

σ

σ

2

2

(^22)

1 1

2 / 2 1 2 / 2 1

)

(

)

(

2

1

s

s

n n z x x z x x

x

x

+ ± − = ± −

α

α

σ

2 2

1 1

2 /

2

1

)

(

s n

s n

z

x

x

+

±

α

Example

(Part

I)

C

ffi i l

d

b

h

l

h

f i

i

l

Company

officials

are

concerned

about

the

l

ength

of

time

a

particular

drug

retains

its

potency.

A

random

sample

(

sample

)

of

bottles

of

the

product

is

drawn

from

the

current

production

and

analyzed

for

(

)

potency.

A

second

sample

(

sample

)

of

bottles

is

obtained,

stored

for

year,

and

then

analyzed.

The

summary

readings

are:

.

0

.

2

2 2 2

=

= =

s

n^ x

σ

.

0

.

1

1 1 1

=

= =

s

n^ x

σ

.

0

2

2

s

σ

.

0

1

1

s

σ

Obtain

a

95%

confidence

interval

for

the

difference

among

sample

groups.

Example

(Part

I)

0

.

2 2

= =

n^ x

0

.

1 1

= =

n^ x

.

0

=

α

.

0

2

2

=

s

σ

.

0

1

1

=

s

σ

(^22)

2 1

s

s

2 2

1 1 2 / 2 1 2 / 2 1

)

(

)

(

2

1

s n

s n

z x x z x x

x

x

  • ± − ≅ ± −

α

α

σ

24 .

0

32 .

0

96

1

)

83

9

37

10 (

2

2

±

50

50

96 .

1

)

83 .

9

37 .

10 (

±

=

]

651

0

429

0 [

057 .

0

96 .

1

)

54 .

0 (

×

±

=

]

651 .

0

;

429 .

0 [

=

Note that

0

is

not

contained

in

this

interval;

therefore,

we

can

think

that

there

is

a

significant

difference

on

population

means

among

these

two

samples

(with

a

95%

confidence

level).

Comparing

Two

Population

Means:

Independent Sampling

One

Tailed

Test

H

(

)

D

Two

Tailed

Test

H

(

)

D

Independent

Sampling

H

0

:

(

μ

1

μ

2

)

=

D

0

H

a

:

(

μ

1

μ

2

)

D

0

(<

D

0

)

H

0

:

(

μ

1

μ

2

)

=

D

0

H

a

:

(

μ

1

μ

2

)

D

0

Rejection

region:

|

z

o

|>

z

α

Rejection

region:

|

z

o

|

> z

α

/

Test

Statistic:

2

1

0

2

1

)

(

x

x

o

D

x

x

z

=

σ

2 2 2

2 1 1

2 (^22)

1 2 1

2

1

s n

s n

n

n

x

x

+

+

=

σ

σ

σ

2

1

Conditions:

The

two

samples

are

randomly

selected

from

the

target

population

and

independent

of

each

other.

The

sample

sizes

are

both

Example

(Part

II)

Is

there

enough

evidence

to

thing

that

there

is

a

change

in

the

drug

potency

after

being

stored

for

1

year?

Use

α

=

0.05. 83

.

2 2

= =

n^ x

.

1 1

= =

n^ x

.

0

2

2

=

s

σ

.

0

1

1 1

=

s

σ

H

:

H

0

:

H

a

:

Example

(Part

II)

Is

there

enough

evidence

to

thing

that

there

is

a

change

in

the

drug

potency

after

being

stored

for

1

year?

Use

α

=

0.05. 83

.

2 2

= =

n^ x

.

1 1

= =

n^ x

.

0

2

2

=

s

σ

.

0

1

1 1

=

s

σ

H

: (

μ

-

μ

)

= 0

or

H

:

μ

=

μ

H

0

: (

μ

1

-

μ

2

)

0

H

a

: (

μ

1

-

μ

2

)

0

0

)

83

9

37

10 (

)

(

D

x

x

24

0

32

0

2

2

or

H

0

:

μ

1

μ

2

H

a

:

μ

1

μ

2

55 .

9

057 .

0

0

)

83 .

9

37 .

10 (

)

(

2

1

0

2

1

= − − = − − =

x

x

o

D

x

x

z

σ

057 .

0

50 24 .

0

50

32 .

0

2

2

2

1

=

x

x

σ

9 55

|

|

>

1 96

H

j

t

H

9

.55 = |

z

o

|

> z

α

/

=

1

.

Hence, we reject

H

0

.

Note that we can also calculate a

p-value

for this test.

Comparing

Two

Population

Means:

Independent Sampling

For

small

samples

,

the

t

distribution

can

be

used

with

a

pooled sample estimator

of

σ

2

s

2

Independent

Sampling

pooled

sample

estimator

of

σ

2

,

s

p

2

) 1 ( ) 1 (

(^22)

2

2 1

1

2

=

s

n

s

n

s

p

2

2

1

n

n

p

Small

Sample

Confidence

Interval

for

μ

1

μ

2

p

μ

1

μ

2

⎞ ⎟ ⎟

⎛ ⎜⎜

  • ± − = ± −

2 2 2 / 2 1 2 2 / 2 1

1 1 ) ( ) (

s t x x t x x

σ

⎟⎟ ⎠

⎜⎜ ⎝

±

±

2 1 2 , 2 / 2 1 2 , 2 / 2 1

)

(

)

(

2 1 2 1 2 1

n n s t x x t x x

p n n x x n n

α

α

σ

Th

l

f

t

i

b

d

df

Th

e

value

of

t

i

s

b

ased

on

df

=

n

1

+

n

2

Comparing

Two

Population

Means:

Independent Sampling

One

Tailed

Test

H

(

)

D

Two

Tailed

Test

H

(

)

D

Independent

Sampling

H

0

:

(

μ

1

μ

2

)

=

D

0

H

a

:

(

μ

1

μ

2

)

D

0

(<

D

0

)

H

0

:

(

μ

1

μ

2

)

=

D

0

H

a

:

(

μ

1

μ

2

)

D

0

Rejection

region:

|

t

o

|>

t

α

Rejection

region:

|

t

o

|

t

α

/

)

(

D

Test

Statistic:

⎞ ⎟ ⎟ ⎠

⎛ ⎜⎜ ⎝

=

2

0

2

1

1

1

)

(

n

n

s

D

x

x

t

p

o

⎟ ⎠

⎜ ⎝

2

1

n

n

p

Conditions:

The

two

samples

are

randomly

selected

from

the

target

population

and independent of each otherand

independent

of

each

other

.

Both

samples

populations

have

distributions

that

are

approx.

normal.

The

population

variances

are

equal.

Example

A

i

t

d

t d t

l

t

th

ff

ti

f

An

experiment

was

conducted

t

o

evaluate

th

e

effectiveness

of

a

treatment

for

tapeworm

in

the

stomachs

of

sheep.

A

random

sample

of

worm

infected

lambs

of

approximately

the

same

age

and

health

was

d

l

di id d i t

t

T

l

f th

l

b

i j

t d

randomly

di

vided

i

nto

t

wo

groups.

T

welve

of

th

e

l

ambs

were

i

njected

with

the

drug

and

the

remaining

were

left

untreated.

After

a

month

period,

the

following

worm

counts

were

recorded:

1

=

n^ x

2

=

n^ x

.

.

(^11)

= =

x s

.

.

(^22)

= =

x s

Is

there

enough

evidence

to

thing

that

the

drug

is

effective

to

treat

stomach

tapeworms?

Use

α

=

0.05.

Example

.

.

(^111)

= = =

n^ x s

.

.

(^222)

= = =

n^ x s

1

2

2

1

2

1

0

:

.

0

:

μ

μ

α

μ

μ

<

=

=

a

H H

0

)

(

:

0

)

(

:

2

1

2

1

0

<

=

μ

μ

μ

μ

a

H H

=

+

+

=

) 1 ( ) 1 (

2

1

(^22)

2

2 1

1

2

n

n

s

n

s

n

s

p

=

⎞⎟⎟

⎛⎜⎜

=

2

0

2

1

1

1

)

(

s

D

x

x

t

p

o

⎟ ⎟ ⎠

⎜ ⎜ ⎝

2

1

n

n

p

Example

.

.

(^111)

= = =

n^ x s

.

.

(^222)

= = =

n^ x s

1

2

2

1

2

1

0

:

.

0

:

μ

μ

α

μ

μ

<

=

=

a

H H

0

)

(

:

0

)

(

:

2

1

2

1

0

<

=

μ

μ

μ

μ

a

H H

2

2

2

2

1

(^22)

2

2 1

1

2

.

.

.

) 1 ( ) 1 (

=

+

× + × = − +

+

=

n

n

s

n

s

n

s

p

272 .

2

12

1

12

1

11 .

14

0

)

67 .

39

58 .

26 (

1

1

)

(

2

0

2

1

=

⎞⎟ ⎠

⎛⎜ ⎝

=

⎞⎟⎟

⎛⎜⎜

=

s

D

x

x

t

p

o

12

12

2

1

⎟ ⎠

⎜ ⎝

⎟⎟ ⎠

⎜ ⎜ ⎝

n

n

p

2.272 = |

t

o

| >

t

α

n

1+

n

2 2

= t

0 05

22

=

Hence we reject

H

0

.

|

o

|

α

,

n

1+

n

2-

0.05, 22

j

0

Note that we can also calculate a

p-value

for this test.

Example

2

1

2

1

0

:

.

0

:

μ

μ

α

μ

μ

<

=

=

a

H H

data Tapeworms;

input Treatment $ Count;datalines;

Drug

18

Drug

43

Drug

28

Drug

50

Drug

50

Drug

16

Drug

32

Drug

13

Drug

35

Drug

38

Drug

33

Drug

6

Drug

7

Untreated

40

Untreated

54

Untreated

26

Untreated

63

Untreated

21

Untreated

21

Untreated

37

Untreated

39

Untreated

23

Untreated

48

Untreated

58

Untreated

28

Untreated

39

; proc ttest data=Tapeworms alpha=0.10;

class Treatment;var Count;

Note

that

we

need

to

look

at

the

output

for

the

Pooled

method

and

that

the

p

value

run;

reported

needs

to

be

divided

by

2,

as

we

are

dealing

with

a

1

sided

hypothesis.

Comparing

Two

Population

Means:

Paired Difference ExperimentsPaired

Difference

Experiments

Known

also

as

paired

data

.

Pairs of observations are dependent (i e correlated)Pairs

of

observations

are

dependent

(i

.e.

correlated)

.

It

can

provide

more

information

about

the

difference

between

population

means

than

an

independent

samples

experiment.

The population means are compared by looking at the

differences

The

population

means

are

compared

by

looking

at

the

differences

between

pairs

of

experimental

units

that

were

similar

prior

to

the

experiment.

Differencing removes some sources of variation (mainly correlation)

Differencing

removes

some

sources

of

variation

(mainly

correlation)

.

Comparing

Two

Population

Means:

Paired Difference ExperimentsPaired

Difference

Experiments

Paired

Difference

Confidence

Interval

for

2

1

μ

μ

μ

=

d

New

variable:

d

i

=

x

i

1

x

i

2

s

σ

(^22)

2 1

2

2

1

)

(

)

(

σ

σ

σ

μ

μ

μ

=

=

=

=

d

i

d

i

d

V

d

E

Large

Sample:

d d

d

d d

d

s

n s z x n z x

2 /

2 /

±

±

α

α

σ

Small

Sample:where

d d

n

d

n

s

t

x

d

1

, 2 /

±

α

e e

sample

mean

difference

sample

standard

deviation

of

differences

number of pairs observed

d d

x s n

number

of

pairs

observed

d

n

The

value

of

t

is

based

on

df =

n

d

Example

L

di t

h

t

d d th t

d

t

t

Long

di

stance

runners

h

ave

contended

th

at

moderate

exposure

t

o

ozone

increases

lung

capacity.

To

investigate

this

possibility,

a

researcher

exposed

rats

to

ozone

at

the

rate

of

parts

per

million

over

a

period

of

days.

Th

l

it

f

h

t

d t

i

d

t th

b

i

i

f th

t d

Th

e

l

ung

capacity

of

each

rat

was

d

etermined

at

th

e

b

eginning

of

th

e

study

and

again

after

days

of

ozone

exposure.

The

lung

capacities

(in

ml)

are

given

below:

Rat

Before

After

1

2

3

4

5

6

7

8

9

9

10

11

12

Obtain

a

95%

confidence

interval

for

the

effect

in

lung

capacity

in

this

study.

Example

R

B f

Af

Diff

24

n

R

at

Before

After

Diff

1

2

3

4

di

=

x

i 1

x

i 2

05 .

0

12 24

=

=

α

d

n n

5

6

7

8

9

10

11

12

Mean

±

d

s

t

St.

dev.

=

±

d d

n

d

n

t

x

d

1

, 2 /

α

Example

R

B f

Af

Diff

24

n

R

at

Before

After

Diff

1

8.7

9.4

‐ 0.7

2

7.9

9.8

‐ 1.9

3

8.3

9.9

‐ 1.6

4

8.4

10.3

‐ 1.9

di

=

x

i 1

x

i 2

05 .

0

12 24

=

=

α

d

n n

5

9.2

8.9

0.3

6

9.1

8.8

0.3

7

8.2

9.8

‐ 1.6

8

8.1

8.2

‐ 0.1

9

8.9

9.4

‐ 0.5

10

8.2

9.9

‐ 1.7

11

8.9

12.2

‐ 3.3

12

7.5

9.3

‐ 1.8

Mean

8.450

9.658

‐ 1.208

.

±

±

t

s

t

x

d

St.

dev.

0.516

0.988

1.077

]

.

0

;

.

[

.

.

.

.

(^11) ,

025 . 0

1

, 2 /

− − = ± − =

±

=

±

t

n

t

x

d d

n

d

d

α

]

;

[

Comparing

Two

Population

Means:

Paired Difference ExperimentsPaired

Difference

Experiments

2

1

μ

μ

μ

=

d

Hypothesis

test

for:

One

Tailed

Test

H

0

:

μ

d

=

D

0

Two

Tailed

Test

H

0

:

μ

d

=

D

0

H

a

:

μ

d

<

D

0

(>

D

0

)

Rejection

region:

H

a

:

μ

d

D

0

Rejection

region:

j

g

|

t

o

|<

t

α

(>

t

α

)

j

g

|

t

o

|

t

α

/2

D

x

Test

Statistic:

for

small

sample

sizes

d

d

d

o

n

s

D

x

t

/

0

=

d

D

x

0

for

large

sample

sizes

d

d d

o

n

z

/

0

σ

=

Example

I

th

ffi i

t

id

t

t th

j

t

th t

I

s

th

ere

sufficient

evidence

t

o

support

th

e

conjecture

th

at

ozone

exposure

increases

lung

capacity?

Use

α

=

and

report

the

p

value

of

your

test.

0

Rat

Before

After

Diff

1

8.7

9.4

‐ 0.7

2

7.9

9.8

‐ 1.9

3

8.3

9.9

‐ 1.6

4

8.4

10.3

‐ 1.9

H

0

:

μ

d

=

0

H

a

:

μ

d

<

0

(this

depends

on

how

Diff

was

obtained)

4

8.4

10.3

1.9

5

9.2

8.9

0.3

6

9.1

8.8

0.3

7

8.2

9.8

‐ 1.6

8

8.1

8.2

‐ 0.1

9

8 9

9 4

‐ 0 5

=

=

d

d

d

o

n

s

D

x

t

/

0

9

8

.9

9.4

0.5

10

8.2

9.9

‐ 1.7

11

8.9

12.2

‐ 3.3

12

7.5

9.3

‐ 1.8

Mean

8 450

9 658

1 208

=

1

,

d n

t

α

Mean

8

.450

9.658

‐ 1

.208

St.

dev.

0.516

0.988

1.077

Hence we reject

H

=

=

|)

|

(

0 t

t

P

value

p Hence

, we reject

H

0

.

Example

I

th

ffi i

t

id

t

t th

j

t

th t

I

s

th

ere

sufficient

evidence

t

o

support

th

e

conjecture

th

at

ozone

exposure

increases

lung

capacity?

Use

α

=

and

report

the

p

value

of

your

test.

0

Rat

Before

After

Diff

1

8.7

9.4

‐ 0.7

2

7.9

9.8

‐ 1.9

3

8.3

9.9

‐ 1.6

4

8.4

10.3

‐ 1.9

H

0

:

μ

d

=

0

H

a

:

μ

d

<

0

(this

depends

on

how

Diff

was

obtained)

4

8.4

10.3

1.9

5

9.2

8.9

0.3

6

9.1

8.8

0.3

7

8.2

9.8

‐ 1.6

8

8.1

8.2

‐ 0.1

9

8 9

9 4

‐ 0 5

.

/

.

0

.

/

0

− = − − = − =

d

d

d

o

n

s

D

x

t

796

1

9

8

.9

9.4

0.5

10

8.2

9.9

‐ 1.7

11

8.9

12.2

‐ 3.3

12

7.5

9.3

‐ 1.8

Mean

8 450

9 658

1 208

796 .

1

(^11) ,

05 . 0

1

,

=

=

t

t

d n

α

= | t

o

| > t

α

=

Mean

8

.450

9.658

‐ 1

.208

St.

dev.

0.516

0.988

1.077

Hence we reject

H

00126 .

0

)

89 .

3

(

|)

|

(

0

= > = > = −

t

P

t

t

P

value

p Hence

, we reject

H

0

.

Example

data

Lung;

data

Lung;

input

Rat

Before

After;

datalines; 1

8.7

9.4

2

7.9

9.8

3

8.3

9.9

proc

ttest

data=Lung

alpha=0.05;

paired

Before*After;

run;

95% confidence interval for the

H

0

:

μ

d

=

0

4

8.4

10.3

5

9.2

8.9

6

9.1

8.8

7

8.2

9.8

8

8.1

8.2

9

8 9

9 4

95%

confidence

interval

for

the

effect

in

lung

capacity.

H

a

:

μ

d

<

0

9

8

.9

9.4

10

8.2

9.9

11

8.9

12.2

12

7.5

9.3

;