Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An overview of inferential statistics for comparing two population means, focusing on independent and paired samples. It covers the identification of the target parameter, the standard deviation requirement, the normal distribution approximation, and the construction of confidence intervals and hypothesis tests for large and small samples. The document also includes examples and calculations for both independent and paired samples.
Typology: Study notes
1 / 25
- Part
Mean
difference;
difference
in
averages
Difference
between
proportions, percentages,
Ratio
of
variances;
difference
in
variability
or
spread;
Quantitative Data
fractions
or
rates;
compare
proportions
Qualitative Data
compare
variation
Quantitative Data
Quantitative
Data
Qualitative
Data
Quantitative
Data
rs o
d
Point
Estimato
Standard
Error
՜
Single sample
x
1
2
՜
(^22)
2 1
Two samples
2
1
2
1
x
x
−
To
construct
a
confidence
interval
or
conduct
a
hypothesis
test,
we
need
the
standard
deviation:
The
Sampling
Distribution
for
2
1
The
mean
of
the
sampling
distribution
is:
μ
1
‐
μ
2
If
the
two
samples
are
independent,
the
standard
deviation
of
the
sampling
distribution
(the
standard
error
of
the
difference
)
is:
(^22)
2 1
3
Th
li
di t ib ti
f
i
i
t l
l
2 2
1 1
2
1
x
x
−
3
.
The
sampling
di
stribution
f
or
i
s
approximately
normal
for
large
samples,
i.e.
2
1
(^22)
2 1
(^22)
1 1
2
1
2
1
1
2
(^22)
2 1
2
2
(^22)
1 1
2 / 2 1 2 / 2 1
2
1
x
x
−
α
α
2 2
1 1
2 /
2
1
α
2
2 2 2
s
n^ x
1
1 1 1
s
n^ x
2
2
s
1
1
s
2 2
n^ x
1 1
n^ x
2
2
s
1
1
s
(^22)
2 1
s
s
2 2
1 1 2 / 2 1 2 / 2 1
)
(
)
(
2
1
s n
s n
z x x z x x
x
x
−
α
α
σ
24 .
0
32 .
0
96
1
)
83
9
37
10 (
2
2
±
50
50
96 .
1
)
83 .
9
37 .
10 (
±
−
=
]
651
0
429
0 [
057 .
0
96 .
1
)
54 .
0 (
×
±
=
]
651 .
0
;
429 .
0 [
=
Note that
0
is
not
contained
in
this
interval;
therefore,
we
can
think
that
there
is
a
significant
difference
on
population
means
among
these
two
samples
(with
a
95%
confidence
level).
One
‐
Tailed
Test
H
(
)
D
Two
‐
Tailed
Test
H
(
)
D
H
0
:
(
μ
1
‐
μ
2
)
=
D
0
H
a
:
(
μ
1
‐
μ
2
)
D
0
(<
D
0
)
H
0
:
(
μ
1
‐
μ
2
)
=
D
0
H
a
:
(
μ
1
‐
μ
2
)
≠
D
0
Rejection
region:
|
z
o
|>
z
α
Rejection
region:
|
z
o
|
> z
α
/
Test
Statistic:
2
1
0
2
1
x
x
o
−
2 2 2
2 1 1
2 (^22)
1 2 1
2
1
x
x
−
σ
σ
σ
2
1
Conditions:
The
two
samples
are
randomly
selected
from
the
target
population
and
independent
of
each
other.
The
sample
sizes
are
both
≥
Is
there
enough
evidence
to
thing
that
there
is
a
change
in
the
drug
potency
after
being
stored
for
1
year?
Use
α
=
2 2
n^ x
1 1
n^ x
2
2
s
1
1 1
s
0
a
Is
there
enough
evidence
to
thing
that
there
is
a
change
in
the
drug
potency
after
being
stored
for
1
year?
Use
α
=
2 2
n^ x
1 1
n^ x
2
2
s
1
1 1
s
0
1
2
a
1
2
0
)
83
9
37
10 (
)
(
D
x
x
24
0
32
0
2
2
0
1
2
a
1
2
55 .
9
057 .
0
0
)
83 .
9
37 .
10 (
)
(
2
1
0
2
1
= − − = − − =
−
x
x
o
D
x
x
z
σ
057 .
0
50 24 .
0
50
32 .
0
2
2
2
1
=
≅
−
x
x
σ
9 55
|
|
>
1 96
H
j
t
H
9
.55 = |
z
o
|
> z
α
/
=
1
.
Hence, we reject
H
0
.
Note that we can also calculate a
p-value
for this test.
For
small
samples
,
the
t
‐
distribution
can
be
used
with
a
pooled sample estimator
of
σ
2
s
2
pooled
sample
estimator
of
σ
2
,
s
p
2
) 1 ( ) 1 (
(^22)
2
2 1
1
2
−
−
=
s
n
s
n
s
p
2
2
1
−
n
n
p
Small
Sample
Confidence
Interval
for
μ
1
‐
μ
2
p
μ
1
μ
2
⎞ ⎟ ⎟
⎛ ⎜⎜
2 2 2 / 2 1 2 2 / 2 1
1 1 ) ( ) (
s t x x t x x
⎟⎟ ⎠
⎜⎜ ⎝
±
±
−
−
−
2 1 2 , 2 / 2 1 2 , 2 / 2 1
)
(
)
(
2 1 2 1 2 1
n n s t x x t x x
p n n x x n n
α
α
1
2
One
‐
Tailed
Test
H
(
)
D
Two
‐
Tailed
Test
H
(
)
D
H
0
:
(
μ
1
‐
μ
2
)
=
D
0
H
a
:
(
μ
1
‐
μ
2
)
D
0
(<
D
0
)
H
0
:
(
μ
1
‐
μ
2
)
=
D
0
H
a
:
(
μ
1
‐
μ
2
)
≠
D
0
Rejection
region:
|
t
o
|>
t
α
Rejection
region:
|
t
o
|
t
α
/
)
(
D
Test
Statistic:
⎞ ⎟ ⎟ ⎠
⎛ ⎜⎜ ⎝
−
−
=
2
0
2
1
1
1
)
(
n
n
s
D
x
x
t
p
o
⎟ ⎠
⎜ ⎝
2
1
n
n
p
Conditions:
The
two
samples
are
randomly
selected
from
the
target
population
and independent of each otherand
independent
of
each
other
.
Both
samples
populations
have
distributions
that
are
approx.
normal.
The
population
variances
are
equal.
i
t
d
t d t
l
t
th
ff
ti
f
An
experiment
was
conducted
t
o
evaluate
th
e
effectiveness
of
a
treatment
for
tapeworm
in
the
stomachs
of
sheep.
random
sample
of
worm
infected
lambs
of
approximately
the
same
age
and
health
was
d
l
di id d i t
t
l
f th
l
b
i j
t d
randomly
di
vided
i
nto
t
wo
groups.
welve
of
th
e
l
ambs
were
i
njected
with
the
drug
and
the
remaining
were
left
untreated.
After
a
month
period,
the
following
worm
counts
were
recorded:
1
n^ x
2
n^ x
(^11)
x s
(^22)
x s
Is
there
enough
evidence
to
thing
that
the
drug
is
effective
to
treat
stomach
tapeworms?
Use
(^111)
n^ x s
(^222)
n^ x s
1
2
2
1
2
1
0
a
2
1
2
1
0
a
2
1
(^22)
2
2 1
1
2
n
n
s
n
s
n
s
p
=
⎞⎟⎟
⎛⎜⎜
−
−
=
2
0
2
1
1
1
)
(
s
D
x
x
t
p
o
⎟ ⎟ ⎠
⎜ ⎜ ⎝
2
1
n
n
p
(^111)
n^ x s
(^222)
n^ x s
1
2
2
1
2
1
0
a
2
1
2
1
0
a
2
2
2
2
1
(^22)
2
2 1
1
2
n
n
s
n
s
n
s
p
272 .
2
12
1
12
1
11 .
14
0
)
67 .
39
58 .
26 (
1
1
)
(
2
0
2
1
−
=
⎞⎟ ⎠
⎛⎜ ⎝
−
−
=
⎞⎟⎟
⎛⎜⎜
−
−
=
s
D
x
x
t
p
o
12
12
2
1
⎟ ⎠
⎜ ⎝
⎟⎟ ⎠
⎜ ⎜ ⎝
n
n
p
t
o
t
α
n
1+
n
2 2
= t
0 05
22
Hence we reject
0
o
α
,
n
1+
n
2-
0.05, 22
j
0
Note that we can also calculate a
p-value
for this test.
2
1
2
1
0
μ
μ
α
μ
μ
a
data Tapeworms;
input Treatment $ Count;datalines;
Drug
18
Drug
43
Drug
28
Drug
50
Drug
50
Drug
16
Drug
32
Drug
13
Drug
35
Drug
38
Drug
33
Drug
6
Drug
7
Untreated
40
Untreated
54
Untreated
26
Untreated
63
Untreated
21
Untreated
21
Untreated
37
Untreated
39
Untreated
23
Untreated
48
Untreated
58
Untreated
28
Untreated
39
; proc ttest data=Tapeworms alpha=0.10;
class Treatment;var Count;
Note
that
we
need
to
look
at
the
output
for
the
Pooled
method
and
that
the
p
‐
value
run;
reported
needs
to
be
divided
by
2,
as
we
are
dealing
with
a
1
‐
sided
hypothesis.
2
1
d
i
i
1
i
2
(^22)
2 1
2
2
1
)
(
)
(
σ
σ
σ
μ
μ
μ
=
=
−
=
=
d
i
d
i
d
V
d
E
d d
d
d d
d
2 /
2 /
α
α
d d
n
d
d
1
, 2 /
−
α
d d
x s n
d
n
d
di t
h
t
d d th t
d
t
t
Long
di
stance
runners
h
ave
contended
th
at
moderate
exposure
t
o
ozone
increases
lung
capacity.
To
investigate
this
possibility,
a
researcher
exposed
rats
to
ozone
at
the
rate
of
parts
per
million
over
a
period
of
days.
Th
l
it
f
h
t
d t
i
d
t th
b
i
i
f th
t d
Th
e
l
ung
capacity
of
each
rat
was
d
etermined
at
th
e
b
eginning
of
th
e
study
and
again
after
days
of
ozone
exposure.
The
lung
capacities
(in
ml)
are
given
below:
Rat
Before
After
1
2
3
4
5
6
7
8
9
9
10
11
12
Obtain
a
confidence
interval
for
the
effect
in
lung
capacity
in
this
study.
R
B f
Af
Diff
24
n
R
at
Before
After
Diff
1
‐
2
‐
3
‐
4
‐
di
=
x
i 1
x
i 2
05 .
0
12 24
=
α
d
n n
5
6
7
‐
8
‐
9
‐
10
‐
11
‐
12
‐
Mean
‐
d
St.
dev.
−
d d
n
d
d
1
, 2 /
α
R
B f
Af
Diff
24
n
R
at
Before
After
Diff
1
8.7
9.4
‐ 0.7
2
7.9
9.8
‐ 1.9
3
8.3
9.9
‐ 1.6
4
8.4
10.3
‐ 1.9
di
=
x
i 1
x
i 2
05 .
0
12 24
=
α
d
n n
5
9.2
8.9
0.3
6
9.1
8.8
0.3
7
8.2
9.8
‐ 1.6
8
8.1
8.2
‐ 0.1
9
8.9
9.4
‐ 0.5
10
8.2
9.9
‐ 1.7
11
8.9
12.2
‐ 3.3
12
7.5
9.3
‐ 1.8
Mean
8.450
9.658
‐ 1.208
d
St.
dev.
0.516
0.988
1.077
(^11) ,
025 . 0
1
, 2 /
−
d d
n
d
d
α
2
1
−
=
d
One
‐
Tailed
Test
H
0
:
μ
d
=
D
0
Two
‐
Tailed
Test
H
0
:
μ
d
=
D
0
H
a
:
μ
d
<
D
0
(>
D
0
)
Rejection
region:
H
a
:
μ
d
≠
D
0
Rejection
region:
j
g
|
t
o
|<
‐
t
α
(>
t
α
)
j
g
|
t
o
|
t
α
/2
D
x
−
Test
Statistic:
for
small
sample
sizes
d
d
d
o
n
s
D
x
t
/
0
=
d
D
x
0
−
for
large
sample
sizes
d
d d
o
n
z
/
0
=
th
ffi i
t
id
t
t th
j
t
th t
s
th
ere
sufficient
evidence
t
o
support
th
e
conjecture
th
at
ozone
exposure
increases
lung
capacity?
Use
α
and
report
the
p
value
of
your
test.
Rat
Before
After
Diff
1
8.7
9.4
‐ 0.7
2
7.9
9.8
‐ 1.9
3
8.3
9.9
‐ 1.6
4
8.4
10.3
‐ 1.9
0
d
a
d
(this
depends
on
how
Diff
was
obtained)
4
8.4
10.3
1.9
5
9.2
8.9
0.3
6
9.1
8.8
0.3
7
8.2
9.8
‐ 1.6
8
8.1
8.2
‐ 0.1
9
8 9
9 4
‐ 0 5
d
d
d
o
n
s
x
t
0
9
8
.9
9.4
0.5
10
8.2
9.9
‐ 1.7
11
8.9
12.2
‐ 3.3
12
7.5
9.3
‐ 1.8
Mean
8 450
9 658
1 208
=
−
1
,
d n
t
α
Mean
8
.450
9.658
‐ 1
.208
St.
dev.
0.516
0.988
1.077
=
=
−
|)
|
(
0 t
t
P
value
0
th
ffi i
t
id
t
t th
j
t
th t
s
th
ere
sufficient
evidence
t
o
support
th
e
conjecture
th
at
ozone
exposure
increases
lung
capacity?
Use
α
and
report
the
p
value
of
your
test.
Rat
Before
After
Diff
1
8.7
9.4
‐ 0.7
2
7.9
9.8
‐ 1.9
3
8.3
9.9
‐ 1.6
4
8.4
10.3
‐ 1.9
0
d
a
d
(this
depends
on
how
Diff
was
obtained)
4
8.4
10.3
1.9
5
9.2
8.9
0.3
6
9.1
8.8
0.3
7
8.2
9.8
‐ 1.6
8
8.1
8.2
‐ 0.1
9
8 9
9 4
‐ 0 5
0
d
d
d
o
n
s
x
t
796
1
9
8
.9
9.4
0.5
10
8.2
9.9
‐ 1.7
11
8.9
12.2
‐ 3.3
12
7.5
9.3
‐ 1.8
Mean
8 450
9 658
1 208
796 .
1
(^11) ,
05 . 0
1
,
=
=
−
t
t
d n
α
o
α
Mean
8
.450
9.658
‐ 1
.208
St.
dev.
0.516
0.988
1.077
00126 .
0
)
89 .
3
(
|)
|
(
0
= > = > = −
t
P
t
t
P
value
0
data
Lung;
data
Lung;
input
Rat
Before
After;
datalines; 1
8.7
9.4
2
7.9
9.8
3
8.3
9.9
proc
ttest
data=Lung
alpha=0.05;
paired
Before*After;
run;
95% confidence interval for the
0
d
4
8.4
10.3
5
9.2
8.9
6
9.1
8.8
7
8.2
9.8
8
8.1
8.2
9
8 9
9 4
confidence
interval
for
the
effect
in
lung
capacity.
a
d
9
8
.9
9.4
10
8.2
9.9
11
8.9
12.2
12
7.5
9.3
;