



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of sampling distributions, focusing on the sampling distribution of a proportion and the sampling distribution of the mean. It explains how to calculate the probability that a sample statistic is close to the population parameter using the binomial distribution and normal approximation. The document also covers the continuity correction and the normal approximation to the binomial distribution.
Typology: Study notes
1 / 5
This page cannot be seen from the preview
Don't miss anything!




Cécile Ané
Stat 371
Spring 2006
1
Introduction
2
Sampling distribution of a proportion
3
Sampling distribution of the mean
4
Normal approximation to the binomial
5
The continuity correction
What does it mean to take a sample of size
n
1
n
form a random sample if they are independent and
have a common distribution.
From a sample, we can calculate a sample statistic suchas the sample mean
is random too! It can differ from sample to sample. The
textbook refers to a
meta-experiment
The distribution of
is called a sampling distribution.
Example: cross of two heterozygotes
Aa
Aa
. Probability
distribution of the offspring’s genotype:
Offspring genotype
Aa
aa
An offspring is dominant if it has genotype
or
Aa
Experiment: Get
n
2 offsprings, count the number
of
dominant offspring, and calculate the sample proportion
p
We would like
ˆ p
to be close to the “true” value
p
p
is random
Distribution of
ˆ p
(from the binomial distribution):
p
Larger sample size:
n
p
20 the sample proportion.
We still want
ˆ p
to be close to the “true” value
p
p
is still random
What is the probability that
ˆ p
is within 0
05 of
p
? Translate
into a binomial question IP
p
Sample size of 20 better than sample size of 2 !!
Example: weight of seeds of some variety of beans.Sample size
n
Student #
Observations
sample mean
y
¯ y
¯ y
¯ y
is random. How do we know its distribution?
We will see 3 key facts.
If
1
n
is a random sample, and if the
i
’s have mean
μ
and standard deviation
σ
, then
has mean
μ
¯ Y
μ
and variance var
σ
2
n
, i.e. standard deviation
σ
¯ Y
σ
√
n
Seed weight example:
Assume beans have mean
μ
500 mg
and
σ
120 mg. In a sample of size
n
4, the sample mean
has mean
μ
¯ Y
500 mg and standard deviation
σ
¯ Y
60 mg.
If
1
n
is a random sample, and if the
i
’s are all from
μ, σ
, then
μ,
σ
n
Actually,
1
n
n
too.
Seed weight example: 100 students do the same experiment. 350
n=
n=
sample mean
Example:
of
n
200 children. Probability of side effect:
p
What is
Direct calculation:
200
0
0
200
200
15
15
185
Heavy!Or we can use a trick: the binomial might be close to anormal distribution. Pretend
is normally distributed!
0
2
4
6
8
10
0.5 0.4 0.3 0.2 0.1 0.
n= 10 , p= 0.
Probability
0
2
4
6
8
10
0.25 0.20 0.15 0.10 0.05 0.
n= 50 , p= 0.
0
5
10
15
20
0.12 0.10 0.08 0.06 0.04 0.02 0.
n= 200 , p= 0.
0
5
10
15
20
0.25 0.20 0.15 0.10 0.05 0.
n= 20 , p= 0.
Probability
0
5
10
15
20
0.15 0.10 0.
n= 20 , p= 0.
Some Possible Values
0
5
10
15
0.25 0.20 0.15 0.10 0.05 0.
n= 20 , p= 0.
1
200
where
1
if child #1 has side effects,
otherwise.
200
if child #200 has side effects,
otherwise.
Apply key result #3: if
n
(# of children) is large enough,
then
1
n
has a normal distribution.
Use the normal distribution with
’s mean and variance:
μ
np
σ
np
p
If
n
p
and if
n
is large enough:
if
np
and
n
p
(rule of thumb), then
’s distribution is approximately N
np
np
p
Back to our question:
np
10 and
n
p
190 are both
5, so
True value: > sum( dbinom(0:15,
size=200,
prob=0.05))
0
5
10
15
20
0
5
10
15
20
binomial
, and
normal
No continuity correction:
The continuity correction gives a better approximation.
(true value was 0.9556)
binomial
, and
normal
What is the probability that between 8 and 15 children get sideeffects?
True value: > sum(
dbinom(8:15, size=200, prob=0.05) )