Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An in-depth exploration of sampling and hypothesis testing concepts. It covers the properties of estimators, the central limit theorem, and the calculation of confidence intervals for population means and proportions. The document also discusses the importance of knowing the mean, standard error, and shape of sampling distributions.
Typology: Study notes
1 / 9
: an entire set of objects or units of observation of one
sort or another. Sample
: subset of a population.
Parameter
versus
statistic
size
mean
variance
proportion
Population:
μ
σ
2
π
Sample:
n
x
s
2
p
x
n
n
i
=
1
x
i
To make inferences regarding the population mean,
μ
, we need to
know something about the probability distribution of this samplestatistic, ¯
x
The distribution of a sample statistic is known as a
sampling
distribution
. Two of its characteristics are of particular interest, the
mean or expected value and the variance or standard deviation. E(
x)
: Thought experiment: Sample repeatedly from the given
population, each time recording the sample mean, and take theaverage of those sample means.
If the sampling procedure is
unbiased
, deviations of ¯
x
from
μ
in the
upward and downward directions should be equally likely; onaverage, they should cancel out.
x)
μ
The sample mean is then an
unbiased estimator
of the population
mean.
efficient
than another if its values are more
tightly clustered around its expected value.E.g. alternative estimators for the population mean: ¯
x
versus the
average of the largest and smallest values in the sample.The degree of dispersion of an estimator is generally measured by thestandard deviation of its probability distribution (samplingdistribution). This goes under the name
standard error
σ
¯ x
σ
n
The more widely dispersed are the population values around theirmean (larger
σ
), the greater the scope for sampling error (i.e.
drawing by chance an unrepresentative sample whose mean differssubstantially from
μ
A larger sample size (greater
n
) narrows the dispersion of ¯
x
proportion
π
The corresponding sample statistic is the proportion of the samplehaving the characteristic in question,
p
The sample proportion is an unbiased estimator of the populationproportion
E(p)
π
Its standard error is given by
σ
p
π (
π )
n
Population
variance
σ
2
σ
2
N
∑ i
=
1
(x
i
μ)
2
Estimator, sample variance:
s
2
n
n
∑ i
=
1
(x
i
x)
2
shape
of a sampling distribution in order to put it to use.
Sample mean: Central Limit Theorem implies a Gaussian distribution,for “large enough” samples.Reminder:
0
0
.
05
0
.
1
0
.
15
0
.
2
0
1
2
3
4
5
6
7
P ( ¯
x)
¯ x
Five dice
Not all sampling distributions are Gaussian, e.g. sample variance asestimator of population variance. In this case the ratio
(n
)s
2
/σ
2
follows a skewed distribution known as
χ
2
chi-square
) with
n
degrees of freedom. If the sample size is large the
χ
2
distribution converges towards the
normal.
μ
100 and
σ
12 for a certain population, and we draw a
sample with
n
36 from that population.
The standard error of ¯
x
is
σ /
n
2, and a sample size of 36
is large enough to justify the assumption of a Gaussian samplingdistribution. We know that the range
μ
σ
encloses the central 95
percent of a normal distribution, so we can state
x <
There’s a 95 percent probability that the sample mean lies within 4units (= 2 standard errors) of the population mean, 100.
μ
is unknown we can still say
P (μ
x < μ
With probability .95 the sample mean will be drawn from within 4units of the unknown population mean.We go ahead and draw the sample, and calculate a sample mean of(say) 97. If there’s a probability of .95 that our ¯
x
came from within 4
units of
μ
, we can turn that around: we’re entitled to be 95 percent
confident that
μ
lies between 93 and 101.
We draw up a 95 percent
confidence interval
for the population mean
as ¯
x
σ
¯ x
σ
unknown, we have to
estimate
the standard error of ¯
x
s
¯ x
σ
¯ x
s
n
We can now reformulate our 95 percent confidence interval for
μ
x
s
¯ x
Strictly speaking, the substitution of
s
for
σ
alters the shape of the
sampling distribution. Instead of being Gaussian it now follows the
t
distribution, which looks very much like the Gaussian except that it’sa bit “fatter in the tails”.
t
distribution is not fully characterized by its
mean and standard deviation: there is an additional factor, namelythe
degrees of freedom
(df).
For estimating a population mean the df term is the sample sizeminus 1.
At low degrees of freedom the
t
distribution is noticeably more
“dispersed” than the Gaussian, meaning that a 95 percentconfidence interval would have to be wider (greater uncertainty).
As the degrees of freedom increase, the
t
distribution converges
towards the Gaussian.
Values enclosing the central 95 percent of the distribution:
Normal:
μ
σ
t(
μ
σ
P (μ
σ < x < μ
σ )
Thus the 99 percent interval is ¯
x
σ
¯ x
If we want greater confidence that our interval straddles the unknownparameter value (99 percent versus 95 percent) then our intervalmust be wider (
58 standard errors versus
2 standard errors).
Sample info:
p
Our single best guess at the population proportion,
π
, is then 0.56,
but we can quantify our uncertainty.
The standard error of
p
is
π (
π )/n
. The value of
π
is
unknown but we can substitute
p
or, to be conservative, we can
put
π
5 which maximizes the value of
π (
π )
On the latter procedure, the estimated standard error is√
The large sample justifies the Gaussian assumption for thesampling distribution; the 95 percent confidence interval is 0
θ
denote a “generic parameter”.
θ
θ
(point estimate).
α
θ
maximum error for
α)
confidence
“Maximum error” equals so many standard errors of such and such asize. The number of standard errors depends on the chosenconfidence level (possibly also the degrees of freedom). The size ofthe standard error,
σ
ˆ θ
, depends on the nature of the parameter being
estimated and the sample size.
θ
is Gaussian. The following
notation is useful:
z
x
μ
σ
The “standard normal score” or “
z
-score” expresses the value of a
variable in terms of its distance from the mean, measured in standarddeviations.Example:
μ
1000 and
σ
x
850 has a
z
-score of
0: it lies 3 standard deviations below the mean.
Where the distribution of
θ
is Gaussian we can write the 1
α
confidence interval for
θ
as
θ
σ
ˆ θ
z
α/
2
z
.
975
= −
1
.
96
z
.
025
=
1
.
96
This is about as far as we can go in general terms. The specificformula for
σ
ˆ θ
depends on the parameter.
null hypothesis
, some
definite claim regarding a parameter of interest.Just as the defendant is presumed innocent until proved guilty, thenull hypothesis (
0
) is assumed true (at least for the sake of
argument) until the evidence goes against it.
0
is in fact:
Decision:
True
False
Reject
Type I error
Correct decision
α
Fail to reject
Correct decision
Type II error
β
β
is the
power
of a test; trade-off between
α
and
β
choose
α
(probability of Type I error)?
The calculations that compose a hypothesis test are condensed in akey number, namely a conditional probability:
the probability of
observing the given sample data, on the assumption that the nullhypothesis is true
This is called the
p-value
. If it is small, we can place one of two
interpretations on the situation:
(a) The null hypothesis is true and the sample we drew is an
improbable, unrepresentative one.
(b) The null hypothesis is false.
The smaller the p-value, the less comfortable we are with alternative(a). (Digression) To reach a conclusion we must specify the limit ofour comfort zone, a p-value below which we’ll reject
0
Say we use a cutoff of .01: we’ll reject the null hypothesis if thep-value for the test is
If the null hypothesis is in fact true, what is the probability of ourrejecting it? It’s the probability of getting a p-value less than or equalto .01, which is (by definition) .01.In selecting our cutoff we selected
α
, the probability of Type I error.
0
μ
versus
1
μ
Well, we don’t mind if the chips are faster than advertised.So instead we adopt the asymmetrical hypotheses:
0
μ
versus
1
μ >
Let
α
The p-value is
x
μ
where
n
100 and
s
If the null hypothesis is true,
x)
is no greater than 60.
The estimated standard error of ¯
x
is
s/
n
With
n
100 we can take the sampling distribution to be normal.
With a Gaussian sampling distribution the
test statistic
is the
z
-score.
z
x
μ
H
0
s
¯ x
σ
, is unknown, we could not justify the assumption of a
Gaussian sampling distribution for ¯
x
. Rather, we’d have to use the
t
distribution with df = 9.The estimated standard error,
s
¯ x
632, and the test
statistic is
t(
x
μ
H
0
s
¯ x
The p-value for this statistic is 0.000529—a lot larger than for
z
but still much smaller than the chosen significance level of 5 percent,so we still reject the null hypothesis.
In general the test statistic can be written as
test
θ
θ
H
0
s
ˆ θ
That is, sample statistic minus the value stated in the nullhypothesis—which by assumption equals
θ)
—divided by the
(estimated) standard error of
θ
The distribution to which “test” must be referred, in order to obtainthe p-value, depends on the situation.
0
μ
versus
1
μ
We have to think:
what sort of values of the test statistic should count
against the null hypothesis? In the asymmetrical case only values of ¯
x
greater than 60 counted
against
0
. A sample mean of (say) 57 would be consistent with
μ
60; it is not even
prima facie
evidence against the null.
Therefore the
critical region
of the sampling distribution (the region
containing values that would cause us to reject the null) lies strictly inthe upper tail.But if the null hypothesis were
μ
60, then values of ¯
x
both
substantially below and substantially above 60 would count against it.The critical region would be divided into two portions, one in each tailof the sampling distribution.
0
μ
0
0
μ
0
, before comparing it to
α
The sample mean was 63, and the p-value was defined as theprobability of drawing a sample “like this or worse”, from thestandpoint of
0
In the symmetrical case, “like this or worse” means “with a samplemean this far away from the hypothesized population mean, orfarther, in either direction”.
So the p-value is
x
x
, which is double the value we
found previously.
denote the sample evidence and
denote the null hypothesis
that is “on trial”. The p-value can then be expressed as
This may seem awkward. Wouldn’t it be better to calculate theconditional probability the other way round,
Instead of working with the probability of obtaining a sample like theone we in fact obtained, assuming the null hypothesis to be true, whycan’t we think in terms of the probability that the null hypothesis istrue, given the sample evidence?
Recall the multiplication rule for probabilities, which we wrote as
Swapping the positions of
and
we can equally well write
And taking these two equations together we can infer that
or
This is
Bayes’ rule
. It provides a means of converting from a
conditional probability one way round to the inverse conditionalprobability.
Substituting
(Evidence) and
(null Hypothesis) for
and
, we get
We know how to find the p-value,
. To obtain the probability
we’re now canvassing as an alternative,
, we have to supply in
addition
and
is the marginal probability of the null hypothesis and
is
the marginal probability of the sample evidence.Where are these going to come from??
α
is used for both the significance level of a hypothesis
test (the probability of Type I error), and in denoting the confidencelevel (
α
) for interval estimation.
There is an equivalence between a two-tailed hypothesis test atsignificance level
α
and an interval estimate using confidence level
α
Suppose
μ
is unknown and a sample of size 64 yields ¯
x
s
The 95 percent confidence interval for
μ
is then
55 to 52
Suppose we want to test
0
μ
55 using the 5 percent significance
level. No additional calculation is needed. The value 55 lies outside ofthe 95 percent confidence interval, so we can conclude that
0
is
rejected.
In a two-tailed test at the 5 percent significance level, we fail to reject H
0
if and only if ¯
x
falls within the central 95 percent of the sampling
distribution, according to
0
But since 55 exceeds 50 by more than the “maximum error”, 2.45, wecan see that, conversely, the central 95 percent of a samplingdistribution centered on 55 will not include 50, so a finding of ¯
x
must lead to rejection of the null.“Significance level” and “confidence level” are complementary.
normal pdf
normal p-value
The further we are from the center of the sampling distribution,according to
0
, the smaller the p-value.
Back to the main discussion.