











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Davenport; Class: APPLIED STAT FOR ENGINR & SCI; Subject: Statistics; University: Virginia Commonwealth University; Term: Unknown 1989;
Typology: Study notes
1 / 19
This page cannot be seen from the preview
Don't miss anything!












Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 1
Instructor: Dr. James M. Davenport Lecture # 12 Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 2
Information in today’s lecture corresponds to the following sections in your textbook (in addition to my notes):
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 3
We can now use the Central Limit Theorem to construct a large sample confidence interval for an unknown proportion.
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 4
Is approximately distributed as N( 0 , 1 ).
( ) ( )
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 5
The resulting confidence interval for p is
Note that this is a function of the unknown proportion, p.
( )
( ) 2
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 6
Confidence Interval for a Proportion, p
If we solve this inequality for the unknown proportion, p, we obtain the following for the confidence limits.
( ) ( )
2 2 2 2 2 2 2
np z z npq z LCL n z
α α α α
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 7
Confidence Interval for a Proportion, p
The upper confidence limit is as follows:
These agree with those given in several texts, and should be used for small n – sometimes called the Wilson Estimators
( ) ( )
2 2 2 2 2 2 2
np z z npq z UCL n z
α α α α
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 8
Sample Size for a Confidence Interval for a Proportion, p
The sample size formula is given by:
where L is the width of the interval.
(^2 2 2 2 2 4 2) ( (^2) ) (^2 ) 2
2 z ˆ ˆ pq^ z L 4 z ˆ ˆ pq^ pq ˆ ˆ L L z n L
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 9
If the sample size n is large, these limits reduce to the following:
where This is an approximate (100)( 1 - α )% confidence interval for p.
2
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 10
Sample Size for a Confidence Interval for a Proportion, p
And the sample size formula reduces to:
where L is the width of the interval.
2 2 2
4 z pq ˆ ˆ n L
=^ α
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 11
σ^2
σ^2
σ^2
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 12
At this point, the only thing we can do is simple substitute an estimate of , namely s^2 , and hence use
σ^2
σ^2
2
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 19
σ^2
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 20
What we really must investigate is the sampling distribution of this statistic:
as opposed to
First done by W. S. Gosset.
μ σ
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 21
But before we can adequately do that, we need to introduce the sampling distribution of S^2 (the sample variance), that are computed from random samples X 1 , X 2 ,... , Xn that arise from NORMAL DISTRIBUTIONS N( μ , σ^2 ).
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 22
That is, the sampling distribution of the random variable defined by
2 2 1
n i i
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 23
We can prove that the distribution of W has a very special sampling distribution called the Chi-square distribution with ( n – 1 ) degrees of freedom, where
2 2 1 2 2
n i i
=
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 24
The density function of W is given by
where r = the degrees of freedom.
1 2 2 2
r w
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 25
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 26
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 27
THE CHI-SQUARE CURVE Plot of a Chi-square p.d.f.
w Variable with r Degrees of Freedom
Density
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 28
The percentage points for selected probabilities such that
are given in Table II in your textbook.
2 χ α , r α 2 P W ⎡⎣^ ≥ χ α (^) , r ⎤⎦= α for 0 < α< 1
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 29
THE CHI-SQUARE CURVE Plot of a Chi-square p.d.f.
w Variable with r Degrees of Freedom
Density
α
2 χ α , r
1 − α
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 30
These percentage points can be found using NCSS’s Probability Calculator, along with the cumulative probabilities of the Chi-square distribution.
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 37
σ
σ^2 σ
σ
σ Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 38
μ
σ^2
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 39
This was the starting place for deriving the confidence interval for μ, assuming σ is known.
α α
μ α σ
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 40
But if we substitute s for σ and the sample size is small to moderate in size,
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 41
These percentiles are not correct, since for small sample sizes, the sampling distribution of is no longer adequately described by the normal distribution, .
2
σ μ
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 42
If we substitute s for σ, and we wish to maintain a valid probability statement with probability 1 - α , then we must use
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 43
What we really must investigate is the sampling distribution of the following statistic:
− μ
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 44
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 45 Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 46
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 47
.
μ σ σ
( ) ( ) (^ )
( )
( )
2 2 1 2
n
μ σ χ σ
−
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 48
The random variable T has a very special sampling distribution which is called the “Student’s T” distribution with r = (n – 1) degrees of freedom.
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 55
n = 2 r = n – 1 = 1 α = 0. Find t (^) 0.025 , 1 such that
From Table IV, t0.025 , 1 = 12.
By contrast z0.025 = 1.
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 56
THE STUDENT's T CURVE Plot of Student's t p.d.f.
Density
t Variable
t (^) 0.025,1 = 12.
α = 0.
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 57
And, of course, these percentage points can be found for any value of the cumulative or upper tail probability and for any degree of freedom (including non integer degrees of freedom) using NCSS’s Probability Calculator. Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 58
THE STUDENT's T CURVE Plot of Student's t p.d.f.
Density
t Variable
− t 0.025,4 = − 2.776 t 0.025,4 (^) = 2.
0.025 0.
n = 5 ; r = 4
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 59
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 60
Find d 1 such that P[ T > d 1 ] = 0. (or equivalently, P[ T < d 1 ] = 0.95 ) for degrees of freedom r = 11 From NCSS’s Probability Calculator, we find d 1 = 1..
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 61
THE STUDENT's T CURVE Plot of Student's t p.d.f.
Density
t Variable
d 1 (^) = 1.
0.
n = 12 ; r = 11
0.
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 62
Let r = 11. Find d 2 such that P[ - d 2 < T < d 2 ] = 0..
1 - α = 0.95 α = 0.05 α / 2 = 0.
This implies that d 2 = 2..
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 63
THE STUDENT's T CURVE Plot of Student's t p.d.f.
Density
t Variable
− d (^) 2 = − t (^) 0.025,11 = − 2.201 d (^) 2 = t 0.025,11 = 2.
0.025 0.
n = 12 ; r = 11
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 64
is a (100)( 1 - α )% confidence interval for the mean μ. This of course, assumes that we are sampling from a N( μ , σ^2 ) pop.
2, r
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 65
The mean tearing strength of a certain brand of paper in under investigation by a manufacturer of laser printers. We assume that this measurement is normally distributed. A random sample of n = 22 sheets of paper were tested and the sample mean tearing strength was 2.4 pounds Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 66
1 - α = 0.95 α = 0.05 α / 2 = 0. X = tearing strength of paper (pounds)
2^ (^ )^ (^ )
x z n α
σ ⎛ ⎞ ± = ± (^) ⎜ ⎟= ⎝ ⎠
Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 73
Hence, if that population is relatively mound-shaped and somewhat symmetric, then use the percentage points from the Student’s T distribution. Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 74
If you find yourself in situations where the original population is decidedly non- normally distributed and the sample size is small, then you must use other methods that we will not discuss here (exact sampling dist. & non parametric methods).
Using NCSS’s Probability Calculator^ to find the cumulative probability and p – value for a
Chi-square distribution with 2 degrees of freedom.
Enter the degrees offreedom of the Chi-square distributionThis is the non-centrality parameter.
Enter the value, say
, of
the Chi-square distributionwith degrees of freedomgiven above at left.When computing p-values,this is the observed value ofthe test statistic computedfrom the data.
This output window provides thecumulative probability associatedwith the value,
, entered at the
lower left; i.e. P [ x <
For a lower tailed test, this willbe the p – value.
This is the complement of theprobability given above; i.e.1 – P [ x <
] = P [ x >
For an upper tailed test, thiswill be the p – value.
Note: these two probabilitiessum to one.
For a two tailed test, you must double the appropriate value givenabove at right; i.e. p – value = 2 (.001526 ) =..
Using NCSS's Probability Calculator to find the cumulative probability and p - value for aStudent's t distribution with 18 degrees of freedom.
Enter the degrees of freedomof the Student's t distribution.The is the non-centralityparameter, which for most ofthe applications in anintroductory course in statistics,will be equal to zero.Enter the value, say T, of theStudent's t distribution withdegrees of freedom givenabove at left.When computing p-values,this is the observed value ofthe test statistic computedfrom the data.
This output window provides thecumulative probability associated withthe value, T, entered at the lower left;i.e. P [ t < T ] .For a lower tailed test, this will be thep-value.
This is the complement of theprobability given above; i.e.1 - P [ t < T ] = P [ t > T ]For an upper tailed test, this will bethe p-value. Note: these two probabilitiessum to one.
For a two tailed test, you must double the appropriate value givenabove at right; i.e. p -value = 2 (0.0304 ) = 0..
Using NCSS’s Probability Calculator
to fnd the 95
th^
percentile from a
Student’s t-distribution with 18 degrees of freedom.
Output is the percentagepoint from the Student’st – distribution with thedegrees of freedom atupper left and thecumulative probabilitygiven at lower left.
Enter the degrees offreedom for the Student’st – distribution. This window is for thenon-centrality parameter. Enter the cumulativeprobability correspondingto the percentage pointdesired.
symmetric. 2), As the sample size grows, assumption (b) is even less important; the Central Limit Theorem tell us that the sampling distribution of the mean approaches the normal regardless of the underlying population distribution With a large sample, the assumption of underlying Normality is not important at all.
-- Paul Velleman
P.S. In ActivStats, our multimedia materials for the introductory statistics course, we do not state any arbitrary rule for df and instead advise students to follow the advice I give here. It is what statisticians do in practice.