Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Confidence Interval for a Proportion - Lecture Slides | STAT 541, Study notes of Statistics

Virginia Commonwealth University (VCU)Statistics

Prof. James Davenport

Material Type: Notes; Professor: Davenport; Class: APPLIED STAT FOR ENGINR & SCI; Subject: Statistics; University: Virginia Commonwealth University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/12/2009

koofers-user-570 🇺🇸

10 documents

1 / 19

This page cannot be seen from the preview

Don't miss anything!

1

Lecture 12 -

VCU's Stat 541

James M. Davenport,

1

Virginia

Commonwealth

University

STAT 541

APPLIED STATISTICS FOR

ENGINEERS & SCIENTISTS

Instructor: Dr. James M. Davenport

Lecture # 12

Lecture 12 -

VCU's Stat 541

James M. Davenport,

2

Today’s Lecture

Information in today’s lecture corresponds to

the following sections in your textbook (in

addition to my notes):

•5.4, 8.1 & 10.1: Large Sample Confidence Intervals for

Proportions, Confidence Intervals on Means with Unknown

Variance, Intro. to the Chi-Square Distribution & Student’s T

Distribution, & confidence intervals for the Mean Using Small

Samples

Lecture 12 -

VCU's Stat 541

James M. Davenport,

3

Central Limit Theorem

Application

We can now use the Central Limit

Theorem to construct a large sample

confidence interval for an unknown

proportion.

Lecture 12 -

VCU's Stat 541

James M. Davenport,

4

Confidence Interval for a

Proportion, p

Is approximately distributed as N( 0 , 1 ) .

() ()

ˆ

11

Yp

Ynp pp

n

Znpq pp pp

nn

−

−−

== =

−−

Lecture 12 -

VCU's Stat 541

James M. Davenport,

5

Confidence Interval for a

Proportion, p

The resulting confidence interval for p is

Note that this is a function of the

unknown proportion, p .

() ()

2

1

ˆpp

pz n

α

−

±

Lecture 12 -

VCU's Stat 541

James M. Davenport,

6

Confidence Interval for a

Proportion, p

If we solve this inequality for the

unknown proportion, p, we obtain the

following for the confidence limits.

(

)

()

22

22 2

22

ˆˆˆ

24

2

n

p

zzn

pq

z

LCL nz

αα α

α

+− +

=+

Discover Study notes of Statistics Virginia Commonwealth University (VCU)

Partial preview of the text

Download Confidence Interval for a Proportion - Lecture Slides | STAT 541 and more Study notes Statistics in PDF only on Docsity!

Virginia

Commonwealth

University

STAT 541

APPLIED STATISTICS FOR

ENGINEERS & SCIENTISTS

Today’s Lecture

Information in today’s lecture corresponds to the following sections in your textbook (in addition to my notes):

5.4, 8.1 & 10.1: Large Sample Confidence Intervals for Proportions, Confidence Intervals on Means with Unknown Variance, Intro. to the Chi-Square Distribution & Student’s T Distribution, & confidence intervals for the Mean Using Small Samples

Central Limit Theorem

Application

We can now use the Central Limit Theorem to construct a large sample confidence interval for an unknown proportion.

Confidence Interval for a

Proportion, p

Is approximately distributed as N( 0 , 1 ).

( ) ( )

Y

p

Y np n p p

Z

npq p p p p

n n

Confidence Interval for a

Proportion, p

The resulting confidence interval for p is

Note that this is a function of the unknown proportion, p.

( )

( ) 2

p p

p z

α n

Confidence Interval for a Proportion, p

If we solve this inequality for the unknown proportion, p, we obtain the following for the confidence limits.

( ) ( )

2 2 2 2 2 2 2

np z z npq z LCL n z

α α α α

Confidence Interval for a Proportion, p

The upper confidence limit is as follows:

These agree with those given in several texts, and should be used for small n – sometimes called the Wilson Estimators

( ) ( )

2 2 2 2 2 2 2

2 ˆ^4 ˆ ˆ

np z z npq z UCL n z

α α α α

Sample Size for a Confidence Interval for a Proportion, p

The sample size formula is given by:

where L is the width of the interval.

(^2 2 2 2 2 4 2) ( (^2) ) (^2 ) 2

2 z ˆ ˆ pq^ z L 4 z ˆ ˆ pq^ pq ˆ ˆ L L z n L

α −^ α ±^ α −^ + α

Confidence Interval for a

Proportion, p

If the sample size n is large, these limits reduce to the following:

where This is an approximate (100)( 1 - α )% confidence interval for p.

2

pq

p z

α n

± q ˆ = 1 − ˆ p

Sample Size for a Confidence Interval for a Proportion, p

And the sample size formula reduces to:

where L is the width of the interval.

2 2 2

4 z pq ˆ ˆ n L

=^ α

Confidence Intervals

Using Estimates of :

Large Sample Results

What we discussed in the previous lecture requires that be known.
What if is unknown?

σ^2

Confidence Intervals

Using Estimates of :

Large Sample Results

At this point, the only thing we can do is simple substitute an estimate of , namely s^2 , and hence use

σ^2

2

s

x z

n

Confidence Intervals

Using Estimates of :

Large Sample Results

Also note that if we use s in place of , then the length of the confidence interval will also be a random quantity. Hence, if we determine the sample size from a “known” or “historical value” for , and compute the interval using the sample standard deviation s, then the confidence interval will most likely not be of length L.

σ^2

Confidence Intervals for

using s

What we really must investigate is the sampling distribution of this statistic:

as opposed to

First done by W. S. Gosset.

X

T

s n

X

Z

n

μ σ

The Sampling

Distribution of S 2

But before we can adequately do that, we need to introduce the sampling distribution of S^2 (the sample variance), that are computed from random samples X 1 , X 2 ,... , Xn that arise from NORMAL DISTRIBUTIONS N( μ , σ^2 ).

The Sampling

Distribution of S 2

That is, the sampling distribution of the random variable defined by

2 2 1

n i i

S X X

n =

The Sampling

Distribution of S 2

We can prove that the distribution of W has a very special sampling distribution called the Chi-square distribution with ( n – 1 ) degrees of freedom, where

2 2 1 2 2

n i i

X X

n S

W

=

The Sampling

Distribution of S 2

The density function of W is given by

where r = the degrees of freedom.

1 2 2 2

r w

f w r r^ w^ e^ w

elsewhere

⎪⎜ ⎟ <^ < ∞

= ⎨⎜^ Γ ⎟

The Chi-square

Distribution

The parameter r denotes the degrees of freedom.
For brevity, we write … " W has Χ^2 r "

THE CHI-SQUARE CURVE Plot of a Chi-square p.d.f.

w Variable with r Degrees of Freedom

Density

Percentage Points of the

Chi-square Distribution

The percentage points for selected probabilities such that

are given in Table II in your textbook.

2 χ α , r α 2 P W ⎡⎣^ ≥ χ α (^) , r ⎤⎦= α for 0 < α< 1

THE CHI-SQUARE CURVE Plot of a Chi-square p.d.f.

w Variable with r Degrees of Freedom

Density

α

2 χ α , r

1 − α

Percentage Points of the

Chi-square Distribution

These percentage points can be found using NCSS’s Probability Calculator, along with the cumulative probabilities of the Chi-square distribution.

S is Unbiased for?

So S^2 is unbiased for.
Is S unbiased for?
The answer is NO!
S is a biased estimator of. But the bias is not great, and we use S as THE ESTIMATOR of.

σ

σ^2 σ

σ

Confidence Interval for

with small sample sizes

If the sample size is small to moderate in size and the variance is unknown, what do we do?
Simply substituting S for is fine provided the sample size is large enough for S to be a reasonably good estimator of.

μ

σ^2

Small Sample Conf. Int.

This was the starting place for deriving the confidence interval for μ, assuming σ is known.

X

P z z

n

α α

μ α σ

⎢ −^ ≤^ ≤^ ⎥=^ −

Small Sample Conf. Int.

But if we substitute s for σ and the sample size is small to moderate in size,

then the percentiles z α 2 are not correct.

Small Sample Conf. Int.

These percentiles are not correct, since for small sample sizes, the sampling distribution of is no longer adequately described by the normal distribution, .

X

2

N ,

n

σ μ

Small Sample Conf. Int.

If we substitute s for σ, and we wish to maintain a valid probability statement with probability 1 - α , then we must use

a different percentage point, not z α 2.

The T Statistic

What we really must investigate is the sampling distribution of the following statistic:

X

T

S

n

− μ

T Ratio Can Be Written

As Follows

.

X n

T

S n n

μ σ σ

= ×

( ) ( ) (^ )

( )

2 2 1 2

n

X

n Z N

n S W

n n^ n

μ σ χ σ

−

− −^ −

Student’s T Distribution

The random variable T has a very special sampling distribution which is called the “Student’s T” distribution with r = (n – 1) degrees of freedom.

Percentage Points of the

Student’s T

n = 2 r = n – 1 = 1 α = 0. Find t (^) 0.025 , 1 such that

From Table IV, t0.025 , 1 = 12.

By contrast z0.025 = 1.

P T ⎡⎣^ ≥ t 0.025,1 ⎤⎦= 0.

THE STUDENT's T CURVE Plot of Student's t p.d.f.

Density

t Variable

t (^) 0.025,1 = 12.

α = 0.

Percentage Points of the

Student’s T

And, of course, these percentage points can be found for any value of the cumulative or upper tail probability and for any degree of freedom (including non integer degrees of freedom) using NCSS’s Probability Calculator. Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 58

THE STUDENT's T CURVE Plot of Student's t p.d.f.

Density

t Variable

− t 0.025,4 = − 2.776 t 0.025,4 (^) = 2.

0.025 0.

n = 5 ; r = 4

Moments of Student’s T

• E[ T ] = 0

Var[ T ] = r / ( r – 2 )

Exercise

Find d 1 such that P[ T > d 1 ] = 0. (or equivalently, P[ T < d 1 ] = 0.95 ) for degrees of freedom r = 11 From NCSS’s Probability Calculator, we find d 1 = 1..

THE STUDENT's T CURVE Plot of Student's t p.d.f.

Density

t Variable

d 1 (^) = 1.

0.

n = 12 ; r = 11

0.

Exercise

Let r = 11. Find d 2 such that P[ - d 2 < T < d 2 ] = 0..

1 - α = 0.95 α = 0.05 α / 2 = 0.

This implies that d 2 = 2..

THE STUDENT's T CURVE Plot of Student's t p.d.f.

Density

t Variable

− d (^) 2 = − t (^) 0.025,11 = − 2.201 d (^) 2 = t 0.025,11 = 2.

0.025 0.

n = 12 ; r = 11

Confidence Interval for

the Mean μ using S 2

is a (100)( 1 - α )% confidence interval for the mean μ. This of course, assumes that we are sampling from a N( μ , σ^2 ) pop.

2, r

S

X t

n

Example

The mean tearing strength of a certain brand of paper in under investigation by a manufacturer of laser printers. We assume that this measurement is normally distributed. A random sample of n = 22 sheets of paper were tested and the sample mean tearing strength was 2.4 pounds Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 66

Example: known σ^2

1 - α = 0.95 α = 0.05 α / 2 = 0. X = tearing strength of paper (pounds)

X has N( μ , σ^2 = 0.04 ) n = 22 x = 2.

2^ (^ )^ (^ )

x z n α

σ ⎛ ⎞ ± = ± (^) ⎜ ⎟= ⎝ ⎠

When do you use the

Student’s T percentage

points?

Hence, if that population is relatively mound-shaped and somewhat symmetric, then use the percentage points from the Student’s T distribution. Lecture 12 -VCU's Stat 541 James M. Davenport,Copyright 2008 74

When do you use the

Student’s T percentage

points?

If you find yourself in situations where the original population is decidedly non- normally distributed and the sample size is small, then you must use other methods that we will not discuss here (exact sampling dist. & non parametric methods).

Using NCSS’s Probability Calculator^ to find the cumulative probability and p – value for a

Chi-square distribution with 2 degrees of freedom.

Enter the degrees offreedom of the Chi-square distributionThis is the non-centrality parameter.

Enter the value, say

, of

the Chi-square distributionwith degrees of freedomgiven above at left.When computing p-values,this is the observed value ofthe test statistic computedfrom the data.

This output window provides thecumulative probability associatedwith the value,

, entered at the

lower left; i.e. P [ x <

].

For a lower tailed test, this willbe the p – value.

This is the complement of theprobability given above; i.e.1 – P [ x <

] = P [ x >

].

For an upper tailed test, thiswill be the p – value.

Note: these two probabilitiessum to one.

For a two tailed test, you must double the appropriate value givenabove at right; i.e. p – value = 2 (.001526 ) =..

Using NCSS's Probability Calculator to find the cumulative probability and p - value for aStudent's t distribution with 18 degrees of freedom.

Enter the degrees of freedomof the Student's t distribution.The is the non-centralityparameter, which for most ofthe applications in anintroductory course in statistics,will be equal to zero.Enter the value, say T, of theStudent's t distribution withdegrees of freedom givenabove at left.When computing p-values,this is the observed value ofthe test statistic computedfrom the data.

This output window provides thecumulative probability associated withthe value, T, entered at the lower left;i.e. P [ t < T ] .For a lower tailed test, this will be thep-value.

This is the complement of theprobability given above; i.e.1 - P [ t < T ] = P [ t > T ]For an upper tailed test, this will bethe p-value. Note: these two probabilitiessum to one.

For a two tailed test, you must double the appropriate value givenabove at right; i.e. p -value = 2 (0.0304 ) = 0..

Using NCSS’s Probability Calculator

to fnd the 95

th^

percentile from a

Student’s t-distribution with 18 degrees of freedom.

Output is the percentagepoint from the Student’st – distribution with thedegrees of freedom atupper left and thecumulative probabilitygiven at lower left.

Enter the degrees offreedom for the Student’st – distribution. This window is for thenon-centrality parameter. Enter the cumulativeprobability correspondingto the percentage pointdesired.

symmetric. 2), As the sample size grows, assumption (b) is even less important; the Central Limit Theorem tell us that the sampling distribution of the mean approaches the normal regardless of the underlying population distribution With a large sample, the assumption of underlying Normality is not important at all.

No amount of data will make up for a failure of assumption (a). A biased sample or experiment without randomization cannot be fixed by getting still more biased data.
As the df grow, the difference between using t and using the Normal becomes indiscernible. In fact, for well- behaved data the difference in p-values and confidence interval width is negligible around 15 df or above.
In practice, statisticians and researchers using statistics find p-values from the t distribution using computer statistics packages and never (well, hardly ever) refer to the normal distribution or tables of any sort.

-- Paul Velleman

P.S. In ActivStats, our multimedia materials for the introductory statistics course, we do not state any arbitrary rule for df and instead advise students to follow the advice I give here. It is what statisticians do in practice.

Confidence Interval for a Proportion - Lecture Slides | STAT 541, Study notes of Statistics

Related documents

Partial preview of the text

Download Confidence Interval for a Proportion - Lecture Slides | STAT 541 and more Study notes Statistics in PDF only on Docsity!

Virginia

Commonwealth

University

STAT 541

APPLIED STATISTICS FOR

ENGINEERS & SCIENTISTS

Today’s Lecture

Central Limit Theorem

Application

Confidence Interval for a

Proportion, p

Y

p

Y np n p p

Z

npq p p p p

n n

Confidence Interval for a

Proportion, p

p p

p z

α n

2 ˆ^4 ˆ ˆ

α −^ α ±^ α −^ + α

Confidence Interval for a

Proportion, p

pq

p z

α n

± q ˆ = 1 − ˆ p

Confidence Intervals

Using Estimates of :

Large Sample Results

Confidence Intervals

Using Estimates of :

Large Sample Results

s

x z

n

Confidence Intervals

Using Estimates of :

Large Sample Results

Confidence Intervals for

using s

X

T

s n

X

Z

n

The Sampling

Distribution of S 2

The Sampling

Distribution of S 2

S X X

n =

The Sampling

Distribution of S 2

X X

n S

W

The Sampling

Distribution of S 2

f w r r^ w^ e^ w

elsewhere

⎪⎜ ⎟ <^ < ∞

= ⎨⎜^ Γ ⎟

The Chi-square

Distribution

Percentage Points of the

Chi-square Distribution

Percentage Points of the

Chi-square Distribution

S is Unbiased for?

Confidence Interval for

with small sample sizes