Understanding Variability of Statistic Estimates: Sampling Distribution, Papers of Statistics

The concept of sampling distribution of a sample mean and how it helps estimate population mean with varying degrees of accuracy. It covers the importance of the law of large numbers, the difference between population and sample mean, and the impact of sample size on the accuracy of the estimate. It also discusses the central limit theorem and its significance in statistical inference.

Typology: Papers

Pre 2010

Uploaded on 09/02/2009

koofers-user-fe0
koofers-user-fe0 🇺🇸

10 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
1
Chapter 4
Probability and
Sampling Distributions
2
Random Variable
Definition: A random v ariable is a variab le
whose value is a nume rical outcome of a
random phenomenon.
The statistic calculated from a randomly chosen
sample is an example of a random variable.
A statistic from a random sample will take
different values if we take more samples from the
same population
3
Section 4.4
The Sampling
Distribution of a
Sample Mean
4
Sampling Distributions
In this class we will focus on three
sampling distributions.
The sampling distributi on of the samp le
mean.
The sampling distributi on of the sampl e
proportion.
The sampling distributi on of
b1
the sample
slope.
We start with the sampling distribution of
x
ˆ
p
x
5
Sampling Distribution of
To get the sampling distribution of we
must investigate three things
The center of the distrib ution
The spread of the distr ibution
The shape of the distrib ution
We start with the center
x
x
6
Example
Suppose that we are in terested in the workout
times of ISU students a t the Recreation center .
Lets assume that μ is the average workou t time of
all ISU students
To estimate μ lets take a simple random sample of 100
students at ISU
We will record each students work out time (xi)
Then we find the average workout time for the 100 students
The population mean μ is the parameter of intere st.
The sample mean, , is the statistic (which is a random variable).
Use to estimate μ (This seems like a sensible thing to do).
x
x
x
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Understanding Variability of Statistic Estimates: Sampling Distribution and more Papers Statistics in PDF only on Docsity!

1

Chapter 4

Probability and

Sampling Distributions

2

Random Variable

 Definition: A random variable is a variable whose value is a numerical outcome of a random phenomenon.

 The statistic calculated from a randomly chosen sample is an example of a random variable.

 A statistic from a random sample will take different values if we take more samples from the same population

3

Section 4.

The Sampling

Distribution of a

Sample Mean

4

Sampling Distributions

 In this class we will focus on three

sampling distributions.

 The sampling distribution of the sample mean.

 The sampling distribution of the sample proportion.

 The sampling distribution of b 1 the sample slope.

 We start with the sampling distribution of

x

p ˆ

x

5

Sampling Distribution of

 To get the sampling distribution of we

must investigate three things

 The center of the distribution

 The spread of the distribution

 The shape of the distribution

 We start with the center

x

x

6

Example

 Suppose that we are interested in the workout times of ISU students at the Recreation center.  Let’s assume that μ is the average workout time of all ISU students  To estimate μ lets take a simple random sample of 100 students at ISU

  • We will record each students work out time (xi)
  • Then we find the average workout time for the 100 students
    • The population mean μ is the parameter of interest.
    • The sample mean, , is the statistic (which is a random variable).
    • Use to estimate μ (This seems like a sensible thing to do).

x

x x

7

Example

 A SRS should be a fairly good representation of the population so the x-bar should be somewhere near μ.

 x-bar from a SRS is an unbiased estimate of μ

 We don’t expect x-bar to be exactly equal to μ

 There is variability in x-bar from sample to sample

 If we take another simple random sample (SRS) of 100 students, then the x-bar will probably be different.

8

 If x-bar is rarely exactly right and varies from

sample to sample, why is it a reasonable estimate of the population mean μ?

 answer: if we keep on taking larger and larger samples, the statistic x-bar is guaranteed to get closer and closer to the parameter μ

 We have the comfort of knowing that if we can afford to keep on measuring more subjects, eventually we will estimate the mean workout time for all ISU students very accurately…

Statistical Estimation

9

The Law of Large Numbers

Law of Large Numbers (LLN) :  draw independent observations at random from any population with finite mean μ  as the number of observations drawn increases, the mean x-bar of the observed values gets closer and closer to the mean μ of the population  If n is the sample size as n gets large

 The Law of Large Numbers holds for any population, not just for special classes such as Normal distributions

x

10

Example

 Suppose we have a bowl with 21 small pieces of paper inside. Each paper is labeled with a number 0-20. We will draw several random samples out of the bowl of size n and record the sample means, x-bar, for each sample.  What is the population?  Since we know the values for each individual in the population (i.e. for each paper in the bowl), we can actually calculate the value of μ, the true population mean. μ= 10  Draw a random sample of size n = 1. Calculate x-bar for this sample.

  • x =

11

Example

 Draw a second random sample of size n = 5. Calculate for this sample.

 Draw a third random sample of size n = 10. Calculate for this sample.

 Draw a fourth random sample of size n = 15. Calculate for this sample.

 Draw a fifth random sample of size n = 20. Calculate for this sample.

What can we conclude about the value of as the sample size increases?

THIS IS CALLED THE LAW OF LARGE NUMBERS.

x

x

x

x

x

12

Another Example

Example: If we were to roll a pair of dice and sum of the number of dots showing the average would be 7.

 Go to applet

19

How Large is a Large

Number?

 The Law of Large Numbers says that the actual mean outcome of many trials gets close to the distribution mean μ as more trials are made

 It doesn’t say how many trials are needed to guarantee a mean outcome close to μ  that depends on the variability of the random outcomes

 The more variable the outcomes, the more trials are needed to ensure that the mean outcome x- bar is close to the distribution μ 20

More Laws of Large Numbers

 The Law of Large Numbers is one of the

central facts about probability

 it assures us that statistical estimation will be accurate if we can afford enough observations

 The basic Law of Large Numbers applies to

independent observations that all have the

same distribution

 mathematicians have extended the law to many more general settings

21

Sample Statistic Facts.

 Some sample statistic facts:

 A statistic varies from sample to sample.

 A statistic almost always differs from a parameter.  A statistic approaches parameter as sample size increases.

 Question:

 How do we investigate the behavior of the statistic?

22

Questions about

 How well does (statistic) estimate μ (parameter)?  Does vary about μ?

 How much could differ from μ?

 How much could vary from sample to sample?

 Can we compute probabilities on  The sampling distribution of will answer all these questions

x

x

x

x

x

x

x

23

 Recall:

 Theoretical sampling distribution of :

  • The distribution of all -values from all possible samples of the same size from the same population.  Can we take all possible samples?
  • Maybe, if population size = 100 and n = 10 then there are over 17,300,000,000,000 possible samples.  What do we do then
  • Take many, many SRS’s.
  • Compute for each.
  • Approximate the theoretical sampling distribution of

Sampling Distribution

x

x

x

x x

24

 The idea of a sampling distribution is

the foundation of statistical inference

 the laws of probability can tell us about sampling distributions without the need to actually choose or simulate a large number of samples

Sampling Distributions

25

Mean and Standard Deviation of a

Sample Mean

 Suppose that x-bar is the mean of a SRS of size n drawn from a large population with mean μ and standard deviation σ  The mean of the sampling distribution of x-bar is μ  The standard deviation is of the sampling distribution of x-bar is

 In short: where n is the sample size

  • Notice: averages are less variable than individual observations!

n

!

n

x x

μ !" μ

26

 The mean of the statistic x-bar is always the same

as the mean μ of the population  the sampling distribution of x-bar is centered at μ  in repeated sampling, x-bar will sometimes fall above the true value of the parameter μ and sometimes below, but there is no systematic tendency to overestimate or underestimate the parameter  because the mean of x-bar is equal to μ, we say that the statistic x-bar is an unbiased estimator of the parameter μ

Mean and Standard Deviation of a

Sample Mean

27

 An unbiased estimator is “correct on the

average” in many samples  how close the estimator falls to the parameter in most samples is determined by the spread of the sampling distribution  if individual observations have standard deviation σ, then sample means x-bar from samples of size n have standard deviation

 Again, notice that averages are less variable than individual observations

n

!

Mean and Standard Deviation of a

Sample Mean

28

 Not only is the standard deviation of the

distribution of x-bar smaller than the

standard deviation of individual

observations, but it gets smaller as we take

larger samples

the results of large samples are less variable than the results of small samples when dealing with sample means

Mean and Standard Deviation of a

Sample Mean

29

 If n is large, the standard deviation of x-bar

is small and almost all samples will give

values of x-bar that lie very close to the true

parameter μ

 the sample mean from a large sample can be trusted to estimate the population mean accurately

Mean and Standard Deviation of a

Sample Mean

30

Example

 Suppose we take samples of size 15

from a distribution with mean 25 and

standard deviation 7

 the distribution of x-bar has:

  • A mean of :
  • And a standard deviation of :

μ = 25

! n

=

7 15

" 1.

37

How Large a Sample is Needed?

 It depends on how close to Normal the

population distribution is

 more observations are required if the shape of the population distribution is far from Normal

38

Example

 The time X that a technician requires to perform preventive maintenance on an air-conditioning unit is governed by a right skewed distribution (see figure 4.17 (a)) with mean time μ = 1 hour and standard deviation σ = 1 hour

 Your company operates 70 of these units

 The distribution of the mean time your company spends on preventative maintenance is approx.:

( 1 , 0. 12 )

N 1 , != N

39

 What is the probability that

your company’s units

average maintenance time

exceeds 50 minutes?

  • 50/60 = 0.83 hour
  • So we want to know P( x-bar > 0.83)
  • Use Normal distribution calculations we learned in Chp 2!

( )

( )

( )

1 0. 0778 0. 9222

P z

P z

n

x P

P x

μ

Example

40

4.86 ACT scores

 The scores of students on the ACT

college entrance examination in a recent

year had the Normal distribution with

mean μ = 18.6 and standard deviation σ =

41

 What is the probability that a single

student randomly chosen from all those

taking the test scores 21 or higher?

4.86 ACT scores

1 0. 6591 0. 3409

( 0. 4068 ) 1 ( 0. 41 )

  1. 9

21 18. 6

( 21 )

=! =

= " =! <

$

% & '

(! "

!

"

Pz P z

x P

P x

)

μ

42

 About 34% of students (from this

population) scored a 21 or higher on the

ACT

 The probability that a single student

randomly chosen from this population

would have a score of 21 or higher is 0.

4.86 ACT scores

43

 Now take a SRS of 50 students who took

the test. What are the mean and

standard deviation of the sample mean

score x-bar of these 50 students?

 Mean = 18.6 [same as μ]

 Standard Deviation = 0.8344 [sigma/sqrt(50)]

4.86 ACT scores

44

 What is the probability that the mean

score x-bar of these students is 21 or

higher?

4.86 ACT scores

1 0. 9980 0. 002

( 2. 8778 ) 1 ( 2. 88 )

  1. 834

21 18. 6

( 21 )

=! =

= " =! <

$

%

&

&

&

&

'

( ! "

$

% & '

(

!

"

Pz P z

n

x P

P x

)

μ

45

 About 0.2 % of all random samples of

size 50 (from this population) would have

a mean score x-bar of 21 or higher.

 The probability of having a mean score x-

bar of 21 or higher from a sample of 50

students (from this population) is 0.002.

4.86 ACT scores

46

Section 4.4 Summary

 When we want information about the

population mean μ for some variable, we

often take a SRS and use the sample

mean x-bar to estimate the unknown

parameter μ.

47

 The Law of Large Numbers states that

the actually observed mean outcome x-

bar must approach the mean μ of the

population as the number of observations

increases.

Section 4.4 Summary

48

 The sampling distribution of x-bar

describes how the statistic x-bar varies in

all possible samples of the same size

from the same population.

Section 4.4 Summary