Understanding the Binomial Distribution as a Sampling Distribution, Study notes of Data Analysis & Statistical Methods

The concept of the binomial distribution as a sampling distribution, derived from a sequence of bernoulli trials with a fixed number of trials. The binomial probability formula, mean, and variance, and provides histograms of binomial distributions for different sample sizes. It also discusses the relationship between the binomial distribution and statistical inference, and the central limit theorem.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-eskvay2b3r
koofers-user-eskvay2b3r 🇺🇸

10 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sampling Distributions
You have seen probability distributions of various types. The
normal distribution is an example of a continuous distribution that
is often used for quantitative measures such as weights, heights,
etc. The binomial distribution is an example of a discrete
distribution because the possible outcomes are only a “discrete” set
of values, 0, 1, … , n. The value of the binomial random variable
is the number of “successes” out of a random sample of n trials, in
which the probability of success on a particular trail is π.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Understanding the Binomial Distribution as a Sampling Distribution and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Sampling Distributions

You have seen probability distributions of various types. The normal distribution is an example of a continuous distribution that is often used for quantitative measures such as weights, heights, etc. The binomial distribution is an example of a discrete distribution because the possible outcomes are only a “discrete” set of values, 0, 1, … , n. The value of the binomial random variable is the number of “successes” out of a random sample of n trials, in which the probability of success on a particular trail is π.

The Binomial Distribution

as a Sampling Distributions

The nature of the binomial distribution makes it a sampling distribution. In other words, the value of the binomial random variable is derived from a sample of size n from a population. You can think of the population as being a set containing 0’s and 1’s, with a 1 representing a “success” and 0 representing a “failure.” (The term “success” does not necessarily imply something good, nor does “failure” imply something bad.) The sample is obtained as a sequence of n trials, with each trial being a draw from the population of 0’s and 1’s. (Such trials are called “Bernoulli” trials.)

On the first trial, you draw a value from {0, 1}, with P(1)=π. Then you draw again in the same way, and do this repeatedly a total of n times. So your sample will be a set such as {1, 1, 0, 1, 0, 1} in the case for n=6. This set has 4 1’s and 2 0’s, so the value of the binomial random variable is y =4. The value of the binomial distribution is the sum of the outcomes from the trials; that is, the number of 1’s.

The Binomial distribution

as a Sampling Distribution

Consider a population that consists 60% of 1’s and 40% of 0’s. If you draw a value at random, you get a 1 with probability .6 and a 0 with probability .4. Such a draw would constitute a Bernoulli trial with P(1)=.6.

Suppose you draw a sample of size n and add up the value you obtain. This would give you a binomial random variable. Now suppose you do this again and again for a very large number of samples. The conceptual results make up a conceptual population. Here are the histograms that correspond to the population for values of n equal to 1, 2, 3, 6, 10, and 20.

Notice how the shapes of the histograms change as n increases. The distributions become more “mound-shaped” and symmetric.

Histotram of B(1, .6) Distribution

0

1

2

3

4

5

6

7

0 1 Successes

Relative Frequency

Histogram of B(2, .6) Distribution

0

0 1 2 Successes

Relative Frequency

Histogram of B(3, .6) Distribution

0

0 1 2 3 Successes

Relative Frequency

The Binomial Sampling Distribution

and Statistical Inference

Here are the values of P( y=k ) and P( y<=k ) for n=20 and k=6- (P( y=k )<.005 for n<6 or n>17):

P( y=k ) .005 .015 .035 .071 .117 .160 .180 .165 .124 .075 .035.

P( yk ) .006 .021 .057 .128 .245 .404 .584 .750 .874 .949 .984.

Remember that these are the probabilities for π=.6.

Suppose you are sampling from a distribution with unknown π.

If you drew sample with y =12, would you have reason to doubt that π=.6? (Note: y /n=.6)

If you drew sample with y =10, would you have reason to doubt that π=.6? (Note: y /n=.5)

If you drew sample with y =6, would you have reason to doubt that π=.6? (Note: y /n=.3)

The Binomial Sampling Distribution

and Statistical Inference

Here are the numeric values of P( y=k ) for n=10, 3 decimal places:

P( y= k ) .000 .002 .011 .042 .111 .201 .251 .215 .121 .040.

P( yk ) .000 .002 .012 .055 .166 .367 .618 .833 .954 .994 1.

Remember that these are the probabilities for π=.6.

Suppose you are sampling from a distribution with unknown π.

If you drew sample with y =6, would you have reason to doubt that π=.6? (Note: y /n=.6)

If you drew sample with y =5, would you have reason to doubt that π=.6? (Note: y /n=.5)

If you drew sample with y =3, would you have reason to doubt that π=.6? (Note: y /n=.3)

Normal Distributions and

Sampling Distributions

If the distribution from which the samples were obtained is

not normal, then the sampling distribution of y^ is only

approximately normal, but the distribution becomes more nearly normal as n increases. This is the Central limit Theorem.

Central Limit Theorem : If y is a random variable with mean μ

and variance σ^2 , then the sampling distribution of y^ is

approximately normal with mean μ and variance σ^2 /n.