1 Population and Sample Proportion

Consider categorical data for a population of size N. If Mindividuals from the population belong to a certain

group, we say that the proportion of the population that belongs to this group is p=M/N .

Now suppose that a sample of size mis randomly selected and kindividuals from the sample belong to the

group in question. We say that the proportion of the sample that belongs to this group is ¯p=m/n. The

sample proportion may or may not equal the population proportion.

Since ¯pwas obtained through a random process, it is a random variable. Therefore, it has a set of possible

values, a probability distribution, an expected value or mean, a variance, and a standard deviation. Since ¯p

represents a proportion, its set of possible values is limited to the interval between 0 and 1. We let µ¯pdenote

the mean of ¯pand we let σ¯pdenote the standard deviation of ¯p.

It turns out that the mean and standard deviation of the sample proportion are related to the population

proportion in the following way:

µ¯p=p

That is, the mean or expected value of the sample proportion is the same as the population proportion.

Notice that this does not depend on the sample size or the population size.

σ¯p=rp(1 −p)

nrN−n

N−1

| {z }

FPCF

The finite population correction factor appears again. We can ignore it in the same three cases that we did

when considering the sample mean. Observe that, as the sample size nincreases, the standard deviation of

the sample proportion gets smaller. That is, as the sample size increases, the sample proportion becomes

more likely to be closer to the population proportion.

Notice that we have not said anything about the distribution of ¯pso far other than its mean and standard

deviation. For all we know at this point, it could follow a normal distribution, or a uniform distribution, or

any distribution really. We will give a more precise description of the distribution of ¯plater.

As an example, suppose that a family has five people, A, B, C, D, and E. A and D are women and B, C,

and E are men. This is our population data. The proportion of the population which is men is p= 3/5.

Now suppose that we obtain a simple random sample of 2 people from the family, without replacement. That

is, the sample must consist of 2 different people. From Lecture 7, we know that there are 5C2=5·4

2·1= 10

possible ways of doing this. Each pair of people is equally likely to occur, with probability 1/10. For each

different sample, we will get a (perhaps) different value for ¯p, the proportion of men in the sample. For

example, if the sample consists of people A and B, then ¯pis 1/2. We can then fill in the rest of the table

below.

sample ¯p

A,B 1/2

A,C 1/2

A,D 0/2

A,E 1/2

B,C 2/2

B,D 1/2

B,E 2/2

C,D 1/2

C,E 2/2

D,E 1/2

In the second column, we see all the possible values of ¯p. The probability distribution of ¯xis:

k P (¯p=k)

0/2 1/10

1/2 6/10

2/2 3/10

1 Population and Sample Proportion, Study notes of Probability and Statistics

Related documents

Partial preview of the text

Download 1 Population and Sample Proportion and more Study notes Probability and Statistics in PDF only on Docsity!

2 Central Limit Theorem for Proportion

3 Main Ideas for Confidence Intervals for Proportion

E

6 Sample Size Determination