Population and Sample, Exams of Design

Producing data: Sampling. 1. Population and Sample. The population in a statistical study is the entire group of individuals about which we want information ...

Typology: Exams

2022/2023

Uploaded on 03/01/2023

sohail
sohail 🇺🇸

4.5

(16)

235 documents

1 / 24

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 8. Producing data: Sampling
1
Population and Sample
The population in a statistical study is the entire group of
individuals about which we want information.
A sample is the part of the population from which we actually
collect information. We use information from a sample to draw
conclusions about the entire population.
Popula'on)
Sample
Collect data from a
representative Sample...
Make an Inference about
the Population.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18

Partial preview of the text

Download Population and Sample and more Exams Design in PDF only on Docsity!

Chapter 8. Producing data: Sampling

1

Population and Sample

The population in a statistical study is the entire group of

individuals about which we want information.

A sample is the part of the population from which we actually

collect information. We use information from a sample to draw

conclusions about the entire population.

Popula'on Sample Collect data from a representative Sample ... Make an Inference about the Population.

Ex. Describe the population and the sample. For each of the following situations, describe the population and the sample. (a) A survey of 17,096 students in U.S. four-year colleges reported that 19.4% were binge drinkers. Population: college students. Sample: 17,096 students. (b) In a study of work stress, 100 female restaurant workers were asked about the impact of work stress on their personal lives. Population: female restaurant workers. Sample: 100 workers. (c) A tract of forest has 584 longleaf pine trees. The diameters of 40 of these trees were measured. Population: longleaf pine trees. Sample: 584 trees.

  • The first step in planning a sample survey is to say exactly what population we want to describe.
  • The second step is to say exactly what we want to measure, that is, to give exact definitions of our variables.
  • The final step in planning a sample survey is the sampling design. A sampling design is a specific method for choosing a sample from the population. Sample survey

5 How to Sample Badly

The design of a sample is biased if it systematically favors certain

outcomes.

A voluntary response sample consists of people who choose

themselves by responding to a general appeal. Voluntary

response samples show bias because people with strong opinions

(often in the same direction) are most likely to respond.

Choosing individuals who are easiest to reach results in a

convenience sample.

Ex. A sample of mall shoppers is fast and cheap. But people at shopping malls tend to be more prosperous than typical Americans. They are also more likely to be teenagers or retired. Moreover, unless interviewers are carefully trained, they tend to question well-dressed, respectable-looking people and avoid poorly dressed or tough-looking individuals. In short, mall interviews will not contact a sample that is representative of the entire population. Interviews at shopping malls will almost always overrepresent middle- class and retired people and underrepresent the poor. This is bias : the outcomes of mall surveys will repeatedly miss the truth about the population in the same ways.

Suppose, for example, that a news show asks viewers to participate in an on-line poll. This would be a voluntary sample. The sample is chosen by the viewers, not by the survey administrator. Voluntary response samples are always biased: they only include people who choose volunteer, where as a random sample would need to include people whether or not they choose to volunteer. Often, voluntary response samples oversample people who have strong opinions and undersample people who don't care much about the topic of the survey.

10

Simple Random Samples

Random sampling , the use of chance to select a sample, is the

central principle of statistical sampling.

A simple random sample (SRS) of size n consists of n

individuals from the population chosen in such a way that every

set of n individuals has an equal chance to be the sample

actually selected.

In practice, people use random numbers generated by a

computer or calculator to choose samples. If you don’t have

technology handy, you can use a table of random digits.

11 How to Choose a SRS

A table of random digits is a long string of the digits 0, 1, 2, 3, 4, 5, 6,

7, 8, 9 with these properties:

  • Each entry in the table is equally likely to be any of the 10 digits
  • The entries are independent of each other. That is, knowledge of

one part of the table gives no information about any other part.

Step 1: Label. Give each member of the population a numerical label

of the same length.

Step 2: Table. To choose an SRS, read from Table B successive

groups of digits of the length you used as labels.

Your sample contains the individuals whose labels you find.

How to Choose an SRS Using Table B

Table B at the back of the book is a table of random digits. To make the table easier to read, the digits appear in groups of five and in numbered rows. The groups and rows have no meaning — the table is just a long list of randomly chosen digits.

The walk to your statistics class takes about 10 minutes, about the amount of time needed to listen to three songs on your iPod. You decide to take a simple random sample of songs from a Billboard list of Rock Songs. Here is the list: Select the three songs for your iPod using a simple random sample. Line #104 in the Table B:

14

Random sampling

Random sampling, the use of impersonal chance to avoid bias, is the

basic idea of sampling design. Online polls and mall interviews also

produce samples, but these samples are chosen in ways that invite

bias. The first question to ask about any sample is whether it was

chosen at random.

Why should we rely on random sampling?

1. To eliminate bias in selecting samples from the list of available

individuals.

2. The laws of probability allow trustworthy inference about the

population.

  • Results from random samples come with a margin of

error that sets bounds on the size of the likely error.

  • Larger random samples give better information about the

population than smaller samples.

We need an accurate and complete list of the population. Because such a list is rarely available, most samples suffer from some degree of undercoverage. Ex. A sample survey of households, for example, will miss not only homeless people but prison inmates and students in dormitories. Ex. An opinion poll conducted by calling landline telephone numbers will miss households that have only cell phones as well as households without a phone. The results of national sample surveys therefore have some bias if the people not covered differ from the rest of the population.

A more serious source of bias in most sample surveys is nonresponse , which occurs when a selected individual cannot be contacted or refuses to cooperate. Ex. Nonresponse to sample surveys often exceeds 50%, even with careful planning and several callbacks. If the people contacted differ from those who are rarely at home or who refuse to answer questions, some bias remains. Most national sample surveys are carried out by telephone, using random digit dialing to choose residential telephone numbers at random. Call screening is increasing nonresponse to such surveys, and the rise of cell-phone-only households is increasing undercoverage.

The wording of questions wording effects is the most important influence on the answers given to a sample survey. Confusing or leading questions can introduce strong bias, and changes in wording can greatly change a survey’s outcome. Even the order in which questions are asked matters. Ex. Ask a sample of college students these two questions: “How happy are you with your life in general?” (Answers on a scale of 1 to 5) “How many dates did you have last month?” The correlation between answers is r = −0.012 when asked in this order. It appears that dating has little to do with happiness. Reverse the order of the questions, however, and r = 0.66. Asking a question that brings dating to mind makes dating success a big factor in happiness.

20

Cautions About Sample Surveys

Good sampling technique includes the art of reducing all sources of error.

Undercoverage occurs when some groups in the population

are left out of the process of choosing the sample.

Nonresponse occurs when an individual chosen for the sample

can’t be contacted or refuses to participate.

A systematic pattern of incorrect responses in a sample survey

leads to response bias.

The wording of questions is the most important influence on

the answers given to a sample survey.