Sampling Concepts and Techniques: Random vs. Non-Random Samples, Lecture notes of Designs and Groups

An excerpt from the International Program for Development Evaluation Training (2007). It discusses various sampling concepts, including populations, samples, sampling frames, sample designs, parameters, and statistics. The document also explains the differences between random and non-random samples and lists and defines the types of each. It emphasizes the importance of statistics in determining confidence and precision and provides guidelines for determining sample size.

Typology: Lecture notes

2021/2022

Uploaded on 08/01/2022

hal_s95
hal_s95 🇵🇭

4.4

(655)

10K documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
IPDET
Handbook
Module 9
Sampling
Introduction
In the previous three modules, you have learned about writing
evaluation questions, and choosing an evaluation design and
data collection approach. In this module, you will look at
sampling strategies to help you determine how much data you
need to collect, and where to collect it so that you can answer
your evaluation questions.
This module has five topics. They are:
Introduction to Sampling
Sampling Glossary
Types of Samples: Random and Non-random
How Confident and Precise Do You Need to Be?
How Large a Sample Do You Need?
Sampling
Intro
Glossary
Types
Confidence/
Precision?
How Large?
Intervention
or
Policy
Data
Collection
Design
Evaluation
Questions
A
pproaches
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Sampling Concepts and Techniques: Random vs. Non-Random Samples and more Lecture notes Designs and Groups in PDF only on Docsity!

IPDET

Handbook

Module 9

Sampling

Introduction

In the previous three modules, you have learned about writing evaluation questions, and choosing an evaluation design and data collection approach. In this module, you will look at sampling strategies to help you determine how much data you need to collect, and where to collect it so that you can answer your evaluation questions. This module has five topics. They are:

  • Introduction to Sampling
  • Sampling Glossary
  • Types of Samples: Random and Non-random
  • How Confident and Precise Do You Need to Be?
  • How Large a Sample Do You Need?

Sampling

Intro Glossary Types Confidence/ Precision? How Large?

Intervention or Policy

Data Design Collection Evaluation Questions Approaches

Module 9

Page 442 International Program for Development Evaluation Training − 2007

Learning Objectives

By the end of the module, you should be able to:

  • define sampling concepts, including: population, sample, sampling frame, sample design, parameter, statistic, and random sample
  • discuss the differences between random and non- random samples
  • list and define the types of random samples, including: simple, stratified, and cluster
  • list and define the types of non-random samples, including: quota, accidental, snowball, judgmental, and convenience
  • describe the role of statistics to determine confidence and precision
  • describe the guidelines to determine the size of sample that is needed.

Â

Module 9

Page 444 International Program for Development Evaluation Training − 2007

Introduction to Sampling

When we begin planning our data collection strategy, we have to decide whether it is possible to collect data from the entire population we intend to study: every document, for example, or every farmer, or every mile of a road system. Can we review every file, observe every farmer, examine every road? If we can, we can then accurately report the qualifications of every teacher in our school system, the number of paved miles on all our roads, or the views of all the citizens of the country. If we collect all the data accurately and reliably, then there is little chance of error. The complete coverage of the population in question is called a census. However, most often we are unable to collect data from every file, farmer, or person. It takes too much time and costs too much. Instead, we take a sample − a subset of the entire population. If we select a sample, we may be able to draw inferences about a population based on our sample results; that is, we can estimate what the population is like based on our sample results. We call this “ generalizing to a population .” We use samples all the time. For example, when we have a blood test to check on our health, the laboratory takes a sample rather than all our blood. Tests are run using that sample and it is assumed that what they find in the sample is an accurate reflection of what is in all our blood. Sampling is not just something that applies to large, quantitative studies. Even when conducting a highly qualitative, one-week field visit to assess a program that is spread out over a large geographic region, for example, you still need to be thoughtful about which areas of that region to investigate. Consider, as another example, the biases that might be introduced if program officials select the participants to be studied and how those biases could be avoided with a randomly selected sample. A modest understanding of the basic concepts of systematic sampling can greatly enhance the extent to which your assessment reflects what is really going on in the real world.

Sampling

International Program for Development Evaluation Training − 2007 Page 445

Sampling Glossary

population: The total set of units. It could be all the citizens in a country, all farms in a region, or all children under the age of five living without running water in a particular area. census: A count of (or collection of information from) the entire population. sample : A subset of units selected from a larger set of the same units. sampling frame : The list from which you can select your sample. systematic sampling : a sample drawn from a list using a random start followed by a fixed sampling interval. sample design : The method of sample selection. random sample: A sample in which each unit in the population has an equal chance of being selected. quota sample: A sample in which a specific number of different types of units are selected. accidental sample: A sample in which the units are selected “by accident.” snowball sample: A type of sampling strategy typically used in interviews, where you ask interviewees who else you should talk to. judgmental sample : A sample in which selections are made based on pre-determined criteria. convenience sample: A sample in which selections are based on the convenience to the evaluator (e.g., on easy geographic or organizational access). parameter: Characteristic of the population. statistic: Characteristic of a sample. confidence level: How certain you are (or need to be) that the statistic obtained from your sample is an accurate estimate of the population as a whole. confidence interval: The calculated range within which the true population value lies and for which we can express 95% confidence (the standard, but may vary).

Sampling

International Program for Development Evaluation Training − 2007 Page 447

Example of systematic sampling

You want to review records but the records are in boxes and there is no way you can go through and number them all to select a sample. A systematic selection with a random start is acceptable, as long as there is nothing about the original order of the documents that is systematic. For example, you would take a random start and then pick every 20th file until you get the total number of files you want to analyze.

Types of Random Samples There are four types of random samples. They are:

  • simple random samples
  • stratified random samples
  • multi-stage samples
  • cluster samples
  • combination random samples.

Simple Random Samples

A simple random sample is the simplest sample. We establish a sample size and then proceed to randomly select units until we reach that number. Let us say we want to select 100 files from a population of 500. All the files have been consecutively numbered from 001 to 500 and are filed in numerical order from 1-500. We could then use a random numbers table, mentally block it off into three-digit numbers, and then select the first 100 numbers that fall between 001 and 500. These are the files we select for our study.

Stratified Random Samples

Sometimes we want to make sure specific groups are included that might otherwise be missed by using a simple random sample; those groups are usually a small proportion of the population. In this case, we would divide the population into strata based on some meaningful characteristic. This kind of sample is called a stratified random sample. For example, you may want to make sure you have enough people from rural areas in your study. If selected by a simple random sample, you may not get enough people from rural areas if they are a small proportion of all the people in the area. This is especially important if you want to have sufficient numbers in each stratum so you can make meaningful comparisons. For example, we may want to take a stratified sample of farmers at various distances from a major city.

Module 9

Page 448 International Program for Development Evaluation Training − 2007

To do a stratified random sample, divide the population into non-overlapping groups (i.e., strata ) n 1 , n 2 , n 3 , ... n (^) i, such that n 1 + n 2 + n 3 + ... + n (^) i = n. Then do a simple random sample in each stratum. Figure 9.1 illustrates this process.

Fig. 9.1: Stratified Random Sample.

Cluster and Multi-stage Samples

Cluster sampling is another form of random sampling. A “cluster” is any naturally occurring aggregate of the units that are to be sampled. Thus households (or homes) are clusters of people, and towns are clusters of households. Cluster samples are most often used when:

  • you do not have a complete list of everyone in the population of interest but do have a complete list of the

clusters in which they occur, or

  • you have a complete list of everyone, but they are so widely disbursed that it would be too time consuming and expensive to send data collectors out to a simple random sample. In a cluster sample, the cluster is randomly sampled (such as towns or household) and then data is collected on all the target units. For instance, if the evaluation needs to collect data on the height and weight of children ages 2-5 in the program sites scattered across a large rural region, the evaluators might randomly sample 20 villages from the 100 villages receiving the program, and then collect data on all the children ages 2-5 in those villages.

simple random sample

Total Population

sub-population

sub- population

sub-population

simple random sample

simple random sample

Module 9

Page 450 International Program for Development Evaluation Training − 2007

For example, you may want to interview 200 AIDS patients, but these 200 may be selected from only four randomly sampled clinics because of resource constraints. It is possible that the clinics will serve populations that are too similar in terms of economic background or other characteristics, and therefore may not be representative of all AIDS patients. Likewise, it is possible that the selected clinics are too similar in their level of care to patients to accurately represent the total population of patients. When you want to interview people living on small, dispersed, and remote farms, it would be very time consuming to sample the people and then travel to all the farms in which they reside. In a cluster sample, you might sample 10 of the 50 farms and then interview all the people at each of those sampled farms.

Combination Random Samples

Sometimes combinations of methods are used. The group may be divided into strata; all the people in one stratum might be selected (as a census) and a random sample selected from the other strata. The program in Ghana (see Case 9-1) gives an example of a complex use of combination random sample.

Examples of selecting random samples

For example, you want to observe classroom activities to measure the amount of time spent doing hands-on learning activities. You can:

  • randomly select classrooms
  • randomly select times of day
  • randomly select days of the week. In another example, you might want to observe the amount of traffic on the road from the village to a major town. You can:
  • randomly select times and days of the week
  • randomly select times of the year
  • randomly select observation points or select a single observation point along the road.

Sampling

International Program for Development Evaluation Training − 2007 Page 451

Table 9.1 gives you a summary of the random sampling process.

Table 9.1: Summary of Random Sampling Process Step Process

  1. Obtain a complete listing of the entire population
  2. Assign each case a unique number.
  3. Randomly select the sample using a random numbers table.
  4. When no numbered listing exists or is not practical to create, use systematic random sampling: - make a random start - select every n th^ case.

Case 9-1: Impact on Nutrition: Lower Pra Rural Bank

Credit with Education Program in Ghana

The intent of the program is to increase the nutritional status and food security of poor households in Ghana. The Credit with Education Program combines: (1) providing credit to participants with (2) education on the basics of health, nutrition, birth timing, and small business skills. Evaluation Questions: Did the program have an impact on the nutritional status of children, women’s economic capacity, women’s knowledge of health issues, and ability to offer a healthy diet to their children? Overall Design: A quasi-experimental design using two surveys. Nineteen communities that did not yet have the Credit with Education Program were the focus of this study. The communities were divided into groups (strata) based on set criteria. Within each of the strata, communities were assigned to either to a treatment group (will receive the Credit with Education Program) or to a control group (who will not receive the program). They were not randomly assigned; three were assigned for political reasons and three were assigned as matched controls. Sampling Within the Communities: Three groups of women with children were surveyed: those who participated at least one year (all participants were selected); those who did not participate but were in the program communities (random sample); and those in control communities (random sample). In all, ten mother/child pairs with children aged 12-23 months were chosen from each of the small communities; 30 from the large communities.

Sampling

International Program for Development Evaluation Training − 2007 Page 453

When using a non-random sample, examining the issue of bias is important. Is there something about this particular sample that might be different from the population as a whole? You might want to gather demographic information so you can describe the characteristics of your sample. Ideally, there will be no obvious differences between the sample and the population. But, when you report the demographics of the sample, your audience can then make a judgment as to how similar the sample is to the population. When using a non-random sample, you need to report your results in terms of the respondents. For example, “Of the mothers interviewed, 70% are satisfied with the quality of the healthcare their children are receiving." Without random sampling, you have to be careful about generalizing to a larger population. However, the data may be very useful and may be the best given your situation. Always make your sample selection criteria and procedures clear.

Combinations Random and non-random methods can be combined. For example, you may be collecting data on schools. You can select two schools from the poorest communities and two from the wealthiest communities. Then from these four schools, you can randomly select students for your data collection.

How Confident and Precise Do You Need to Be?

Even when you use a random sample, there is some possibility of error. It is possible that your sample will be different from the population. This is where statistics come in (see Module 11, Data Analysis and Interpretation ). The narrowest definition of statistics concerns the validity of data derived from random samples. More specifically, it is concerned with estimating the probability that the sample results are representative of the population as a whole. Statisticians have developed theories and formulas for making these estimates and selecting sample size. While we will present some statistics in the next module, we will not present or discuss statistical formulas here. Rather, we will focus on understanding the basic concepts of statistical analysis, and how to apply them to designing evaluations.

Module 9

Page 454 International Program for Development Evaluation Training − 2007

You have some options in deciding how accurate and precise you need to be in inferring results to the larger population. The first thing you want to do is decide how confident you wish to be that your sample results are an accurate estimate of what is true for the entire population. The standard confidence level is 95%. This means you want to be 95% certain that your sample results are an accurate estimate of the population as a whole. If you are willing to be 90% certain, your sample size will be smaller. If you want to be 99% confident (only 1% chance of having the sample be very different from the population as a whole), you will need a larger sample. Your next choice is about how precise you need your estimates to be. This is sometimes called sampling error or margin of error. We often see this when results from polls are reported. For example, you might read in the paper that 48% favor raising taxes and 52% oppose raising taxes (+/- 3%). What this means is that if everyone in the population were asked, the actual proportions would be somewhere between 45% to 51% (48 +/- 3) favoring raising taxes, and 49% to 55% (52 +/- 3) _ opposing. Most evaluations accept a sampling error of 5%. In the tax example, if we had a 5% margin of error, than the true picture of opinions would be between 43% to 53% favoring raising taxes and between 47% to 57% opposing raising taxes. As you can see, there is more variability (less precision) in our estimates with a +/-5% margin of error as compared to a +/- 3% margin. The more precise you want to be, the larger your sample will need to be. In both examples, however, note that the sampling errors overlap. This means that these results are too close to call. When working with real numbers, such as age or income, precision is presented in terms of the confidence interval. (Note: this is not to be confused with the concept of confidence level explained earlier). We use this when we want to estimate the mean of the population based on our sample results. For example, if the average per capita income of the rural poor in our sample is 2,000 South African Rand per year, the computer might calculate a 95% confidence interval as between R1,800 and R2,200. We can then say that we are 95% certain (this is the confidence level ) that the true population's average salary is between 1,800 and 2,200 (this is the confidence interval ).

Module 9

Page 456 International Program for Development Evaluation Training − 2007

However, low response rates always have the threat of non- response bias. Over-sampling cannot control for this. No amount of over-sampling can control response bias if you have a low response rate (e.g., only 20%). Rather than over- sampling, the evaluator should put extra resources into doing everything possible to obtain a high response rate. Such efforts should include incentives and multiple follow-ups to non- respondents. While samples are used to keep costs of data collection down, go for as large a sample as you can manage. This will make your estimates of the population as accurate as possible. If you can do the entire population, opt for that, because there will then be no sampling error involved. However, keep mind that censuses can also yield biased data if there are low response rates.

Summary of Sampling Size

  • Accuracy and precision can be improved by increasing your sample size. In other words: − By increasing sample size, you increase accuracy and reduce margin of error.
  • The standard you should aim for is a 95% confidence level and a margin of error of +/- 5%.
  • The larger the margin of error, the less precise your results will be.
  • The smaller the population, the larger the needed ratio of the sample size to the population size (See Table 9.2)

Sampling

International Program for Development Evaluation Training − 2007 Page 457

Table 9.3 gives you a summary of sample sizes for very large populations (those of 1 million or larger). Many national surveys use samples of about 1,100 because that makes the margin of error + or –3 percentage points with a 95% confidence level. Table 9.3: Sampling Sizes for Large Populations Precision (margin of error, +/-%)

Confidence Level

99% 95% 90% ± 1% 16,576 9,604 6, ± 2% 4,144 2,401 1, ± 3% 1,848 1,067 752 ± 5% 666 384 271

Tables 9.4 and 9.5 1 show the confidence intervals for two population sizes (100 and 50): a few samples sizes and various proportions found in the sample. Table 9.4 shows confidence levels for populations of 100. Table 9.4: 95% Confidence Intervals for a Population of 100. Proportion Found in the Sample Sample Size .5 .4 or .6 .3 or .7 .2 or .8 .1 or. 75 ± .06 ± .06 ± .05 ± .05 ±. 50 ± .10 ± .10 ± .09 ± .08 ±. 30 ± .15 ± .15 ± .14 ± .12 ±.

Table 9.4: 95% Confidence Intervals for a Population of 50. Proportion Found in the Sample Sample Size .5 .4 or .6 .3 or .7 .2 or .8 .1 or. 30 ± .11 ± .11 ± .10 ± .09 ±. 20 ± .17 ± .17 ± .16 ± .14 ±.

(^1) Gregg B. Jackson (2005). Sampling in Development Evaluations. Presentation at IPDET, July 5 and 6, 2005, p 28.

Sampling

International Program for Development Evaluation Training − 2007 Page 459

Summary

In this module, you learned about sampling. Review the following checklist. Check those items that you can complete and review those that you cannot.

‰ define the sampling concepts, including: population sample sampling frame sample design parameter statistic random sample

‰ discuss the differences between random and non- random samples

‰ list and define the types of random samples simple random samples stratified random samples multi-stage random samples cluster random samples

‰ list and define the types of non-random samples quota accidental snowball judgmental convenience

‰ describe the role of statistics to determine confidence and precision

‰ describe the guidelines to determine the size of sample that is needed.

Ý

Module 9

Page 460 International Program for Development Evaluation Training − 2007

Hints for Development Evaluators

  • You can enhance the extent to which your appraisal reflects what is really going on in the real world by using what you know about appropriate sampling techniques and strategies.
  • Consider each of the following concepts when deciding upon one or more sampling techniques:

− random samples y simple, stratified, cluster or combination random samples − non-random samples y quota, accidental, snowball, judgmental, convenience and combination non-random samples

− confidence and precision − sample size.

Hints for Development Evaluation Managers

  • Go back and consider the evaluation questions again, making sure the evaluator is choosing an appropriate sample strategy that fits with the needed information.
  • Consider whether different sample strategies will be needed for different questions.

Mgr

Ev