Sampling Weights and Design Effect in Complex Surveys - Prof. Christopher J. Williams, Study notes of Survey Sampling Techniques

The concepts of sampling weights and design effect in complex survey sampling. Sampling weights are used to simplify calculations and obtain accurate estimates, while the design effect measures the efficiency loss of a complex design compared to simple random sampling. Examples of their use and calculations, with references to lohr's 'sampling: design and analysis'.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-m7j
koofers-user-m7j 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Two practical tools: sample weights and the design effect
Use of sample weights: We have learned about many important topics
in designing sample surveys, such as stratification, clustering, and the use of
auxillary information with ratio estimation. These ideas can be combined
in many ways to obtain very complex multi-stage sampling designs, and our
chapter on two-stage cluster sampling gave us a glimpse of the complexity
that can be involved in more sophisticated designs. As we saw in the chap-
ter on two-stage cluster sampling, the usual estimators (particularly variance
estimators) can become very complicated. In actual sampling studies, re-
searchers often simplify calculations by using sampling weights and compu-
tational approximations for variance calculations. Here we will introduce
sampling weights by rewriting the expression for the estimator of the mean
in stratified random sampling. Recall in the chapter on stratified random
sampling that the estimator of µis:
bµ=yst =1
N
L
X
i=1
Niyiwhich can be written as
=1
N
L
X
i=1
ni
X
j=1
Ni
ni
yij =
L
P
i=1
ni
P
j=1
wijyij
L
P
i=1
ni
P
j=1
wij
,
where wij =Ni/niis the weight for the jth observation in group i, and has
the interpretation that each observation in the sample represents wij =Ni/ni
members of the population. Thus if a population of N= 1000 elements are
divided into four strata each equal to Ni= 250,and if equal sample sizes
of ni= 100 are used for each stratum, then each observation in the sample
represents Ni/ni= 2.5 elements from the population. The general idea is
that a sampling weight is a reciprocal of a selection probability, so for the
StRS example above, δij = 1/wij =ni/Niis the probability of being sampled
for a member of the ith stratum. In many sampling studies, the sampling
weights are calculated as the sampling design is developed. Once the sam-
pling weights are calculated, any quantity of interest can be calculated as
a weighted sum as exemplified in the StRS expression above, and computa-
tional methods such as Taylor series, jackknife, or bootstrap methods can be
1
pf2

Partial preview of the text

Download Sampling Weights and Design Effect in Complex Surveys - Prof. Christopher J. Williams and more Study notes Survey Sampling Techniques in PDF only on Docsity!

Two practical tools: sample weights and the design effect

Use of sample weights: We have learned about many important topics in designing sample surveys, such as stratification, clustering, and the use of auxillary information with ratio estimation. These ideas can be combined in many ways to obtain very complex multi-stage sampling designs, and our chapter on two-stage cluster sampling gave us a glimpse of the complexity that can be involved in more sophisticated designs. As we saw in the chap- ter on two-stage cluster sampling, the usual estimators (particularly variance estimators) can become very complicated. In actual sampling studies, re- searchers often simplify calculations by using sampling weights and compu- tational approximations for variance calculations. Here we will introduce sampling weights by rewriting the expression for the estimator of the mean in stratified random sampling. Recall in the chapter on stratified random sampling that the estimator of μ is:

̂ μ = yst =

N

∑^ L

i=

Niyi which can be written as

N

∑^ L

i=

∑^ ni

j=

Ni ni

yij =

∑^ L

i=

∑^ ni j=

wij yij

∑L i=

∑^ ni j=

wij

where wij = Ni/ni is the weight for the jth^ observation in group i, and has the interpretation that each observation in the sample represents wij = Ni/ni members of the population. Thus if a population of N = 1000 elements are divided into four strata each equal to Ni = 250, and if equal sample sizes of ni = 100 are used for each stratum, then each observation in the sample represents Ni/ni = 2.5 elements from the population. The general idea is that a sampling weight is a reciprocal of a selection probability, so for the StRS example above, δij = 1/wij = ni/Ni is the probability of being sampled for a member of the ith^ stratum. In many sampling studies, the sampling weights are calculated as the sampling design is developed. Once the sam- pling weights are calculated, any quantity of interest can be calculated as a weighted sum as exemplified in the StRS expression above, and computa- tional methods such as Taylor series, jackknife, or bootstrap methods can be

used to calculate an approximate variance estimate. For multistage sam- ples weights are generated for each stage and then multiplied to obtain an overall weight. The SAS example for this lecture shows the use of SAS Proc SURVEYSELECT to generate a two-stage cluster sample. The sam- pling weights produced from this sample are combined with the data in Proc SURVEYMEANS to obtain an estimate of the mean and approximate vari- ance estimate, which can be compared to the result obtained using explicit formulas for estimating the mean from two-stage cluster sampling. Use of a design effect: Computing sample sizes for complex surveys that are repeated over time is made easier with the concept of a design effect (denoted by deff). The design effect for a sampling plan and a statistic of interest is defined to be the ratio of the estimated variance of the statistic under the sampling plan to the estimated variance of the statistic under simple random sampling. As an example, consider estimating a proportion from a complex multistage design. The design effect for the complex design would be:

deff(complex design, ̂p) =

V (estimate from complex design) V̂ (SRS with same sample size)

=

V (estimate from complex design) ̂ p(1 − p̂)/n

The design effect is similar to a relative efficiency, and measures the loss (or gain) in efficiency of the complex design relative to an SRS design. This is extremely useful when computing sample sizes for a future sample survey. For a future survey, the sample size estimate is just the estimate for a SRS sample for a given bound multiplied by the design effect. For example, suppose a multistage sampling plan that involved clustering and stratification was used to estimate a proportion, and the design effect was 1.7. Then for the next survey, the sample size for an SRS sample and the given bound can be calculated and multiplied by 1.7 to give a sample size for the complex sample design. Reference: Lohr, S.L. 1999. Sampling: Design and Analysis. Pacific Grove, CA: Brooks/Cole.