

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concepts of sampling weights and design effect in complex survey sampling. Sampling weights are used to simplify calculations and obtain accurate estimates, while the design effect measures the efficiency loss of a complex design compared to simple random sampling. Examples of their use and calculations, with references to lohr's 'sampling: design and analysis'.
Typology: Study notes
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Two practical tools: sample weights and the design effect
Use of sample weights: We have learned about many important topics in designing sample surveys, such as stratification, clustering, and the use of auxillary information with ratio estimation. These ideas can be combined in many ways to obtain very complex multi-stage sampling designs, and our chapter on two-stage cluster sampling gave us a glimpse of the complexity that can be involved in more sophisticated designs. As we saw in the chap- ter on two-stage cluster sampling, the usual estimators (particularly variance estimators) can become very complicated. In actual sampling studies, re- searchers often simplify calculations by using sampling weights and compu- tational approximations for variance calculations. Here we will introduce sampling weights by rewriting the expression for the estimator of the mean in stratified random sampling. Recall in the chapter on stratified random sampling that the estimator of μ is:
̂ μ = yst =
i=
Niyi which can be written as
i=
∑^ ni
j=
Ni ni
yij =
i=
∑^ ni j=
wij yij
∑L i=
∑^ ni j=
wij
where wij = Ni/ni is the weight for the jth^ observation in group i, and has the interpretation that each observation in the sample represents wij = Ni/ni members of the population. Thus if a population of N = 1000 elements are divided into four strata each equal to Ni = 250, and if equal sample sizes of ni = 100 are used for each stratum, then each observation in the sample represents Ni/ni = 2.5 elements from the population. The general idea is that a sampling weight is a reciprocal of a selection probability, so for the StRS example above, δij = 1/wij = ni/Ni is the probability of being sampled for a member of the ith^ stratum. In many sampling studies, the sampling weights are calculated as the sampling design is developed. Once the sam- pling weights are calculated, any quantity of interest can be calculated as a weighted sum as exemplified in the StRS expression above, and computa- tional methods such as Taylor series, jackknife, or bootstrap methods can be
used to calculate an approximate variance estimate. For multistage sam- ples weights are generated for each stage and then multiplied to obtain an overall weight. The SAS example for this lecture shows the use of SAS Proc SURVEYSELECT to generate a two-stage cluster sample. The sam- pling weights produced from this sample are combined with the data in Proc SURVEYMEANS to obtain an estimate of the mean and approximate vari- ance estimate, which can be compared to the result obtained using explicit formulas for estimating the mean from two-stage cluster sampling. Use of a design effect: Computing sample sizes for complex surveys that are repeated over time is made easier with the concept of a design effect (denoted by deff). The design effect for a sampling plan and a statistic of interest is defined to be the ratio of the estimated variance of the statistic under the sampling plan to the estimated variance of the statistic under simple random sampling. As an example, consider estimating a proportion from a complex multistage design. The design effect for the complex design would be:
deff(complex design, ̂p) =
V (estimate from complex design) V̂ (SRS with same sample size)
=
V (estimate from complex design) ̂ p(1 − p̂)/n
The design effect is similar to a relative efficiency, and measures the loss (or gain) in efficiency of the complex design relative to an SRS design. This is extremely useful when computing sample sizes for a future sample survey. For a future survey, the sample size estimate is just the estimate for a SRS sample for a given bound multiplied by the design effect. For example, suppose a multistage sampling plan that involved clustering and stratification was used to estimate a proportion, and the design effect was 1.7. Then for the next survey, the sample size for an SRS sample and the given bound can be calculated and multiplied by 1.7 to give a sample size for the complex sample design. Reference: Lohr, S.L. 1999. Sampling: Design and Analysis. Pacific Grove, CA: Brooks/Cole.