

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Sample Survey Methods; Subject: Statistics; University: University of Idaho; Term: Unknown 1989;
Typology: Study notes
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Two-stage cluster sampling; unbiased estimation of a population mean and total
Two-stage cluster sampling: In this chapter we examine two-stage cluster sampling, a very simple kind of multi-stage sampling. In real-life surveys, it is common to have several levels of sampling, including use of stratification and ratio estimators. Here we will consider two-stage cluster sampling where a simple random sample of clusters is taken, then a simple random sample of elements in the selected clusters is taken. One example would be to estimate the proportion of U.S. college students who like Korean food by taking a SRS of U.S. colleges, then taking a SRS of students in each selected college. Notation: The extra stage of sampling here gives us notation that is slightly changed from that used for single-stage cluster sampling: N = the number of clusters in the population n = the number of clusters selected in a SRS Mi = the number of elements in cluster i mi = the number of elements selected in a SRS from cluster i M =
i=
Mi = the number of elements in the population
M /N = the average cluster size for the population yij = the jth observation in the sample from the ith cluster yi = (^) m^1 i
∑^ mi j=
yij = the sample mean for the ith cluster
Unbiased estimation of the population mean μ: In single-stage
cluster sampling the unbiased estimator of τ was Nn
∑n i=
yi, where the yi term
was the total of observations in the ith cluster. We are now sampling from each cluster, so we do not know these totals. However, we can estimate the cluster total by multiplying the cluster average (yi) by the number of elements in the cluster (Mi). We can divide by M to estimate μ. Then we have:
M n
∑^ n
i=
yi is estimated by
M n
∑^ n
i=
Miyi, so μ̂ =
M n
∑^ n
i=
Miyi =
∑^ n i=
Miyi
n
and the estimated variance is:
V̂ (μ̂ ) =
N − n N
2
s^2 b n
nN M
2
∑^ n
i=
M (^) i^2
Mi − mi Mi
s^2 i mi
where
s^2 b =
∑^ n i=
(Miyi − M ̂μ)^2
n − 1
and s^2 i =
∑^ mi j=
(yij − yi)^2
mi − 1
In the variance estimator, s^2 b measures variation between clusters, and s^2 i measures variation within cluster i. We can obtain an unbiased estimator of τ and a variance estimator by multiplying the above expressions by M :
̂ τ =
n
∑^ n
i=
Miyi = N
∑^ n i=
Miyi
n
with variance estimator:
V̂ (̂τ ) =
N − n N
s^2 b n
n
∑^ n
i=
M (^) i^2
Mi − mi Mi
s^2 i mi
where s^2 b and s^2 i are defined as above. See the text and SAS code on the web for example calculations.