Lecture Notes on Cluster Sampling | STAT 422, Study notes of Survey Sampling Techniques

Material Type: Notes; Professor: Williams; Class: Sample Survey Methods; Subject: Statistics; University: University of Idaho; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-m7j
koofers-user-m7j 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Cluster Sampling
A cluster sample is a probability sample in which each sampling unit is
a collection of elements. Two common reasons for using cluster sampling
are i) a frame of elements is either impossible or very costly, and ii) the
cost of sampling increases with the distance between the elements. When
using cluster sampling, the rst decision is what to use as a cluster, several
examples of these considerations are discussed in the text. Once the clusters
are chosen, a frame of clusters is obtained and then a simple random sample
of clusters is taken.
Notation for cluster sampling:
N= the number of clusters in the population,
n= the number of clusters sampled,
mi= the number of elements in cluster i; i = 1;2;3; :::; N
m=1
n
n
X
i=1
mi=the average cluster size for the sample,
M=
N
X
i=1
mi=the number of elements in the population,
M=M=N =the average cluster size for the population,
yi=the total of all observations in the ith cluster.
Estimation of a population mean :
Our estimator of the population mean is just the total of all elements in
the sample divided by the number of elements in the sample:
y=
n
X
i=1
yi
n
X
i=1
mi
with b
V(y) = Nn
N 1
M2s2
r
n, where s2
r=
n
X
i=1
(yiymi)2
n1.
Note that the estimator yis a ratio estimator. The estimated variance
above is biased, so it is advisable to have n20 unless the miare equal.
Example: the number of hours of television watched per day
Suppose we visit a small community of N= 150 households, and we
randomly sample n= 10 households. For each sampled household we nd
out how many people live at the household, and how many hours of TV are
watched per day by each of them:
1
pf2

Partial preview of the text

Download Lecture Notes on Cluster Sampling | STAT 422 and more Study notes Survey Sampling Techniques in PDF only on Docsity!

Cluster Sampling

A cluster sample is a probability sample in which each sampling unit is a collection of elements. Two common reasons for using cluster sampling are i) a frame of elements is either impossible or very costly, and ii) the cost of sampling increases with the distance between the elements. When using cluster sampling, the Örst decision is what to use as a cluster, several examples of these considerations are discussed in the text. Once the clusters are chosen, a frame of clusters is obtained and then a simple random sample of clusters is taken. Notation for cluster sampling: N = the number of clusters in the population, n = the number of clusters sampled, mi = the number of elements in cluster i; i = 1; 2 ; 3 ; :::; N

m = (^1) n

Xn

i=

mi = the average cluster size for the sample,

M =

X^ N

i=

mi = the number of elements in the population,

M = M=N = the average cluster size for the population, yi = the total of all observations in the ith^ cluster. Estimation of a population mean  : Our estimator of the population mean is just the total of all elements in the sample divided by the number of elements in the sample:

y =

X^ n

i=

yi

Xn

i=

mi

with Vb (y) =

N n N

M

2

s^2 r n

, where s^2 r =

X^ n

i=

(yi ymi)^2

n 1

Note that the estimator y is a ratio estimator. The estimated variance above is biased, so it is advisable to have n  20 unless the mi are equal. Example: the number of hours of television watched per day Suppose we visit a small community of N = 150 households, and we randomly sample n = 10 households. For each sampled household we Önd out how many people live at the household, and how many hours of TV are watched per day by each of them:

1

Household mi hours yi 1 2 4,5 9 2 4 2,4,2,3 11 3 2 1,2 3 4 3 3,4,4 11 5 5 4,4,5,2,2 17 6 1 2 2 7 3 4,3,3 10 8 2 4,3 7 9 1 6 6 10 4 3,3,5,6 17 Totals 27 93

Now we have y = 93=27 = 3: 44 ; m = 27=10 = 2: 7 ; and

X^ n

i=

(yi ymi)^2 =

46 : 89 so that s^2 r = 5: 21 : Then we have:

V^ b (y) =

= : 0667 so that B = : 52

We can also plot yi against mi to check the linearity of the data, and if the regression line appears to go through the origin.