

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Williams; Class: Sample Survey Methods; Subject: Statistics; University: University of Idaho; Term: Unknown 1989;
Typology: Study notes
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Cluster Sampling
A cluster sample is a probability sample in which each sampling unit is a collection of elements. Two common reasons for using cluster sampling are i) a frame of elements is either impossible or very costly, and ii) the cost of sampling increases with the distance between the elements. When using cluster sampling, the Örst decision is what to use as a cluster, several examples of these considerations are discussed in the text. Once the clusters are chosen, a frame of clusters is obtained and then a simple random sample of clusters is taken. Notation for cluster sampling: N = the number of clusters in the population, n = the number of clusters sampled, mi = the number of elements in cluster i; i = 1; 2 ; 3 ; :::; N
m = (^1) n
Xn
i=
mi = the average cluster size for the sample,
i=
mi = the number of elements in the population,
M = M=N = the average cluster size for the population, yi = the total of all observations in the ith^ cluster. Estimation of a population mean : Our estimator of the population mean is just the total of all elements in the sample divided by the number of elements in the sample:
y =
X^ n
i=
yi
Xn
i=
mi
with Vb (y) =
N n N
2
s^2 r n
, where s^2 r =
X^ n
i=
(yi ymi)^2
n 1
Note that the estimator y is a ratio estimator. The estimated variance above is biased, so it is advisable to have n 20 unless the mi are equal. Example: the number of hours of television watched per day Suppose we visit a small community of N = 150 households, and we randomly sample n = 10 households. For each sampled household we Önd out how many people live at the household, and how many hours of TV are watched per day by each of them:
1
Household mi hours yi 1 2 4,5 9 2 4 2,4,2,3 11 3 2 1,2 3 4 3 3,4,4 11 5 5 4,4,5,2,2 17 6 1 2 2 7 3 4,3,3 10 8 2 4,3 7 9 1 6 6 10 4 3,3,5,6 17 Totals 27 93
Now we have y = 93=27 = 3: 44 ; m = 27=10 = 2: 7 ; and
X^ n
i=
(yi ymi)^2 =
46 : 89 so that s^2 r = 5: 21 : Then we have:
V^ b (y) =
= : 0667 so that B = : 52
We can also plot yi against mi to check the linearity of the data, and if the regression line appears to go through the origin.