Chapter 11 Systematic Sampling, Exams of Survey Sampling Techniques

The systematic sampling technique is operationally more convenient than simple random ... th systematic sample and k is termed as a sampling interval.

Typology: Exams

2021/2022

Uploaded on 09/07/2022

adnan_95
adnan_95 🇮🇶

4.3

(39)

918 documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
SamplingTheory| Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur
Pa
g
e 1
Chapter 11
Systematic Sampling
The systematic sampling technique is operationally more convenient than simple random sampling. It
also ensures, at the same time that each unit has an equal probability of inclusion in the sample. In this
method of sampling, the first unit is selected with the help of random numbers, and the remaining units
are selected automatically according to a predetermined pattern. This method is known as systematic
sampling.
Suppose the N units in the population are numbered 1 to Nin some order. Suppose further that N is
expressible as a product of two integers n and k, so that .Nnk
To draw a sample of size n,
- select a random number between 1 and k.
- Suppose it is i.
- Select the first unit, whose serial number is i.
- Select every th
k unit after th
i unit.
- The sample will contain , ,1 2 ,..., ( 1)ii k k i n k serial number units.
So the first unit is selected at random and other units are selected systematically. This systematic sample
is called kth systematic sample and k is termed as a sampling interval. This is also known as linear
systematic sampling.
The observations in the systematic sampling are arranged as in the following table:
Systematic sample
number
1 2 3 i k
Sample
composition
1
2
n
1
y
1k
y
(1)1nk
y
2
y
2k
y
(1) 2nk
y
3
y
3k
y
(1)3nk
y
i
y
ki
y
(1)nki
y
k
y
2k
y
nk
y
Probability 1
k 1
k 1
k 1
k 1
k
Sample mean 1
y
2
y
3
y
i
y
k
y
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Chapter 11 Systematic Sampling and more Exams Survey Sampling Techniques in PDF only on Docsity!

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Chapter 11

Systematic Sampling

The systematic sampling technique is operationally more convenient than simple random sampling. It

also ensures, at the same time that each unit has an equal probability of inclusion in the sample. In this

method of sampling, the first unit is selected with the help of random numbers, and the remaining units

are selected automatically according to a predetermined pattern. This method is known as systematic

sampling.

Suppose the N units in the population are numbered 1 to N in some order. Suppose further that N is

expressible as a product of two integers n and k , so that N nk.

To draw a sample of size n ,

  • select a random number between 1 and k.
  • Suppose it is i.
  • Select the first unit, whose serial number is i.
  • Select every

th k unit after

th i unit.

  • The sample will contain i i,  k ,1  2 ,...,k i  ( n  1)k serial number units.

So the first unit is selected at random and other units are selected systematically. This systematic sample

is called k th^ systematic sample and k is termed as a sampling interval. This is also known as linear

systematic sampling.

The observations in the systematic sampling are arranged as in the following table:

Systematic sample

number

(^1 2 3)  i (^)  k

Sample

composition

n

y 1

yk (^)  1

y ( (^) n 1) k 1

y 2

yk (^)  2

y ( (^) n 1) k 2

y 3

yk (^)  3

y ( (^) n 1) k 3

y i

y k (^) i

y ( (^) n 1) k i

y k

y 2 k

y nk

Probability 1

k

k

k

k

k

Sample mean (^) y 1 y 2 y 3  yi  yk

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Example: Let N  50 and n  5.So k  10.Suppose first selected number between 1 and 10 is 3. Then

systematic sample consists of units with following serial number 3, 13, 23, 33, 43.

Systematic sampling in two dimensions:

Assume that the units in a population are arranged in the form of m rows, and each row contains nk

units. A sample of size mn is required. Then

  • select a pair of random numbers ( ,i j ) such that i   and j  k.
  • Select the ( , )

th i j unit, i.e.,

th j unit in

th i row as the first unit.

  • Then the rows to be selected are

i i,  , i  2 ,..., i  ( m1)

and columns to be selected are

j, j  k , j  2 ,...,k j  ( n 1) .k

  • The points at which the m selected rows and n selected columns intersect determine the position

of mn^ selected units in the sample.

Such a sample is called an aligned sample.

An alternative approach to select the sample is

  • independently select n random integers i 1 , i 2 ,..., in such that each of them is less than or equal to

.

  • Independently select m random integers j 1 , j 2 ,..., jm such that each of them is less than or equal

to k.

  • The units selected in the sample will have the following coordinates:

( i 1  r , jr (^)  1 ), ( i 2  r , jr (^)  1  k ), ( i 3  r , jr (^)  1  2 ),..., (k in  r , jr (^)  1  ( n 1) )k.

Such a sample is called an unaligned sample.

Under certain conditions, an unaligned sample is often superior to an aligned sample as well as a

stratified random sample.

Advantages of systematic sampling :

  1. It is easier to draw a sample and often easier to execute it without mistakes. This is more

advantageous when the drawing is done in fields and offices as there may be substantial saving in

time.

  1. The cost is low, and the selection of units is simple. Much less training is needed for surveyors to

collect units through systematic sampling.

  1. The systematic sample is spread more evenly over the population. So no large part will fail to be

represented in the sample. The sample is evenly spread and cross-section is better. Systematic

sampling fails in case of too many blanks.

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Consider

2 2

1 1 2

1 1

2 2

1 1 1

2 2

1

k n

ij i j

k n

ij i i i j k n k

ij i i i j i k

wsy i i

N S y Y

y y y Y

y y n y Y

k n S n y Y

 

 

  

 ^    





 

where

2 2

1 1

k n

wsy ij i i j

S y y k n (^)  



is the variation among the units that lies within the same systematic sample. Thus

2 2

2 2

Variation Pooled within

as a variation of the

whole systematic sample

sy wsy

wsy

N k n Var y S S N N

N n S S N n

k

with N  nk. This expression indicates that when the within variation is large, then Var y( (^) i ) becomes

smaller. Thus higher heterogeneity makes the estimator more efficient and higher heterogeneity is well

expected in a systematic sample.

Alternative form of variance:

2

1 2

1 1

2 1 1

2 2 1 1 ( ) 1 1

2 2 1 ( ) 1 1

k

sy i i

k n

ij i j

k n

ij i j

k n n n

ij ij i i j j

k n n

ij i i j

Var y y Y k

y Y k n

y Y kn

y Y y Y y Y kn

nk S y Y y Y kn

 

 

    

   

 

 

   

  

  

  

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

The intraclass correlation between the pairs of units that are in the same systematic sample is

2

1 ( ) 1 1

2

   

  

  

ij i w w ij k n n

ij i i j

E y Y y Y

E y Y nk

y Y y Y nk n

nk S nk

So substituting

2

1 ( ) 1 1

k n n

ij i w i j

y Y y Y n nk S

   

   ^ ^ ^ ^   

in Var y( (^) i ) gives

 

 

2

2

sy w

w

nk S Var y n nk n

N S n N n

Comparison with SRSWOR:

For a SRSWOR sample of size n ,

2

2

2

SRS

N n Var y S Nn

nk n S Nn

k S N

Since

2 2

2 2

2 2

sy wsy

SRS sy wsy

wsy

N n Var y S S N n

N nk

k N n Var y Var y S S N N n

n S S n

 ^ ^  

Thus y (^) sy is

  • more efficient than ySRS when

2 2 S (^) wsy S.

  • less efficient than ySRS when

2 2 S wsy  S.

  • equally efficient as

2 2 ySRS when S (^) wsyS.

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Recall that in the case of stratified sampling with k strata, the stratum mean

1

1 k st j j j

y N y N (^) 

 (^) 

is an unbiased estimator of the population mean.

Considering the set up of stratified sample in the set up of a systematic sample, we have

  • Number of strata = n
  • Size of strata = k (row size)
  • Sample size to be drawn from each stratum = 1

and ys (^) t becomes

1

1

n

st j j n

j j

y ky nk

y n

2 1

2 2 2 1 2 2 1 2 2

using ( ) .

n

st j j n

j SRS j n

j j

wst

wst

Var y Var y n

k N n S Var y S n k Nn

k S kn

k S nk

N n S Nn

where

2 2

1

k

j ij j i

S y y k (^) 

is the mean sum of squares of units in the

th j stratum.

2 2 2

1 1 1

 (^ 1)  

 

n k n

wst j ij j j i j

S S y y n n k

is the mean sum of squares within strata (or rows).

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

The variance of the systematic sample mean is

2

1 2

1 1 1 2

2 1 1

2 2 1 1 1 1

k

sy i i

k n n

ij j i j j

k n

ij j i j

k n k n n

ij j ij j i i j i j

Var y y Y k

y y k n n

y y n k

y y y y y y n k

  

 

    

  

 

    

Now we simplify and express this expression in terms of the intraclass correlation coefficient. The

intraclass correlation coefficient between the pairs of deviations of units which lie along the same row

measured from their stratum means is defined as

2

1 1

2

1 1

1 1 2

  

 

  







  

  

ij i wst ij k n n

ij j i i j k n

ij j i j k n n

ij j i i j

wst

E y Y y Y

E y Y

y y y y nk n

y y nk

y y y y

N n S

So

 

2 2 2

2

1 ( 1). (using )

sy wst wst wst

wst wst

Var y N n S N n n S n k

N n S n N nk Nn

Thus

2 ( (^) sy ) ( (^) st ) ( 1) wst wst

N n Var y Var y n S Nn

and the relative efficiency of systematic sampling relative to equivalent stratified sampling is given by

st

sy wst

Var y RE

Var y n 

So the systematic sampling is

  • more efficient than the corresponding equivalent stratified sample when  (^) wst 0.
  • less efficient than the corresponding equivalent stratified sample when wst  0
  • equally efficient than the corresponding equivalent stratified sample when wst 0.

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Under systematic sampling

Earlier yij denoted the value of study variable with the

th j unit in the

th i systematic sample. Now yij

represents the value of (^)  ( 1) (^) 

th i  j  k unit of the population, so

 

 

2

1

1

1

ij

sy i k

sy i i n

i ij j n

j

y a b i j k i k j n

y y

Var y y Y k

y y n

a b i j k n

n a b i k

2 2

1 1 2 2

1 2 2 2

1 1

2 2

2 2

 

 

 ^  

 ^  

 ^  

 

 

k k

i i i

k

i

k k

i i

n nk y Y a b i k a b

k b i

k k b i k i

k k k k k k b k

b k k

2 2

2 2

sy

b Var y k k k

b k

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Under stratified sampling

 

1

ij k

st i i i

y a b i j k i k j n

y N y N (^) 

 (^) 

( (^) st ) wst wst

N n k Var y S S Nn nk

 

2 2

1

2

1 1 2

1 1

2 2

1 1 2 2

2

where

n

wst j j k n

ij j i j

k n

i j

k n

i j

S S

n

y y n k

k a b i j k a b j k n k

b k i n k

b nk k

n k

k k b

 

 

 







2

2 2

st

k k k Var y b nk

b k

n

If k is large, so that

k

is negligible, then comparing Var y( (^) st ), Var y( (^) sy ) and V y( (^) SRS),

Var y ( (^) st): (^) Var y( (^) sy) : Var y( (^) SRS)

or

2 k 1

n

2 k  1 : ( k  1)(1 nk)

or

k 1

n

: k  1 : nk  1

or

k

n k

k

k

nk

k

n

1 : n

Thus

1 Var y( (^) st ) : Var y( (^) sy ) : Var y( (^) SRS ) :: : 1 : n n

So stratified sampling is best for linearly trended population. Next best is systematic sampling.

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Systematic sampling when N nk.

When N is not expressible as nk then suppose N can be expressed as

N  nk  p; p k.

Then consider the following sample mean as an estimator of the population mean

1

1

1

if 1

if.

n

ij j sy i (^) n

ij j

y i p n y y

y i p n

In this case

1

1 1 1 1

p n n n

i ij ij i j i p j

E y y y k n n

Y

    

   

So ys (^) y is a biased estimator of Y.

An unbiased estimator of Y is

sy ij j

i

k y y N

k C N

where Ci  nyiis the total of values of the

th i column.

1

sy i

k

i i

k E y E C N

k C N k

Y

2

2

( (^) sy ) c

k k Var y S N k

where

2

1

k

c i i

NY

S ny k (^)  k

.

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Now we consider another procedure which is opted when N nk.

[Reference: Theory of Sample Surveys, A.K. Gupta, D.G. Kabe, 2011, World Scientific Publishing Co.]

When population size N is not expressible as the product of n and k , then let

N  nq r.

Then take the sampling interval as

if 2 .

1 if 2

n q r

k n q r

Let

M

g

denotes the largest integer contained in.

M

g

If

k  q (  q or q 1), then the

number of units expected in sample



with probability 1

1 with probability.

N N N

q q q

N N N

q q q

    ^  

 ^ ^ ^ ^ ^ 

  ^ ^   

^ ^ ^ ^ ^ 

If

q  q , then we get

with probability 1

1 with probability

r r r n q q q n r r r n q q q

 ^    ^  

 ^ ^ ^ ^ ^ 

 ^ ^ ^ ^ ^ 

Similarly, if

q  q1, then

with probability 1 1 ( 1) 1

1 with probability. 1 1 ( 1)

n r n r n r n q q q n n r n r n r n q q q

 ^    ^  

 ^ ^ ^ ^ ^ ^ ^  

 ^ ^ ^ ^  ^ ^ ^ ^  

  ^ ^  ^ ^ ^ 

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

When N  nk,the systematic samples are not of the same size and the sample mean is not an unbiased

estimator of the population mean. To overcome these disadvantages of systematic sampling when

N  nkcircular systematic sampling is proposed. Circular systematic sampling consists of selecting a

random number from 1 to N and then selecting the unit corresponding to this random number. After that,

every

th k unit in a cyclical manner is selected until a sample of n units is obtained, k being the nearest

integer to.

N

n

In other words, if i is a number selected at random from 1 to N , then the circular systematic sample

consists of units with serial numbers

, if 0,1, 2,..., ( 1). , if

i jk i jk N j n i jk N i jk N

 ^ 

This sampling scheme ensures an equal probability of inclusion in the sample for every unit.

Example:

Let N  14 and n  5.Then, k  nearest integer to

 Let the first number selected at random from

1 to 14 be 7. Then, the circular systematic sample consists of units with serial numbers

This procedure is illustrated diagrammatically in the following figure.

13

12

1 2 3 4 5 6 7 8

9

10

11

12

Sampling Theory | Chapter 11 | Systematic Sampling | Shalabh, IIT Kanpur

Theorem : In circular systematic sampling, the sample mean is an unbiased estimator of the population

mean.

Proof : If i is the number selected at random, then the circular systematic sample mean is

n

i

y y n

where

n

i

y

 denotes the total of^ y^ values in the^

th i circular systematic sample, i  1, 2,..., N.We note

here that in circular systematic sampling, there are N circular systematic samples, each having

probability

N

of its selection. Hence,

1 1

N n N n

i (^) i i i

E y y y  n^ N^ Nn 

   

Clearly, each unit of the population occurs in n of the N possible circular systematic sample means.

Hence,

1 1

N n N

i i (^) i i

y n Y  

  

which on substitution in E ( y )proves the theorem.

What to do when N nk

One of the following possible procedures may be adopted when N nk.

(i) Drop one unit at random if the sample has ( n  1)units.

(ii) Eliminate some units so that N nk.

(iii) Adopt circular systematic sampling scheme.

(iv) Round off the fractional interval k.