Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Chapter 3: Simple Random Sampling and Systematic Sampling, Study notes of Design

Design

Both designs involve selecting n sample units from the N units in the population and can be implemented with or without replacement. Simple Random Sampling.

Typology: Study notes

2022/2023

Uploaded on 02/28/2023

ekadant 🇺🇸

4.3

(32)

267 documents

1 / 8

This page cannot be seen from the preview

Don't miss anything!

1

Chapter 3: Simple Random Sampling and Systematic

Sampling

Simple random sampling and systematic sampling provide the foundation for almost all of the

more complex sampling designs that are based on probability sampling. They are also usually

the easiest designs to implement. These two designs highlight a trade-off inherent in all sampling

designs: do we select sample units at random to minimize the risk of introducing biases into the

sample or do we select sample units systematically to ensure that sample units are well-

distributed throughout the population?

Both designs involve selecting n sample units from the N units in the population and can be

implemented with or without replacement.

Simple Random Sampling

When the population of interest is relatively homogeneous then simple random sampling works

well, which means it provides estimates that are unbiased and have high precision. When little is

known about a population in advance, such as in a pilot study, simple random sampling is a

common design choice.

Advantages:

• Easy to implement

• Requires little advance knowledge about the target population

Disadvantages:

• Imprecise relative to other designs if the population is heterogeneous

• More expensive to implement than other designs if entities are clumped and the cost to

travel among units is appreciable

How it is implemented:

• Select n sample units at random from N available in the population

All units within the population must have the same probability of being selected, therefore each

and every sample of size n drawn from the population has an equal chance of being selected.

There are many strategies available for selecting a

random sample. For large finite populations (i.e.,

those where every potential sampling unit can be

identified in advance), this can involve generating

pseudorandom numbers with a computer. For small

finite populations it might involve using a table of

random numbers or even writing a unique identifier for

every sample unit in the population on a scrap of

paper, placing those numbers in a jar, shaking it, then

selecting n scraps of paper from the jar blindly. The

approach used for selecting the sample matters little

provided there are no constraints on how the sample

units are selected and all units have an equal chance

of being selected.

Partial preview of the text

Download Chapter 3: Simple Random Sampling and Systematic Sampling and more Study notes Design in PDF only on Docsity!

Chapter 3: Simple Random Sampling and Systematic

Sampling

Simple random sampling and systematic sampling provide the foundation for almost all of the more complex sampling designs that are based on probability sampling. They are also usually the easiest designs to implement. These two designs highlight a trade-off inherent in all sampling designs: do we select sample units at random to minimize the risk of introducing biases into the sample or do we select sample units systematically to ensure that sample units are well- distributed throughout the population?

Both designs involve selecting n sample units from the N units in the population and can be implemented with or without replacement.

Simple Random Sampling

When the population of interest is relatively homogeneous then simple random sampling works well, which means it provides estimates that are unbiased and have high precision. When little is known about a population in advance, such as in a pilot study, simple random sampling is a common design choice.

Advantages:

Easy to implement
Requires little advance knowledge about the target population

Disadvantages:

Imprecise relative to other designs if the population is heterogeneous
More expensive to implement than other designs if entities are clumped and the cost to travel among units is appreciable

How it is implemented:

Select n sample units at random from N available in the population

All units within the population must have the same probability of being selected, therefore each and every sample of size n drawn from the population has an equal chance of being selected.

There are many strategies available for selecting a random sample. For large finite populations (i.e., those where every potential sampling unit can be identified in advance), this can involve generating pseudorandom numbers with a computer. For small finite populations it might involve using a table of random numbers or even writing a unique identifier for every sample unit in the population on a scrap of paper, placing those numbers in a jar, shaking it, then selecting n scraps of paper from the jar blindly. The approach used for selecting the sample matters little provided there are no constraints on how the sample units are selected and all units have an equal chance of being selected.

Estimating the Population Mean

The population mean ( μ ) is the true average number of entities per sample unit and is estimated

with the sample mean ( μˆ or y ) which has an unbiased estimator:

n

y

n

i

∑ i

μˆ= =^1

where y (^) i is the value from each unit in the sample and n is the number of units in the sample.

The population variance ( σ^2 ) is estimated with the sample variance ( s 2 ) which has an unbiased estimator:

1

2 2 −

= n

y y s

n

i

( i )

Variance of the estimate μˆ is:

n

s N

N n (μ)

2 ⎟ ⎠

va ˆr ˆ =⎛^ −.

The standard error of the estimate is the square root of variance of the estimate, which as always, is the standard deviation of the sampling distribution of the estimate. Standard error is a useful gauge of how precisely a parameter has been estimated as is a function of the variation inherent in the population ( σ^2 ) and the size of the sample ( n ).

Standard error of μˆ is:

n

s N

N n SE(μ)

2 ⎟ ⎠

ˆ = ⎛^ −.

The quantity (^) ⎟ ⎠

N

N n is the finite population correction factor which adjusts variance of the

estimator (not variance of the population which does not change with n ) to reflect the amount of information that is known about the population through the sample. Simply, as the amount of information we know about the population through sampling increases, the remaining uncertainty decreases. Therefore, the correction factor reflects the proportion of the population that remains unknown. Consequently, as the number of sampling units measured ( n ) approaches the total number of sampling units in the population ( N ), the finite population correction factor approaches zero, so the amount of uncertainty in the estimate also approaches zero.

When the sample size n is small relative to the population size N , the fraction of the population being sampled n / N also is small, therefore the correction factor has little effect on the variance of the estimator (Fig. 2 - FPC.xls). If the finite population correction factor is ignored, which is what

FPC with N = 100

0

1

0 20 40 60 80 100 n

FPC

In the case of simple random sampling, the population proportion follows the mean exactly; that is, p = μ. If this idea is new to you, convince yourself by working through an example. Say we generate a sample of size 10, where 4 entities have a value of 1 and 6 entities have a value of 0 (e.g., 1 = presence of a trait, 0 = absence of a trait). The proportion of entities in the sample with the trait is 4/10 or 0.40 which is also equal to the sample mean, which = 0. ([1+1+1+1+0+0+0+0+0+0]/10 = 4/10). Cosmic.

It follows that the population proportion ( p ) is estimated with the sample proportion ( p ˆ^ ) which has

an unbiased estimator:

n

y p

n

i

∑ i

ˆ = μˆ= =^1.

Because we are dealing with dichotomous proportions (sample unit does or does not have the trait), the population variance σ^2 is computed based on variance for a binomial which is the proportion of the population with the trait ( p ) times the proportion that does not have that trait (1 –

p ) or p (1 – p ). The estimate of the population variance s 2 is:^ p ˆ^ (^^1 −^ p ˆ).

Variance of the estimate p ˆ^ is: 1

2

−

n

p( p) N

N n n

s N

N n (p )

va ˆr ˆ.

Standard error of p ˆ^ is: 1

2

−

n

p( p) N

N n n

s N

N n SE(p )

Example. (to be added)

Determining Sample Sizes

How many sample units should we measure from the population so that we have confidence that parameters have been estimated adequately?

Determining how many sample units ( n ) to measure requires that we establish the degree of precision that we require for the estimate we wish to generate; we denote a quantity B , the desired bound of the error of estimation, which we define as the half-width of the confidence interval we want to result around the estimate we will generate from the proposed sampling effort.

To establish the number of sample units to measure to estimate the population mean μ at a desired level of precision B with simple random sampling, we set Z × SE( y ) (the formula for a

confidence interval) equal to B and solve this expression for n. We use Z to denote the upper α/ point of the standard normal distribution for simplicity (although we could use the Student’s t distribution), where α is the same value we used to establish the width of a confidence interval, the rate at which we are willing to tolerate Type I errors.

We set ⎟

n

σ N

N n B Z

2 and solve for n:

2

2 2 0

0

B

Z

n

n N

n

If we anticipate n to be small relative to N , we can ignore the population correction factor and use only the formula for n 0 to gauge sample size.

Example : Estimate the average amount of money μ for a hospital’s accounts receivable. Note, however, that no prior information exists with which to estimate population variance σ^2 but we know that most receivables lie within a range of about $100 and there are N = 1000 accounts. How many samples are needed to estimate μ with a bound on the error of estimation B = $3 with 95% confidence (α = 0.05, Z = 1.96) using simple random sampling?

Although it is ideal to have data with which to estimate σ^2 , the range is often approximately equal to 4 σ , so one-fourth of the range might be used as an approximate value of σ.

range

Substituting into the formula above:

2

2 2

n =

Therefore, about 218 samples are needed to estimate μ with a bound on the error of estimation B = $3.

To establish the number of sample units to measure to estimate the population total τ at a desired

level of precision B with simple random sampling, we set (^) ⎟⎟ ⎠

n

B Z NN n

σ^2

( ) and solve for n :

2

2 2 2 0

0

B

N Z

n

n N

n

And as with establishing n for the population mean, if N is large relative to n , the population correction factor can be ignored, and the formula for sample size reduced to n (^0)

Example : What sample size is necessary to estimate the caribou population we examined to within d = 2000 animals of the true total with 90% confidence ( α = 0.10)?

Using s 2 = 919 from earlier and Z = 1.645, which is the upper α = 0.10/2 = 0.05 point of the normal distribution:

2

2 2 2 0 =^ ≈

n.

To adjust for the size of the finite population:

a 1-in-10 systematic sample. The example in the figure is a 1-in-8 sample drawn from a population of N = 300; this yields n = 28. Note that the sample size drawn will vary and depends on the location of the first unit drawn.

Estimating the Population Mean

The population mean ( μ ) is estimated with: n

y

n

i

∑ i

μˆ= =^1

The population variance ( σ^2 ) is estimated with: 1

1

2 2 −

= n

y y s

n

i

( i )

Variance of the estimate μˆ is:

n

s N

N n^2 ⎟ ⎠

vaˆr( μˆ )=⎛^ −.

Standard error of μˆ^ is:

n

s N

N n SE

2 ⎟ ⎠

Estimating the Population Total

The population total τ is estimated with: ∑

=

n

i

yi n

N

1

Variance of the estimate τˆ is: ⎟

N

N n n

s N N

2

vaˆr(τˆ )^2 var( μˆ)^2.

Standard error of τˆ is: ⎟

N

N n n

s N

2

vaˆr( τˆ )^2

Estimating the Population Proportion

The population proportion ( p ) is estimated with the sample proportion ( p ˆ ) which has an unbiased

estimator:

n

y p

n

i

∑ i

ˆ = μˆ= =^1.

Because we are estimating a dichotomous proportion, the population variance σ^2 is again computed with a binomial which is the proportion of the population with the trait ( p ) times the proportion without that trait (1 – p ) or p (1 – p ). The estimate of the population variance s 2 is: p ˆ ( 1 − p ˆ ).

Variance of the estimate p ˆ^ is: 1

2

−

n

p p N

N n n

s N

N n p

va ˆr(ˆ).

Examples. (to be added)

Chapter 3: Simple Random Sampling and Systematic Sampling, Study notes of Design

Related documents

Partial preview of the text

Download Chapter 3: Simple Random Sampling and Systematic Sampling and more Study notes Design in PDF only on Docsity!

Chapter 3: Simple Random Sampling and Systematic

Sampling

Simple Random Sampling

Estimating the Population Mean

with the sample mean ( μˆ or y ) which has an unbiased estimator:

∑ i

μˆ= =^1

Variance of the estimate μˆ is:

Standard error of μˆ is:

ˆ = ⎛^ −.

N

∑ i

ˆ = μˆ= =^1.

Determining Sample Sizes

B

Z

σ^2

B

N Z

Estimating the Population Mean

∑ i

μˆ= =^1

Variance of the estimate μˆ is:

vaˆr( μˆ )=⎛^ −.

Standard error of μˆ^ is:

Estimating the Population Total

The population total τ is estimated with: ∑

N

N

Variance of the estimate τˆ is: ⎟

N

vaˆr(τˆ )^2 var( μˆ)^2.

Standard error of τˆ is: ⎟

N

vaˆr( τˆ )^2

Estimating the Population Proportion

∑ i

ˆ = μˆ= =^1.