Download Population vs. Sample and more Summaries Statistics in PDF only on Docsity!
Population vs. Sample
We draw samples from a population because we
are interested in inferring something about the
population based on the sample. We sample
when a census is impractical.
In order to draw a sample, we identify the
population of interest and then choose a sample
that is representative (hopefully) of the population
of interest.
POPULATION = the group we WANT information
about
SAMPLE = the group we GET information about
SAMPLING FRAME = the group of individuals
from which we will draw our sample (ideally the
entire population - but not always)
Why do we sample?
Parameters* of interest Corresponding sample
Statistics**
*Parameter: a value describing the entire population
**Statistic: a value describing the sample
Sample Surveys
A “sample survey” is a study that uses an organized plan to
choose a sample that represents some specific population.
1. Define a population we want to describe.
2. Say exactly what we want to measure (the variable(s)).
3. Decide^ how^ to^ choose^ a^ sample^ from^ the^ population.
This is known as the sampling design.
When sampling goes badly... Sampling Errors:
Mistakes made in the process of taking a sample that could
lead to inaccurate information about the population
Non-Sampling Errors:
Errors that arise in the data collection process as a result of
factors OTHER THAN taking a sample
Voluntary Response Sampling:
- Consists of people who choose themselves by responding to a general appeal.
- Biased because people with strong opinions (often in the same direction) are most likely to respond. Example: Convenience Sampling:
- Consists of individuals who are easiest to reach
- Tend to have similar opinions and typically are not representative of the population Example: These are SAMPLING ERRORS Bad Sampling Designs Undercoverage - when some groups in the population are left out of the process of choosing the sample. Example:
This is an example of a SAMPLING ERROR
When sampling goes well...
The statistician’s remedy to sampling error is to allow
impersonal chance to choose the sample. A sample chosen by
chance rules out both favoritism by the sampler and self-
selection by respondents.
Random sampling , the use of chance to select a sample, is the
central principle of statistical sampling.
Simple Random Samples
A simple random sample (SRS) of size n consists of n
individuals from the population chosen in such a way
that every set of n individuals has an equal chance to
be the sample actually selected.
Ultimately it is easier to show something is NOT a SRS rather than it is - how do you possibly show that EVERY set of n individuals can be the sample. The easiest way to show that a sampling method DOES NOT SATISFY THE DEFINITION OF SRS is to show that there is some group of n individuals that has NO POSSIBLE WAY of being chosen - thus not all groups are possible and can't be equally likely to be the sample selected.
In order to gather an SRS, we can use a table of Random digits or our calculators to generate random digits. using Table (what you will always do in this class):
- The table contains a long string of digits 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , & 9
- Each entry in the table is equally likely to be any of the 10 digits 0 to 9
- The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part using the Calculator:
- Select MATH, arrow over to PRB, select # 5 randInt
- The inputs need to be your starting value (typically 0 or 1 ), the ending value, and how many random values you want
- Keep in mind, if there are repeats you may need to generate more random digits
To gather data on a 1200 - acre pine forest in Louisiana, the U.S. Forest Service laid a grid of 1410 equally spaced circular plots over a map of the forest. A ground survey visited a sample of 10 % of these plots. (a) How would you label the plots? (b) What is the total sample size for this study? (c) Use your calculator to choose the first 10 plots. Stratified Random Sample
- Divide the population (or sampling frame) into strata (like sub-populations) which are homogenous groups of individuals that are similar in some way that is important to the response
- Choose a SRS from each strata proportional in size to the strata's size within the population.
For example: If females represent 30 % of the population and you are
stratifying based on gender, then 30 % of your Zinal sample should be
females.
EXAMPLE: The manager of a beach-front hotel wants to survey guests in the hotel to estimate overall customer satisfaction. The hotel has two towers, an older one to the south and a newer one to the north. Each tower has 10 floors of standard rooms ( 40 rooms per floor) and 2 floors of suites ( 20 suites per floor). Half of the rooms in each tower face the beach, while the other half of the rooms face the street. This means there are ( 2 towers)( 10 floors)( 40 rooms) + ( 2 towers)( 2 floors)( 20 suites) = 880 total rooms. Describe a method for gathering a stratified random sample - there are multiple options!
A club has 30 students:
Abel Fisher Huber Miranda Reinmann
Carson Ghosh Jimenez Moskowitz Santos
Chen Griswold Jones Neyman Shaw
David Hein Kim O'Brien Thompson
Deming Hernandez Klotz Pearl Utts
Elashoff Holland Liu Potter Varga
and 10 faculty members:
Andrew Fernandez Kim Moore West
Besicovitch Gupta Lightman Vicario Yang
The club can send 4 students and 2 faculty members to a
convention. It decides to choose those who will go by random
selection. Using the table of random digits, starting at line 18 , to
choose a stratified random sample of 4 students and 2 faculty
members.
Multistage Sampling Multistage samples are sampling designs that combine several different methods.
Examples:
Inference for Sampling The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population on the basis of sample data is called inference_._ Why should we rely on random sampling? To eliminate bias in selecting samples from the list of available individuals. The laws of probability allow trustworthy inference about the population. Results from random samples come with a margin of error that sets bounds on the size of the likely error. Larger random samples give better information about the population than smaller samples.
Homework: p. 226 #s 1 - 11 odd, 17 - 27 odd, 28 , 29 ,
QUIZ (FOR A GRADE) ON SECTION 4. 1 THE CLASS RIGHT
AFTER WE FINISH THE NOTES