







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of sampling distributions, focusing on the sampling distribution of the mean and the central limit theorem. It provides examples and formulas to help understand how the distribution of a statistic approaches a normal distribution as the sample size increases.
Typology: Study notes
1 / 13
This page cannot be seen from the preview
Don't miss anything!








The concept of a sampling distribution is difficult to understand, at first, but it is an important statistical concept. Motivating Example: Suppose we wanted to know the probabilities of the following events. How would we find them? Event of Interest How we’d find the probability? It rains on a July Day in Boone Height of an adult male is above 70 inches A voter favors increasing the local sales tax The average height of a group of 10 adult males is above 70 inches. A sample of 200 voters has a majority that favor the sales tax increase. The last two examples are examples of sampling distributions.
Definition: The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
Example 1: Let X represent a man’s height. Suppose that X follows a N(68, 4) distribution. What is the probability that one randomly selected man will be over 70 inches tall? Now, suppose we observe 30 randomly selected values of X. What is the probability that the average of these 30 heights will be larger than 70 inches? Example 2: Flight prices from Charlotte to Boston average $525 with standard deviation of $75. A business traveler will fly this route 50 times in the next year. Supposing each of these 50 flights is a random selection from the population of all CLT-BOS flights, what is the probability her average cost of the flight is above $550?
Sampling Distribution Result 2: Let p ˆ^ represent a sample proportion, and p represent a population proportion. For large n, (^) ^
n p p p N p
ˆ (^) ~ , , at least approximately. This approximation tends to be reasonable when: np and n(1- p ) are both at least 10. Example 1: Suppose that 54% of the eligible voters in a city will vote for a candidate on Election Day. Before the election, a newspaper is interested in estimating the candidate’s support, so they take a poll of 300 random voters in the city. (a) Let p ˆ^ represent the proportion of voters in a poll of 300 voters that support the candidate. What is the approximate sampling distribution of p ˆ^? (b) Interpret the result in part (A) in words. (C) Find the probability that the newspaper’s poll shows a minority that support the candidate. Example 2:
The idea of a confidence interval is to draw a conclusion about a population parameter from a sample statistic. For Stt1810, we focus on the following situation: Population Parameter: p, a proportion of interest (unknown) Sample Statistic: p ˆ^ , a sample proportion (calculated from a random sample of size n from the population of interest). We will use the key result of (^) ^
n p p p N p
to form our confidence intervals for p, based upon the sample proportion. Recall that the empirical rule tells us that 95% of the time, any Normally distributed variable is within 2 standard deviations of its mean. A more technical answer from the Z table suggests that 95% of the time, any Normally distributed variable is within 1.96 standard deviations of its mean. How can we use this result here? What are our conclusions?
Can we repeat these calculations with a different probability level (like, say, 90% of 99%)? What would change in our previous argument? We will use these general findings to produce what we call confidence intervals for μ. The general form of these intervals is: estimate + margin of error. These intervals also have a confidence level, which is a measurement of certainty that the interval actually contains the parameter of interest.
(e) Do these calculations provide convincing evidence that more than half of all students prefer to remain on the quarter system? Explain. (f) Do we believe this CI formula is reasonably accurate in this case? Why or why not? Example 2: Example 10.63 in the book shows an interesting example of question wording. Gallup used two different questions to assess support for the death penalty in a February 1999 poll: Question 1: Are you in favor of the death penalty for a person convicted of murder? n = 543; for: 71% against: 22% no opinion: 7% Question 2: What do you think should be the penalty for murder – the death penalty or life imprisonment with absolutely no possibility of parole? n = 511; death penalty: 56% life imprisonment: 38% no opinion: 6% Compute 95% CIs for the true proportion of adults who favor the death penalty in each case: Question 1: Question 2:
How Confidence Intervals Behave: The ideal scenario for a confidence interval would be high confidence AND a small margin of error (hence, an interval with small width). Unfortunately, achieving both these objectives is difficult. Often, researchers are forced to choose between one or the other. Let’s focus on the margin of error. The margin of error in our confidence intervals for the mean is: n p p z
A small margin of error is always desirable. How can this margin of error be small? Ways the margin of error can be small:
Example 2: Repeat the calculation(s) from example 1, but now assume that p ˆ^ will be: So, earlier in the course we learned that the margin of error for a poll was n (^1). Why is this formula for CI’s different?