Understanding Pop. Parameters & Sample Stats: Confidence Intervals & Sampling, Study Guides, Projects, Research of Statistics

The concept of population parameters versus sample statistics, uncertainty in estimates, and confidence intervals. It uses the example of Barack Obama's approval rating from a Gallup survey to illustrate the concepts of sample size, sample proportion, point estimate, interval estimate, and margin of error. It also discusses the importance of sampling distributions and how they help assess the accuracy of point estimates.

Typology: Study Guides, Projects, Research

2021/2022

Uploaded on 09/27/2022

shyrman
shyrman 🇺🇸

4.2

(6)

239 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Questions about the Assignment
If your answer is wrong, but you show your work you can
get more partial credit. Population parameter versus sample statistic
Uncertainty in estimates
Sampling distribution
Confidence interval
Understanding Inference:
Confidence Intervals I
Population
Sample
Sampling
Statistical
Inference
The Big Picture
A sample statistic is a number computed from sample data.
(e.g., sample mean: mean income of the people in the sample)
A population parameter is a number that describes some
aspect of a population.
(e.g., population mean: mean income of the entire population)
We usually have a sample statistic and want to make
inferences about the population parameter.
Statistic vs. Parameter
Population
Sample
Sampling
Statistical
Inference
The Big Picture
Statistic
Parameter
Statistic vs. Parameter
Sample
Statistics Population
Parameters
Mean μ(mu)
Proportion
p
Std. Deviation s(sigma)
Correlation rρ(rho)
Slope bβ(beta)
pf3
pf4
pf5

Partial preview of the text

Download Understanding Pop. Parameters & Sample Stats: Confidence Intervals & Sampling and more Study Guides, Projects, Research Statistics in PDF only on Docsity!

Questions about the Assignment

If your answer is wrong, but you show your work you can

get more partial credit.

Population parameter versus sample statistic

Uncertainty in estimates

Sampling distribution

Confidence interval

Understanding Inference:

Confidence Intervals I

Population

Sample

Sampling

Statistical

Inference

The Big Picture

A sample statistic is a number computed from sample data.

(e.g., sample mean: mean income of the people in the sample)

A population parameter is a number that describes some

aspect of a population.

(e.g., population mean: mean income of the entire population)

We usually have a sample statistic and want to make

inferences about the population parameter.

Statistic vs. Parameter

Population

Sample

Sampling

Statistical

Inference

The Big Picture

Statistic

Parameter

Statistic vs. Parameter

Sample

Statistics

Population

Parameters

Mean ݔഥ^ μ^ (mu)

Proportion ̂݌^ p

Std. Deviation s^ ߪ^ (sigma)

Correlation r^ ρ^ (rho)

Slope b^ β^ (beta)

Gallup surveyed 1,500 Americans between June 9th-11th^ 2012 and 49% of these people approved of the job Barack Obama is doing as president.

What is the population?

What is the sample size?

Is this categorical or quantitative variable?

For categorical variables, what sample statistic are we

interested in?

Sample statistic: (sample proportion)

Based on this sample statistic, what do you think is the true proportion of Americans who approve of the job Barack Obama is doing as president?

Population parameter: (population proportion)

Obama’s Approval Rating

http://www.gallup.com/poll/113980/Gallup-Daily-Obama-Job-Approval.aspx

~330million (All Americans) 1, Categorical

Sample proportion

p =?

The sample statistic gives a point estimate (a single number)

for the population parameter.

Usually, it is more useful to provide an interval estimate

which gives a range of plausible values for the population

parameter:

interval estimate = point estimate േ margin of error

How do we determine the margin of error???

Point and Interval Estimates

Point Estimate: ݌ො =.

Interval Estimate: 0.49 േ 0.

Between 46% and 52% of Americans currently approve of

the job Obama is doing as president.

Obama’s Approval Rating

point estimate

margin of error

The population parameter is a fixed value.

Sample statistics vary from sample to sample. They will not

match the population parameter exactly.

For a given sample statistic, what are plausible values for the

population parameter? How much uncertainty surrounds the

sample statistic?

It depends on how much the sample statistic varies from

sample to sample!

Important Points

What proportion of Reese’s pieces are orange?

Reese’s Pieces

The sample size does not affect the shape of the sampling distribution.

The sample size does not affect the center of the sampling distribution.

The sample size does affect the spread of the sampling distribution.

As the sample size increases, the spread decreases.

The Importance of Sample Size

n = 1,

n = 200

n = 50

Each dot represents a sample statistic. The number of samples taken to generate these sampling distributions is the same. What varies for each sampling distribution is the size of the sample taken to calculate the sample statistic.

3 Sampling

Distributions The sample size influences the spread of the sampling distribution(i.e., the variation in sample statistics), which influences the margin

of error for our estimate of the population parameter.

If we increased the sample size to 100, the standard deviation of the sampling distribution will... A. increase B. decrease C. remain the same

and the margin of error for our point estimate will… A. increase B. decrease C. remains the same

Sample Size

For each sample, the sample statistic (i.e., the proportion of orange pieces) would be closer to the proportion of the population and thus closer to each other.

Increasing the sample size will cause the standard deviation of the

sampling distribution to decrease.

Let’s Test Our Hypothesis

http://www.rossmanchance.com/applets/Reeses3/ReesesPieces.html

Hypothesis

If you take random samples, the sampling distribution will be

centered around the true population parameter.

If sampling bias exists (if you do not take random samples),

the sampling distribution may provide inaccurate information

about the true population parameter.

Random Samples

A confidence interval for a population parameter estimate is an interval computed from sample data that will contain the true population parameter for a specified proportion of all samples.

The confidence level is the proportion of samples whose intervals contain the true population parameter.

The confidence level indicates how confident we are that our interval contains the population parameter.

A 95% confidence interval will contain the true population parameter for 95% of all samples. We are 95% confident that the true population parameters falls within this range.

Confidence Intervals

The population parameter (݌) is fixed. It is typically not known. The sample statistic ( x (^) i ) is random. It depends on the sample. The confidence interval ( x (^) i േ2SD)* is random. It depends on the sample statistic. The sampling distribution is comprised of the sample statistics and is centered on the population parameter. 95% of the sample statistics will fall within 2 standard deviations of the population parameter. 95% of the sample intervals will contain the population parameter.

Confidence Intervals

Sample Statistic

2 SDs

http://bcs.whfreeman.com/ips4e/cat_010/applets/confidenceinterval.html

Sampling Distribution Population Parameter

Confidence Interval p

----------95%--------

*The standard deviation used to calculate the confidence interval is the standard deviation of the sampling distribution (not the sample distribution).

Population Proportion

A 95% confidence interval can be created by:

sample statistic േ 2 ൈ standard deviations

point estimate േ margin of error

The point estimate is calculated from our sample.

The margin of error is calculated from the sampling distribution.

Confidence Intervals

The standard deviation of the sampling distribution (i.e., the

distribution of sample statistics) is called the standard error (SE).

This is done to clearly distinguish it from the standard deviation

of the sample distribution.

Standard Error:

The Standard Deviation of the Sampling Distribution

To create a plausible range of values for a parameter:

  1. Take many random samples from the population, and compute the sample statistic for each sample.
  2. Compute the standard error as the standard deviation of all these statistics.
  3. Use: sample statistic ± 2 × standard error

One small problem…

Often we only have one sample!

How can we calculate the variation in sample statistics,

if we only have one sample?

Summary

Part I: Graded Problems

3.12, 3.16, 3.24, and 3.

Part II: (Type up this assignment in a Word document)

Goto http://sda.berkeley.edu/cgi-bin/hsda?harcsda+gss Find 3 quantitative variables and for each variable find another quantitative variable that you think is associated with it. Conduct a correlation test to see how correlated they are. For each pair of variables provide the following information: Variable names Question related to the variable Explain in your own words what this variable is measuring The unit used to measure the variable (e.g., years, dollars, inches, etc.) Min, Max, Mean, Median, Standard Deviation (Std Dev) The correlation score An interpretation of the correlation score

Assignment

Calculating Correlations from the GSS

Under the “Analysis” tab, click on the “Correlation matrix” tab.

Enter the names of two quantitative variables here.

Click on this button and the correlation statistics will open up in a new window.

Calculating Correlations from the GSS

This is what will pop up in the new window. This is the correlation ( r ) score for the two variables