Sampling Distributions: Understanding Mean, Proportion, and Total in Statistics, Study Guides, Projects, Research of Statistics

The concept of sampling distributions, focusing on the mean, proportion, and total in statistics. It covers the terms 'sampling distribution', 'law of large numbers', and 'central limit theorem'. The document also includes examples of calculating the mean and standard deviation of the sampling distribution for the sample mean and sample proportion, as well as finding probabilities using the central limit theorem.

Typology: Study Guides, Projects, Research

2021/2022

Uploaded on 09/27/2022

ryangosling
ryangosling 🇺🇸

4.8

(24)

249 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sampling Distributions
Module 7
Statistics 251: Statistical Methods
Updated 2021
Three Types of Distributions
data distribution
the distribution of a variable in a sample
population distribution
the probability distribution of a single observation of a variable
sampling distribution
the probability distribution of a statistic
Terms I
sampling distribution
: a probability distribution of a statistic; it is a distribution of all possible samples
(random samples) from a population and how often each outcome occurs in repeated sampling (of the same
size
n
). Given simple random samples of size
n
from a given population with a measured characteristic such
as mean
X
, proportion (
ˆπ
)
1
, or standard deviation (
s
) for each sample, the probability distribution of all
the measured characteristics is called a sampling distribution. It is the distribution of all possible samples
(outcomes) of that statistic.
Use of a statistic to estimate the parameter is the main function of inferential statistics as it provides the
properties of the statistic.
Terms II
law of large numbers
states that as the number of repetitions of an experiment is increased, the relative
frequency obtained in the experiment tends to become ever closer to the theoretical probability. Even though
the outcomes do not happen according to any set pattern or order (overall), the long-term observed relative
frequency will approach the theoretical probability
1πand ˆπare NOT 3.14159. .. , they are being used like µand the other Greek letters we are using for notation.
1
pf3
pf4
pf5

Partial preview of the text

Download Sampling Distributions: Understanding Mean, Proportion, and Total in Statistics and more Study Guides, Projects, Research Statistics in PDF only on Docsity!

Sampling Distributions

Module 7

Statistics 251: Statistical Methods

Updated 2021

Three Types of Distributions

data distribution the distribution of a variable in a sample population distribution the probability distribution of a single observation of a variable sampling distribution the probability distribution of a statistic

Terms I

sampling distribution : a probability distribution of a statistic; it is a distribution of all possible samples (random samples) from a population and how often each outcome occurs in repeated sampling (of the same size n). Given simple random samples of size n from a given population with a measured characteristic such as mean X, proportion (πˆ)^1 , or standard deviation (s) for each sample, the probability distribution of all the measured characteristics is called a sampling distribution. It is the distribution of all possible samples (outcomes) of that statistic. Use of a statistic to estimate the parameter is the main function of inferential statistics as it provides the properties of the statistic.

Terms II

law of large numbers states that as the number of repetitions of an experiment is increased, the relative frequency obtained in the experiment tends to become ever closer to the theoretical probability. Even though the outcomes do not happen according to any set pattern or order (overall), the long-term observed relative frequency will approach the theoretical probability (^1) π and ˆ π are NOT 3.14159... , they are being used like μ and the other Greek letters we are using for notation.

Simulation of LLN

Count of Trials

Proportion of successes

Central Limit Theorem (CLT)

Definition The sampling distribution of the sample mean is approximately normal with mean μ X and standard deviation (of the sampling distribution of the sample mean) se = σXn , provided n is sufficiently large.

Sampling distribution of the Sample Mean

If we take n observations of a quantitative variable and then compute the mean (x¯) of those observations in the sample, then x¯ is the sample mean statistic. Assumptions: Each observation x has the same probability distribution with mean μ and standard deviation σ, and the observations are independent.

Properties of the Sampling Distribution of x ¯

(1) The mean of the sampling distribution is μ

(2) The standard deviation of the sampling distribution is se = √ σn

(3) The shape of the sampling distribution becomes more like a normal distribution as n increases

Sampling distribution of the Sample Mean

X ∼ N (μ, se mean )

Standard error of the mean: σ X = se mean =

σ √ n

z = X − μ se mean

Sample sizes should be n ≥ 30 for the sample mean If a distribution is already inherently normal, the sample size stipulation can be ignored.

Simulation example

The linked file shows how taking multiple random samples of the same size from the same population will produce a normal distribution of the sample means. The examples show a normal distribution, exponential distribution, and a binomial distribution. CLT simulation

CLT for sample mean ( X ) and sample sum/total ( τ ˆ )

for sample mean (X) and total (τˆ ) The level of a particular pollutant, nitrogen dioxide (N O 2 ), in the exhaust of a hypothetical model of car, that when driven in city traffic, has a mean level of 2.1 grams per mile (g/m) and a standard deviation of 0. g/m. Suppose a company has a fleet of 35 of these cars.

(a) What is the mean and standard deviation of the sampling distribution of the sample mean?

mean: μ X = μ = 2. 1 and se mean = √ σn = √^0_._ 353 = 0. 0507

X ∼ N (μ, se mean ) = X ∼ N (2. 1 , 0 .0507)

CLT for X and τ ˆ solutions

(b) find the probability that the mean N O 2 level is less than 2.03 g/m

P (X < 2 .03) = P

Z <

= P (Z < − 1 .38) = 0. 083793

(c) Mandates by the EPA state that the average of the fleet of these cars cannot exceed 2.2 g/m, find the probability that the fleet N O 2 levels from their fleet exceed the EPA mandate

P (X > 2 .2) = 1 − P

Z <

= 1 − P (Z < 1 .97) = 1 − 0 .975581 = 0. 024419

CLT for X and τ ˆ solutions

(d) At most, 25% of these cars exceed what mean N O 2 value?

Find the z score that represents the top 25%, which is the same as the bottom 75% (is also Q 3 ) and what is needed to find z 0_._ 75 = 0. 67449. Next use z = (^) seXmeanμ and solve for X: X = z(se mean ) + μ

X = (0.67449)(0.0507) + 2.1 = 2. 134197

CLT for X and τ ˆ solutions

(e) what is the mean and standard deviation of the total amount (sum), in g/m, of N O 2 in the exhaust for the fleet?

τ = nμ = 35(2.1) = 73. 5

se sum =

nσ =

τˆ ∼ N (τ, se sum ) = ˆτ ∼ N (73. 5 , 1 .7748)

CLT for X and τ ˆ solutions

(f) find the probability that the total amount of N O 2 for the fleet is between 70 and 75 g/m

P (70 < τ <ˆ 75) = P

< Z <

= P (− 2. 01 < Z < 0 .86) = P (Z < 0 .86) − P (Z < − 2 .01)

CLT for proportion ( π ˆ )

Mars company claims that 10% of the M&M’s it produces are green. Suppose that candies are packaged at random in bags that contain 60 candies. (a) Describe the sampling distribution of the sample proportion (what should the distribution look like?); calculate the mean proportion and standard deviation of the sampling distribution of the sample proportion of green M&M’s in bags that contain 60 candies (calculate π and se). (b) What is the probability that a bag of 60 candies will have more than 13% green M&M’s?

CLT for π ˆ solutions

(a) Describe the sampling distribution of the sample proportion; calculate the mean proportion and standard deviation of the sampling distribution of the sample proportion of green M&M’s in bags that contain 60 candies. The distribution of the sample proportion will be approximately normal since n ≥ 60. The mean proportion π = 0. 1 and the standard error is

π (1− π ) n =

(0_._ 1)(1− 0_._ 1) 60 = 0.^0387 (the standard deviation of the sampling distribution of the sample proportion). Thus

ˆπ ∼ N (0. 1 , 0 .0387)

CLT for π ˆ solutions

(b) What is the probability that a bag of 60 candies will have more than 13% green M&M’s?

P (ˆπ > 0 .13) = P

Z >

= P (Z > 0 .78) = 1 − P (Z < 0 .78) = 1 − 0. 782305