Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Facts and Formulas - Statistical Inference - Study Guide | STAT 431, Study notes of Statistics

University of Pennsylvania (UPenn)Statistics

Material Type: Notes; Class: STATISTICAL INFERENCE; Subject: Statistics; University: University of Pennsylvania; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-dj5 🇺🇸

9 documents

1 / 6

This page cannot be seen from the preview

Don't miss anything!

Statistics 431: Statistical Inference

Facts and Formulas

Probability foundations

The normal distribution and its samples

• The probability density function of a N(µ, σ 2)rv is

f(x)=1

√2π σ 2exp−1

2

(x−µ)2

σ2.

• The population mean is µ; the population SD is σ.

• For a sample X1,...,Xnof size nfrom a normal population,

–The sample estimate of µis the sample mean

¯

X=1

n

X

i=1

Xi.

–The sample estimate of σ2is the sample variance

S2=1

n−1

n

X

i=1

(Xi−¯

X)2=n

n−1"1

n

X

i=1

X2

i−¯

X2#.

• The distribution of the normal sample mean: ¯

X∼N(µ, σ 2/n). The SD of ¯

X,σ/√n, is also

called the standard error (SE) of ¯

X. We estimate it as S/√n.

•¯

Xand S2are independent as rvs. The distribution of the sample variance: (n−1)S2∼χ2

n−1.

• The sample histogram and the normal quantile plot are two graphical tools to judge whether

a sample comes from an approximately normal population. Prefer the quantile plot.

The binomial distribution and its samples

• Let Xcountthe number of“successes” in nindependent Bernoulli trials, each with probability

pof “success.” Then Xhas the binomial distribution, X∼Bin(n,p). The probability mass

function of a binomial rv is

P(X=k)=n

kpk(1−p)n−k,k=0,...,n.

1

Discover Study notes of Statistics University of Pennsylvania (UPenn)

Partial preview of the text

Download Facts and Formulas - Statistical Inference - Study Guide | STAT 431 and more Study notes Statistics in PDF only on Docsity!

Statistics 431: Statistical Inference

Facts and Formulas

Probability foundations

The normal distribution and its samples

The probability density function of a N (μ, σ 2 ) rv is

f ( x ) =

2 πσ 2

exp

( x − μ)^2 σ 2

The population mean is μ; the population SD is σ.
For a sample X 1 ,... , Xn of size n from a normal population, - The sample estimate of μ is the sample mean

X^ ¯ = 1

n

∑^ n

i = 1

Xi.

- The sample estimate of σ 2 is the sample variance

S^2 =

n − 1

∑^ n

i = 1

( Xi − ¯ X )^2 =

n n − 1

[

n

∑^ n

i = 1

X (^) i^2 −

X ¯

]

The distribution of the normal sample mean: X ¯ ∼ N (μ, σ 2 / n ). The SD of X ¯ , σ/

n , is also called the standard error (SE) of X ¯. We estimate it as S /

n.

X ¯ and S^2 are independent as rvs. The distribution of the sample variance: ( n − 1 ) S^2 ∼ χ n^2 − 1.
The sample histogram and the normal quantile plot are two graphical tools to judge whether a sample comes from an approximately normal population. Prefer the quantile plot.

The binomial distribution and its samples

Let X count the number of “successes” in n independent Bernoulli trials, each with probability p of “success.” Then X has the binomial distribution, X ∼ Bin( n , p ). The probability mass function of a binomial rv is

P ( X = k ) =

n k

pk^ ( 1 − p ) n − k^ , k = 0 ,... , n.

The population mean is np ; the population SD is

np ( 1 − p ).

For a binomial rv X based on a Bernoulli sample Z 1 ,... , Zn from a population, the estimate of p is p ˆ = X / n.
The SD of p ˆ is

p ( 1 − p )/ n ; it is also called the SE of p ˆ. We estimate it as

p ˆ( 1 − ˆ p )/ n.

For large n , the distribution of p ˆ is approximately N ( p , p ( 1 − p )/ n ).

Chapter 7: Confidence intervals

One sample mean

A 100γ % = 100 ( 1 − α)% confidence interval (CI) for an unknown population mean μ has the general form

X^ ¯ ± C ∗^ · σ^

∗ √ n

X^ ¯ − C ∗^ · σ^

∗ √ n

, X ¯ + C ∗^ ·

σ ∗ √ n

where C ∗^ is an appropriate upper quantile and σ ∗^ is an appropriate population SD or estimate thereof. The meaning of the confidence statement is that P (μ ∈ Interval) = γ , at least approximately. The important situations are:

- σ known and either population normal or n large: σ ∗^ = σ and C ∗^ = z α/ 2 , the (α/ 2 ) upper quantile of the standard normal. - σ unknown and n large: σ ∗^ = S and C ∗^ = z α/ 2. - σ unknown and population normal: σ ∗^ = S and C ∗^ = t α/ 2 ; n − 1 , the (α/ 2 ) upper quantile of the t distribution with n − 1 degrees of freedom (df).

The sample size needed to get an interval of width w is (approximately)

n (w) =

2 z α/ 2

σ w

rounded up to the nearest integer. When σ is unknown, use an estimate from previous experience or from the corresponding value of S in a pilot experiment.

One population proportion

When n is large (say, n p ˆ( 1 − ˆ p ) > 20) and p ˆ is not too near 0 or 1, you can use the classical large-sample CI formula,

p ˆ ± z α/ 2

p ˆ( 1 − ˆ p ) n

and P μ∈ HA (Do not reject H 0 ) = P (Type II error) = β for this μ. The significance level α is the probability of a Type I error at the boundary value μ 0. The power of the test is P μ∈ HA (Reject H 0 )) = 1 − β.

If we observe the test statistic value T = t , the p -value is the smallest α at which we can reject H 0 using t. If you know the p -value of a test, you know the outcome for every level α:

p -value < α ⇒ reject ; p -value ≥ α ⇒ do not reject.

Duality: for the usual two-sided tests, a level α test does not reject H 0 : μ = μ 0 exactly when a 100( 1 − α)% CI for μ contains μ 0. (There is a similar relationship between one-sided tests and upper/lower confidence bounds.)

Particular tests: one population mean

A test of H 0 : μ = μ 0 vs HA : μ 6 = μ 0 rejects when | T | > C ∗, where

T =

X ¯ − μ 0 σ ∗/

n

Here σ ∗^ and C ∗^ are as in the above discussion of two-sided confidence intervals.

The p -value corresponding to T = t can be found by looking up P (| T | > t ) for the normal distribution (when n is large) or the tn − 1 distribution (when n is small). NOTE: because of the absolute value signs, you must multiply the tabled value by 2.
The sample size n at which a two-sided level α test has power 1 − β under the alternative μ′ is approximately

n =

σ ∗( z α/ 2 + z β ) μ 0 − μ′

In the one-sided case, put z α in place of z α/ 2. (The resulting n is only valid if it is large, since the formula uses large-sample normality).

A test of H 0 : μ ≥ μ 0 vs HA : μ < μ 0 would reject when T < C ∗, where C ∗^ is from the lower confidence bound case discussed above. p -values can be found analogously (do not multiply the tabled value by 2).

Particular tests: one population proportion

To test H 0 : p = p 0 vs HA : p 6 = p 0 , use

T =

p ˆ − p 0 √ p 0 ( 1 − p 0 )/ n

with the critical value determined in the usual manner as a standard normal upper quantile. p -values are also determined from T = t in an analogous manner. Here, n should not be too small; np 0 ( 1 − p 0 ) > 5 should suffice. For smaller n , there is a procedure we have not covered based on the binomial distribution.

The n for which a two-sided test of p 0 has power 1 − β under the alternative p = p ′^ is approximately

n =

z α/ 2

p 0 ( 1 − p 0 ) + z β

p ′( 1 − p ′) p ′^ − p 0

rounded up to the nearest integer. For a one-sided test, replace z α/ 2 with z α.

Chapter 9: Inferences based on two samples

Inferences about the difference of two population means

A two-sided hypothesis test has the form H 0 : μ 1 − μ 2 = 10 vs HA : μ 1 − μ 2 6 = 10. Often, 10 = 0. If the sample from population A is independent of the sample from population B, reject when | T | > C ∗, where

T =

X ¯ − ¯ Y − 10

(σ 1 ∗ )^2 / n 1 + (σ 2 ∗ )^2 / n 2

Here σ 1 ∗ , σ 2 ∗ , and C ∗^ are like the values in the one-sample procedures. However, if n 1 or n 2 is small and σ is unknown, you need to assume normal population distributions and treat T as having a t distribution with ν df. Here

ν =

S^21 n 1 +^

S 22 n 2

( S 12 / n 1 )^2 n 1 − 1 +^

( S 22 / n 2 )^2 n 2 − 1

rounded to the nearest integer. Note that min{ n 1 − 1 , n 2 − 1 } ≤ ν ≤ n 1 + n 2 + 1.

A 100( 1 − α)% CI for μ 1 − μ 2 takes the form X ¯ − ¯ Y ± C ∗^ · SE, where SE is the denominator of the test statistic T.
p -values can be found in the usual way from the value T = t and the corresponding table.
Formulas for β can be derived from the structure of the test. Samples size calculations are complicated, except in special cases. We do not give general formulas here, or for two proportions below.
If the additional assumption σ 1 = σ 2 is tenable, use

T =

X ¯ − ¯ Y − 10

S^2 pooled( 1 / n 1 + 1 / n 2 )

Facts and Formulas - Statistical Inference - Study Guide | STAT 431, Study notes of Statistics

Related documents

Partial preview of the text

Download Facts and Formulas - Statistical Inference - Study Guide | STAT 431 and more Study notes Statistics in PDF only on Docsity!

Statistics 431: Statistical Inference

Facts and Formulas

Probability foundations

X^ ¯ = 1

S^2 =

[

X ¯

]

Chapter 7: Confidence intervals

, X ¯ + C ∗^ ·

T =

T =

Chapter 9: Inferences based on two samples

T =

X ¯ − ¯ Y − 10

T =

X ¯ − ¯ Y − 10