Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Comparing Population Means: Two-Sample Inference in Statistics, Study notes of Statistics

University of Pennsylvania (UPenn)Statistics

The methods for inferring differences between the means of two populations based on samples. Large-sample and small-sample tests, as well as confidence intervals for the difference between means. The document also introduces the concept of pooled t-procedures and power calculations.

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-6za 🇺🇸

10 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

Statistics 431:

Statistical Inference

Lecture 7: Two-sample inference

Discover Study notes of Statistics University of Pennsylvania (UPenn)

Partial preview of the text

Download Comparing Population Means: Two-Sample Inference in Statistics and more Study notes Statistics in PDF only on Docsity!

Statistics 431:

Statistical Inference

Lecture 7: Two-sample inference

Two populations: basics

(^) Up to now, we have looked at samples from a single population, like

X 1 ,... , Xn ∼ N (μ, σ 2 ). We asked

what are plausible values of μ? (confidence interval)
is it reasonable to say μ = μ 0 , or do the data strongly suggest μ 6 = μ 0? (hypothesis test)
(^) Often, we instead want to compare the properties of two different

population distrns.

distrn of years lived for people who (1) receive or (2) do not receive a clinical treatment (eg, compare means of distrns)
distrn of annual returns from investment vehicles (A) and (B) (eg, compare the variance-adjusted means)
distrn of party affiliation for American males and females (eg, compare proportion of Republicans in each group)
(^) We want to work out CI and testing ideas for comparative quantities like

μ 1 − μ 2.

Difference between means: large-sample test

(^) It’s natural to estimate μ 1 − μ 2 using X ¯ − ¯ Y.
(^) To conduct tests, we need to know the null distrn of X ¯ − ¯ Y.

To form CIs, we need to build a pivot based on X ¯ − ¯ Y.

(^) If m and n are large, then X ¯ ≈ N (μ 1 , σ 12 / m ) and Y ¯ ≈ N (μ 2 , σ 22 / n ).
(^) So X ¯ − ¯ Y ≈ N (μ 1 − μ 2 , σ 12 / m + σ 22 / n ) (why?).
(^) Under H 0 : μ 1 − μ 2 = 10 , and substituting in sample variances, we get

T =

X ¯ − ¯ Y − 10

S 12 / m + S 22 / n

≈ N ( 0 , 1 ).

(^) To set critical values for significance level α:

1 for H 0 : μ 1 − μ 2 = 10 vs HA : μ 1 − μ 2 6 = 10 , reject when | T | > c (α) = z α/ 2 2 for H 0 : μ 1 − μ 2 ≤ 10 vs HA : μ 1 − μ 2 > 1 0 , reject when T > c (α) = z α 3 for H 0 : μ 1 − μ 2 ≥ 10 vs HA : μ 1 − μ 2 < 1 0 , reject when T < c (α) = − z α

(^) Example: Are Japanese cars more fuel-efficient than American cars?
(^) We gather a set of miles per gallon ratings X 1 ,... , Xm for m = 249

American models, and another set Y 1 ,... , Yn for n = 79 Japanese models.

(^) The sample statistics: x ¯ = 20. 14 mpg, s 1 = 6. 41 mpg; y ¯ = 30. 48 mpg,

s 2 = 6. 11 mpg.

(^) H 0 : μ 1 − μ 2 = 10 = 0 vs HA : μ 1 − μ 2 6 = 0. The test statistic:

T =

x ¯ − ¯ y − (^10) √ s 12 / m + s 22 / n

(^) This is a huge negative value. It will clearly lead to rejection at any usual

significance level, for a two-tailed or lower-tailed test: the p-value is 2 ( 1 − 8( 12. 95 )) ≈ 2 × 10 −^38 , where 8 is the cdf of the N ( 0 , 1 ) distrn.

(^) What are the two “populations” that we “sampled”?
(^) Is this evidence that American auto engineers are less talented than their

Japanese counterparts?

Difference between means: large-sample CI

(^) As always, to derive a CI we need a pivot: a quantity that
- includes the data, X ¯ − ¯ Y
- includes the unknown parameter, μ 1 − μ 2
- has a known distrn
(^) For n and m large, we just discussed that T ≈ N ( 0 , 1 ), so T is a pivot. We

can write

− z α/ 2 <^

X ¯ − ¯ Y − (μ 1 − μ 2 ) √ S 12 / m + S 22 / n

< z α/ 2

 ≈^1 −^ α.

(^) Rearranging the event in the usual way, we get a 100 ( 1 − α)% CI for μ 1 − μ 2 :

X^ ¯ − ¯ Y ± z α/ 2

S 12

S 22

(^) Exactly the same idea as one-sample CI: sample mean ± normal quantile

times SE(sample mean).

Difference between means: small-sample CI

(^) Again, assume X 1 ,... , Xm ∼ N (μ 1 , σ 12 ) and Y 1 ,... , Yn ∼ N (μ 2 , σ 22 ).
(^) Suppose either m or n (or both) are small.
(^) Then T is once more a pivot, but its pivot distrn is t ν rather than N ( 0 , 1 ):

P

− t α/ 2 ;ν <^

X ¯ − ¯ Y − (μ 1 − μ 2 ) √ S 12 / m + S 22 / n

< t α/ 2 ;ν

 ≈^1 −^ α.

(^) This leads to a small-sample 100 ( 1 − α)% CI for μ 1 − μ 2 :

X^ ¯ − ¯ Y ± t α/ 2 ;ν

S 12

S 22

Power calculations

(^) Assume population distrns are N (μ 1 , σ 12 ) and N (μ 2 , σ 22 ).
(^) We conduct a test of H 0 : μ 1 − μ 2 ≤ 10 vs HA : μ 1 − μ 2 > 1 0 , using the test

statistic Z = ( X ¯ − ¯ Y − 10 )/σ (^) X ¯ − ¯ Y. (We define σ (^) X ¯ − ¯ Y =

σ 12 / m + σ 22 / n .)

(^) What is the power against the particular alternative μ 1 − μ 2 = 1 ′? (Here

1 ′^ ∈ HA , ie 1 ′^ > 1 0 .)

(^) Power = 1 − β(1′).

β(1′) = P 1 ′(accept H 0 ) = P 1 ′( X ¯ − ¯ Y < 1 0 + z α · σ (^) X ¯ − ¯ Y )

= 8

z α −

1 ′^ − 10

σ (^) X ¯ − ¯ Y

(^) Can derive similar expressions for lower-tailed and two-tailed tests (see

Devore).

Comparing Population Means: Two-Sample Inference in Statistics, Study notes of Statistics

Related documents

Partial preview of the text

Download Comparing Population Means: Two-Sample Inference in Statistics and more Study notes Statistics in PDF only on Docsity!

Statistics 431:

Statistical Inference

Lecture 7: Two-sample inference

Two populations: basics

Difference between means: large-sample test

T =

X ¯ − ¯ Y − 10

≈ N ( 0 , 1 ).

T =

Difference between means: large-sample CI

S 12

S 22

Difference between means: small-sample CI

P

S 12

S 22

Power calculations

1 ′^ − 10