Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Inference: Hypothesis Testing and Confidence Intervals - Prof. Kobi Abayomi, Study notes of Data Analysis & Statistical Methods

Various methods for statistical inference, including hypothesis testing and confidence intervals. Topics include testing for population means, the chi-square test for goodness of fit, the t-distribution, and tests for differences in proportions and variances. The document also includes examples and calculations using r.

Typology: Study notes

Pre 2010

Uploaded on 08/04/2009

koofers-user-7q0-2
koofers-user-7q0-2 🇺🇸

10 documents

1 / 12

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Inference: Hypothesis Testing and Confidence Intervals - Prof. Kobi Abayomi and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

ISYE 2028 A and B

Lecture 12

Confidence Intervals and Hypothesis Testing ’cont.

Dr. Kobi Abayomi

March 25, 2009

We have looked at hypothesis testing generally, but we have used only the specific example of a test for the population mean. For instance, if X ∼ μ, σ^2 is a random variable [model], and we collect some data x =

∑n i xi. Then, the hypotheses

H 0 : μ = μ 0 vs. Ha : μ 6 = μ 0

we use in a two sided test of the population mean. You will recall that we use the sampling distribution x ∼ N (μ, σ^2 /n) to construct the test statistic:

Z =

x − μ 0 √ σ^2 /n

which has the standard normal distribution N (0, 1).

This setup is often sufficient: the Z statistic is the deviation of the data from the null hypothesis, over its standard deviation. In words:

Z ≡

obs − exp S.D(obs)

is the statistic we want to use if we want to test the proportion of people who vote for Pedro, the mean income of Njoroge’s in Kisumu, if the sample mean is representative of our population mean.

Situations often arise where the sample mean cannot sufficiently describe, or test for, im- portant hypothetical differences in populations. We must appeal to other distributions, to other quantifications of difference, to test other hypothesis. A useful alternative is...

1 The Chi Squared Distribution and associated Hy-

potheses Tests

Recall this example from Lecture 10: Say we are interested in the fairness of a die. Here is the observed distribution after 120 tosses:

Die Face 1 2 3 4 5 6 Obs. Count 30 17 15 23 24 21

The appropriate test statistic here is the Chi-square.

1.1 The Chi-Square test for Goodness of Fit

Formally, here, we are going to test

H 0 : T he die is f air

vs.

Ha : T he die is not f air

In general the hypotheses tests are

H 0 : πi =

ni n

, f or all i

vs.

Ha : πi 6 =

ni n

, f or at least 1 i

Remember here, our observed test statistic is χ^2 o = (25−20)

2 20 +^ · · ·^ +^

(16−20)^2 20 = 18.00.^ The number of degrees of freedom n − 1, here 6 − 1 = 5. Notice that the total number of

observations is fixed, that is how we calculate the expected frequency. Once the total is set we lose a degree of freedom.

A χ^2. 95 , 5 = 11.70 Here χ^2 o > χ^2. 95 , 5 = 11.70 so we reject the null hypothesis. We conclude the die is unfair.^1.

1.2 The Chi-Square test for independence, the Two-Way layout

The Chi-Square test is useful for the contingency table, or two-way setup. Remember the contingency table from Lecture 10: a variable on rows, a variable on columns, each cell has the observed counts for each bivariate value of the variable.

We used this example:

Fashion Level Classlevel Low Middle High Total Graduate 6 4 1 11 PhD. 5 1 2 8 Pre-K 30 25 75 130 Total 41 30 78 149

The formal hypotheses tests are

H 0 : πij =

nij n..

, f or all i, j

vs.

Ha : πij 6 =

nij n..

, f or at least 1 (i, j) pair

For our data here we calculated χ^2 o = (6−^11 .89)

2

  1. 89 +^ · · ·^ +^

(75− 82 .65)^2

  1. 65 = 14.92. A^ χ

2

. 95 , 4 = 9.48. We reject the null hypothesis and conclude that class level and fashion are not independent.

1.3 The T Distribution

Remember if we cannot assume that we know the variance of the sample mean we appeal to the t distribution as the sampling distribution for the sample mean

(^1) From the table in the back of the book, which you should familiarize yourself with

We often have not enough samples to apply the central limit theorem to the sampling dis- tribution. In these situation we construct the t-statistic as well.

Formally, the two sided hypothesis test is still one of location of the true mean

H 0 : μ = μ 0 , σ^2 unknown

vs.

Ha : μ 6 = μ 0 , σ^2 unknown

A confidence interval here is:

x ± tα,df ∗

s^2 n

the associated margin of error:

M E = tα/ 2 ,df

s^2 n

the appropriate number of samples for a fixed 1 − α confidence level

n =

t^2 α/ 2 ,df s^2 M E^2

For the hypothesis testing setup of H 0 : μ = μ 0 vs. Ha : μ 6 = μ 0 our observed test statistic is

to =

x − μo s/

n

2 Samples - Independent or Dependent? 1 or 2 or

many?

In general always remember: (1) The sampling distribution, which will yield the (2) The confidence interval , which is immediately analogous to (3) The test statistic. Everything is a variation on this theme, just a slightly different scenario.

2.1 Scenario 1: Two sample proportions

Say we wish to gain inference on the support for election reform in California and Georgia. Let p 1 ≡ the proportion who support in Georgia and p 2 ≡ the proportion who support in California. We estimate these, in the usual way, ˆp 1 = x n^11 , pˆ 2 = x n^22 : the sample proportions of voters who supported the reform over total voters, for each state.

We know from the sampling distribution of ˆp: E(ˆp 1 ) = p 1 , E(ˆp 2 ) = p 2 and V ar(ˆp 1 ) = p 1 q 1 n 1 , V ar(ˆp^2 ) =^

p 2 q 2 n 2.

The difference p 1 − p 2 is distributed:

p 1 − p 2 ∼ N (p 1 − p 2 ,

p 1 q 1 n 1

+

p 2 q 2 n 2

)

This is the sampling distribution for the difference in proportions. The appropriate rescaled statistic is:

Z =

pˆ 1 − pˆ 2 − (p 1 − p 2 ) S.D.(ˆp 1 − pˆ 2 )

and it will have a standard normal distribution.

Thus, a confidence interval for the difference in two proportions is:

pˆ 1 − pˆ 2 ± Zα/ 2

pˆ 1 qˆ 1 n 1

+

pˆ 2 qˆ 2 n 2

For the two tailed hypothesis test

H 0 : p 1 = p 2 vs. p 1 6 = p 2

we exploit the fact p 1 = p 2 implies p 1 − p 2 = 0 and write

pˆpooled = ˆpp =

x 1 + x 2 n 1 + n 2

to pool the estimate of the population proportion, since, under the null, here, p 1 = p 2.

Then our test statistic is

zo =

pˆ 1 − pˆ 2 √ p ˆp qˆp( (^) n^11 + (^) n^12 )

2.2 Scenario 2: Two samples, in general

In general if we have data coming from two samples X 1 ∼ μ, s 12 and X 2 ∼ μ, s 22 and we cannot assume knowledge of the variances we get a sampling distribution for the difference in the population mean μ 1 − μ 2 as

x 1 − x 2 ∼ μ 1 − μ 2 ,

s 12 n 1

+

s 22 n 2

which we approximate with a t-distribution with n 1 + n 2 − 2 degrees of freedom.^2

Thus the confidence interval is

x 1 − x 2 ± tα/ 2 ,n 1 +n 2 − 2 ∗

s 12 n 1

+

s 22 n 2

The two sided hypotheses test for differences in the population mean

H 0 : μ 1 − μ 2 = ∆ 0 vs. Ha : μ 1 − μ 2 6 = ∆ 0

would use this test statistic:

t 0 =

x 1 − x 2 − ∆ 0 √ s 12 n 1 +^

s 22 n 2

Of course one sided tests are the usual variations on this.

If you are willing to assume that s 1 = s 2 then you can pool the variance estimates with

Sp^2 =

(n 1 − 1)s 12 + (n 2 − 1)s 22 n 1 + n 2 − 2

and use this test statistic: (^2) The exact calculation for degrees of freedom here is more involved. Using n 1 + n 2 − 2 is good

t 0 =

x 1 − x 2 − ∆ 0 √ Sp^2 ( (^) n^11 + (^) n^12 )

2.3 Scenario 3: Two samples, ”dependent”

In many cases it is not reasonable to assume that your two samples have arrived indepen- dently. We call data paired when it is natural to think of each sample as bivariate. Like errors while playing piano with the right hand versus the left hand. In these cases, we believe that the samples come from one element, perhaps, but two separate samplings.

Let

D = X 1 − X 2

thus

di = xi 1 − xi 2

and

D =

(X 11 − X 12 ) + · · · + (Xn 1 − Xn 2 ) n

Here we have taken the differences in each observation, and then computed the average difference. A sampling distribution for D is

D ∼ μ 1 − μ 2 , SD^2 /n

where

SD^2 =

n − 1

∑^ n

i

(di − d)

We again approximate with the t-distribution. Here the degrees of freedom are the number of pairs minus 1 df = n − 1

The confidence interval for paired differences of the population mean is then:

d ± tα/ 2 ,n− 1 ∗

sd^2 n

And the hypotheses test for paired differences of the population mean, also known as a paired t-test is

H 0 : ∆ = ∆ 0 vs. Ha : ∆ 6 = ∆ 0

uses this test statistic

to =

d − ∆ 0 Sd/

n

3 Beyond the sample mean S^2

Thus far all of our confidence intervals and hypothesis test have been restricted to tests of the mean (test of location) μ. We have used the sample mean x as the natural estimator. Now we introduce tests and intervals based upon the variance (test of scale).

3.1 A confidence interval for σ^2

We have to accept as fact^3 that for a random sample of size n from a normal distribution with parameters μ, σ^2 that

(n − 1)S^2 σ^2

∼ χ^2 (n − 1) (1)

i.e. chi-squared with n − 1 degrees of freedom. We use this fact to set up a confidence interval, now, for σ^2 - using the estimator S^2.

Since P(χ^21 −α/ 2 ,n− 1 < (n−1)S

2 σ^2 < χ

2 α/ 2 ,n− 1 ) = 1^ −^ α^ then a 1^ −^ α^ percent confidence interval is (for α fixed)^4 :

(^3) The proof involves techniques not introduced in this class - but look at lectures 7-9 and you’ll get the flavor. (^4) Notice that χ (^21) −α/ 2 ,n− 1 6 = −χ (^2) α/ 2 ,n− 1. The chi-squared distribution is not symmetric, nor is the associated confidence interval

[

(n − 1)s^2 χ^2 α/ 2 ,n− 1

,

(n − 1)s^2 χ^21 −α/ 2 ,n− 1

] (2)

An intelligent reader like you understands that the interval for σ is just the square root of that for σ^2.

3.2 Ratio of variance

Remember from lecture 10: If X 1 , ...Xm is distributed N (μ 1 , σ^21 ) and Y 1 , ...Yn is distributed N (μ 2 , σ 22 ) then the ratio

F =

S 12 /σ^21 S 22 /σ^22

(3)

has what we call an F distribution with numerator degrees of freedom m−1 and denominator degrees of freedom n − 1. From what we just learned in the previous section - F is the ratio

of two chi-squared variables, call them U ∼ χ^2 (m − 1) and V ∼ χ^2 (n − 1). If U = (m−1)S

2 σ 12

then U ∼ χ^2 (m − 1). If V = (n−1)S

2 σ^22 then^ V^ ∼^ χ

(^2) (n − 1). Then

F =

(m−1)S^2 σ^2 /(m^ −^ 1) (n−1)S^2 σ^2 /(n^ −^ 1)

which just simplifies to (3).^5

Remember this identity for the F-distribution is:

F 1 −α,ν 1 ,ν 2 = F (^) α,ν−^12 ,ν 1 (4)

You’ll notice you have to use this fact in looking up values on the F -table in some books.

Lastly, we can construct a confidence interval for the ratio of two variances using this fact (we’ve seen this reasoning before): P(F 1 −α/ 2 ,ν 1 ,ν 2 < F < Fα/ 2 ,ν 1 ,ν 2 ) = 1 − α and rewriting so

that we get a statement P(F 1 −α/ 2 ,ν 1 ,ν 2 S

(^22) S^21 <^

σ^22 σ^21 < Fα/^2 ,ν^1 ,ν^2

S^22 S^21 ) = 1^ −^ α. This yields: (^5) It turns out that E(F ) = ν 2 ν 2 − 2 and^ var(F^ ) =^

2 ν^22 (ν 1 +ν 2 −2) ν 1 (ν 2 −2)^2 (ν 2 −4) where^ U^ ∼^ χ (^2) (ν 1 ), V ∼ χ (^2) (ν 2 ), F = U/ν^1 V /ν 2 and U is independent of V.

(

s^22 F 1 −α/ 2 ,ν 1 ,ν 2 s^21

,

s^22 Fα/ 2 ,ν 1 ,ν 2 s^21

) (5)

as a 1 − α percent confidence interval for the ratio σ^22 /σ 12.

4 R example continued from lecture 11

4.1 part b

Here we are comparing costs of accidents in the non-ABS year 1991 and the ABS year 1992. We can treat the cost as a continuous non-proportion variable. Remember the data from lecture 4.

A hypothesis test:

H 0 : ∆μ = μN oABS − μABS = 0

vs.

H 1 : ∆μ = μN oABS − μABS > 0

The variances are unknown - so we know we need to use a t-test. But can we assume they are equal and use a pooled variance estimator?

First things first: this data has missing values:

mean(data)

#Cost1991 Cost1992 2074.952 NA

#we could also use data[37:42,]

mean(data,na.rm=T)

#Cost1991 Cost1992 2074.952 1714.

#here we have removed the missing values

var(data,na.rm=T)

#Cost1991 Cost1992 Cost1991 441529. #-7008.193 Cost1992 -7008.193 390409.

#this is the covariance matrix, for now we only need the diagonal #elements

We should do an F-test for equality of variances (I’ll skip the hypothesis notation for this intermediate test) to know which form of the t-test to apply.

data[1,1]/data[2,2] [1] 1. #our observed value of the f-statistic

pf(data[1,1]/data[2,2],41,37,lower.tail=FALSE) [1] 0.

#the p-value for our (inherently) two tailed test

We can assume that the variances are equal, so our test statistic is:

t =

(xN oABS − xABS ) − 0 √ s^2 p(n− N oABS^1 + n− ABS^1 )

where

s^2 p =

(n 1 − 1)s^21 + (n 2 − 1)s^22 n 1 + n 2 − 2

The calculations in R:

mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T) [1] 360. %the difference in the sample means s1squared<-var(data,na.rm=T)[1,1] s2squared<-var(data,na.rm=T)[2,2] spsquared<-((42-1)s1squared +(38-1)s2squared)/(42+38-2) spsquared

[1] 417280.

%the sample variances, and pooled sample variance

tstat<-(mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T))/sqrt(spsquared*(1/42 +1/38)) tstat [1] 2. %the calculated value for the t-statistic pt(2.49,df=(42+38-2),lower.tail=FALSE) [1] 0. % the p-value for the observed t-statistic

The difference in sample variation between non-ABS and ABS equipped cars was statistically insignificant at the .05 level. A test for equality of means of cost of repairs was statistically significant at the same level - we reject the null hypothesis. Translated into the narrative: There is enough evidence at the .05 level to conclude that the cost of repairs for non-ABS equipped cars is higher than that of ABS cars. We could perhaps conjecture that ABS equipped cars help the driver lessen the severity of an accident.

4.2 part c

In R the confidence limits are:

mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T)-qt(.975,(42+38-2)) sqrt(spsquared(1/42 +1/38)) [1] 72. mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T)+qt(.975,(42+38-2)) sqrt(spsquared(1/42 +1/38)) [1] 648.

5 Exercises

  • Look at the chart on page 351. I wouldn’t commit it to memory (you can always derive the appropriate test statistics by reasoning!)- it is a nice summary though.
  • Do exercises 10.30, 10.35, 10.42, 10.52 on pages 358-361/
  • Do exercise 10.72(b), Do exercise 10.67. Both on page 370.
  • Do exercise 10.73 on page 370.
  • Exercises 10.79, 10.83, 10.90, 10.93 on page 383-384.
  • Exercises 10.106-10.109 on page 386.