Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Various methods for statistical inference, including hypothesis testing and confidence intervals. Topics include testing for population means, the chi-square test for goodness of fit, the t-distribution, and tests for differences in proportions and variances. The document also includes examples and calculations using r.
Typology: Study notes
1 / 12
We have looked at hypothesis testing generally, but we have used only the specific example of a test for the population mean. For instance, if X ∼ μ, σ^2 is a random variable [model], and we collect some data x =
∑n i xi. Then, the hypotheses
H 0 : μ = μ 0 vs. Ha : μ 6 = μ 0
we use in a two sided test of the population mean. You will recall that we use the sampling distribution x ∼ N (μ, σ^2 /n) to construct the test statistic:
x − μ 0 √ σ^2 /n
which has the standard normal distribution N (0, 1).
This setup is often sufficient: the Z statistic is the deviation of the data from the null hypothesis, over its standard deviation. In words:
obs − exp S.D(obs)
is the statistic we want to use if we want to test the proportion of people who vote for Pedro, the mean income of Njoroge’s in Kisumu, if the sample mean is representative of our population mean.
Situations often arise where the sample mean cannot sufficiently describe, or test for, im- portant hypothetical differences in populations. We must appeal to other distributions, to other quantifications of difference, to test other hypothesis. A useful alternative is...
1 The Chi Squared Distribution and associated Hy-
potheses Tests
Recall this example from Lecture 10: Say we are interested in the fairness of a die. Here is the observed distribution after 120 tosses:
Die Face 1 2 3 4 5 6 Obs. Count 30 17 15 23 24 21
The appropriate test statistic here is the Chi-square.
Formally, here, we are going to test
H 0 : T he die is f air
vs.
Ha : T he die is not f air
In general the hypotheses tests are
H 0 : πi =
ni n
, f or all i
vs.
Ha : πi 6 =
ni n
, f or at least 1 i
Remember here, our observed test statistic is χ^2 o = (25−20)
2 20 +^ · · ·^ +^
(16−20)^2 20 = 18.00.^ The number of degrees of freedom n − 1, here 6 − 1 = 5. Notice that the total number of
observations is fixed, that is how we calculate the expected frequency. Once the total is set we lose a degree of freedom.
A χ^2. 95 , 5 = 11.70 Here χ^2 o > χ^2. 95 , 5 = 11.70 so we reject the null hypothesis. We conclude the die is unfair.^1.
The Chi-Square test is useful for the contingency table, or two-way setup. Remember the contingency table from Lecture 10: a variable on rows, a variable on columns, each cell has the observed counts for each bivariate value of the variable.
We used this example:
Fashion Level Classlevel Low Middle High Total Graduate 6 4 1 11 PhD. 5 1 2 8 Pre-K 30 25 75 130 Total 41 30 78 149
The formal hypotheses tests are
H 0 : πij =
nij n..
, f or all i, j
vs.
Ha : πij 6 =
nij n..
, f or at least 1 (i, j) pair
For our data here we calculated χ^2 o = (6−^11 .89)
2
(75− 82 .65)^2
2
. 95 , 4 = 9.48. We reject the null hypothesis and conclude that class level and fashion are not independent.
Remember if we cannot assume that we know the variance of the sample mean we appeal to the t distribution as the sampling distribution for the sample mean
(^1) From the table in the back of the book, which you should familiarize yourself with
We often have not enough samples to apply the central limit theorem to the sampling dis- tribution. In these situation we construct the t-statistic as well.
Formally, the two sided hypothesis test is still one of location of the true mean
H 0 : μ = μ 0 , σ^2 unknown
vs.
Ha : μ 6 = μ 0 , σ^2 unknown
A confidence interval here is:
x ± tα,df ∗
s^2 n
the associated margin of error:
M E = tα/ 2 ,df
s^2 n
the appropriate number of samples for a fixed 1 − α confidence level
n =
t^2 α/ 2 ,df s^2 M E^2
For the hypothesis testing setup of H 0 : μ = μ 0 vs. Ha : μ 6 = μ 0 our observed test statistic is
to =
x − μo s/
n
2 Samples - Independent or Dependent? 1 or 2 or
many?
In general always remember: (1) The sampling distribution, which will yield the (2) The confidence interval , which is immediately analogous to (3) The test statistic. Everything is a variation on this theme, just a slightly different scenario.
Say we wish to gain inference on the support for election reform in California and Georgia. Let p 1 ≡ the proportion who support in Georgia and p 2 ≡ the proportion who support in California. We estimate these, in the usual way, ˆp 1 = x n^11 , pˆ 2 = x n^22 : the sample proportions of voters who supported the reform over total voters, for each state.
We know from the sampling distribution of ˆp: E(ˆp 1 ) = p 1 , E(ˆp 2 ) = p 2 and V ar(ˆp 1 ) = p 1 q 1 n 1 , V ar(ˆp^2 ) =^
p 2 q 2 n 2.
The difference p 1 − p 2 is distributed:
p 1 − p 2 ∼ N (p 1 − p 2 ,
p 1 q 1 n 1
p 2 q 2 n 2
This is the sampling distribution for the difference in proportions. The appropriate rescaled statistic is:
pˆ 1 − pˆ 2 − (p 1 − p 2 ) S.D.(ˆp 1 − pˆ 2 )
and it will have a standard normal distribution.
Thus, a confidence interval for the difference in two proportions is:
pˆ 1 − pˆ 2 ± Zα/ 2
pˆ 1 qˆ 1 n 1
pˆ 2 qˆ 2 n 2
For the two tailed hypothesis test
H 0 : p 1 = p 2 vs. p 1 6 = p 2
we exploit the fact p 1 = p 2 implies p 1 − p 2 = 0 and write
pˆpooled = ˆpp =
x 1 + x 2 n 1 + n 2
to pool the estimate of the population proportion, since, under the null, here, p 1 = p 2.
Then our test statistic is
zo =
pˆ 1 − pˆ 2 √ p ˆp qˆp( (^) n^11 + (^) n^12 )
In general if we have data coming from two samples X 1 ∼ μ, s 12 and X 2 ∼ μ, s 22 and we cannot assume knowledge of the variances we get a sampling distribution for the difference in the population mean μ 1 − μ 2 as
x 1 − x 2 ∼ μ 1 − μ 2 ,
s 12 n 1
s 22 n 2
which we approximate with a t-distribution with n 1 + n 2 − 2 degrees of freedom.^2
Thus the confidence interval is
x 1 − x 2 ± tα/ 2 ,n 1 +n 2 − 2 ∗
s 12 n 1
s 22 n 2
The two sided hypotheses test for differences in the population mean
H 0 : μ 1 − μ 2 = ∆ 0 vs. Ha : μ 1 − μ 2 6 = ∆ 0
would use this test statistic:
t 0 =
x 1 − x 2 − ∆ 0 √ s 12 n 1 +^
s 22 n 2
Of course one sided tests are the usual variations on this.
If you are willing to assume that s 1 = s 2 then you can pool the variance estimates with
Sp^2 =
(n 1 − 1)s 12 + (n 2 − 1)s 22 n 1 + n 2 − 2
and use this test statistic: (^2) The exact calculation for degrees of freedom here is more involved. Using n 1 + n 2 − 2 is good
t 0 =
x 1 − x 2 − ∆ 0 √ Sp^2 ( (^) n^11 + (^) n^12 )
In many cases it is not reasonable to assume that your two samples have arrived indepen- dently. We call data paired when it is natural to think of each sample as bivariate. Like errors while playing piano with the right hand versus the left hand. In these cases, we believe that the samples come from one element, perhaps, but two separate samplings.
Let
thus
di = xi 1 − xi 2
and
(X 11 − X 12 ) + · · · + (Xn 1 − Xn 2 ) n
Here we have taken the differences in each observation, and then computed the average difference. A sampling distribution for D is
D ∼ μ 1 − μ 2 , SD^2 /n
where
n − 1
∑^ n
i
(di − d)
We again approximate with the t-distribution. Here the degrees of freedom are the number of pairs minus 1 df = n − 1
The confidence interval for paired differences of the population mean is then:
d ± tα/ 2 ,n− 1 ∗
sd^2 n
And the hypotheses test for paired differences of the population mean, also known as a paired t-test is
H 0 : ∆ = ∆ 0 vs. Ha : ∆ 6 = ∆ 0
uses this test statistic
to =
d − ∆ 0 Sd/
n
3 Beyond the sample mean S^2
Thus far all of our confidence intervals and hypothesis test have been restricted to tests of the mean (test of location) μ. We have used the sample mean x as the natural estimator. Now we introduce tests and intervals based upon the variance (test of scale).
We have to accept as fact^3 that for a random sample of size n from a normal distribution with parameters μ, σ^2 that
(n − 1)S^2 σ^2
∼ χ^2 (n − 1) (1)
i.e. chi-squared with n − 1 degrees of freedom. We use this fact to set up a confidence interval, now, for σ^2 - using the estimator S^2.
Since P(χ^21 −α/ 2 ,n− 1 < (n−1)S
2 σ^2 < χ
2 α/ 2 ,n− 1 ) = 1^ −^ α^ then a 1^ −^ α^ percent confidence interval is (for α fixed)^4 :
(^3) The proof involves techniques not introduced in this class - but look at lectures 7-9 and you’ll get the flavor. (^4) Notice that χ (^21) −α/ 2 ,n− 1 6 = −χ (^2) α/ 2 ,n− 1. The chi-squared distribution is not symmetric, nor is the associated confidence interval
(n − 1)s^2 χ^2 α/ 2 ,n− 1
(n − 1)s^2 χ^21 −α/ 2 ,n− 1
An intelligent reader like you understands that the interval for σ is just the square root of that for σ^2.
Remember from lecture 10: If X 1 , ...Xm is distributed N (μ 1 , σ^21 ) and Y 1 , ...Yn is distributed N (μ 2 , σ 22 ) then the ratio
S 12 /σ^21 S 22 /σ^22
has what we call an F distribution with numerator degrees of freedom m−1 and denominator degrees of freedom n − 1. From what we just learned in the previous section - F is the ratio
of two chi-squared variables, call them U ∼ χ^2 (m − 1) and V ∼ χ^2 (n − 1). If U = (m−1)S
2 σ 12
then U ∼ χ^2 (m − 1). If V = (n−1)S
2 σ^22 then^ V^ ∼^ χ
(^2) (n − 1). Then
(m−1)S^2 σ^2 /(m^ −^ 1) (n−1)S^2 σ^2 /(n^ −^ 1)
which just simplifies to (3).^5
Remember this identity for the F-distribution is:
F 1 −α,ν 1 ,ν 2 = F (^) α,ν−^12 ,ν 1 (4)
You’ll notice you have to use this fact in looking up values on the F -table in some books.
Lastly, we can construct a confidence interval for the ratio of two variances using this fact (we’ve seen this reasoning before): P(F 1 −α/ 2 ,ν 1 ,ν 2 < F < Fα/ 2 ,ν 1 ,ν 2 ) = 1 − α and rewriting so
that we get a statement P(F 1 −α/ 2 ,ν 1 ,ν 2 S
(^22) S^21 <^
σ^22 σ^21 < Fα/^2 ,ν^1 ,ν^2
S^22 S^21 ) = 1^ −^ α. This yields: (^5) It turns out that E(F ) = ν 2 ν 2 − 2 and^ var(F^ ) =^
2 ν^22 (ν 1 +ν 2 −2) ν 1 (ν 2 −2)^2 (ν 2 −4) where^ U^ ∼^ χ (^2) (ν 1 ), V ∼ χ (^2) (ν 2 ), F = U/ν^1 V /ν 2 and U is independent of V.
s^22 F 1 −α/ 2 ,ν 1 ,ν 2 s^21
s^22 Fα/ 2 ,ν 1 ,ν 2 s^21
as a 1 − α percent confidence interval for the ratio σ^22 /σ 12.
4 R example continued from lecture 11
Here we are comparing costs of accidents in the non-ABS year 1991 and the ABS year 1992. We can treat the cost as a continuous non-proportion variable. Remember the data from lecture 4.
A hypothesis test:
H 0 : ∆μ = μN oABS − μABS = 0
vs.
H 1 : ∆μ = μN oABS − μABS > 0
The variances are unknown - so we know we need to use a t-test. But can we assume they are equal and use a pooled variance estimator?
First things first: this data has missing values:
mean(data)
#Cost1991 Cost1992 2074.952 NA
#we could also use data[37:42,]
mean(data,na.rm=T)
#Cost1991 Cost1992 2074.952 1714.
#here we have removed the missing values
#Cost1991 Cost1992 Cost1991 441529. #-7008.193 Cost1992 -7008.193 390409.
#this is the covariance matrix, for now we only need the diagonal #elements
We should do an F-test for equality of variances (I’ll skip the hypothesis notation for this intermediate test) to know which form of the t-test to apply.
data[1,1]/data[2,2] [1] 1. #our observed value of the f-statistic
pf(data[1,1]/data[2,2],41,37,lower.tail=FALSE) [1] 0.
#the p-value for our (inherently) two tailed test
We can assume that the variances are equal, so our test statistic is:
t =
(xN oABS − xABS ) − 0 √ s^2 p(n− N oABS^1 + n− ABS^1 )
where
s^2 p =
(n 1 − 1)s^21 + (n 2 − 1)s^22 n 1 + n 2 − 2
The calculations in R:
mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T) [1] 360. %the difference in the sample means s1squared<-var(data,na.rm=T)[1,1] s2squared<-var(data,na.rm=T)[2,2] spsquared<-((42-1)s1squared +(38-1)s2squared)/(42+38-2) spsquared
%the sample variances, and pooled sample variance
tstat<-(mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T))/sqrt(spsquared*(1/42 +1/38)) tstat [1] 2. %the calculated value for the t-statistic pt(2.49,df=(42+38-2),lower.tail=FALSE) [1] 0. % the p-value for the observed t-statistic
The difference in sample variation between non-ABS and ABS equipped cars was statistically insignificant at the .05 level. A test for equality of means of cost of repairs was statistically significant at the same level - we reject the null hypothesis. Translated into the narrative: There is enough evidence at the .05 level to conclude that the cost of repairs for non-ABS equipped cars is higher than that of ABS cars. We could perhaps conjecture that ABS equipped cars help the driver lessen the severity of an accident.
In R the confidence limits are:
mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T)-qt(.975,(42+38-2)) sqrt(spsquared(1/42 +1/38)) [1] 72. mean(data[,1],na.rm=T)-mean(data[,2],na.rm=T)+qt(.975,(42+38-2)) sqrt(spsquared(1/42 +1/38)) [1] 648.
5 Exercises