











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of three widely used statistical tests for a hypothesis that fixes the values of some parameters in a parametric model. The tests discussed are the wald test, likelihood ratio test, and score test. The differences between these tests, their assumptions, and their applications. It also covers the use of these tests in linear and generalized linear models, as well as their implementation in r.
Typology: Study notes
1 / 19
This page cannot be seen from the preview
Don't miss anything!












2005-10-
Given a well-behaved parametric model there are three widely used types of test for a hypothesis that fixes the values of some parameters, ie, θ = (β, η) = (β 0 , η). Write p for the dimension of β.
(β 0 ) −(ˆβ)) to a χ^2 p distributionThe Wald test requires estimation in the full model, the score test requires estimation in the reduced model with β = β 0 and the LRT requires both.
β=β 0 β=β
score
The Wald test is very easy to use for confidence interval generation, the LRT is slightly harder and the score test is much harder.
The Wald confidence interval is always symmetric around β 0 , which is its main defect. It can often be improved by creating the confidence interval on a transformed parameter where symmetry is more appropriate.
The confint function in the MASS package gives likelihood ratio confidence intervals. The R package hoa provides higher-order approximations to the likelihood for some regression models. If the distributional assumptions are correct, the confidence intervals will be substantially more accurate in small samples.
For generalized linear models with V (μ) known exactly the likelihood ratio test has a χ^2 distribution.
When V (μ) is known only up to a dispersion parameter, as in overdispersed Poisson regression, it is usual to use an F or t distribution, but this is not particularly well justified by theory.
By analogy with the residual sum of squares in linear models, the log likelihood is usually reported in terms of the deviance, the difference in − 2 ×loglikelihood between the fitted model and a model with a separate mean parameter for each observation.
The LRT statistic comparing two nested models is simply the difference in deviance
When V (μ) is known only up to a dispersion parameter k it is necessary to estimate k. There are two popular estimators
∑ i
(Yi − μˆi)^2 V (ˆμi)
The Pearson residual estimator is probably better. It is valid even when the data do not come from the assumed exponential family model. Also, the deviance estimator can be badly biased, eg for Poisson regression with small means.
A small simulation for data really from a Poisson distribution (so k = 1) gave mean residual deviance of ˆk = 0.77 for μ = 0.5 and ˆk = 0.49 for μ = 0.25.
The Wald test for a single coefficient is printed by summary on a glm object. Most statistical software will automatically display all these single-term Wald tests.
The function regTermTest on the class web page (and taken from the ‘survey’ package) performs Wald tests for specific regression terms.
data(airquality) a<-glm(Ozone~Solar.R+cut(Temp,3)cut(Wind,3),data=airquality, family=Gamma("log")) regTermTest(a, ~Solar.R) Wald test for Solar.R in glm(formula = Ozone ~ Solar.R + cut(Temp, 3) * cut(Wind, 3), family = Gamma("log"), data = airquality) Chisq = 17.48064 on 1 df: p= 2.9025e- regTermTest(a, ~cut(Temp, 3):cut(Wind, 3)) Wald test for cut(Temp, 3):cut(Wind, 3) in glm(formula = Ozone ~ Solar.R + cut(Temp, 3) * cut(Wind, 3), family = Gamma("log"), data = airquality) Chisq = 1.610642 on 4 df: p= 0. regTermTest(a, ~cut(Temp, 3)cut(Wind, 3)) Wald test for cut(Temp, 3) cut(Wind, 3) cut(Temp, 3):cut(Wind, 3) in glm(formula = Ozone ~ Solar.R + cut(Temp, 3) * cut(Wind, 3), family = Gamma("log"), data = airquality) Chisq = 122.8512 on 8 df: p= < 2.22e-
Likelihood ratio tests are done with the anova function. It is necessary to specify which test you want (χ^2 or F ). anova applied to a single model gives sequential tests, applied to two models it compares them.
anova(a, test="F") Analysis of Deviance Table
Model: Gamma, link: log Response: Ozone
Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev F Pr(>F) NULL 110 71. Solar.R 1 13.040 109 58.910 48.5287 3.383e- cut(Temp, 3) 2 25.238 107 33.672 46.9616 3.803e- cut(Wind, 3) 2 6.766 105 26.906 12.5894 1.313e- cut(Temp, 3):cut(Wind, 3) 4 0.447 101 26.459 0.4155 0.
There is no automatic user-friendly function for score tests in R (or in other software AFAIK), but they are not hard to do from first principles.
The primary difficulty is computing the information matrix for the full parameter under the reduced model. This is not part of the standard model output from either the full or the reduced model.
A sneaky trick is that the score test can be computed as the Wald test based on a single iteration of IWLS. If we start IWLS at the fitted μ from the null-hypothesis model and take one iteration, then
βˆ − β 0 =
∑ i
∂Ui(β 0 ) ∂β
− 1 ∑ i
Ui(β 0 )
and the sandwich estimator is ∑ i
∂Ui(β 0 ) ∂β
− 1 ∑ i
Ui(β 0 )Ui(β 0 )T
∑ i
∂Ui(β 0 ) ∂β
− 1
so the ‘Wald test’ formula gives the score test if we are using the sandwich estimator.
If we are using the model-based estimator and estimating a dispersion parameter then we need to make sure that ˆσ^2 is computed at the null hypothesis not at the fitted values.
null<-glm(Ozone~cut(Temp,3)cut(Wind,3),data=na.omit(airquality), family=Gamma("log")) onestep<-glm(Ozone~Solar.R+cut(Temp,3)cut(Wind,3),data=na.omit(airquality), family=Gamma("log"), Warning message: algorithm did not converge in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, regTermTest(onestep, ~Solar.R) Wald test for Solar.R in glm(formula = Ozone ~ Solar.R + cut(Temp, 3) * cut(Wind, 3), family = Gamma("log"), data = na.omit(airquality), mustart = fitted(null), maxit = 1) Chisq = 14.29433 on 1 df: p= 0.
Z<-predict(null,type="link")+resid(null,"working") W<-null$weights iwls1<-lm(Z~Solar.R + cut(Temp, 3) * cut(Wind, 3), weight=W,data=airquality) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ iwls1<-lm(Z~Solar.R + cut(Temp, 3) * cut(Wind, 3), weight=W,data=na.omit(airquality)) regTermTest(iwls1,~Solar.R) Wald test for Solar.R in lm(formula = Z ~ Solar.R + cut(Temp, 3) * cut(Wind, 3), data = na.omit(airquality), weights = W) Chisq = 14.29433 on 1 df: p= 0.
A famous experiment by Fisher tested a woman’s claim that she could tell if milk had been added first or last in a cup of tea. The data are
TeaTasting Truth Guess Milk Tea Milk 3 1 Tea 1 3
Fitting a logistic regression we obtain