Lecture Slides on Missing Data with Examples | BIOST 570, Study notes of Biostatistics

Material Type: Notes; Class: ADV APPL LIN MODELS; Subject: Biostatistics; University: University of Washington - Seattle; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-kfa-1
koofers-user-kfa-1 🇺🇸

5

(1)

10 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Missing data: examples
Thomas Lumley
BIOST 570
2005-11-14
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Lecture Slides on Missing Data with Examples | BIOST 570 and more Study notes Biostatistics in PDF only on Docsity!

Missing data: examples

Thomas Lumley

BIOST 570

Example

(X, Z) are bivariate Normal with mean zero, variance 1, correla-

tion 0.5.

Z is missing with probability exp(x)/(1 + exp(x))

Complete-case linear regression for Z ∼ X is not biased

Complete-case linear regression for X ∼ Z is biased.

Z is Missing At Random, so multiple imputation will work

R depends on observable X, so weighting will work.

Example

Set up data

> px<-exp(dd$x)/(1+exp(dd$x))

> drop<-rbinom(nrow(dd),1,1-px)

Example

> lm(z~x,data=dd)

Call:

lm(formula = z ~ x, data = dd)

Coefficients:

(Intercept) x

> lm(z~x,data=dd,subset=!drop)

Call:

lm(formula = z ~ x, data = dd, subset = !drop)

Coefficients:

(Intercept) x

Weighting

> lm(x~z,data=dd,subset=!drop,weights=1/px)

Call:

lm(formula = x ~ z, data = dd, subset = !drop, weights = 1/px)

Coefficients:

(Intercept) z

> pmodel<-glm(drop~x,data=dd,family=binomial)

> estp<-1-fitted(pmodel)

> lm(x~z,data=dd,subset=!drop,weights=1/estp)

Call:

lm(formula = x ~ z, data = dd, subset = !drop, weights = 1/px)

Coefficients:

(Intercept) z

Imputation

> impmodel<-lm(z~x,data=dd,subset=!drop)

> summary(impmodel)

Call:

lm(formula = z ~ x, data = dd, subset = !drop)

Residuals:

Min 1Q Median 3Q Max

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.01012 0.07117 -0.142 0.

x 0.55677 0.07007 7.946 1.46e-13 ***

Signif. codes: 0 ’’ 0.001 ’’ 0.01 ’’ 0.05 ’.’ 0.1 ’ ’ 1

Residual standard error: 0.9003 on 197 degrees of freedom

Multiple imputation

> dd1<-dd

> dd1$z<-ifelse(drop,

rnorm(nrow(dd),m=predict(impmodel,newdata=dd),s=0.9003),

dd$z)

> dd2<-dd

> dd2$z<-ifelse(drop,

rnorm(nrow(dd),m=predict(impmodel,newdata=dd),s=0.9003),

dd$z)

> dd3<-dd

> dd3$z<-ifelse(drop,

rnorm(nrow(dd),m=predict(impmodel,newdata=dd),s=0.9003),

dd$z)

> dd4<-dd

> dd4$z<-ifelse(drop,

rnorm(nrow(dd),m=predict(impmodel,newdata=dd),s=0.9003),

dd$z)

Multiple imputation

> dd5<-dd

> dd5$z<-ifelse(drop,

rnorm(nrow(dd),m=predict(impmodel,newdata=dd),s=0.9003),

dd$z)

> library(mitools)

> ddimp<-imputationList(list(dd1,dd2,dd3,dd4,dd5))

> models<-with(ddimp, lm(x~z))

> summary(MIcombine(models))

Multiple imputation results:

with.imputationList(ddimp, lm(x ~ z))

MIcombine.default(models)

results se (lower upper) missInfo

(Intercept) 0.03576729 0.04634838 -0.05618825 0.1277228 22 %

z 0.47037529 0.04374016 0.38360450 0.5571461 21 %

Likelihood

The observed data loglikelihood is univariate normal for X alone,

bivariate normal for (X, Z)

> ell<-function(theta){

xmu<-theta[1]

mu<-theta[1:2]

Sigma<-matrix(theta[c(3,4,4,5)],2)

xsigma<-sqrt(theta[3])

ellx<-sum(dnorm(dd$x[!!drop],m=xmu,s=xsigma,log=TRUE))

ellxz<-

for(i in which(!drop)){

xz<-as.matrix(dd[i,c("x","z")])

ellxz<-ellxz-log(2pidet(Sigma))

-(xz-mu)%%solve(2Sigma,t(xz-mu))

ellxz+ellx}

Likelihood

Direct optimization gives

> optim(c(0,0.25,0.9,0.5,0.7),ell,method="BFGS",

control=list(trace=TRUE,fnscale=-1))

initial value 759.

iter 10 value 724.

final value 724.

converged

$par

[1] 0.04750160 0.01633095 0.62181402 0.34618559 0.

$value

[1] -724.

$counts

function gradient

EM algorithm

The EM algorithm would be more work that direct maximization

in this case. For the cases with Z missing we need to

compute E[Z|X] and E[Z

|X] as functions of the bivariate normal

parameters in order to compute

E

[

((X, Z) − μ)Σ

((X, Z) − μ)

T

|X

]

If we were computing analytic rather than numeric first deriva-

tives (which often helps optimization substantially) the EM

algorithm would mean that we only needed one set of derivatives

rather than two.

The EM algorithm is most used not for real missing data but

for likelihoods depending on latent variables, where treating the

latent variables as missing data simplifies the likelihood.