Nonparametric Estimation of Hazard-Based Survivor Function: Analysis by R. Prentice - Prof, Study notes of Statistics

An overview of hazard-based nonparametric survivor function estimation, a statistical method used to estimate the probability that two random variables, t1 and t2, both exceed a given threshold at the same time. The theory behind the method, including the survivor function f(t) and hazard function λ(dt), and discusses various estimators such as the bickel, prentice-cai, and dabrowska estimators. The document also includes simulation comparisons and further research possibilities.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-j86
koofers-user-j86 🇺🇸

10 documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
HAZARD-BASED NONPARAMETRIC
SURVIVOR FUNCTION ESTIMATION
Ross L. Prentice
Seattle, WA, U.S.A.
Overview
Joint work with Zoe Moodie and Jianrong Wu
NP Estimation of F
where F(t1,t
2)=P(T1>t
1,T
2>t
2)
Review of estimators ˆ
F=Φ(ˆ
F1,ˆ
F2,ˆ
Λ)
where Λ(dt1,dt
2)=F(dt1,dt
2)/F (t
1,t
2)
Hazard-based survivor function representation F=Φ(Λ)for
truncated T1and T2,(t1,t
2)ε[0,τ1)×[0,τ2)
Show ˆ
F=Φ(ˆ
Λ) readily obtained using a simple matrix calcu-
lation
Consider ˆ
ΛEand ˆ
ΛVDL special cases in simulation studies
Comment on further research possibilities
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Nonparametric Estimation of Hazard-Based Survivor Function: Analysis by R. Prentice - Prof and more Study notes Statistics in PDF only on Docsity!

HAZARD-BASED NONPARAMETRIC

SURVIVOR FUNCTION ESTIMATION

Ross L. Prentice

Seattle, WA, U.S.A.

Overview

  • Joint work with Zoe Moodie and Jianrong Wu
  • NP Estimation of F

where F (t 1

, t 2

) = P (T

1

t 1

, T

2

t 2

  • Review of estimators

F = Φ(

F

1

F

2

where Λ(dt 1

, dt 2

) = F (dt 1

, dt 2

)/F (t

1

, t

2

  • Hazard-based survivor function representation F = Φ(Λ) for

truncated T 1

and T 2

, (t 1

, t 2

)ε[0, τ 1

) × [0, τ 2

  • Show

F = Φ(

Λ) readily obtained using a simple matrix calcu-

lation

  • Consider

E

and

V DL

special cases in simulation studies

  • Comment on further research possibilities

Introduction

• T = (T

1

, T

2

) and let s = (s 1

, s 2

) < t = (t 1

, t 2

) ⇔ s 1

< t 1

and

s 2

< t 2

  • Survivor function F defined by F (t) = P (T > t)
  • Hazard function Λ defined by Λ(dt) = F (dt)/F (t

− )

• F = Φ(F

1

, F

2

F (ds) = F (s

)Λ(ds)

F (t) = Ψ(t) +

8 t

0

F (s

)Λ(ds) ∗

where Ψ(t) = F 1

(t 1

) + F

2

(t 2

) − 1 and F i

(t i

) = P (T

i

t i

i = 1, 2.

— ∗ is an inhomogeneous Volterra equation having a unique

Peano series solution

F (t) = Ψ(t) +

8 t

0

Ψ(s

)

P {(s, t]; Λ}Λ(ds)

where

P {(s, t]; Λ} = 1 +

∞ 3

m=

8

s<u 1

<···<um≤t

m 

j=

Λ(du j

Representation F = Φ(Λ)

Since F (0) = 1 = F (τ

)

P {( 0 , τ ); α}

F (t

) = { 1 − Λ(∆t)}

− 1 ˜ P {(t, τ ); α}/

P {( 0 , τ ), α}

Notes:

  • F (t) =

lim

s↓t

F (s

)

  • F (dt) = F (t

− )Λ(dt) = α(dt)

P {(t, τ ), α}/

P {( 0 , τ ); α}

P is compact differentiable (Gill and Johansen, 1990, AS), as is

Φ, so that

F = Φ(

Λ) inherits such properties as strong consis-

tency, weak convergence to a Gaussian process, and bootstrap

applicability from

Calculation of ˆF = Φ(

  • Denote

u 11

< u 12

< · · · < u 1 I

= τ 1

uncensored T 1

values

u 21

< u 22

< · · · < u 2 J

= τ 2

uncensored T 2

values

  • Nonparametric

Λ(∆t) estimators take positive values on a sub-

set of the grid formed by these uncensored failure times, zero

elsewhere

  • Video order the points having positive hazard, starting from

bottom left and moving across rows. Denote these points by

t f

= (t 1 f

, t 2 f

), f = 1,... , s and let

λ f

denote the hazard rate

at t f

. Note that t s

= (τ 1

, τ 2

) and

λ s

= 1, whereas

λ f

< 1 for

f < s

  • Denote by ˆp f

F (∆t f

) the mass assignment at t f

, f = 1,... , s

λ f

= ˆp f

F

f

(t

f

) = ˆp f

3

{t 1 i

<t 1 f

or t 2 i

<t 2 f

}

p ˆ i

}, f = 1,... , s

Apˆ = 1

where ˆp

I

= (p 1

,... , p s

), 1 = (1 · · · 1) and

A =

A(

Λ) = (a fm

a fm

     

    

λ

− 1

f

, if f = m;

1 , if t 1 f

t 1 m

or t 2 f

t 2 m

0 , otherwise.

Calculation of ˆF = Φ(

F (t) =

3

{f|t 1 f

t 1

and t 2 f

t 2

}

p ˆ f

provides n.p. estimator of F for all tε[0, τ 1

) × [0, τ 2

Linkage to representation

F (∆t) = ˆα(∆t)

P {(t, τ ); ˆα}/

P {( 0 , τ ); ˆα}

  • Denote ˆα f

λ f

λ f

} = ˆγ

− 1

f

, f = 1,... , s − 1

Then

P {( 0 , τ ); ˆα} = ( ˆα 1

· · · αˆ s− 1

) det

A

and

P {t k

, τ ); ˆα} = ( ˆα k+

· · · αˆ s− 1

)ˆp k

det A( ˆα 1

,... , αˆ k− 1

so

F (∆t k

) = ˆα k

(



fW=k

α ˆ f

) det

A

s− 1 

1

αˆ f

det

A

p ˆ k

= ˆp k

, as desired

Generalization to Other Dimensions

• T = (T

1

,... , T

m

) for m = 1 or m > 2

  • Same argument gives, for any m ≥ 1,

F (t

) = { 1 − Λ(∆t)}

− 1 ˜ P {(t, τ ); α}/

P ( 0 , τ ); α}

  • Univariate

P {(t, τ ); α} = 1 +

∞ 3

n=

8

t<v 1

<···<τ

n 

i=

α(dv i



t<u<τ

{1 + α(du)} (Gill and Johansen, 1990)



t<u<τ

{ 1 − Λ(du)}

− 1

So F (t

) = { 1 − Λ(∆t)}

− 1



t<u<τ

{ 1 − Λ(du)}

− 1



0 <u<τ

{ 1 − Λ(du)}

− 1

= { 1 − Λ(∆t)}

− 1



0 <u≤t

{ 1 − Λ(du)}



0 <u<t

{ 1 − Λ(du)} (K-M)

  • For any m compact differentiability of transformation from Λ

to F allows asymptotic properties held by

Λ to be inherited

by

F = Φ(

Special Cases of ˆΛ

RE

V DL

with bivariate censoring

Λ requires iterative calculation, and provides no advantage over

direct calculation of

F via E-M algorithm

  • Van der Laan’s (1996 AS) repaired NPMLE maximizes

L =

I 

i=

J 

j=

p

n

11

ij

ij

  

3

fεS 1 i

3

m>j

p fm

  

n

10

ij

  

3

f>i

3

mεS 2 j

p fm

  

n

01

ij

 

3

f>i

3

m>j

p fm

 

n

00

ij

after imposition of additional censoring, replacement of any

undistributed singly censored observation by an uncensored

observation, and coarsening of the uncensored component of

the singly censored observation to the pertinent partition ele-

ment

  • p ij

= 0 if n

11

ij

= 0. Otherwise p ij

0 and satisfies

n

11

ij

p ij

3

m<j

n ¯

10

im

3

uεS 1 i

3

v>m

p uv

3

f<i

n ¯

01

fj

3

u>f

3

vεS 2 j

p uv

3

f<i

3

m<j

n

00

fm

3

u>f

3

v>m

p uv

−n = 0

  • Rewrite as

p ij

n

11

ij

n

3

m<j

n ¯

10

im

n

       

p ij

3

v>m

3

uεS 1 i

p uv

       

3

f<i

n ¯

01

fj

n

        

p ij

3

u>f

3

vεS 2 j

p uv

3

f<i

3

m<j

n

00

fm

n

      

p ij

3

u>f

3

v>m

p uv

      

  • Simple iterative calculation using mass assignments from re-

duced data

F

E

for starting values

  • Greenwood-like asymptotic variance estimator for

F

V DL

from

2

log L/∂p ij

∂p fm

Simulation Comparisons

(see also Van der Laan, 1997, Statistica Nederlandica)

(i) F : independence F (t 1

, t 2

) = F

1

(t 1

)F

2

(t 2

— Clayton (1978, Bmka)

F (t 1

, t 2

) = {F

1

(t 1

−θ

  • F 2

(t 2

−θ

− 1 }

−θ

− 1

with θ = 4.

— unit exponential margins

(ii) Censoring: C 1

— exponential, mean 0.5 (2/3 probability of cen-

soring)

C

2

— none; univariate (C 1

= C

2

); independent exponential

with mean 0.

(iii) Truncation: τ 1

= τ 2

= 0. 5978 , F

i

(τ i

) = 0. 55 , i = 1, 2

Exclude sample from summary statistics if risk set empty at τ

(iv) Sample size: n = 30, 60 , 120 , 240; 1000 repetitions

(v) Partitions for

F

V DL

: Vertical, and horizontal strips

[0, − log(.85)), [− log(.85), − log(.70)), [− log(.70), − log(.55)),

[− log(.55), − log(.55)]

Also considered wider and narrow strips

Table 2. Sample means and standard deviations (in parentheses) for various estimators of the marginal

survivor functions F 1 and F 2. Each row is based on 1000 simulations at sample size n = 120.

T 1 Survival Probability T 2 Survival Probability

Failure Model .85 .70 .55 .85 .70.

Censoring on T 1 Only

Independence FˆE .850 (.038) .700 (.055) .552 (.070) .850 (.054) .702 (.072) .551 (.081)

Fˆ KM .850 (.036)^ .701 (.049)^ .552 (.064)^ .851 (.033)^ .700 (.042)^ .550 (.046)

ˆ FRE .851 (.035) .701 (.050) .553 (.066) .850 (.039) .700 (.050) .550 (.047)

Clayton

ˆ FE .850 (.036) .700 (.049) .551 (.063) .850 (.042) .702 (.055) .555 (.066)

ˆ FKM .850 (.036) .701 (.049) .552 (.064) .851 (.033) .701 (.041) .553 (.044)

Fˆ RE .851 (.036)^ .702 (.048)^ .559 (.060)^ .851 (.037)^ .701 (.045)^ .553 (.046)

Univariate Censoring

Independence FˆE .852 (.053) .703 (.067) .554 (.078) .850 (.053) .702 (.070) .552 (.079)

ˆ FKM .850 (.036) .701 (.049) .552 (.064) .851 (.035) .702 (.050) .553 (.065)

ˆ FRE .851 (.044) .701 (.056) .552 (.071) .850 (.042) .702 (.056) .553 (.070)

Clayton FˆE .851 (.043) .701 (.057) .552 (.071) .850 (.042) .701 (.056) .555 (.069)

Fˆ KM .850 (.036)^ .701 (.049)^ .552 (.064)^ .851 (.036)^ .702 (.049)^ .555 (.064)

ˆ FRE .851 (.041) .702 (.054) .556 (.069) .850 (.041) .702 (.055) .560 (.067)

Bivariate Censoring

Independence

ˆ FE .847 (.063) .701 (.092) .552 (.117) .853 (.061) .700 (.095) .550 (.122)

ˆ FKM .850 (.035) .702 (.051) .552 (.064) .852 (.033) .700 (.051) .551 (.065)

Fˆ RE .849 (.052)^ .702 (.070)^ .558 (.072)^ .852 (.049)^ .702 (.069)^ .557 (.075)

(Greenwood)

◦ (.042) (.057) (.068) (.042) (.057) (.069)

Clayton FˆE .849 (.048) .701 (.072) .548 (.100) .851 (.046) .703 (.071) .552 (.097)

ˆ FKM .850 (.035) .702 (.051) .552 (.064) .850 (.034) .700 (.050) .548 (.063)

ˆ FRE .850 (.047) .704 (.065) .563 (.069) .849 (.047) .704 (.063) .561 (.068)

(Greenwood) (.040) (.054) (.065) (.040) (.054) (.065)

◦ Entries are the square root of the average across samples of Greenwood-like variance estimators.

Discussion/Further Research

  • Compact differentiable transformation

F (t

) = { 1 − Λ(∆t)}

− 1 ˜ P {(t, τ ); α}/

P {( 0 , t); α}

developed, where α(dt) = Λ(dt)/{ 1 − Λ(∆t)}, for truncated

failure time data

  • Nonparametric plug-in estimators

F = Φ(

Λ) calculated using

a simple matrix calculation

F

E

is too inefficient; and asymptotically efficient

F

V DL

may

require very large sample sizes to out-perform simple plug-in

estimators

F

P C

and

F

D

  • Further research possibilities:

— Obtain

F

1

F

2

estimators via partitioning [0, τ 1

], [0, τ 2

], fol-

lowed by Prentice-Cai or Dabrowska procedures

— Impose hazard rate form of Prentice-Cai or Dabrowska es-

timators (with restriction to avoid negative mass) in

F =

Λ) calculation.