Notes on Censored EL, and Harzard

Notes o Cesored EL, ad Harzard Ma Zhou I survval aalyss, the statstcs volvg the hazard fuctos are usually easer to hadle mathematcally the those volvg the dstrbutos. For example, t s easer to show the Nelso-Aale estmator s a NPMLE of the cumulatve hazard fucto compared to the Kapla-Meer estmator (whch s a NPMLE of dstrbuto fucto). However, there s a catch: the hazard fucto Λ has two dstct pars formula coectg wth the dstrbuto fucto F, oe for the cotuous hazard, oe for the dscrete hazard. The cotuous verso of emprcal lkelhood s also called the Posso emprcal lkelhood; the dscrete verso s also called the bomal emprcal lkelhood. The dscrete verso s a truly lkelhood, the cotuous verso s a approxmato. Dscrete Λ(t) = F (t) 1 F (t ), ad 1 F (t) = [1 Λ(s )] s t Cotuous Λ(t) = log[1 F (t)] ad 1 F (t) = exp[ Λ(t)] Therefore there are two versos of everythg related to hazard: two versos of emprcal lkelhood, two versos of the ull hypothess. Ad later we wll proof two versos of the Emprcal lkelhood rato Wlks theorem. 1 Hazard Emprcal Lkelhood: cotuous verso Suppose that X 1, X 2,..., X are..d. oegatve radom varables deotg the lfetmes wth a cotuous dstrbuto fucto F 0 ad cumulatve hazard fucto Λ 0 (t). Idepedet of the lfetmes there are cesorg tmes C 1, C 2,..., C whch are..d. wth a dstrbuto G 0. Oly the cesored observatos, (T, δ ), are avalable to us: T = m(x, C ) ad δ = I[X C ] for = 1, 2,.... For the emprcal lkelhood terms of hazard, we use the Posso exteso of the lkelhood 1

(Murphy 1995), ad t s defed as EL(Λ) = [ Λ(T ))] δ exp{ Λ(T )} = [ Λ(T ))] δ exp{ j:t j T Λ(T j )} where Λ(t) = Λ(t+) Λ(t ) s the jump of Λ at t. Remark: The term exp( Λ(T )) the frst le above has ts org the cotuous formula, yet the secod le we assume a dscrete Λ( ). Let w = Λ(T ) for = 1, 2,...,, where we otce w = δ because the last jump of a dscrete cumulatve hazard fucto must be oe. The lkelhood at ths Λ ca be wrtte term of the jumps EL = [w ] δ exp{ w j I[T j T ]}, ad the log lkelhood s log EL = δ log w j=1 w j I[T j T ]. If we max the log EL above (wthout costrat) we see that w = δ R, where R = j I[T j T ]. Ths s the well kow Nelso-Aale estmator: ˆΛ NA (T ) = δ R. If we defe R(t) = k I[T k t] the R = R(T ). The frst step our aalyss s to fd a (dscrete) cumulatve hazard fucto that maxmzes the log EL(Λ) uder the costrats (1): 0 0 0 j=1 g 1 (t)dλ(t) = θ 1 g 2 (t)dλ(t) = θ 2 (1) g p (t)dλ(t) = θ p where g (t)( = 1, 2,..., p) are gve fuctos satsfy some momet codtos (specfy later), ad θ ( = 1, 2,..., p) are gve costats. The costrats (1) ca be wrtte as (for dscrete hazard) g 1 (T )w = θ 1 g 2 (T )w = θ 2 (2) g p (T )w = θ p. 2

A smlar argumet as Owe (1988) wll show that we may restrct our atteto the EL aalyss to those dscrete hazard fuctos that are domated by Nelso-Aale: Λ(t) ˆΛ NA (t). [Owe 1988 restrcted hs atteto to those dstrbuto fuctos that F (t) the emprcal dstrbuto.] Sce for dscrete hazard fuctos, the last jump must be oe, ths mply that w = δ = ˆΛ NA (T ) always. The ext theorem gves the other jumps. Theorem 1 If the costrats above are feasble (whch meas the maxmum problem has a hazard soluto), the the maxmum of log EL(Λ) uder the costrat s obtaed whe where δ w = R + λ T G(T )δ = δ 1 R 1 + λ T (δ G(T )/(R /)) 1 = ˆΛ NA (T ) 1 + λ T Z G(T ) = {g 1 (T ),..., g p (T )} T, Z = δ G(T ) R / = {Z 1,..., Z p } T for = 1, 2,...,. ad λ = {λ 1,..., λ p } T s the soluto of the followg equatos 1 1 Z k 1 + λ T Z + g k (T )δ = θ k for k = 1,..., p. (3) Proof. Use Lagrage Multpler to fd the costraed maxmum of log EL. See Pa ad Zhou (2002) for detals. Smlar to the proof the paper, t ca also be show the followg Wlks theorem hold. Theorem 2 Let (T 1, δ 1 ),..., (T, δ ) be pars of..d. radom varables as defed above. Suppose g = 1,..., p are left cotuous fuctos satsfy g (x)g j (x) 0 < dλ(x) <, all 1, j p. (4) (1 F 0 (x))(1 G 0 (x )) Furthermore, assume the matrx Σ, defed the Lemma 2 below, s vertble. The, θ 0 = { g 1 (t)dλ(t),..., g p (t)dλ(t)} T wll be a feasble vector wth probablty approachg oe as ad 2 log ELR(θ 0 ) D χ 2 (p) as where log ELR(θ 0 ) = max log EL(wth costrats (2)) log EL(ˆΛ NA ). Proof. Here we brefly outle the proof. The complete proof s just a multvarate verso of Pa ad Zhou (2002). Frst, we eed the followg two lemmas. They are the Law of Large Numbers ad CLT for Nelso-Aale estmator ad ca be proved va coutg processes techque. 3

Lemma 1 Uder the assumpto of Theorem 2, we have, for 1 k, r p 1 Z k Z r = as where gk (t)g r (t) R(t)/ dˆλ NA (t) P R(t) = I [T t]. Lemma 2 Uder the assumpto of Theorem 2, we have ( 1 Z θ 0 ) = ( G(T ) ˆΛ NA (T ) θ 0 ) g k (x)g r (x) (1 F 0 (x))(1 G 0 (x )) dλ 0(x) D MV N(0, Σ), as where the lmtg varace-covarace matrx s g k (x)g r (x) Σ kr = (1 F 0 (x))(1 G 0 (x )) dλ 0(x) for 1 k, r p; (5) ad θ 0 = { g 1 (t)dλ 0 (t),, g p (t)dλ 0 (t)} T. We defe the matrx A as below. Sce A Σ as (Lemma 1) ad we assumed Σ s vertble ad thus postve defte, we coclude that for large eough the symmetrc matrx A s vertble. Next, we show the soluto of λ to the costrat equatos (3) s where A kr = 1 b = { 1 λ = λ = A 1 b + o p ( 1/2 ) (6) Z k Z r for 1 k, r p. Z 1 θ 1,, 1 Z p θ p } T Ths ca be proved by a expaso of equato (3). The stcker questo s that why a expaso of (3) s vald ad why the remader term s o p ( 1/2 ). We deal wth ths appedx. Defe f(λ) = log EL(w (λ)) = w j (λ)i[t j T ] δ log w (λ) j ad the test statstc 2 log ELR(θ 0 ) ca be expressed as 2 log ELR = 2[f(0) f(λ )] = 2[f(0) f(0) λ T f (0) 1/2λ T f (0)λ +...]. Straght forward calculato show f (0) = 0 ad f (0) = A. Therefore 2 log ELR = λ T f (0)λ +... (7) 4

smplfy t to the followg Fally, by Lemma 1 ad Lemma 2, we get 2 log ELR(θ 0 ) = b T A 1 b + o p (1) 2 log ELR(θ 0 ) D χ 2 (p) as. 5

Appedx. We ow gve a proof of the Lemma 1, law of large umber for the Z or for the tegral of Nelso-Aale estmator. The results s obvously true f we mpose more momet codtos. We, however, try to gve a proof that oly assume the fteess of the lmtg tegrato ad wthout the extra momet codto. Also, we allow the g(t) fucto to be a radom sequece of fuctos. Notce here the radom varables Z are ot depedet. Lemma 1 Uder assumptos below, for gve k = 1, 2,, p we have ) 1 Zk 2 = gk 2(t) R(t)/ dˆλ P NA (t) g 2 k (x) (1 F 0 (x))(1 G 0 (x )) dλ 0(x) Assumptos: (We omt the subscrpt k. These codtos should hold for all k = 1, 2,, p. (1) The lmt that you hope to coverge to must be fte:.e. 0 g 2 (x) (1 F 0 (x))(1 G 0 (x )) dλ 0(x) < (2) If we use g (t) o the left sde, the we eed to assume: t coverges uformly ay fte tervals,.e. for ay fte τ, sup t<τ g (t) g(t) go to zero probablty ad the rato sup g (T )/g(t ) s bouded probablty. These two codtos are satsfed by the emprcal dstrbutos, the Kapla-Meer estmator ad the Nelso-Aale estmator. Notce the CLT of the martgale (Lemma 2), we wll further requre that g (t) be predctable fuctos. Proof: We frst proof the LLN for τ 0 for ay gve fte τ. τ 0 g 2 (t) R(t)/ dˆλ(t) = I[T < τ] g2 (T ) N(T ) R(T )/ R(T ) (8) Mus ad plus the term (recall N(T ) = δ ) 1 g 2 (T ) I[T < τ] [1 H(T )] 2 δ the above, ad regroup, we get = 1 ( g 2 I[T < τ]δ (T ) [R(T )/] 2 g 2 ) (T ) [1 H(T )] 2 + 1 g 2 (T )δ I[T < τ] [1 H(T )] 2 (9) The frst term above s bouded by 1 I[T < τ] g(t 2 ) [R(T )/] 2 g 2 (T ) [1 H(T )] 2 δ sup g(t) 2 [R(t)/] 2 g 2 (t) [1 H(t )] 2 The term sde the absolute sg s uformly coverget to zero, by the assumpto 2 o g (t). Ad t s well kow that R(t)/ [1 H(t )] uformly. Therefore the recprocal of t s at least uformly coverget o t τ. t<τ 6

The last term (9) above s a d sum wth respect to (T, δ ). By classc LLN, t coverge to ts expectato, whch s ( g 2 ) (T )δ τ g 2 (t) E I[T < τ] [1 H(T )] 2 = 1 H(t ) dλ 0(t) whch by assumpto 1 s fte. Ths proves that the Lemma holds for ay fte τ. We eed to take care of the tal: τ. By assumpto 1, τ g 2 (t) 1 H(t ) dλ 0(t) ca be made arbtrary small by selectg a large τ. (say smaller tha ɛ/c) Sce the rato g (T )/g(t ) ad [1 H(T )]/[R(T )/] are both uformly ( sup 1 ) bouded probablty (assumpto 2, ad property of emprcal dstrbuto fucto) we have, that the term s bouded probablty by I[T τ] g2 (T ) N(T ) R(T )/ R(T ) C 1 g 2 (T )δ I[T τ] [1 H(T )] 2 Ths summato/average above coverges to ts mea (sce t s a d average) C τ g 2 (t) 1 H(t ) dλ 0(t) the absolute value of whch, tur, s smaller tha the pre-selected ɛ. Ths fshes the proof. Remark: For future work o the Edgeworth expaso/bartlett correcto, We eed a LLN lke the above but wth rates, for the Edgeworth aalyss of the emprcal lkelhood. Uder sutable assumpto, the followg should be true (LIL): g 2 (t)dˆλ (t) or smlar to lemma 1 g 2 (t) R(t)/ dˆλ (t) 0 log log g 2 (t)dλ(t) = O( ) a.s. g 2 (t) log log dλ(t) = O( ) a.s. 1 H(t ) The proof of Lemma 2 s drect cosequece of the martgale cetral lmt theorem, See, for example, Kalbflesch ad Pretce 2002 chapter 5, Theorem 5.1 partcular. Remark: A better ormal approxmato for the martgales, whch has Edgeworth expaso, s gve by La ad Wag. P ( (ˆΛ (t) Λ(t)) σz) = Φ(z) 1/2 φ(z)p 1 (z) 1 φ(z)p 2 (z) + o( 1 ) 7

Appedx. PROOF OF λ 0 SMALL. We gve a proof that valdates the expaso of (3). I other words, we show the soluto of (3) s small. We wat to show λ T Z s small uformly over. We shall deote the soluto as λ 0. Lemma 3: Suppose M = o p ( 1/2 ), the we have λ = O p ( 1/2 ) Proof: Homework. f ad oly f λ 1 + λ M = O p( 1/2 ). Lemma 4: If X 1,, X are detcally dstrbuted, ad E(X 1 ) 2 <, the we have M = max 1 X = o p ( 1/2 ). Proof Sce {M > a} = { X > a}, we compute P (M > 1/2 ) = P ( ( X > 1/2 )) By the detcal dstrbuto assumpto, P ( X > 1/2 ). = P ( X 1 > 1/2 ) = P (X 2 1 > ). Sce EX1 2 <, the rght had sde above 0 as. Smlar proof wll show that, f E X p < the M = o p ( 1/p ). Lemma 5: We compute δ g 2 (T ) E [1 F (T )] 2 [1 G(T )] 2 = g 2 (t) [1 F (t )][1 G(t)] dλ(t). Therefore f we assume g 2 (t) [1 F (t )][1 G(t)] dλ 0(t) < the M δ g(t ) = max 1 [1 F (T )][1 G(T )] = o p( 1/2 ) by Lemma 4 ad 5. Now, usg a theorem of Zhou (1992) we ca replace the deomator of M by R(T )/: M = max Z = max δ g(t ) 1 R / M [1 F (T )][1 G(T )] max = o p ( 1/2 ). R / Now we proceed: deote the soluto by λ 0. We otce that for all, 1 + λ T 0 Z 0 sce the soluto w gve Theorem 1 must gve rse to a legtmate jump of the hazard fucto, whch must be 0. Clearly w 0 mply 1 + λ T 0 Z 0. Frst we rewrte the equato (3) ad otce that λ 0 s the soluto of the followg equato 0 = l(η). 0 = l(λ 0 ) = (θ 0 1 Z ) + λ 0 8 1 Z 2 1 + λ 0 Z (10)

Therefore, θ 0 1 Z = λ 1 0 Z 2 (11) 1 + λ 0 Z θ 0 1 Z = λ 1 0 Z 2 (12) 1 + λ 0 Z Sce for every term (at least whe δ = 1, or Z 2 > 0), Z2 /(1 + λ 0Z ) 0, therefore we have θ 0 1 Z = λ 0 Z 2 1 + λ 0 Z Replace the deomators 1 + λ 0 Z by ts upper boud: for ay we have we got a lower boud the fracto θ 0 1 Z 1 + λ 0 Z 1 + λ 0 M 1 λ 0 1 1 + λ 0 M Z 2 0 Sce θ 0 1/ Z = O p ( 1/2 ) (CLT, Lemma 2), We see that λ 0 1 Z 2 1 + λ 0 M = O p ( 1/2 ) ad obvously 1 Z 2 = O p (1) (Lemma 1) thus we must have λ 0 1 + λ 0 M = O p ( 1/2 ). By Lemma 3 above we must fally have λ 0 = O p ( 1/2 ). As a cosequece, we also have λ 0 M = o p (1) ad thus λ 0 Z = o p (1) uformly for all. Problem: Usg the smlar techques to show that the emprcal lkelhood rato uder sequece of local alteratve hypothess has a o-cetral ch squared dstrbuto. (smlar to Owe 1988). 9

The Posso lkelhood we defed prevous chapter has receved some crtcsm. Sce we assumed a dscrete hazard/dstrbuto fucto but at the same tme we used a formula coectg the hazard ad CDF that s oly vald for the cotuous case. The dscrepacy vashes asymptotcally but for fte samples, t s ot a exact lkelhood. The bomal lkelhood we shall dscuss here always strctly stck to a dscrete CDF/hazard fucto, ad the lkelhood s a true probablty. However, the class of statstc/parameter we shall be testg has a strage tegratg format. 1 Cesored Emprcal Lkelhood wth (k > 1) Costrats, Bomal lkelhood We wll frst study the oe sample case. The results exted straghtforwardly to the two sample stuato the ext secto. 1.1 Oe Sample Cesored Emprcal Lkelhood For depedet, detcally dstrbuted observatos, X 1,, X, assume that the dstrbuto of the X s F x0 (t), ad the cumulatve hazard fucto of X s Λ x0 (t). Wth rght cesorg, we oly observe T = m(x, C ) ad δ = I [X C ] (1) where the C s are the cesorg tmes, assumed to be depedet, detcally dstrbuted, ad depedet of the X s. Based o the cesored observatos, the log emprcal lkelhood pertag to a dstrbuto F x s log EL(F x ) = [δ log F x (T ) + (1 δ ) log{1 F x (T )}]. (2) As show Pa ad Zhou (2002), computatos are much easer wth the emprcal lkelhood reformulated terms of the correspodg (cumulatve) hazard fucto. However, there are dfferet formula relatg the CDF ad the cumulatve hazard fucto for dscrete or cotuous cases. Sce the maxmzato of the EL wll force the dstrbuto to be dscrete (for example: emprcal dstrbuto or the Kapla-Meer) we shall use the dscrete formula relatg the F to Λ. The equvalet hazard formulato of (2) wll be deoted by log EL(Λ x ). Usg the relatos Λ(t) = F (t) 1 F (t ) ad 1 F (t) = s t[1 Λ(s)] 2

we ca rewrte the emprcal lkelhood (Proof as homework) as follows. Deotg Λ(T ) = v the EL s gve as follows: log EL(Λ x ) = {d log v + (R d ) log(1 v )} (3) where d = j=1 I [T j =t ]δ j, R = j=1 I [T j t ], ad t are the ordered, dstct values of T. Ths EL s called the bomal verso of the hazard emprcal lkelhood. See, for example, Thomas ad Grukemeer (1975) ad L (1995) for smlar otato. Here, 0 < v 1 are the dscrete hazards at t. The maxmzato of (3) wth respect to v s kow to be attaed at the jumps of the Nelso-Aale estmator: v = d /R. We further deote the maxmum value acheved by EL as EL(ˆΛ NA ). Notce the smlarty of ths lkelhood to the lkelhood of a bomal sample, hece the ame. Let us cosder a hypothess testg problem for a k dmesoal parameter θ = (θ 1,, θ k ) T wth θ r = g r (t) log(1 dλ x (t)), where the g r (t) are gve oegatve fuctos. remark 1 after the theorem for the strage lookg tegrato. See also H 0 : θ = µ vs. H A : θ µ where µ = (µ 1,, µ k ) T s a vector of k costats. The costrats we shall mpose o the dscrete hazards v are: for gve fuctos g 1 ( ),, g k ( ) ad costats µ 1,, µ k, we have N 1 g 1 (t ) log(1 v ) = µ 1,, N 1 g k (t ) log(1 v ) = µ k, (4) where N s the total umber of dstct observato values. We eed to exclude the last value as we always have v N = 1 for dscrete hazards. Let us abbrevate the maxmum lkelhood estmators of Λ x (t ) uder costrats (4) as v. Applcato of the Lagrage multpler method shows v = v (λ) = d R + λ T G(t ), where G(t ) = {g 1 (t ),, g k (t )} T, ad λ s the soluto to (4) whe replace v by v (λ) (Lemma 1 the appedx). The, the lkelhood rato test statstc terms of hazards s gve by W 2 = 2{log max EL(Λ x )(wth costrat (4)) log EL(ˆΛ NA )}. 3

We have the followg result that s a verso of Wlks theorem for W 2 uder some regularty codtos whch clude the stadard codtos o cesorg that allow the Nelso-Aale estmators to have a asymptotc ormal dstrbuto (see, e.g., Gll, 1983; Aderse et al., 1993). The proof of the followg theorem, alog wth a detaled set of codtos, s provded the appedx. Theorem 1. Suppose that the ull hypothess H 0 holds,.e. µ r = g r (t) log{1 dλ x (t)}, r = 1,..., k. The, uder codtos specfed the appedx, the test statstc W 2 has asymptotcally a ch-squared dstrbuto wth k degrees of freedom. Remark 1 The tegrato costrats are orgally gve as θ r = g r (t)d log{1 F x (t)}, r = 1,, k. (but ths s ot terms of the hazard). The above formulato s foud by usg the detty d log{1 F (t)} = log{1 dλ(t)} whch holds for both cotuous ad dscrete F (t). Aga, the pot t where F (t) = 1 have to be excluded from the tegrato. Remark 2: If the fuctos g r (t) are radom but predctable wth respect to the fltrato F t = σ{t I [T t]; δ I [T t]; = 1,..., }, the Theorem 1 s stll vald (see the appedx for detals). Remark 3: Oe of the codtos for Theorem 1 s that the matrx Σ defed Lemma 2 appedx s vertble. If Σ s ot vertble, the the k costrats may have redudacy wth, whch case we may hadle t by usg the theory of over-determed EL. 1.2 Two Sample Cesored Emprcal Lkelhood Suppose that addto to the cesored sample of X-observatos, we have a secod sample Y 1,, Y m comg from a dstrbuto fucto F y0 (t) wth a cumulatve hazard fucto Λ y0 (t). Assume that the Y j s are depedet of the X s. Wth cesorg, we ca oly observe U j = m(y j, S j ) ad τ j = I [Yj S j ] (5) where the S j s are the cesorg varables for the secod sample. Deote the ordered, dstct values of the U j by s j. Smlar to (3), the log emprcal lkelhood fucto based o the two cesored samples 4

pertag to cumulatve hazard fuctos Λ x ad Λ y s smply log EL(Λ x, Λ y ) = L 1 + L 2 where L 1 = d 1 log v + (R 1 d 1 ) log(1 v ) ad L 2 = j d 2j log w j + j (R 2j d 2j ) log(1 w j ), (6) wth d 1, R 1, d 2j ad R 2j defed aalogous to the oe sample stuato (see p. 3). Accordgly, let us cosder a hypothess testg problem for a k dmesoal parameter θ = (θ 1,, θ k ) T wth respect to the cumulatve hazard fuctos Λ x ad Λ y such that H 0 : θ = µ vs. H A : θ µ, where θ r = g 1r (t) log{1 dλ x (t)} g 2r (t) log{1 dλ y (t)}, r = 1,, k, for some predctable fuctos g 1r (t) ad g 2r (t). The, the costrats mposed o v ad w j are µ r = N 1 M 1 g 1r (t ) log(1 v ) j=1 g 2r (s j ) log(1 w j ), r = 1,..., k, (7) where N ad M are the total umbers of dstct observato values the two samples. As the oe sample case, we eed to exclude the last value each sample. Let us abbrevate the maxmum lkelhood estmators of Λ x (t ) ad Λ y (s j ) uder the costrats (7) as v of the Lagrage multpler method shows v = v (λ) = ad wj, respectvely, where = 1,, N ad j = 1,, M. Applcato d 1 R 1 + m(, m)λ T G 1 (t ), w j = w j (λ) = d 2j R 2j m(, m)λ T G 2 (s j ), where G 1 (t ) = {g 11 (t ),, g 1k (t )} T, G 2 (s j ) = {g 21 (s j ),, g 2k (s j )} T, ad λ s the soluto to (7) whe we plug the v ad wj. The, the two-sample test statstc s gve as follows: W2 = 2{log max EL(Λ x, Λ y )(wth costrat (7)) log EL(ˆΛ NA NA x, ˆΛ y )}, aalogous to the oe-sample case. The followg theorem provdes the asymptotc dstrbuto result for W2. The proof ca be foud the appedx. Theorem 2. Suppose that the ull hypothess H 0 : θ r = µ r holds..e. µ r = g 1r (t) log{1 dλ x (t)} g 2r (t) log{1 dλ y (t)}, r = 1,..., k. The, as m(, m) ad /m c (0, ), W 2 has asymptotcally a ch-squared dstrbuto wth k degrees of freedom. 5

Remark: The two-sample setup we studes ths secto took a partcularly smple form: the dfferece of two parameters. For more volved parameters, we may ot be able to wrte t as a smple dfferece. For example a two sample U statstcs: θ = g(s, t)dλ x (s)dλ y (t). For the aalyss of those, please see Barto (2010) ad the R package emplk2. 6

A Appedx Assumptos for Theorem 1 Let X 1,, X be depedet, detcally dstrbuted radom varables wth cumulatve dstrbuto fucto F x0 (t) ad cumulatve hazard fucto Λ x0 (t). We observe T = m(x, C ) ad δ = I [X C ], where the C are depedet, detcally dstrbuted cesorg tmes, depedet of the X. The cumulatve dstrbuto fucto of the C s F c (t). The dstrbuto fuctos F x0 (t) ad F c (t) do ot have commo dscotutes. Let g 1 (t),..., g k (t) be o-egatve left cotuous fuctos wth gr (t) 2 (1 Λ x0 (t)) 0 < (1 F x0 (t))(1 F c (t)) dλ x0(t) <, r = 1,..., k. (9) Ths codto guaratees asymptotc ormalty of the Nelso-Aale estmator (cf. Theorem 2.1 Gll, 1983). Note that the factor (1 Λ x0 (t)) s oly eeded for dscrete dstrbutos. It equals 1 whe F x0 s absolutely cotuous. Also, uder the above codto, µ r = g r (t) log(1 dλ x0 (t)) s feasble wth probablty approachg 1 as. Note that the fuctos g r (t) may be radom, but they have to be predctable wth respect to the fltrato F t = σ{t I [T t]; δ I [T t]; = 1,..., } whch makes ˆΛ NA (t) Λ x0 (t) a martgale, so that the martgale cetral lmt theorem ca be appled. Here, ˆΛ NA (t) deotes the Nelso-Aale estmator of hazard fucto. Furthermore, f the fuctos g r (t) are radom, we requre that there are o-radom left cotuous fuctos g r0 (t) such that sup g r (t) g r0 (t) = o p (1) ad t T sup gr(t) g r0 (t) = O p(1) for r = 1,..., k as. t T Remark: The stregth of the above assumpto (9) s qute weak. Ths assumpto s apparetly of the same stregth as the codto Akrtas (2000) put o the mea fucto. Akrtas was cosderg the CLT for φ(t)df (t) ad requres φ 2 (t)df (t)/[1 G(t )] <. If we put g(t) = [1 F (t)]φ(t), the our codto (9) above s the same as Akrtas codto. Why g(t) = [1 F (t)]φ(t) s the coecto? See Akrtas ad some more dscussos tech report. Mathematcal Dervatos ad Proofs for Theorem 1 Recall the colum vectors G(t) = {g 1 (t),, g k (t)} T ad λ = {λ 1,, λ k } T. 9

Lemma 1. The hazards that maxmze the log lkelhood fucto (3) uder the costrats (4) are gve by v (λ) = where λ s obtaed as the soluto of the followg k equatos. d R + λ T G(t ), (10) N 1 g 1 (t ) log{1 v (λ)} = µ 1,, N 1 g k (t ) log{1 v (λ)} = µ k. (11) Proof of Lemma 1. The result follows from a stadard Lagrage multpler argumet appled to (3) ad (4). See Fag ad Zhou (2000) for some smlar calculatos.. We deote the soluto of (11) by λ x. Lemma 2. Assume the data are such that the Nelso-Aale estmator s asymptotcally ormal ad the varace-covarace matrx Σ defed below (p. 12) s vertble. The, for the soluto λ x of the costraed problem (11), correspodg to the ull hypothess H 0 : µ r = g r (t) log{1 dλ x0 (t)}, r = 1,..., k, we have that 1/2 λ x coverges dstrbuto to N(0, Σ). Preparato for the proofs of Lemma 2 ad Theorem 1. Let f(λ) = [d log v (λ) + (R d ) log{1 v (λ)}]. (12) I order to show that f (0) = 0, we compute f(λ) = λ r d v (λ) v (λ) (R d ) λ r v (λ) (1 v (λ)) λ r, r = 1,..., k. Lettg λ = 0, ad after some smplfcato, we have λ r f(λ) λ=0 = (R R ) d g r (t ) R 2 0. We ow compute f (0) =. The rl th elemet of the k k matrx s D rl = 2 λ r λ l f(λ) λ=0. After straghtforward but tedous calculatos, we obta { } 2 g r g l d D rl = R R d. 10

By a ow stadard coutg process martgale argumet, we see that D rl / coverges almost surely to Drl. Proof of Lemma 2. We derve the asymptotc dstrbuto of λ. The argumet s smlar to, for example, Owe (1990) ad Pa ad Zhou (2002). Defe a vector fucto h(s) = {h 1 (s),, h k (s)} T by h 1 (s) = g 1 (t ) log{1 v (s)} µ 1,, h k (s) = g k (t ) log{1 v (s)} µ k. (13) The, λ s the soluto of h(s) = 0. Thus, we have 0 = h(λ) = h(0) + h (0)λ + o p ( 1/2 ), (14) where h (0) s a k k matrx. Ideed, f we wrte λ = ρ λ, where λ = 1, the 0 = λ T h(λ) = ( = λ T G(t ) log{1 v (s)} λ T µ = λ T G(t ) log(1 d ) R λ ) T µ + = A + B, where the frst expresso A s of order O p ( 1/2 ). λ T d G(t ) log{1 R + λ T G(t ) } λ T µ [ 1 λ T d /{R + λ T G(t )} ] G(t ) log 1 d /R Cosderg the secod expresso, ad otg that for ay par of umbers ε 1, ε 2 (0, 1], the equalty ε 1 ε 2 log(ε 1 ) log(ε 2 ) holds, we have B = [ 1 λ T d /{R + λ T G(t )} ] G(t ) log 1 d /R λ T ρg(t ) T λd G(t ) R (R + ρ λ T G(t )) ρ ( λ T G(t )) 2 d 1 + ρ max λ T G(t )/R R 2 The sum the last expresso s of order O p (1), ad uder assumpto (9), the maxmum the deomator s of order o p ( 1/2 ). Therefore, ρ s of order O p ( 1/2 ), ad hece, the expaso (14) s vald. Therefore, 1/2 λ = {h (0)} 1 { 1/2 h(0)} + o p (1). 11

The elemets of h (0) are easly computed: h rl = g r g l d R (R d ). Notce that we have verfed h rl = D rl. By the coutg process martgale cetral lmt theorem (see, for example, Gll, 1980; Aderse et al., 1993; or Fag ad Zhou, 2000), we ca show that 1/2 h(0) coverges dstrbuto to N(0, Σ h ) wth Σ h = lm h (0). Fally, puttg t together, we have that 1/2 λ(0) = {h (0)} 1 { 1/2 h(0)}+o p (1) coverges dstrbuto to N(0, Σ) wth Σ = lm{h (0)} 1. Recallg h rl = D rl, we see that Σ 1 = D. Proof of Theorem 1. Let f(λ) be defed as (12). The, we have W 2 = 2{f(λ x ) f(0)}. By Taylor expaso, we obta W 2 = 2{f(0) f(0) f (0)λ x 1 2 λt x Dλ x + o p (1)}, (15) where we use D to deote the matrx of secod dervatves of f( ) wth respect to λ. expaso s vald vew of Lemma 2 (λ x s close to zero). Sce we have f (0) = 0 (see above), the expresso above s reduced to The W 2 = λ T x Dλ x + o p (1). (16) Notce that D s symmetrc ad postve defte for large eough because D/ coverges to a postve defte matrx, see below. Therefore, we may wrte W 2 = λ T x ( D) 1/2 ( D) 1/2 λ x + o p (1). (17) Recallg the dstrbutoal result for λ x Lemma 2 ad otcg that D/ coverges almost surely to D, ad D = Σ 1 (see above the proof of Lemma 2), t s ot hard to show that 1/2 λ T x (D 1/2 1/2 ) coverges dstrbuto to N(0, I). Ths together wth (16) mples that W 2 coverges dstrbuto to χ 2 k. About feasblty: Clearly whe λ = 0 all the v s are betwee 0 ad 1 or equvaletly, µ NP MLE s feasble. For the costrat mposed by a true H 0, as show above we have the order λ x = O p ( 1/2 ). Ths mply λ T G = O p ( 1/2 ). Notce R(t) = O p (), so as we always have 0 < d /(R + λg) < 1, or that a true ull hypothess s feasble. Assumptos for Theorem 2 12

Let X 1,, X be depedet, detcally dstrbuted radom varables wth cumulatve dstrbuto fucto F x0 (t) ad cumulatve hazard fucto Λ x0 (t). We observe T = m(x, C ) ad δ = I [X C ], where the C are depedet, detcally dstrbuted cesorg tmes, depedet of the X. The cumulatve dstrbuto fucto of the C s F c (t). The dstrbuto fuctos F x0 (t) ad F c (t) do ot have commo dscotutes. Further, let Y 1,, Y m be depedet, detcally dstrbuted radom varables wth cumulatve dstrbuto fucto F y0 (t) ad cumulatve hazard fucto Λ y0 (t). We observe U j = m(y j, S j ) ad τ j = I [Yj S j ], where the S j are depedet, detcally dstrbuted cesorg tmes, depedet of the Y j. cumulatve dstrbuto fucto of the S j s F s (t). The dstrbuto fuctos F y0 (t) ad F s (t) do ot have commo dscotutes. The (Y j, S j ) are depedet of the (X, C ). Let g 1r (t) ad g 2r (t), r = 1,..., k, be o-egatve left cotuous fuctos wth g1r (t) 2 (1 Λ x0 (t)) 0 < (1 F x0 (t))(1 F c (t)) dλ x0(t) <, r = 1,..., k, ad g2r (t) 2 (1 Λ y0 (t)) 0 < (1 F y0 (t))(1 F s (t)) dλ y0(t) <, r = 1,..., k. The fuctos g lr (t), l = 1, 2, r = 1,..., k, may be radom, but they have to be predctable wth respect to the fltrato F t = σ{t I [T t]; δ I [T t]; U j I [Uj t]; τ j I [Uj t]; = 1,..., ; j = 1,..., m}. Furthermore, f the fuctos g lr (t) are radom, we requre that there are o-radom left cotuous fuctos g lr0 (t) such that sup g lr (t) g lr0 (t) = o p (1) ad sup t V for r = 1,..., k as m(m, ). Here V = m(max T, max U j ). Mathematcal Dervatos ad Proofs for Theorem 2 t V g lr(t) The g lr0 (t) = O p(1) The proof of Theorem 2 s very smlar to the oe for the oe-sample stuato. two-sample case, the costrats are defed by I the µ r = N 1 M 1 g 1r (t ) log(1 v ) Defe G 1 (t ) = {g 11 (t ),, g 1k (t )} T j=1 g 2r (s j ) log(1 w j ), r = 1,..., k. ad G 2 (s j ) = {g 21 (s j ),, g 2k (s j )} T. The vector λ xy s the soluto to maxmzg log EL(Λ x, Λ y ) = L 1 + L 2 uder the above costrats. Smlar to Lemma 1, applcato of the Lagrage multpler method yelds the maxmum lkelhood estmators v (λ) = d 1 R 1 + m(, m)λ T xyg 1 (t ) ad w j (λ xy ) = d 2j R 2j m(, m)λ T xyg 2 (s j ). 13

I the two-sample stuato, the fucto f(λ) defed (12) becomes f(λ) = [d 1 log v (λ) + (R 1 d 1 ) log{1 v (λ)}]+ [d 2j log w j (λ) + (R 2j d 2j ) log{1 w j (λ)}]. The same calculato as above (see p. elemet of the k k matrx s 2 g 1r g 1l d 1 D rl = + R 1 R 1 d 1 j 10) yelds f (0) = 0 ad f (0) = where the rl th m 2 g 2r g 2l R 2j d 2j R 2j d 2j. Sce we assume that /m c (0, ) as m(m, ), we have aga that D rl / coverges almost surely to D rl. I order to show the asymptotc ormalty of 1/2 λ xy, we proceed aalogous to the proof of Lemma 2. Defe h(u) = {h 1 (u),, h k (u)} T, where h r (u) = g 1r (t ) log{1 v (u)} j g 2r (s j ) log{1 w j (u)} µ r, r = 1,..., k, let λ xy = ρ λ, where λ = 1, ad otce that 0 = λ T h(λ) = A + B, where ad A = B = j λ T G 1 (t ) log{1 d 1 R 1 } j [ 1 λ T d1 /{R 1 + m(m, )ρ λ T G 1 (t )} ] G 1 (t ) log 1 d 1 /R 1 λ T G 2 (s j ) log{1 d 2j R 2j } λ T µ = O p ( 1/2 ) [ 1 λ T d2j /{R 2j + m(m, )ρ λ T G 2 (s j )} ] G 2 (s j ) log. 1 d 2j /R 2j A smlar calculato as the proof of Lemma 2 yelds B ρ 1 + ρ max ( max ( λ T G(t )/R 1 ), max j ( λ T G(s j )/R 2j ) ) ( ( λ T G(t )) 2 d 1 R 2 + ( λ T G(s j )) 2 d ) 2j 1 R 2. j 2j Aga, the sum the last expresso s of order O p (1), ad ρ s therefore of order O p ( 1/2 ). Thus, the expaso 0 = h(λ xy ) = h(0) + h (0)λ xy + o p ( 1/2 ) s vald, where h (0) s a k k 14

matrx. Applcato of the coutg process martgale cetral lmt theorem shows that 1/2 λ xy coverges to N(0, Σ) wth Σ = lm{h (0)} 1. The fal step the proof of Theorem 2 s a Taylor expaso of W 2 = 2( f(λ xy ) f(0) ) as W 2 = 2 ( f (0)λ xy + 1 2 λt xydλ xy ) + op (1) = λ T xy( D) 1/2 ( D) 1/2 λ xy + o p (1), ad otcg that λ T xy( D) 1/2 coverges dstrbuto to N(0, I). 15