COXREG. Estimation (1) - PDF Free Download

COXREG Cox (972) frst suggested the modes n whch factors reated to fetme have a mutpcatve effect on the hazard functon. These modes are caed proportona hazards (PH) modes. Under the proportona hazards assumpton, the hazard functon h of t gven X s of the form 2 7 06 ht x = h t e x () where x s a known vector of regressor varabes assocated wth the ndvdua, s a vector of unknown parameters, and h06 t s the basene hazard functon for an ndvdua wth x = 0. Hence, for any two covarates sets x and x 2, the og hazard functons ht 2 x 7 and ht 2 x 2 7 shoud be parae across tme. When a factor does not affect the hazard functon mutpcatvey, stratfcaton may be usefu n mode budng. Suppose that ndvduas can be assgned to one of m dfferent strata, defned by the eves of one or more factors. The hazard functon for an ndvdua n the jth stratum s defned as x j 6 0 j6 h t x = h t e (2) Estmaton There are two unknown components n the mode: the regresson parameter and the basene hazard functon h0 j6. t The estmaton for the parameters s descrbed beow. We begn by consderng a nonnegatve random varabe T representng the fetmes of ndvduas n some popuaton. Let ft x6 denote the probabty densty functon (pdf) of T gven a regressor x and et St x 6 be the survvor functon (the probabty of an ndvdua survvng unt tme t). Hence

2 COXREG St x 6I = fu x6du (3) t The hazard ht x 6 s then defned by 6 6 6 ht x f t x = (4) St x Another usefu expresson for St x and (4) s 6 6 t St x = exp hu x du I0 6 n terms of ht x 6 derved from equatons (3) (5) Thus, t 6 6 n St x hu x du = I0 (6) For some purposes, t s aso usefu to defne the cumuatve hazard functon t 6 6 6 I0 Ht x = hu x du= n St x (7) Assume that the hazard functon has the form of equaton (). The survvor functon can be wrtten as exp x 6 6 6 0 St x = S t (8) where S06 t s the basene survvor functon defned by 6 2 67 S0 t = exp H0 t (9)

COXREG 3 and =I t H06 t h06 u du 0 Some reatonshps between St x 6, Ht x w be used ater are 6 6 6 00 5 n St x = Ht x = exp x H t 2 67 005 n n St x = x + n H t 6 and H06, t S06 t and h t 06 whch (0) () To estmate the survvor functon St x 6, we can see from equaton (8) that there are two components, and S06, t whch need to be estmated. The approach we use here s to estmate from the parta kehood functon and then to maxmze the fu kehood for S t Estmaton of Beta Assume that 06. There are m eves for the stratfcaton varabe. Indvduas n the same stratum have proportona hazard functons. The reatve effect of the regressor varabes s the same n each stratum. Let tj < L < tjk be the observed uncensored faure tme of the n j j ndvduas n the jth stratum and xj, K, xjk j be the correspondng covarates. Then the parta kehood functon s defned by 05= L m j= k j = Rj e S j we x d j (2a)

4 COXREG where d j s the sum of case weghts of ndvduas whose fetme s equa to t j and S j s the weghted sum of the regresson vector x for those d j ndvduas, w s the case weght of ndvdua, and R j s the set of ndvduas ave and uncensored just pror to t j n the jth stratum. Thus the og-kehood arsng from equaton (2a) s m k j m k j 6 S j j= = j= = = n L = d n w e j Rj x (2b) and the frst dervatves of are D r x wx e r m k j r Rj = = Sj dj r 6 x we j= = R j, r =, K, p (3) In equaton (3), S j r 6 s the rth component of S j 6 p j j 6 = S, K, S. The maxmum parta kehood estmate (MPLE) of s obtaned by settng r equa to zero for r =, K, p, where p s the number of ndependent varabes n the mode. The equatons = 0 r =, K, p6 can usuay be soved by usng the r Newton-Raphson method. Note that from equaton (2a) the parta kehood functon L6 s nvarant under transaton. A the covarates are centered by ther correspondng overa mean. The overa mean of a covarate s defned as the sum of the product of weght and covarate for a the censored and uncensored cases n each stratum. For notatona smpcty, x used n the Estmaton Secton denotes centered covarates.

COXREG 5 Three convergence crtera for the Newton-Raphson method are avaabe: Absoute vaue of the argest dfference n parameter estmates between teratons 6 δ dvded by the vaue of the parameter estmate for the prevous teraton; that s, BCON = δ parameter estmate for prevous teraton Absoute dfference of the og-kehood functon between teratons dvded by the og-kehood functon for prevous teraton. Maxmum number of teratons. The asymptotc covarance matrx for the MPLE = $ $,, $ K p 4 9 s estmated by I where I s the nformaton matrx contanng mnus the second parta dervatves of n L. The (r, s)-th eement of I s defned by I rs = E m k j j= = r 2 n L s R R x s r wx x e = d j we! j j x Rj wx e x r Rj x we Rj wx e 2 x s " $ # (4) We can aso wrte I n a matrx form as m k j Irs = d j x 3t j8 V t j x t j j= = 4 9 3 84 3 89

6 COXREG where x3t j 8 s a nj p matrx whch represents the p covarate varabes n the mode evauated at tme t j, n j s the number of dstnct ndvduas n R j, and 3 8 s a nj n V t j v t = p t w w p t j j j 3 j8 p t h R j w j matrx wth the th dagona eement v tj 3 8 3 8 4 3 89 = 4x 9 h exp 4x $ h9 exp $ 2 3 8 defned by and the (, k) eement vk tj 3 8 3 8 3 8 vk tj = w p tj wk pk t j 3 8 defned by Estmaton of the Basene Functon After the MPLE $ of s found, the basene survvor functon S0 j6 t s estmated separatey for each stratum. Assume that, for a stratum, t < L < t k are observed fetmes n the sampe. There are n at rsk and d deaths at t, and n the nterva t, t 6 there are λ censored tmes. Snce S06 t s a survvor functon, t s non-ncreasng and eft contnuous, and thus S $ 06 t must be constant except for jumps at the observed fetmes t, K, t k. Further, t foows that $S06= t and 6 6 S$ t S$ 0 + = 0 t+

COXREG 7 Wrtng S $ 0 t + = p =, K, 6, the observed kehood functon s of the form % &K 'K 6 k k w w exp x exp x exp x exp x k = D C C L = p p p p k + 6 6 6 6 where D s the set of ndvduas dyng at t and C s the set of ndvduas wth censored tmes n t, t 6. (Note that f the ast observaton s uncensored, C k + s empty and p k = 0.) If we et α = p p =, K, k6, L can be wrtten as k w exp x 6 w expx 6 L = α α = D R D ( )K *K w Dfferentatng n L wth respect to α, K, αk zero, we get and settng the equatons equa to w exp x exp x D α 6 6 6 = w exp x =, K, k (5) R We then pug the MPLE $ of nto equaton (5) and sove these k equatons separatey. There are two thngs worth notng: If any D =, $α can be soved expcty. w exp $ x $ α = w exp x $! R 4 9 4 9 " $ # 4 9 $ exp x (6)

8 COXREG If D >, equaton (7) must be sove teratvey for $α. A good nta vaue for $α s $ = exp α R d 4 9 w exp x $ (7) where d = w s the weght sum for set D. (See Lawess, 982, p. 36.) D Once the $α, 06 =, K, k are found, S06 t s estmated by S$ t = $ α (8) : t < t 6 Snce the above estmate of S06 t requres some teratve cacuatons when tes exst, Bresow (974) suggests usng equaton (7) as an estmate for α ; however, we w use ths as an nta estmate. The asymptotc varance for n S $ 06 t can be found n Chapter 4 of Kabfesch and Prentce (980). At a specfed tme t, t s consstenty estmated by 4 069 4 9 t < t R var n S$ t = D w exp $ x + a I a (9) 2 where a s a p vector wth the jth eement defned by D t < t 4 9 wx j exp x $ R 2 4 9 w exp x $ R

COXREG 9 and I s the nformaton matrx. The asymptotc varance of Stx $ 6 s estmated by 4 69 4 69 (20) $ 2x 2 e S$ t x var n S $ 0 t Seecton Statstcs for Stepwse Methods COX REGRESSION offers the same methods for varabe seecton as LOGISTIC REGRESSION. For the detas of these methods, and stepwse agorthms, see the LOGISTIC REGRESSION chapter. Here we w ony defne the three remova statstcs Wad, LR, and Condtona and the Score entry statstc. Score Statstc The score statstc s cacuated for every varabe not n the mode to decde whch varabe shoud be added to the mode. Frst we compute the nformaton matrx I for a egbe varabes based on the parameter estmates for the varabes n the mode and zero parameter estmates for the varabes not n the mode. Then we partton the resutng I nto four submatrces as foows:! " A A2 A2 A22$# (2) where A and A 22 are square matrces for varabes n the mode and varabes not n the mode, respectvey, and A 2 s the cross-product matrx for varabes n and out. The score statstc for varabe x s defned by D B D x 22, x where D x s the frst dervatve of the og-kehood wth respect to a the parameters assocated wth x and B 22, s equa to A22 A2 A, 4, A2, 9, and A 22, and A 2, are the submatrces n A 22 and A 2 assocated wth varabe x.

0 COXREG Wad Statstc The Wad statstc s cacuated for the varabes n the mode to seect varabes for remova. The Wad statstc for varabe x j s defned by $ $ jb, j j where $ j s the parameter estmate assocated wth x j and B, j s the submatrx of A assocated wth x j. LR (Lkehood Rato) Statstc The LR statstc s defned as twce the og of the rato of the kehood functons of two modes evauated at ther own MPLES. Assume that r varabes are n the current mode and et us ca the current mode the fu mode. Based on the MPLES of parameters for the fu mode, (fu) s defned n equaton (2b). For each of r varabes deeted from the fu mode, MPLES are found and the reduced og-kehood functon, (reduced), s cacuated. Then LR statstc s defned as 2((reduced) (fu)) Condtona Statstc The condtona statstc s aso computed for every varabe n the mode. The formua for condtona statstc s the same as LR statstc except that the parameter estmates for each reduced mode are condtona estmates, not MPLES. The condtona estmates are defned as foows. Let $ = $,, $ 4 K r9 be the MPLES for the r varabes (bocks) and C be the asymptotc covarance for the parameters eft n the mode gven $ s ~ $ $ = C 2 C22 6 6 6 6

COXREG Statstcs 6 s $ where $ s the MPLE for the parameter(s) assocated wth x and $ wthout $ 6, C 2 s the covarance between the parameter estmates eft n the mode $ 6 and $ 6, and C 22 s the covarance of $. Then the condtona statstc for varabe x s defned by 46 9 6 ~ 2 fu 4 9 s the og-kehood functon evauated at ~ 6 ~ where 6. Note that a these four statstcs have a ch-square dstrbuton wth degrees of freedom equa to the number of parameters the correspondng mode has. Inta Mode Informaton The nta mode for the frst method s for a mode that does not ncude covarates. The og-kehood functon s equa to m k j 0 = d j n nj j= = 6 4 9 where n j s the sum of weghts of ndvduas n set R j. Mode Informaton When a stepwse method s requested, at each step, -2 og-kehood functon and three ch-square statstcs (mode ch-square, mprovement ch-square, and overa ch-square) and ther correspondng degrees of freedom and sgnfcance are prnted.

2 COXREG 2 Log-Lkehood m k j 2 s j $ d j n w exp x$ j= = R j 4 9 where $ s the MPLE of for the current mode. Improvement Ch-Square ( 2 og-kehood functon for prevous mode) ( 2 og-kehood functon for current mode). The prevous mode s the mode from the ast step. The degrees of freedom are equa to the absoute vaue of the dfference between the number of parameters estmated n these two modes. Mode Ch-Square ( 2 og-kehood functon for nta mode) ( 2 og-kehood functon for current mode). The nta mode s the fna mode from the prevous method. The degrees of freedom are equa to the absoute vaue of the dfference between the number of parameters estmated n these two mode. Note: The vaues of the mode ch-square and mprovement ch-square can be ess than or equa to zero. If the degrees of freedom are equa to zero, the chsquare s not prnted. Overa Ch-Square The overa ch-square statstc tests the hypothess that a regresson coeffcents for the varabes n the mode are dentcay zero. Ths statstc s defned as 6 6 u 0 I u 0 where u 0 6 represents the vector of frst dervatves of the parta og-kehood functon evauated at =0. The eements of u are defned n equaton (3) and I s defned n equaton (4).

COXREG 3 Informaton for Varabes n the Equaton For each of the snge varabes n the equaton, MPLE, SE for MPLE, Wad statstc, and ts correspondng df, sgnfcance, and parta R are gven. For a snge varabe, R s defned by R =! Wad 2 2 og - kehood for the nta mode 2 " $# sgn of MPLE f Wad > 2. Otherwse R s set to zero. For a mutpe category varabe, ony the Wad statstc, df, sgnfcance, and parta R are prnted, where R s defned by R =! f Wad Wad 2 df 2 og - kehood for the nta mode > 2 df. Otherwse R s set to zero. " $# 2 Informaton for the Varabes Not n the Equaton For each of the varabes not n the equaton, the Score statstc s cacuated and ts correspondng degrees of freedom, sgnfcance, and parta R are prnted. The parta R for varabes not n the equaton s defned smary to the R for the varabes n the equaton by changng the Wad statstc to the Score statstc. There s one overa statstc caed the resdua ch-square. Ths statstc tests f a regresson coeffcents for the varabes not n the equaton are zero. It s defned by 49 22 49 u $ B u $ where u 49 $ s the vector of frst dervatves of the parta og-kehood functon wth respect to a the parameters not n the equaton evauated at MPLE $ and 22 2 4 29 and A s defned n equaton (2). B 22 s equa to A A A A

4 COXREG Survva Tabe 6 and 6 functon and ther standard errors are computed. The estmate S $ 0 for S 0 has been dscussed n equatons (5) through (8). It s easy to see from For each stratum, the estmates of the basene cumuatve survva S 0 hazard H 0 equaton (9) that H06 t s estmated by 6 6 H$ 0 t = n S $ 0 t and the asymptotc varance of H $ 06 t s defned n equaton (9). Fnay, the cumuatve hazard functon Ht x 6 4 9 6 Ht $ x = exp x $ H$ 0 t 6 and survva functon St x 6 are estmated by and, for a gven x, 4 9 4 9 6 6 6 $ $ exp x exp x a 0 0 St $ x = S$ t = S$ t The asymptotc varances are 4 69 42 9 4 069 var Ht $ x = exp x $ var H$ t and 4 69 4 94 69 4 69 var St $ x = exp 2 $ 2 x St $ x var H$ 0 t

COXREG 5 Dagnostc Statstcs Three casewse dagnostc statstcs, Resdua, Parta Resdua, and DFBETAs, are produced. Both Resdua and DFBETA are computed for a dstnct ndvduas. Parta Resduas are cacuated ony for uncensored ndvduas. Assume that there are n j subjects n stratum j and k j dstnct observed events t < L < t k. Defne the seected probabty for the th ndvdua at tme t j as 6 p t and = % K & K ' K exp 4x 6 t $ 9 whexp x h6 t $ h R 0 k j 2 = u = d p t p t 6 6 4 9 f th ndvdua s n R otherwse 6 y t = % & ' 0 f th ndvdua s n D otherwse k j 6 6 r = y t d p t = DFBETA The changes n the maxmum parta kehood estmate of beta due to the deeton of a snge observaton have been dscussed n Can and Lange (984) and Storer and Crowey (985). The estmate of DFBETA computed s derved from augmented regresson modes. The detas can be found n Storer and Crowey (985). When the th ndvdua n the jth stratum s deeted, the change j s estmated by

6 COXREG = m I vr where 4 n j 9 w = dag w, K, w k j 62 6 6 67 v = d p t x t x t wp t = p6 t = p6 t, K, pn 6 t 4 j 9 m = u v I v 6 s an n p and x t j matrx whch represents the p covarate varabes n the mode evauated at t, and n j s the number of ndvduas n R j. Parta Resduas Parta resduas can ony be computed for the covarates whch are not tme dependent. At tme t n stratum j, x g s the p observed covarate vector for any gth ndvdua n set D, where D s the set of ndvduas dyng at t. The parta resdua γ g s defned by γ g = γ g K g pt = x x gp γ 6

COXREG 7 Rewrtng the above formua n a unvarate form, we get 4 9 wx hexp x $ R γ gh = x gh, h =, K, p, g D w exp $ 4x9 R where x gh s the hth component for x g. For every varabe, the resduas can be potted aganst tmes to test the proportona hazards assumpton. Resduas The resduas e are computed by 6 4 94 69 e = H$ t x = exp x $ H$ t 0 whch s the same as the estmate of the cumuatve hazard functon. Pots For a specfed pattern, the covarate vaues x c are determned and x $ c s computed. There are three pots avaabe n COXREG. Survva Pot 6 For stratum j, 4t, S $ 0 t xc 9, =, K, k j are potted where 6 6 $ exp x St $ S$ c xc = 4 t 9 4 9 0

8 COXREG When PATTERN(ALL) s requested, for every uncensored tme t n stratum j the survva functon s estmated by 6 St $ k j k j wst $ w S$ xc 0 t = = = = k k j 6 4 69 w = j w = 4 9 $ exp x c Then 4t, S $ 6 t 9, =, K, k j are potted for stratum j. Hazard Pot 6 For stratum j, 4t, H $ t x c 9, =, K, k j are potted where c6 4 c 9 6 Ht $ x = exp x $ H$ 0 t LML Pot References The og-mnus-og pot s used to see whether the stratfcaton varabe shoud be ncuded as a covarate. For stratum j, t, $ 4 xc + n H $ 0t69, =, K, k j are potted. If the pot shows paraesm among strata, then the stratum varabe shoud be a covarate. Bresow, N. E. 974. Covarance anayss of censored survva data. Bometrcs, 30: 89 99. Can, K. C., and Lange, N. T. 984. Approxmate case nfuence for the proportona hazards regresson mode wth censored data. Bometrcs, 40: 493 499.

COXREG 9 Cox, D. R. 972. Regresson modes and fe tabes (wth dscusson). Journa of the Roya Statstca Socety, Seres B, 34: 87 220. Kabfesch, J. D., and Prentce, R. L. 980. The statstca anayss of faure tme data. New York: John Wey & Sons, Inc. Lawess, J. F. 982. Statstca modes and methods for fetme data. New York: John Wey & Sons, Inc. Storer, B. E., and Crowey, J. 985. A dagnostc for Cox regresson and genera condtona kehoods. Journa of the Amercan Statstca Assocaton, 80: 39 47.