Estimation of accelerated failure time models with random effects

Size: px

Start display at page:

Download "Estimation of accelerated failure time models with random effects"

Jean Melton
5 years ago
Views:

Retrospectve Theses and Dssertatons Iowa State Unversty Capstones, Theses and Dssertatons 6 Estmaton of accelerated falure tme models wth random effects Yaqn Wang Iowa State Unversty Follow ths and

1 Retrospectve Theses and Dssertatons Iowa State Unversty Capstones, Theses and Dssertatons 6 Estmaton of accelerated falure tme models wth random effects Yaqn Wang Iowa State Unversty Follow ths and addtonal works at: Part of the Bostatstcs Commons Recommended Ctaton Wang, Yaqn, "Estmaton of accelerated falure tme models wth random effects " (6). Retrospectve Theses and Dssertatons Ths Dssertaton s brought to you for free and open access by the Iowa State Unversty Capstones, Theses and Dssertatons at Iowa State Unversty Dgtal Repostory. It has been accepted for ncluson n Retrospectve Theses and Dssertatons by an authorzed admnstrator of Iowa State Unversty Dgtal Repostory. For more nformaton, please contact dgrep@astate.edu.

2 Estmaton of accelerated falure tme models wth random effects by Yaqn Wang A dssertaton submtted to the graduate faculty n partal fulfllment of the requrements for the degree of DOCTOR OF PHILOSOPHY Major: Statstcs Program of Study Commttee: Kenneth J. Koehler, Major Professor Song X Chen Rchard Evans Heke Hofmann Terry Therneau Iowa State Unversty Ames, Iowa 6 Copyrght Yaqn Wang, 6. All rghts reserved.

3 UMI Number: UMI Mcroform Copyrght 7 by ProQuest Informaton and Learnng Company. All rghts reserved. Ths mcroform edton s protected aganst unauthorzed copyng under Ttle 7, Unted States Code. ProQuest Informaton and Learnng Company 3 North Zeeb Road P.O. Box 346 Ann Arbor, MI

4 TABLE OF CONTENTS ABSTRACT... v GENERAL INTRODUCTION... Introducton... Cox Proportonal Hazards Model wth Random Effects Accelerated Falure Tme Models AFT Models Inference for AFT Models AFT Models wth Shared Fralty AFT Models wth Random Effects... 4 Dssertaton Organzaton References for General Introducton... 3 ESTIMATION OF ACCELERATED FAILURE TIME MODELS WITH RANDOM EFFECTS... 6 Abstract... 6 Introducton... 6 Accelerated Falure Tme Models wth Random Effects AFT Models wth Shared Fralty.... AFT Models wth Random Effects... 3 Estmaton Approxmate Lkelhood Asymptotc Propertes of Laplace-Based Estmaton Consstency of the Laplace-Based Estmator Asymptotc Normalty Estmaton Smulaton Studes Descrpton of Smulaton I Results of Smulaton I Descrpton of Smulaton II... 48

5 4.4 Results of Smulaton II Approxmate Grouped Jackknfe Estmator Dscusson References Appendx The Accuracy of the Laplace Approxmaton Appendx Programs for AFT Models wth Random Effects Algorthm Descrpton Algorthm Testng... 6 AFT MODELS WITH RANDOM EFFECTS FOR CORRELATED SURVIVAL DATA AND AN APPLICATION TO BREAST CANCER FAMILY DATA Abstract Introducton Mnnesota Breast Cancer Famly Studes Mnnesota Breast Cancer Famly Resource Knshp Mxed Effects Cox Models Modelng the Breast Cancer Data Usng Mxed Effects Cox Models AFT Models wth Random Effects Modelng the Breast Cancer Data Usng AFT Models wth Random Effects Dscusson References GENERAL CONCLUSIONS...

6 v ABSTRACT Correlated survval data wth possble censorng are frequently encountered n survval analyss. Ths ncludes mult center studes where subjects are clustered by clncal or other envronmental factors that nfluence expected survval tme, studes where tmes to several dfferent events are montored on each subject, and studes usng groups of genetcally related subjects. To analyze such data, we propose accelerated falure tme (AFT) models based on lognormal fraltes. AFT models provde a lnear relatonshp between the log of the falure tme and covarates that affect the expected tme to falure by contractng or expandng the tme scale. These models account for wthn cluster assocaton by ncorporatng random effects wth dependence structures that may be functons of unknown covarance parameters. They can be appled to rght, left or nterval-censored survval data. To estmate model parameters, we consder an approxmate maxmum lkelhood estmaton procedure derved from the Laplace approxmaton. Ths avods the use of computatonally ntensve methods needed to evaluate the exact log-lkelhood, such as MCMC methods or numercal ntegraton that are not feasble for large data sets. Asymptotc propertes of the proposed estmators are establshed and small sample performance s evaluated through several smulaton studes. The fxed effects parameters are estmated well wth lttle absolute bas. Asymptotc formulas tend to underestmate the standard errors for small cluster szes. Relable estmates depend on both the number of clusters and cluster sze. The methodology s used to analyze data taken from the Mnnesota Breast Cancer Famly Resource to examne age-at-onset of breast cancer for women n 46 famles.

7 GENERAL INTRODUCTION Introducton There are two mportant classes of regresson models for survval data, Cox proportonal hazards (PH) models (Cox, 97) and accelerated falure tme (AFT) models (Collett, 3). Cox proportonal hazards models relate the hazard functon to covarates, whle the AFT models specfy a drect relatonshp between the falure tme and covarates. Cox models have been extensvely appled n medcal research. AFT models are especally useful n ndustral applcatons n whch falure s accelerated by thermal, hgh-voltage or other factors. The theme of ths dssertaton s the applcaton of accelerated falure tme models to correlated survval data. Tradtonal applcatons and development of the proportonal hazards and AFT models have reled on the assumpton of ndependent responses from the montored unts that are subject to falure. Correlated survval data wth possble censorng, however, are frequently encountered n survval analyss and models for correlated survval data are recevng ncreasng attenton. Correlated data may arse from multple observatons on the same ndvduals, for nstance, recurrent nfectons n clncal trals. The lack of ndependence also appears when observatons are clustered, for example, n a mult-center study of kdney transplant survval (Lambert et al., 4), survval tmes of patents from the same transplant center were assocated snce the transplants mght be carred out by the same surgcal team. Correlated survval tme may also arse when genetcally or socally related subjects, such as famly members or classmates, are followed untl some specfc event occurs. Tradtonal methods of estmaton that treat observatons as ndependent are napproprate for such data. Varous methods have been developed for analyss of correlated observatons. One basc approach ntroduces random effects nto models to nduce correlatons. In survval analyss such random effects models are commonly referred to as fralty models. Another approach s to use estmaton methods developed for ndependent observatons, such as partal lkelhood estmaton, and then adjust the covarance matrx of the resultng estmators to reflect the

8 correlatons. Robust or Sandwch covarance estmators, or approprate resamplng methods, can be used to obtan consstent estmates of covarance matrces and standard errors. Whle ths approach provdes approprate large sample nferences, the estmators tend to be neffcent because nformaton provded by the correlatons among the survval tme s not fully ncorporated nto the estmatng equatons. Ths s a specal case of generalzed estmatng equatons. It has the advantage of not requrng a specfc model for the jont dstrbuton of the correlated responses, whch may be dffcult to assess for small or moderate samples. Estmatng equatons that ncorporate nformaton about the correlaton structure of the observatons can be developed wthout completely specfyng a model for the jont dstrbuton of the observatons, and such equatons can mprove the effcency of estmators. By completely specfyng jont dstrbutons for correlated observatons, maxmum lkelhood, maxmum partal lkelhood, or Bayesan estmaton methods can be used. Although effcency may be ganed, one practcal problem wth ths approach s that the dervaton of the margnal lkelhood, or margnal partal lkelhood, for the observed may be ntractable. Numercal ntegraton s usually not feasble, and margnal lkelhoods, or margnal partal lkelhoods, are ether evaluated wth smulaton technques or approxmated. The former may be qute expensve computatonally, and the latter s an approxmaton that may reduce effcency of estmaton. The concept of fralty ntally was used to explan varablty due to heterogenety of members of a populaton n the context of mortalty studes (Vaupel et al, 979). Fraltes are bascally random effects n survval models. Hougaard (986) examned a shared fralty model wth Webull hazards. Whtmore and Lee (99) dscussed an nverse Gaussan shared fralty model wth constant ndvdual hazards. A shared fralty descrbes some common effects on the members of a cluster. The shared fralty model has ganed broad acceptance over the last few years for clustered survval data. When there are dependences among observed survval tmes, tradtonal partal lkelhood estmaton for the Cox proportonal hazards model that assumes ndependent responses may not provde relable nferences. Although parameter estmates are generally consstent, gnorng the dependence of correlated survval data adversely affects the precson of the parameter estmates (We, Ln, and Wessfeld, 989). More mportantly, the estmated

9 3 varances of parameter estmates are based. Therefore, the Cox proportonal hazards model wth random effects was proposed to account for such dependences. Many approaches have been developed to estmate parameters n the Cox proportonal hazards model wth random effects. Next, we wll brefly revew several estmaton procedures for ths model. Cox Proportonal Hazards Model wth Random Effects Let * T j denote the event tme or survval tme for the j th (j =,, n ) subject from the th * cluster ( =,, N), and let C j represent the censorng tme. Then, the observed tme s T j = * * * mn ( T C ), the ndcator functon δ = I ({ T C }) s f the response tme s * j, j uncensored and f the response tme s censored. Gven random effects, survval tmes are assumed to be condtonally ndependent. The hazard functon for the j th subject from the th cluster of a shared fralty model s gven by λ ( t ) = λ ( t ) ω exp( β) () j where λ s the baselne hazard functon, β s a vector of fxed effects correspondng to covarate vector x j, and ω are ndependent, dentcally dstrbuted random varables wth some common densty functon. Shared fralty models have some lmtatons. For example, they can t accommodate the stuaton where the fralty s not the same for all the ndvduals n a cluster. In order to account for more complcated fralty structure, the shared fralty model needs to be extended. The hazard functon for a more general mxed-effects proportonal hazards model can be defned as j x j j λ t ) = λ ( t ) exp( x β + z b ) () j ( j j where b s a vector of random cluster effects assocated wth ndvdual vectors of covarates z j. The random effects b are assumed to be dstrbuted accordng to some dstrbuton wth mean and covarance matrx D = D(θ), where θ s a vector of unknown parameters. Several approaches have been proposed to estmate the parameters of model (). McGlchrst and Asbett (99) and McGlchrst (993) used a penalzed partal lkelhood

10 4 approach to estmate the fxed effects and an approxmate resdual maxmum lkelhood (REML) approach to estmate the varance covarance parameters based on a normal approxmaton to the dstrbuton of the resduals. They only consdered the specal case where the random effects are normally dstrbuted wth mean zero and dagonal varancecovarance matrx D. In an anmal-breedng context, Ducrocq and Casella (996) ntroduced a Bayesan approach to estmate the parameters of a specal form of model () wth Webull baselne hazards and one set of random sre effects wth ether log-gamma or Gaussan dstrbutons. For those models, the sre effects can be ntegrated out of the posteror dstrbuton algebracally. The margnal posteror dstrbuton for the dsperson parameter cannot be obtaned algebracally and a Laplace approxmaton was consdered. Smulaton results showed that the estmaton procedure performed well when there are few sres and many daughters per sre, but dd not always perform well when there were many sres wth only a few daughters per sre. Rpatt and Palmgren () proposed an approxmate margnal lkelhood approach for a multvarate lognormal fralty model based on a penalzed partal lkelhood. Ther approach allows for more complex dependence fralty structures. The random effects are assumed to be log-normally dstrbuted wth postve defnte varance-covarance matrx D(θ). The Laplace approxmaton was appled to get an approxmate margnal lkelhood as the ntegral cannot be evaluated analytcally. Ths leads to estmatng equatons based on a penalzed partal lkelhood. The estmatng procedure s smple but t tends to result n an underestmaton of the varance of the estmated fxed effects parameters. EM-algorthm based estmaton approaches have been appled by several authors. Rpatt, Larsen and Palmgren () developed an estmaton procedure based on a Monte Carlo EM algorthm wth the am of obtanng the maxmum margnal lkelhood estmaton rather than an approxmaton of the margnal lkelhood estmaton (Rpatt and Palmgren ). The fraltes are treated as mssng data and mputed n the E-step. The expectaton n the E-step cannot be solved analytcally and t s approxmated by samplng from the condtonal dstrbuton of the fraltes gven the observed data. The M-step maxmzes the complete data log-lkelhood usng the mputed fraltes as f they were observed. Ths procedure alternates

11 5 between the E-step and the M-step. It s computatonally ntensve. The more complcated the fralty structure, the more computatonally nvolved the evaluaton of the E-step becomes. Cortnas and Burzykowsk (4) proposed a modfed EM algorthm, usng a Laplace approxmaton at the E-step to numercally smplfy the estmaton procedure. Also, Cortnas (4) used smulatons to compare the performance of the estmaton procedures proposed by McGlchrst and Asbett (99), Ducrocq and Casella (996), Rpatt and Palmgren (), and Cortnas and Burzykowsk (4). Ths study assumed that model (3) was correctly specfed wth a gven baselne hazard λ. Parameters of the model were chosen to mmc a real bladder cancer clncal tral data (Royston, Parmar, and Sylvester, 4) wth 33 patents dstrbuted over 37 centers. The data were generated accordng to the proportonal hazards model, b wth b ~ N, θ λ t β, b ) = λ ( t ) exp( b + x ( β + b )) (3) j ( j. There were 37 random effects for center-specfc baselne θ hazards and 37 random coeffcents for the center-specfc covarate. All four methods produced comparable regresson parameter pont estmates. The McGlchrst and Asbett approach has problems wth the estmaton of the standard errors of the varance components. Ther varance component estmaton has large bas n the heavy censorng settng, especally when varances of random effects are large. Ducrocq and Casella s approach provdes good estmates of standard errors for regresson parameters. Whle the standard errors tend to be slghtly underestmated for the Cortnas s EM algorthm and the Rpatt and Palmgren approach. The method proposed by Ducrocq and Casella yelds conservatve estmates of the standard errors of the varance components. The Cortnas s EM algorthm and the Rpatt and Palmgren method tend to underestmate the standard errors of the varance components. Ths study also found that Ducrocq and Casella s approach does not suffer from the convergence problems that occurred wth the other two methods.

12 6 3 Accelerated Falure Tme Models Although the Cox proportonal hazards model has been extensvely used n medcal research, the assumpton of proportonal hazard functons s rather strong and may often be volated. The omsson of mportant covarates can lead to devatons from proportonal hazards and bas n the estmaton of regresson parameters n Cox models (Solomon, 984). Accelerated falure tme models are an mportant alternatve to the Cox proportonal hazards model even though they have been rarely consdered n the medcal lterature. Chapman et al. (99) appled four parametrc survval models (exponental, Webull, log logstc, and log normal) to the effects of prognostc factors on breast cancer survval and concluded that the lognormal model provded the best ft to the data. Royston () demonstrated the practcal value of the lognormal AFT model n the analyss of survval tmes of breast and ovaran cancer patents. More recently, an AFT model has been mplemented to analyss of the tme to AIDS onset n the Women s Interagency HIV Study (Komarek et al., 4). Lambert et al. (4) appled AFT models wth shared fralty to determne prognostc factors for the survval tme of a kdney graft n patents from 3 transplant centers n the UK. An advantage of AFT models, and other parametrc approaches, s that you can characterze the shape of the hazard functon. AFT models specfy a drect lnear relatonshp between the log of the falure tme and covarates, whch may be approprate when a covarate acts to speed up or slow down the expected tme to falure by contractng or expandng the tme scale. The regresson parameters can be more ntutvely nterpreted wth respect to expected change n medan survval tme. For example, a natural way of expressng a treatment effect n an AFT model s an mprovement of % n medan survval tme. Also, the log-lnear formulaton of AFT models yelds the ndependence of regresson parameter estmates and random fralty effects (Kedng et al., 997). Msspecfcaton of a parametrc famly for the fralty dstrbuton may not be a serous ssue. Emprcal results of Lambert et al. (4) demonstrated the robustness of regresson parameters estmates wth respect to msspecfcaton of the fralty dstrbuton for Webull, Gamma, lognormal, and log-logstc models. Compared to Cox proportonal hazards models, AFT models for

13 7 correlated survval data have receved much less attenton. In ths dssertaton, we wll ncorporate random effects nto the AFT model to allow for correlatons and propose an estmaton procedure for AFT models wth random effects. 3. AFT Models Accelerated falure tme models are useful n many felds of applcaton. Gven the values of the covarates x, the densty functon has the followng form, where σ s the scale parameter, and ψ (x) log t log ψ ( x) f ( t) = ( σt) f ( ) (4) σ s some functon of covarates. One of the most common choces for ψ (x) s ψ ( x ) = exp( x β ) (5) The correspondng AFT model can be expressed n a regresson form as, log T = x β + σε (6) where ε s a random varable wth densty functon f ( ) and the correspondng baselne survvor functon S ( ). Accelerated falure tme models allow a wde range of parametrc ε forms for the densty functon. The standard normal dstrbuton s a common choce for the random varableε. Also, the extreme value and logstc dstrbutons are frequently used. These three dstrbutons have the property that the logarthmc transformaton of the lfetme log T has a locaton-scale dstrbuton on (-, ). AFT models assume a survvor functon of the followng form, where S * s baselne survvor functon. ε * t σ Pr( T t ) = S ( t) = S [( ) ] (7) ψ ( x) The Webull, lognormal, and log-logstc dstrbutons for lfetme correspond to extreme value, normal, and logstc dstrbutons for log of the lfetme, and the survvor functon s gven by

14 8 If ψ ( x ) = exp( x β ) log t log ψ ( x) S ( t) = S ( ) (8) σ, the survvor functon can be rewrtten as The S ( ) functons for some common dstrbutons are: ε Normal: S ( ε ) = Φ ( ) log t x β S ( t) = S ( ) (9) σ ε ε Extreme value: S ( ε ) = exp( e ) () Logstc: S ε ( ε ) = ( + e ) 3. Inference for AFT Models For random lfetme T of subjects =,, n, wth possble rght-censorng, the lkelhood functon under model (9) s gven by Lawless (3) as L( log t x β log t x β n δ δ β, σ ) = [ f ( )] S ( ) () = σ σ σ logt x β Usng ε =, the log-lkelhood functon assumes the form σ where n l ( β, σ ) = r log σ + [ δ log f ( ε ) + ( δ ) log S ( ε )] () = r = δ s the number of uncensored event tmes. Let x = ( x,..., xj,..., x p ) denote the set of covarates under whch the -th subject responds. The frst partal dervatves of l ( β, σ ) are l β j = σ f log S n [ δ + ( δ ) ] = ε ε log ( ε ) ( ε ) x j (3) l r = σ σ σ n [ δ ε + ( δ ) ε ] = ε ε log f ( ε ) log S ( ε ) (4)

15 9 l The maxmum lkelhood estmators βˆ and σˆ are found by solvng the equatons β = l and =. The observed nformaton matrx s σ l l β β β σ I ( β, σ ) = (5) l l σ β σ Assumng needed smoothness condtons on S, we can use the approxmate normalty of the m.l.e. s or a ch-squared approxmaton to lkelhood rato tests to test hypotheses about regresson coeffcents. Ths applcaton s llustrated by Lawless (3). For testng H : β = β, a Wald test statstc s constructed as Λ = β β ( β ) V ( β ) (6) ( Here β = β, β ) and V = I ( β, σ ) s parttoned as An alternatve method for testng Λ = V V V =, V V β = β s to use the lkelhood rado statstc ~ l ( β,, ) (,, ~ β σ l β β σ ) When the null hypothess s true, both tests have asymptotc central ch-squared dstrbutons wth degree of freedom equal to the rank of V. Unless otherwse stated, we wll assume that the model s parameterzed so that V has full rank. (7) 3.3 AFT Models wth Shared Fralty For the clustered falure tme data wth N clusters, let * T j represent the survval tme for the j th ( j =,, n ) ndvdual from the th ( =,, N) cluster and let C * j represent censorng tme. Then, the observed tme s T j = mn ( T C ). Censorng s ndcated by the * * j, j

16 * * ndcator functon, δ = I ({ T C }) j j j, whch s f the ndvdual s uncensored and f the ndvdual s censored. In a classcal AFT model, the survvor functon at tme t s assumed to be of the form S j * t σ ( t) = S [( ) ] (8) ψ ( x ) j where σ s an unknown scale parameter, S s the baselne survvor functon, and ψ x ) s * ( j some functon of covarates x j. Here, t s assumed that ψ ( x ) = exp( x β) (9) j j The AFT regresson model can equvalently be expressed as a log lnear model for the random varable T j, the lfetme of the j th ndvdual n the th cluster. Smlar to equaton (6), the AFT model can be wrtten as, log T = x β + σε () j j where ε j are random varables. For clustered data, subjects are correlated wthn a cluster. Shared fralty models account for the lack of ndependence by ntroducng a random component n Equaton (), whch could be modfed as j j j log T = ω + x β + σε () Here, α = exp ω s a random fralty dstrbuted across clusters wth some dstrbuton. Usually, the fralty dstrbuton s assumed to be gamma, nverse Gaussan, lognormal, or postve stable. AFT models wth shared fralty are appled n stuatons where the unexplaned survval tme heterogenety s common to all ndvduals wthn a cluster. Ths model can be ftted usng standard software packages such as R, Splus or SAS. j 3.4 AFT Models wth Random Effects Shared fralty AFT models have some lmtatons. Frstly, these models requre the fralty to be the same for all the subjects wthn a cluster. Another restrcton s that shared fralty can only nduce postve assocaton wthn the cluster, whch mght not always reflect

17 realty. Lmted resources shared by ndvduals n a cluster could result n some competton, and negatve correlatons among some response tmes. Therefore, AFT models wth shared fralty need to be extended to ncorporate more complcated covarance structure. AFT models that nclude random effects n the regresson expresson, as n a classcal lnear mxed model, have been consdered. The basc model s, log T = x β + z b + σε () j j where β s the vector of unknown regresson coeffcents correspondng to the covarate vector for fxed effects x j and b = (,..., ) b b q s the random effects vector assocated wth a second set of covarate values denoted by z j. It s assumed that the b s are dstrbuted wth mean and covarance matrx D = D(θ), whereθ s a vector of unknown parameters. The densty functon for b s denoted by f (b ). j Pan and Lous () proposed an estmaton procedure that terates between (a) estmatng the margnal dstrbuton of (logt β) usng Kaplan-Meer estmaton and mputaton of censored event tmes, and (b) estmaton of regresson coeffcents usng a Monte Carlo EM algorthm. But only a unvarate random effect wth z j = s consdered n ther approach. j To account for more complcated fralty structure, Komarek and Lesaffre (4) have developed a full Bayesan approach to estmate the parameters of model (). The advantage of ths approach s that a general random effect vector s ncluded n the model. Also ths approach can be appled to not only rght or left censored survval data but also nterval censored survval data. In the Bayesan context, the dstrbuton of error terms ε j s modeled as a mxture of an unknown number of normal dstrbutons. A Markov Chan Monte Carlo (MCMC) algorthm s used to estmate the number of normal components as well as the parameters of the normal dstrbutons. The densty f (ε ) of the error termε j n model () s specfed as K k = x j j f ( ε ) = ω ϕ( ε μ, σ ) (3) k k k

18 where ϕ(. μ k, σ k ) s the densty of N( μ k, σ k ). The number of mxture components K, mxture weghts ω ( ω, L, ω ), means μ ( μ, L, μ ) and varances σ ( σ, L, σ ) = k = k = k are unknown. Let r j be the label of the group from whch the random errorε j s drawn. That s, ε j s drawn from N( μ, σ ). The pror for the mxture weghts ω s assumed to be a r j r j symmetrc K-dmensonal Drchlet dstrbuton, and the mean and varance of each component dstrbuton are drawn ndependently from prors wth normal and nverse-gamma dstrbutons. The estmates of K, ω, μ and σ are updated by a reversble jump MCMC algorthm of Green (995). The condtonal dstrbuton of the log-event tmes s y j r j, μ, σ, β, b, x, z ~ N ( μ + x β + z b, σ ) (4) j j rj j j r j The pror dstrbuton for each regresson coeffcent s assumed to be ndependently and normally dstrbuted. The dstrbuton for the random effect vector b s assumed to be multvarate normal, b γ, D ~ N q (γ, D) (5) and ndependently dstrbuted for =,, N, where γ ( γ, L, γ ). Each γ j has an = q ndependent normal pror N( vγ, ψ γ ). The covarance matrx D of random effects s, j, j assumed to have an nverse-wshart pror. The regresson part of the model s updated usng the Gbbs sampler. However, ths method s computatonally ntensve and cannot be practcally appled when the dmenson of D s large. In the next chapter, we wll propose a method of estmaton for model () based on a penalzed lkelhood developed by applyng the Laplace approxmaton to the margnal lkelhood functon. It s possble to nclude random effects wth general varance structure n the analyses of survval data through ths method. Ths method makes analyses of correlated survval data feasble and computatonally effcent, even for large data sets.

19 3 4 Dssertaton Organzaton Ths dssertaton s organzed nto four major parts n the paper format. The frst part s the general ntroducton ncludng lterature revews of past work on the Cox proportonal hazards models for correlated survval data, the motvaton for ths research, and an ntroducton to AFT models. The next two parts are two papers n the form to be submtted to journals. The fnal part summarzes the results of the prevous chapters and dscusses addtonal ssues. The frst paper proposes an estmaton approach for the AFT model wth random effects. Smulaton studes are used to evaluate the performance of the estmaton approach for AFT models wth shared fralty and AFT models wth nested fraltes. In the second paper, we apply the method to a dataset from the Mnnesota Breast Cancer Famly Resource usng the AFT model wth random effects. 5 References for General Introducton Chapman, J. W., Trudeau, M. E., Prtchard, K. I., Sawka, C. A., Mobbs, B. G., Hanna, W. M., Kahn, H., McCready, D. R., Lckley, L. A., A comparson of all-subset Cox and accelerated falure tme models wth Cox step-wse regresson for node-postve breast cancer, Breast Cancer Research and Treatment, (3): 63 7,99. Collett, D., Modellng Survval Data n Medcal Research- nd ed., Chapman & Hall/CRC CRC Press LLC, 3. Cortnas Abrahantes, J., Estmaton procedures for mxed-effects models wth applcatons to normally dstrbuted and survval data, Ph.D. Thess, 4. Cortnas Abrahantes, J. and Burzykowsk, T., A verson of the EM algorthm for proportonal hazards model wth random effects, Techncal Report 455, IAP statstcs network, 4. Cox, D. R., Regresson models and lfe-tables (wth dscusson), Journal of the Royal Statstcal Socety Seres. B, vol. 34: 87, 97. Ducrocq, V. and Casella, G., A Bayesan analyss of mxed survval models, Genet. Sel. Evol., 8: 55-59, 996.

20 4 Green, P. J., Reversble jump Markov chan computaton and Bayesan model determnaton, Bometrka, 8: 7-73, 995. Hougaard, P., A class of multvarate falure tme dstrbutons, Bometrka, 73: 67-8, 986. Kedng, N., Andersen, P. K. and Klen, J. P., The role of fralty models and accelerated falure tme models n descrbng heterogenety due to omtted covarates, Statstcs n Medcne, vol. 6 pp. 5 4, 997. Komarek, A., Lesaffre, E., and Hlton, J.F., Bayesan accelerated falure tme model for correlated censored data wth a normal mxture as an error dstrbuton, Techncal Report 45, IAP statstcs network, 4. Lambert, P., Collett, D., Kmber, A., and Johnson, R., Parametrc accelerated falure tme models wth random effects and an applcaton to kdney transplant survval, Statstcs n Medcne, vol. 3 pp , 4. Lawless, J. F., Statstcal Models and Methods for Lfetme Data, New York: John Wley & Sons, Inc. 3. McGlchrst, C. A. and Asbett, C. W., Regresson wth fralty n survval analyss, Bometrcs, 47: , 99. McGlchrst, C. A., REML estmaton for survval models wth fralty, Bometrcs, 49: -5, 993. Pan, W. and Lous, T. A., A lnear mxed-effects model for multvarate censored data, Bometrcs, 56, 6-66,. Rppatt, S. and Palmgren, J., Estmaton of multvarate fralty models usng penalzed partal lkelhood, Bometrcs, 56: 6-,. Rppatt, S., Larsen, K., and Palmgren, J., Maxmum lkelhood nference for multvarate fralty models usng an automated Monte Carlo EM algorthm, Lfetme Data Analyss, 8:349-36,. Royston, P. The lognormal dstrbuton as a model for survval tme n cancer, wth an emphass on prognostc factors, Statstca Neerlandca, 55:89-4,. Royston, P., Parmar, M. K. B. and Sylvester, R., Constructon and valdaton of a prognostc model across several studes, wth an applcaton n superfcal bladder cancer, Statstcs n Medcne, 3:97-96, 4. Solomon, P. J., Effect of msspecfcaton of regresson models n the analyss of survval data, Bometrka, 7:9-98, 984.

21 5 Vaupel, J. W., Manton, K. G., and Stallard, E., The mpact of heterogenety n ndvdual fralty on the dynamcs of mortalty, Demography, 6: , 979. We, L.J., Ln, D.Y., and Wessfeld, L., Regresson analyss of multvarate ncomplete falure tme data by modelng margnal dstrbutons, Journal of the Amercan Statstcal Assocaton, 84: 65-73, 989. Whtmore, G. A. and Lee, M.-L. T., A multvarate survval dstrbuton generated by an nverse Gaussan mxture of exponentals, Technometrcs, 33: 39 5, 99.

22 6 ESTIMATION OF ACCELERATED FAILURE TIME MODELS WITH RANDOM EFFECTS Yaqn Wang, Kenneth J. Koehler, Terry M. Therneau A paper to be submtted to Bometrcs Abstract There s an ncreasng nterest n ncorporatng multvarate fraltes nto the analyss of survval data to account for correlated outcomes. We propose accelerated falure tme (AFT) models based on fraltes wth a multvarate lognormal jont dstrbuton. It allows for random effects wth a complcated dependence structure that may be a functon of unknown covarance parameters. The proposed models can be appled to rght, left or nterval-censored survval data. An estmaton procedure s developed for AFT models wth random effects, whch s based on the Laplace approxmaton to the margnal lkelhood. The performance of ths approxmaton s evaluated through several smulaton studes. Key Words: AFT models; multvarate fraltes; correlated survval data; random effects; Laplace approxmaton. Introducton Correlated survval data wth possble censorng are frequently encountered n survval analyss. The observatons may be clustered n mult center studes, e.g., a group of patents may share unobserved envronmental, procedural, or genetc factors that nduce wthn cluster assocaton among response tmes. Correlated data may also arse from takng multple observatons on ndvdual subjects. Alternatvely, event tmes may be montored for socally related subjects, such as classmates, or genetcally related subjects, such as famly members n human studes, or lttermates n anmal studes.

23 7 In survval analyss, one of the most common assumptons s that event tmes are ndependent from one observaton to another gven survval to a specfc tme and observed covarate values. When there are dependences among observed event tmes, models based on ths assumpton are not plausble. Common regresson models for survval analyss are Cox proportonal hazards (PH) models (Cox, 97) and accelerated falure tme models (Collett, 3). For ether Cox models or AFT models, gnorng dependences n the analyss of the data may result n msleadng nferences. Although parameter estmates may be generally consstent, estmaton of the varablty of parameter estmates may be based. Many methods that deal wth correlatons among survval tmes have appeared n the lterature. Due to ts wdespread use, most of the attenton has been gven to extensons of the Cox proportonal hazards model to ncorporate random effects, known as fraltes, to account for correlatons among response tmes. There s a rather extensve lterature on the Cox proportonal hazards model wth random effects. We wll consder clustered falure-tme data wth N clusters. Gven the random effects, or fraltes, the condtonal hazard functon for the j th observaton from the th cluster s generally assumed to have the form t λ ( t β, b ) = λ ( t ) exp( x β + z b ) () j where λ ( ) s the baselne hazard, t s the event tme, β s the unknown regresson coeffcent vector, x j s the covarate vector of fxed effects for the j th observaton from the th cluster, and b s a vector of random effects assocated wth a vector of covarates z j. The random effects are assumed to be dstrbuted accordng to some dstrbuton wth mean and covarance matrx D = D(θ), where θ s a vector of unknown parameters unrelated to β. For a shared fralty model, b s a scalar that expresses a cluster specfc devaton, where z j s an ndcator varable defnng cluster membershp. More complex patterns of assocaton can be modeled by allowng z j to defne addtonal sub-clusters. Several approaches have been proposed to estmate the parameters of the proportonal hazards model wth random effects. McGlchrst and Asbett (99) and McGlchrst (993) used a penalzed partal lkelhood approach to estmate the fxed effects parameters and an approxmate resdual maxmum lkelhood (REML) approach to estmate the covarance parameters for the random effects. Ths approach has a problem wth the estmaton of the standard errors of the varance components. The varance component estmaton has large j j

24 8 bas n the heavy censorng settng, especally when varances of random effects are large. Ducrocq and Casella (996) ntroduced a Bayesan approach that yelds conservatve estmates of the standard errors of the varance components. Rpatt and Palmgren () proposed estmaton based on penalzed partal lkelhood for the Cox proportonal hazards model. Ther approach allows for more complex dependence fralty structure and the estmaton procedure s smple, but t tends to underestmate the standard errors of the varance components. EM-algorthm based estmaton approaches have been appled by several authors. Rpatt, Larsen and Palmgren () developed an estmaton procedure based on a Monte Carlo EM algorthm, but ths approach s numercally ntensve. Cortnas and Burzykowsk (4) proposed a modfed EM algorthm, usng a Laplace approxmaton n the E-step to smplfy the estmaton procedure. However, ths approach also tends to underestmate the standard errors of the varance components. Although the Cox model has been extensvely appled n medcal research, the assumpton of proportonal hazards s rather strong and may often be volated. A useful alternatve to proportonal hazards models s accelerated falure tme models. Accelerated falure tme models use expanson and contracton of tme scales to relate the lfetme dstrbuton to the covarates. The dstrbuton of the event tmes can be defned through the survvor functon or hazard functon. In typcal AFT models, the logarthms of the event tmes are assumed to be ndependently and dentcally drawn from some dstrbuton such as the normal dstrbuton (log normal regresson), extreme value dstrbuton (Webull regresson), or the logstc dstrbuton (log-logstc regresson). Chapman et al. (99) appled four parametrc survval models (exponental, Webull, log logstc, and log normal) to prognostc factors n breast cancer and concluded that the lognormal model provded the best ft to the data. These models provded for a wde varety of shapes of hazard functons that can be further extended by usng mxtures of dstrbutons. In ths paper, we consder AFT models wth random effects to allow for possble correlatons among the survval tmes. The varablty n survval tmes s generally modeled as arsng from two dfferent sources. The frst one s the usual varablty assocated wth the baselne hazard functon. The second source s nduced by varaton n random effects and

25 9 fxed covarates. Condtonally on the random effects, the survval tmes are often assumed to be statstcally ndependent across observatons n these random effects models. We propose an estmaton procedure based on an approxmate penalzed log-lkelhood, whch s smlar to that used by Breslow and Clayton (993) for generalzed lnear mxed models wth Gaussan random effects. Estmates of varance components can be used to assess the strength of assocaton among event tmes wthn clusters. Under the proposed random effects models, the regresson parameters β express the effect of covarates both condtonally (gven the random effects) and margnally (after ntegratng random effects out). Kedng et al. (997) reported that estmates of the regresson parameters are robust aganst the msspecfcaton of the fralty dstrbuton for Webull AFT models. Ths fndng s supported by the emprcal results of Lambert et al. (4) for AFT models wth shared fralty. The organzaton of the artcle s as follows. A descrpton of the parametrc accelerated falure tme models wth shared fralty s gven n Secton.. Secton. contnues wth an extenson to AFT models wth more general random effects. In secton 3., an estmaton procedure for AFT models wth random effects s ntroduced and the asymptotc propertes of the estmators are revewed n secton 3.. Secton 4 s devoted to smulaton studes that provde emprcal valdaton of estmaton procedures. Secton 5 summarzes the results and dscusses some addtonal ssues. Accelerated Falure Tme Models wth Random Effects In ths paper, the data are assumed to consst of rght censored event tme observatons from N clusters wth n observatons from the th cluster. Let T * j represent the event tme correspondng to the j th (j =,, n ) ndvdual from the th cluster ( =,, N), and let represent a correspondng censorng tme that s ndependent of the event tme. Thus, the observed data consst of the observed follow-up tme T j = mn ( T C * * j, j * C j ), and a censorng * * ndcator δ = I ({ T C }) whch s f the ndvdual s uncensored and otherwse. In j j j ths settng, t s natural to assume that observatons wthn a cluster wll be correlated. In the

26 lterature, many authors have proposed usng a shared fralty model to account for wthn cluster dependences.. AFT Models wth Shared Fralty Shared fralty models are approprate when observatons wthn a cluster share a common unobservable fralty. In these models, each observaton belongs to only one cluster, and fraltes of dfferent clusters are ndependent. Many dfferent fralty dstrbutons have been consdered n generalzatons of the Cox proportonal hazards model that mplement random effects: the gamma dstrbuton (Clayton, 99; Klen, 99), the postve stable dstrbuton (Hougaard, 986a), the nverse Gaussan (Hougaard, 986b) and the lognormal dstrbuton (McGlchrst and Asbett, 99). AFT models wth shared fralty have also receved some attenton recently. Klen et al. (999) consdered a lognormal regresson model wth a shared lognormal fralty and Pan () explored AFT models wth gamma fralty. Condtonal on the fralty, wthn cluster survval tmes are assumed to be ndependent. The AFT models wth shared fralty can be expressed as a log lnear model for the logarthm of the event tme as follows log T = x β + b + σε () j j where β s a vector of fxed effects correspondng to covarate vector x j, σ s a scale parameter, the ε j s are ndependent and dentcally dstrbuted random errors, and the b s are the cluster-specfc random effects whch are assumed to be ndependent, dentcally dstrbuted random varables wth densty functon p b ). In these models, fralty could be consdered as an unobserved covarate that s addtve on the log falure tme scale and descrbes some reduced or ncreased event tmes for dfferent clusters. All observatons wthn a cluster share a common unobserved random effect. AFT models wth shared fralty specfy a drect lnear relatonshp between the log of falure tme and the covarates. The regresson parameters can be ntutvely nterpreted wth respect to the expected log of the falure tme. However, the formulaton based on the survvor functon and hazard functon s more convenent for the descrpton n the next secton. The survvor functon for an AFT model at tme t has the form j (

27 log log * t t ψ σ j Pr( Tj t) = S [( ) ] = S ( ) (3) ψ σ j whereσ s the scale parameter, S * s a survvor functon defned on (, ), and S s the * baselne survvor functon satsfyng the relatonshp S ω) = S (log ), and ψ j s some ( ω functon of the covarates. One of the most common choces for AFT models wth shared fralty s ψ = exp( x β + b ) (4) j j Some falure tme dstrbutons, such as the lognormal, Webull, and log-logstc dstrbutons, have the property that log of the falure tme has a locaton-scale dstrbuton. Condtonal on the random effects, the survvor functon n (3) can be rewrtten n the followng form: S j log t x j β b ( t b ) = S ( b σ ) (5) AFT models wth shared fralty have some lmtatons. Frst, a shared fralty model forces the fralty to be the same for all the observatons wthn a cluster. Clearly, there s a need for extensons of shared fralty models to ncorporate more complcated fralty structure, e.g., one may wsh to use a herarchcal nested fralty model. Another restrcton s that shared fralty can only nduce postve assocaton wthn the cluster, whch mght not always reflect realty. To deal wth more complex assocaton structures, AFT models wth random effects are proposed.. AFT Models wth Random Effects Gven a q-dmensonal vector of random effects b, the wthn cluster event tmes are assumed ndependent. For the AFT models wth random effects, the regresson model n equaton () can be extended as follows, log T = x β + z b + σε (6) j The condtonal survvor functon of observaton j from cluster has the form S j j j log t x j β z j b ( t b ) = S ( b ) (7) σ j

28 where S ( ) s the survvor functon of ε j and β s a vector of fxed effects assocated wth a vector of covarates x j measured on the j th observaton n the th cluster. We assumed that the random effect b s randomly dstrbuted accordng to a multvarate normal dstrbuton wth mean zero and covarance matrx D (θ), where θ s an unknown vector of parameters. The densty functon for b s denoted by p(b ; D (θ)). Wthε j Tj x jβ z jb = log, the condtonal survvor and hazard functons are σ S h j j ( t b ) = S ( ε j b ) (8) ( t b ) = h ( ε j b ) (9) σ t respectvely, where h ( ) s the hazard functon of ε j. Let N denote the number of the clusters and n denote the sample sze wthn the th cluster. If condtonal on the random effects the censorng s assumed to be ndependent of survval, the condtonal lkelhood for the observed data s, L c N n δ j = [ h ( ε j b )] S ( ε j b ) () σt = j= j Integratng out the unobserved fraltes b, the margnal lkelhood functon for all clusters can be expressed as: L m N n = b ( ε b ) p ( b ; D ) db () = j= δ j [ h ( ε j )] S σtj j Our am s to use the maxmum lkelhood approach to maxmze the ntegral wth respect to unknown parameters σ, β and θ and make nferences. The ntegral n () s multdmensonal and wll be dffcult to evaluate analytcally. Computatonally ntensve methods, such as MCMC methods or numercal ntegraton, can be used to evaluate the exact loglkelhood numercally. However, these methods may not be feasble for large data sets wth correlated observatons. In ths paper, we propose an approxmate maxmum lkelhood estmaton procedure derved from a Laplace approxmaton to the margnal lkelhood.

29 3 3 Estmaton When the ntegraton n equaton () s analytcally ntractable, one opton s to maxmze the ntegral s to maxmze an approxmate lkelhood obtaned from the Laplace approxmaton to the ntegral. The Laplace approxmaton has been wdely used to obtan approxmate posteror dstrbutons (Terney and Kadane 986) and approxmate lkelhoods (Solomon and Cox 99; Shun and McCullagh, 995). Frst partal dervatves of the approxmated log-lkelhood yeld a set of estmatng equatons that produce consstent parameter estmates wth large sample normal dstrbutons under relatvely broad condtons. 3. Approxmate Lkelhood To smplfy the dscusson, we restrct the q-dmensonal vector b to follow a multvarate normal dstrbuton as set forth by Rpatt and Palmgren (). Thus, we can use arbtrary covarance matrces and handle negatve dependences wthn clusters. Followng the applcaton of the Laplace approxmaton for the generalzed lnear mxed model (Breslow and Clayton, 993), approxmate ntegrated log lkelhood can be derved. We assume the condtonal ndependence of the observatons wthn a cluster gven b. Then up to a constant factor, the condtonal lkelhood for the th cluster s ) ( )) ( ( ) ( j j n j c S h L j b b b ε ε ε σ δ = = () and the correspondng margnal lkelhood s q c q c d e d e L d p L L b θ b b ε θ b θ b b ε b K b θ D b ) ( - ) ( - ) ( D ) ( ) ( ) ( D ) ( ) ; ( ) ( = = = π π (3) where ) ( b K s the penalzed log lkelhood gven by j j n j j n j j c S h L b θ D b b D θ b b b K ) ( )] ( log )) ( (log [ log ) ( )] ( log[ ) ( = = + + = = ε ε δ σ δ (4)

30 4 Although the penalzed log lkelhood s a functon of all unknown parameters, we smplfy the notaton as K b ) n the followng dervaton. Wrtng contrbuton to the margnal ( lkelhood from the th cluster n the form of (3) wth K b ) = n [ K ( b ) / n ], we can apply ( the Laplace method for ntegral approxmaton. The Laplace method s a famly of ml (b ) asymptotc methods used to approxmate ntegrals of the form e db (See Appendx ). The approxmaton s gven by ~ ml ( b ) / ~ q ( ) ( ) ( ) / ml b e db π ml b e (5) where b s a q-dmensonal vector and b ~ denotes the soluton to the equatons obtaned from settng the frst partal dervatves of ml ( b ) wth respect to b equal to zero. Therefore, the contrbuton of the th cluster to the overall log margnal lkelhood can be approxmated as * ~ ~ ~ l ( b ) = K ( b ) log D ( θ) log K ( b ) (6) The order of accuracy assocated wth Laplace approxmaton s O ( ). Let b ~ denote the vector obtaned from stackng the b ~ vectors for all clusters. The covarance matrx, D(θ), can capture the structure for wthn cluster dependence and between cluster heterogenety. Here θ s a vector of unknown parameters, whch do not depend on β. Across all clusters, the approxmate log margnal lkelhood s gven by N * ~ ~ ~ l ( b ) = ( K ( b ) log D ( θ) log K ( b ) ) (7) = Alternatvely, the approxmate log margnal lkelhood can be rewrtten as * ~ l ( b ) = K( b ~ ) log D( θ) log K ( b ~ ) (8) where b ~ ~ s a functon of all unknown parameter ( β,σ, θ), K(b) s the penalzed log lkelhood gven by ~ K( b) N n ~ ~ r logσ + [ δ (log ( )) + log ( )] b j h ε j S ε D( θ) b, = j= = j ~ and K (b ) are the second partal dervatves of K(b) wth respect to b evaluated at b ~ gven by n

31 5 ~ N n log h ( ε j ) log S ( ε j ) K ( b) = z z [ + ] D( θ) j j δ j σ = j= ε ε j j (9) 3. Asymptotc Propertes of Laplace-Based Estmaton Maxmzng the approxmate log-lkelhood obtaned usng the Laplace method results n approxmate maxmum lkelhood estmaton. The correspondng estmates dffer from those obtaned usng the true maxmum lkelhood and are not necessarly consstent. However, the estmates are shown to be consstent under some condtons and the rate of convergence depends on both the number of clusters and cluster szes. Also, under some regularty condtons, we can establsh the asymptotc normal dstrbuton of estmated parameters. 3.. Consstency of the Laplace-Based Estmator The Laplace approxmaton s appled to the random effects of the ntegrated lkelhood for each cluster. Ths approach allows the random effects to have a q-dmensonal dstrbuton wthn each cluster and be correlated. Let γ = ( σ, β ). Up to a constant, the th cluster s contrbuton to the overall log-lkelhood (See Appendx ) s equvalent to where K ( γ) l ( ) log ( ) log γ = D θ K ( γ) + K ( γ) + O p ( n ) () n n = δ j logσ + j= j= [ δ (log h ( ε )) + log S penalzed log-lkelhood. Let l(γ) denote l ( γ) j N = j ( ε )] j ~ b D ( θ) ~ b s the. Here, we assume homogeneous cluster szes for convenence. Up to a constant, the true log-lkelhood wth respect to γ can be wrtten as, * O p l ( γ) = l ( γ) + ( Nn ) () N * where l ( γ) = ( log D ( θ) log K ( γ) + K ( γ)), N s the number of clusters = and n s the common cluster sze. For fxed q, the omtted terms n the approxmaton of the

32 6 log-lkelhood are the order of ( Nn ). A more hghly accurate approxmaton could be O p obtaned by usng hgher order terms n the expanson of the logarthm of the ntegrand (See Appendx ). α Let n = O p ( N ) for α > so that the accuracy of the Laplace approxmaton to the α margnal log-lkelhood s approxmately O ( N ) = () p o p by (). That s, the Laplace approxmaton to the margnal log-lkelhood s o p () f the cluster sze, n, grows faster than * γ the number of clusters N. Then, l ( ) converges to l (γ). The consstency of the Laplace based maxmum lkelhood estmator can be establshed by arguments smlar to those used by Vonesh (996) for the nonlnear mxed-effects models. The followng condtons are assumed: () b and ε are ndependent of one another. () Let l (γ ), the true but unspecfed log margnal lkelhood functon, satsfy the followng regularty condtons: C: The dstrbutons of log-event tmes have common support for all γ Β, where Β s the parameter space for γ. C: There exsts an open subset ω of Β contanng the true parameter pont γ such that l(γ ) s thrd dfferentable as a functon of γ for all γ ω. T C3: E[ l ( γ T )] = and [ l( γ)] I( γ), where the Fsher nformaton Nn matrx, I (γ), s fnte and postve defnte for all γ ω. () The ffth order dervatves of l (γ) exst and are contnuous n an open neghborhood of γ T for all clusters. (v) Let A be the Eucldean norm for a matrx A and assume that E +δ ( l ( γ ) E[ l ( γ )]) / n < Δ < for some Δ > andδ > and for all T T N. Let B, n ( γ ) = var( l ( γ )) / n and assume that B( γ ) = lmn, n B, n ( γ ) s T T T N = T postve defnte wth mnmum egenvalue λ > mn.

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

$Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010$ Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton