A NOTE ON BAYESIAN ANALYSIS OF THE. University of Oxford. and. A. C. Davison. Swiss Federal Institute of Technology. March 10, PDF Free Download

A NOTE ON BAYESIAN ANALYSIS OF THE POLY-WEIBULL MODEL F. Luzada-Net University f Oxfrd and A. C. Davisn Swiss Federal Institute f Technlgy March 10, 1998 Summary We cnsider apprximate Bayesian analysis fr the ply-weibull mdel, which arises in cmpetitive risk scenaris when the risks are independent and there is n knwledge abut which factr was respnsible fr failure. Real and generated datasets illustrate the methdlgy, which is based n Laplace's methd fr integrals. Sme key wrds: Apprximate Bayesian analysis, bi-weibull hazard mdel; cmpeting risks; Laplace's methd fr integrals; Weibull distributin. Authr's ftnte: F. Luzada-Net is a research student at the University f Oxfrd, Department f Statistics, 1 Suth Parks Rad, Oxfrd OX1 3TG, UK, and A. C. Davisn is Prfessr f Statistics, Swiss Federal Institute f Technlgy, Department f Mathematics, 1015 Lausanne, Switzerland. F. Luzada-Net is grateful t the Brazilian institutins UFSCar and CNPq fr their nancial supprt. 1

1 INTRODUCTION In a recent paper Berger and Sun (1993) describe Bayesian analysis via the Gibbs sampler fr a ply-weibull mdel, which arises in scenaris f cmpeting risks (Cx and Oakes, 1984, Ch. 7) when nly the minimum f several Weibull failure times is bserved. The advantage f the ply-weibull mdel in relatin t the standard Weibull mdel is that it allws nt nly increasing, cnstant r decreasing hazard functins, but als nnmntne nes. This is an attractive prperty since such hazards are nt uncmmn in practice. The purpse f this paper is t pint ut that iterative simulatin is nt necessary fr Bayesian analysis f ply-weibull mdels, which can be dealt with mre simply using Laplace's methd (Tierney and Kadane, 1986) r the Bayesian btstrap (Smith and Gelfand, 1992). This reduces cnsiderably the cmputatinal burden in btaining psterir prbabilities, densities, r mments, and makes pssible the use f a wider class f prir densities because there is n need t cast the prblem in the Gibbs sampler framewrk. By the same tken, and despite the cmments in Berger and Sun (1993, Sectin 1.3), standard likelihd analysis f the mdel is entirely straightfrward, the nly diculty arising because f the pssible nn-identiability f the mdel, which we discuss in the next sectin. Mdel frmulatin is discussed in Sectin 2, where we briey discuss the identiability f the parameters, and the use f Laplace's methd is utlined in Sectin 3. In Sectin 4, the methdlgy is illustrated with datasets, tw generated and ne real, and an alternative analysis via the Bayesian btstrap is briey utlined. 2 MODEL FORMULATION Suppse that an individual r system is cmpsed f m 2 elements, and that the lifetime X f the th element has hazard functin h (t). The quantity bserved is T = min(x1; : : : ; X m ), which is said t have a ply-weibull mdel if its hazard functin is given by mx mx t?1 h(t) = h (t) = =1 =1 ; t > 0; (1) where ; > 0 are parameters assciated t the th element. This arises if the individual elements X have independent Weibull distributins with parameters and. An identiability prblem arises if the shape parameters 1; : : : ; m are equal, fr then (1) can be written as t?1, which is the hazard functin f a single Weibull randm P variable; in this case?1 =?. The cause f the prblem is that a redundant vectr f parameters has been kept in the parametrizatin (Bernard and Smith, 1994, p. 239). In practice the parameters are nly identiable when the risk factrs have suciently separated hazard functins. We have fund that graphical gdness-f-t methds such as hazard plts (Lawless, 1982, pp. 88, 278) are very useful fr checking when ply-hazard mdels are likely 2

lg H(t) -3-2 -1 0 1 lg H(t) -3-2 -1 0 1-4 -2 0 2 4 6 8 10 lg t 0 2 4 6 8 lg t Figure 1: Hazard plts fr simulated bi-weibull data. Left panel: data generated with 1 = 1000, 1 = 0:5, 2 = 3200 and 2 = 5. Right panel: data generated with 1 = 1000, 1 = 0:6, 2 = 3200 and 2 = 1:6 (Berger and Sun, 1993, p. 1416). The dtted lines represent maximum likelihd ts f a single Weibull distributin. t be identiable. T illustrate this, Figure 1 shws hazard plts fr tw datasets generated frm a ply-weibull mdel with m = 2 elements; we call this a bi-weibull mdel. The left panel f the gure shws a clear departure frm the straight line that crrespnds t a single Weibull distributin, but this is nt the case fr the sample in the right panel, which was simulated by Berger and Sun (1993) frm a bi-weibull mdel. This suggests nnidentiable parameters r parameters that are nly identiable thrugh functins q(; ) (Basu and Klein, 1982), where = (1; : : : ; m ) T, and = (1; : : : ; m ) T. 3 ESTIMATION Cnsider a sample f independent psitive randm variables T1; : : : ; T n with cmmn hazard functin h(t) and such that T i has assciated an indicatr variable dened by i = 1 if T i is an bserved failure time and i = 0 if T i is a right-censred bservatin. Up t an additive cnstant, the lg likelihd functin fr the parameters f any set f randmly-censred survival data frm this mdel, (t1; 1); : : : ; (t n ; n ), may be expressed as lg L(; ) = nx i lg h(t i )? H(t i ); (2) i=1 where H(t) is the cumulative hazard functin H(t) = R t 0 h(u) du. On substituting the bi- Weibull hazard (1) int (2) we see that the lg likelihd functin fr the parameter vectrs 3

, is lg L(; ) = ( nx X i lg A i? m i=1 =1 ) ti ; (3) P m where A i = =1 t?1 i?. Cntrary t the discussin in Sectins 1.3 and 1.4 f Berger and Sun (1993) there is n diculty in direct calculatin f (3), which simply invlves tw nested d-lps and can be perfrmed in a single S-Plus statement, and cnsequently there is n bstacle t standard frequentist likelihd analysis f the mdel, prvided its parameters are identiable. Tests fr nn-identiability culd be develped, but seem unlikely t be necessary in mst applicatins, where plts like thse in Figure 1 will suce t detect ptential prblems with likelihd analysis. The maximum likelihd estimates f and can be btained by slving the system f 2m nnlinear equatins nx i=1 where g ir = r?r r nx i=1 t i? i A i r i g ir A i?1 r + lg(t i = r )? t i r g ir lg(t i = r ) r g ir = 0; r = 1; : : : ; m; (4) = 0; r = 1; : : : ; m; (5) t r?1 i. Frm a frequentist viewpint interval inference fr the parameters is prbably best cnducted using their prle lg likelihds, as experience suggests that the asympttic apprximatins t the distributins f likelihd rati statistics are likely t be mre accurate in small samples than are asympttic apprximatins t maximum likelihd estimatrs. The parametric btstrap can be applied t avid dependence n asympttic apprximatins when samples are small; see Davisn and Hinkley (1997, p. 346) fr a related example. As the methds invlved are entirely standard, we turn t Bayesian inference rather than cnsider them further. When there is n strng prir infrmatin, the simplest pssibility is t assume a cnstant prir density fr (lg ; lg ), in which case the int lg psterir density fr and is f curse (3), apart frm the nrmalizing cnstant. This als crrespnds t a Jereys prir fr the parameters f the separate Weibull distributins, because the lg Weibull distributin is a lcatin-scale family with lcatin parameter lg and scale parameter. Unfrtunately this results in an imprper psterir distributin, and less vague prir infrmatin must be emplyed. In the examples belw we used independent nrmal distributins with means at the maximum likelihd estimates and standard deviatins 100 as prirs fr the parameters lg and lg. This is purely intended t represent very weak but prper prir infrmatin; f curse in an applicatin ne wuld nt use a data-based prir such as this. With the understanding that L 0 (; ) hencefrth dentes the sum f (2) and the lg prir density, and using Laplace's methd fr apprximatin f the tw integrals invlved in the marginal density f ( r ; r ) (Tierney and Kadane, 1986; Kass, Tierney and Kadane, 1990), 4

the apprximate lg int psterir density fr ( r ; r ), fr r = 1; 2; : : : ; m, is lg L 0 ( ; )? 1 2 lg D?r? lg L 0 (^; ^)? 1 2 lg ^D? lg(2): (6) Here = ( r ; ^?r ) and = ( r ; ^?r ), are the slutins f the analgues f (4) and (5) with ( r ; r ) held xed, D?r is the determinant f the Hessian matrix f minus the lg int psterir density fr?r and?r evaluated at and, and the crrespnding quantities maximizing ver all the parameters are ^, ^ and ^D. Expressin (6) must be evaluated ver a grid f values f ( r ; r ), and then cntured, but the ert invlved is much less than using Markv chain Mnte Carl methds, and there is n restrictin t particular classes f prirs in rder that it is pssible easily t simulate frm the necessary cnditinal distributins. Laplace's methd may fail if the psterir density is seriusly multi-mdal, but we have nt encuntered this prblem in any f the examples t which we have applied the apprximatin. If genuine prir knwledge abut the parameters is available in the frm f a density (; ), it can be incrprated int the analysis in the bvius way. Imprper psterirs can be bserved when ratis f the 's are near ne, a phenmenn related t the near-nn-identiability f the parameters in that situatin. This can be alleviated if there is strng prir infrmatin n the parameters (Bernard and Smith, 1994, p. 239). Berger and Sun (1993) utline methds f eliciting the parameters linked with the prir. 4 EXAMPLES We nw test the methdlgy prpsed in this paper n sme datasets. In each case we use the parametrizatin (lg ; lg ) fr the Laplace apprximatin. This has the benets that the ranges f the lg-transfrmed parameters are unbunded, and that the Hessian matrix f the lg psterir is rughly independent f the parameters. Fr the data in the left panel f Figure 1 the int psterir mdes fr 1 and 1, and fr 2 and 2, with the prir utlined in the previus sectin, are ( ~ 1; ~ 1) = (1304; 0:48) and ( ~ 2; ~ 2) = (3236; 4:88). The apprximate 95% highest psterir density (HPD) intervals fr 1, 1, 2 and 2 are (657; 3191), (0:34; 0:69), (2400; 4023), and (1:58; 13:39), respectively. The Schwarz criterin fr cmparisn f the tw mdels, 3.33, gives an apprximate Bayes factr f 27.9, which accrding t the rugh classicatin in Sectin 3.2 f Kass and Raftery (1995) is `strng' evidence in favur f the bi-weibull mdel. Fr the data in the right panel f Figure 1, the HPD interval fr 2 is (1636; 50000) using ur chice f prir; here we give 50000 as a likely upper bund fr an exact value that is dicult t establish. In practice this wuld be regarded as an uninfrmative psterir inference fr 2, and this is related t the identiability prblem pinted ut in Sectin 2. Hwever, if we use the same prirs as Berger and Sun (1993), the int psterir mdes using 5

Laplace apprximatin are ( ~ 1; ~ 1) = (866; 0:50) and ( ~ 2; ~ 2) = (3464; 1:92), and apprximate 95% HPD intervals fr 1, 1, 2, and 2 are (508; 1894), (0:44; 0:56), (2513; 7090), and (1:76; 2:03), respectively, similar t the results f Berger and Sun (1993). This suggests that their results depend crucially n the chice f prir, because the parameters cannt be identied frm the data, as Figure 1 shws. Further evidence f this is given by the Schwarz criterin fr cmparisn f the mdels, which is {2.85, crrespnding t a rugh Bayes factr f 17.3 in favur f the single Weibull mdel; this is `psitive' evidence in terms f the classicatin mentined abve. Of curse this evidence is crude, but mre detailed calculatins depend crucially n the chice f prir. It is n accident that the single Weibull mdel is adequate in this case: data sets simulated using the parameter values f Berger and Sun are eectively samples frm a single Weibull distributin, s that any infrmatin that they are in fact bi-weibull must cme frm the prir. The prir used by Berger and Sun eectively rules ut the single Weibull mdel, thereby inecting strng infrmatin that is nt crrbrated by the data. Fr a real example, we cnsider data frm Table 1 f Lagaks and Luis (1988) n the survival f 50 male rats receiving 60 mg/kg f tluene diiscyanate. The rats may have died frm side-eects f the drug, as well as frm the tumurs f the testis it was administered t prevent, but the cause f death is unknwn. The hazard plt in the left panel f Figure 2 suggests that a single Weibull mdel will be inadequate, and that dierent risk factrs may be at wrk. The lwer panels shw the apprximate cntur plts fr the lg int psterir density (6) fr (lg 1; lg 1), and fr (lg 2; lg 2), assuming n prir infrmatin abut the parameters. The int psterir mdes are ( ~ 1; ~ 1) = (122; 0:84) and ( ~ 2; ~ 2) = (107:6; 6:5). Apprximate 95% HPD intervals fr 1 and 1, and 2 and 2 are (78:0; 256:7) and (0:55; 1:17), and (98:8; 122:6) and (3:28; 12:1), respectively. In this case the Schwarz criterin equals 3.45, giving `strng' evidence fr the bi-weibull mdel. This is brne ut by cmparisn f the Kaplan{Meier estimate and the mdal tted mdels, shwn in the tp right panel f Figure 2. Fr a limited assessment f the numerical accuracy f Laplace apprximatin in this cntext, we estimated the survivr functin fr the rat data using sampling-imprtance resampling. We generated 10,000 values f the parameters frm a 4-variate nrmal distributin estimated frm the Laplace apprximatin t the psterir density, and then used a Bayesian btstrap (Smith and Gelfand, 1992) t btain a sample f size 1000 frm the true psterir distributin. This is easily achieved because the prir and likelihd are readily calculated. We then used this sample t give the expected marginal psterir survivr functin fr the rat data, which is shwn as a dashed line in the upper right panel f Figure 2. The curve is very clse t the result f the Laplace apprximatin, which seems adequate fr practical use in this setting. We wuld nt expect these curves t be equal, as sampling-imprtance resampling gives the expected psterir survivr functin, whereas 6

ur applicatin f Laplace's methd gives the mdal psterir survivr functin. A large number f applicatins f Laplace's methd wuld be required t apprximate the expected psterir survivr functin, which shuld nt dier frm the mdal value by much. The diference illustrates an interesting facet f the tw appraches, that mdal quantities are dicult t btain by simulatin methds, but relatively easily btained using asympttic apprximatin. REFERENCES Basu, A. and Klein, J. (1982). Sme recent develpments in cmpeting risks thery, in Crwley, J. and Jhnsn, R. A. (eds), Survival Analysis. Hayward, CA: Institute f Mathematical Statistics, 216{229. Berger, J. O. and Sun, D. (1993). Bayesian analysis fr the ply-weibull distributin. Jurnal f the American Statistical Assciatin 88, 1412{1417. Bernard, J. M. and Smith, A. F. M. (1994). Bayesian Thery. New Yrk: Wiley. Cx, D. R. and Oakes, D. (1984). Analysis f Survival Data. Lndn: Chapman & Hall. Davisn, A. C. and Hinkley, D. V. (1997). Cambridge: Cambridge University Press. Btstrap Methds and their Applicatin. Kass, R. E. and Raftery, A. E. (1995). Bayes factrs. Jurnal f the American Statistical Assciatin 90, 773{795. Kass, R. E., Tierney, L. and Kadane, J. B. (1990). The validity f psterir expansins based n Laplace's methd, in S. Geisser, J. S. Hdges, S. J. Press and A. Zellner (eds), Bayesian and Likelihd Methds in Statistics and Ecnmetrics: Essays in Hnr f Gerge A. Barnard, Amsterdam: Nrth-Hlland, 473{488. Lagaks, S. W. and Luis, T. A. (1988). Use f tumur lethality t interpret tumrigenicity experiments lacking cause-f-death data. Applied Statistics 37, 169{179. Lawless, J. F. (1982). Statistical Mdels and Methds fr Lifetime Data. New Yrk: Wiley. Smith, A. F. M. and Gelfand, A. E. (1992). Bayesian statistics withut tears: A samplingresampling perspective. The American Statistician 46, 84{88. Tierney, L. and Kadane, J. B. (1986). Accurate apprximatins fr psterir mments and marginal densities. Jurnal f the American Statistical Assciatin 81, 82{86. 7

lg H(t) -4-3 -2-1 0 S(t) 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 0 20 40 60 80 100 lg t t lg beta1-0.6-0.2 0.0 0.2 + lg beta2 1.0 1.5 2.0 2.5 + 4.5 5.0 5.5 4.55 4.65 4.75 4.85 lg theta1 lg theta2 Figure 2: Results fr ts t rat data (Lagaks and Luis, 1988). Upper left: hazard plt with tted single Weibull mdel shwn by the dtted line. Upper right: Kaplan{Meier estimate f survivr functin, (agged slid), with psterir mdal bi-weibull survivr functin btained using Laplace apprximatin (smth slid) and psterir expected bi-weibull survivr functin btained using sampling-imprtance resampling (dashes), and t f single Weibull (dts). Lwer panels: apprximate 50, 70, 90 and 95% cnturs fr the lg int psterir densities fr (lg 1; lg 1) (left) and fr (lg 2; lg 2) (right). 8

A NOTE ON BAYESIAN ANALYSIS OF THE. University of Oxford. and. A. C. Davison. Swiss Federal Institute of Technology. March 10, 1998.