PowerLaw Adjusted FailureTime Models

 Vernon Norton
 23 days ago
 Views:
Transcription
1 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 46, 2012, Lodo, U.K. PowerLaw Adjusted FailureTime Models William J. Reed Abstract A simple adjustmet to parametric failuretime distributios, which allows for much greater fleibility i the shape of the hazardrate fuctio, is cosidered. Aalytical epressios for the distributios of the powerlaw adjusted Weibull, gamma, loggamma, geeralized gamma, logormal ad Pareto distributios are give. Most of these allow for bathtub shaped ad other multimodal forms of the hazard rate. The ew distributios are fitted to real failuretime data which ehibit a multimodal hazardrate fuctio ad the fits are compared. Ide Terms survival aalysis; bathtub hazard; accelerated failure time (AFT) regressio; powerlaw distributio. I. INTRODUCTION Parametric distributios play a importat role i the aalysis of lifetime data especially i accelerated failure time (AFT) regressio models. Geerally speakig aalysis based o a parametric model will be more precise tha that based o a oparametric or semiparametric model, because it will have fewer ukow parameters. However this is cotiget o it beig possible to fid a suitable parametric model to fit the data. Ufortuately for most of the commo distributios employed there is very little fleibility i the shape of the hazard rate fuctio. I particular oe of the twoparameter distributios customarily employed ca be used to model a bathtubshaped hazard. There are a umber of threeparameter distributios which allow a bathtubshaped hazard icludig the epoetiated Weibull [3], the geeralized Weibull [4] ad the geeralized gamma (see e.g. [1]) distributios. A additio to these was proposed i a recet article by Reed [5]. This distributio, which is a special case of a double Paretologormal distributio [6], ca be characterised as the product of idepedet radom variables, oe with a logormal distributio ad the other with a powerlaw distributio o [0, 1]. For this reaso the ew distributio was called the logormalpower fuctio distributio. It ca be thought of as a etesio of the logormal distributio. I this article it is show how ay simple parametric failuretime distributio ca be eteded i a similar way to allow for much greater fleibility i its form, icludig i most cases the possibility of bathtub shaped hazardrate fuctios. Precisely, the failure time T is modelled as the product T = d T 0 U, where T 0 follows the simple failuretime distributio ad U follows the powerlaw distributio with desity λu λ 1 o [0, 1]. Alteratively this ca be epressed as T = d T 0 /V where V has a Pareto distributio, with desity λ/v λ+1 o [1, ). As might be epected, it is ot possible for every parametrically specified distributio (of T 0 ) to obtai a aalytical Mauscript received March 9, 2012; revised March, This work was supported i part by NSERC Grat OGP W. J. Reed is emeritus professor at Departmet of Mathematics ad Statistics, Uiversity of Victoria, PO Bo 3060 STN CSC, Victoria, B.C., Caada V8W 3R4 ISSN: (Prit); ISSN: (Olie) epressio for the resultig powerlaw modified desity. However it turs out to be possible to do so for a umber of the more commo failuretime distributios icludig the logormal (Reed, 2011), epoetial, Weibull, gamma, loggamma, Pareto ad geeralized gamma distributios. These distributios are cosidered i this article. I all cases, ecept the logormal ad Pareto, the resultig powerfuctio modified desities ca be epressed i terms of a icomplete gamma fuctio. I Sec.2 the distributio theory associated with the powerlaw modificatio is preseted, ad i Sec.3 maimum likelihood estimatio discussed. I Sec.4 the results of fittig the various powerlaw modified failuretime distributios to data with a multimodal shaped hazard rate, are preseted. II. THEORY Let T 0 be a radom variable with a kow cotiuous failuretime distributio. The powerlaw modified form of this distributio ca be represeted by a radom variable T with T = d T 0 U where U, idepedet of T 0, follows the powerlaw distributio with desity λu λ 1 (λ>0) o the iterval [0, 1]. Takig logarithms leads to X = log(t ) d = Z 0 1 λ E where Z 0 = log T 0 (with survivor fuctio ad desity S 0 (z) ad f 0 (z), say) ad E is a stadard (uit mea) epoetial radom variable. The survivor fuctio for X ca be foud as a covolutio as follows: S X () = P(Z 0 E/λ ) = P(E λ(z 0 )) = E{P(E λ(z 0 )) Z 0 } = E{[1 e λ(z0 ) ] I[Z 0 >0]} = [1 e λ(z ) ]f 0 (z)dz = S 0 () e λ e λz f 0 (z)dz (1) where the epectatio E is with respect to Z 0 ad I is a Beroulli idicator radom variable. Upo itegratig by parts oe obtais S X () =λe λ e λz S 0 (z)dz. (2) From this, by differetiatio ad usig (1), oe obtais the correspodig formula for the desity of X f X () =λe λ e λz f 0 (z)dz. (3)
2 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 46, 2012, Lodo, U.K. From (2) ad (3) the survivor fuctio ad desity of T i terms of those of T 0 (S T0 (t) ad f T0 (t)) ca be easily obtaied: S T (t) =λt λ u λ 1 S T0 (u)du. (4) t f T (t) =λt λ 1 u λ f T0 (u)du. (5) t We ow cosider powerlaw modified forms of some specific failuretime distributios. Weibull ad epoetial model. If T 0 has a Weibull distributio with hazard rate fuctio h T0 (t) =αβt β 1, its survivor fuctio ad desity are S T0 (t) =ep( αt β ) ad f T0 (t) =αβt β 1 ep( αt β ). The hazard rate is mootoe icreasig for β>1ad mootoe decreasig for β<1. I the case β =1it is costat ad the Weibull distributio reduces to a epoetial distributio. The survivor fuctio ad desity for Z 0 = log T 0 are S 0 (z) =ep( αe βz ) ad f 0 (z) =αβ ep(βz αe βz ). From (2) ad (3), the survivor fuctio ad desity of X = log T, where T follows the powerlaw adjusted Weibull distributio, are S X () = λαλ/β β e λ I(αe β, λ/β) f X () =λα λ/β e λ I(αe β, 1 λ/β) where I is the icomplete gamma fuctio I(y,θ) = y u θ 1 e u du. (6) Note that although the ordiary gamma fuctio ca be epressed as the itegral Γ(θ) = u θ 1 e u du oly for Loggamma model. If Z 0 = log T 0 follows a 0 θ>0, the icomplete gamma fuctio I(y,θ) evaluated at gamma distributio, so that T 0 has desity f T0 (t) = θ y>0coverges for all real θ. Thus S X () ad f X () above κ are welldefied sice αe β t (θ+1) (log t) κ 1 with support o [1, ) the from (2) > 0. ad (3), it is easy to show that the powerlaw adjusted radom The survivor fuctio, desity ad hazardrate fuctio for variable T has support o (0, ) ad that X = log T has T are easily computed from the above as survivor fuctio ad desity κ S T (t) =S X (log t); f T (t) = 1 t f X(log t); h T (t) = f T (t) 1 e λ θ θ+λ if 0 S T (t) S X () = κ 1 θ I(θ,κ) θ+λ e λ I([θ + λ],κ ) if >0 Fig.1 (top row) illustrates three shapes that the hazard rate fuctio of the powerlaw adjusted Weibull distributio ca ad assume. λe λ θ+λκ θ if 0 f X () = κ Gamma model. If T 0 follows a gamma distributio with λe λ θ I([θ+λ],κ) scale parameter θ 1 θ+λ if >0 ad shape parameter κ, the the desity ad survivor fuctio of Z 0 = log T 0 are S 0 (z) = I(θez,κ) ad f 0 (z) = θκ ep(κz θez ) From (2) ad (3), the survivor fuctio ad desity of X = log T, where T follows the powerlaw adjusted gamma distributio, are S X () = 1 I(θe,κ) θ λ e λ I(θe,κ λ) f X () = λθλ eλ I(θe,κ λ) Fig. 1. Some shapes of the hazard rate fuctio for for various powerlaw adjusted distributios. Top row: Weibull distributio with α =1: (l.had) β =1(epoetial distributio) ad λ =0.02; (cetre) β =2ad λ =2; r.had β =3ad λ =.02. Secod row: gamma distributio with θ =0.25: (l.had) κ =.01 ad λ =1; (cetre) κ =.01 ad λ =2.5; (r.had) κ =.1 ad λ = 7. Third row: loggamma distributio with θ = 20: (l.had) κ =50ad λ =.01; (cetre) κ =10ad λ =.01; (r.had) κ =5ad λ =.5. Bottom row: Pareto distributio with τ 0 =1.5: (l.had) α =1ad λ =0.1; (cetre) α =15ad λ =2; (r.had): α =15ad λ =0.2 Fig.1 (secod row) illustrates some shapes that the hazard rate fuctio of the powerlaw adjusted gamma distributio ca assume. Fig.1 (third row) illustrates some shapes that the hazard rate fuctio of the powerlaw adjusted loggamma distributio ca assume. Pareto model. If T 0 follows a Pareto distributio with (α+1) support o (τ 0, ) ad pdf f T0 (t) = α t τ 0 τ 0 thereo, oe ca show that the powerlaw adjusted form has support o (0, ) ad (usig (4)) that the survivor fuctio of the powerlaw adjusted form is λ 1 α t α+λ τ S T (t) = 0 if t τ 0 α λ t α+λ τ 0 if t>τ 0 ISSN: (Prit); ISSN: (Olie)
3 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 46, 2012, Lodo, U.K. ad usig (5) that the correspodig pdf is λ 1 αλ 1 t α+λ τ f T (t) = 0 τ 0 if t τ 0 α 1 αλ 1 t α+λ τ 0 τ 0 if t>τ 0 Fig.1 (bottom row) illustrates some shapes that the hazard rate fuctio of the powerlaw adjusted Pareto distributio ca assume. Logormal model. Cosider the case where Z 0 = log T 0 follows a ormal distributio with mea µ ad variace 2. Reed (2011) The powerlaw adjusted versio of this distributio (the logormalpower fuctio or lnpf distributio) was cosidered i [5] where it is show that the survivor fuctio ad desity of X = log T, where T follows the lnpf distributio, are µ µ S X () =φ R ad µ f X () =λφ R λ + µ R λ + µ where R is Mills ratio of the complemetary cumulative distributio fuctio (cdf) to the pdf of a stadard ormal distributio: R(z) = Φc (z) φ(z). Geeralized gamma model. The threeparameter geeralized gamma distributio icludes the Weibull, gamma ad logormal models as special or limitig cases. It has desity f T0 (t) =αθ κ t ακ 1 ep( θt α )/ With some work usig (2) ad (3), the survivor fuctio ad desity of X = log T, where T follows the powerlaw adjusted gamma distributio, ca be show to be S X () = 1 I(θe α,κ) θ λ/α e λ I(θe α,κ λ/α) f X () = λθλ/α eλ I(θe α,κ λ/α) It should be oted that while the (uadjusted) loggamma ad Pareto distributios have support bouded away from zero, their power law adjusted versios have support o [0, ) as ideed occurs i all of the power law adjusted models discussed i this paper. Thus i these models there are o problems with the rage of support depedig o a parameter, as occurs for eample with the geeralized Weibull distributio. Smoothed estimated hazard rate 0e+00 2e 04 4e 04 6e 04 8e 04 Time (# of cycles) Fig. 2. Kerel smoothed oparametric estimate of the hazard rate fuctio for electrical appliaces data. The Epaechikov kerel with a badwith of 1500 was used. Note that the righthad part (> 6000) of the estimated hazard is ureliable, beig based o oly two observatios. to be idetically distributed followig a powerlaw adjusted distributio with pdf ad survivor fuctio f T ad S T, the up to a additive costat the loglikelihood is δ i log f T (t i )+ (1 δ i ) log S T (t i ) which is the same as δ i log f X (log t i )+ (1 δ i ) log S X (log t i ) log t i Thus for each of the models discussed above a aalytical epressio for the loglikelihood ca be obtaied. This will eed to be maimized umerically to obtai maimum likelihood estimates usig a optimizatio routie such as optim i R. For startig values oe ca use the MLEs of the two parameters of the uadjusted distributio ad a arbitrary value (say 1) for λ. Covariates Z T =(Z 1,Z 2,...,Z p ) ca be icorporated i a accelerated failure time (AFT) regressio model: log T = β 0 + β T Z + X (7) where X is a radom variable with oe of the powerlaw adjusted distributios of the previous sectio. Note that for all but the loggamma these distributios ca be reparameterized i terms of a locatio parameter ad two other parameters. I these cases the itercept term β 0 i (7) is ot eeded (ad ideed will result i a oidetifiable model if it is icluded). III. PARAMETER ESTIMATION BY MAXIMUM LIKELIHOOD. The parametric likelihood for much failuretime data is proportioal to [f Ti (t i )] δi [S Ti (t i )] 1 δi where δ i is a idicator variable with value 1 for a observed failure time, ad value 0 for a rightcesored observatio. If there are o covariates ad the failure times are cosidered IV. AN EXAMPLE. Electrical appliaces. Lawless (p. 256) [2] presets data o the umbers of cycles to failure for 60 electrical appliaces put o test. All of the sity appliaces evetually failed, the largest failure times beig 6065 ad 9701 cycles. Fig.2 shows a kerelsmoothed oparametric estimate of the hazard rate for these data. There is clearly a suggestio of multimodality. To assess ad compare the various powerlaw adjusted models discussed i the previous sectio each was fitted to these data. Maimizatio of the loglikelihood ISSN: (Prit); ISSN: (Olie)
4 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 46, 2012, Lodo, U.K. Estimated hazard rate 0e+00 2e 04 4e 04 6e 04 8e 04 1e Time (# of cycles) Fig. 3. Maimum likelihood estimates of various powerlaw adjusted distributios for the electrical appliace data. They are (clockwise from upper left) Weibull, log gamma, logormal ad Pareto. Fig. 4. Kerel smoothed oparametric estimate of the hazard rate fuctio for electrical appliaces data ad the MLE of the powerlaw adjusted Pareto hazardrate. was performed i R usig the NelderMead method i the routie optim ad i all cases required oly a miute or two of computatio. The values of the maimized loglikelihood ad of the Akaike Iformatio Criterio (AIC) for the powerlaw adjusted forms of the twoparameter models are give i Table 1. I all cases, the improvemet i fit obtaied by icludig the powerlaw adjustmet was highly sigificat (P <<.001) as oe would epect sice oe of the twoparameter forms allows for a bathtub shape. From Table 1 it ca be see that the powerlaw adjusted Pareto distributio provides the best fit of these models. Fig.3 shows the MLES of the hazard rate for (clockwise from upper left) the powerlaw adjusted Weibull, loggamma, logormal ad Pareto distributios. While these plots may appear very differet to the oparametric estimate of the hazard fuctio (Fig.2) at the upper ed, it should be oted that the upper part of the oparametric estimate is ot very precise, sice i the dataset there are oly two observatios greater tha 6000 (with values 6065 ad 9701). Fig.4 shows the fitted powerlaw adjusted Pareto hazard rate fuctio superimposed o the oparametric estimate o the rage 0 to 6000 cycles. Also Fig.5 shows the KaplaMeier estimate of the survivor fuctio ad the fitted survivor fuctio for the powerlaw adjusted Pareto distributio. Both plots suggest a good fit. Attempts at fittig the fourparameter powerlaw adjusted geeralized gamma distributio were ot successful, with differet maima arisig with differet startig values. This suggests the possibility of idetifiability problems with this model. Ideed the geeralized gamma distributio without the powerlaw adjustmet is capable of ehibitig a bathtub shaped hazard. For compariso purposes the three 3parameter distributios metioed i the itroductio which have bee previously used to model data with a bathtub shaped hazard (epoetiated Weibull, geeralized Weibull ad geeralized gamma) were fitted to the electrical appliaces data. The results are show i Table 2. From compariso with Table 1 it ca be see that of all eight models the best fittig is the powerlaw adjusted Pareto, followed by the geer Estimated survival probability Fig. 5. Noparametric KaplaMeier estimate (step fuctio) of the survivor fuctio for the electrical appliace data ad the maimum likelihood estimate of the survivor fuctio usig the powerlaw adjusted Pareto distributio. alized Weibull. Furthermore all of the powerlaw adjusted 2parameter models, save the Weibull, have a better fit tha the geeralized gamma ad the epoetiated Weibull distributios, suggestig that the cosideratio of powerlaw adjusted models may provide a useful additio to the toolkit of practitioers. V. CONCLUSIONS. This article shows how eistig parametric failuretime distributios ca be modified by a simple powerlaw adjustmet, thereby rederig them more fleible, icludig i may cases havig the possibility of a bathtub shaped hazardrate fuctio. The powerlaw adjustmet ivolves the itroductio of a etra parameter. While the article cosiders oly distributios for which there are aalytical epressios for the desity ad survivor fuctio, the idea could still be applied to other commo failure distributios (e.g. loglogistic, Gompertz, etc.) I such cases the desity ad survivor fuctio would eed to be computed umerically, usig quadrature methods for evaluatig the itegrals (2) ad (3). This would replace the computatio ivolved ISSN: (Prit); ISSN: (Olie)
5 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 46, 2012, Lodo, U.K. i evaluatig the icomplete gamma fuctios which occur i the distributios discussed i this paper ad so the etra computatio ivolved might ot be too great. REFERENCES [1] Co, C., Chu, H., Scheider, M. & Muñoz, A. Parametric survival aalysis ad taoomy of hazard fuctios for the geeralized gamma distributio, Statist. Med. 26, pp , 2007 [2] Lawless, J. F. Statistical Models ad Methods for Lifetime Data. New York: Joh Wiley ad Sos [3] Muldholkar, G. S. ad D. K. Srivastava, Epoetiated Weibull family for aalyzig bathtub failurerate data, IEEE Tras. Rel., 42, pp [4] Muldholkar, G. S., Srivastava, D. K. & Kollia, G. D. A Geeralizatio of the Weibull distributio with applicatio to the aalysis of survival data. J. Amer. Stat. Assoc , pp [5] Reed, W. J. A fleible parametric survival model which allows a bathtub shaped hazard rate fuctio. J. Appl. Stat. 38, pp [6] Reed, W. J & Jorgese, M. The double Paretologormal distributio  A ew parametric model for size distributios, Comm. Stats  Theory & Methods, 33, pp ISSN: (Prit); ISSN: (Olie)