Power-Law Adjusted Failure-Time Models

Size: px

Start display at page:

Download "Power-Law Adjusted Failure-Time Models"

Vernon Norton
5 years ago
Views:

1 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 4-6, 2012, Lodo, U.K. Power-Law Adjusted Failure-Time Models William J. Reed Abstract A simple adjustmet to parametric failure-time distributios, which allows for much greater fleibility i the shape of the hazard-rate fuctio, is cosidered. Aalytical epressios for the distributios of the power-law adjusted Weibull, gamma, log-gamma, geeralized gamma, logormal ad Pareto distributios are give. Most of these allow for bathtub shaped ad other multi-modal forms of the hazard rate. The ew distributios are fitted to real failure-time data which ehibit a multi-modal hazard-rate fuctio ad the fits are compared. Ide Terms survival aalysis; bathtub hazard; accelerated failure time (AFT) regressio; power-law distributio. I. INTRODUCTION Parametric distributios play a importat role i the aalysis of lifetime data especially i accelerated failure time (AFT) regressio models. Geerally speakig aalysis based o a parametric model will be more precise tha that based o a oparametric or semi-parametric model, because it will have fewer ukow parameters. However this is cotiget o it beig possible to fid a suitable parametric model to fit the data. Ufortuately for most of the commo distributios employed there is very little fleibility i the shape of the hazard rate fuctio. I particular oe of the two-parameter distributios customarily employed ca be used to model a bathtub-shaped hazard. There are a umber of three-parameter distributios which allow a bathtub-shaped hazard icludig the epoetiated Weibull [3], the geeralized Weibull [4] ad the geeralized gamma (see e.g. [1]) distributios. A additio to these was proposed i a recet article by Reed [5]. This distributio, which is a special case of a double Pareto-logormal distributio [6], ca be characterised as the product of idepedet radom variables, oe with a logormal distributio ad the other with a power-law distributio o [0, 1]. For this reaso the ew distributio was called the logormal-power fuctio distributio. It ca be thought of as a etesio of the logormal distributio. I this article it is show how ay simple parametric failure-time distributio ca be eteded i a similar way to allow for much greater fleibility i its form, icludig i most cases the possibility of bathtub shaped hazard-rate fuctios. Precisely, the failure time T is modelled as the product T = d T 0 U, where T 0 follows the simple failuretime distributio ad U follows the power-law distributio with desity λu λ 1 o [0, 1]. Alteratively this ca be epressed as T = d T 0 /V where V has a Pareto distributio, with desity λ/v λ+1 o [1, ). As might be epected, it is ot possible for every parametrically specified distributio (of T 0 ) to obtai a aalytical Mauscript received March 9, 2012; revised March, This work was supported i part by NSERC Grat OGP W. J. Reed is emeritus professor at Departmet of Mathematics ad Statistics, Uiversity of Victoria, PO Bo 3060 STN CSC, Victoria, B.C., Caada V8W 3R4 reed@math.uvic.ca ISSN: (Prit); ISSN: (Olie) epressio for the resultig power-law modified desity. However it turs out to be possible to do so for a umber of the more commo failure-time distributios icludig the logormal (Reed, 2011), epoetial, Weibull, gamma, loggamma, Pareto ad geeralized gamma distributios. These distributios are cosidered i this article. I all cases, ecept the logormal ad Pareto, the resultig power-fuctio modified desities ca be epressed i terms of a icomplete gamma fuctio. I Sec.2 the distributio theory associated with the powerlaw modificatio is preseted, ad i Sec.3 maimum likelihood estimatio discussed. I Sec.4 the results of fittig the various power-law modified failure-time distributios to data with a multi-modal shaped hazard rate, are preseted. II. THEORY Let T 0 be a radom variable with a kow cotiuous failure-time distributio. The power-law modified form of this distributio ca be represeted by a radom variable T with T = d T 0 U where U, idepedet of T 0, follows the power-law distributio with desity λu λ 1 (λ>0) o the iterval [0, 1]. Takig logarithms leads to X = log(t ) d = Z 0 1 λ E where Z 0 = log T 0 (with survivor fuctio ad desity S 0 (z) ad f 0 (z), say) ad E is a stadard (uit mea) epoetial radom variable. The survivor fuctio for X ca be foud as a covolutio as follows: S X () = P(Z 0 E/λ ) = P(E λ(z 0 )) = E{P(E λ(z 0 )) Z 0 } = E{[1 e λ(z0 ) ] I[Z 0 >0]} = [1 e λ(z ) ]f 0 (z)dz = S 0 () e λ e λz f 0 (z)dz (1) where the epectatio E is with respect to Z 0 ad I is a Beroulli idicator radom variable. Upo itegratig by parts oe obtais S X () =λe λ e λz S 0 (z)dz. (2) From this, by differetiatio ad usig (1), oe obtais the correspodig formula for the desity of X f X () =λe λ e λz f 0 (z)dz. (3)

2 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 4-6, 2012, Lodo, U.K. From (2) ad (3) the survivor fuctio ad desity of T i terms of those of T 0 (S T0 (t) ad f T0 (t)) ca be easily obtaied: S T (t) =λt λ u λ 1 S T0 (u)du. (4) t f T (t) =λt λ 1 u λ f T0 (u)du. (5) t We ow cosider power-law modified forms of some specific failure-time distributios. Weibull ad epoetial model. If T 0 has a Weibull distributio with hazard rate fuctio h T0 (t) =αβt β 1, its survivor fuctio ad desity are S T0 (t) =ep( αt β ) ad f T0 (t) =αβt β 1 ep( αt β ). The hazard rate is mootoe icreasig for β>1ad mootoe decreasig for β<1. I the case β =1it is costat ad the Weibull distributio reduces to a epoetial distributio. The survivor fuctio ad desity for Z 0 = log T 0 are S 0 (z) =ep( αe βz ) ad f 0 (z) =αβ ep(βz αe βz ). From (2) ad (3), the survivor fuctio ad desity of X = log T, where T follows the power-law adjusted Weibull distributio, are S X () = λαλ/β β e λ I(αe β, λ/β) f X () =λα λ/β e λ I(αe β, 1 λ/β) where I is the icomplete gamma fuctio I(y,θ) = y u θ 1 e u du. (6) Note that although the ordiary gamma fuctio ca be epressed as the itegral Γ(θ) = u θ 1 e u du oly for Log-gamma model. If Z 0 = log T 0 follows a 0 θ>0, the icomplete gamma fuctio I(y,θ) evaluated at gamma distributio, so that T 0 has desity f T0 (t) = θ y>0coverges for all real θ. Thus S X () ad f X () above κ are well-defied sice αe β t (θ+1) (log t) κ 1 with support o [1, ) the from (2) > 0. ad (3), it is easy to show that the power-law adjusted radom The survivor fuctio, desity ad hazard-rate fuctio for variable T has support o (0, ) ad that X = log T has T are easily computed from the above as survivor fuctio ad desity κ S T (t) =S X (log t); f T (t) = 1 t f X(log t); h T (t) = f T (t) 1 e λ θ θ+λ if 0 S T (t) S X () = κ 1 θ I(θ,κ) θ+λ e λ I([θ + λ],κ ) if >0 Fig.1 (top row) illustrates three shapes that the hazard rate fuctio of the power-law adjusted Weibull distributio ca ad assume. λe λ θ+λκ θ if 0 f X () = κ Gamma model. If T 0 follows a gamma distributio with λe λ θ I([θ+λ],κ) scale parameter θ 1 θ+λ if >0 ad shape parameter κ, the the desity ad survivor fuctio of Z 0 = log T 0 are S 0 (z) = I(θez,κ) ad f 0 (z) = θκ ep(κz θez ) From (2) ad (3), the survivor fuctio ad desity of X = log T, where T follows the power-law adjusted gamma distributio, are S X () = 1 I(θe,κ) θ λ e λ I(θe,κ λ) f X () = λθλ eλ I(θe,κ λ) Fig. 1. Some shapes of the hazard rate fuctio for for various power-law adjusted distributios. Top row: Weibull distributio with α =1: (l.had) β =1(epoetial distributio) ad λ =0.02; (cetre) β =2ad λ =2; r.had β =3ad λ =.02. Secod row: gamma distributio with θ =0.25: (l.had) κ =.01 ad λ =1; (cetre) κ =.01 ad λ =2.5; (r.had) κ =.1 ad λ = 7. Third row: log-gamma distributio with θ = 20: (l.had) κ =50ad λ =.01; (cetre) κ =10ad λ =.01; (r.had) κ =5ad λ =.5. Bottom row: Pareto distributio with τ 0 =1.5: (l.had) α =1ad λ =0.1; (cetre) α =15ad λ =2; (r.had): α =15ad λ =0.2 Fig.1 (secod row) illustrates some shapes that the hazard rate fuctio of the power-law adjusted gamma distributio ca assume. Fig.1 (third row) illustrates some shapes that the hazard rate fuctio of the power-law adjusted log-gamma distributio ca assume. Pareto model. If T 0 follows a Pareto distributio with (α+1) support o (τ 0, ) ad pdf f T0 (t) = α t τ 0 τ 0 thereo, oe ca show that the power-law adjusted form has support o (0, ) ad (usig (4)) that the survivor fuctio of the power-law adjusted form is λ 1 α t α+λ τ S T (t) = 0 if t τ 0 α λ t α+λ τ 0 if t>τ 0 ISSN: (Prit); ISSN: (Olie)

3 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 4-6, 2012, Lodo, U.K. ad usig (5) that the correspodig pdf is λ 1 αλ 1 t α+λ τ f T (t) = 0 τ 0 if t τ 0 α 1 αλ 1 t α+λ τ 0 τ 0 if t>τ 0 Fig.1 (bottom row) illustrates some shapes that the hazard rate fuctio of the power-law adjusted Pareto distributio ca assume. Logormal model. Cosider the case where Z 0 = log T 0 follows a ormal distributio with mea µ ad variace 2. Reed (2011) The power-law adjusted versio of this distributio (the logormal-power fuctio or lnpf distributio) was cosidered i [5] where it is show that the survivor fuctio ad desity of X = log T, where T follows the lnpf distributio, are µ µ S X () =φ R ad µ f X () =λφ R λ + µ R λ + µ where R is Mills ratio of the complemetary cumulative distributio fuctio (cdf) to the pdf of a stadard ormal distributio: R(z) = Φc (z) φ(z). Geeralized gamma model. The three-parameter geeralized gamma distributio icludes the Weibull, gamma ad logormal models as special or limitig cases. It has desity f T0 (t) =αθ κ t ακ 1 ep( θt α )/ With some work usig (2) ad (3), the survivor fuctio ad desity of X = log T, where T follows the power-law adjusted gamma distributio, ca be show to be S X () = 1 I(θe α,κ) θ λ/α e λ I(θe α,κ λ/α) f X () = λθλ/α eλ I(θe α,κ λ/α) It should be oted that while the (uadjusted) log-gamma ad Pareto distributios have support bouded away from zero, their power law adjusted versios have support o [0, ) as ideed occurs i all of the power law adjusted models discussed i this paper. Thus i these models there are o problems with the rage of support depedig o a parameter, as occurs for eample with the geeralized Weibull distributio. Smoothed estimated hazard rate 0e+00 2e 04 4e 04 6e 04 8e 04 Time (# of cycles) Fig. 2. Kerel smoothed o-parametric estimate of the hazard rate fuctio for electrical appliaces data. The Epaechikov kerel with a badwith of 1500 was used. Note that the right-had part (> 6000) of the estimated hazard is ureliable, beig based o oly two observatios. to be idetically distributed followig a power-law adjusted distributio with pdf ad survivor fuctio f T ad S T, the up to a additive costat the log-likelihood is δ i log f T (t i )+ (1 δ i ) log S T (t i ) which is the same as δ i log f X (log t i )+ (1 δ i ) log S X (log t i ) log t i Thus for each of the models discussed above a aalytical epressio for the log-likelihood ca be obtaied. This will eed to be maimized umerically to obtai maimum likelihood estimates usig a optimizatio routie such as optim i R. For startig values oe ca use the MLEs of the two parameters of the uadjusted distributio ad a arbitrary value (say 1) for λ. Covariates Z T =(Z 1,Z 2,...,Z p ) ca be icorporated i a accelerated failure time (AFT) regressio model: log T = β 0 + β T Z + X (7) where X is a radom variable with oe of the powerlaw adjusted distributios of the previous sectio. Note that for all but the log-gamma these distributios ca be reparameterized i terms of a locatio parameter ad two other parameters. I these cases the itercept term β 0 i (7) is ot eeded (ad ideed will result i a o-idetifiable model if it is icluded). III. PARAMETER ESTIMATION BY MAXIMUM LIKELIHOOD. The parametric likelihood for much failure-time data is proportioal to [f Ti (t i )] δi [S Ti (t i )] 1 δi where δ i is a idicator variable with value 1 for a observed failure time, ad value 0 for a right-cesored observatio. If there are o covariates ad the failure times are cosidered IV. AN EXAMPLE. Electrical appliaces. Lawless (p. 256) [2] presets data o the umbers of cycles to failure for 60 electrical appliaces put o test. All of the sity appliaces evetually failed, the largest failure times beig 6065 ad 9701 cycles. Fig.2 shows a kerel-smoothed o-parametric estimate of the hazard rate for these data. There is clearly a suggestio of multi-modality. To assess ad compare the various powerlaw adjusted models discussed i the previous sectio each was fitted to these data. Maimizatio of the log-likelihood ISSN: (Prit); ISSN: (Olie)

4 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 4-6, 2012, Lodo, U.K. Estimated hazard rate 0e+00 2e 04 4e 04 6e 04 8e 04 1e Time (# of cycles) Fig. 3. Maimum likelihood estimates of various power-law adjusted distributios for the electrical appliace data. They are (clockwise from upper left) Weibull, log gamma, logormal ad Pareto. Fig. 4. Kerel smoothed o-parametric estimate of the hazard rate fuctio for electrical appliaces data ad the MLE of the power-law adjusted Pareto hazard-rate. was performed i R usig the Nelder-Mead method i the routie optim ad i all cases required oly a miute or two of computatio. The values of the maimized log-likelihood ad of the Akaike Iformatio Criterio (AIC) for the power-law adjusted forms of the two-parameter models are give i Table 1. I all cases, the improvemet i fit obtaied by icludig the power-law adjustmet was highly sigificat (P <<.001) as oe would epect sice oe of the twoparameter forms allows for a bathtub shape. From Table 1 it ca be see that the power-law adjusted Pareto distributio provides the best fit of these models. Fig.3 shows the MLES of the hazard rate for (clockwise from upper left) the power-law adjusted Weibull, log-gamma, logormal ad Pareto distributios. While these plots may appear very differet to the o-parametric estimate of the hazard fuctio (Fig.2) at the upper ed, it should be oted that the upper part of the o-parametric estimate is ot very precise, sice i the dataset there are oly two observatios greater tha 6000 (with values 6065 ad 9701). Fig.4 shows the fitted power-law adjusted Pareto hazard rate fuctio superimposed o the o-parametric estimate o the rage 0 to 6000 cycles. Also Fig.5 shows the Kapla-Meier estimate of the survivor fuctio ad the fitted survivor fuctio for the power-law adjusted Pareto distributio. Both plots suggest a good fit. Attempts at fittig the four-parameter power-law adjusted geeralized gamma distributio were ot successful, with differet maima arisig with differet startig values. This suggests the possibility of idetifiability problems with this model. Ideed the geeralized gamma distributio without the power-law adjustmet is capable of ehibitig a bathtub shaped hazard. For compariso purposes the three 3-parameter distributios metioed i the itroductio which have bee previously used to model data with a bathtub shaped hazard (epoetiated Weibull, geeralized Weibull ad geeralized gamma) were fitted to the electrical appliaces data. The results are show i Table 2. From compariso with Table 1 it ca be see that of all eight models the best fittig is the power-law adjusted Pareto, followed by the geer- Estimated survival probability Fig. 5. No-parametric Kapla-Meier estimate (step fuctio) of the survivor fuctio for the electrical appliace data ad the maimum likelihood estimate of the survivor fuctio usig the power-law adjusted Pareto distributio. alized Weibull. Furthermore all of the power-law adjusted 2-parameter models, save the Weibull, have a better fit tha the geeralized gamma ad the epoetiated Weibull distributios, suggestig that the cosideratio of power-law adjusted models may provide a useful additio to the toolkit of practitioers. V. CONCLUSIONS. This article shows how eistig parametric failure-time distributios ca be modified by a simple power-law adjustmet, thereby rederig them more fleible, icludig i may cases havig the possibility of a bathtub shaped hazard-rate fuctio. The power-law adjustmet ivolves the itroductio of a etra parameter. While the article cosiders oly distributios for which there are aalytical epressios for the desity ad survivor fuctio, the idea could still be applied to other commo failure distributios (e.g. log-logistic, Gompertz, etc.) I such cases the desity ad survivor fuctio would eed to be computed umerically, usig quadrature methods for evaluatig the itegrals (2) ad (3). This would replace the computatio ivolved ISSN: (Prit); ISSN: (Olie)

5 Proceedigs of the World Cogress o Egieerig 2012 Vol I, July 4-6, 2012, Lodo, U.K. i evaluatig the icomplete gamma fuctios which occur i the distributios discussed i this paper ad so the etra computatio ivolved might ot be too great. REFERENCES [1] Co, C., Chu, H., Scheider, M. & Muñoz, A. Parametric survival aalysis ad taoomy of hazard fuctios for the geeralized gamma distributio, Statist. Med. 26, pp , 2007 [2] Lawless, J. F. Statistical Models ad Methods for Lifetime Data. New York: Joh Wiley ad Sos [3] Muldholkar, G. S. ad D. K. Srivastava, Epoetiated Weibull family for aalyzig bathtub failure-rate data, IEEE Tras. Rel., 42, pp [4] Muldholkar, G. S., Srivastava, D. K. & Kollia, G. D. A Geeralizatio of the Weibull distributio with applicatio to the aalysis of survival data. J. Amer. Stat. Assoc , pp [5] Reed, W. J. A fleible parametric survival model which allows a bathtub shaped hazard rate fuctio. J. Appl. Stat. 38, pp [6] Reed, W. J & Jorgese, M. The double Pareto-logormal distributio - A ew parametric model for size distributios, Comm. Stats - Theory & Methods, 33, pp ISSN: (Prit); ISSN: (Olie)

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution. POWER-LAW ADJUSTED SURVIVAL MODELS William J. Reed Department of Mathematics & Statistics University of Victoria PO Box 3060 STN CSC Victoria, B.C. Canada V8W 3R4 reed@math.uvic.ca Key Words: survival