Parameter Estimation for Partially Complete Time and Type of Failure Data

Size: px
Start display at page:

Download "Parameter Estimation for Partially Complete Time and Type of Failure Data"

Transcription

1 Biometrical Journal 46 (004), DOI 0.00/bimj arameter Estimation for artially Complete Time and Type of Failure Data Debasis Kundu Department of Mathematics, Indian Institute of Technology Kanpur, in 0806, INDIA Received August 00, revised July 03, accepted December 003 Summary The theory of competing risks has been developed to asses a specific risk in presence of other risk factors. In this paper we consider the parametric estimation of different failure modes under partially complete time and type of failure data using latent failure times and cause specific hazard functions models. Uniformly minimum variance unbiased estimators and maximum likelihood estimators are obtained when latent failure times and cause specific hazard functions are exponentially distributed. We also consider the case when they follow Weibull distributions. One data set is used to illustrate the proposed techniques. Key words Competing risks; Incomplete data; Failure distribution; Exponential distribution; Maximum likelihood estimators; Minimum variance unbiased estimator.. Introduction In medical studies an investigator is often concerned with the assessment of a specific risk in presence of other risk factors. In statistical literature it is well known as the competing risks model. In analyzing the competing risks model, ideally data consist of a failure time and an indicator denoting the failure type. But in many situations data may be incomplete, both in failure time and failure type. Moreover, under some circumstances, the failure type can be determined even when the failure time is censored. For example, consider an analysis of prostatic cancer in older man. Since, prostatic cancer is not rapidly lethal and can be diagnosed long before death, a man known to have the prostatic cancer who is alive at the time of analysis contributes complete information on the failure type, but incomplete in failure time. On the other hand, without an autopsy, the time of death provides a complete information on the failure time but no information on the failure type. Note that in presence of complete information, several studies have been carried out for the last two decades both under the parametric and non-parametric set up. Dinse [3] considered the nonparametric estimation of the survival function of a particular cause for incomplete time and type of failure data. Because of the non-parametric formulation, it was difficult to consider estimations and construction of confidence intervals of several other parameters of interests. Here we adopt the parametric formulation of the problem and consider statistical inference of several important parameters. We approach the problem in two different ways. First we use the latent failure times model, as suggested by Cox []. In the latent failure times model, it is assumed that the competing risks are independent. Secondly, we use the cause specific hazard functions model as suggested by rentice et al. [9], where the distributions of the competing causes may not be independent. It is assumed that the latent failure time distributions can be either exponential or Weibull. Similarly, under the cause specific hazard functions model, it is assumed that the cause specific hazard functions also can be either exponential or Weibull. Interestingly, in both the formulations, likelihood functions of the observed data are same in each case. Therefore, the estimation procedures of the different unknown * Corresponding author kundu@iitk.ac.in

2 66 D. Kundu Competing Risks parameters and their statistical properties remain unchanged, although the interpretations of the different parameters might be different. In this paper, we make similar assumptions as of Dinse [3]. It is assumed that every member of a certain target population either dies of a particular disease, say cancer or by other causes. A proportion p of the population die of cancer and the proportion ( p) die due to other causes. Suppose that a subject can experience one of J failure types and let us define X be the time of failure and let d be the failure type. At the end of the study, we have the following types of observations fx ¼ t; d ¼ jg fx > tg 3 fx > t; d ¼ jg 4 fx ¼ tg It may be mentioned that Miyakawa [8] and recently Kundu and Basu [5] considered a similar problem in presence of only type and type 4 observations. The rest of the paper is organized as follows. In Section, we provide the formulation in terms of the latent failure times model and describe different notation, which we are going to use throughout this paper. In Section 3, we consider the estimation of different parameters when the latent failure time distributions are exponential. The corresponding confidence intervals are considered in Section 4. The Weibull latent failure distributions are considered in Section 5. One data set is analyzed in Section 6 for illustration purposes. In Section 7, we formulate the problem using the cause specific hazard functions model and finally we draw conclusions from out work in Section 8.. Model Description and Notation Without loss of generality, we assume that there are only two failures types. We use the following notation; X i lifetime of system i X ji lifetime of mode j (¼ or ) of system i FðÞ cumulative distribution function of X i f ðþ density function of X i F j ðþ distribution function of X ji f j ðþ density function of X ji F j ðþ ¼ F j ðþ d i indicator variable denoting cause of failure of system i I½Š indicator function of event ½Š Exp ðlþ denotes exponential random variable with density function l e lx Gamma ða; lþ denotes gamma random variable with density function la GðaÞ xa e lx Weibull ða; lþ denotes Weibull random variable with density function al x a e lxa bin ðn; pþ denotes binomial random variable with parameters n and p. In this Section we formulate the problem using the latent failure times model of Cox []. It is assumed that ðx i ; X i Þ; i ¼ ;...N are N independent and identically distributed (i.i.d.) random variables. It is also assumed that X i and X i are independent for all i ¼ ; ;...N and X i ¼ min fx i ; X i g. Without loss of generality we assume that there are only two failure types, namely and. We assume that we have following observations. The first r observations have complete failure times and corresponding cause of failure is for all of them. We denote this set as I, i.e., the observations are of the type ðx i ; d i Þ; d i ¼ for all and the number of elements in I, j I j,isr. Similarly the next r observations have complete failure times and the corresponding cause of failure is for all of them. We denote this set as I. The next r 3 observations have complete failure times but corresponding causes of failure are unknown and we denote this set as I 3. For the next r 4 observations we know that the failure type will be but failure times have been right censored. We denote this set as I 4. Similarly for the next r 5 observations we have failure type, but corresponding failure times are right censored. Finally we have last r 6 right censored observations,

3 Biometrical Journal 46 (004) 67 where exact failure time and failure type both are unknown and we denote this set as I 6. Therefore, in summary we have following types of observations a ðx i ; Þ; ; ji j¼r ; b ðx i ; Þ; ; ji j¼r ; c ðx i ; Þ; 3 ; ji 3 j¼r 3 ; d ðx i ; Þ; 4 ; ji 4 j¼r 4 ; ðþ e ðx i ; Þ; 5 ; ji 5 j¼r 5 ; f ðx i ; Þ; 6 ; ji 6 j¼r 6 We denote r þ r þ r 3 ¼ n, r 4 þ r 5 þ r 6 ¼ m and therefore, m þ n ¼ N. In order to analyze incomplete data it is assumed that failure times are from the same population as the complete data. We also assume throughout that r þ r ¼ n, r 3, r 4 þ r 5 ¼ m, r 6 are fixed and not random and n > 0. If they are assumed to be random, then all the results are valid conditionally, see Kundu and Basu [5] for details. The likelihood contributions from the observations a; b; c; d; e and f are respectively. f ðxþ F ðxþ ; f ðxþ F ðxþ ; f ðxþ ; Ð x f ðyþ F ðyþ dy ; Ð x f ðyþ F ðyþ dy ; 3. Exponential Latent Failure Time Distributions oint Estimation In this Section, we assume that X ji s are exponential random variables with parameters l j for j ¼ and and for i ¼ ;...; N. The distribution function, F j ðtþ, ofx ji has the following form F j ðtþ ¼ð e ljt Þ ; j ¼ ; In this case the likelihood function of the observed data () is L ¼ l r exp ðl þ l Þ x i l r exp ðl þ l Þ x i ðl þ l Þ r 3 exp ðl þ l Þ r4 l x i exp ðl þ l Þ x i 3 l þ l 4 l r5 exp ðl þ l Þ x i exp ðl þ l Þ x i l þ l 5 6 ¼ l r þr 4 l r þr 5 ðl þ l Þ r 3 r 4 r 5 exp ðl þ l Þ x i ðþ Here I ¼ I [ I [...[ I 6. From the likelihood function (), we obtain MLEs of l and l as nðr þ r 4 Þ nðr ^l ¼ þ r 4 Þ and ^l ¼ ðn þ m Þ ðn þ m Þ x i x i Now to compute mean and variances of ^l and ^l, we observe the following l l r bin n ; ; r bin n ; ; ð3þ l þ l l þ l l l r 4 bin m ; ; r 5 bin m ; ; ð4þ l þ l l þ l nþ m i ¼ X i Gðn þ m; l þ l Þ FðxÞ; ð5þ

4 68 D. Kundu Competing Risks Note that using (3) (5), the exact distributions of ^l and ^l can be easily calculated. Using (3) (5), we obtain n Eð^l Þ¼ n þ m l n l l ; Vð^l Þ¼ ðn þ m Þðnþm Þ n þ m þ l ; n þ m n Eð^l Þ¼ n þ m l n l l ; Vð^l Þ¼ ðn þ m Þðnþm Þ n þ m þ l ð6þ n þ m From (6), it is immediate that ~l ¼ ðr þ r 4 Þðnþm Þ and ~ l ¼ ðr þ r 5 Þðnþm Þ ðn þ m Þ ðn þ m Þ x i x i are uniformly minimum variance unbiased estimators (UMVUEs) of l and l respectively. Note that these results match with those of Miyakawa [8], when r 4 ¼ r 5 ¼ r 6 ¼ 0. The relative risk rate, p, due to cause is ½X i < X i Š¼ ð 0 l exp ð ðl þ l Þ xþdx ¼ l l þ l Observe that r n and r 4 m are unbiased estimators of p, but the MLE of p, ^p, is ^p ¼ ^l ^l þ ^l ¼ r þ r 4 n þ m Note that the distribution ^p is a scaled version of a Binomial distribution and ^p is an unbiased estimator of p. Now, we consider the problem of estimating the mean lifetimes q ¼ and q ¼ due to l l cause and cause respectively. Note that although MLEs and UMVUEs of l and l always exist but the same is not true for q and q. UMVUEs of q and q do not exist and MLEs of q and q exist when r þ r 4 > 0 and r þ r 5 > 0 respectively. First we give the results for q. The conditional MLE (conditioning that r þ r 4 > 0) of q, say ^q, is as follows ^q ¼ x i n þ m nðr þ r 4 Þ Now, we obtain the conditional distribution of ^q, conditioning on r þ r 4 > 0, which in turn helps us in constructing confidence interval q. We need the following lemma for further development. Lemma The conditional moment generating function (mgf) of ^q, say f^q ðtþ, is of the following form f^q ðtþ ¼ E½exp ðt^q Þjr þ r 4 > 0Š " ¼ q n þ m n þ m i n þm ðn þ m Þ! q q i i ¼ i!ðn þ m iþ! q þ q q þ q tðn ðnþmþ # þ m Þ q q ¼ n þ m p i tðn þ m Þ q q ðn þ mþ ni q þ q i ¼ ni q þ q Here q ¼ q and q þ q i n p i ¼ q n þ m ð Þ þ m ðn þ m Þ! q q i i!ðn þ m iþ! q þ q q þ q

5 Biometrical Journal 46 (004) 69 roof of lemma Note that nþ m random variables. Therefore, i ¼ f^q ðtþ ¼ E½ exp ðt^q Þjr þ r 4 > 0 Š ¼ n þ m i ¼ X i is Gamma n þ m; q þ q and r þ r 4 is bin n þ m ; q q E½ exp ðt^q Þ j r þ r 4 ¼ i Šr ½ þ r 4 ¼ i j r þ r 4 > 0 Š l l þ l Note that, p i ¼ r ½ þ r 4 ¼ i j r þ r 4 > 0 Šfor i ¼ ;...; n þ m. Since the moment generating function of Gamma ða; lþ is t a, the result immediately follows. Therefore, we have the following theorem. l Theorem The conditional probability density function (pdf) of ^q, f^q ðxþ, and the conditional probability distribution function, F^q ðxþ, becomes f^q ðxþ ¼ n þ m i ¼ p i g i ðxþ ; F^q ðxþ ¼ n þ m i ¼ p i G i ðxþ Here g i ðxþ and G i ðxþ denote the density function and the distribution function respectively of a gamma niðq þ q Þ random variable with shape parameter ðn þ mþ and scale parameter for i ¼ ;...; n ðn þ m Þðq q Þ þ m. roof of theorem The proof follows immediately from lemma. From theorem, it is clear that the conditional distribution of ^q is a mixture of gamma random variables. It is easy to compute different conditional moments of ^q. We provide the first and the second conditional moments, others can be computed along the same line. We denote two conditional moments as Eð^q Þ and Eð^q Þ respectively. We have n þ m p i Eð^q Þ¼ i ¼ i n ðn þ mþðn q q þ m Þ ð7þ q þ q and! Eð^q Þ¼ n þ m p i i ¼ i n ðn þ mþðnþmþþðn þ m Þ ðq q Þ ðq þ q Þ Note that ^q is not an unbiased estimator of q and since the summation sign denotes inverse moments of positive binomial random variables, it is not possible to give exact expressions. However, it is known (Mendenhall and Lehmann [7]) that if Z is a binomial random variable with parameters N and p, then for large N, E Z j Z > 0 EðZÞ ¼ Np and E Z j Z > 0 ðeðzþþ ¼ ðnpþ. Using these approximations, we have Eð^q Þ n þ m q and Eð^q þ mþðnþmþþ Þðn n n q. Therefore, when m n tends to zero, then ^q is an asymptotically unbiased and consistent estimator of q. For fixed m and n (large), the expression (7) of Eð^q Þ can be used for biased correction. Same results are true for ^q also and they can be obtained simply by interchanging the roles of q and q. 4. Exponential Latent Failure Time Distributions Confidence Intervals In this Section we obtain confidence bounds of l and l using the asymptotic distributions of ^l and ^l.the Fisher Information of l and l is Iðl ; l Þ¼ðI ij ðl ; l ÞÞ for i; j ¼ and. Here I ij ðl ; l Þ¼ ln ðl ; l j

6 70 D. Kundu Competing Risks and I ðl ; l Þ¼ nl þðn þ m Þ l l ðl þ l Þ ; I ðl ; l Þ¼ I ðl ; l Þ¼ ðn m Þ n ðl þ l Þ and I ðl ; l Þ¼ nl þðn þ m Þ l l ðl þ l Þ Therefore, if l ¼ðl ; l Þ and ^l ¼ð^l ; ^l Þ, then we have, ð^l lþ!n 0; I ðl ; l Þ ; where I ðl ; l Þ¼ðIij ðl ; l ÞÞ and I ; l Þ¼ l ðnl þðn þ m Þ l Þ ; nðn þ m Þ I ; l Þ¼ I ; l Þ¼ l l ð ðn m Þþn Þ ; nðn þ m Þ I ; l Þ¼ l ðnl þðn þ m Þ l Þ nðn þ m Þ Similarly, we can have the Fisher Information matrix, Iðq ; q Þ, for q and q by using simply Jacobian transformation. Some other important parameters are, say ½Š p ¼ l ¼ q l þ l q þ q robability of death due to cause. ½Š t ¼ pq þð pþq ¼ pl þð pþl The expected lifetime of all individuals. ½3Š t ¼ pt The cause () specific mortality index. ½4Š t 3 ¼ exp ð l xþ The survival function due to cause. The MLEs of p, t, t and t 3 can be obtained very easily by using the in-variance property of the MLEs and their asymptotic distributions can be obtained using the d-method. 5. Weibull Latent Failure Time Distributions 5. Estimation of parameters In this Section we assume that X ji s are Weibull random variables with parameters ða; l j Þ for j ¼ ; and for i ¼ ;...; n. The distribution function F j ðþ of X ji has the following form F j ðtþ ¼ exp ð l j t a Þ We assume that the lifetime distributions of the different causes follow Weibull distributions with different scale parameters but they have the same shape parameter. The hazard rates due to cause and cause are given ðtþ F ðtþ ¼ al t a ðtþ F ðtþ ¼ al t a respectively. The mean lifetimes due to cause and cause are EðX i Þ¼ G þ and EðX a l i Þ¼ G þ a a l a

7 Biometrical Journal 46 (004) 7 respectively. The relative risk rate, p, due to cause is ð p ¼ X ½ i < X i Š ¼ al x a exp ð ðx a ðl þ l ÞÞ dx ¼ l l þ l 0 and it is independent of the shape parameter. The log-likelihood of the observed data is ln ðlþ ¼ ðr þ r 4 Þ ln ðl Þþðr þ r 5 Þ ln ðl Þþðr 3 r 4 r 5 Þ ln ðl þ l Þ ðl þ l Þ x a i þða Þ ln ðx i Þþðr þ r þ r 3 Þ ln ðaþ [ I [ I 3 Taking derivatives of ln ðlþ with respect to l, l and a and making them equal to zero, give ^l ðaþ ¼ nðr þ r 4 Þ x a i ðn and þ m Þ ^l ðaþ ¼ nðr þ r 5 Þ x a i ðn þ m Þ utting values of ^l ðaþ and ^l ðaþ in ln (L), we obtain ln ðlðaþþ ¼ K þðr þ r þ r 3 Þ ln ðaþ ln x a i þða Þ ln ðx i Þ ; ð8þ [ I [ I 3 here K is a constant independent of a. Now differentiating (8) with respect to a and equating it to zero, we obtain x a i ln ðx i Þ d da ln ðlðaþþ ¼ ðr þ r þ r 3 Þ þ ln ðx i Þþ r þ r þ r 3 ¼ 0 ð9þ [ I [ I 3 a x a i The equation (9) can be written equivalently as 3 x a i ln ðx i Þ ln ðx i Þ 6 a ¼ [ I [ I r þ r þ r 3 x a i ¼ hðaþ (say) Therefore, a simple iterative scheme may be used to solve (0). From the ith iterate a ðiþ, a ði þþ can be obtained as hða ði þþ Þ. Once we obtain ^a, ^l ð^aþ and ^l ð^aþ can be obtained as MLEs of l and l respectively. Alternatively, we can use EM algorithm of Dempster, Laird and Rubin [] to compute MLEs of unknown parameters. The EM algorithm consists of two steps. The first one is the estimation (E) step, where complete observations are left intact and pseudo observations are framed. In this case we consider the E-step as follows. All data belong to category a, b, d, e, and f are left intact, only if the observation x belongs to category c, we form pseudo observation by fractioning x to two partially complete pseudo observation of the form ðx; w ðx; gþþ, ðx; w ðx; gþþ, where g ¼ðl ; l ; aþ. Specifically, the fractional mass, ðw j ðx; gþþ, assigned to this pseudo observation x, is the conditional probability that the individual died from risk j given that the individual had died at time point point x. Therefore, w ðx; gþ ¼ l and w ðx; gþ ¼ l l þ l l þ l From now on we denote w ðx; gþ, w ðx; gþ as w and w respectively. The log-likelihood function of the pseudo data can be written as ln ðl S ðl ; l ; aþþ, where ln ðl S ðl ; l ; aþþ ¼ ðr þ r þ r 3 Þ ln ðaþþðr þ w r 3 Þ ln ðl Þþðr þ w r 3 Þ ln ðl Þ þða Þ ln ðx i Þ ðl þ l Þ x a i ðþ [ I [ I 3 ð0þ

8 7 D. Kundu Competing Risks The maximization step (M) involves the maximization of the log-likelihood function of the pseudo data () with respect to l, l and a. Assuming w j s to be known, we need to maximize L S ðl ; l ; aþ. It has to be performed iteratively. If ðl ðiþ ; lðiþ ; aðiþ Þ is the ith iterate, then ði þ Þth iterate, ðl ðiþþ ; l ðiþþ ; a ðiþþ Þ can be obtained as l ðiþþ ¼ r þ r 3 w x aðiþ i and l ðiþþ Finally a ðiþþ can be obtained as follows; where a ðiþþ ¼ arg max gðaþ ; ¼ r þ r 3 w x aðiþ i gðaþ ¼ ðr þ r þ r 3 Þ ln ðaþþðr þ w r 3 Þ ln ðl ðiþ Þþðr þ w r 3 Þ ln ðl ðiþ þða Þ ln ðx i Þ ðl ðiþ þ lðiþ Þ x a i [ I [ I 3 Equating g 0 ðaþ ¼0, we obtain a ¼ ðl ðiþ þ lðiþ Þ Note that () also can be solved similarly as (0). r þ r þ r 3 ¼ vðaþðsayþ ðþ x a i ln ðx i Þ ln ðx i Þ [ I [ I 3 Þ 5. Confidence intervals In this Section we provide confidence intervals of different parameters. Since it is not possible to obtain the exact distribution of MLEs, we use the asymptotic method. The asymptotic results can be stated as follows ð^l l ; ^l l ; ^a aþ!n 3 ð0; I ðl ; l ; aþþ ð3þ Here Iðl ; l ; aþ is the Fisher Information matrix for the unknown parameters ðl ; l ; aþ. The elements of the 3 3 matrix I ¼ððI ij ÞÞ are as follows I ðl ; l ; aþ ¼ nl þðn þ m Þ l l ðl þ l Þ ; I ðl ; l ; aþ ¼ nl þðn þ m Þ l l ðl þ l Þ ; I ðl ; l ; aþ ¼ n ðn þ m Þ ðl þ l Þ ¼ I ða; l ; l Þ ; I 33 ðl ; l ; aþ ¼ n a þ Vðn þ mþðl þ l Þ ; I 3 ðl ; l ; aþ ¼ ðn þ mþ U ¼ I 3 ðl ; l ; aþ ¼I 3 ða; l ; l Þ¼I 3 ðl ; l ; aþ Here U ¼ EðX a ln ðxþþ and V ¼ EðX a ðln ðxþþ Þ, where X is distributed as Weibull ða; l þ l Þ. Note that U and V can be written as U ¼ aðl þ l Þ wðþ ln l ð þ l Þ V ¼ a ðl þ l Þ ½w0 ðþþwðþðln ðl þ l ÞÞ wðþ ln ðl þ l ÞŠ

9 Biometrical Journal 46 (004) 73 Here wðþ and w 0 ðþ are the digamma and polygamma functions respectively. We can use (3) to compute asymptotic confidence intervals of all unknown parameters. Since it is observed that in many cases (see Efron and Hinkley [4]), it is more appropriate to use the observed information matrix than the expected information matrix, we provide the observed information matrix also. It is possible to obtain the observed information matrix when the EM algorithm is used (Louis [6]). Using the same notation of Louis [6], the observed information matrix, ^I, takes the form ^I ¼ B SS 0 Here B is the 3 3 negative of the second derivative matrix and S is the 3 gradient vector. They are as follows and SðÞ ¼ r þ w r 3 l Sð3Þ ¼ r þ r þ r 3 þ a x a I ; SðÞ ¼r þ w r 3 l x a I ; ln ðx i Þ ðl þ l Þ x a i ln ðx i Þ [ I [ I 3 Bð; Þ ¼ r þ w r 3 l ; Bð; Þ ¼ r þ w r 3 l ; Bð; Þ ¼Bð; Þ ¼0 ; Bð; 3Þ ¼Bð3; Þ ¼Bð; 3Þ ¼Bð3; Þ ¼ x a i ln ðx i Þ ; I I Bð3; 3Þ ¼ r þ r þ r 3 a 6. Data Analysis ðl þ l Þ x a i ðln ðx i ÞÞ We illustrate the proposed parametric estimation technique using one cancer clinical trials. The data was originally analyzed by Dinse [3] using the non-parametric approach. Data Set This example is concerned with the relationship between time until progression and an indicator function of patient status at the time of progression for patients with glioblastoma, a cancer of the brain. Define X as the time until disease progression, as measured in month from the date of randomization. Suppose, d indicates whether a patient is non ambulatory (d ¼ ) or ambulatory (d ¼ ) at the time of their most recent examination within the month before progression. If a patient does not experience a progression by the time of analysis, the patient belongs to category f. For a patient with a known progression time, if there is no record of his/her ambulatory status within the month of preceding progression falls into category c. For detailed discussions on the data set the readers are referred to Dinse [3]. The data set presents times and indicators for 7 patients from a clinical trial. The summary of the data set is as follows r ¼ 4, r ¼ 7, r 3 ¼ 3, r 4 ¼ r 5 ¼ 0, r 6 ¼ 83, n ¼ 58, n ¼ 89, m ¼ 0 and m ¼ 83. Also x i ¼ 3, x i ¼ 50, ¼ 76, x i ¼ 885 and x i ¼ Before fitting our model to the complete data set we would like to test the validity of the proposed parametric models. Since the closeness between the empirical survival function and the estimated survival function is a good measure for model validity for uncensored data, we consider a subset of the data which has only a, b or c types of failure. Assuming that the latent failure time distributions are exponential we obtain the MLEs of the parameters of the restricted data as ^l r ¼ and ^l r ¼ The corresponding UMVUEs are ~ l r ¼ and ~ l r ¼ respectively. In this case MLEs and UMVUEs are quite close to each other and we consider MLEs only. The Kolmogorov-Smirnov (K-S) distance between the empirical distribution function of the restricted data and the estimated distribution function using the exponential model is 0.9 and the corresponding p

10 74 D. Kundu Competing Risks Empirical survival function Estimated exponential survival function Estimated Weibull survival function Fig. Empirical survival function and the estimated survival functions using exponential and Weibull models. value is Similarly assuming that the latent failure time distributions are Weibull, we obtain the MLEs of the parameters of the restricted data as ^l r ¼ ; ^l r ¼ and ^a r ¼ In this case the K-S distance between the empirical distribution function and the estimated distribution function using the Weibull model is 0.73 and the corresponding p value is Since the p value corresponding to the Weibull model is quite high, therefore the preliminary data analysis suggests that the proposed Weibull model can be used to analyze this data set. We plot the empirical survival function and the estimated survival functions using exponential model and 0.8 Empirical survival function Estimated survival function using UMVUE, s Estimated survival function using MLE, s Fig. Empirical survival function and estimated survival functions using MLE s and UMVUE s. Here lifetime distributions are exponential.

11 Biometrical Journal 46 (004) Empirical survival function Estimated survival function using MLE, s Estimated survival function using modified MLE, s Fig. 3 Empirical survival function and the estimated survival functions using MLE s and modified MLE s. Here the lifetime distributions are Weibull. Weibull model in Figure. Since it is assumed that the incomplete data are the coming from the same population as the complete data, therefore it is not unreasonable to use the proposed models to analyze this data set. Now we consider the full data set. Under the assumption that the latent failure times are exponentially distributed, MLEs of l and l are ^l ¼ and ^l ¼ 0059 respectively, the log-likelihood value is Also Vð^l Þ¼5 0 6 and Vð^l Þ¼ Moreover, UMVUEs of l and l are ~ l ¼ and ~ l ¼ respectively, with Vð ~ l Þ¼ and Vð ~ l Þ¼ 0 5. Clearly MLEs and UMVUEs are quite different and it is mainly due to heavy ð00384 þ 0059Þ x censoring. Estimated survival function obtained using the MLEs is e and the corresponding 95% confidence bounds are e ð00384 þ 0059Þ x 003 e ð00384 þ 0059Þ x. The Kolmogorov- Smirnov (K-S) distance between the empirical distribution function and the estimated distribution function is 0.96 and the corresponding p value is almost zero. Similarly the estimated survival function using the UMVUEs is e and the corresponding 95% confidence bands are ð00738 þ 00305Þ x e ð00738 þ 00305Þx 0046 e ð00738 þ 00305Þ x. In this case the K-S distance is 0.5 and the corresponding p value is We provide the empirical survival function and the estimated survival functions using MLEs and UMVUEs in the Figure. Clearly, UMVUEs provide a much better fit to the empirical survival function but still its p value is quite small. Instead of MLEs we prefer to use the bias corrected UMVUEs in this case. Conditional estimates of mean times until progression of patients with non ambulatory status and ambulatory status are q ~ ¼ 3480 and q ~ ¼ 353 respectively. The corresponding 95% confidence intervals are (4.8404,.99) and (6.4388, ) respectively. Clearly the mean progression time of the patients with ambulatory status is significantly more than the patients with non ambulatory status. The biased corrected estimate of the mean progression time of all patients, say ~t ¼ and the corresponding 95% asymptotic confidence interval is (5.094, 3.049). The estimate of the relative risk rate of the non ambulatory status is and the corresponding 95% asymptotic confidence interval is (0.6076, ). Clearly it is siginificantly greater than 0.5. The estimates of the survival functions of the time until progression of the patients with non-ambulatory and ambulatory status at the time point x are e 00738x and e 00305x respectively. The corresponding 95% confidence intervals are e 00738x 0096x e 00738x and e 00305x 0038x e 00305x respectively.

12 76 D. Kundu Competing Risks Now we analyze the data under the assumptions that the latent failure time distributions are Weibull. In this case, we obtain the MLEs of the unknown parameters using the methods proposed in the Section 5. Both the methods converge to the same solutions and they are as follows, ^a ¼ 45; ^l ¼ 00 and ^l ¼ 0009, the corresponding log-likelihood value is We obtain the empirical survival function and the estimated survival function using the MLEs as e ð00 þ 0009Þx45. We provide it in the Figure 3. Unfortunately it does not match very well. The K-S distance between the empirical distribution function and the estimated distribution function is 0.33 and the corresponding p value is almost zero. Similar things happen even in the exponential case also. We feel that the scale parameters are highly biased. We make the bias correction as follows. Assuming that the shape parameter is known to be.45, we transform the date as X a and then estimate the scale parameters by UMVUEs. It gives the improved estimates of l and l, say ~l ¼ 0049 and ~ l ¼ 0078 respectively. We obtain the empirical survival function and the estimated survival function using the improved estimates as e ð0049 þ 0078Þx45. It is also plotted in Figure 3. The K-S distance between the empirical survival function and the estimated survival function in this case is and the corresponding p value is 0.7. We will be using these new estimates from now on. The asymptotic variance covariance matrix of ð^l ; ^l ; ^aþ obtained from the observed information matrix using the EM algorithm is Therefore, by proper multiplication we obtain the variance covariance matrix of ð ~ l ; ~ l ; ~aþ as The asymptotic 95% confidence bounds for l, l and a are (0.0366, 0.049), (0.09, 0.036) and (.350,.940) respectively. Since the 95% confidence interval of a does not include, therefore it is reasonable to use the Weibull model rather than the exponential model for this data set. Even the log-likelihood values also suggest (performing c testing) that the Weibull model is a preferred model than the exponential model in this case. The estimates of the mean lifetime of the patients with nonambulatory and ambulatory status are.5349 and and the corresponding 95% confidence bounds are (5.099, ) and (4.5047, 37.) respectively. The estimate of the relative risk rate of the patients with non-ambulatory status is and the corresponding asymptotic 95% confidence band is (0.5885, 0.85). The estimates of the survival functions of the time until progression of the patients with non-ambulatory and ambulatory status at the time point x are e 0049x45 and e 0078x45 respectively. The corresponding 95% asymptotic confidence bounds are e 0049x45 e 0049x45 x 45 ð ln ðxþþ59570 ðln ðxþþ Þ 0 3 and e 0078x45 e 0078x45 x 45 ð ln ðxþþ07 ðln ðxþþ Þ 0 3. They are plotted in Figure 4 and it is clear from the figure that the two survival functions are significantly different. For illustrations purposes we assume that 0% of the category f data have known status of the patients. We randomly choose 0% of the data from the category f and assign status and status with probability 07 ( p) and 03 respectively. Now the summary of the new data set is as follows r ¼ 4, r ¼ 7, r 3 ¼ 3, r 4 ¼ 7, r 5 ¼ 5, r 6 ¼ 7, n ¼ 58, n ¼ 89, m ¼ and m ¼ 83. Also x i ¼ 3, x i ¼ 50, ¼ 76, ¼ 09, ¼ 76, x i ¼ 700 and x i ¼ 639. Assuming that the lifetime distributions are exponential, the estimates of the scale parameters are as follows ^l ¼ 0037; ^l ¼ 007; ~ l ¼ 0074 and ~ l ¼ Similarly assuming that the lifetime distri-

13 Biometrical Journal 46 (004) Estimated survival function of the ambulatory status patients 95% confidence bands Estimated survival function of the non ambulatory status patients 0. 95% confidence bands Fig. 4 Estimated survival functions and the corresponding 95% confidence bounds for the non-ambulatory and the ambulatory status patients. The lifetime distributions are assumed to be Weibull. butions are Weibull the estimates are ^a ¼ 45; ^l ¼ 004; ^l ¼ 00099; ~ l ¼ 0043 and ~l ¼ Therefore the estimates of the other parameters can be obtained similarly as before, using these estimates. Now we compare ours with Dinse s [3] findings. First of all the non-parametric estimate of FðÞ as obtained by Dinse [3] is quite close to the parametric estimate of the survival function, particularly when the latent failure time distributions are assumed to be Weibull (see Figure 3). This indicates that both analysis should provide comparable results. Dinse [3] also mentioned that the non-parametric estimates of d j ðtþ ¼½d ¼ j j X ¼ tš are very erratic and therefore, he has provided the smoothed estimates of d j ðtþ. In our formulation, d j ðtþ is constant for all t. Therefore, the parametric estimate of d j ðtþ can be considered as the smoothed version of the non-parametric estimates. In our case using Weibull distributions, we obtain ^d ðtþ 070 and ^d ðtþ 030. From Figure of Dinse [3], it is clear that the nonparametric smoothed estimate of d ðtþ is close to Therefore, both methods provide similar results. 7. Cause Specific Hazard Functions Model In this Section, we formulate the problem, using the cause specific hazard functions, as originally proposed by rentice et al. [9]. We assume cause specific hazard functions to be exponential or Weibull. We use lðtþ as the overall hazard function and l j ðtþ as the cause specific hazard function for j ¼ and. It is known that lðtþ ¼l ðtþþl ðtþ. Suppose X, d and FðÞ are same as defined in Sections and, the likelihood contributions of the observations a, b, c, d, e and f are l ðtþ FðtÞ; l ðtþ FðtÞ; lðtþ FðtÞ; Ð respectively. t lðuþ FðuÞ du ; Ð t l ðuþ FðuÞ du; Ð t l ðuþ FðuÞ du;

14 78 D. Kundu Competing Risks 7. Exponential cause specific hazard functions In this subsection, it is assumed l ðtþ ¼l and l ðtþ ¼l In this case the likelihood contributions from the observations a, b, c, d, e and f are l e ðl þl Þ t ; l e ðl þl Þ t ; ðl þ l Þ e ðl þl Þ t ; l l þ l e ðl þl Þ t ; e ðl þl Þ t l l þ l e ðl þl Þ t ; respectively. Therefore, under this formulation, the likelihood function of the observed data () is same as (). 7. Weibull cause specific hazard functions In this subsection, it is assumed, l ðtþ ¼al t a and l ðtþ ¼al t a Therefore, FðtÞ ¼e ðl þl Þ t a In this case the likelihood contributions from the observations a, b, c, d, e and f are al t a e ðl þl Þ t a ; al t a e ðl þl Þ t a ; aðl þ l Þ t a e ðl þl Þ t a ; l e ðl þl Þ t l a ; e ðl þl Þ t a ; e ðl þl Þ t a ; l þ l l þ l respectively. In this case also the likelihood function of the data is same as the likelihood function obtained using independent Weibull latent failure times distributions in Section 5. Since the likelihood functions are equal in both cases, the estimation procedures of the different unknown parameters, namely ðl ; l Þ or ðl ; l ; aþ, and the statistical properties of these estimates are same in both the formulation. 8. Conclusions In this paper we consider the analysis of the partially complete time and type of failure data in presence of competing risks. We formulate the problem in two different ways and it is observed that both of them lead to the same likelihood function. It raises the important question of the identifiability problem of competing risks. Our analysis shows, as expected, that in the above situation it is not possible to identify from the given data whether the competing causes are independent or not without the presence of covariates. References Cox, D. R. (959). The analysis of exponentially distributed lifetimes with two types of failures. Journal of the Royal Statistical Society, Series B, 4 4. Dempster, A.., Laird, N. M., and Rubin, D. B. (977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39,. Dinse, G. E. (98). Non-parametric estimation of partially incomplete time and types of failure data. Biometrics 38,

15 Biometrical Journal 46 (004) 79 Efron, B. and Hinkley, D. V. (978). The observed versus expected information. Biometrika 65, Kundu, D. and Basu, S. (000). Analysis of incomplete data in presence of competing risks. Journal of Statistical lanning and Inference 87, 39. Louis, T. A. (98). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B 44,, Mendenhall, W. and Lehmann, E. H. Jr. (960). An approximation of the negative moments of the positive binomial useful in life testing. Technometrics, 7 4. Miyakawa, M. (984). Analysis of incomplete data in competing risks model. IEEE Transactions on Reliability Analysis 33, rentice, R. L., Kalbfleisch, J. D., eterson, Jr., A. V., Flournoy, N., Farewell, V. T. and Breslow, N. E. (978). The analysis of failure times in the presence of competing risks. Biometrics 34,

Analysis of incomplete data in presence of competing risks

Analysis of incomplete data in presence of competing risks Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department

More information

Bayesian Analysis for Partially Complete Time and Type of Failure Data

Bayesian Analysis for Partially Complete Time and Type of Failure Data Bayesian Analysis for Partially Complete Time and Type of Failure Data Debasis Kundu Abstract In this paper we consider the Bayesian analysis of competing risks data, when the data are partially complete

More information

On the Comparison of Fisher Information of the Weibull and GE Distributions

On the Comparison of Fisher Information of the Weibull and GE Distributions On the Comparison of Fisher Information of the Weibull and GE Distributions Rameshwar D. Gupta Debasis Kundu Abstract In this paper we consider the Fisher information matrices of the generalized exponential

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Chapter 2 Reliability and Confidence Levels of Fatigue Life

Chapter 2 Reliability and Confidence Levels of Fatigue Life Chapter Reliability and Confidence Levels of Fatigue Life.1 Introduction As is well known, fatigue lives of nominally identical specimens subjected to the same nominal cyclic stress display scatter as

More information

A New Two Sample Type-II Progressive Censoring Scheme

A New Two Sample Type-II Progressive Censoring Scheme A New Two Sample Type-II Progressive Censoring Scheme arxiv:609.05805v [stat.me] 9 Sep 206 Shuvashree Mondal, Debasis Kundu Abstract Progressive censoring scheme has received considerable attention in

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Step-Stress Models and Associated Inference

Step-Stress Models and Associated Inference Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated

More information

On Weighted Exponential Distribution and its Length Biased Version

On Weighted Exponential Distribution and its Length Biased Version On Weighted Exponential Distribution and its Length Biased Version Suchismita Das 1 and Debasis Kundu 2 Abstract In this paper we consider the weighted exponential distribution proposed by Gupta and Kundu

More information

Analysis of Type-II Progressively Hybrid Censored Data

Analysis of Type-II Progressively Hybrid Censored Data Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017 Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;

More information

Modified maximum likelihood estimation of parameters in the log-logistic distribution under progressive Type II censored data with binomial removals

Modified maximum likelihood estimation of parameters in the log-logistic distribution under progressive Type II censored data with binomial removals Modified maximum likelihood estimation of parameters in the log-logistic distribution under progressive Type II censored data with binomial removals D.P.Raykundaliya PG Department of Statistics,Sardar

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Analysis of Middle Censored Data with Exponential Lifetime Distributions

Analysis of Middle Censored Data with Exponential Lifetime Distributions Analysis of Middle Censored Data with Exponential Lifetime Distributions Srikanth K. Iyer S. Rao Jammalamadaka Debasis Kundu Abstract Recently Jammalamadaka and Mangalam (2003) introduced a general censoring

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

Burr Type X Distribution: Revisited

Burr Type X Distribution: Revisited Burr Type X Distribution: Revisited Mohammad Z. Raqab 1 Debasis Kundu Abstract In this paper, we consider the two-parameter Burr-Type X distribution. We observe several interesting properties of this distribution.

More information

Asymptotic efficiency and small sample power of a locally most powerful linear rank test for the log-logistic distribution

Asymptotic efficiency and small sample power of a locally most powerful linear rank test for the log-logistic distribution Math Sci (2014) 8:109 115 DOI 10.1007/s40096-014-0135-4 ORIGINAL RESEARCH Asymptotic efficiency and small sample power of a locally most powerful linear rank test for the log-logistic distribution Hidetoshi

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS. Ping Sa and S.J.

SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS. Ping Sa and S.J. SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS " # Ping Sa and S.J. Lee " Dept. of Mathematics and Statistics, U. of North Florida,

More information

Statistical Analysis of Competing Risks With Missing Causes of Failure

Statistical Analysis of Competing Risks With Missing Causes of Failure Proceedings 59th ISI World Statistics Congress, 25-3 August 213, Hong Kong (Session STS9) p.1223 Statistical Analysis of Competing Risks With Missing Causes of Failure Isha Dewan 1,3 and Uttara V. Naik-Nimbalkar

More information

Generalized Exponential Distribution: Existing Results and Some Recent Developments

Generalized Exponential Distribution: Existing Results and Some Recent Developments Generalized Exponential Distribution: Existing Results and Some Recent Developments Rameshwar D. Gupta 1 Debasis Kundu 2 Abstract Mudholkar and Srivastava [25] introduced three-parameter exponentiated

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM

More information

Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme

Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme Applied Mathematical Sciences, Vol. 12, 2018, no. 18, 879-891 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2018.8691 Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively

More information

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples 90 IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003 Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples N. Balakrishnan, N. Kannan, C. T.

More information

A Quasi Gamma Distribution

A Quasi Gamma Distribution 08; 3(4): 08-7 ISSN: 456-45 Maths 08; 3(4): 08-7 08 Stats & Maths www.mathsjournal.com Received: 05-03-08 Accepted: 06-04-08 Rama Shanker Department of Statistics, College of Science, Eritrea Institute

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH THE EQUIVALENCE OF TREATMENT EFFECTS AMONG TWO ETHNIC GROUPS

SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH THE EQUIVALENCE OF TREATMENT EFFECTS AMONG TWO ETHNIC GROUPS MARCEL DEKKER, INC. 70 MADISON AVENUE NEW YORK, NY 006 JOURNAL OF BIOPHARMACEUTICAL STATISTICS Vol., No. 4, pp. 553 566, 00 SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH

More information

Hybrid Censoring; An Introduction 2

Hybrid Censoring; An Introduction 2 Hybrid Censoring; An Introduction 2 Debasis Kundu Department of Mathematics & Statistics Indian Institute of Technology Kanpur 23-rd November, 2010 2 This is a joint work with N. Balakrishnan Debasis Kundu

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

exclusive prepublication prepublication discount 25 FREE reprints Order now!

exclusive prepublication prepublication discount 25 FREE reprints Order now! Dear Contributor Please take advantage of the exclusive prepublication offer to all Dekker authors: Order your article reprints now to receive a special prepublication discount and FREE reprints when you

More information

A THREE-PARAMETER WEIGHTED LINDLEY DISTRIBUTION AND ITS APPLICATIONS TO MODEL SURVIVAL TIME

A THREE-PARAMETER WEIGHTED LINDLEY DISTRIBUTION AND ITS APPLICATIONS TO MODEL SURVIVAL TIME STATISTICS IN TRANSITION new series, June 07 Vol. 8, No., pp. 9 30, DOI: 0.307/stattrans-06-07 A THREE-PARAMETER WEIGHTED LINDLEY DISTRIBUTION AND ITS APPLICATIONS TO MODEL SURVIVAL TIME Rama Shanker,

More information

Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA

Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA P. VENKATESAN* K. VISWANATHAN + R. PRABHAKAR* * Tuberculosis Research Centre,

More information

the Presence of k Outliers

the Presence of k Outliers RESEARCH ARTICLE OPEN ACCESS On the Estimation of the Presence of k Outliers for Weibull Distribution in Amal S. Hassan 1, Elsayed A. Elsherpieny 2, and Rania M. Shalaby 3 1 (Department of Mathematical

More information

Bayes Estimation and Prediction of the Two-Parameter Gamma Distribution

Bayes Estimation and Prediction of the Two-Parameter Gamma Distribution Bayes Estimation and Prediction of the Two-Parameter Gamma Distribution Biswabrata Pradhan & Debasis Kundu Abstract In this article the Bayes estimates of two-parameter gamma distribution is considered.

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

Number of Complete N-ary Subtrees on Galton-Watson Family Trees

Number of Complete N-ary Subtrees on Galton-Watson Family Trees Methodol Comput Appl Probab (2006) 8: 223 233 DOI: 10.1007/s11009-006-8549-6 Number of Complete N-ary Subtrees on Galton-Watson Family Trees George P. Yanev & Ljuben Mutafchiev Received: 5 May 2005 / Revised:

More information

Constant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring

Constant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring Algorithms Research 013, (): 43-49 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, Arif-Ul-Islam Department of Statistics & Operations

More information

3003 Cure. F. P. Treasure

3003 Cure. F. P. Treasure 3003 Cure F. P. reasure November 8, 2000 Peter reasure / November 8, 2000/ Cure / 3003 1 Cure A Simple Cure Model he Concept of Cure A cure model is a survival model where a fraction of the population

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Information Matrix for Pareto(IV), Burr, and Related Distributions

Information Matrix for Pareto(IV), Burr, and Related Distributions MARCEL DEKKER INC. 7 MADISON AVENUE NEW YORK NY 6 3 Marcel Dekker Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker

More information

Statistical Methods for Alzheimer s Disease Studies

Statistical Methods for Alzheimer s Disease Studies Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations

More information

Small sample corrections for LTS and MCD

Small sample corrections for LTS and MCD Metrika (2002) 55: 111 123 > Springer-Verlag 2002 Small sample corrections for LTS and MCD G. Pison, S. Van Aelst*, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

On Sarhan-Balakrishnan Bivariate Distribution

On Sarhan-Balakrishnan Bivariate Distribution J. Stat. Appl. Pro. 1, No. 3, 163-17 (212) 163 Journal of Statistics Applications & Probability An International Journal c 212 NSP On Sarhan-Balakrishnan Bivariate Distribution D. Kundu 1, A. Sarhan 2

More information

Analysis of Progressive Type-II Censoring. in the Weibull Model for Competing Risks Data. with Binomial Removals

Analysis of Progressive Type-II Censoring. in the Weibull Model for Competing Risks Data. with Binomial Removals Applied Mathematical Sciences, Vol. 5, 2011, no. 22, 1073-1087 Analysis of Progressive Type-II Censoring in the Weibull Model for Competing Risks Data with Binomial Removals Reza Hashemi and Leila Amiri

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Inferences on stress-strength reliability from weighted Lindley distributions

Inferences on stress-strength reliability from weighted Lindley distributions Inferences on stress-strength reliability from weighted Lindley distributions D.K. Al-Mutairi, M.E. Ghitany & Debasis Kundu Abstract This paper deals with the estimation of the stress-strength parameter

More information

Inference for the dependent competing risks model with masked causes of

Inference for the dependent competing risks model with masked causes of Inference for the dependent competing risks model with masked causes of failure Radu V. Craiu and Benjamin Reiser Abstract. The competing risks model is useful in settings in which individuals/units may

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Inference based on the em algorithm for the competing risk model with masked causes of failure

Inference based on the em algorithm for the competing risk model with masked causes of failure Inference based on the em algorithm for the competing risk model with masked causes of failure By RADU V. CRAIU Department of Statistics, University of Toronto, 100 St. George Street, Toronto Ontario,

More information

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests) Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Inference for the dependent competing risks model with masked causes of failure

Inference for the dependent competing risks model with masked causes of failure Lifetime Data Anal (6) : 33 DOI.7/s985-5-78-3 Inference for the dependent competing risks model with masked causes of failure Radu V. Craiu Æ Benamin Reiser Received: January 5 / Accepted: 3 November 5

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive

More information

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data International Mathematical Forum, 3, 2008, no. 33, 1643-1654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Al-khedhairi Department of Statistics and O.R. Faculty

More information

Bivariate Weibull-power series class of distributions

Bivariate Weibull-power series class of distributions Bivariate Weibull-power series class of distributions Saralees Nadarajah and Rasool Roozegar EM algorithm, Maximum likelihood estimation, Power series distri- Keywords: bution. Abstract We point out that

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the Review Quiz 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the Cramér Rao lower bound (CRLB). That is, if where { } and are scalars, then

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs 11-755 Machine Learning for Signal rocessing Expectation Maximization Mixture Models HMMs Class 9. 21 Sep 2010 1 Learning Distributions for Data roblem: Given a collection of examples from some data, estimate

More information

The Accelerated Failure Time Model Under Biased. Sampling

The Accelerated Failure Time Model Under Biased. Sampling The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)

More information

Maximum Likelihood Estimation. only training data is available to design a classifier

Maximum Likelihood Estimation. only training data is available to design a classifier Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional

More information

inferences on stress-strength reliability from lindley distributions

inferences on stress-strength reliability from lindley distributions inferences on stress-strength reliability from lindley distributions D.K. Al-Mutairi, M.E. Ghitany & Debasis Kundu Abstract This paper deals with the estimation of the stress-strength parameter R = P (Y

More information

Parametric Maximum Likelihood Estimation of Cure Fraction Using Interval-Censored Data

Parametric Maximum Likelihood Estimation of Cure Fraction Using Interval-Censored Data Columbia International Publishing Journal of Advanced Computing (2013) 1: 43-58 doi:107726/jac20131004 Research Article Parametric Maximum Likelihood Estimation of Cure Fraction Using Interval-Censored

More information

Discriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions

Discriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions Discriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions Arabin Kumar Dey & Debasis Kundu Abstract Recently Kundu and Gupta ( Bivariate generalized exponential distribution,

More information

Survey Sampling: A Necessary Journey in the Prediction World

Survey Sampling: A Necessary Journey in the Prediction World Official Statistics in Honour of Daniel Thorburn, pp. 1 14 Survey Sampling: A Necessary Journey in the Prediction World Jan F. Bjørnstad 1 The design approach is evaluated, using a likelihood approach

More information

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring Communications for Statistical Applications and Methods 2014, Vol. 21, No. 6, 529 538 DOI: http://dx.doi.org/10.5351/csam.2014.21.6.529 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation for Mean

More information

Bayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter

Bayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter Statistics Preprints Statistics 10-8-2002 Bayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter Yao Zhang Iowa State University William Q. Meeker Iowa State University, wqmeeker@iastate.edu

More information

Evaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal Probability and Fixed Sample Size

Evaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal Probability and Fixed Sample Size Journal of Official Statistics, Vol. 21, No. 4, 2005, pp. 543 570 Evaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal robability and Fixed Sample Size Alina Matei

More information

I I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -;

I I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -; R AD-A237 850 E........ I N 11111IIIII U 1 1I!til II II... 1. AGENCY USE ONLY Leave 'VanK) I2. REPORT DATE 3 REPORT TYPE AND " - - I I FINAL, 01 Jun 8.4 to 31 May 88 4. TITLE AND SUBTITLE 5 * _- N, '.

More information

Introduction of Shape/Skewness Parameter(s) in a Probability Distribution

Introduction of Shape/Skewness Parameter(s) in a Probability Distribution Journal of Probability and Statistical Science 7(2), 153-171, Aug. 2009 Introduction of Shape/Skewness Parameter(s) in a Probability Distribution Rameshwar D. Gupta University of New Brunswick Debasis

More information

A Bivariate Weibull Regression Model

A Bivariate Weibull Regression Model c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

More information

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Point and Interval Estimation of Weibull Parameters Based on Joint Progressively Censored Data

Point and Interval Estimation of Weibull Parameters Based on Joint Progressively Censored Data Point and Interval Estimation of Weibull Parameters Based on Joint Progressively Censored Data Shuvashree Mondal and Debasis Kundu Abstract Analysis of progressively censored data has received considerable

More information

1. (Regular) Exponential Family

1. (Regular) Exponential Family 1. (Regular) Exponential Family The density function of a regular exponential family is: [ ] Example. Poisson(θ) [ ] Example. Normal. (both unknown). ) [ ] [ ] [ ] [ ] 2. Theorem (Exponential family &

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Machine Learning for Signal Processing Expectation Maximization Mixture Models. Bhiksha Raj 27 Oct /

Machine Learning for Signal Processing Expectation Maximization Mixture Models. Bhiksha Raj 27 Oct / Machine Learning for Signal rocessing Expectation Maximization Mixture Models Bhiksha Raj 27 Oct 2016 11755/18797 1 Learning Distributions for Data roblem: Given a collection of examples from some data,

More information

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS040) p.4828 Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

More information