Parameter Estimation for Partially Complete Time and Type of Failure Data
|
|
- Audra Knight
- 6 years ago
- Views:
Transcription
1 Biometrical Journal 46 (004), DOI 0.00/bimj arameter Estimation for artially Complete Time and Type of Failure Data Debasis Kundu Department of Mathematics, Indian Institute of Technology Kanpur, in 0806, INDIA Received August 00, revised July 03, accepted December 003 Summary The theory of competing risks has been developed to asses a specific risk in presence of other risk factors. In this paper we consider the parametric estimation of different failure modes under partially complete time and type of failure data using latent failure times and cause specific hazard functions models. Uniformly minimum variance unbiased estimators and maximum likelihood estimators are obtained when latent failure times and cause specific hazard functions are exponentially distributed. We also consider the case when they follow Weibull distributions. One data set is used to illustrate the proposed techniques. Key words Competing risks; Incomplete data; Failure distribution; Exponential distribution; Maximum likelihood estimators; Minimum variance unbiased estimator.. Introduction In medical studies an investigator is often concerned with the assessment of a specific risk in presence of other risk factors. In statistical literature it is well known as the competing risks model. In analyzing the competing risks model, ideally data consist of a failure time and an indicator denoting the failure type. But in many situations data may be incomplete, both in failure time and failure type. Moreover, under some circumstances, the failure type can be determined even when the failure time is censored. For example, consider an analysis of prostatic cancer in older man. Since, prostatic cancer is not rapidly lethal and can be diagnosed long before death, a man known to have the prostatic cancer who is alive at the time of analysis contributes complete information on the failure type, but incomplete in failure time. On the other hand, without an autopsy, the time of death provides a complete information on the failure time but no information on the failure type. Note that in presence of complete information, several studies have been carried out for the last two decades both under the parametric and non-parametric set up. Dinse [3] considered the nonparametric estimation of the survival function of a particular cause for incomplete time and type of failure data. Because of the non-parametric formulation, it was difficult to consider estimations and construction of confidence intervals of several other parameters of interests. Here we adopt the parametric formulation of the problem and consider statistical inference of several important parameters. We approach the problem in two different ways. First we use the latent failure times model, as suggested by Cox []. In the latent failure times model, it is assumed that the competing risks are independent. Secondly, we use the cause specific hazard functions model as suggested by rentice et al. [9], where the distributions of the competing causes may not be independent. It is assumed that the latent failure time distributions can be either exponential or Weibull. Similarly, under the cause specific hazard functions model, it is assumed that the cause specific hazard functions also can be either exponential or Weibull. Interestingly, in both the formulations, likelihood functions of the observed data are same in each case. Therefore, the estimation procedures of the different unknown * Corresponding author kundu@iitk.ac.in
2 66 D. Kundu Competing Risks parameters and their statistical properties remain unchanged, although the interpretations of the different parameters might be different. In this paper, we make similar assumptions as of Dinse [3]. It is assumed that every member of a certain target population either dies of a particular disease, say cancer or by other causes. A proportion p of the population die of cancer and the proportion ( p) die due to other causes. Suppose that a subject can experience one of J failure types and let us define X be the time of failure and let d be the failure type. At the end of the study, we have the following types of observations fx ¼ t; d ¼ jg fx > tg 3 fx > t; d ¼ jg 4 fx ¼ tg It may be mentioned that Miyakawa [8] and recently Kundu and Basu [5] considered a similar problem in presence of only type and type 4 observations. The rest of the paper is organized as follows. In Section, we provide the formulation in terms of the latent failure times model and describe different notation, which we are going to use throughout this paper. In Section 3, we consider the estimation of different parameters when the latent failure time distributions are exponential. The corresponding confidence intervals are considered in Section 4. The Weibull latent failure distributions are considered in Section 5. One data set is analyzed in Section 6 for illustration purposes. In Section 7, we formulate the problem using the cause specific hazard functions model and finally we draw conclusions from out work in Section 8.. Model Description and Notation Without loss of generality, we assume that there are only two failures types. We use the following notation; X i lifetime of system i X ji lifetime of mode j (¼ or ) of system i FðÞ cumulative distribution function of X i f ðþ density function of X i F j ðþ distribution function of X ji f j ðþ density function of X ji F j ðþ ¼ F j ðþ d i indicator variable denoting cause of failure of system i I½Š indicator function of event ½Š Exp ðlþ denotes exponential random variable with density function l e lx Gamma ða; lþ denotes gamma random variable with density function la GðaÞ xa e lx Weibull ða; lþ denotes Weibull random variable with density function al x a e lxa bin ðn; pþ denotes binomial random variable with parameters n and p. In this Section we formulate the problem using the latent failure times model of Cox []. It is assumed that ðx i ; X i Þ; i ¼ ;...N are N independent and identically distributed (i.i.d.) random variables. It is also assumed that X i and X i are independent for all i ¼ ; ;...N and X i ¼ min fx i ; X i g. Without loss of generality we assume that there are only two failure types, namely and. We assume that we have following observations. The first r observations have complete failure times and corresponding cause of failure is for all of them. We denote this set as I, i.e., the observations are of the type ðx i ; d i Þ; d i ¼ for all and the number of elements in I, j I j,isr. Similarly the next r observations have complete failure times and the corresponding cause of failure is for all of them. We denote this set as I. The next r 3 observations have complete failure times but corresponding causes of failure are unknown and we denote this set as I 3. For the next r 4 observations we know that the failure type will be but failure times have been right censored. We denote this set as I 4. Similarly for the next r 5 observations we have failure type, but corresponding failure times are right censored. Finally we have last r 6 right censored observations,
3 Biometrical Journal 46 (004) 67 where exact failure time and failure type both are unknown and we denote this set as I 6. Therefore, in summary we have following types of observations a ðx i ; Þ; ; ji j¼r ; b ðx i ; Þ; ; ji j¼r ; c ðx i ; Þ; 3 ; ji 3 j¼r 3 ; d ðx i ; Þ; 4 ; ji 4 j¼r 4 ; ðþ e ðx i ; Þ; 5 ; ji 5 j¼r 5 ; f ðx i ; Þ; 6 ; ji 6 j¼r 6 We denote r þ r þ r 3 ¼ n, r 4 þ r 5 þ r 6 ¼ m and therefore, m þ n ¼ N. In order to analyze incomplete data it is assumed that failure times are from the same population as the complete data. We also assume throughout that r þ r ¼ n, r 3, r 4 þ r 5 ¼ m, r 6 are fixed and not random and n > 0. If they are assumed to be random, then all the results are valid conditionally, see Kundu and Basu [5] for details. The likelihood contributions from the observations a; b; c; d; e and f are respectively. f ðxþ F ðxþ ; f ðxþ F ðxþ ; f ðxþ ; Ð x f ðyþ F ðyþ dy ; Ð x f ðyþ F ðyþ dy ; 3. Exponential Latent Failure Time Distributions oint Estimation In this Section, we assume that X ji s are exponential random variables with parameters l j for j ¼ and and for i ¼ ;...; N. The distribution function, F j ðtþ, ofx ji has the following form F j ðtþ ¼ð e ljt Þ ; j ¼ ; In this case the likelihood function of the observed data () is L ¼ l r exp ðl þ l Þ x i l r exp ðl þ l Þ x i ðl þ l Þ r 3 exp ðl þ l Þ r4 l x i exp ðl þ l Þ x i 3 l þ l 4 l r5 exp ðl þ l Þ x i exp ðl þ l Þ x i l þ l 5 6 ¼ l r þr 4 l r þr 5 ðl þ l Þ r 3 r 4 r 5 exp ðl þ l Þ x i ðþ Here I ¼ I [ I [...[ I 6. From the likelihood function (), we obtain MLEs of l and l as nðr þ r 4 Þ nðr ^l ¼ þ r 4 Þ and ^l ¼ ðn þ m Þ ðn þ m Þ x i x i Now to compute mean and variances of ^l and ^l, we observe the following l l r bin n ; ; r bin n ; ; ð3þ l þ l l þ l l l r 4 bin m ; ; r 5 bin m ; ; ð4þ l þ l l þ l nþ m i ¼ X i Gðn þ m; l þ l Þ FðxÞ; ð5þ
4 68 D. Kundu Competing Risks Note that using (3) (5), the exact distributions of ^l and ^l can be easily calculated. Using (3) (5), we obtain n Eð^l Þ¼ n þ m l n l l ; Vð^l Þ¼ ðn þ m Þðnþm Þ n þ m þ l ; n þ m n Eð^l Þ¼ n þ m l n l l ; Vð^l Þ¼ ðn þ m Þðnþm Þ n þ m þ l ð6þ n þ m From (6), it is immediate that ~l ¼ ðr þ r 4 Þðnþm Þ and ~ l ¼ ðr þ r 5 Þðnþm Þ ðn þ m Þ ðn þ m Þ x i x i are uniformly minimum variance unbiased estimators (UMVUEs) of l and l respectively. Note that these results match with those of Miyakawa [8], when r 4 ¼ r 5 ¼ r 6 ¼ 0. The relative risk rate, p, due to cause is ½X i < X i Š¼ ð 0 l exp ð ðl þ l Þ xþdx ¼ l l þ l Observe that r n and r 4 m are unbiased estimators of p, but the MLE of p, ^p, is ^p ¼ ^l ^l þ ^l ¼ r þ r 4 n þ m Note that the distribution ^p is a scaled version of a Binomial distribution and ^p is an unbiased estimator of p. Now, we consider the problem of estimating the mean lifetimes q ¼ and q ¼ due to l l cause and cause respectively. Note that although MLEs and UMVUEs of l and l always exist but the same is not true for q and q. UMVUEs of q and q do not exist and MLEs of q and q exist when r þ r 4 > 0 and r þ r 5 > 0 respectively. First we give the results for q. The conditional MLE (conditioning that r þ r 4 > 0) of q, say ^q, is as follows ^q ¼ x i n þ m nðr þ r 4 Þ Now, we obtain the conditional distribution of ^q, conditioning on r þ r 4 > 0, which in turn helps us in constructing confidence interval q. We need the following lemma for further development. Lemma The conditional moment generating function (mgf) of ^q, say f^q ðtþ, is of the following form f^q ðtþ ¼ E½exp ðt^q Þjr þ r 4 > 0Š " ¼ q n þ m n þ m i n þm ðn þ m Þ! q q i i ¼ i!ðn þ m iþ! q þ q q þ q tðn ðnþmþ # þ m Þ q q ¼ n þ m p i tðn þ m Þ q q ðn þ mþ ni q þ q i ¼ ni q þ q Here q ¼ q and q þ q i n p i ¼ q n þ m ð Þ þ m ðn þ m Þ! q q i i!ðn þ m iþ! q þ q q þ q
5 Biometrical Journal 46 (004) 69 roof of lemma Note that nþ m random variables. Therefore, i ¼ f^q ðtþ ¼ E½ exp ðt^q Þjr þ r 4 > 0 Š ¼ n þ m i ¼ X i is Gamma n þ m; q þ q and r þ r 4 is bin n þ m ; q q E½ exp ðt^q Þ j r þ r 4 ¼ i Šr ½ þ r 4 ¼ i j r þ r 4 > 0 Š l l þ l Note that, p i ¼ r ½ þ r 4 ¼ i j r þ r 4 > 0 Šfor i ¼ ;...; n þ m. Since the moment generating function of Gamma ða; lþ is t a, the result immediately follows. Therefore, we have the following theorem. l Theorem The conditional probability density function (pdf) of ^q, f^q ðxþ, and the conditional probability distribution function, F^q ðxþ, becomes f^q ðxþ ¼ n þ m i ¼ p i g i ðxþ ; F^q ðxþ ¼ n þ m i ¼ p i G i ðxþ Here g i ðxþ and G i ðxþ denote the density function and the distribution function respectively of a gamma niðq þ q Þ random variable with shape parameter ðn þ mþ and scale parameter for i ¼ ;...; n ðn þ m Þðq q Þ þ m. roof of theorem The proof follows immediately from lemma. From theorem, it is clear that the conditional distribution of ^q is a mixture of gamma random variables. It is easy to compute different conditional moments of ^q. We provide the first and the second conditional moments, others can be computed along the same line. We denote two conditional moments as Eð^q Þ and Eð^q Þ respectively. We have n þ m p i Eð^q Þ¼ i ¼ i n ðn þ mþðn q q þ m Þ ð7þ q þ q and! Eð^q Þ¼ n þ m p i i ¼ i n ðn þ mþðnþmþþðn þ m Þ ðq q Þ ðq þ q Þ Note that ^q is not an unbiased estimator of q and since the summation sign denotes inverse moments of positive binomial random variables, it is not possible to give exact expressions. However, it is known (Mendenhall and Lehmann [7]) that if Z is a binomial random variable with parameters N and p, then for large N, E Z j Z > 0 EðZÞ ¼ Np and E Z j Z > 0 ðeðzþþ ¼ ðnpþ. Using these approximations, we have Eð^q Þ n þ m q and Eð^q þ mþðnþmþþ Þðn n n q. Therefore, when m n tends to zero, then ^q is an asymptotically unbiased and consistent estimator of q. For fixed m and n (large), the expression (7) of Eð^q Þ can be used for biased correction. Same results are true for ^q also and they can be obtained simply by interchanging the roles of q and q. 4. Exponential Latent Failure Time Distributions Confidence Intervals In this Section we obtain confidence bounds of l and l using the asymptotic distributions of ^l and ^l.the Fisher Information of l and l is Iðl ; l Þ¼ðI ij ðl ; l ÞÞ for i; j ¼ and. Here I ij ðl ; l Þ¼ ln ðl ; l j
6 70 D. Kundu Competing Risks and I ðl ; l Þ¼ nl þðn þ m Þ l l ðl þ l Þ ; I ðl ; l Þ¼ I ðl ; l Þ¼ ðn m Þ n ðl þ l Þ and I ðl ; l Þ¼ nl þðn þ m Þ l l ðl þ l Þ Therefore, if l ¼ðl ; l Þ and ^l ¼ð^l ; ^l Þ, then we have, ð^l lþ!n 0; I ðl ; l Þ ; where I ðl ; l Þ¼ðIij ðl ; l ÞÞ and I ; l Þ¼ l ðnl þðn þ m Þ l Þ ; nðn þ m Þ I ; l Þ¼ I ; l Þ¼ l l ð ðn m Þþn Þ ; nðn þ m Þ I ; l Þ¼ l ðnl þðn þ m Þ l Þ nðn þ m Þ Similarly, we can have the Fisher Information matrix, Iðq ; q Þ, for q and q by using simply Jacobian transformation. Some other important parameters are, say ½Š p ¼ l ¼ q l þ l q þ q robability of death due to cause. ½Š t ¼ pq þð pþq ¼ pl þð pþl The expected lifetime of all individuals. ½3Š t ¼ pt The cause () specific mortality index. ½4Š t 3 ¼ exp ð l xþ The survival function due to cause. The MLEs of p, t, t and t 3 can be obtained very easily by using the in-variance property of the MLEs and their asymptotic distributions can be obtained using the d-method. 5. Weibull Latent Failure Time Distributions 5. Estimation of parameters In this Section we assume that X ji s are Weibull random variables with parameters ða; l j Þ for j ¼ ; and for i ¼ ;...; n. The distribution function F j ðþ of X ji has the following form F j ðtþ ¼ exp ð l j t a Þ We assume that the lifetime distributions of the different causes follow Weibull distributions with different scale parameters but they have the same shape parameter. The hazard rates due to cause and cause are given ðtþ F ðtþ ¼ al t a ðtþ F ðtþ ¼ al t a respectively. The mean lifetimes due to cause and cause are EðX i Þ¼ G þ and EðX a l i Þ¼ G þ a a l a
7 Biometrical Journal 46 (004) 7 respectively. The relative risk rate, p, due to cause is ð p ¼ X ½ i < X i Š ¼ al x a exp ð ðx a ðl þ l ÞÞ dx ¼ l l þ l 0 and it is independent of the shape parameter. The log-likelihood of the observed data is ln ðlþ ¼ ðr þ r 4 Þ ln ðl Þþðr þ r 5 Þ ln ðl Þþðr 3 r 4 r 5 Þ ln ðl þ l Þ ðl þ l Þ x a i þða Þ ln ðx i Þþðr þ r þ r 3 Þ ln ðaþ [ I [ I 3 Taking derivatives of ln ðlþ with respect to l, l and a and making them equal to zero, give ^l ðaþ ¼ nðr þ r 4 Þ x a i ðn and þ m Þ ^l ðaþ ¼ nðr þ r 5 Þ x a i ðn þ m Þ utting values of ^l ðaþ and ^l ðaþ in ln (L), we obtain ln ðlðaþþ ¼ K þðr þ r þ r 3 Þ ln ðaþ ln x a i þða Þ ln ðx i Þ ; ð8þ [ I [ I 3 here K is a constant independent of a. Now differentiating (8) with respect to a and equating it to zero, we obtain x a i ln ðx i Þ d da ln ðlðaþþ ¼ ðr þ r þ r 3 Þ þ ln ðx i Þþ r þ r þ r 3 ¼ 0 ð9þ [ I [ I 3 a x a i The equation (9) can be written equivalently as 3 x a i ln ðx i Þ ln ðx i Þ 6 a ¼ [ I [ I r þ r þ r 3 x a i ¼ hðaþ (say) Therefore, a simple iterative scheme may be used to solve (0). From the ith iterate a ðiþ, a ði þþ can be obtained as hða ði þþ Þ. Once we obtain ^a, ^l ð^aþ and ^l ð^aþ can be obtained as MLEs of l and l respectively. Alternatively, we can use EM algorithm of Dempster, Laird and Rubin [] to compute MLEs of unknown parameters. The EM algorithm consists of two steps. The first one is the estimation (E) step, where complete observations are left intact and pseudo observations are framed. In this case we consider the E-step as follows. All data belong to category a, b, d, e, and f are left intact, only if the observation x belongs to category c, we form pseudo observation by fractioning x to two partially complete pseudo observation of the form ðx; w ðx; gþþ, ðx; w ðx; gþþ, where g ¼ðl ; l ; aþ. Specifically, the fractional mass, ðw j ðx; gþþ, assigned to this pseudo observation x, is the conditional probability that the individual died from risk j given that the individual had died at time point point x. Therefore, w ðx; gþ ¼ l and w ðx; gþ ¼ l l þ l l þ l From now on we denote w ðx; gþ, w ðx; gþ as w and w respectively. The log-likelihood function of the pseudo data can be written as ln ðl S ðl ; l ; aþþ, where ln ðl S ðl ; l ; aþþ ¼ ðr þ r þ r 3 Þ ln ðaþþðr þ w r 3 Þ ln ðl Þþðr þ w r 3 Þ ln ðl Þ þða Þ ln ðx i Þ ðl þ l Þ x a i ðþ [ I [ I 3 ð0þ
8 7 D. Kundu Competing Risks The maximization step (M) involves the maximization of the log-likelihood function of the pseudo data () with respect to l, l and a. Assuming w j s to be known, we need to maximize L S ðl ; l ; aþ. It has to be performed iteratively. If ðl ðiþ ; lðiþ ; aðiþ Þ is the ith iterate, then ði þ Þth iterate, ðl ðiþþ ; l ðiþþ ; a ðiþþ Þ can be obtained as l ðiþþ ¼ r þ r 3 w x aðiþ i and l ðiþþ Finally a ðiþþ can be obtained as follows; where a ðiþþ ¼ arg max gðaþ ; ¼ r þ r 3 w x aðiþ i gðaþ ¼ ðr þ r þ r 3 Þ ln ðaþþðr þ w r 3 Þ ln ðl ðiþ Þþðr þ w r 3 Þ ln ðl ðiþ þða Þ ln ðx i Þ ðl ðiþ þ lðiþ Þ x a i [ I [ I 3 Equating g 0 ðaþ ¼0, we obtain a ¼ ðl ðiþ þ lðiþ Þ Note that () also can be solved similarly as (0). r þ r þ r 3 ¼ vðaþðsayþ ðþ x a i ln ðx i Þ ln ðx i Þ [ I [ I 3 Þ 5. Confidence intervals In this Section we provide confidence intervals of different parameters. Since it is not possible to obtain the exact distribution of MLEs, we use the asymptotic method. The asymptotic results can be stated as follows ð^l l ; ^l l ; ^a aþ!n 3 ð0; I ðl ; l ; aþþ ð3þ Here Iðl ; l ; aþ is the Fisher Information matrix for the unknown parameters ðl ; l ; aþ. The elements of the 3 3 matrix I ¼ððI ij ÞÞ are as follows I ðl ; l ; aþ ¼ nl þðn þ m Þ l l ðl þ l Þ ; I ðl ; l ; aþ ¼ nl þðn þ m Þ l l ðl þ l Þ ; I ðl ; l ; aþ ¼ n ðn þ m Þ ðl þ l Þ ¼ I ða; l ; l Þ ; I 33 ðl ; l ; aþ ¼ n a þ Vðn þ mþðl þ l Þ ; I 3 ðl ; l ; aþ ¼ ðn þ mþ U ¼ I 3 ðl ; l ; aþ ¼I 3 ða; l ; l Þ¼I 3 ðl ; l ; aþ Here U ¼ EðX a ln ðxþþ and V ¼ EðX a ðln ðxþþ Þ, where X is distributed as Weibull ða; l þ l Þ. Note that U and V can be written as U ¼ aðl þ l Þ wðþ ln l ð þ l Þ V ¼ a ðl þ l Þ ½w0 ðþþwðþðln ðl þ l ÞÞ wðþ ln ðl þ l ÞŠ
9 Biometrical Journal 46 (004) 73 Here wðþ and w 0 ðþ are the digamma and polygamma functions respectively. We can use (3) to compute asymptotic confidence intervals of all unknown parameters. Since it is observed that in many cases (see Efron and Hinkley [4]), it is more appropriate to use the observed information matrix than the expected information matrix, we provide the observed information matrix also. It is possible to obtain the observed information matrix when the EM algorithm is used (Louis [6]). Using the same notation of Louis [6], the observed information matrix, ^I, takes the form ^I ¼ B SS 0 Here B is the 3 3 negative of the second derivative matrix and S is the 3 gradient vector. They are as follows and SðÞ ¼ r þ w r 3 l Sð3Þ ¼ r þ r þ r 3 þ a x a I ; SðÞ ¼r þ w r 3 l x a I ; ln ðx i Þ ðl þ l Þ x a i ln ðx i Þ [ I [ I 3 Bð; Þ ¼ r þ w r 3 l ; Bð; Þ ¼ r þ w r 3 l ; Bð; Þ ¼Bð; Þ ¼0 ; Bð; 3Þ ¼Bð3; Þ ¼Bð; 3Þ ¼Bð3; Þ ¼ x a i ln ðx i Þ ; I I Bð3; 3Þ ¼ r þ r þ r 3 a 6. Data Analysis ðl þ l Þ x a i ðln ðx i ÞÞ We illustrate the proposed parametric estimation technique using one cancer clinical trials. The data was originally analyzed by Dinse [3] using the non-parametric approach. Data Set This example is concerned with the relationship between time until progression and an indicator function of patient status at the time of progression for patients with glioblastoma, a cancer of the brain. Define X as the time until disease progression, as measured in month from the date of randomization. Suppose, d indicates whether a patient is non ambulatory (d ¼ ) or ambulatory (d ¼ ) at the time of their most recent examination within the month before progression. If a patient does not experience a progression by the time of analysis, the patient belongs to category f. For a patient with a known progression time, if there is no record of his/her ambulatory status within the month of preceding progression falls into category c. For detailed discussions on the data set the readers are referred to Dinse [3]. The data set presents times and indicators for 7 patients from a clinical trial. The summary of the data set is as follows r ¼ 4, r ¼ 7, r 3 ¼ 3, r 4 ¼ r 5 ¼ 0, r 6 ¼ 83, n ¼ 58, n ¼ 89, m ¼ 0 and m ¼ 83. Also x i ¼ 3, x i ¼ 50, ¼ 76, x i ¼ 885 and x i ¼ Before fitting our model to the complete data set we would like to test the validity of the proposed parametric models. Since the closeness between the empirical survival function and the estimated survival function is a good measure for model validity for uncensored data, we consider a subset of the data which has only a, b or c types of failure. Assuming that the latent failure time distributions are exponential we obtain the MLEs of the parameters of the restricted data as ^l r ¼ and ^l r ¼ The corresponding UMVUEs are ~ l r ¼ and ~ l r ¼ respectively. In this case MLEs and UMVUEs are quite close to each other and we consider MLEs only. The Kolmogorov-Smirnov (K-S) distance between the empirical distribution function of the restricted data and the estimated distribution function using the exponential model is 0.9 and the corresponding p
10 74 D. Kundu Competing Risks Empirical survival function Estimated exponential survival function Estimated Weibull survival function Fig. Empirical survival function and the estimated survival functions using exponential and Weibull models. value is Similarly assuming that the latent failure time distributions are Weibull, we obtain the MLEs of the parameters of the restricted data as ^l r ¼ ; ^l r ¼ and ^a r ¼ In this case the K-S distance between the empirical distribution function and the estimated distribution function using the Weibull model is 0.73 and the corresponding p value is Since the p value corresponding to the Weibull model is quite high, therefore the preliminary data analysis suggests that the proposed Weibull model can be used to analyze this data set. We plot the empirical survival function and the estimated survival functions using exponential model and 0.8 Empirical survival function Estimated survival function using UMVUE, s Estimated survival function using MLE, s Fig. Empirical survival function and estimated survival functions using MLE s and UMVUE s. Here lifetime distributions are exponential.
11 Biometrical Journal 46 (004) Empirical survival function Estimated survival function using MLE, s Estimated survival function using modified MLE, s Fig. 3 Empirical survival function and the estimated survival functions using MLE s and modified MLE s. Here the lifetime distributions are Weibull. Weibull model in Figure. Since it is assumed that the incomplete data are the coming from the same population as the complete data, therefore it is not unreasonable to use the proposed models to analyze this data set. Now we consider the full data set. Under the assumption that the latent failure times are exponentially distributed, MLEs of l and l are ^l ¼ and ^l ¼ 0059 respectively, the log-likelihood value is Also Vð^l Þ¼5 0 6 and Vð^l Þ¼ Moreover, UMVUEs of l and l are ~ l ¼ and ~ l ¼ respectively, with Vð ~ l Þ¼ and Vð ~ l Þ¼ 0 5. Clearly MLEs and UMVUEs are quite different and it is mainly due to heavy ð00384 þ 0059Þ x censoring. Estimated survival function obtained using the MLEs is e and the corresponding 95% confidence bounds are e ð00384 þ 0059Þ x 003 e ð00384 þ 0059Þ x. The Kolmogorov- Smirnov (K-S) distance between the empirical distribution function and the estimated distribution function is 0.96 and the corresponding p value is almost zero. Similarly the estimated survival function using the UMVUEs is e and the corresponding 95% confidence bands are ð00738 þ 00305Þ x e ð00738 þ 00305Þx 0046 e ð00738 þ 00305Þ x. In this case the K-S distance is 0.5 and the corresponding p value is We provide the empirical survival function and the estimated survival functions using MLEs and UMVUEs in the Figure. Clearly, UMVUEs provide a much better fit to the empirical survival function but still its p value is quite small. Instead of MLEs we prefer to use the bias corrected UMVUEs in this case. Conditional estimates of mean times until progression of patients with non ambulatory status and ambulatory status are q ~ ¼ 3480 and q ~ ¼ 353 respectively. The corresponding 95% confidence intervals are (4.8404,.99) and (6.4388, ) respectively. Clearly the mean progression time of the patients with ambulatory status is significantly more than the patients with non ambulatory status. The biased corrected estimate of the mean progression time of all patients, say ~t ¼ and the corresponding 95% asymptotic confidence interval is (5.094, 3.049). The estimate of the relative risk rate of the non ambulatory status is and the corresponding 95% asymptotic confidence interval is (0.6076, ). Clearly it is siginificantly greater than 0.5. The estimates of the survival functions of the time until progression of the patients with non-ambulatory and ambulatory status at the time point x are e 00738x and e 00305x respectively. The corresponding 95% confidence intervals are e 00738x 0096x e 00738x and e 00305x 0038x e 00305x respectively.
12 76 D. Kundu Competing Risks Now we analyze the data under the assumptions that the latent failure time distributions are Weibull. In this case, we obtain the MLEs of the unknown parameters using the methods proposed in the Section 5. Both the methods converge to the same solutions and they are as follows, ^a ¼ 45; ^l ¼ 00 and ^l ¼ 0009, the corresponding log-likelihood value is We obtain the empirical survival function and the estimated survival function using the MLEs as e ð00 þ 0009Þx45. We provide it in the Figure 3. Unfortunately it does not match very well. The K-S distance between the empirical distribution function and the estimated distribution function is 0.33 and the corresponding p value is almost zero. Similar things happen even in the exponential case also. We feel that the scale parameters are highly biased. We make the bias correction as follows. Assuming that the shape parameter is known to be.45, we transform the date as X a and then estimate the scale parameters by UMVUEs. It gives the improved estimates of l and l, say ~l ¼ 0049 and ~ l ¼ 0078 respectively. We obtain the empirical survival function and the estimated survival function using the improved estimates as e ð0049 þ 0078Þx45. It is also plotted in Figure 3. The K-S distance between the empirical survival function and the estimated survival function in this case is and the corresponding p value is 0.7. We will be using these new estimates from now on. The asymptotic variance covariance matrix of ð^l ; ^l ; ^aþ obtained from the observed information matrix using the EM algorithm is Therefore, by proper multiplication we obtain the variance covariance matrix of ð ~ l ; ~ l ; ~aþ as The asymptotic 95% confidence bounds for l, l and a are (0.0366, 0.049), (0.09, 0.036) and (.350,.940) respectively. Since the 95% confidence interval of a does not include, therefore it is reasonable to use the Weibull model rather than the exponential model for this data set. Even the log-likelihood values also suggest (performing c testing) that the Weibull model is a preferred model than the exponential model in this case. The estimates of the mean lifetime of the patients with nonambulatory and ambulatory status are.5349 and and the corresponding 95% confidence bounds are (5.099, ) and (4.5047, 37.) respectively. The estimate of the relative risk rate of the patients with non-ambulatory status is and the corresponding asymptotic 95% confidence band is (0.5885, 0.85). The estimates of the survival functions of the time until progression of the patients with non-ambulatory and ambulatory status at the time point x are e 0049x45 and e 0078x45 respectively. The corresponding 95% asymptotic confidence bounds are e 0049x45 e 0049x45 x 45 ð ln ðxþþ59570 ðln ðxþþ Þ 0 3 and e 0078x45 e 0078x45 x 45 ð ln ðxþþ07 ðln ðxþþ Þ 0 3. They are plotted in Figure 4 and it is clear from the figure that the two survival functions are significantly different. For illustrations purposes we assume that 0% of the category f data have known status of the patients. We randomly choose 0% of the data from the category f and assign status and status with probability 07 ( p) and 03 respectively. Now the summary of the new data set is as follows r ¼ 4, r ¼ 7, r 3 ¼ 3, r 4 ¼ 7, r 5 ¼ 5, r 6 ¼ 7, n ¼ 58, n ¼ 89, m ¼ and m ¼ 83. Also x i ¼ 3, x i ¼ 50, ¼ 76, ¼ 09, ¼ 76, x i ¼ 700 and x i ¼ 639. Assuming that the lifetime distributions are exponential, the estimates of the scale parameters are as follows ^l ¼ 0037; ^l ¼ 007; ~ l ¼ 0074 and ~ l ¼ Similarly assuming that the lifetime distri-
13 Biometrical Journal 46 (004) Estimated survival function of the ambulatory status patients 95% confidence bands Estimated survival function of the non ambulatory status patients 0. 95% confidence bands Fig. 4 Estimated survival functions and the corresponding 95% confidence bounds for the non-ambulatory and the ambulatory status patients. The lifetime distributions are assumed to be Weibull. butions are Weibull the estimates are ^a ¼ 45; ^l ¼ 004; ^l ¼ 00099; ~ l ¼ 0043 and ~l ¼ Therefore the estimates of the other parameters can be obtained similarly as before, using these estimates. Now we compare ours with Dinse s [3] findings. First of all the non-parametric estimate of FðÞ as obtained by Dinse [3] is quite close to the parametric estimate of the survival function, particularly when the latent failure time distributions are assumed to be Weibull (see Figure 3). This indicates that both analysis should provide comparable results. Dinse [3] also mentioned that the non-parametric estimates of d j ðtþ ¼½d ¼ j j X ¼ tš are very erratic and therefore, he has provided the smoothed estimates of d j ðtþ. In our formulation, d j ðtþ is constant for all t. Therefore, the parametric estimate of d j ðtþ can be considered as the smoothed version of the non-parametric estimates. In our case using Weibull distributions, we obtain ^d ðtþ 070 and ^d ðtþ 030. From Figure of Dinse [3], it is clear that the nonparametric smoothed estimate of d ðtþ is close to Therefore, both methods provide similar results. 7. Cause Specific Hazard Functions Model In this Section, we formulate the problem, using the cause specific hazard functions, as originally proposed by rentice et al. [9]. We assume cause specific hazard functions to be exponential or Weibull. We use lðtþ as the overall hazard function and l j ðtþ as the cause specific hazard function for j ¼ and. It is known that lðtþ ¼l ðtþþl ðtþ. Suppose X, d and FðÞ are same as defined in Sections and, the likelihood contributions of the observations a, b, c, d, e and f are l ðtþ FðtÞ; l ðtþ FðtÞ; lðtþ FðtÞ; Ð respectively. t lðuþ FðuÞ du ; Ð t l ðuþ FðuÞ du; Ð t l ðuþ FðuÞ du;
14 78 D. Kundu Competing Risks 7. Exponential cause specific hazard functions In this subsection, it is assumed l ðtþ ¼l and l ðtþ ¼l In this case the likelihood contributions from the observations a, b, c, d, e and f are l e ðl þl Þ t ; l e ðl þl Þ t ; ðl þ l Þ e ðl þl Þ t ; l l þ l e ðl þl Þ t ; e ðl þl Þ t l l þ l e ðl þl Þ t ; respectively. Therefore, under this formulation, the likelihood function of the observed data () is same as (). 7. Weibull cause specific hazard functions In this subsection, it is assumed, l ðtþ ¼al t a and l ðtþ ¼al t a Therefore, FðtÞ ¼e ðl þl Þ t a In this case the likelihood contributions from the observations a, b, c, d, e and f are al t a e ðl þl Þ t a ; al t a e ðl þl Þ t a ; aðl þ l Þ t a e ðl þl Þ t a ; l e ðl þl Þ t l a ; e ðl þl Þ t a ; e ðl þl Þ t a ; l þ l l þ l respectively. In this case also the likelihood function of the data is same as the likelihood function obtained using independent Weibull latent failure times distributions in Section 5. Since the likelihood functions are equal in both cases, the estimation procedures of the different unknown parameters, namely ðl ; l Þ or ðl ; l ; aþ, and the statistical properties of these estimates are same in both the formulation. 8. Conclusions In this paper we consider the analysis of the partially complete time and type of failure data in presence of competing risks. We formulate the problem in two different ways and it is observed that both of them lead to the same likelihood function. It raises the important question of the identifiability problem of competing risks. Our analysis shows, as expected, that in the above situation it is not possible to identify from the given data whether the competing causes are independent or not without the presence of covariates. References Cox, D. R. (959). The analysis of exponentially distributed lifetimes with two types of failures. Journal of the Royal Statistical Society, Series B, 4 4. Dempster, A.., Laird, N. M., and Rubin, D. B. (977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39,. Dinse, G. E. (98). Non-parametric estimation of partially incomplete time and types of failure data. Biometrics 38,
15 Biometrical Journal 46 (004) 79 Efron, B. and Hinkley, D. V. (978). The observed versus expected information. Biometrika 65, Kundu, D. and Basu, S. (000). Analysis of incomplete data in presence of competing risks. Journal of Statistical lanning and Inference 87, 39. Louis, T. A. (98). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B 44,, Mendenhall, W. and Lehmann, E. H. Jr. (960). An approximation of the negative moments of the positive binomial useful in life testing. Technometrics, 7 4. Miyakawa, M. (984). Analysis of incomplete data in competing risks model. IEEE Transactions on Reliability Analysis 33, rentice, R. L., Kalbfleisch, J. D., eterson, Jr., A. V., Flournoy, N., Farewell, V. T. and Breslow, N. E. (978). The analysis of failure times in the presence of competing risks. Biometrics 34,
Analysis of incomplete data in presence of competing risks
Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department
More informationBayesian Analysis for Partially Complete Time and Type of Failure Data
Bayesian Analysis for Partially Complete Time and Type of Failure Data Debasis Kundu Abstract In this paper we consider the Bayesian analysis of competing risks data, when the data are partially complete
More informationOn the Comparison of Fisher Information of the Weibull and GE Distributions
On the Comparison of Fisher Information of the Weibull and GE Distributions Rameshwar D. Gupta Debasis Kundu Abstract In this paper we consider the Fisher information matrices of the generalized exponential
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationCOMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY
Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationChapter 2 Reliability and Confidence Levels of Fatigue Life
Chapter Reliability and Confidence Levels of Fatigue Life.1 Introduction As is well known, fatigue lives of nominally identical specimens subjected to the same nominal cyclic stress display scatter as
More informationA New Two Sample Type-II Progressive Censoring Scheme
A New Two Sample Type-II Progressive Censoring Scheme arxiv:609.05805v [stat.me] 9 Sep 206 Shuvashree Mondal, Debasis Kundu Abstract Progressive censoring scheme has received considerable attention in
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationStep-Stress Models and Associated Inference
Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated
More informationOn Weighted Exponential Distribution and its Length Biased Version
On Weighted Exponential Distribution and its Length Biased Version Suchismita Das 1 and Debasis Kundu 2 Abstract In this paper we consider the weighted exponential distribution proposed by Gupta and Kundu
More informationAnalysis of Type-II Progressively Hybrid Censored Data
Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationDepartment of Statistical Science FIRST YEAR EXAM - SPRING 2017
Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;
More informationModified maximum likelihood estimation of parameters in the log-logistic distribution under progressive Type II censored data with binomial removals
Modified maximum likelihood estimation of parameters in the log-logistic distribution under progressive Type II censored data with binomial removals D.P.Raykundaliya PG Department of Statistics,Sardar
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationAnalysis of Middle Censored Data with Exponential Lifetime Distributions
Analysis of Middle Censored Data with Exponential Lifetime Distributions Srikanth K. Iyer S. Rao Jammalamadaka Debasis Kundu Abstract Recently Jammalamadaka and Mangalam (2003) introduced a general censoring
More informationExact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring
Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme
More informationBurr Type X Distribution: Revisited
Burr Type X Distribution: Revisited Mohammad Z. Raqab 1 Debasis Kundu Abstract In this paper, we consider the two-parameter Burr-Type X distribution. We observe several interesting properties of this distribution.
More informationAsymptotic efficiency and small sample power of a locally most powerful linear rank test for the log-logistic distribution
Math Sci (2014) 8:109 115 DOI 10.1007/s40096-014-0135-4 ORIGINAL RESEARCH Asymptotic efficiency and small sample power of a locally most powerful linear rank test for the log-logistic distribution Hidetoshi
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationSIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS. Ping Sa and S.J.
SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS " # Ping Sa and S.J. Lee " Dept. of Mathematics and Statistics, U. of North Florida,
More informationStatistical Analysis of Competing Risks With Missing Causes of Failure
Proceedings 59th ISI World Statistics Congress, 25-3 August 213, Hong Kong (Session STS9) p.1223 Statistical Analysis of Competing Risks With Missing Causes of Failure Isha Dewan 1,3 and Uttara V. Naik-Nimbalkar
More informationGeneralized Exponential Distribution: Existing Results and Some Recent Developments
Generalized Exponential Distribution: Existing Results and Some Recent Developments Rameshwar D. Gupta 1 Debasis Kundu 2 Abstract Mudholkar and Srivastava [25] introduced three-parameter exponentiated
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM
More informationParameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme
Applied Mathematical Sciences, Vol. 12, 2018, no. 18, 879-891 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2018.8691 Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively
More informationPoint and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples
90 IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003 Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples N. Balakrishnan, N. Kannan, C. T.
More informationA Quasi Gamma Distribution
08; 3(4): 08-7 ISSN: 456-45 Maths 08; 3(4): 08-7 08 Stats & Maths www.mathsjournal.com Received: 05-03-08 Accepted: 06-04-08 Rama Shanker Department of Statistics, College of Science, Eritrea Institute
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More informationSAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH THE EQUIVALENCE OF TREATMENT EFFECTS AMONG TWO ETHNIC GROUPS
MARCEL DEKKER, INC. 70 MADISON AVENUE NEW YORK, NY 006 JOURNAL OF BIOPHARMACEUTICAL STATISTICS Vol., No. 4, pp. 553 566, 00 SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH
More informationHybrid Censoring; An Introduction 2
Hybrid Censoring; An Introduction 2 Debasis Kundu Department of Mathematics & Statistics Indian Institute of Technology Kanpur 23-rd November, 2010 2 This is a joint work with N. Balakrishnan Debasis Kundu
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationLikelihood Construction, Inference for Parametric Survival Distributions
Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make
More informationPractice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:
Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation
More informationGoodness-of-fit tests for randomly censored Weibull distributions with estimated parameters
Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly
More informationexclusive prepublication prepublication discount 25 FREE reprints Order now!
Dear Contributor Please take advantage of the exclusive prepublication offer to all Dekker authors: Order your article reprints now to receive a special prepublication discount and FREE reprints when you
More informationA THREE-PARAMETER WEIGHTED LINDLEY DISTRIBUTION AND ITS APPLICATIONS TO MODEL SURVIVAL TIME
STATISTICS IN TRANSITION new series, June 07 Vol. 8, No., pp. 9 30, DOI: 0.307/stattrans-06-07 A THREE-PARAMETER WEIGHTED LINDLEY DISTRIBUTION AND ITS APPLICATIONS TO MODEL SURVIVAL TIME Rama Shanker,
More informationTechnical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA
Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA P. VENKATESAN* K. VISWANATHAN + R. PRABHAKAR* * Tuberculosis Research Centre,
More informationthe Presence of k Outliers
RESEARCH ARTICLE OPEN ACCESS On the Estimation of the Presence of k Outliers for Weibull Distribution in Amal S. Hassan 1, Elsayed A. Elsherpieny 2, and Rania M. Shalaby 3 1 (Department of Mathematical
More informationBayes Estimation and Prediction of the Two-Parameter Gamma Distribution
Bayes Estimation and Prediction of the Two-Parameter Gamma Distribution Biswabrata Pradhan & Debasis Kundu Abstract In this article the Bayes estimates of two-parameter gamma distribution is considered.
More informationSTAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring
STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the
More informationNumber of Complete N-ary Subtrees on Galton-Watson Family Trees
Methodol Comput Appl Probab (2006) 8: 223 233 DOI: 10.1007/s11009-006-8549-6 Number of Complete N-ary Subtrees on Galton-Watson Family Trees George P. Yanev & Ljuben Mutafchiev Received: 5 May 2005 / Revised:
More informationConstant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring
Algorithms Research 013, (): 43-49 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, Arif-Ul-Islam Department of Statistics & Operations
More information3003 Cure. F. P. Treasure
3003 Cure F. P. reasure November 8, 2000 Peter reasure / November 8, 2000/ Cure / 3003 1 Cure A Simple Cure Model he Concept of Cure A cure model is a survival model where a fraction of the population
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationInformation Matrix for Pareto(IV), Burr, and Related Distributions
MARCEL DEKKER INC. 7 MADISON AVENUE NEW YORK NY 6 3 Marcel Dekker Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker
More informationStatistical Methods for Alzheimer s Disease Studies
Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations
More informationSmall sample corrections for LTS and MCD
Metrika (2002) 55: 111 123 > Springer-Verlag 2002 Small sample corrections for LTS and MCD G. Pison, S. Van Aelst*, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling
More informationFrailty Models and Copulas: Similarities and Differences
Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt
More informationST745: Survival Analysis: Nonparametric methods
ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate
More informationOn Sarhan-Balakrishnan Bivariate Distribution
J. Stat. Appl. Pro. 1, No. 3, 163-17 (212) 163 Journal of Statistics Applications & Probability An International Journal c 212 NSP On Sarhan-Balakrishnan Bivariate Distribution D. Kundu 1, A. Sarhan 2
More informationAnalysis of Progressive Type-II Censoring. in the Weibull Model for Competing Risks Data. with Binomial Removals
Applied Mathematical Sciences, Vol. 5, 2011, no. 22, 1073-1087 Analysis of Progressive Type-II Censoring in the Weibull Model for Competing Risks Data with Binomial Removals Reza Hashemi and Leila Amiri
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationSurvival Analysis. Stat 526. April 13, 2018
Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined
More informationInferences on stress-strength reliability from weighted Lindley distributions
Inferences on stress-strength reliability from weighted Lindley distributions D.K. Al-Mutairi, M.E. Ghitany & Debasis Kundu Abstract This paper deals with the estimation of the stress-strength parameter
More informationInference for the dependent competing risks model with masked causes of
Inference for the dependent competing risks model with masked causes of failure Radu V. Craiu and Benjamin Reiser Abstract. The competing risks model is useful in settings in which individuals/units may
More informationFrailty Modeling for clustered survival data: a simulation study
Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationInference based on the em algorithm for the competing risk model with masked causes of failure
Inference based on the em algorithm for the competing risk model with masked causes of failure By RADU V. CRAIU Department of Statistics, University of Toronto, 100 St. George Street, Toronto Ontario,
More informationDr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)
Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationInference for the dependent competing risks model with masked causes of failure
Lifetime Data Anal (6) : 33 DOI.7/s985-5-78-3 Inference for the dependent competing risks model with masked causes of failure Radu V. Craiu Æ Benamin Reiser Received: January 5 / Accepted: 3 November 5
More informationPairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion
Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationHacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract
Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive
More informationParameters Estimation for a Linear Exponential Distribution Based on Grouped Data
International Mathematical Forum, 3, 2008, no. 33, 1643-1654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Al-khedhairi Department of Statistics and O.R. Faculty
More informationBivariate Weibull-power series class of distributions
Bivariate Weibull-power series class of distributions Saralees Nadarajah and Rasool Roozegar EM algorithm, Maximum likelihood estimation, Power series distri- Keywords: bution. Abstract We point out that
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationA Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks
A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing
More informationReview Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the
Review Quiz 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the Cramér Rao lower bound (CRLB). That is, if where { } and are scalars, then
More informationPart III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data
1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions
More informationExpectation Maximization Mixture Models HMMs
11-755 Machine Learning for Signal rocessing Expectation Maximization Mixture Models HMMs Class 9. 21 Sep 2010 1 Learning Distributions for Data roblem: Given a collection of examples from some data, estimate
More informationThe Accelerated Failure Time Model Under Biased. Sampling
The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)
More informationMaximum Likelihood Estimation. only training data is available to design a classifier
Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional
More informationinferences on stress-strength reliability from lindley distributions
inferences on stress-strength reliability from lindley distributions D.K. Al-Mutairi, M.E. Ghitany & Debasis Kundu Abstract This paper deals with the estimation of the stress-strength parameter R = P (Y
More informationParametric Maximum Likelihood Estimation of Cure Fraction Using Interval-Censored Data
Columbia International Publishing Journal of Advanced Computing (2013) 1: 43-58 doi:107726/jac20131004 Research Article Parametric Maximum Likelihood Estimation of Cure Fraction Using Interval-Censored
More informationDiscriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions
Discriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions Arabin Kumar Dey & Debasis Kundu Abstract Recently Kundu and Gupta ( Bivariate generalized exponential distribution,
More informationSurvey Sampling: A Necessary Journey in the Prediction World
Official Statistics in Honour of Daniel Thorburn, pp. 1 14 Survey Sampling: A Necessary Journey in the Prediction World Jan F. Bjørnstad 1 The design approach is evaluated, using a likelihood approach
More informationEstimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring
Communications for Statistical Applications and Methods 2014, Vol. 21, No. 6, 529 538 DOI: http://dx.doi.org/10.5351/csam.2014.21.6.529 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation for Mean
More informationBayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter
Statistics Preprints Statistics 10-8-2002 Bayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter Yao Zhang Iowa State University William Q. Meeker Iowa State University, wqmeeker@iastate.edu
More informationEvaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal Probability and Fixed Sample Size
Journal of Official Statistics, Vol. 21, No. 4, 2005, pp. 543 570 Evaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal robability and Fixed Sample Size Alina Matei
More informationI I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -;
R AD-A237 850 E........ I N 11111IIIII U 1 1I!til II II... 1. AGENCY USE ONLY Leave 'VanK) I2. REPORT DATE 3 REPORT TYPE AND " - - I I FINAL, 01 Jun 8.4 to 31 May 88 4. TITLE AND SUBTITLE 5 * _- N, '.
More informationIntroduction of Shape/Skewness Parameter(s) in a Probability Distribution
Journal of Probability and Statistical Science 7(2), 153-171, Aug. 2009 Introduction of Shape/Skewness Parameter(s) in a Probability Distribution Rameshwar D. Gupta University of New Brunswick Debasis
More informationA Bivariate Weibull Regression Model
c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression
More informationProbability Distributions Columns (a) through (d)
Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)
More informationLecture 22 Survival Analysis: An Introduction
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationStatistics Ph.D. Qualifying Exam: Part I October 18, 2003
Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationProduct-limit estimators of the survival function with left or right censored data
Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut
More informationPoint and Interval Estimation of Weibull Parameters Based on Joint Progressively Censored Data
Point and Interval Estimation of Weibull Parameters Based on Joint Progressively Censored Data Shuvashree Mondal and Debasis Kundu Abstract Analysis of progressively censored data has received considerable
More information1. (Regular) Exponential Family
1. (Regular) Exponential Family The density function of a regular exponential family is: [ ] Example. Poisson(θ) [ ] Example. Normal. (both unknown). ) [ ] [ ] [ ] [ ] 2. Theorem (Exponential family &
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationMachine Learning for Signal Processing Expectation Maximization Mixture Models. Bhiksha Raj 27 Oct /
Machine Learning for Signal rocessing Expectation Maximization Mixture Models Bhiksha Raj 27 Oct 2016 11755/18797 1 Learning Distributions for Data roblem: Given a collection of examples from some data,
More informationStatistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS040) p.4828 Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions
More information