Wei Pan 1. the NPMLE of the distribution function for interval censored data without covariates. We reformulate the

Size: px
Start display at page:

Download "Wei Pan 1. the NPMLE of the distribution function for interval censored data without covariates. We reformulate the"

Transcription

1 Extending the Iterative Convex Minorant Algorithm to the Cox Model for Interval-Censored Data Wei Pan 1 1 The iterative convex minorant (ICM) algorithm (Groeneboom and Wellner, 1992) is fast in computing the NPMLE of the distribution function for interval censored data without covariates. We reformulate the ICM as a generalized gradient projection method (GGP), which leads to a natural extension to the Cox model. It is also easily extended to support the Lasso (Tibshirani, 1996). Some simulation results are also shown. For illustration we reanalyze two real datasets. Key Words: Cross-validation EM Generalized gradient projection ICM Interval censoring Isotonic regression Lasso NPMLE. 1. INTRODUCTION There has been an increasing research interest in interval censored data recently, largely due to its popularity in studies of AIDS and other chronic diseases, in which the event ofinterestisnever observed exactly but only known to occur within some time intervals. In these studies a baseline examination was conducted, then after some xed time the rst followup took place, then second and third, etc. followups (if any) after intervals of several years each. If the event happens, wecan only be aware of its happening between two examinations. Otherwise, it may only happen after the end of the last followup. This produces interval censoring. One fundamental statistical problem in survival analysis is to estimate the distribution function of the event time from censored data. In a more general framework Turnbull (1976) presents an EM algorithm (Dempster, Laird and Rubin, 1977) to compute the nonparametric maximum likelihood estimator (NPMLE) of the distribution function for this kind of data without covariates. Groeneboom and Wellner (1992, shortened as G&W in the sequel) propose an iterative convex minorant (ICM) algorithm, which they suggest is much faster than the EM. In the Cox model with interval censored data, Finkelstein (1986) gives a modied Newton-Raphson method after reparameterization. This approach uses the 1 Division of Biostatistics, University of Minnesota, A46 Mayo Building (Box 33), Minneapolis, MN weip@biostat.umn.edu

2 Extending the ICM 2 inverse Hessian matrix at each iteration. However, for continuous time model without grouping, the dimension of the unknown parameters is in the order of the number of observations, which is often more than several hundreds. Many oftoday's ordinary workstations cannot aord this storage space. Moreover, the Hessian matrix is sometimes near singular, which may cause some numerical problems when inverting it. In contrast, neither the EM nor the ICM needs to store and invert the Hessian matrix. In fact, as analyzed in the next section, the ICM only uses the diagonal elements of the Hessian. On the other hand, in spite of its slowness the EM is hard to be used in the Cox model for continuous survival times (Alioum and Commenges, 1996). Hence an extension of the ICM to the Cox model has a practical importance. For low-dimensional regression coecients, a prole likelihood approach has been suggested (Huang 1996 Huang and Wellner 1995). It works by xing the regression coecient at some possible values, then for each xed value of regression coecient maximizing the likelihood with respect to the baseline distribution by a modied ICM algorithm. Then the point achieving the maximum of this prole likelihood (possibly after smoothing) is taken as the nal estimate. Just recently Huang and Wellner (1997) suggest a more general method based on the back-tting scheme (Hastie and Tibshirani 199), though no simulation results are shown. We will compare our approach with theirs later. Originally the ICM was formulated in terms of some special stochastic processes and isotonic regression algorithms (G&W Huang and Wellner 1997). In Section 2 we identify it as a special case of the generalized gradient projection method (GGP) in the optimization literature (Mangasarian, 1996 Bertsekas, 1982). From this connection a natural extension of the ICM to the Cox model is almost trivial. We further extend it to support the Lasso, a biased regression scheme proposed by Tibshirani (1996, 1997). In Section 3 simulations were conducted to assess the performance of the extended ICM methods. Finally two real world problems are used to illustrate the methods, followed by a brief discussion. 2. COMPUTATIONAL PROCEDURES 2.1 GGP We rst briey review the GGP, a general optimization scheme (Mangasarian, 1996 Bertsekas, 1982). It is helpful to regard it as a Newton-type iterative algorithm in constrained optimization.

3 Extending the ICM 3 Specically, suppose we want to maximize a function f(x) in a closed convex set X. Let rf denote the rst derivative off, H a symmetric positive denite matrix, which can be often taken as the negative Hessian matrix of f if f is strictly concave. The GGP updates its current estimate x (m) by x (m+1) = Proj[ x (m) + (m) H (m);1 rf(x (m) ) H (m) X ] where (m) > is a suitably chosen stepsize, and Proj is the projection operation dened by: Proj [x H X ] := arg minf(x ; x ) H(x ; x ):x 2Xg: x The iterate x (m) at convergence is taken as a solution of the maximization problem. The rationale behind the GGP is intuitively explained by the following result: 1) x 2 arg max x2x f(x) 2) x = Proj [x + H ;1 rf(x ) H X ] for any > 3) x = Proj [x + H ;1 rf(x ) H X ] for some > If f is dierentiable, then 1) implies 2) On the other hand, if f is convex, then 3) implies 1). We notice by passing that taking H as an identity matrix (i.e. completely ignoring the second order information of f) results in the gradient projection method (Polak 1971). Some theoretical properties of the GGP, such as its superlinear or linear convergence rate (depending on the choice of H), have been established in Bertsekas (1982). Furthermore, Bertsekas has shown its eectiveness via computational examples involving as many as 1 variables. This feature is especially attractive in our current setting (often involving more than several hundred parameters for medium to large samples). 2.2 ICM for Interval Censored Data without Covariates For interval censored data, the time of the event ofinterest, say survival time X i, is not observed exactly. Observations available are censoring times, (U i V i )fori =1 ::: n. We only know that U i X i V i 1. As usual we assume that X i and (U i V i ) are independent, and fx i g and f(u i V i )g are respectively two iid samples. We are interested in estimating the distribution function of X i, F,fromf(U i V i )g. The (conditional) log-likelihood can be written down as (Turnbull, 1976): L(F jf(u i V i )g) = nx i=1 log (F (V i ) ; F (U i ;)) :

4 Extending the ICM 4 Notice that L can be further expressed in terms of the cumulative hazard function. This is actually what we did in implementing our algorithms. There are two advantages with this parametrization: i) concavity of the likelihood function is readily available, even in the Cox model (Huang and Wellner, 1997) ii) in contrast to restricting F 1 there is no upper bound on. But for simplicity of notation we still use the distribution function throughout. The nonparametric maximum likelihood estimator (NPMLE) ^F of F is a right-continuous function maximizing L(F ). Turnbull (1976) shows that ^F can only have jumps between the order statistics i, i =1 ::: k, ofu i and V i,1in. However, the probability distribution of ^F within some intervals is nonidentiable. This nonidentiability will not inuence the computation of ^F by the EM algorithm. To facilitate the computation of ^F by the ICM, G&W dene the NPMLE ^F as piecewise constant with possible discontinuities only at the order statistics f i g, and moreover ^F may be less than 1 at the largest order statistic k. These two dierent denitions seem to be non-critical. In the sequel, we adopt the second one. Turnbull (1976) also suggests how to reduce the number of the order statistics f i g on which ^F may have jumps. We denote the possible support of the NPMLE as a disjoint union of intervals [a i b i ), i =1 ::: m. (It is likely that the NPMLE has no probability mass in some of these intervals.) Generally we usev to denote a column vector with components v i, and similarly the (i j)th element of a matrix M with M ij. So far the ICM is only applicable to interval censored data without covariates. The original exposition of the ICM was in terms of stochastic processes (Groeneboom and Wellner, 1992). Now we reformulate it as a GGP method (Mangasarian 1996 Bertsekas 1982) so that its connection with the Newton-Raphson algorithm and gradient projection method is obvious. Let the rst derivative (i.e. gradient) and second derivative (i.e. the Hessian) of L with respect to F =(F ( 1 ) ::: F ( k )) be rl and r 2 L respectively. G is a k k diagonal matrix with the same diagonal elements as ;r 2 L.Tohave a proper (sub)distribution function we need to restrict F 2R= ff : F ( 1 ) ::: F ( k ) 1g. Denote the estimates from the ith iteration as F (i). Then the ICM iterates as F (m+1) = Proj[F (m) + G(F (m) ) ;1 rl(f (m) ) G(F (m) ) R]

5 Extending the ICM 5 where the projection by denition is Proj[y G R] := arg min x f kx i=1 (y i ; x i ) 2 G ii : x 1 x 2 ::: x k 1g which is just an isotonic least squares regression problem and hence can be eciently accomplished by some well-known algorithms such as the pool-adjacent-violators algorithm (PAVA) (Robertson et al., 1988). The time complexity of the PAVA iso(k 2 ) and easy to program. Graphically the solution of the isotonic regression (with a simple order as R) can be represented by a greatest convex minorant of the plot f( ), ( P j i=1 G ii P j i=1 G iix i ) for i =1 ::: kg. This may be the reason why the above (iterative) algorithm is named ICM. The projection guarantees that the new F (m+1) is still a proper (sub)distribution function. Moreover, the projection seems to speed up convergence as suggested in G&W. Evidently this is a special case of the GGP method. Without the projection, the above iteration is just a Newton-like step where only the diagonal elements of the Hessian matrix are used. The estimate F (m) at convergence is taken as the NPMLE. We follow Aragon and Eberly (1992) to further dene the damped iterative convex minorant algorithm (DICM) as F (m+1) = Proj[F (m) + j G(F (m) ) ;1 rl(f (m) ) G(F (m) ) R] where k is an Armijo-like stepsize (Polak 1971) or can be simply chosen by j =max n 1=2 i : L(F (m+1) ) >L(F (m) ) i= 1 2 ::: which tries to take a large stepsize while increases the likelihood. The appeal of the DICM is its global convergence and increasing likelihood in each step, though Zhan and Wellner (1995) point out some possible problems in the proof of the global convergence in Aragon and Eberly (1992). On the other hand, G&W (page 7) suggest using \buer" to keep the estimate as a proper distribution and to avoid the likelihood becoming. However, they do not say what to be done if the new estimate does make the likelihood be. Damping is a natural way to avoid this problem. Hence in our discussion we refer to ICM and DICM equivalently. 2.3 Extending ICM to the Cox Model The Cox proportional hazards model is probably the most widely used one in survival analysis. It assumes that the hazard function h(xjz) with covariate (vector) Z is proportional to an unknown o

6 Extending the ICM 6 baseline hazard h (x): h(xjz) =h (x)exp(z ) or formulated in terms of distribution functions: 1 ; F (xjz) =[1; F (x)] exp(z ) where F is the corresponding baseline distribution (unknown), and is the regression coecient (vector). Let the n observations be labelled (U 1 V 1 Z 1 ),...,(U n V n Z n ), where (U i V i ) as before is the lower and upper boundaries of the censoring interval, and Z i is the covariate (vector). The survival time X i is only observed to be inside (U i V i ]. As usual we assume that X i and (U i V i ) are mutually independent. (Finkelstein 1986): L(F )= The joint log-likelihood can be written down (up to a constant) as nx i=1 o log n(1 ; F (U i ;)) exp(z) i ; (1 ; F (V i )) exp(z) i : L is maximized to obtain the NPMLE ^ of the regression coecient along with that of the baseline distribution, ^F. Here we approach this maximization problem by extending the ICM. Computationally this is different from Finkelstein (1986), where the Newton-Raphson method is adopted after reparametrization. As before we would like to avoid using the full Hessian matrix due to the possible diculties mentioned earlier. Our extended ICM is simple (without inverting a high-dimensional matrix) and reasonably fast when applied in this setting. Extending the ICM is straightforward by considering the regression coecient as another component of parameters in addition to the baseline distribution. Now let r 1 L(F )=@L(F )=@F and r 2 L(F )=@L(F )=@ denote the rst derivatives G 1 (F )andg 2 (F ) the corresponding diagonal matrices of the negative second derivatives. Then our extended ICM iterates as F (m+1) = Proj[F (m) + j G 1 (F (m) (m) ) ;1 r 1 L(F (m) (m) ) G 1 (F (m) (m) ) R] (m+1) = (m) + j G 2 (F (m) (m) ) ;1 r 2 L(F (m) (m) ) where j and Proj are dened as before. Again the projection step is taken to ensure that the estimate of F is a proper (sub)distribution, which requires that ^F is nondecreasing and between and 1. Since no constraint is on, no projection is performed on it. We will discuss another application of the GP to constrained next.

7 Extending the ICM 7 Notice that inverting the diagonal matrices G 1 and G 2 is trivial. But to ensure numerical stability ofinverting G 1 and G 2 (and positive-deniteness of G 1 and G 2 for truncated data as mentioned at the end of the paper), the Levenberg-Marquardt adjustment is adopted (Thisted, 1988). Specically, G ;1 1 and G ;1 2 are respectively replaced by (G 1 + I) ;1 and (G 2 + I) ;1 with a suitably chosen small > when they are near singular (or non-positive-denite for truncated data). We note that our extended ICM is dierent from that outlined by Huang and Wellner (1997), which is basically a back-tting algorithm (Hastie and Tibshirani 199): L(F (m) )is rst maximized with respect to F at the xed (m) (by an ICM-like algorithm), which leads to an F (m+1) then L(F (m+1) ) is maximized with respect to at the xed F (m+1) to get another estimate (m+1) the above two steps are repeated until convergence. In contrast, we propose to maximize the likelihood jointly with F and, which seems to be more common and likely more ecient. Huang and Wellner (1997) did not give any specic numerical results. Some numerical comparison is presented in Section 3.3. Intuitively the (extended) ICM will converge, since the likelihood function is upper-bounded and each step of the (extended) ICM increases the likelihood. When we don't use the second order information (i.e. G is an identity matrix), the GGP reduces to the GP, which only has a linear convergence rate. On the other hand, the GGP can have a superlinear convergence rate if G is suitably chosen (part of which is diagonal and the remaining part equals the corresponding Hessian) (Propositions 3 and 4, Bertsekas 1982). Since the used information G in our (extended) ICM is between them, we can expect the convergence rate of the (extended) ICM is also between linear and superlinear furthermore, if the diagonal elements of the Hessian dominate the non-diagonals, the convergence rate of the (extended) ICM will at least be close to superlinearity. Rigorous derivations are not pursued here, though our limited experience seems to be positive. 2.4 Extending ICM to Support Lasso Tibshirani (1996) proposes a new estimation procedure called Lasso in linear regression and then extends it to the Cox model for right-censored data (Tibshirani, 1997). The Lasso works by constraining the sum of the absolute values of the regression coecients being less than a constant. This immediately reminds one the ridge regression, where the constraint is replaced by the sum of the squared regression coecients. As in ridge regression, the Lasso is a biased estimation

8 Extending the ICM 8 method which nonetheless may lead to smaller mean squared errors by reducing the variability of the estimates. However, to our knowledge, there is still no application of the Lasso method to the Cox model with interval censored data in the literature. Unlike in linear regression or in the Cox model with right censored data, here we have a high-dimensional (and ordered) nuisance parameter, the baseline distribution. (The baseline distribution is eliminated from the partial likelihood in the Cox model for right censored data but we do not have an explicit partial likelihood for interval censored data.) The ICM is readily adapted to support the Lasso by using the ideas of the GGP. Now we need to constrain in C = f : P r i=1 j i jg (where is a specied constant). This only needs to modify our previous algorithm by (m+1) = Proj[ (m) + j G 2 (F (m) (m) ) ;1 r 2 L(F (m) (m) ) G 2 (F (m) (m) ) C]: The projection of a point into C is a standard quadratic programming problem. In particular, Tibshirani (1996) gives two ecient algorithms. We term the above (m) Lasso estimate. at convergence as the The in the constraint can be chosen by generalized cross-validation for right censored data (Tibshirani 1997). Here we use a 1-fold likelihood cross-validation (Silverman 1986). 3. SIMULATION In our simulations, the baseline distribution F is Weibull with shape parameter 2 and scale parameter 1. Only binary covariates are used, each component of which takes or 1 with an equal probability. The regression coecient isalsoavector of and 1. The rst examination time T is Uniform( ). To mimic many panel studies we take the length of the time interval between two follow-up examinations to be constant, len = :5. Generally if we have k + 1 examinations, survival time X i is accordingly censored in one of ( T i ] (T i T i + len] ::: (T i + k len 1). k and jointly determine the censoring pattern. In our simulation studies there are four congurations: 1. k =1, =2, =(1 ) 2. k =1, =2, =(:333 :333 :333) 3. k =2, =1, =(1 ) 4. k =2, =1, =(:333 :333 :333)

9 Extending the ICM 9 It is evident that in the rst two cases the average length of censoring intervals tends to be longer than that in the other two, which implies more information loss from censoring. The rst and third cases represent one large covariate eect while the other twomany smaller covariate eects. They are used to compare the NPMLE with the Lasso estimate. The sample size is always 1. All the programs are implemented in C on a SunSparc 1. The starting value for the baseline distribution is the discrete uniform in the possible support of the NPMLE, whereas it is zero for the regression coecient. An algorithm is stopped when both the log-likelihood increment and the change of the regression coecient fromtwo consecutive iterations are less than 1 ; NPMLE and Bootstrap For cases 1 and 3, suppose we know the true model in other words, we know there is only one non-zero coecient 1. The ICM is so fast that often it converges within several seconds. The fast speed of the ICM facilitates using some computing-intensive methods, such as bootstrap and cross-validation. The cross-validation is used in choosing the Lasso constraint. Now we apply bootstrap to estimate the variability of the NPMLE ^. For related work in the context of right-censored data see Burr (1992). As reviewed in Huang and Wellner (1997), in practice there are two approaches to estimate the variance of ^: one is by using the observed information matrix another is through the prole likelihood for low-dimensional. Aswe pointed out earlier, for continuous time model without grouping the observed information matrix is often of too high-dimension, and sometimes is near singular. There is also no theoretical justication for their use. Bootstrapping seems a natural alternative, though we are not aware of any empirical studies yet, which probably results from lack of faster algorithms. Our bootstrapping scheme is as discussed by Efron and Tibshirani (1986), where the same number of observations are randomly taken with replacement from the original sample to form a bootstrap sample. We used 1 bootstrap samples. The results are reported in Table 1. Noticeably the NPMLE is biased upwards and with a large variability for case 1. The reason is too much information loss from censoring, particularly with an overwhelming 6%left-censoring (and 1% right-censoring approximately). For a left-censored observation ( V i Z i ) its contribution to the

10 Extending the ICM 1 Table 1: NPMLE of the regression coecient, ^, and the bootstrap estimate of its standard error, ^se. Their means and Monte Carlo standard errors were calculated from 5 replications. = 1. Case mean( ^) se( ^) mean(^se) se(^se) log-likelihood is n o L i =log 1 ; (1 ; F (V i )) exp(z i) which will be maximized when tends to the innity for a positive Z i. Nevertheless, the bootstrap estimate of the variance of the NPMLE is still sensible. In case 3, the left-censoring percentage is reduced to 3%(and there is about 1%right-censoring). The bootstrap estimate improves along with the NPMLE. In overall we can see that the bootstrap works reasonably well. 3.2 NPMLE and Lasso Estimate In this subsection we compare the performance of the NPMLE and the Lasso estimate of the regression coecient. Unlike in the last subsection, now all three covariates are used in other words, we do not know in advance which i is zero. Our results are presented in Table 2, which are pretty similar to that of Tibshirani (1996) for ordinary least squares regression. The Lasso estimates tend to have larger biases (down towards zero) but much smaller variances, hence smaller mean squared errors (MSE), than the NPMLE. However, as the performance of the NPMLE improves in cases 3 and 4, the Lasso estimate is only on a slight edge. 3.3 A Comparison with Other Algorithms As pointed out earlier, Finkelstein's Newton-Raphson algorithm is not practical to use when the sample size is large, whereas the prole likelihood approach (Huang 1996 Huang and Wellner 1995) is only applicable with low-dimensional regression coecients. Hence the only left competitor is the algorithm outlined in Huang and Wellner (1997). Essentially their method is to alternate

11 Extending the ICM 11 Table 2: NPMLEs and Lasso estimates of regression coecients (with their standard errors in parentheses) from 5 replications. =(1 ) for Cases 1 and 3 =(:333 :333 :333) for other two. NPMLE Lasso Case MSE ^1 ^2 ^3 MSE 1 ~ 2 ~ 3 ~ (.29) (.63) (.4) (.41) (.29) (.4) (.17) (.18) (.65) (.41) (.4) (.44) (.17) (.25) (.22) (.25) (.28) (.3) (.26) (.27) (.2) (.3) (.9) (.11) (.21) (.27) (.26).27) (.9) (.21) (.2) (.21) maximizing the likelihood with respect to the baseline distribution and with the regression coecients. In contrast, we propose to maximize the likelihood with respect to the baseline distribution and the regression coecients simultaneously. The advantage of our algorithm can be explained intuitively as follows. Suppose we want to maximize a bivariate function, say f(x y), with respect to x and y. If possible, we would like to maximize f(x y) with respect to x and y jointly, rather than maximize f(x y) with respect to x, then with y and repeat this process. We note that due to this connection between two algorithms it may appear that our proposal is a trivial extension of Huang and Wellner's. However, this connection is established largely by our formulation of the ICM as a GGP. We compared these two algorithms in the conguration of Case 3. Both algorithms used the same starting values and convergence criteria as before. The results are shown in Table 3. It is apparent that our algorithm is over two times faster than Huang and Wellner's. In particular, the log-likelihoods at the convergence of the two algorithms are very close to each other in fact, all the log-likelihood values at the convergence of our algorithm are slightly larger than their corresponding parts from Huang and Wellner's. This guarantees that our algorithm was not favorably stopped

12 Extending the ICM 12 Table 3: Speed of our new and Huang and Wellner's algorithm in computing the NPMLE in Case 3, with 5 replications. The CPU time is in seconds Log-likelihood is that at convergence. All the corresponding standard errors are in parentheses. Algorithm CPU time Log-likelihood ^ Huang-Wellner 1.36 (.24) (6.7869) (.293) New.44 (.8) ( ) (.295) earlier. Toinvestigate our algorithm's (in)dependence on starting values, we used some random starting values for the rst ten datasets in Case 3 each one is replicated for 1 times. The initial baseline hazard increment ineach of its support intervals was taken as a uniform random number between.1 and.1, while the starting regression coecients were from a uniform distribution between -2 and 2. For each dataset, the algorithm all converged to the neighborhood of the same point: the estimated regression coecients agreed up to the third decimal digit while the log-likelihood value was the same. However, more work is needed to clarify whether there are any local maxima to which the algorithm may converge, as one referee pointed out. 4. REAL EXAMPLES Now we apply our extended ICM to reanalyze two real datasets via the NPMLE. They have already been discussed among others in Finkelstein (1986), and Huang (1994) and Huang and Wellner (1995). Finkelstein focuses on hypothesis testing while Huang (1994) and Huang and Wellner (1995) establish the asymptotic normality and asymptotic eciency of the NPMLE of the regression coecient. The rst dataset is from a tumorigenicity study, where 144 mice were assigned to either a germfree or a conventional environment. The survival time here is the tumor onset time, which cannot be observed directly. The presence of the tumor can only be detected at the time of the sacrice or death of the animal. Hence the survival time is either left censored or right censored (i.e. interval-censoring case 1). If we take the germfree group as the baseline with covariate Z = and another group with Z = 1, then the NPMLE ^ of the regression coecient from the Cox model

13 Extending the ICM 13 is ^ = ;:68 with estimated standard error.41 by bootstrap (with 1 bootstrap replications). They are dierent from Huang's results: ;:55 and :29. Huang uses the prole likelihood to compute the NPMLE and approximates its variance by asymptotics, which involves some density estimation. Our 95%bootstrap percentile condence interval for is [;1:26 :34], compared with Huang's asymptotic normal approximation [;1:12 :1]. The second example considered is the Breast Cosmesis Study dataset (Finkelstein 1986 Finkelstein and Wolfe 1985). Two medical treatments are given to 9 early breast cancer patients after tumorectomy: one is a mix of primary radiotheropy and adjuvant chemotheropy, and another is radiotheropy only. The interest of this clinical trial is to investigate which treatment has better long-term cosmetic eects. Interval-censoring results since the patients could be visited every 4to 6 months and those living farther away from the clinic had even longer followup intervals. If we take the radiotheropy group as the baseline group, the NPMLE of the regression coecient from the Cox model is ^ =:735 with estimated standard error.35 by bootstrap. These results are not much dierent from the corresponding.795 and.29 from Huang and Wellner (1995). But the bootstrap inference seems more conservative than theirs. The 95%bootstrap percentile and asymptotic condence intervals are respectively [.11, 1.49] and [.23, 1.36]. Further studies are needed to clarify these discrepancies. 5. SUMMARY In computing the NPMLE of the distribution function for interval censored data, the dimension of the parameters is in the order of the sample size, which is often more than several hundreds. Hence the space and time overheads incurred by using the Hessian (or its approximation by a full matrix) are prohibitive. This motivated us to investigate algorithms that avoid using these large matrices. Clearly we are trading time for space. In particular, we do not claim that our extended ICM is faster than Finkelstein's Newton-Raphson method. The ICM is fast but still simple without using a high-dimensional matrix, such as the Hessian, for interval censored data with no covariates. Our formulation of the ICM as a GGP helps not only to establish its connection with the Newton-Raphson algorithm, but also to ease its extension to the Cox model and to support the Lasso. Its fast speed also facilitates combining its use with some computing-intensive methods, such as bootstrap and cross-validation. In particular, our limited

14 Extending the ICM 14 simulation studies seem to favor the Lasso estimates of the regression coecients in terms of mean squared errors as in linear regression setting (Tibshirani, 1996). Finally, we note that our extended ICM (with Levenberg-Marquardt adjustment) is readily applicable to truncated and interval censored data (Pan and Chappell 1998). Acknowledgements The author would like to thank Olvi Mangasarian for introducing the GGP thanks also go to Rick Chappell for stimulating discussions and support. Many insightful comments from the Associate Editor and three anonymous referees dramatically improved the presentation.

15 Extending the ICM 15 References Alioum, A. and Commenges, D. (1996), \A Proportional Hazards Model for Arbitrarily Censored and Truncated Data", Biometrics, 52, Aragon, J. and Eberly, D. (1992),\On Convergence of Convex Minorant Algorithms for Distribution Estimation with Interval-Censored Data", Journal of Computational and Graphical Statistics, 1, Bertsekas, D.P. (1976), \On the Goldstein-Levitin-Polyak Gradient Projection Method", IEEE Transactions on Automatic Control, 21, Bertsekas, D.P. (1982), \Projected Newton Methods for Optimization Problems with Simple Constraints", SIAM Control and Optimization, 2, Burr, D. (1994), \A Comparison of Certain Bootstrap Condence Intervals in the Cox Model", Journal of American Statistical Association, 89, Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977),\Maximum Likelihood from Incomplete Data via the EM Algorithm", Journal of the Royal Statistical Society, Series B, 39, Efron, B. and Tibshirani, R.J. (1986), \Bootstrap Methods for Standard Errors, Condence Intervals, and Other Measures of Statistical Accuracy" (with discussion), Biometrika, 65, Efron, B. and Tibshirani, R. J. (1993), An Introduction to the Bootstrap, New York: Chapman & Hall. Finkelstein, D.M. (1986), \A Proportional Hazards Model for Interval-Censored Failure Time Data", Biometrics, 42, Finkelstein, D.M. and Wolfe, R.A. (1985), \A Semiparametric Model for Regression Analysis of Interval-censored Failure Time Data", Biometrics, 41, Groeneboom, P. and Wellner, J.A. (1992),Information Bounds and Nonparametric Maximum Likelihood Estimation, Basel: Birhauser Verlag. Hastie, T. and Tibshirani, R. (199), Generalized Additive Models, London: Chapman & Hall. Huang, J. (1996), \Ecient Estimation for the Cox Model with Interval Censoring", Annals of Statistics, 24, Huang, J. and Wellner, J.A. (1995), \Ecient Estimation for the Proportional Hazards Model with `Case 2' Interval Censoring", Tech. Rept., Dept. of Statistics, Univ. of Washington, Seattle, WA.

16 Extending the ICM 16 Huang, J. and Wellner, J.A. (1997), \Interval Censored Survival Data: A Review of Recent Progress", Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis, New York: Springer. Mangasarian, O.L. (1996), \Algorithms of Nonlinear Mathematical Programming", CS73 Course Notes, Computer Sciences Dept., Univ. of Wisconsin, Madison, WI. Pan, W. and Chappell, R. (1998), \Computation of the NPMLE of Distribution Functions for Interval Censored and Truncated Data with Applications to the Cox Model", Computational Statistics and Data Analysis, 28, Polak, E. (1971), Computational Methods in Optimization, New York: Academic Press. Robertson, T., Wright, F.T. and Dykstra, R.L. (1988), Order Restricted Statistical Inference, New York: Wiley, Silverman, B.W. (1986), Density Estimation for Statistics and Data Analysis, New York: Chapman & Hall. Tibshirani, R. (1996), \Regression Shrinkage and Selection via the Lasso", Journal of the Royal Statistical Society, Series B, 58, Tibshirani, R. (1997), \A Proposal for Variable Selection in the Cox Model", Statistics in Medicine, 16, Tibshirani, R. and Hastie, T. (199), Generalized Additive Models, NewYork: Chapman & Hall. Thisted, R.A. (1988), Elements of Statistical Computing, NewYork: Chapman & Hall. Turnbull, B.W. (1976), \The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data", Journal of the Royal Statistical Society, Series B, 38, Zhan, Y. and Wellner, J.A. (1995),\Double Censoring: Characterization and Computation of the Nonparametric Maximum Likelihood Estimator", Tech. Rept. Dept. of Statistics, Univ. of Washington, Seattle, WA.

The iterative convex minorant algorithm for nonparametric estimation

The iterative convex minorant algorithm for nonparametric estimation The iterative convex minorant algorithm for nonparametric estimation Report 95-05 Geurt Jongbloed Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica

More information

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199

(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199 CONSISTENCY OF THE GMLE WITH MIXED CASE INTERVAL-CENSORED DATA By Anton Schick and Qiqing Yu Binghamton University April 1997. Revised December 1997, Revised July 1998 Abstract. In this paper we consider

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Glenn Heller Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center,

More information

Goodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

output dimension input dimension Gaussian evidence Gaussian Gaussian evidence evidence from t +1 inputs and outputs at time t x t+2 x t-1 x t+1

output dimension input dimension Gaussian evidence Gaussian Gaussian evidence evidence from t +1 inputs and outputs at time t x t+2 x t-1 x t+1 To appear in M. S. Kearns, S. A. Solla, D. A. Cohn, (eds.) Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 999. Learning Nonlinear Dynamical Systems using an EM Algorithm Zoubin

More information

Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar

Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Department of Mathematical Sciences, University of Bath,

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

Fahrmeir: Discrete failure time models

Fahrmeir: Discrete failure time models Fahrmeir: Discrete failure time models Sonderforschungsbereich 386, Paper 91 (1997) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Discrete failure time models Ludwig Fahrmeir, Universitat

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2002 Paper 116 Case-Control Current Status Data Nicholas P. Jewell Mark J. van der Laan Division of

More information

Constrained estimation for binary and survival data

Constrained estimation for binary and survival data Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky Summary The empirical likelihood ratio method is a general nonparametric

More information

A comparison study of the nonparametric tests based on the empirical distributions

A comparison study of the nonparametric tests based on the empirical distributions 통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Likelihood Based Inference for Monotone Response Models

Likelihood Based Inference for Monotone Response Models Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 5, 25 Abstract The behavior of maximum likelihood estimates (MLE s) the likelihood ratio statistic

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY

Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY Nonparametric Hypothesis Testing and Condence Intervals with Doubly Censored Data Kun Chen and Mai Zhou Department of Statistics University ofkentucky Lexington KY 46-7 U.S.A. SUMMARY The non-parametric

More information

The Expectation Maximization Algorithm

The Expectation Maximization Algorithm The Expectation Maximization Algorithm Frank Dellaert College of Computing, Georgia Institute of Technology Technical Report number GIT-GVU-- February Abstract This note represents my attempt at explaining

More information

Interval Censored Survival Data: A Review of Recent Progress. Jian Huang and Jon A. Wellner 1. University of Iowa and University of Washington

Interval Censored Survival Data: A Review of Recent Progress. Jian Huang and Jon A. Wellner 1. University of Iowa and University of Washington Interval Censored Survival Data: A Review of Recent Progress Jian Huang and Jon A. Wellner University of Iowa and University of Washington March 3, 996 Abstract We review estimation in interval censoring

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

Managing Uncertainty

Managing Uncertainty Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Testing for a Global Maximum of the Likelihood

Testing for a Global Maximum of the Likelihood Testing for a Global Maximum of the Likelihood Christophe Biernacki When several roots to the likelihood equation exist, the root corresponding to the global maximizer of the likelihood is generally retained

More information

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work SSUI: Presentation Hints 1 Comparing Marginal and Random Eects (Frailty) Models Terry M. Therneau Mayo Clinic April 1998 SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that

More information

Ef ciency considerations in the additive hazards model with current status data

Ef ciency considerations in the additive hazards model with current status data 367 Statistica Neerlandica (2001) Vol. 55, nr. 3, pp. 367±376 Ef ciency considerations in the additive hazards model with current status data D. Ghosh 1 Department of Biostatics, University of Michigan,

More information

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion.

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion. The Uniformity Principle A New Tool for Probabilistic Robustness Analysis B. R. Barmish and C. M. Lagoa Department of Electrical and Computer Engineering University of Wisconsin-Madison, Madison, WI 53706

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Connexions module: m11446 1 Maximum Likelihood Estimation Clayton Scott Robert Nowak This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License Abstract

More information

Combining estimates in regression and. classication. Michael LeBlanc. and. Robert Tibshirani. and. Department of Statistics. cuniversity of Toronto

Combining estimates in regression and. classication. Michael LeBlanc. and. Robert Tibshirani. and. Department of Statistics. cuniversity of Toronto Combining estimates in regression and classication Michael LeBlanc and Robert Tibshirani Department of Preventive Medicine and Biostatistics and Department of Statistics University of Toronto December

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Analytical Bootstrap Methods for Censored Data

Analytical Bootstrap Methods for Censored Data JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 6(2, 129 141 Copyright c 2002, Lawrence Erlbaum Associates, Inc. Analytical Bootstrap Methods for Censored Data ALAN D. HUTSON Division of Biostatistics,

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

Inversion Base Height. Daggot Pressure Gradient Visibility (miles)

Inversion Base Height. Daggot Pressure Gradient Visibility (miles) Stanford University June 2, 1998 Bayesian Backtting: 1 Bayesian Backtting Trevor Hastie Stanford University Rob Tibshirani University of Toronto Email: trevor@stat.stanford.edu Ftp: stat.stanford.edu:

More information

* * * * * * * * * * * * * * * ** * **

* * * * * * * * * * * * * * * ** * ** Generalized Additive Models Trevor Hastie y and Robert Tibshirani z Department of Statistics and Division of Biostatistics Stanford University 12th May, 1995 Regression models play an important role in

More information

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models. Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness

More information

Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques

Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques Hung Chen hchen@math.ntu.edu.tw Department of Mathematics National Taiwan University 3rd March 2004 Meet at NS 104 On Wednesday

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when

More information

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Behavioral Data Mining. Lecture 7 Linear and Logistic Regression

Behavioral Data Mining. Lecture 7 Linear and Logistic Regression Behavioral Data Mining Lecture 7 Linear and Logistic Regression Outline Linear Regression Regularization Logistic Regression Stochastic Gradient Fast Stochastic Methods Performance tips Linear Regression

More information

Analysis of incomplete data in presence of competing risks

Analysis of incomplete data in presence of competing risks Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department

More information

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated

More information

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive

More information

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

Optimization Methods. Lecture 19: Line Searches and Newton s Method

Optimization Methods. Lecture 19: Line Searches and Newton s Method 15.93 Optimization Methods Lecture 19: Line Searches and Newton s Method 1 Last Lecture Necessary Conditions for Optimality (identifies candidates) x local min f(x ) =, f(x ) PSD Slide 1 Sufficient Conditions

More information

Estimating Gaussian Mixture Densities with EM A Tutorial

Estimating Gaussian Mixture Densities with EM A Tutorial Estimating Gaussian Mixture Densities with EM A Tutorial Carlo Tomasi Due University Expectation Maximization (EM) [4, 3, 6] is a numerical algorithm for the maximization of functions of several variables

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

Recovering Copulae from Conditional Quantiles

Recovering Copulae from Conditional Quantiles Wolfgang K. Härdle Chen Huang Alexander Ristig Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. Center for Applied Statistics and Economics HumboldtUniversität zu Berlin http://lvb.wiwi.hu-berlin.de

More information

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Sonderforschungsbereich 386, Paper 196 (2000) Online

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Error Empirical error. Generalization error. Time (number of iteration)

Error Empirical error. Generalization error. Time (number of iteration) Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp

More information

Analysis of Type-II Progressively Hybrid Censored Data

Analysis of Type-II Progressively Hybrid Censored Data Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in

More information

Estimating the parameters of hidden binomial trials by the EM algorithm

Estimating the parameters of hidden binomial trials by the EM algorithm Hacettepe Journal of Mathematics and Statistics Volume 43 (5) (2014), 885 890 Estimating the parameters of hidden binomial trials by the EM algorithm Degang Zhu Received 02 : 09 : 2013 : Accepted 02 :

More information

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

ARTIFICIAL INTELLIGENCE LABORATORY. and CENTER FOR BIOLOGICAL INFORMATION PROCESSING. A.I. Memo No August Federico Girosi.

ARTIFICIAL INTELLIGENCE LABORATORY. and CENTER FOR BIOLOGICAL INFORMATION PROCESSING. A.I. Memo No August Federico Girosi. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL INFORMATION PROCESSING WHITAKER COLLEGE A.I. Memo No. 1287 August 1991 C.B.I.P. Paper No. 66 Models of

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Professors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th

Professors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th DISCUSSION OF THE PAPER BY LIN AND YING Xihong Lin and Raymond J. Carroll Λ July 21, 2000 Λ Xihong Lin (xlin@sph.umich.edu) is Associate Professor, Department ofbiostatistics, University of Michigan, Ann

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform

More information

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Jian Huang Department of Statistics and Actuarial Science University of Iowa, Iowa City, IA 52242 Email: jian@stat.uiowa.edu

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Choosing initial values for the EM algorithm for nite mixtures

Choosing initial values for the EM algorithm for nite mixtures Computational Statistics & Data Analysis 41 (2003) 577 590 www.elsevier.com/locate/csda Choosing initial values for the EM algorithm for nite mixtures Dimitris Karlis, Evdokia Xekalaki Department of Statistics,

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,

More information

Convergence Rate of Expectation-Maximization

Convergence Rate of Expectation-Maximization Convergence Rate of Expectation-Maximiation Raunak Kumar University of British Columbia Mark Schmidt University of British Columbia Abstract raunakkumar17@outlookcom schmidtm@csubcca Expectation-maximiation

More information

School of Public Health. 420 Delaware Street SE. University of Naples SANFORD WEISBERG. School of Statistics. 353f Classroom-Oce Building

School of Public Health. 420 Delaware Street SE. University of Naples SANFORD WEISBERG. School of Statistics. 353f Classroom-Oce Building Graphical Model Checking With Correlated Response Data WEI PAN 1 and JOHN E. CONNETT Division of Biostatistics School of Public Health University of Minnesota A460 Mayo Building, Box 303 420 Delaware Street

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Generalized Boosted Models: A guide to the gbm package

Generalized Boosted Models: A guide to the gbm package Generalized Boosted Models: A guide to the gbm package Greg Ridgeway April 15, 2006 Boosting takes on various forms with different programs using different loss functions, different base models, and different

More information

1/sqrt(B) convergence 1/B convergence B

1/sqrt(B) convergence 1/B convergence B The Error Coding Method and PICTs Gareth James and Trevor Hastie Department of Statistics, Stanford University March 29, 1998 Abstract A new family of plug-in classication techniques has recently been

More information

Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme

Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme Applied Mathematical Sciences, Vol. 12, 2018, no. 18, 879-891 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2018.8691 Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Simple techniques for comparing survival functions with interval-censored data

Simple techniques for comparing survival functions with interval-censored data Simple techniques for comparing survival functions with interval-censored data Jinheum Kim, joint with Chung Mo Nam jinhkim@suwon.ac.kr Department of Applied Statistics University of Suwon Comparing survival

More information

Independent Increments in Group Sequential Tests: A Review

Independent Increments in Group Sequential Tests: A Review Independent Increments in Group Sequential Tests: A Review KyungMann Kim kmkim@biostat.wisc.edu University of Wisconsin-Madison, Madison, WI, USA July 13, 2013 Outline Early Sequential Analysis Independent

More information

of the Mean of acounting Process with Panel Count Data Jon A. Wellner 1 and Ying Zhang 2 University of Washington and University of Central Florida

of the Mean of acounting Process with Panel Count Data Jon A. Wellner 1 and Ying Zhang 2 University of Washington and University of Central Florida Two Estimators of the Mean of acounting Process with Panel Count Data Jon A. Wellner 1 and Ying Zhang 2 University of Washington and University of Central Florida November 1, 1998 Abstract We study two

More information

Estimating Causal Effects of Organ Transplantation Treatment Regimes

Estimating Causal Effects of Organ Transplantation Treatment Regimes Estimating Causal Effects of Organ Transplantation Treatment Regimes David M. Vock, Jeffrey A. Verdoliva Boatman Division of Biostatistics University of Minnesota July 31, 2018 1 / 27 Hot off the Press

More information