Wei Pan 1. the NPMLE of the distribution function for interval censored data without covariates. We reformulate the
|
|
- Hortense Cunningham
- 5 years ago
- Views:
Transcription
1 Extending the Iterative Convex Minorant Algorithm to the Cox Model for Interval-Censored Data Wei Pan 1 1 The iterative convex minorant (ICM) algorithm (Groeneboom and Wellner, 1992) is fast in computing the NPMLE of the distribution function for interval censored data without covariates. We reformulate the ICM as a generalized gradient projection method (GGP), which leads to a natural extension to the Cox model. It is also easily extended to support the Lasso (Tibshirani, 1996). Some simulation results are also shown. For illustration we reanalyze two real datasets. Key Words: Cross-validation EM Generalized gradient projection ICM Interval censoring Isotonic regression Lasso NPMLE. 1. INTRODUCTION There has been an increasing research interest in interval censored data recently, largely due to its popularity in studies of AIDS and other chronic diseases, in which the event ofinterestisnever observed exactly but only known to occur within some time intervals. In these studies a baseline examination was conducted, then after some xed time the rst followup took place, then second and third, etc. followups (if any) after intervals of several years each. If the event happens, wecan only be aware of its happening between two examinations. Otherwise, it may only happen after the end of the last followup. This produces interval censoring. One fundamental statistical problem in survival analysis is to estimate the distribution function of the event time from censored data. In a more general framework Turnbull (1976) presents an EM algorithm (Dempster, Laird and Rubin, 1977) to compute the nonparametric maximum likelihood estimator (NPMLE) of the distribution function for this kind of data without covariates. Groeneboom and Wellner (1992, shortened as G&W in the sequel) propose an iterative convex minorant (ICM) algorithm, which they suggest is much faster than the EM. In the Cox model with interval censored data, Finkelstein (1986) gives a modied Newton-Raphson method after reparameterization. This approach uses the 1 Division of Biostatistics, University of Minnesota, A46 Mayo Building (Box 33), Minneapolis, MN weip@biostat.umn.edu
2 Extending the ICM 2 inverse Hessian matrix at each iteration. However, for continuous time model without grouping, the dimension of the unknown parameters is in the order of the number of observations, which is often more than several hundreds. Many oftoday's ordinary workstations cannot aord this storage space. Moreover, the Hessian matrix is sometimes near singular, which may cause some numerical problems when inverting it. In contrast, neither the EM nor the ICM needs to store and invert the Hessian matrix. In fact, as analyzed in the next section, the ICM only uses the diagonal elements of the Hessian. On the other hand, in spite of its slowness the EM is hard to be used in the Cox model for continuous survival times (Alioum and Commenges, 1996). Hence an extension of the ICM to the Cox model has a practical importance. For low-dimensional regression coecients, a prole likelihood approach has been suggested (Huang 1996 Huang and Wellner 1995). It works by xing the regression coecient at some possible values, then for each xed value of regression coecient maximizing the likelihood with respect to the baseline distribution by a modied ICM algorithm. Then the point achieving the maximum of this prole likelihood (possibly after smoothing) is taken as the nal estimate. Just recently Huang and Wellner (1997) suggest a more general method based on the back-tting scheme (Hastie and Tibshirani 199), though no simulation results are shown. We will compare our approach with theirs later. Originally the ICM was formulated in terms of some special stochastic processes and isotonic regression algorithms (G&W Huang and Wellner 1997). In Section 2 we identify it as a special case of the generalized gradient projection method (GGP) in the optimization literature (Mangasarian, 1996 Bertsekas, 1982). From this connection a natural extension of the ICM to the Cox model is almost trivial. We further extend it to support the Lasso, a biased regression scheme proposed by Tibshirani (1996, 1997). In Section 3 simulations were conducted to assess the performance of the extended ICM methods. Finally two real world problems are used to illustrate the methods, followed by a brief discussion. 2. COMPUTATIONAL PROCEDURES 2.1 GGP We rst briey review the GGP, a general optimization scheme (Mangasarian, 1996 Bertsekas, 1982). It is helpful to regard it as a Newton-type iterative algorithm in constrained optimization.
3 Extending the ICM 3 Specically, suppose we want to maximize a function f(x) in a closed convex set X. Let rf denote the rst derivative off, H a symmetric positive denite matrix, which can be often taken as the negative Hessian matrix of f if f is strictly concave. The GGP updates its current estimate x (m) by x (m+1) = Proj[ x (m) + (m) H (m);1 rf(x (m) ) H (m) X ] where (m) > is a suitably chosen stepsize, and Proj is the projection operation dened by: Proj [x H X ] := arg minf(x ; x ) H(x ; x ):x 2Xg: x The iterate x (m) at convergence is taken as a solution of the maximization problem. The rationale behind the GGP is intuitively explained by the following result: 1) x 2 arg max x2x f(x) 2) x = Proj [x + H ;1 rf(x ) H X ] for any > 3) x = Proj [x + H ;1 rf(x ) H X ] for some > If f is dierentiable, then 1) implies 2) On the other hand, if f is convex, then 3) implies 1). We notice by passing that taking H as an identity matrix (i.e. completely ignoring the second order information of f) results in the gradient projection method (Polak 1971). Some theoretical properties of the GGP, such as its superlinear or linear convergence rate (depending on the choice of H), have been established in Bertsekas (1982). Furthermore, Bertsekas has shown its eectiveness via computational examples involving as many as 1 variables. This feature is especially attractive in our current setting (often involving more than several hundred parameters for medium to large samples). 2.2 ICM for Interval Censored Data without Covariates For interval censored data, the time of the event ofinterest, say survival time X i, is not observed exactly. Observations available are censoring times, (U i V i )fori =1 ::: n. We only know that U i X i V i 1. As usual we assume that X i and (U i V i ) are independent, and fx i g and f(u i V i )g are respectively two iid samples. We are interested in estimating the distribution function of X i, F,fromf(U i V i )g. The (conditional) log-likelihood can be written down as (Turnbull, 1976): L(F jf(u i V i )g) = nx i=1 log (F (V i ) ; F (U i ;)) :
4 Extending the ICM 4 Notice that L can be further expressed in terms of the cumulative hazard function. This is actually what we did in implementing our algorithms. There are two advantages with this parametrization: i) concavity of the likelihood function is readily available, even in the Cox model (Huang and Wellner, 1997) ii) in contrast to restricting F 1 there is no upper bound on. But for simplicity of notation we still use the distribution function throughout. The nonparametric maximum likelihood estimator (NPMLE) ^F of F is a right-continuous function maximizing L(F ). Turnbull (1976) shows that ^F can only have jumps between the order statistics i, i =1 ::: k, ofu i and V i,1in. However, the probability distribution of ^F within some intervals is nonidentiable. This nonidentiability will not inuence the computation of ^F by the EM algorithm. To facilitate the computation of ^F by the ICM, G&W dene the NPMLE ^F as piecewise constant with possible discontinuities only at the order statistics f i g, and moreover ^F may be less than 1 at the largest order statistic k. These two dierent denitions seem to be non-critical. In the sequel, we adopt the second one. Turnbull (1976) also suggests how to reduce the number of the order statistics f i g on which ^F may have jumps. We denote the possible support of the NPMLE as a disjoint union of intervals [a i b i ), i =1 ::: m. (It is likely that the NPMLE has no probability mass in some of these intervals.) Generally we usev to denote a column vector with components v i, and similarly the (i j)th element of a matrix M with M ij. So far the ICM is only applicable to interval censored data without covariates. The original exposition of the ICM was in terms of stochastic processes (Groeneboom and Wellner, 1992). Now we reformulate it as a GGP method (Mangasarian 1996 Bertsekas 1982) so that its connection with the Newton-Raphson algorithm and gradient projection method is obvious. Let the rst derivative (i.e. gradient) and second derivative (i.e. the Hessian) of L with respect to F =(F ( 1 ) ::: F ( k )) be rl and r 2 L respectively. G is a k k diagonal matrix with the same diagonal elements as ;r 2 L.Tohave a proper (sub)distribution function we need to restrict F 2R= ff : F ( 1 ) ::: F ( k ) 1g. Denote the estimates from the ith iteration as F (i). Then the ICM iterates as F (m+1) = Proj[F (m) + G(F (m) ) ;1 rl(f (m) ) G(F (m) ) R]
5 Extending the ICM 5 where the projection by denition is Proj[y G R] := arg min x f kx i=1 (y i ; x i ) 2 G ii : x 1 x 2 ::: x k 1g which is just an isotonic least squares regression problem and hence can be eciently accomplished by some well-known algorithms such as the pool-adjacent-violators algorithm (PAVA) (Robertson et al., 1988). The time complexity of the PAVA iso(k 2 ) and easy to program. Graphically the solution of the isotonic regression (with a simple order as R) can be represented by a greatest convex minorant of the plot f( ), ( P j i=1 G ii P j i=1 G iix i ) for i =1 ::: kg. This may be the reason why the above (iterative) algorithm is named ICM. The projection guarantees that the new F (m+1) is still a proper (sub)distribution function. Moreover, the projection seems to speed up convergence as suggested in G&W. Evidently this is a special case of the GGP method. Without the projection, the above iteration is just a Newton-like step where only the diagonal elements of the Hessian matrix are used. The estimate F (m) at convergence is taken as the NPMLE. We follow Aragon and Eberly (1992) to further dene the damped iterative convex minorant algorithm (DICM) as F (m+1) = Proj[F (m) + j G(F (m) ) ;1 rl(f (m) ) G(F (m) ) R] where k is an Armijo-like stepsize (Polak 1971) or can be simply chosen by j =max n 1=2 i : L(F (m+1) ) >L(F (m) ) i= 1 2 ::: which tries to take a large stepsize while increases the likelihood. The appeal of the DICM is its global convergence and increasing likelihood in each step, though Zhan and Wellner (1995) point out some possible problems in the proof of the global convergence in Aragon and Eberly (1992). On the other hand, G&W (page 7) suggest using \buer" to keep the estimate as a proper distribution and to avoid the likelihood becoming. However, they do not say what to be done if the new estimate does make the likelihood be. Damping is a natural way to avoid this problem. Hence in our discussion we refer to ICM and DICM equivalently. 2.3 Extending ICM to the Cox Model The Cox proportional hazards model is probably the most widely used one in survival analysis. It assumes that the hazard function h(xjz) with covariate (vector) Z is proportional to an unknown o
6 Extending the ICM 6 baseline hazard h (x): h(xjz) =h (x)exp(z ) or formulated in terms of distribution functions: 1 ; F (xjz) =[1; F (x)] exp(z ) where F is the corresponding baseline distribution (unknown), and is the regression coecient (vector). Let the n observations be labelled (U 1 V 1 Z 1 ),...,(U n V n Z n ), where (U i V i ) as before is the lower and upper boundaries of the censoring interval, and Z i is the covariate (vector). The survival time X i is only observed to be inside (U i V i ]. As usual we assume that X i and (U i V i ) are mutually independent. (Finkelstein 1986): L(F )= The joint log-likelihood can be written down (up to a constant) as nx i=1 o log n(1 ; F (U i ;)) exp(z) i ; (1 ; F (V i )) exp(z) i : L is maximized to obtain the NPMLE ^ of the regression coecient along with that of the baseline distribution, ^F. Here we approach this maximization problem by extending the ICM. Computationally this is different from Finkelstein (1986), where the Newton-Raphson method is adopted after reparametrization. As before we would like to avoid using the full Hessian matrix due to the possible diculties mentioned earlier. Our extended ICM is simple (without inverting a high-dimensional matrix) and reasonably fast when applied in this setting. Extending the ICM is straightforward by considering the regression coecient as another component of parameters in addition to the baseline distribution. Now let r 1 L(F )=@L(F )=@F and r 2 L(F )=@L(F )=@ denote the rst derivatives G 1 (F )andg 2 (F ) the corresponding diagonal matrices of the negative second derivatives. Then our extended ICM iterates as F (m+1) = Proj[F (m) + j G 1 (F (m) (m) ) ;1 r 1 L(F (m) (m) ) G 1 (F (m) (m) ) R] (m+1) = (m) + j G 2 (F (m) (m) ) ;1 r 2 L(F (m) (m) ) where j and Proj are dened as before. Again the projection step is taken to ensure that the estimate of F is a proper (sub)distribution, which requires that ^F is nondecreasing and between and 1. Since no constraint is on, no projection is performed on it. We will discuss another application of the GP to constrained next.
7 Extending the ICM 7 Notice that inverting the diagonal matrices G 1 and G 2 is trivial. But to ensure numerical stability ofinverting G 1 and G 2 (and positive-deniteness of G 1 and G 2 for truncated data as mentioned at the end of the paper), the Levenberg-Marquardt adjustment is adopted (Thisted, 1988). Specically, G ;1 1 and G ;1 2 are respectively replaced by (G 1 + I) ;1 and (G 2 + I) ;1 with a suitably chosen small > when they are near singular (or non-positive-denite for truncated data). We note that our extended ICM is dierent from that outlined by Huang and Wellner (1997), which is basically a back-tting algorithm (Hastie and Tibshirani 199): L(F (m) )is rst maximized with respect to F at the xed (m) (by an ICM-like algorithm), which leads to an F (m+1) then L(F (m+1) ) is maximized with respect to at the xed F (m+1) to get another estimate (m+1) the above two steps are repeated until convergence. In contrast, we propose to maximize the likelihood jointly with F and, which seems to be more common and likely more ecient. Huang and Wellner (1997) did not give any specic numerical results. Some numerical comparison is presented in Section 3.3. Intuitively the (extended) ICM will converge, since the likelihood function is upper-bounded and each step of the (extended) ICM increases the likelihood. When we don't use the second order information (i.e. G is an identity matrix), the GGP reduces to the GP, which only has a linear convergence rate. On the other hand, the GGP can have a superlinear convergence rate if G is suitably chosen (part of which is diagonal and the remaining part equals the corresponding Hessian) (Propositions 3 and 4, Bertsekas 1982). Since the used information G in our (extended) ICM is between them, we can expect the convergence rate of the (extended) ICM is also between linear and superlinear furthermore, if the diagonal elements of the Hessian dominate the non-diagonals, the convergence rate of the (extended) ICM will at least be close to superlinearity. Rigorous derivations are not pursued here, though our limited experience seems to be positive. 2.4 Extending ICM to Support Lasso Tibshirani (1996) proposes a new estimation procedure called Lasso in linear regression and then extends it to the Cox model for right-censored data (Tibshirani, 1997). The Lasso works by constraining the sum of the absolute values of the regression coecients being less than a constant. This immediately reminds one the ridge regression, where the constraint is replaced by the sum of the squared regression coecients. As in ridge regression, the Lasso is a biased estimation
8 Extending the ICM 8 method which nonetheless may lead to smaller mean squared errors by reducing the variability of the estimates. However, to our knowledge, there is still no application of the Lasso method to the Cox model with interval censored data in the literature. Unlike in linear regression or in the Cox model with right censored data, here we have a high-dimensional (and ordered) nuisance parameter, the baseline distribution. (The baseline distribution is eliminated from the partial likelihood in the Cox model for right censored data but we do not have an explicit partial likelihood for interval censored data.) The ICM is readily adapted to support the Lasso by using the ideas of the GGP. Now we need to constrain in C = f : P r i=1 j i jg (where is a specied constant). This only needs to modify our previous algorithm by (m+1) = Proj[ (m) + j G 2 (F (m) (m) ) ;1 r 2 L(F (m) (m) ) G 2 (F (m) (m) ) C]: The projection of a point into C is a standard quadratic programming problem. In particular, Tibshirani (1996) gives two ecient algorithms. We term the above (m) Lasso estimate. at convergence as the The in the constraint can be chosen by generalized cross-validation for right censored data (Tibshirani 1997). Here we use a 1-fold likelihood cross-validation (Silverman 1986). 3. SIMULATION In our simulations, the baseline distribution F is Weibull with shape parameter 2 and scale parameter 1. Only binary covariates are used, each component of which takes or 1 with an equal probability. The regression coecient isalsoavector of and 1. The rst examination time T is Uniform( ). To mimic many panel studies we take the length of the time interval between two follow-up examinations to be constant, len = :5. Generally if we have k + 1 examinations, survival time X i is accordingly censored in one of ( T i ] (T i T i + len] ::: (T i + k len 1). k and jointly determine the censoring pattern. In our simulation studies there are four congurations: 1. k =1, =2, =(1 ) 2. k =1, =2, =(:333 :333 :333) 3. k =2, =1, =(1 ) 4. k =2, =1, =(:333 :333 :333)
9 Extending the ICM 9 It is evident that in the rst two cases the average length of censoring intervals tends to be longer than that in the other two, which implies more information loss from censoring. The rst and third cases represent one large covariate eect while the other twomany smaller covariate eects. They are used to compare the NPMLE with the Lasso estimate. The sample size is always 1. All the programs are implemented in C on a SunSparc 1. The starting value for the baseline distribution is the discrete uniform in the possible support of the NPMLE, whereas it is zero for the regression coecient. An algorithm is stopped when both the log-likelihood increment and the change of the regression coecient fromtwo consecutive iterations are less than 1 ; NPMLE and Bootstrap For cases 1 and 3, suppose we know the true model in other words, we know there is only one non-zero coecient 1. The ICM is so fast that often it converges within several seconds. The fast speed of the ICM facilitates using some computing-intensive methods, such as bootstrap and cross-validation. The cross-validation is used in choosing the Lasso constraint. Now we apply bootstrap to estimate the variability of the NPMLE ^. For related work in the context of right-censored data see Burr (1992). As reviewed in Huang and Wellner (1997), in practice there are two approaches to estimate the variance of ^: one is by using the observed information matrix another is through the prole likelihood for low-dimensional. Aswe pointed out earlier, for continuous time model without grouping the observed information matrix is often of too high-dimension, and sometimes is near singular. There is also no theoretical justication for their use. Bootstrapping seems a natural alternative, though we are not aware of any empirical studies yet, which probably results from lack of faster algorithms. Our bootstrapping scheme is as discussed by Efron and Tibshirani (1986), where the same number of observations are randomly taken with replacement from the original sample to form a bootstrap sample. We used 1 bootstrap samples. The results are reported in Table 1. Noticeably the NPMLE is biased upwards and with a large variability for case 1. The reason is too much information loss from censoring, particularly with an overwhelming 6%left-censoring (and 1% right-censoring approximately). For a left-censored observation ( V i Z i ) its contribution to the
10 Extending the ICM 1 Table 1: NPMLE of the regression coecient, ^, and the bootstrap estimate of its standard error, ^se. Their means and Monte Carlo standard errors were calculated from 5 replications. = 1. Case mean( ^) se( ^) mean(^se) se(^se) log-likelihood is n o L i =log 1 ; (1 ; F (V i )) exp(z i) which will be maximized when tends to the innity for a positive Z i. Nevertheless, the bootstrap estimate of the variance of the NPMLE is still sensible. In case 3, the left-censoring percentage is reduced to 3%(and there is about 1%right-censoring). The bootstrap estimate improves along with the NPMLE. In overall we can see that the bootstrap works reasonably well. 3.2 NPMLE and Lasso Estimate In this subsection we compare the performance of the NPMLE and the Lasso estimate of the regression coecient. Unlike in the last subsection, now all three covariates are used in other words, we do not know in advance which i is zero. Our results are presented in Table 2, which are pretty similar to that of Tibshirani (1996) for ordinary least squares regression. The Lasso estimates tend to have larger biases (down towards zero) but much smaller variances, hence smaller mean squared errors (MSE), than the NPMLE. However, as the performance of the NPMLE improves in cases 3 and 4, the Lasso estimate is only on a slight edge. 3.3 A Comparison with Other Algorithms As pointed out earlier, Finkelstein's Newton-Raphson algorithm is not practical to use when the sample size is large, whereas the prole likelihood approach (Huang 1996 Huang and Wellner 1995) is only applicable with low-dimensional regression coecients. Hence the only left competitor is the algorithm outlined in Huang and Wellner (1997). Essentially their method is to alternate
11 Extending the ICM 11 Table 2: NPMLEs and Lasso estimates of regression coecients (with their standard errors in parentheses) from 5 replications. =(1 ) for Cases 1 and 3 =(:333 :333 :333) for other two. NPMLE Lasso Case MSE ^1 ^2 ^3 MSE 1 ~ 2 ~ 3 ~ (.29) (.63) (.4) (.41) (.29) (.4) (.17) (.18) (.65) (.41) (.4) (.44) (.17) (.25) (.22) (.25) (.28) (.3) (.26) (.27) (.2) (.3) (.9) (.11) (.21) (.27) (.26).27) (.9) (.21) (.2) (.21) maximizing the likelihood with respect to the baseline distribution and with the regression coecients. In contrast, we propose to maximize the likelihood with respect to the baseline distribution and the regression coecients simultaneously. The advantage of our algorithm can be explained intuitively as follows. Suppose we want to maximize a bivariate function, say f(x y), with respect to x and y. If possible, we would like to maximize f(x y) with respect to x and y jointly, rather than maximize f(x y) with respect to x, then with y and repeat this process. We note that due to this connection between two algorithms it may appear that our proposal is a trivial extension of Huang and Wellner's. However, this connection is established largely by our formulation of the ICM as a GGP. We compared these two algorithms in the conguration of Case 3. Both algorithms used the same starting values and convergence criteria as before. The results are shown in Table 3. It is apparent that our algorithm is over two times faster than Huang and Wellner's. In particular, the log-likelihoods at the convergence of the two algorithms are very close to each other in fact, all the log-likelihood values at the convergence of our algorithm are slightly larger than their corresponding parts from Huang and Wellner's. This guarantees that our algorithm was not favorably stopped
12 Extending the ICM 12 Table 3: Speed of our new and Huang and Wellner's algorithm in computing the NPMLE in Case 3, with 5 replications. The CPU time is in seconds Log-likelihood is that at convergence. All the corresponding standard errors are in parentheses. Algorithm CPU time Log-likelihood ^ Huang-Wellner 1.36 (.24) (6.7869) (.293) New.44 (.8) ( ) (.295) earlier. Toinvestigate our algorithm's (in)dependence on starting values, we used some random starting values for the rst ten datasets in Case 3 each one is replicated for 1 times. The initial baseline hazard increment ineach of its support intervals was taken as a uniform random number between.1 and.1, while the starting regression coecients were from a uniform distribution between -2 and 2. For each dataset, the algorithm all converged to the neighborhood of the same point: the estimated regression coecients agreed up to the third decimal digit while the log-likelihood value was the same. However, more work is needed to clarify whether there are any local maxima to which the algorithm may converge, as one referee pointed out. 4. REAL EXAMPLES Now we apply our extended ICM to reanalyze two real datasets via the NPMLE. They have already been discussed among others in Finkelstein (1986), and Huang (1994) and Huang and Wellner (1995). Finkelstein focuses on hypothesis testing while Huang (1994) and Huang and Wellner (1995) establish the asymptotic normality and asymptotic eciency of the NPMLE of the regression coecient. The rst dataset is from a tumorigenicity study, where 144 mice were assigned to either a germfree or a conventional environment. The survival time here is the tumor onset time, which cannot be observed directly. The presence of the tumor can only be detected at the time of the sacrice or death of the animal. Hence the survival time is either left censored or right censored (i.e. interval-censoring case 1). If we take the germfree group as the baseline with covariate Z = and another group with Z = 1, then the NPMLE ^ of the regression coecient from the Cox model
13 Extending the ICM 13 is ^ = ;:68 with estimated standard error.41 by bootstrap (with 1 bootstrap replications). They are dierent from Huang's results: ;:55 and :29. Huang uses the prole likelihood to compute the NPMLE and approximates its variance by asymptotics, which involves some density estimation. Our 95%bootstrap percentile condence interval for is [;1:26 :34], compared with Huang's asymptotic normal approximation [;1:12 :1]. The second example considered is the Breast Cosmesis Study dataset (Finkelstein 1986 Finkelstein and Wolfe 1985). Two medical treatments are given to 9 early breast cancer patients after tumorectomy: one is a mix of primary radiotheropy and adjuvant chemotheropy, and another is radiotheropy only. The interest of this clinical trial is to investigate which treatment has better long-term cosmetic eects. Interval-censoring results since the patients could be visited every 4to 6 months and those living farther away from the clinic had even longer followup intervals. If we take the radiotheropy group as the baseline group, the NPMLE of the regression coecient from the Cox model is ^ =:735 with estimated standard error.35 by bootstrap. These results are not much dierent from the corresponding.795 and.29 from Huang and Wellner (1995). But the bootstrap inference seems more conservative than theirs. The 95%bootstrap percentile and asymptotic condence intervals are respectively [.11, 1.49] and [.23, 1.36]. Further studies are needed to clarify these discrepancies. 5. SUMMARY In computing the NPMLE of the distribution function for interval censored data, the dimension of the parameters is in the order of the sample size, which is often more than several hundreds. Hence the space and time overheads incurred by using the Hessian (or its approximation by a full matrix) are prohibitive. This motivated us to investigate algorithms that avoid using these large matrices. Clearly we are trading time for space. In particular, we do not claim that our extended ICM is faster than Finkelstein's Newton-Raphson method. The ICM is fast but still simple without using a high-dimensional matrix, such as the Hessian, for interval censored data with no covariates. Our formulation of the ICM as a GGP helps not only to establish its connection with the Newton-Raphson algorithm, but also to ease its extension to the Cox model and to support the Lasso. Its fast speed also facilitates combining its use with some computing-intensive methods, such as bootstrap and cross-validation. In particular, our limited
14 Extending the ICM 14 simulation studies seem to favor the Lasso estimates of the regression coecients in terms of mean squared errors as in linear regression setting (Tibshirani, 1996). Finally, we note that our extended ICM (with Levenberg-Marquardt adjustment) is readily applicable to truncated and interval censored data (Pan and Chappell 1998). Acknowledgements The author would like to thank Olvi Mangasarian for introducing the GGP thanks also go to Rick Chappell for stimulating discussions and support. Many insightful comments from the Associate Editor and three anonymous referees dramatically improved the presentation.
15 Extending the ICM 15 References Alioum, A. and Commenges, D. (1996), \A Proportional Hazards Model for Arbitrarily Censored and Truncated Data", Biometrics, 52, Aragon, J. and Eberly, D. (1992),\On Convergence of Convex Minorant Algorithms for Distribution Estimation with Interval-Censored Data", Journal of Computational and Graphical Statistics, 1, Bertsekas, D.P. (1976), \On the Goldstein-Levitin-Polyak Gradient Projection Method", IEEE Transactions on Automatic Control, 21, Bertsekas, D.P. (1982), \Projected Newton Methods for Optimization Problems with Simple Constraints", SIAM Control and Optimization, 2, Burr, D. (1994), \A Comparison of Certain Bootstrap Condence Intervals in the Cox Model", Journal of American Statistical Association, 89, Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977),\Maximum Likelihood from Incomplete Data via the EM Algorithm", Journal of the Royal Statistical Society, Series B, 39, Efron, B. and Tibshirani, R.J. (1986), \Bootstrap Methods for Standard Errors, Condence Intervals, and Other Measures of Statistical Accuracy" (with discussion), Biometrika, 65, Efron, B. and Tibshirani, R. J. (1993), An Introduction to the Bootstrap, New York: Chapman & Hall. Finkelstein, D.M. (1986), \A Proportional Hazards Model for Interval-Censored Failure Time Data", Biometrics, 42, Finkelstein, D.M. and Wolfe, R.A. (1985), \A Semiparametric Model for Regression Analysis of Interval-censored Failure Time Data", Biometrics, 41, Groeneboom, P. and Wellner, J.A. (1992),Information Bounds and Nonparametric Maximum Likelihood Estimation, Basel: Birhauser Verlag. Hastie, T. and Tibshirani, R. (199), Generalized Additive Models, London: Chapman & Hall. Huang, J. (1996), \Ecient Estimation for the Cox Model with Interval Censoring", Annals of Statistics, 24, Huang, J. and Wellner, J.A. (1995), \Ecient Estimation for the Proportional Hazards Model with `Case 2' Interval Censoring", Tech. Rept., Dept. of Statistics, Univ. of Washington, Seattle, WA.
16 Extending the ICM 16 Huang, J. and Wellner, J.A. (1997), \Interval Censored Survival Data: A Review of Recent Progress", Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis, New York: Springer. Mangasarian, O.L. (1996), \Algorithms of Nonlinear Mathematical Programming", CS73 Course Notes, Computer Sciences Dept., Univ. of Wisconsin, Madison, WI. Pan, W. and Chappell, R. (1998), \Computation of the NPMLE of Distribution Functions for Interval Censored and Truncated Data with Applications to the Cox Model", Computational Statistics and Data Analysis, 28, Polak, E. (1971), Computational Methods in Optimization, New York: Academic Press. Robertson, T., Wright, F.T. and Dykstra, R.L. (1988), Order Restricted Statistical Inference, New York: Wiley, Silverman, B.W. (1986), Density Estimation for Statistics and Data Analysis, New York: Chapman & Hall. Tibshirani, R. (1996), \Regression Shrinkage and Selection via the Lasso", Journal of the Royal Statistical Society, Series B, 58, Tibshirani, R. (1997), \A Proposal for Variable Selection in the Cox Model", Statistics in Medicine, 16, Tibshirani, R. and Hastie, T. (199), Generalized Additive Models, NewYork: Chapman & Hall. Thisted, R.A. (1988), Elements of Statistical Computing, NewYork: Chapman & Hall. Turnbull, B.W. (1976), \The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data", Journal of the Royal Statistical Society, Series B, 38, Zhan, Y. and Wellner, J.A. (1995),\Double Censoring: Characterization and Computation of the Nonparametric Maximum Likelihood Estimator", Tech. Rept. Dept. of Statistics, Univ. of Washington, Seattle, WA.
The iterative convex minorant algorithm for nonparametric estimation
The iterative convex minorant algorithm for nonparametric estimation Report 95-05 Geurt Jongbloed Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica
More information1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time
ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More information(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199
CONSISTENCY OF THE GMLE WITH MIXED CASE INTERVAL-CENSORED DATA By Anton Schick and Qiqing Yu Binghamton University April 1997. Revised December 1997, Revised July 1998 Abstract. In this paper we consider
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationEmpirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm
Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationPower Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis
Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Glenn Heller Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center,
More informationGoodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH
FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter
More informationMantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC
Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationFull likelihood inferences in the Cox model: an empirical likelihood approach
Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:
More informationoutput dimension input dimension Gaussian evidence Gaussian Gaussian evidence evidence from t +1 inputs and outputs at time t x t+2 x t-1 x t+1
To appear in M. S. Kearns, S. A. Solla, D. A. Cohn, (eds.) Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 999. Learning Nonlinear Dynamical Systems using an EM Algorithm Zoubin
More informationSymmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar
Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Department of Mathematical Sciences, University of Bath,
More informationAn Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong
More informationFahrmeir: Discrete failure time models
Fahrmeir: Discrete failure time models Sonderforschungsbereich 386, Paper 91 (1997) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Discrete failure time models Ludwig Fahrmeir, Universitat
More informationGROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin
FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2002 Paper 116 Case-Control Current Status Data Nicholas P. Jewell Mark J. van der Laan Division of
More informationConstrained estimation for binary and survival data
Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline
More informationRegularization in Cox Frailty Models
Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University
More informationCOMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky
COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky Summary The empirical likelihood ratio method is a general nonparametric
More informationA comparison study of the nonparametric tests based on the empirical distributions
통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationLikelihood Based Inference for Monotone Response Models
Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 5, 25 Abstract The behavior of maximum likelihood estimates (MLE s) the likelihood ratio statistic
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationNonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY
Nonparametric Hypothesis Testing and Condence Intervals with Doubly Censored Data Kun Chen and Mai Zhou Department of Statistics University ofkentucky Lexington KY 46-7 U.S.A. SUMMARY The non-parametric
More informationThe Expectation Maximization Algorithm
The Expectation Maximization Algorithm Frank Dellaert College of Computing, Georgia Institute of Technology Technical Report number GIT-GVU-- February Abstract This note represents my attempt at explaining
More informationInterval Censored Survival Data: A Review of Recent Progress. Jian Huang and Jon A. Wellner 1. University of Iowa and University of Washington
Interval Censored Survival Data: A Review of Recent Progress Jian Huang and Jon A. Wellner University of Iowa and University of Washington March 3, 996 Abstract We review estimation in interval censoring
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationand Comparison with NPMLE
NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/
More informationManaging Uncertainty
Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More informationTesting for a Global Maximum of the Likelihood
Testing for a Global Maximum of the Likelihood Christophe Biernacki When several roots to the likelihood equation exist, the root corresponding to the global maximizer of the likelihood is generally retained
More informationSSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work
SSUI: Presentation Hints 1 Comparing Marginal and Random Eects (Frailty) Models Terry M. Therneau Mayo Clinic April 1998 SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that
More informationEf ciency considerations in the additive hazards model with current status data
367 Statistica Neerlandica (2001) Vol. 55, nr. 3, pp. 367±376 Ef ciency considerations in the additive hazards model with current status data D. Ghosh 1 Department of Biostatics, University of Michigan,
More informationThe Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion.
The Uniformity Principle A New Tool for Probabilistic Robustness Analysis B. R. Barmish and C. M. Lagoa Department of Electrical and Computer Engineering University of Wisconsin-Madison, Madison, WI 53706
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationMaximum Likelihood Estimation
Connexions module: m11446 1 Maximum Likelihood Estimation Clayton Scott Robert Nowak This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License Abstract
More informationCombining estimates in regression and. classication. Michael LeBlanc. and. Robert Tibshirani. and. Department of Statistics. cuniversity of Toronto
Combining estimates in regression and classication Michael LeBlanc and Robert Tibshirani Department of Preventive Medicine and Biostatistics and Department of Statistics University of Toronto December
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationAnalytical Bootstrap Methods for Censored Data
JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 6(2, 129 141 Copyright c 2002, Lawrence Erlbaum Associates, Inc. Analytical Bootstrap Methods for Censored Data ALAN D. HUTSON Division of Biostatistics,
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationModels for Multivariate Panel Count Data
Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous
More informationInversion Base Height. Daggot Pressure Gradient Visibility (miles)
Stanford University June 2, 1998 Bayesian Backtting: 1 Bayesian Backtting Trevor Hastie Stanford University Rob Tibshirani University of Toronto Email: trevor@stat.stanford.edu Ftp: stat.stanford.edu:
More information* * * * * * * * * * * * * * * ** * **
Generalized Additive Models Trevor Hastie y and Robert Tibshirani z Department of Statistics and Division of Biostatistics Stanford University 12th May, 1995 Regression models play an important role in
More informationSome New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.
Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness
More informationMonte Carlo Methods for Statistical Inference: Variance Reduction Techniques
Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques Hung Chen hchen@math.ntu.edu.tw Department of Mathematics National Taiwan University 3rd March 2004 Meet at NS 104 On Wednesday
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationA Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common
Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when
More information1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa
On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationEMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky
EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are
More informationBehavioral Data Mining. Lecture 7 Linear and Logistic Regression
Behavioral Data Mining Lecture 7 Linear and Logistic Regression Outline Linear Regression Regularization Logistic Regression Stochastic Gradient Fast Stochastic Methods Performance tips Linear Regression
More informationAnalysis of incomplete data in presence of competing risks
Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department
More informationCOMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY
Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated
More informationHacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract
Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive
More informationTECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection
DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model
More informationExact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring
Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme
More informationOptimization Methods. Lecture 19: Line Searches and Newton s Method
15.93 Optimization Methods Lecture 19: Line Searches and Newton s Method 1 Last Lecture Necessary Conditions for Optimality (identifies candidates) x local min f(x ) =, f(x ) PSD Slide 1 Sufficient Conditions
More informationEstimating Gaussian Mixture Densities with EM A Tutorial
Estimating Gaussian Mixture Densities with EM A Tutorial Carlo Tomasi Due University Expectation Maximization (EM) [4, 3, 6] is a numerical algorithm for the maximization of functions of several variables
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationRecovering Copulae from Conditional Quantiles
Wolfgang K. Härdle Chen Huang Alexander Ristig Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. Center for Applied Statistics and Economics HumboldtUniversität zu Berlin http://lvb.wiwi.hu-berlin.de
More informationAugustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models
Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Sonderforschungsbereich 386, Paper 196 (2000) Online
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationError Empirical error. Generalization error. Time (number of iteration)
Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp
More informationAnalysis of Type-II Progressively Hybrid Censored Data
Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in
More informationEstimating the parameters of hidden binomial trials by the EM algorithm
Hacettepe Journal of Mathematics and Statistics Volume 43 (5) (2014), 885 890 Estimating the parameters of hidden binomial trials by the EM algorithm Degang Zhu Received 02 : 09 : 2013 : Accepted 02 :
More informationThe Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel
The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.
More informationLikelihood Construction, Inference for Parametric Survival Distributions
Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make
More informationARTIFICIAL INTELLIGENCE LABORATORY. and CENTER FOR BIOLOGICAL INFORMATION PROCESSING. A.I. Memo No August Federico Girosi.
MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL INFORMATION PROCESSING WHITAKER COLLEGE A.I. Memo No. 1287 August 1991 C.B.I.P. Paper No. 66 Models of
More informationPublished online: 10 Apr 2012.
This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer
More informationProfessors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th
DISCUSSION OF THE PAPER BY LIN AND YING Xihong Lin and Raymond J. Carroll Λ July 21, 2000 Λ Xihong Lin (xlin@sph.umich.edu) is Associate Professor, Department ofbiostatistics, University of Michigan, Ann
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationA comparison of inverse transform and composition methods of data simulation from the Lindley distribution
Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform
More informationAsymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data
Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Jian Huang Department of Statistics and Actuarial Science University of Iowa, Iowa City, IA 52242 Email: jian@stat.uiowa.edu
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationChoosing initial values for the EM algorithm for nite mixtures
Computational Statistics & Data Analysis 41 (2003) 577 590 www.elsevier.com/locate/csda Choosing initial values for the EM algorithm for nite mixtures Dimitris Karlis, Evdokia Xekalaki Department of Statistics,
More informationPart III Measures of Classification Accuracy for the Prediction of Survival Times
Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples
More informationA TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL
A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,
More informationConvergence Rate of Expectation-Maximization
Convergence Rate of Expectation-Maximiation Raunak Kumar University of British Columbia Mark Schmidt University of British Columbia Abstract raunakkumar17@outlookcom schmidtm@csubcca Expectation-maximiation
More informationSchool of Public Health. 420 Delaware Street SE. University of Naples SANFORD WEISBERG. School of Statistics. 353f Classroom-Oce Building
Graphical Model Checking With Correlated Response Data WEI PAN 1 and JOHN E. CONNETT Division of Biostatistics School of Public Health University of Minnesota A460 Mayo Building, Box 303 420 Delaware Street
More informationPairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion
Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationSurvival Analysis for Case-Cohort Studies
Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz
More informationEstimation of Conditional Kendall s Tau for Bivariate Interval Censored Data
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional
More informationGeneralized Boosted Models: A guide to the gbm package
Generalized Boosted Models: A guide to the gbm package Greg Ridgeway April 15, 2006 Boosting takes on various forms with different programs using different loss functions, different base models, and different
More information1/sqrt(B) convergence 1/B convergence B
The Error Coding Method and PICTs Gareth James and Trevor Hastie Department of Statistics, Stanford University March 29, 1998 Abstract A new family of plug-in classication techniques has recently been
More informationParameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme
Applied Mathematical Sciences, Vol. 12, 2018, no. 18, 879-891 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2018.8691 Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationSimple techniques for comparing survival functions with interval-censored data
Simple techniques for comparing survival functions with interval-censored data Jinheum Kim, joint with Chung Mo Nam jinhkim@suwon.ac.kr Department of Applied Statistics University of Suwon Comparing survival
More informationIndependent Increments in Group Sequential Tests: A Review
Independent Increments in Group Sequential Tests: A Review KyungMann Kim kmkim@biostat.wisc.edu University of Wisconsin-Madison, Madison, WI, USA July 13, 2013 Outline Early Sequential Analysis Independent
More informationof the Mean of acounting Process with Panel Count Data Jon A. Wellner 1 and Ying Zhang 2 University of Washington and University of Central Florida
Two Estimators of the Mean of acounting Process with Panel Count Data Jon A. Wellner 1 and Ying Zhang 2 University of Washington and University of Central Florida November 1, 1998 Abstract We study two
More informationEstimating Causal Effects of Organ Transplantation Treatment Regimes
Estimating Causal Effects of Organ Transplantation Treatment Regimes David M. Vock, Jeffrey A. Verdoliva Boatman Division of Biostatistics University of Minnesota July 31, 2018 1 / 27 Hot off the Press
More information