A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood

Size: px
Start display at page:

Download "A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood"

Transcription

1 Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood Mai Zhou Yifan Yang Received: date / Accepted: date Abstract The Kaplan-Meier estimator is very popular in analysis of survival data. However, it is not easy to compute the constrained Kaplan-Meier. Current computational method uses expectationmaximization algorithm to achieve this, but can be slow at many situations. In this note we give a recursive computational algorithm for the constrained Kaplan-Meier estimator. The constraint is assumed given in linear estimating equations or mean functions. We also illustrate how this leads to the empirical likelihood ratio test with right censored data. Speed comparison to the EM based algorithm favours the current procedure. Keywords Wilks test Empirical Likelihood Ratio Right Censored Data NPMLE 1 Introduction Suppose that X 1,X 2,...,X n are i.i.d. non-negative random variables denoting the lifetimes with a continuousdistributionfunctionf 0.IndependentofthelifetimestherearecensoringtimesC 1,C 2,...,C n which are i.i.d. with a distribution G 0. Only the censored observations, (T i,δ i ), are available to us: T i = min(x i,c i ) and δ i = I[X i C i ] for i = 1,2,...,n. (1) Here I[A] is the indicator function of of A. The empirical likelihood (EL) of the censored data in terms of distribution F is defined as EL(F) = = n [ F(T i ))] δi [1 F(T i )] 1 δi n [ F(T i ))] δi { Mai Zhou Department of Statistics,University of Kentucky Tel.: (859) Fax: (859) mai@ms.uky.edu Yifan Yang Department of Statistics,University of Kentucky Tel.: (859) yifan.yang@uky.edu j:t j>t i F(T j )} 1 δi

2 2 Mai Zhou, Yifan Yang where F(t) = F(t+) F(t ) is the jump of F at t. See for example Kaplan and Meier (1958) and Owen (2001). The second line above assumes a discrete F( ). It is well known that the constrained or unconstrained maximum of the empirical likelihood are both obtained by discrete F (Zhou, 2005). Let w i = F(T i ) for i = 1,2,...,n. The likelihood at this F can be written in term of the jumps and the log likelihood is logel = EL = n n n [w i ] δi { w j I[T j > T i ]} 1 δi, δ ilogw i +(1 δ i )log n w j I[T j > T i ]. (2) If we maximize the log EL above without extra constraint (the probability constraints w i 0, and w i = 1 are always imposed), it is well known (Kaplan and Meier, 1958) that the Kaplan-Meier estimator w i = ˆF KM (T i ) will achieve the maximum value of the log EL. Empirical likelihood ratio method was first proposed by Thomas and Grunkemeier (1975) in the context of a Kaplan-Meier estimator. This method has been studied by Owen (1988, 2001); Li (1995); Murphy and van der Vaart (1997) and Pan and Zhou (1999) among many others. When using the empirical likelihood with right censored data in testing a general hypothesis, Zhou (2005) gave an EM algorithm to compute the likelihood ratio. He compared the EM algorithm to the sequential quadratic programming method (Chen and Zhou, 2007), and concluded that the EM was better. Though quite stable, the EM can be slow in certain data settings. See examples in section 3. We shall give a new recursive computation procedure for the constrained maximum of the log empirical likelihood above, which leads to the computation of the empirical likelihood ratio for testing. Section two contains the derivation of the recursive algorithm, as well as its application to the censored data empirical likelihood test. Section three reports the simulation results and a real data example. Finally we end the paper with a discussion of further issues. 2 The mean constrained Kaplan-Meier estimator In order to compute the empirical likelihood ratio, we need two empirical likelihoods: one with constraints, one without. The maximum of the empirical likelihood without constraint is achieved by an F equals to the Kaplan-Meier estimator, as is well known. It remains to find the maximum of logel under constraints, or equivalently, the Kaplan-Meier under constraints. Using an argument similar to in Owen (1988) will show that we may restrict our attention in the EL analysis, i.e. search max under constrains, to those discrete CDF F that are dominated by the Kaplan-Meier: F(t) ˆF KM (t). Owen (1988) restricted his attention to those distribution functions that are dominated by the empirical distribution. The first step in our analysis is to find a discrete CDF that maximizes the log EL(F) under the mean constraints (3), which are specified as follows: g 1 (t)df(t) = µ 1 g 2 (t)df(t) = µ 2 (3) g p (t)df(t) = µ p

3 Title Suppressed Due to Excessive Length Kaplan Meier Curve H 0: 0 τxdf(x)=2.2 H 0:S(3)= Fig. 1 Kaplan-Meier plot with and without mean constraint for Stanford 5 data discussed in Section 3, time scaled by 500. where g i (t)(i = 1,2,...,p) are given functions satisfy some moment conditions (specified later), and µ i (i = 1,2,...,p) are given constants. Without loss of generality, we shall assume all µ i = 0. Examples of such constrained Kaplan-Meier are shown in Figure 1. The constraints (3) can be written as (for discrete CDFs, and in terms of w i = F(T i ) = F(T i ) F(T i )) n g 1 (T i )w i = 0 (4) n g p (T i )w i = 0. We must find the maximum of the logel(f) under these constraints. We shall use the Lagrange multiplier to find this constrained maximum. Since F(t) ˆF KM (t), w i is only positive at the uncensored observation T i, except may be the last observation. Without loss of generality assume the observations are already ordered according to T and the smallest observation is an uncensored one (δ 1 = 1). To see this, suppose the first observation is right censored and second one is uncensored. In this case, δ 1 = 0, and w 1 = 0. Hence w 1 = 0, n w i = 1, and δ 1 logw 1 0. (5) i=2 The i-th term in log empirical likelihood is δ i logw i +(1 δ i )log n j=i+1 This is true as observations are sorted according to T. Since δ 1 logw 1 0, the log empirical likelihood only depends on w 2,...,w n. Additionally, the first observation with δ = 0 has no contribution to the constraints. w j.

4 4 Mai Zhou, Yifan Yang Now, assume the sample of n observations of are already ordered according to T and δ 1 = 1. Let I = {i 1,...,i k } be the index set of censored observations among the n s such that T i1 < < T ik, k is number of elements in I. Thus we have only n k positive probability w i s. Introduce k new variables { S 1,..., S k }, one for each censored T observation, i.e. assume j I, i.e. δ ij = 0, and let: = w i = 1 w i. (6) i:t i>t ij i:t i T ij This adds k new constraints to the optimization problem. We write the vector of those k constraints as i:t i>t ij w i = 0. With these k new variables S i, the log empirical likelihood in section 1 can be written simply as logel = n δ i logw i + k log. (7) Remark 1 Obviously, if two censored observations T ij and T im are such that there are no uncensored observations sandwiched in between them, then = S m. This is obvious from (6) and the fact w i > 0 only when δ i = 1. We comment that this do not change our final recursive formula (12) and (6). The Lagrangian function for constrained maximum is ( ) ( n ) G = logel(w, S)+λ i w i g(t i ) η w i 1 δ δ γ ( S ) W. Here, λ R p, g(t i ) = (g 1 (T i ),...,g p (T i )) is a vector corresponding to the constraints (4); η R is a scalar; γ, S,W R k, S is a vector for those s defined in (6), and the j-th entry of W is i:t i>t ij w i. Next we shall take the partial derivatives and set them to zero. We shall show that η = n. First we compute G = 1 δ j γ j Setting the derivative to zero, we have Furthermore, γ j = (1 δ j )/. (8) G w l = δ l w l +δ l λ g(t l ) η +γ U (l), whereu (l) {0,1} k isavectorwith the j-th entry tobe anindicatori[t ij < T l ] (1 δ ij ) = I[T ij < T l ]. Then set the derivative to zero and write l as i: Multiply w i on both sides and sum, w i η = i i η = δ i w i +δ i λ g(t i )+γ U (i). δ i + i Make use the other constraints, this simplifies to δ i w i λ g(t i )+( i w i γ U (i) ). η = (n k)+0+ i w i γ U (i). (9) We now focus on the last term above. Plug in the γ j expression we obtained in (8) above,

5 Title Suppressed Due to Excessive Length 5 switch order of summation and it is not hard to see that n w iγ U (i) = ( n w k ) i γ ju (i) j = k = k. n w ii[t ij <T i] I[δ ij =0] = k 1I[δ i j = 0] Therefore equation (9) becomes η = (n k)+0+k. Therefore η = n. Thus, we have δ i w i = (11) n λ δ i g(t i ) γ U (i) where we further note (plug in the γ, and δ ij = 0): γ U (i) = This finally gives rise to k (1 δ ij ) I[T ij < T i,δ ij = 0] = k I[T ij < T i ]. (10) w i = w i (λ) = δ i n λ δ i g(t i ) (12) k I[T ij <T i] which, together with (6), provides a recursive computation method for the probabilities w i, provided λ is given: 1. Starting from the left most observation, and without loss of generality (as noted above) we can assume it is an uncensored data point: δ 1 = 1. Thus w 1 = 1 n λ g(t 1 ). 2. Once we have w i for all i l, we also have all Sj where T ij < T l+1 and δ ij = 0, by using ( = 1 i I[T i T ij ]w i ), then we can compute w l+1 = δ l+1 n λ g(t l+1 ). (13) k I[T ij <T l+1 ] So this recursive calculation will give us w i and as a function of λ. Remark 2 In the special case of no constraint of mean, then there is no λ (or λ = 0) and we have w l+1 = δ l+1 n, k I[T ij <T l+1 ] which is actually the jump of the Kaplan-Meier estimator. Proof Use the identity (1 ˆF)(1 Ĝ) = 1 Ĥ, we first get a formula for the jump of the Kaplan-Meier estimator: w i = 1/n 1/(1 Ĝ). This works for ˆF as well as for Ĝ. We next show that 1 1/n k I[T ij < T k+1 ] = (1 Ĝ(T k+1)) since the left hand side is just equal to the summation of jumps of Ĝ before T k+1.

6 6 Mai Zhou, Yifan Yang Finally the λ value needs to be determined from the constraint equation 0 = i δ i w i (λ)g(t i ) = i More formally, the recursion proceeds as follows: δ i g(t i ) n λ δ i g(t i ). (14) k I[T j<t i] (1) Initialization: Pick a λ value near zero, or identical to zero. (Recall λ = 0 gives the Kaplan-Meier). (2) Updating w and S: With this λ find all the w i s and s by the recursive formula (13) and (6). (3) Updating λ: Plug those w i into the right hand side of equation (14) above and call it θ. The w i s obtained in step (2) are actually the constrained Kaplan-Meier with the constraint being these θ instead of zero. (4) Checking λ and repeat: Check if θ is zero. If not, change the λ value and repeat, until you find a λ which gives rise to w i and that satisfy θ = 0. For one dimensional λ this is going to be easily handled by any function that computes the root of an univariate function such as the uniroot function in R. For multi-dimensional λ this calls for a Newton type iteration. The empirical likelihood ratio is then obtained as 2logELR = 2{logEL(w i,s j ) logel(w i = ˆF KM (T i ))} ; for the first log EL inside the curly bracket above we use the expression (7) with the w i,s j computed fromtherecursivemethodinthissection,andthesecondlogelisobtainedby(2)withw i = ˆF KM (T i ), the regular Kaplan-Meier estimator. Observations (T i,δ i ) are first ordered according to their T i values. If there are several observations with identical T i value, we then order according to their δ i value. That is, an uncensored T i is considered as come before a censored T j even when T i = T j. This is the usual convention in the calculation of the Kaplan-Meier estimators. When there are observations with identical T i and δ i value, we can either merge the tied observations and record the number of tie in another vector u i ; or we may just leave the tied observations as is, in the order of their input. When there are substantial(extensive) tied observations, it may save computational time to first merge the tied data record the number of tied in u i before the recursive computation. However, in most applications of survival analysis the Kapan-Meier estimator will be computed along with some covariates, as in regression analysis. So the actual data likely will look like (T i,δ i,x i ) where x i are the covariates, e.g. treatment, gender, age, blood pressure, etc. of the i-th patient. In this case, even if two observations have identical T i and δ i, they should not be merged because they have different covariates. Therefore in the current implementation of kmc package, we choose not to merge any tied data. Under the assumption that the variance of g(t)dˆf KM (t) is finite, we have a chi square limiting distribution for the above -2 log empirical likelihood ratio, under null hypothesis as stated in the Wilks Theorem. Therefore we reject the null hypothesis if the computed -2 empirical likelihood ratio exceeds the chi square 95% percentile with p degrees of freedom. See Zhou (2010) for a proof of this theorem. 3 Simulation If the Jacobian matrix of mean zero requirement (14) is not singular, Newton method could be used to find the root of (14). As mentioned previously, the null parameter space which has no constraint corresponds to λ = 0. We shall call the recursive computation for w i s plus the Newton iteration for λ as Kaplan-Meierconstrained(KMC) method, which is also the name of the R package kmc available on the comprehensive R archive network (CRAN).

7 Title Suppressed Due to Excessive Length 7 In each Newton iterationwhen we try to find the root,it is helpful to know the bounds of solution, or socalled, feasible region.in the KMC computation, it is obviousthat those λ s suchthat n δ i λ g(t i ) k I[T ij <T i] = 0 will lead (14) to. Denote those roots as λ i Rp,i = 1,...,n. Any i-th entry of the desired λ root for (14) must satisfy j such that λ i ji < 0,λ i < λ i j+1i i = 1,...,p where λ ij is the j-th entry of vector λ i such that λ i <... < 1i λ i, and ni λ 0,i =, λ n+1,i = +. So, our strategy is to start at 0 and try to stay within the feasible region at all times when carry out the Newton iterations. We could also calculate the analytical derivatives used in the Newton iteration. Denote the right hand side of (14) as f(λ), i.e. f(λ) = δ i w i (λ)g(t i ). i To compute λf(λ), we only need to calculate λ w i and S ( λ j = λ 1 ) j k=1 w k = j λ k=1 w k. There are no closed forms of such derivatives, but it could again be derived recursively. (1) w 1 = 1 n λg(t 1), and λ w 1 = w 2 1 g(t 1) (2) λ w k+1(λ) = δ k+1 (w k+1 ) 2 ( g(t k+1 )+ n ( I[T ij < T k+1 ]( ) 2 ) ij λ s=1 λ w s To evaluate the performance of this algorithm, a series of simulations had been done. We compared with standard EM algorithm (Zhou, 2005). Without further statement, all simulation works have been repeat 5,000times and implemented in R language(r Core Team, 2014). R is used on awindows 7 (64-bits) computer with 2.4 GHz Intel(R) i7-3630qm CPU. Experiment 1 : Consider a right censored data with only one constraint: { X Exp(1) (15) C Exp(β) Censoring percentage of the data are determined by different β s. Three models are included in the experiments: (1) β = 1.5, then 40% data are uncensored; (2) β = 0.7, then 58.9% data are uncensored; (3) β = 0.2, then 83.3% data are uncensored. The common hypothesis is (4), where g(x) = (1 x)1 (0 x 1) e 1. We could verify that the true expectation is zero: g(x)df(x) = (1 x)1 (0 x 1) e x dx e 1 =0. To compare the performances of KMC and EM algorithm, we use four different sample sizes, i.e. 200, 1,000, 2,000, and 5,000 in the experiments. To make fair comparisons, f (t) f (t+1) l is used as the convergence criterion for EM algorithm, and f (t) l is used for KMC. Average spending time is reported to compare the computation efficiency in Table 1. The no censored case is included for reference, this is equivalent to Newton solving λ without recursion. In all cases in our study, EM and KMC reported almost the same χ 2 test statistics. As shown in Table 1, we observed the following phenomenons: (1) KMC always outperformed EM algorithm in speed at different simulation settings. (2) Computation complexity of EM increased sharply with the percentage of censored data increasing. This is reasonable, since more censored data needs more E-step computation. But censored rate did not affect KMC much. (3) Sample size is related to the computation complexity. We could see the running time of both EM and KMC increased along with the sample size. ).

8 8 Mai Zhou, Yifan Yang Table 1 Average running time of EM/KMC (in second). No Censor column refers to time spend on solving empirical likelihood without censoring in R el.test(emplik). We use this as a comparison reference. Censoring Rate N EM KMC(nuDev) KMC(AnalyDev) No Censor % β = % β = % β = (4) Another phenomenon is that, the computation of numeric derivative and analytic derivative of KMC is similar. This fact is straightforward, as KMC depends on solving (14) which is defined iteratively. To summarize, when sample size is small and censored rate is low, the performance of EM and KMC is similar. But either in the large sample case or heavily censored case, KMC far outperformed EM algorithm with the same stopping criterion. Experiment 2 : Consider a right censored data setting with two constraints. The i.i.d. right censored data are generated from (1) using { X Exp(1) (16) C Exp(.7) with the following hypothesis: H 0 : i g j (T i )w i = 0; j = 1,2 where { g1 (x) = (1 x)1 (0 x 1) e 1 g 2 (x) = 1 (0 x 1) 1+e 1 (17) It is straightforward to verify that both g functions have expectation zero. In this simulation study, we Table 2 Average running time of EM/KMC (in second) Censoring Rate N EM KMC(nuDev) 41% % observed that EM spent great amount of time (3s 55s per case) to meet the converge criterion, while the average running time of KMC was considerable shorter (0.03s 0.08s). This dramatic result shows in multi-dimensional case, KMC runs much faster than EM algorithm. Only numerical derivatives were used in our simulations, one could implement the analytic ones using iteration shown previously. But in multi-dimensional case, the iterative type of derivatives do not have advantage over numeric ones. We recommend to use KMC with numeric derivative if one has more than one hypothesis even the sample size is small. Experiment 3 : Non-exponential survival setting. Consider a right censored data with one constraints: { X Γ(3,2) (18) C U(0,η) with hypothesis H 0 : g(t i )w i = 0, with g(x) = x 1.5 i we carried out two experiments on different censoring time C i from uniform distribution:

9 0.02 Title Suppressed Due to Excessive Length Intercept Fig. 2 Contour plot of-2 log likelihood ratio corresponding to intercept and slope of age for the Stanford Heart Transplants Data. β Age (1) η = 5, then 70.00% data are uncensored; (2) η = 3, then 51.34% data are uncensored. Table 3 Average running time of EM/KMC (in second) of one constraint and Uniform distributed censored time Censoring Rate N EM KMC(nuDev) KMC(AnalyDev) No Censor 30.00% η = % η = We found that the result shown in Table 3 is very similar to Table 1, which shows that different distribution of censored time will not affect the simulation result too much. Real Data Example : The speed advantage of KMC algorithm is more apparent in time consuming analysis such as drawing contour plot. In this real data example, we illustrate the proposed algorithm to analyze the Stanford Heart Transplants Data described in Miller and Halpern (1982) and considered regression with intercept and slope term of age. There were 157 patients who received transplantation, among which 55 were still alive and 102 were deceased. We deleted cases that the survival times are less than 5 days, and used only 152 cases in the analysis, as suggested by Miller and Halpern. To draw such contour plot = 2601 empirical likelihood ratios were calculated. In this example, we used KMC to calculate the empirical likelihood instead of EM described in Zhou and Li (2008). 4 Discussion and Conclusion In this manuscript, we proposed a new recursive algorithm, KMC, to calculate mean constrained Kaplan-Meier estimator and log empirical likelihood ratio statistics of right censored data with linear constraints. Our algorithm used Lagrange multiplier method directly, and recursively computes the constrained Kaplan-Meier and test statistics.

10 10 Mai Zhou, Yifan Yang Numerical simulations show this new method has an advantage against traditional EM algorithm in the sense of computational complexity. Our simulation work also shows that the performance of KMC does not depend on the censoring rate, and outperformed EM algorithm at every simulation setting. We recommend to use KMC in all cases but particular large gain are expected in the following cases: (1) Sample size is large (e.g. > 1000 observations); (2) Data are heavily censored (e.g. censored rate > 40%); (3) There are more than one constraints. On the other hand, and somewhat surprisingly, the analytic derivative did not help speed up computation in our simulation study. Besides, since KMC with numeric derivative could be extended to more than one constraints case, we highly recommend using numeric derivative in KMC rather than analytical one. One of the issues of KMC is the initial value choosing, as is the case for most Newton algorithms. The performance of root solving relies on the precision of numerical derivative and Newton method. Our current strategy uses the M-step output of EM algorithm with only two iterations. Other better initial values are certainly possible. In addition, current KMC only works on right censored data, while EM algorithm works for right-, left- or doubly censored data; or even interval censored data. We were unable to find a proper way to derive such recursive computation algorithm in other censoring cases. Software is available to download at as a standard R package. Acknowledgements Mai Zhou s research was supported in part by US NSF grant DMS References Chen K, Zhou M (2007) Computation of the empirical likelihood ratio from censored data. Journal of Statistical Computation and Simulation 77(12): Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. Journal of the American statistical association 53(282): Li G (1995) On nonparametric likelihood ratio estimation of survival probabilities for censored data. Statistics & Probability Letters 25(2): Miller R, Halpern J (1982) Regression with censored data. Biometrika 69(3): Murphy SA, van der Vaart AW (1997) Semiparametric likelihood ratio inference. The Annals of Statistics pp Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75(2): Owen AB (2001) Empirical likelihood. CRC press Pan XR, Zhou M (1999) Using one-parameter sub-family of distributions in empirical likelihood ratio with censored data. Journal of statistical planning and inference 75(2): R Core Team (2014) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, URL Thomas DR, Grunkemeier GL (1975) Confidence interval estimation of survival probabilities for censored data. Journal of the American Statistical Association 70(352): Zhou M (2005) Empirical likelihood ratio with arbitrarily censored/truncated data by em algorithm. Journal of Computational and Graphical Statistics 14(3): Zhou M (2010) A wilks theorem for the censored empirical likelihood of means. URL Zhou M, Li G (2008) Empirical likelihood analysis of the buckley james estimator. Journal of multivariate analysis 99(4):

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky Summary The empirical likelihood ratio method is a general nonparametric

More information

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky Summary Empirical likelihood ratio method (Thomas and Grunkmier 975, Owen 988,

More information

Empirical Likelihood in Survival Analysis

Empirical Likelihood in Survival Analysis Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Package emplik2. August 18, 2018

Package emplik2. August 18, 2018 Package emplik2 August 18, 2018 Version 1.21 Date 2018-08-17 Title Empirical Likelihood Ratio Test for Two Samples with Censored Data Author William H. Barton under the supervision

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Empirical Likelihood

Empirical Likelihood Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Jackknife Empirical Likelihood for the Variance in the Linear Regression Model

Jackknife Empirical Likelihood for the Variance in the Linear Regression Model Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics Summer 7-25-2013 Jackknife Empirical Likelihood for the Variance in the Linear

More information

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data Mai Zhou and Julia Luan Department of Statistics University of Kentucky Lexington, KY 40506-0027, U.S.A.

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Empirical Likelihood Inference for Two-Sample Problems

Empirical Likelihood Inference for Two-Sample Problems Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

ST495: Survival Analysis: Hypothesis testing and confidence intervals

ST495: Survival Analysis: Hypothesis testing and confidence intervals ST495: Survival Analysis: Hypothesis testing and confidence intervals Eric B. Laber Department of Statistics, North Carolina State University April 3, 2014 I remember that one fateful day when Coach took

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

More information

Pooling multiple imputations when the sample happens to be the population.

Pooling multiple imputations when the sample happens to be the population. Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht

More information

Estimation for nonparametric mixture models

Estimation for nonparametric mixture models Estimation for nonparametric mixture models David Hunter Penn State University Research supported by NSF Grant SES 0518772 Joint work with Didier Chauveau (University of Orléans, France), Tatiana Benaglia

More information

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

Analytical Bootstrap Methods for Censored Data

Analytical Bootstrap Methods for Censored Data JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 6(2, 129 141 Copyright c 2002, Lawrence Erlbaum Associates, Inc. Analytical Bootstrap Methods for Censored Data ALAN D. HUTSON Division of Biostatistics,

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

STAT 6350 Analysis of Lifetime Data. Probability Plotting

STAT 6350 Analysis of Lifetime Data. Probability Plotting STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life

More information

Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis.

Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis. 10.2 Product-Limit (Kaplan-Meier) Method Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis. Let T be a continuous random

More information

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Eco517 Fall 2004 C. Sims MIDTERM EXAM Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering

More information

LSS: An S-Plus/R program for the accelerated failure time model to right censored data based on least-squares principle

LSS: An S-Plus/R program for the accelerated failure time model to right censored data based on least-squares principle computer methods and programs in biomedicine 86 (2007) 45 50 journal homepage: www.intl.elsevierhealth.com/journals/cmpb LSS: An S-Plus/R program for the accelerated failure time model to right censored

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

18.465, further revised November 27, 2012 Survival analysis and the Kaplan Meier estimator

18.465, further revised November 27, 2012 Survival analysis and the Kaplan Meier estimator 18.465, further revised November 27, 2012 Survival analysis and the Kaplan Meier estimator 1. Definitions Ordinarily, an unknown distribution function F is estimated by an empirical distribution function

More information

Proportional hazards regression

Proportional hazards regression Proportional hazards regression Patrick Breheny October 8 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/28 Introduction The model Solving for the MLE Inference Today we will begin discussing regression

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Constrained estimation for binary and survival data

Constrained estimation for binary and survival data Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline

More information

Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY

Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY Nonparametric Hypothesis Testing and Condence Intervals with Doubly Censored Data Kun Chen and Mai Zhou Department of Statistics University ofkentucky Lexington KY 46-7 U.S.A. SUMMARY The non-parametric

More information

exclusive prepublication prepublication discount 25 FREE reprints Order now!

exclusive prepublication prepublication discount 25 FREE reprints Order now! Dear Contributor Please take advantage of the exclusive prepublication offer to all Dekker authors: Order your article reprints now to receive a special prepublication discount and FREE reprints when you

More information

Stochastic Analogues to Deterministic Optimizers

Stochastic Analogues to Deterministic Optimizers Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies

More information

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques Journal of Data Science 7(2009), 365-380 Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques Jiantian Wang and Pablo Zafra Kean University Abstract: For estimating

More information

A general mixed model approach for spatio-temporal regression data

A general mixed model approach for spatio-temporal regression data A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive

More information

Group Sequential Designs: Theory, Computation and Optimisation

Group Sequential Designs: Theory, Computation and Optimisation Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1 36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations

More information

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1 4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Bootstrap Procedures for Testing Homogeneity Hypotheses

Bootstrap Procedures for Testing Homogeneity Hypotheses Journal of Statistical Theory and Applications Volume 11, Number 2, 2012, pp. 183-195 ISSN 1538-7887 Bootstrap Procedures for Testing Homogeneity Hypotheses Bimal Sinha 1, Arvind Shah 2, Dihua Xu 1, Jianxin

More information

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Exam C Solutions Spring 2005

Exam C Solutions Spring 2005 Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Approximation of Average Run Length of Moving Sum Algorithms Using Multivariate Probabilities

Approximation of Average Run Length of Moving Sum Algorithms Using Multivariate Probabilities Syracuse University SURFACE Electrical Engineering and Computer Science College of Engineering and Computer Science 3-1-2010 Approximation of Average Run Length of Moving Sum Algorithms Using Multivariate

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

Censoring mechanisms

Censoring mechanisms Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood ICSA Applied Statistics Symposium 1 Balanced adjusted empirical likelihood Art B. Owen Stanford University Sarah Emerson Oregon State University ICSA Applied Statistics Symposium 2 Empirical likelihood

More information

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Testing Equality of Two Intercepts for the Parallel Regression Model with Non-sample Prior Information

Testing Equality of Two Intercepts for the Parallel Regression Model with Non-sample Prior Information Testing Equality of Two Intercepts for the Parallel Regression Model with Non-sample Prior Information Budi Pratikno 1 and Shahjahan Khan 2 1 Department of Mathematics and Natural Science Jenderal Soedirman

More information

NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA

NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA Statistica Sinica 14(2004, 533-546 NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA Mai Zhou University of Kentucky Abstract: The non-parametric Bayes estimator with

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Masters Comprehensive Examination Department of Statistics, University of Florida

Masters Comprehensive Examination Department of Statistics, University of Florida Masters Comprehensive Examination Department of Statistics, University of Florida May 6, 003, 8:00 am - :00 noon Instructions: You have four hours to answer questions in this examination You must show

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

A Model for Correlated Paired Comparison Data

A Model for Correlated Paired Comparison Data Working Paper Series, N. 15, December 2010 A Model for Correlated Paired Comparison Data Manuela Cattelan Department of Statistical Sciences University of Padua Italy Cristiano Varin Department of Statistics

More information

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Working Paper 2013:9 Department of Statistics Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Ronnie Pingel Working Paper 2013:9 June

More information