Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY

Size: px
Start display at page:

Download "Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY"

Transcription

1 Nonparametric Hypothesis Testing and Condence Intervals with Doubly Censored Data Kun Chen and Mai Zhou Department of Statistics University ofkentucky Lexington KY 46-7 U.S.A. SUMMARY The non-parametric maximum likelihood estimator (NMLE) of the distribution function with doubly censored data can be computed using self-consistent algorithm (Turnbull 1974). We extend the self-consistent algorithm to include a constraint on the NMLE. We then show how this can be used to construct condence intervals and test hypotheses based on the NMLE via empirical likelihood ratio. Finally we present some numerical comparison of the performance of the above method with another method that make use of the inuence functions. AMS 1991 Subject Classication rimary 6G1; secondary 6G. Key Words and hrases Modied Self-consistency Empirical likelihood ratio Inuence function. 1. Introduction When data are doubly censored the nonparametric maximum likelihood estimator (NMLE) of the distribution function have no explicit form and have to be computed via some iteration [Turnbull(1976) Chang and Yang(1987) Zhou(199) Zhan and Wellner(1999)]. The estimation of the asymptotic variance of the NMLE is even more involved [Chang(199)]. Self-consistency algorithm is a general way of computing non-parametric distributional estimators when data are not completely observed [Efron (1967) Turnbull(1974) Tsai and Crowly(198)]. In particular a NMLE of distribution function based on an i.i.d. sample must satisfy the selfconsistent equation [Gill(1989)]. It is a special case of the EM algorithm when the parameter is the distribution function itself [Dempseter Laird and Rubin(1977)]. In this paper we extend the self-consistent algorithm to compute the NMLE under a constraint F (T ) for doubly censored data. This constrained estimator is useful in the construction of empirical likelihood ratio. As an application we show how this in turn will allow us to nd condence intervals and test hypotheses for doubly censored data. Another approach of constructing condence intervals based on the NMLE with doubly censored data is to estimate the inuence function/variance of the NMLE. But the inuence function is not easy to estimate and this approach may not be very practical for larger samples. We will compare the two approaches for one example in this paper. 1

2 Doubly censored data can arise when observations are subject to both right and left censoring i.e. the patients are only watched within a window of observational time otherwise we only know it is below or above the window. Double censoring also arise from pairwise comparisons when there are right censoring in both group as the following example explains. Example In paired comparison experiments we have n pairs of observations based on two treatments. To estimate the treatment dierence it is customary to focus on the n pairwise dierences. However when right censoring occurs in either treatments the pairwise dierence can be right censored or left censored as the following table shows. Trt 1 Trt di(1 ) For a pair of observations if the observation from treatment one is right-censored but treatment two is uncensored their dierence will be right-censored; if the observation from treatment one is uncensored but treatment two is right-censored their dierence will be left-censored; If both observations are uncensored their dierence is uncensored. If both observations are right-censored their dierence can take any value we drop those data from further analysis. Therefore the dierences of paired right-censored data can be doubly censored { having both left and right censored observations. Section 6 will treat this in more detail. } The rest of this section is to formally introduce the relevant notation and basic assumptions. Let 1 ; ; n be positive random variables denoting the sample of lifetimes which is independent and identically distributed with a continuous distribution F. The censoring mechanism is such that i is observable if and only if it lies inside the interval [Z i ;Y i ]. The Z i and Y i are positive random variables with continuous distribution functions G L and G R respectively andz i Y i with probability 1.If i is not inside [Z i ;Y i ] the exact value of i cannot be determined. We only know whether i is less than Z i or greater than Y i and we observe Z i or Y i correspondingly. The variable i is said to be left censored if i <Z i and right censored if i >Y i. The available information may be expressed by a pair of random variables T i ; i where T i max(min( i ;Y i );Z i ) and i 8 >< > 1 if Z i i Y i if i >Y i i 1; ; ;n (11) if i <Z i The modied (constrained) self-consistent equation for doubly censored data is derived in Section. We generalize the self-consistent algorithm to deal with several constraints in Section. We then explain how the constrained self-consistent estimate may be used to obtain condence intervals via empirical likelihood ratio in section 4. Inuence function is discussed in Section. We carry out simulations and apply our algorithm to some doubly censored data in Section 6.

3 . Modied Self-Consistent Equation Under a Constraint Suppose for a given T and we areinterested to test the hypothesis H F (T ) vs H 1 F (T ) 6 (1) As we will see later A key step to accomplish this is to be able to compute the NMLE of F (t) under H from the data (1.1). We shall focus on computing NMLE in this section. First let us write down the likelihood function ( L(F )) for doubly censored data. The likelihood function involves all n observations. However We can decompose the likelihood function into two parts for observations before and after time T. L(F ) L 1 +L (w i )+ [1 F (t i )] + [F (t i )] + i 1;in 1 i ;in 1 i ;in 1 i 1;in (v i )+ [1 F (s i )] + [F (s i )] i ;in i ;in (w i )+ [1 F (T )+F(T ) F (t i )] i 1;in 1 i ;in 1 + [F (t i )] + (v i )+ [1 F (s i )] i ;in 1 i 1;in i ;in + [F (s i ) F (T )+F(T )] () i ;in where L 1 denotes the likelihood function for n 1 observations before time T and L denotes the likelihood function for n observations after time T (n 1 + n n). w i is the jump at ith observation before time T. v i is the jump at ith observation after time T. t i is the observed time before time T. s i is the observed time after time T. Therefore w i and n 1 i1 w i corresponding to jumps before time T ; similarly v i and n i1 v i (1 ) for jumps after time T. Notice that because F (T ) wehave F (T ) F (t i )w i w n1 and F (s i ) F (T ) v 1 + v + + v (i1) The rst part of above likelihood L 1 only involves w i the jumps of F before time T as in (.). Similarly the second part of the likelihood only involves the jumps of F after time T as in (.4). Therefore we can maximize the whole likelihood in two independent steps maximize L 1 and then maximize L. If a distribution function say ^F did maximize the likelihood under H then it must satisfy that the derivative of the likelihood function at this ^F equal to. We dene a directional change of w i in the direction h() and parametrized by to facilitate derivative (under H ). Since w i must sum up to v i must sum up to (1 ) we dene them as follows

4 For jumps before time T we dene and C(). w i w i () ^F (t i ) 1+h(t i ) C() For jumps after time T we dene Clearly D() 1. v i v i () ^F (s i ) 1 1+h(s i ) D() with with C() D() n 1 i1 n i1 ^F (t i ) 1+h(t i ) ; () ^F (s i ) 1+h(s i ) (4) Remark.1 We may also use the following denition in the computation of derivatives. wi () ^F (ti )[1 + h(t i )] with C () ^F (ti )[1 + h(t i )] C () i1;t i <T Since W i ^F (ti ) is the maximum the derivative of the likelihood with respect to must be zero for any direction h(t). This leads to the equation (A.*)(see Appendix A) In particular the choice h(t) I [tu] for 1 <u<1 will give us the modied self-consistent equation under the constraint (1). We shall use the convention 1 F (t i ) jt j >t i F (t j ) ; F (t i ) jt j t i F (t j ) (a) The constrained self-consistent equation for ^F (u) when u T is ^F (u) n 1 8 < where all t i T. i 1;in 1 I[t i u]+ i ;in 1 ^F (u) ^F (ti ) + I[t i u] + 1 F (t i ;in i ) 1 The last term in (.) can also be simplied to 1 ^F (u) 1 F (t i ) i ;in 1 ^F (min(u; ti )) i ;in 1 ^F (t i ) (b) Similarly the constrained self-consistent equation for u>t is ^F (u) n 8 < i ;in t j <t i ^F (t j )I[t j u] ^F (t i ) 9 ; () ^F (u) ^F (si ) I[s i u]+ I[s i u] i 1;in i ;in 1 ^F (si ) 1 [ ^F (u) ] 9 s + j <s i ^F (sj )I[s j u] ^F (s i ) i ;in ^F (s i ) ; (6) 4

5 where all s i >T. Again the last term in (.6) can be simplied to i ;in ^F (min(u; si )) ^F (s i ) For the derivation of above self-consistent equations please see Appendix A. We summerize the result in the following theorem. Theorem.1 The NMLE of F () under hypothesis (.1) based on data (1.1) must satisfy the modied self-consistent equations (.) and (.6). } Using the above modied self-consistent equations we can iteratively compute the estimator of distribution F under the constrain equation by combining the estimator before and after time T. Theorem. The estimator obtained by the above modied self-consistent equations is at least a local maximum of the likelihood function under (1). roof See Appendix B.. Modied Self-Consistent Equations Under Many Constraints In this section we extend the modied self-consistent algorithm to handle hypotheses (constraints) at several times H F (T 1 ) 1 ;F(T ) ;;F(T k ) k H 1 F (T j ) 6 j foratleastonej; 1 j k For simplicity we only give the proof for k. Following the similar steps as in the simple hypothesis we can decompose the -likelihood function into three parts before time T 1 between time T 1 and T and after time T. L(F ) L 1 (F ) + L (F )+L (F ) (w i )+ [1 F (t i )] + [F (t i )] i 1;in 1 i ;in 1 i ;in 1 + (z i )+ [1 F (x i )] + [F (x i )] i 1;in i ;in i ;in + (v i )+ [1 F (s i )] + [F (s i )] i 1;in i ;in i ;in where w i z i and v i are dened as jumps on ith observation before time T 1 between time T 1 and T and after time T respectively; t i x i and s i are observations before time T 1 between time T 1 and T and after time T respectively. There are n 1 observations before time T 1 n observations between time T 1 and T andn observations after time T. The total number of observations is n 1 +n +n n. Note that the constraints are n 1 i1 w i 1 n i1 z i 1 and n i1 v i 1.

6 We dene the jumps before time T 1 and after time T similarly as in section. The jumps between time T 1 and T can be dened as and B() 1. z i z i () ^F (xi ) 1 1+h(x i ) with n i1 ^F (xi ) 1+h(x i ) ; L 1 (F ) and L (F ) here are the same as L 1 (F ) and L (F )insection. We shall focus on L (F ) the -likelihood function between time T 1 and T. The self-consistent equation for T 1 u T is ^F (u) n 8 < i ;in i 1;in I(x i u) 1 1 [ ^F (u) 1 ] [ ^F (u) ^F (xi )]I [ x i u] 1 ^F + (x i ) i ;in 1 ^F (x i ) [ + ^F (min(x i ;u)) 1 ] + i ;in ^F (x i ) where T 1 x i T (For the details see Appendix C). i ;in 4. Empirical Likelihood Ratio Tests And Condence Intervals 1 1 [ ^F (u) 1 ] ^F (x i ) 9 ; (1) Let ~ F be the NMLE of F which maximizes likelihood L(F ) dened in (.) over all distributions and ^F denote the NMLE of F under H which maximizes the likelihood only among distributions that satisfy H.We dene the empirical likelihood ratio function as Our likelihood ratio test statistic is R(H ) R(H ) L( ^F ) L( ~ F ) max H L(F ) max H +H 1 L(F ) ( max L(F )) (max L(F )) H +H 1 H h i (L( F ~ )) (L( ^F )) The method described in section will enable us to compute the constrained NMLE ^F which maximizes the likelihood under H F (T ). Using the usual self-consistent algorithm we can compute NMLE ~ F without any constraint. So once we have these estimates we can easily compute the empirical likelihood ratio. We use chi-square theory to carry out hypothesis testing 6

7 and construct condence intervals. For theory about empirical likelihood ratio see Owen (1987) and Murphy and Van der Varrt (199). If the observed R(H ) is greater than 1; (the 1(1)th percentile of with 1 degree freedom) we reject H at signicance level. To construct the condence interval for F (T ) we can test dierent hypotheses with xed T and various 's and form the 1 condence interval for F (T )as n o R(H F (T )) 1; (41) We can also the construct condence interval for percentile F 1 () as follows. Test many hypotheses with xed value and various T values and form the condence interval as n o T R(H F (T )) 1; (4) In particular a condence interval for the median can be obtained with 1.. Inuence Function of NMLE and Its Estimation Inuence function (or inuence curve) is a general technique to obtain the variance of a random process (and more). In the analysis of ~ F () with doubly censored samples there are three inuence functions corresponding to right left and non-censored observations. Chang (199) computed those asymptotic inuence functions for the process p n( ^F n (t) F (t)) he obtained the following where p n( ^Fn (t) F (t)) " q (n) j (t) p 1 n n i I [zi t; i j] Z T Z T +! IC 1 (t; s)dq (n) 1 (s)+ Z T IC (t; s)dq (n) (s)+o(n) p (1) ; E 1 n i I [zi t; i j]!# IC (t; s)dq (n) (s) j ; 1; >From the above we could try to estimate the three asymptotic inuence functions and then estimate the variance of p n( ^F (t) F (t)) by Z T Z T IC^ 1 (t; s)dq(n) (s)+ 1 Z T ^ IC 1 (t; s)dq (n) 1 (s)+ Z T Z T IC^ (t; s)dq(n) (s)+ ^ IC (t; s)dq (n) (s)+ Z T IC^ (t; s)dq(n) (s) ^ IC (t; s)dq (n) (s)! ^ IC j (t; s) However those inuence functions are only dened via Fredholm integral equations that involve unknown distribution. 7

8 We plug-in the (self consistent) estimate of those distribution functions discretize the Fredholm integral equations into matrix equations and solve for IC. ^ For details see Chang (199) and Numerical Recipes in C. In this approach we need to solve a matrix equation that resulted from discretizing corresponding integral equations for inuence functions. If we discretizing at a few points then the estimate would not be very good. If we use a lot of discretizing points computation is slow. Therefore for large sample sizes this approach is very computationally expensive (to nd the inverse of a large matrix). In our experience when censored observations (both right and left) total exceed it becomes slow in our implementation since we discretizing at observed censoring times. Nevertheless this approach is worth exploring and it is interesting to compare to the approach of empirical likelihood described in Section Applications Simulations and Examples 6.1 Applications aired Comparison In section 1 we gave an example indicating that doubly censored data may result from paired comparison experiments. We shall specify a model for this case and illustrate the use of the testing procedures to test the hypothesis for drug eect when we donotwant tomake parametric assumptions. A reasonable model for the paired experiment isasfollows for ith subject (or pair) (i 1; ;;n)we observe Y 1i and Y i where Y 1i d + S i + 1i ; Y i p + S i + i ; where d ( p ) is the main eect for drug (placebo) S i is the subject eect ki is the random error. The dierence of Y 1i and Y i is D i ( d p )+( 1i i ); which is free from S i a fact that lead many test procedures to be based on the D i 's. If we assume 1i and 1i are exchangable then the median of D i is d p.thusatestofh d p can be carried out by testing if the median of D i is zero. Double censoring on the D i requires our test as described in section 4 and. In the case where ki are i.i.d. with a distribution of exp() 1 (mean zero exponential) D i has double exponential distribution with location parameter d p. Since the sample median is 8

9 MLE of location parameter for double exponential distribution we can expect the test to perform well in this case. We carried out some simulations for this procedure summerized in Table 6.1. Table 6.1 The ercentage Of Rejecting H d p At Dierence Size No Censored Light Censored Medium Censored Heavy Censored n n n n n n When the null hypothesis is true the percentages of rejecting H are very close to the nominal level for small and large samples. When there is dierence between drug and placebo the rejecting percentages decrease with the increases of censoring observations. 6. Simulation Hypothesis Testing In our rst simulation we took normally distributed samples of size 1 and size respectively for each run and each entry in Table 6. was based on runs. Table 6. Generating Normally Distributed Samples i 1; ; ; n i Z i Y i light-censored N( 1;) N( 6;) exp(1) + Z i +8 (1% % censored) medium-censored N( 1;) N( 7;) exp(1) + Z i + (% 4% censored) heavy-censored N( 1;) N( 9;) exp(1) + Z i + (4% 6% censored) Table 6. illustrates the probabilities of rejecting H F (T ) at nominal signicance level (or 1). The percentages were computed as the number of -likelihood ratios greater than critical value 1; 84 (or 1;1 71) divided by. From Table 6. we can see the probabilities of rejecting H are pretty close to the nominal level (or 1). We took exponentially distributed samples of size 1 and size respectively for our second simulation. This simulation was also based on samples. 9

10 Table 6. The ercentage Of Rejecting H F (T ) At And 1 T Sample Size Light Censored Medium Censored Heavy Censored 1 n 1 8% / 16% % / 14% 466% / 16% n 46% / 176% 48% / 11% 49% / 114% n 1 4% / 1% % / 14% 7% / 144% n 1% / 18% 68% / 11% 8% / 17% Table 6.4 Generating Exponentially Distributed Samples i 1; ; ; n i Z i Y i light-censored exp() exp(1) exp(1) + Z i +1 (1% % censored) medium-censored exp() exp(8) exp(1) + Z i + (% 4% censored) heavy-censored exp() exp() exp(1) + Z i (4% 6% censored) Again Table 6. shows that the probabilities of rejecting H (or 1). are around the nominal level Table 6. The ercentage Of Rejecting H F (T ) At And 1 T Sample Size Light Censored Medium Censored Heavy Censored 616 n 1 418% / 96% 4% / 98% 494% / 998% n 48% / 968% 47% / 974% 4% / 1% n 1 47% / 864% 46% / 94% 448% / 98% n 4% / 118% 48% / 118% 46% / 168% Figure 6.1 and 6. are Q-Q plots of -likelihood ratios for the above twosimulations verse the percentiles. At the point 84 (or 71) if the -likelihood ratio line is above the (1) dashed line (4 line) the rejecting probability is greater than % (or 1%). Otherwise the rejecting probability is less than % (or 1%). Remark 6.1 The Q-Q plots for exponentially distributed light-censored simulations (Figure 6.1(d) and 6. (d)) are somehow more discrete than the others. We did uncensored simulations with exponential distribution the Q-Q plot is similar to Figure 6.1(d) and 6. (d). Since the constraint isf (T ) ^F and F ~ only depend on the number of observations(n1 ) before and (n ) after time T. If there are the same n 1 and n in two uncensored samples -likelihood ratio will be the same. That is why uncensored and light-censored plots look more discrete. The censoring somehow smoothed the plot and made the chi-square approximation better. 1

11 6. Example Condence intervals { A case study We compare the condence intervals obtained via empirical likelihood ratio as in section 4 to condence intervals obtained by directly estimating the asymptotic variance of ^F (T ) and then form the Wald (1 ) condence interval ^F (T ) Z (1) q ^ Var[ ^F (T )] (61) where Z (1) is the 1(1)th percentile of a standard normal distribution. Better condence intervals may be obtained on a transformed scale. We may use the - transformation to obtain ( ^S() 1 ^F ()) q [ ^S(T ) b ; ^S(T ) 1b ] ; where b exp4 Z (1) Var[ ^ ^S(T )] (6) ^S(T )j ^S(T )j or the it transformation to obtain where and a b e a [ 1+e a ; e b 1+e b ] ; (6) ^F (T ) 1 ^F (T ) Z q (1) ^F (T )(1 ^F Var[ ^ ^F (T )] (T )) ^F (T ) 1 ^F (T ) + Z (1) ^F (T )(1 ^F (T )) q ^ Var[ ^F (T )] It is not easy to obtain the Wald type condence interval for the median (or other quantiles). But we can invert the test of F (T ) with estimated variance. Therefore the condence set for median may be obtained as the set of points T that satisfy the following condition 8 < T Z (1) ^F (T ) q Z (1) Var[ ^ ^F (T )] ; (64) To construct condence intervals we will use the following doubly censored data as our example. Turnbull and Weiss (1978) reported part of a study conducted at Stanford-alo Alto eer Counseling rogram (see Hamburg et al. (197) for details of the study). In this study 191 California high school boys were asked \When did you rst use marijuana?" The answers are either the exact age (uncensored observations) or \I never used it" which are right-censored observations at the boys' current ages or \I have used it but can not recall just when the rst time was" which are left-censored observations. Table 6.6 shows the results of this study. The estimated median age of high school boys who use marijuana is 14 years old. 9 11

12 Suppose we are interested to obtain 9% condence interval for the median age of rst time marijuana use. Table 6.7 shows the 9% condence intervals for the median age with (4.) and (6.4). Table 6.8 shows four dierent kinds of 9% condence intervals for at the median age with (4.1) (6.1) (6.) and (6.). Table 6.7 9% Condence Intervals For Median Age(14) method Condence Interval T R(H F (T )) 84 (4) (11; ) T 196 ^F (t) p Var[ ^ ^F (t)] 196 (6.4) (1; 19) Table 6.8 9% Condence Intervals For F (14) method Condence Interval R(H F (14) ) 84 (41) (79876; 71987) [1 ^S(14) b ; 1 ^S(14) 1b ] (6) (786; ) e [ a e 1+e ; b ] (6) (171994; 9844) a 1+e b ^F (14) 196q Var[ ^ (14)] (61) (8; 111) From Table 6.8 we can see the 9% condence interval obtained by empirical likelihood ratio (4.1) is narrower than any other three intervals with (6.1) (6.) and (6.). The two transformed intervals obtain by (6.) and (6.) are wider than (4.1) but close to each other. We do not know which transformation is better. We believe the empirical likelihood ratio approach is better because we do not need to know what is the best transformation and simulation in the previous section show the chi square approximation is pretty accurate. The Wald condence interval (6.1) includes the numbers less than and greater than 1. These are not reasonable condence limits for a distribution since a distribution must be between and 1. Therefore in this example we conclude that the empirical likelihood ratio approach works better than Wald's approach. 6.4 Software Used The simulation was carried out using Splus.4 for Unix on the H workstations. The Splus functions that computes the constrained NMLE with doubly censored data will be uploaded to Statlib in the near future as an update to the function d9newr that was there since June 199. References Andersen.K. Borgan O. Gill R. and Keiding N. (199) Statistical Models Based on Counting rocesses. Springer New York. 1

13 Chang M. N. and Yang G. L. (1987) Strong Consistency of a Nonparametric Estimator of the Survival Function with Doubly Censored Data. Ann. Statist Chang M. N. (199). Weak Convergence in Doubly Censored Data. Ann. Statist Dempseter A.. Laird N. M. and Rubin D. B. (1977). Maximum Likelihood From Incomplete Data via The EM Algorithm with Discussion). J. Roy. Statist. Soc. Ser. B Efron B. (1967). The Two Sample roblem with Censored Data. roc. Fifth Berkeley Symp. Math. Statist. robab Gill R. (1989) Non-and Semi-parametric Maximum Likelihood Estimator and the Von Mises Method (I) Scand. J. Statist Hamberg B.A. Kraemer H.C. and Jahnke W. (197). A Hierarchy of Drug Use in Adolescence Behavioral and Attitudinal Correlates of Substantial Drug Use. American Journal of sychiatry 1 (197) Kaplan E. and Meier. (198) Non-parametric Estimator From Incomplete Observations JASA 47{ 481. Murphy S. and Van der Varrt (1997). Semiparametric Likelihood Ratio Inference. Ann. Statist Owen A. (1988). Empirical Likelihood Ratio Condence Intervals for a Single Functional. Biometrika ress W. Teukolsky S. Vetterling W. and Flannery B. (199) Numerical Recipes in C The Art of Scientic Computing. Cambridge Univ. ress. Turnbull B. (1974) Nonparametric Estimation of a Survivorship Function with Doubly Censored Data. JASA Turnbull B. (1976) The Empirical Distribution Function with Arbitrarily Grouped Censored and Truncated Data. JRSS B 9-9. Turnbull B. and Weiss (1978). A Likelihood Ratio Statistic for Testing Goodness of Fit with Randomly Censored Data. Biometrics 4 (1978) Wei-Yann Tsai and John Crowley (198) A Large Sample Study of Generalized Maximum Likelihood Estimator From Incomplete Data Via Self-Consistency. Ann. Statist. 1 No Zhan Y. and Wellner J. (1999). A hybrid algorithm for computation of the NMLE from censored data. JASA Zhou M.(199) http//lib.stat.cmu.edu/s/d9newr. Appendix A Derivation of the Self-consistent Equation Before Time T (a) The likelihood function before time T L 1 i1;in 1 (w i )+ i1;in 1 (w i )+ [1 F (t i )] + 4 (1 )+ To facilitate derivative we substitute w i () as in (.) L 1 i1;in 1 + i;in 1 ^F(ti) 1+h(t i ) C() + ^F(tj) 1+h(t j ) t j<t i;jn 1 C() 1 i;in 1 [F (t i )] t j>t i;jn 1 w j + 4 (1 )+ i;in 1 1+h(t j ) t j>t i;jn 1 t j<t i;jn 1 w j C() 1 A

14 C() + i1;in n 1 i;in 1 + C() + i;in C() + i1;in 1 C()+ i1;in 1 t j<t i;jn 1 ^F(ti) 1+h(t i ) + i;in 1 ^F(tj) 1+h(t j ) t j>t i;jn 1 ^F(ti) 1+h(t i ) + ^F(tj) 1+h(t j ) Now we are ready to take derivative with respect L n 1 C () C() + If we set the above can be simplied L j n 1 n 1 i1 i;in 1 ^F(t i )h(t i ) This derivative must be zero since ^F is the NMLE. Thus we get n 1 i1 ^F(t i )h(t i ) n 1 8 < ^F(tj) 1+h(t j ) t j<t i;jn 1 i1;in 1 4 h(t i ) 1+h(t i ) C() (1 )C() + 1 C () t j>t i ^F (t j)h(t j) [1+h(t j)] 1 C()+ t j>t i ^F (t j) 1+h(t j) t j<t i ^F (t j)h(t i) [1+h(t j)] t j<t i ^F (t j) 1+h(t j ) i1;in 1 h(t i ) t j>t i ^F(tj )h(t j ) (1 )+ t j>t i ^F(tj ) i1;in 1 h(t i )+ t + j>t i ^F(t j )h(t j ) 1 F (t i ) If we take h(t i )I[t i u] for u T (A) becomes equation (.). 1 i;in 1 ^F(tj) 1+h(t j ) t j>t i;jn 1 1 n1 ^F(ti i1 )h(t i ) (1 )+ t j>t i ^F(t j ) n1 i1 ^F(ti )h(t i ) 1 F (t i ) + i;in 1 t j<t i ^F(tj )h(t j ) t j<t i ^F(tj ) t j<t i ^F(t j )h(t j ) ^F (t i ) 9 ; (A) Derivation of the Self-consistent Equation After Time T Similarly we can obtain the constrained self-consistent equation for ^F when u>t. 14

15 (b) The likelihood function after time T is L i1;in (v i )+ i1;in (v i )+ Again substituting v i () asin(.4) L i1;in + i;in i;in [1 F (s i )] + i;in ( ^F(si) 1 1+h(s i ) 4 + i1;in 1 D() + + i;in + i;in n 1 D() + + i;in D() + s j>s i;jn v j )+ ^F(sj) 1+h(s j ) s j <s i i1;in ^F(sj) 1+h(s j ) + s j >s i 4 1 D()+ i1;in Taking derivative with respect to it i;in 1 D() ^F(si) 1+h(s i ) + i;in [F (s i )] i;in ( + ^F(sj) 1 1+h(s s j ) D() j>s i 1 D() i;in ^F(sj) 1+h(s j ) s j<s i ^F(si) 1+h(s i ) D()+ n D () D() i;in + i;in If we set and the derivative must be zero thus n i1 ^F(si )h(s i ) 1 n 8 < + i;in i1;in h(s i )+ 1 i;in 1 D() i;in ^F(sj) 1+h(s j ) s j <s i i1;in h(s i ) [1 + h(s i )] s j>s i ^F (s j )h(s j) [1+h(s j )] s j>s i ^F (s j) 1+h(s j) s j>s i () ^F (s j)h(s i) 1 D s j<s i [1+h(s j )] D()+ ^F (s j) 1 s j<s i 1+h(s j) i;in n i1 ^F(sj )h(s j ) + s j <s i ^F(sj ) s j>s i ^F(sj )h(s j ) s j>s i ^F(sj ) + i;in If we take h(s i )I[s i u] for u>t the above equation becomes (.6). s j<s i;jn v j ) ^F(sj) 1+h(s j ) s j <s i ^F(s j )h(s j ) + s j<s i ^F(sj ) 9 ; 1

16 Appendix B roof of Theorem. Taking the second derivative of likelihood function L 1 with respect to and setting and h(t i )I [tiu] we can simplify the derivative as L j n 1 ^F(u)+ n 1 [ ^F(u)] + Substituting ^F(u) as in (.) where L j n 1 a i i1 + + n 1 i;in 1 1 ^F(u) 1 ^F (ti ) + i1;in 1 I [tiu] ^F(u) ^F(ti ) 1 ^F (ti ) ^F(u)+[^F(u) ^F(t i )]I [tiu] ^F(min(u; t i ) ^F(t i ) i1;in 1 I [tiu] + n1 [ ^F(u)] i;in 1 8 > < n 1 > 1 n 1 + i;in 1 [1 ^F(t i )] i;in 1 n 1 ^F (u)+[^f(u) ^F (t i )]I [tiu] " ^F(min(u; ti )) < n 1 1 n 1 n 1 n 1 a i i1 ^F(t i ) i1;in 1 I [t iu] + i1 ^F(min(u; t i )) ^F(t i ) a i " 1 n 1 n 1 [1 ^F (t i )] # # 9 a i ; i1 I + [t iu] i1;in 1 " # ^F(min(u; ti )) + i;in 1 i1;in 1 I [tiu] + ^F(t i ) 16 o! " # 9 ^F(u) ; o I [tiu] ( ^F(min(u; ti ) ^F(t i ) ) n o 1 ^F (u)+[^f(u) ^F (ti )]I [tiu] [1 ^F(t i )] n o 1 ^F (u)+[^f(u) ^F (t i )]I [tiu] 1 [1 ^F(t i )] ^F (u)+[^f(u) ^F (t i )]I [tiu] [1 ^F (ti )]

17 + i;in 1 ^F (min(u; t i ) ^F (t i ) By (.) and Cauchy-Schwartz Inequality it is clear that the second derivative of L 1 is non-positive. The second derivative ofl is also non-positive by following the similar steps. } L Appendix C Derivation of the self-consistent Equations Between Time T 1 and T i1;in (z i )+ i;in [1 F (x i )] + i;in [F (x i )] (z i )+ [1 F (T )+F(T ) F (x i )] i1;in i;in + [F (x i ) F (T 1 )+F(T 1 )] i;in (z i )+ 4 (1 )+ i1;in i;in lug z i () into L into above equation we get L i1;in + " ^F(xi ) 1+h(x i ) 4 1 i;in x j<x i;jn 1 + i1;in i1;in i;in + i;in n i1;in # x j>x i;in z j (1 )+ i;in ^F(xj) h(x j ) ^F(xi) 1+h(x i ) 1 (1 )+ x j>x i;jn i;in + i;in 4 x j<x i;jn Taking derivative with respective n B () x j>x i;jn ^F(xj) 1+h(x j ) + 1 ^F(xi) 1+h(x i ) ^F(xj) 1+h(x j ) x j>x i;jn ^F(xj) 1+h(x j ) i1;in 1 17 ^F(xj) 1+h(x j ) h(x i ) 1+h(x i ) i;in 1 1 x j>x i;jn 1 4 x j<x i;in z j + 1 ^F(xj) 1+h(x j ) 1

18 + + i;in 1 B 1 () ^F (x j)h(x j ) x j>x i [1+h(x j )] ^F (x j) x j>x i 1+h(x j ) ^F (x j)h(x i) x j <x i;jn [1+h(x j + )] 1 x ^F (x j) + j <x i;jn 1+h(x j) 1 B () 1 1 If we set the above equation can be simplied j n n 1 i1 i;in i;in ^F(x i )h(x i ) i1;in h(x i ) 1 n ^F(x 1 i1 i )h(x i )+ x j >x i ^F(x j )h(x j ) (1 )+ x j>x i ^F(xj ) x j<x i ^F(xj )h(x j )+ 1 n 1 i1 ^F(xi )h(x i ) 1 + x j <x i ^F (x j ) Since the derivative is equal to zero we rearrange the terms and obtain n i1 ^F(xi )h(x i ) 1 n 8 < + i;in i1;in h(x i ) 1 n 1 i1 ^F(xi )h(x i )+ x j>x i ^F(xj )h(x j ) + i;in x j<x i ^F(x j )h(x j )+ (1 )+ x j >x i ^F (x j ) 1 n x j <x i ^F(xj ) i1 ^F(x i )h(x i ) Now we take h(x i )I [xiu] for T 1 u T the above equation can be rewritten as (.1). 9 ; 18

19 Table 6.6 Marijuana Use In High School Boys Age Number of Exact Number Who Have Yet Number Who Have Started Observations to Smoke Marijuana Smoking at an Earlier Age >

20 Normally Distributed Light-Censored Simulation Normally Distributed Medium-Censored Simulation -likelihood Ratios Ho F(1). Ho F(8).186 -likelihood Ratios Ho F(1). Ho F(8) Chi-Square ercentiles (a) Chi-Square ercentiles (b) Normally Distributed Heavy-Censored Simulation Exponentially Distributed Light-Censored Simulation -likelihood Ratios Ho F(1). Ho F(8).186 -likelihood Ratios Ho F(.).616 Ho F(.) Chi-Square ercentiles (c) Chi-Square ercentiles (d) Exponentially Distributed Medium-Censored Simulation Exponentially Distributed Heavy-Censored Simulation -likelihood Ratios Ho F(.).616 Ho F(.) likelihood Ratios Ho F(.).616 Ho F(.) Chi-Square ercentiles (e) Chi-Square ercentiles (f) Figure 6.1 Q-Q lot of -likelihood Ratios vs. (1) ercentiles For Sample Size 1

21 Normally Distributed Light-Censored Simulation Normally Distributed Medium-Censored Simulation -likelihood Ratios Ho F(1). Ho F(8).186 -likelihood Ratios Ho F(1). Ho F(8) Chi-Square ercentiles (a) Chi-Square ercentiles (b) Normally Distributed Heavy-Censored Simulation Exponentially Distributed Light-Censored Simulation -likelihood Ratios Ho F(1). Ho F(8).186 -likelihood Ratios Ho F(.).616 Ho F(.) Chi-Square ercentiles (c) Chi-Square ercentiles (d) Exponentially Distributed Medium-Censored Simulation Exponentially Distributed Heavy-Censored Simulation -likelihood Ratios Ho F(.).616 Ho F(.) likelihood Ratios Ho F(.).616 Ho F(.) Chi-Square ercentiles (e) Chi-Square ercentiles (f) Figure 6. Q-Q lot of likelihood Ratios vs. (1) ercentiles For Sample Size 1

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky Summary The empirical likelihood ratio method is a general nonparametric

More information

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky Summary Empirical likelihood ratio method (Thomas and Grunkmier 975, Owen 988,

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood Mai Zhou Yifan Yang Received:

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Empirical Likelihood in Survival Analysis

Empirical Likelihood in Survival Analysis Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)

More information

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

Approximate Self Consistency for Middle-Censored Data

Approximate Self Consistency for Middle-Censored Data Approximate Self Consistency for Middle-Censored Data by S. Rao Jammalamadaka Department of Statistics & Applied Probability, University of California, Santa Barbara, CA 93106, USA. and Srikanth K. Iyer

More information

Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar

Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Department of Mathematical Sciences, University of Bath,

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Approximate self consistency for middle-censored data

Approximate self consistency for middle-censored data Journal of Statistical Planning and Inference 124 (2004) 75 86 www.elsevier.com/locate/jspi Approximate self consistency for middle-censored data S. Rao Jammalamadaka a;, Srikanth K. Iyer b a Department

More information

ST495: Survival Analysis: Hypothesis testing and confidence intervals

ST495: Survival Analysis: Hypothesis testing and confidence intervals ST495: Survival Analysis: Hypothesis testing and confidence intervals Eric B. Laber Department of Statistics, North Carolina State University April 3, 2014 I remember that one fateful day when Coach took

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques Journal of Data Science 7(2009), 365-380 Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques Jiantian Wang and Pablo Zafra Kean University Abstract: For estimating

More information

Package emplik2. August 18, 2018

Package emplik2. August 18, 2018 Package emplik2 August 18, 2018 Version 1.21 Date 2018-08-17 Title Empirical Likelihood Ratio Test for Two Samples with Censored Data Author William H. Barton under the supervision

More information

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data Mai Zhou and Julia Luan Department of Statistics University of Kentucky Lexington, KY 40506-0027, U.S.A.

More information

exclusive prepublication prepublication discount 25 FREE reprints Order now!

exclusive prepublication prepublication discount 25 FREE reprints Order now! Dear Contributor Please take advantage of the exclusive prepublication offer to all Dekker authors: Order your article reprints now to receive a special prepublication discount and FREE reprints when you

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Analysis of incomplete data in presence of competing risks

Analysis of incomplete data in presence of competing risks Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department

More information

Goodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA

NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA Statistica Sinica 14(2004, 533-546 NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA Mai Zhou University of Kentucky Abstract: The non-parametric Bayes estimator with

More information

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Jian Huang Department of Statistics and Actuarial Science University of Iowa, Iowa City, IA 52242 Email: jian@stat.uiowa.edu

More information

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

A Bivariate Weibull Regression Model

A Bivariate Weibull Regression Model c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression

More information

Wei Pan 1. the NPMLE of the distribution function for interval censored data without covariates. We reformulate the

Wei Pan 1. the NPMLE of the distribution function for interval censored data without covariates. We reformulate the Extending the Iterative Convex Minorant Algorithm to the Cox Model for Interval-Censored Data Wei Pan 1 1 The iterative convex minorant (ICM) algorithm (Groeneboom and Wellner, 1992) is fast in computing

More information

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Glenn Heller Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center,

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ ADJUSTED POWER ESTIMATES IN MONTE CARLO EXPERIMENTS Ji Zhang Biostatistics and Research Data Systems Merck Research Laboratories Rahway, NJ 07065-0914 and Dennis D. Boos Department of Statistics, North

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

Maximum likelihood estimation of a log-concave density based on censored data

Maximum likelihood estimation of a log-concave density based on censored data Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen

More information

Statistical Analysis of Competing Risks With Missing Causes of Failure

Statistical Analysis of Competing Risks With Missing Causes of Failure Proceedings 59th ISI World Statistics Congress, 25-3 August 213, Hong Kong (Session STS9) p.1223 Statistical Analysis of Competing Risks With Missing Causes of Failure Isha Dewan 1,3 and Uttara V. Naik-Nimbalkar

More information

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting Nonparametric Estimation of Distributions in a Large-p, Small-n Setting Jeffrey D. Hart Department of Statistics, Texas A&M University Current and Future Trends in Nonparametrics Columbia, South Carolina

More information

Testing Goodness-of-Fit for Exponential Distribution Based on Cumulative Residual Entropy

Testing Goodness-of-Fit for Exponential Distribution Based on Cumulative Residual Entropy This article was downloaded by: [Ferdowsi University] On: 16 April 212, At: 4:53 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Sample Size Determination

Sample Size Determination Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive

More information

GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1)

GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1) Working Paper 4-2 Statistics and Econometrics Series (4) July 24 Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, 26 2893 Getafe (Spain) Fax (34) 9 624-98-49 GOODNESS-OF-FIT TEST

More information

On the generalized maximum likelihood estimator of survival function under Koziol Green model

On the generalized maximum likelihood estimator of survival function under Koziol Green model On the generalized maximum likelihood estimator of survival function under Koziol Green model By: Haimeng Zhang, M. Bhaskara Rao, Rupa C. Mitra Zhang, H., Rao, M.B., and Mitra, R.C. (2006). On the generalized

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

Investigation of goodness-of-fit test statistic distributions by random censored samples

Investigation of goodness-of-fit test statistic distributions by random censored samples d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit

More information

A time-to-event random variable T has survivor function F de ned by: F (t) := P ft > tg. The maximum likelihood estimator of F is ^F.

A time-to-event random variable T has survivor function F de ned by: F (t) := P ft > tg. The maximum likelihood estimator of F is ^F. Empirical Likelihood F. P. Treasure 12-May-05 3010a Objective: to obtain point- and interval- estimates of time-to-event probabilities using a non-parametric, empirical, maximum likelihood estimate of

More information

Testing for a Global Maximum of the Likelihood

Testing for a Global Maximum of the Likelihood Testing for a Global Maximum of the Likelihood Christophe Biernacki When several roots to the likelihood equation exist, the root corresponding to the global maximizer of the likelihood is generally retained

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

Asymptotic Properties of Kaplan-Meier Estimator. for Censored Dependent Data. Zongwu Cai. Department of Mathematics

Asymptotic Properties of Kaplan-Meier Estimator. for Censored Dependent Data. Zongwu Cai. Department of Mathematics To appear in Statist. Probab. Letters, 997 Asymptotic Properties of Kaplan-Meier Estimator for Censored Dependent Data by Zongwu Cai Department of Mathematics Southwest Missouri State University Springeld,

More information

A comparison study of the nonparametric tests based on the empirical distributions

A comparison study of the nonparametric tests based on the empirical distributions 통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical

More information

The iterative convex minorant algorithm for nonparametric estimation

The iterative convex minorant algorithm for nonparametric estimation The iterative convex minorant algorithm for nonparametric estimation Report 95-05 Geurt Jongbloed Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0, Accelerated failure time model: log T = β T Z + ɛ β estimation: solve where S n ( β) = n i=1 { Zi Z(u; β) } dn i (ue βzi ) = 0, Z(u; β) = j Z j Y j (ue βz j) j Y j (ue βz j) How do we show the asymptotics

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( )

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( ) COMSTA 28 pp: -2 (col.fig.: nil) PROD. TYPE: COM ED: JS PAGN: Usha.N -- SCAN: Bindu Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Transformation approaches for the construction

More information

Maximum likelihood estimation for Cox s regression model under nested case-control sampling

Maximum likelihood estimation for Cox s regression model under nested case-control sampling Biostatistics (2004), 5, 2,pp. 193 206 Printed in Great Britain Maximum likelihood estimation for Cox s regression model under nested case-control sampling THOMAS H. SCHEIKE Department of Biostatistics,

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

Rene Tabanera y Palacios 4. Danish Epidemiology Science Center. Novo Nordisk A/S Gentofte. September 1, 1995

Rene Tabanera y Palacios 4. Danish Epidemiology Science Center. Novo Nordisk A/S Gentofte. September 1, 1995 Estimation of variance in Cox's regression model with gamma frailties. Per Kragh Andersen 2 John P. Klein 3 Kim M. Knudsen 2 Rene Tabanera y Palacios 4 Department of Biostatistics, University of Copenhagen,

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Editorial Manager(tm) for Lifetime Data Analysis Manuscript Draft. Manuscript Number: Title: On An Exponential Bound for the Kaplan - Meier Estimator

Editorial Manager(tm) for Lifetime Data Analysis Manuscript Draft. Manuscript Number: Title: On An Exponential Bound for the Kaplan - Meier Estimator Editorial Managertm) for Lifetime Data Analysis Manuscript Draft Manuscript Number: Title: On An Exponential Bound for the Kaplan - Meier Estimator Article Type: SI-honor of N. Breslow invitation only)

More information

Likelihood Ratio Tests and Intersection-Union Tests. Roger L. Berger. Department of Statistics, North Carolina State University

Likelihood Ratio Tests and Intersection-Union Tests. Roger L. Berger. Department of Statistics, North Carolina State University Likelihood Ratio Tests and Intersection-Union Tests by Roger L. Berger Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 Institute of Statistics Mimeo Series Number 2288

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information