Lecture 5 Models and methods for recurrent event data

Similar documents
Lecture 3. Truncation, length-bias and prevalence sampling

Multivariate Survival Analysis

STAT331. Cox s Proportional Hazards Model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Semiparametric Regression

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Analysis Math 434 Fall 2011

DAGStat Event History Analysis.

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Analysing geoadditive regression data: a mixed model approach

Lecture 22 Survival Analysis: An Introduction

Longitudinal + Reliability = Joint Modeling

Survival Analysis for Case-Cohort Studies

Statistical Inference and Methods

Exercises. (a) Prove that m(t) =

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Likelihood Construction, Inference for Parametric Survival Distributions

Harvard University. Harvard University Biostatistics Working Paper Series

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Quantile Regression for Recurrent Gap Time Data

Frailty Modeling for clustered survival data: a simulation study

Modelling and Analysis of Recurrent Event Data

Power and Sample Size Calculations with the Additive Hazards Model

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Survival Analysis I (CHL5209H)

A general mixed model approach for spatio-temporal regression data

Chapter 2 Inference on Mean Residual Life-Overview

Modelling geoadditive survival data

Models for Multivariate Panel Count Data

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Statistical Analysis of Competing Risks With Missing Causes of Failure

Multivariate Survival Data With Censoring.

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

11 Survival Analysis and Empirical Likelihood

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

TMA 4275 Lifetime Analysis June 2004 Solution

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

Tests of independence for censored bivariate failure time data

Multistate models and recurrent event models

Introduction to Statistical Analysis

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

Multistate Modeling and Applications

Introduction to repairable systems STK4400 Spring 2011

Frailty Models and Copulas: Similarities and Differences

Lecture 2: Martingale theory for univariate survival analysis

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Cox s proportional hazards model and Cox s partial likelihood

Modelling Survival Events with Longitudinal Data Measured with Error

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Residuals and model diagnostics

On the Breslow estimator

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Analysis. Stat 526. April 13, 2018

Attributable Risk Function in the Proportional Hazards Model

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

Survival Regression Models

Analysis of recurrent gap time data using the weighted risk-set. method and the modified within-cluster resampling method

1 Glivenko-Cantelli type theorems

The Proportional Hazard Model and the Modelling of Recurrent Failure Data: Analysis of a Disconnector Population in Sweden. Sweden

CTDL-Positive Stable Frailty Model

Multistate models and recurrent event models

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Logistic regression model for survival time analysis using time-varying coefficients

Lecture 7 Time-dependent Covariates in Cox Regression

Concepts and Tests for Trend in Recurrent Event Processes

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

University of California, Berkeley

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

PQL Estimation Biases in Generalized Linear Mixed Models

The Accelerated Failure Time Model Under Biased. Sampling

STAT Sample Problem: General Asymptotic Results

Multi-state Models: An Overview

Duration Analysis. Joan Llull

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Stat 642, Lecture notes for 04/12/05 96

A Poisson Process Approach for Recurrent Event Data with Environmental Covariates NRCSE. T e c h n i c a l R e p o r t S e r i e s. NRCSE-TRS No.

Goodness-of-fit test for the Cox Proportional Hazard Model

Economics 508 Lecture 22 Duration Models

Tied survival times; estimation of survival probabilities

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Empirical Processes & Survival Analysis. The Functional Delta Method

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Efficiency of Profile/Partial Likelihood in the Cox Model

Transcription:

Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events. Recurrent events (focused topic) - time-to-events model (point process model) - time-between-events model (gap times model) - e.g. repeated infections/hospitalizations/tumor occurrences Ordered multiple events - HIV AIDS death - birth onset age of a genetic disease death - disease staging I II III IV Unordered multiple events

Time-to-events and time-between-events models Time-to-events models - Interest focuses on occurrence rate of recurrent events over time. - Time is measured from time-origin to events. - Time-origin could be a fixed calendar time, onset of treatment, or a biological event. - Outcome variables of interest are gap times between events. - This type of models are more relevant when cycling pattern of recurrent events is strong; for example, women s menstrual cycles.

5.1 Time-to-events models Consider a continuous point process N(t), where N(t) represents the number of events occurring at or prior to t, 0 t τ. Intensity function. Intensity function of a continuous point process in [0, τ] is conventionally defined as the occurrence rate of events given the event history, λ(t N H (t)) = lim t 0 + Pr(N(t + ) N(t) > 0 N H (t)), where N H (t) = {N(u) : 0 u t} represents the history of the point process before or at t, t [0, τ].

Remarks - The intensity function uniquely determines the probability structure of the point process under regularity conditions. - For recurrent events, the so-called conditional regression models are constructed on the basis of the intensity function.

Rate function. In contrast with the conditional interpretation of the intensity function, a rate function λ(t), t [0, τ], is defined as the average number of events in unit time at t for subjects in the random population. More precisely, λ(t) = Pr(N(t + ) N(t) > 0) lim 0 + namely, the occurrence rate at t unconditionally on the event history H(t).,

Remarks - In general, a rate function itself does not fully determine the probability structure of the point process. - The rate function is conceptually and quantitatively different from the intensity function, and it coincide with the intensity function only when the process is memoryless. - For recurrent events, the so-called marginal regression models are constructed on the basis of the rate function

Define the cumulative rate function as Λ(t) = t 0 λ(u)du, t [0, T 0 ]. The CRF Λ(t) is also expectation of the number of recurrent events occurring in [0, t]. Note that E[N(t)] = Λ(t) we frequently write E[dN(t)] = λ(t)dt

5.1.1 Poisson process models Poisson process is a counting process model for multiple events occurring over a fixed time interval [0, τ], τ > 0. The Poisson distribution is the probability distribution for the total number of events, M. The Poisson distribution is sometimes used for modelling a count variable in other situations.

A point process is a stationary Poisson process if the following three conditions are satisfied (sketch): 1. The probability that exactly one event occurs in a small interval [t, t + h] is approximately λh, where λ is called the intensity (or rate) of events, λ > 0. 2. The probability that 2 or more events occur at the same time is approximately 0. 3. The numbers of events in disjoint regions are independent. ( ) Let µ = λτ > 0. The pdf of M is f M (m) = e µ µ m, m = 0, 1, 2,.... A Poisson process is called a non-stationary Poisson process if the occurrence rate, λ(t), is time dependent. m!

5.1.2 Nonparametric estimation of CRF Data. Let t i1 t i,mi be the ordered event times with m i defined as the index for the last observed event. The observed data include {(m i, c i, t i1,..., t i,mi ) : i = 1,..., n}. Population. Note that for a single event process (univariate survival time), the risk population at t is composed of subjects who have not failed prior to t, thus the risk population varies with different values of t. In contrast, for a recurrent event process, the risk population at different t s always coincides with the target population defined at 0. Risk set. Let C i represent the terminating time (censoring time) for observing N(t). The risk set at t is defined as {i : C i t} which includes subjects who are under observation at t. Define R i (t) = I(C i > t) as the risk-set indicator, and R(t) = n i=1 R i(t). Independent censoring. If C i is independent of N i ( ), the risk set forms a random sample from the risk population at t.

Under independent censoring assumption, for t > 0 and positive-valued but small, a crude estimate of the occurrence probability in (t, t] can constructed as λ(t) n i=1 mi j=1 I(t ij (t, t]) R(t), (1) with I( ) representing the indicator function. The estimate is essentially an empirical measure with time-dependent sample size R(t). A nonparametric estimate of the CRF corresponding to (1) can then be constructed as ˆΛ(t) = n m i i=1 j=1 I(t ij t) R(t ij ). (2) Nelson (88, JQT; 95, Technometrics)

5.1.3 Conditional Regression Models Anderson and Gill (1982, AS) proposed a time-to-events model which extended Cox s proportional hazards model from single event data to recurrent event data. Suppose the dates of recurrent events are recorded with a continuous scale (e.g., by days or weeks), and the outcome measures of interest are recurrent events occurring in the time interval [0, τ], where the constant τ > 0 is determined with the knowledge that recurrent events could potentially be observed up to τ, say 3 years. Let N H (t) be the recurrent event history and Z H (t) the possibly time-dependent covariate history prior to t. For t [0, τ], the AG model assumes the events occur over time with the occurrence rate λ(t N H (t), Z H (t)) = λ 0 (t)exp{x(t)β}, (3) where X(t) = φ(n H (t), Z H (t)) is a transformation of (N H (t), Z H (t))

Pros and cons of conditional regression model (i) The AG model can be thought of as a predicting model since the event history is included as a part of conditional statistics in the rate function. (ii) Use of the AG model to identify treatment effects is subject to constraints, since the model identifies treatment effects adjusted for subject-specific event history. In general, AG model is not ideal for identifying treatment effects or population risks. (iii) If the AG model chooses to use time-independent covariate, X(t) = X, the model is then required to be memoryless. For example, two subjects with the same X but different event histories would predict the same occurrence rate of events. Thus, if X =treatment indicator, two patients who receive the same treatment but have different hospitalization records would have the same level of risk for rehospitalization according to the AG model.

Statistical methods for conditional regression model AG extended the partial likelihood methods from univariate survival data to recurrent event data. The partial likelihood score function for β 0 can be derived as U(β, t) = n i=1 t 0 {X i (u) X(β, u)}dn i (u) (4) where Z(β, n i=1 t) = Ri(t)Xi(t) exp{βt 0 Xi(t)} n i=1 Ri(t) exp{βt 0 Xi(t)}. Martingale theory was also developed to establish the large sample properties (as an extension of martingale theory for univariate survival analysis).

5.1.4 Marginal Regression Models In stead of the conditional regression model, we may consider a marginal model where the event history, N(t), is not included as part of the conditional statistics: λ(t Z(t)) = λ 0 (t)exp{z(t)β}. The marginal model is generally ideal for identifying treatment effects and risk factors, but the estimation procedure of LWYY depends heavily on the independent censoring assumption. The LWYY estimates could be very biased when the follow-up is terminated by reasons associated with the recurrent events such as informative drop-out or death. Statistical inferences can be found in the articles of Pepe and Cai (1993, JASA) and Lin et al. (Huang, 2000, JRSS-B).

5.1.5 Semi-parametric latent variable models. With intension to deal with censoring due to death or informative drop-out, Wang et al. (2001, JASA) proposed a semi-parametric latent variable model for time-to-events data: λ(t Z, X) = Z λ 0 (t)exp{xβ} The model allows for informative censoring through the use of a latent variable. The model implies the marginal rate model λ(t X = x) = λ 0(t)exp{xβ}. where λ 0(t) = E[Z] λ 0 (t). The model has the feature of treating both the censoring and latent variable distributions as nonparametric components. The approach avoids modeling and estimating these nonparametric components by proper conditional likelihood techniques. As a related work, a joint model for recurrent events and a failure time was proposed and studied by Huang and Wang (2004, JASA).

5.2 Suppose the outcome measure of interest is time between successive events (gap time). When time-between-events is the variable of interest, the occurrence of each recurrent event is considered as the time origin for the occurrence of the next event. Recurrence times could be considered as a type of correlated failure time data in survival analysis. This type of correlated data are, however, different from the correlated data collected from families (e.g., twin data or sibling data) due to the ordering nature of recurrent events.

5.2.1 Specific features of data Informative m. For typical multivariate survival data such as family data, cluster size is usually assumed to be uncorrelated with failure times of a cluster. For recurrence time data, the number of recurrent events, m, is typically correlated with recurrence times in follow-up studies - large m is likely to imply shorter times and vice versa. In some applications, m is even used as the outcome measurement for analysis; e.g., in a Poisson model, m is the Poisson count variable. Induced informative censoring. Induced informative censoring is a special feature for ordered events. When the observation of the recurrent event process is censored at C, the censoring time for T j is max{c j 1 k=1 Y k, 0}, for each j = 2, 3,.... Because j 1 k=1 Y k is correlated with T j for j 2, recurrence times of order greater than one are observed subject to informative censoring even if the censoring time C is independent of N( ).

Intercepted sampling. The intercepted sampling is a well-known probability feature of renewal processes. It is a specific feature of recurrence time data because the sampling scheme to observe recurrence times in longitudinal studies is similar to the intercepted sampling of renewal processes. For simplicity of understanding, assume the recurrence times {Y j : j = 1, 2,...} are independent and identically distributed (iid). Let f, S and µ represent the density function, survival function and mean of Y j. Let T = C T m and R = T m+1 C be the so-called backward and forward recurrence times. When the censoring time, C, is sufficiently large so that an equilibrium condition is reached, the joint density of (T, R) can then be derived as p T,R (t, r) = f(t + r)i(t 0, r 0)/µ. (5)

The marginal density functions of Y, T and R can be derived, based on (1), as p Ym+1 (y) = yf(y)i(y 0)/µ, (6) p T (t) = S(t)I(t 0)/µ, (7) p R (r) = S(r)I(r 0)/µ. (8) The distribution of Y m+1 is referred to as the length-biased distribution. In most of the longitudinal studies, however, the censoring time is not very large and therefore the equilibrium condition is not satisfied. In these cases, although the above distributional results do not hold, the bias from Y m+1 is still significant and one should be careful when conducting statistical analysis. In general, because of the specific data features, standard statistical methods in survival analysis may or may not be appropriate for recurrence time data.

5.2.2 Transitional probability Model Let f j (y y i1,..., y i,j 1 ) denote the pdf of Y ij conditioning on (Y i1,..., Y i,j 1 ) = (y i1,..., y i,j 1 ). Suppose the censoring time C i is independent of the recurrent event process N i ( ). Note that the likelihood function is L n m i { f j (y ij y i1,..., y i,j 1 )}S mi+1(y + i,m y i+1 i1,..., y i,mi ) i=1 j=1 A transitional probability model can be constructed by placing distributional assumptions on the conditional probability f j (y y i1,..., y i,j 1 ). In applications, when a transitional probability model is used, it is frequently accompanied by a further 1st-order (or 2nd-order) markovian assumption that the conditional pdf of Y ij depends on (Y i1,..., Y i,j 1 ) only through Y i,j 1.

In a regression setting, when covariates x i is present, we assume that conditioning on x i the censoring time C i is independent of N i ( ). The likelihood function is modified as L n m i { f j (y ij x i, y i1,..., y i,j 1 )}S mi+1(y + i,m x i+1 i, y i1,..., y i,mi ) i=1 j=1

5.2.3 Parametric Frailty Model Frailty models are basically random-effects or latent-variable models, where the frailty is used to characterize a subject. Assume the following conditions: (i) Conditional on a subject-specific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are independent. (ii) (Independent censoring) C and (N( ), Z) are independent. (iii) (Distributional assumption) Conditional on Z = z, Y j is distributed with pdf f j (y z; θ), θ Θ. The latent variable Z is distributed with pdf h(z; γ), γ Γ.

With Assumptions (i), (ii) and (iii), the likelihood function from the data can be formulated as L n i=1 f j (y ij z i ; θ)}s mi+1(y + i,m z i+1 i; θ)h(z i ; γ)dz i m i { j=1 The likelihood function is then maximized to derive estimates (MLEs) of theta and γ. Large sample distributions of the MLEs can be derived based on normal approximation.

In a regression setting when covariates x is present, Assumptions (i - iii) can be modified as (i) Conditional on x and a subject-specific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are independently distributed. (ii) (Independent censoring) Conditional on x, C and (N( ), Z) are independent. (iii) (Distributional assumption) Conditional on x and Z = z, Y j is distributed with pdf f j (y z; θ), θ Θ. The latent variable Z is distributed with pdf h(z; γ), γ Γ.

With the modified assumptions, the likelihood function is expressed as n L i=1 f j (y ij x i, z i ; θ)}f mi+1(y + i,m x i+1 i, z i ; θ)h(z i ; γ)dz i. m i { j=1 It is, however, generally difficult to compute the MLE. In the literature EM algorithms and other computation algorithms have been developed to resolve the problem.

Appendix (optional reading) A.1 Nonparametric estimation of survival function estimation Recurrence times can be treated as a type of correlated survival data in statistical analysis. However, because of the ordinal nature of recurrence times, statistical methods which are appropriate for clustered survival data may not be applicable to recurrence time data. In many medical papers, recurrence time data are frequently analyzed by inappropriate methods as indicated by Aalen and Husebye (1991). Specifically, for estimating the marginal survival function, the Kaplan-Meier estimator derived from the pooled data is frequently used for exploratory analysis although the estimator is generally inappropriate for such analyses. Suppose recurrent events are of the same type and consider the problem of how to estimate the marginal survival function from univariate recurrence time data. Assume the following conditions are satisfied.

(i) (Conditional iid assumption) Conditional on a subject-specific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are identically and independently distributed. (ii) (Independent censoring) C and (N( ), Z) are independent. Define the univariate recurrent survival function of Y j as S(y) Pr(Y j > y) = S(y z)dh(z), where S(y z) is the conditional survival function of Y j given Z = z, and H is the distribution function of Z.

Under (i) and (ii), let S = 1 S, the nonparametric likelihood function can be formulated as n i=1 d S(u ij z i )]S(u + i,m z i+1 i)dh(z i ). m i [ j=1 Conceptually, the likelihood function involves both infinite parameters (the conditional cdf s S( z i )) and a mixing distribution (H). With infinite parameters, the maximization of the likelihood function could be problematic and therefore it is not used as the tool for finding an estimator of S. Instead of the nonparametric likelihood approach, Wang and Chang (1999, JASA) proposed a class of nonparametric estimators for estimating S(y):

Define the observed recurrence times as { u u ij if j = 1,..., m i ij = u + i,m i+1 if j = m i + 1 Define m i = { m i if m i = 0 m i 1 if m i 1

Let w i = w(c i ), where w( ) is a positive-valued function. The total mass of the risk set at y is calculated as R (y) = n w i [ m i + 1 i=1 and the mass evaluated at y is d (y) = n [ w ii(m i 1) m i + 1 i=1 m i +1 j=1 m i +1 j=1 I(u ij y)] I(u ij = y)]. Let u (1), u (2),..., u (K) be the ordered and distinct uncensored times. The estimator takes the product limit expression, Ŝ n (y) = { 1 d (u (i) ) } R (u (i) ), u (i) y which is non-increasing in y and satisfies 0 Ŝn(y) 1. Further, this estimator also possesses proper large sample properties.

A.2 Semiparametric Regression Models Conditional proportional hazards model. Now, we are back to the general case that recurrent events may or may not be the same. Prentice, Williams and Peterson (1981, Biometrika) modeled time-between-event data by a conditional proportional hazards model as an extension of the usual proportional hazards model for univariate failure time data: λ(t N(t ) = j 1, N H (t), X H (t)) = λ 0j (t t j 1 )exp{z(t)γ j }, (9) for t t j 1. In the model, - N H (t) = {N(u) : 0 u t} is the event history up to t - X H (t) = {X(u) : 0 u t} is the covariate history up to t - λ 0j ( ) is the baseline hazard function - γ j is the regression parameter for the jth recurrence time

The possibly time-dependent covariate history up to t is denoted by X H (t). As an important requirement, the event history N H (t) must be part of the given knowledge (conditional statistics) in the PWP model. The time-dependent covariate vector Z(t) = φ(x H (t), N H (t)) is a transformation of (X H (t), N H (t)). This model serves as a proper model for predicting the future events given subject-specific covariates and event history information. However, since event history is part of the conditional statistics in the model, the PWP model does not serve as an appropriate model for identifying treatment or prevention effects. The PWP model has been further extended to include both globally defined parameters β and episode-specific parameters γ j (Chang and Wang, 1999, JASA): λ(t N(t ) = j 1, N H (t), X H (t)) = λ 0j (t t j 1 )exp{z(t)γ j + W (t)β}, (10) for t t j 1, where Z(t) and W (t) are functions of (X H (t), N H (t)).

Marginal regression models. In contrast with conditional regression models, marginal regression models do not include the event history N H (t) as part of the covariates and therefore serve as appropriate models for identifying treatment effects or population-based risk factors. Without conditioning on event history, limited techniques have been developed for the analysis of marginal regression models, with exceptions of Huang s accelerated failure time model (Y. Huang, 2000, JASA): log Y j = α j + x j β j + ɛ j, j = 1, 2,...

(cont d) Lin, Wei and Robins bivariate accelerated failure time model (1998, Biometrika): log Y 1 = α 1 + x 1 β 1 + ɛ 1, log Y 2 = α 2 + x 2 β 2 + ɛ 2 and Huang and Chen s proportional hazards model for Y j (2002, LIDA): λ(y x) = λ 0 (y)exp{xβ}, where x is the baseline covariates and λ 0 is the baseline hazards function shared by all the episodes. Note that the first two models only partially depend on N(t), and the third model is essentially a renewal model.

Trend models. In many applications the distributional pattern of recurrence times can be used as an index for the progression of a disease. Such a distributional pattern is important for understanding the natural history of a disease or for confirming long-term treatment effect. Assume (i) Within each subject, the recurrence times Y 1, Y 2,... are independently distributed with the survival functions S 0, S 1, S 2,..., and (ii) within each subject, the censoring time C is independent of N( ).

Assumption (i) can be viewed as a frailty condition where the conditional independence of recurrence times holds within each subject. Assumption (ii) implies that, within subject, the censoring mechanism is uninformative for the probability structure of event process. In applications, one might be interested in testing the null hypothesis (that is, (i)) that the duration distributions of different episodes Y 1, Y 2,... remain the same to confirm the stability of pattern of recurrence times, or to identify the treatment efficacy over time; see Wang and Chen (2001, Biometrics) for nonparametric and semiparametric approaches to deal with the problem.