Estimating the cumulative incidence function of dynamic treatment regimes

Size: px
Start display at page:

Download "Estimating the cumulative incidence function of dynamic treatment regimes"

Transcription

1 J. R. Statist. Soc. A (218) 181, Part 1, pp Estimating the cumulative incidence function of dynamic treatment regimes Idil Yavuz Dokuz Eylul University, Izmir, Turkey and Yu Cheng and Abdus S. Wahed University of Pittsburgh, USA [Received June 215. Final revision September 216] Summary. Recently personalized medicine and dynamic treatment regimes have drawn considerable attention. Dynamic treatment regimes are rules that govern the treatment of subjects depending on their intermediate responses or covariates. Two-stage randomization is a useful set-up to gather data for making inference on such regimes. Meanwhile, the number of clinical trials involving competing risk censoring has risen, where subjects in a study are exposed to more than one possible failure and the specific event of interest may not be observed because of competing events. We aim to compare several treatment regimes from a two-stage randomized trial on survival outcomes that are subject to competing risk censoring. The cumulative incidence function (CIF) has been widely used to quantify the cumulative probability of occurrence of the target event over time. However, if we use only the data from those subjects who have followed a specific treatment regime to estimate the CIF, the resulting estimator may be biased. Hence, we propose alternative non-parametric estimators for the CIF by using inverse probability weighting, and we provide inference procedures including procedures to compare the CIFs from two treatment regimes. We show the practicality and advantages of the proposed estimators through numerical studies. Keywords: Competing risks; Inverse probability weighting; Multiple end points; Personalized medicine; Survival outcome 1. Introduction Dynamic treatment regimes (DTRs), which are sets of rules that guide treatment on the basis of observed covariates and intermediate responses, have been studied greatly in recent years. In DTRs the treatment level and type can vary depending on evolving measurements of subjectspecific need for treatment. Since these regimes provide treatment that is adapted to individual needs, they are cost effective and improve patients compliance by avoiding overtreatment or undertreatment (Lavori and Dawson, 2). When treatment is dynamic, the question becomes which regime in the end will produce the best outcome. Different trial designs can be used to estimate the treatment efficacy of various treatment regimes. The simplest is the single-stage randomization design where participants are randomized to all possible treatment regimes on entry to the trial. However, this method is cost ineffective and, in general, requires a larger sample size. A better design is the so-called sequential multipleassignment randomized trial (SMART) which was considered by Lavori and Dawson (2, Address for correspondence: Idil Yavuz, Department of Statistics, Dokuz Eylul University, Tinaztepe, Buca, Izmir, Turkey. idil.yavuz@deu.edu.tr 216 Royal Statistical Society /18/18185

2 86 I. Yavuz, Y. Cheng and A. S. Wahed 24), Murphy (25a) and Murphy et al. (27) among others. In these trials, patients are randomized to the initial treatment options at entry and those continuing to the next stage are randomized to available treatment options on the basis of their intermediate response to the initial treatment and randomization is continued in this fashion. For example, in a two-stage randomization design, suppose that there are two treatment options A 1 and A 2 at the first stage, and two treatment options for both responders and non-responders, namely, B 1 and B 2, and B 1 and B 2 respectively, at the second stage. In a SMART, on entry to the study all the patients are randomized to the initial treatments A 1 and A 2 at the first stage. In the second stage, patients who were initially randomized to A 1 and responded are further randomized to maintenance treatments B 1 and B 2 and the patients who did not respond to A 1 are further randomized to salvage treatments B 1 and B 2. A similar randomization rule is used for patients who were initially assigned A 2. Statistical methods for analysing data from SMART studies are available in the literature. For example, Murphy et al. (21) developed a marginal mean model for the mean response of a DTR and provided a methodology for constructing an optimal regime (Murphy, 23). Later, Murphy and Bingham (29) used screening experiments to help to develop DTRs. Q-learning (Watkins, 1989; Sutton and Barto, 1998; Murphy, 25b; Schulte et al., 214) is a popular approach that is used commonly when the primary goal is to estimate the optimal DTR. Much research has been carried out on models based on Q-learning and related approaches. For example, robust estimation techniques have been provided by Zhang et al. (212b, 213) and Barrett et al. (213). Qian and Murphy (211) considered a penalized least squares approach for estimation. Zhao et al. (211) presented an adaptive reinforcement learning approach where they also tried to determine the optimal time to initiate second-line therapy. Goldberg and Kosorok (212) developed an adjusted Q-learning algorithm for censored data. Zhang et al. (212a) generated a framework for estimating the optimal regime from a classification perspective. Zhao et al. (212) used an outcome-weighted approach. Chakraborty et al. (213) proposed an m out of n bootstrap algorithm for confidence intervals around the optimal regime index. Reviews of recent advances in this area can be found in Henderson et al. (211) and Zhao and Laber (214). There is also much work concerning survival outcomes from DTRs. For example, Lunceford et al. (22) used the inverse weighting method that was introduced by Robins et al. (1994) to propose a marginal mean model (Murphy et al., 21) for analysing survival data from a two-stage setting. Later, Guo and Tsiatis (25) proposed a weighted risk set estimator which is a modified Nelson Aalen estimator (Aalen, 1978). Although it may seem appropriate at first to apply the Kaplan Meier estimator (Kaplan and Meier, 1958) to the subgroup of patients following a specific regime, Wahed and Tsiatis (26) showed that such an estimator is biased, and to correct for this bias they proposed a weighted version of the Kaplan Meier estimator following Lunceford et al. (22) and Lokhnygina and Helterbrand (27). When a survival outcome is concerned, competing risk censoring is often involved and has recently drawn more attention from practitioners (Gooley et al., 1999; Klein, 26; Koller et al., 212). Competing risk censoring refers to a situation where subjects in a study are exposed to more than one possible failure and the specific event of interest may be unobservable owing to the occurrence of competing events. As an example from a neuroblastoma study (Matthay et al., 29), which is a two-stage randomized trial, time to disease progression is often of interest as a target for developing disease-specific treatments. If a subject died free of progression, we would not be able to observe the progression time, and we refer to this case as being competing risk censored by death. When a subject is exposed to more than one risk, we are often interested in the instantaneous failure rate of one cause, the cause-specific hazard (CSH), or the probability of occurrence of the target event by a specific time point, the cumulative incidence function (CIF). In this paper we focus on the CIF for two reasons. First, this quantity is intuitively interpretable

3 Cumulative Incidence Function of Dynamic Treatment Regimes 87 and non-parametrically identifiable; hence it has been commonly used in the competing risks literature (Kalbfleisch and Prentice, 22). Second, it is clinically relevant for our data example since we are interested in quantifying the cumulative risk of developing the progressive disorder, in the context that the subjects are high-risk patients and may die before progression. When competing risks are present in a SMART study, the objective then would become finding a regime which results in a higher (or reduced) probability of occurrence of the desirable (or undesirable) event of interest. Although methods for analysing time-to-event data that arise from two-stage randomization designs have been developed, to our knowledge no research has been carried out on how to analyse data from such designs under a competing risks setting as in the neuroblastoma study. Hence, we extend inverse probability weighting estimators for survival outcomes in Miyahara and Wahed (21) to competing risk settings. Though our methodology was built on existing work by Miyahara and Wahed (21), its unique contribution to the field is at least twofold. First, it is not trivial to extend from typical survival data to competing risk data. The CIF estimator is considerably more complicated than the Kaplan Meier estimator, and so is its variance estimation. Second, we derive an explicit variance estimator for the time varying weighted estimator based on counting process techniques, which significantly reduces the computation time compared with the bootstrap method that was used in Miyahara and Wahed (21). Thus, our proposed methods provide useful tools for the unbiased estimation of the CIF under two-stage randomization, facilitating research in diseases that require complex treatment such as acquired immune deficiency syndrome and cancer. The programs that were used to analyse the data can be obtained from 2. Set-up and notation In what follows, we focus on a two-stage SMART where there are two first-stage treatments A 1 and A 2, and two second-stage treatments B 1 and B 2 for responders and B 1 and B 2 for nonresponders. This design would create a total of eight regimes A m B k B l for m, k, l = 1, 2. Here A m B k B l stands for the regime where the subject is treated with A m followed by B k if the subject responds to A m and by B l if not. We assume that there are only two events, the primary event of interest (or cause 1 event) and the competing event (or cause 2 event), as multiple competing events can be grouped without affecting the analysis for the event of interest. Let T be the event time and ɛ = 1, 2 be the event type. Our objective is to estimate the causal effect of a particular treatment regime on the CIF of a specific type of event. This problem can be framed in terms of counterfactuals (for example, see Wahed and Tsiatis (26)). For simplicity, we do not go into the details here. The goal is then to estimate the cause 1 CIF for a subject following the regime A m B k B l, m, k, l = 1, 2, F 1.t A m B k B l / = pr.t t, ɛ = 1/, which is the probability that the event of interest occurs by time t. For the jth subject, j =1, :::, n, let Tj R be the time to intermediate response since the initial randomization, R j be the response indicator (R j = 1 if the subject has responded to A m ; R j = otherwise), Z 1j be the second treatment assignment indicator for responders (Z 1j = k if the subject is assigned to B k, k = 1, 2) and Z 2j be the second treatment assignment indicator for non-responders (Z 2j = l if the subject is assigned to B l, l =1, 2). Let T j denote the time to the first event since the initial randomization and ɛ j denote the corresponding event type (1, if the first event is the event of interest; 2, if the competing event occurs first). There may also be independent censoring C j. Hence the observed

4 88 I. Yavuz, Y. Cheng and A. S. Wahed event time is V j = min.t j, C j /, and the cause indicator Δ j = ɛ j I.C j T j / takes the value 1 or 2 if a cause 1 or 2 event occurs before censoring, and if no event is observed before C j. Then, the jth subject s data can be represented as {T R j, R j, R j Z 1j,.1 R j /Z 2j, V j, Δ j }. The randomization probabilities π Bk =pr.z 1j =k R j =1/, k =1, 2, and π B l =pr.z 2j =l R j =/, l =1, 2 are assumed to be independent of the observed data before the second randomization except for R j. In some cases, the time to response may also be censored but in such cases it is customary to treat those patients as non-responders (Lunceford et al., 22). 3. Weighted estimators of the cumulative incidence function 3.1. Naive cumulative incidence function estimator To estimate the cause 1 CIF for the regime A m B k B l, m, k, l = 1, 2, one may naively construct a standard non-parametric estimator by using the data only from those subjects whose treatments are consistent with A m B k B l. Considering the case where some subjects may have developed an event before the second-stage randomization, subjects who are consistent with A m B k B l actually include three subgroups: 1, subjects who were assigned to A m and developed an event before the second-stage randomization; 2, subjects who were on treatment A m, responded to it and received B k ; 3, those who were on treatment A m, did not respond to A m and received B l. Let t 1 <t 2 <:::<t k be the distinct event times where either the event of interest or the competing event occurs. Let Y i be the number of subjects at risk, d i be the number of subjects with the occurrence of the target event and r i be the number of subjects with the occurrence of the competing event at time t i among individuals consistent with A m B k B l. Then, the cause 1 CIF for A m B k B l would be estimated by ˆF 1,Am B k B l.t/ = { d i 1 ( i 1 d )} h + r h.3:1/ t i t Y i h=1 Y h for t 1 t and otherwise. For t 1 t the CIF can also be represented as ˆF 1,Am B k B l.t/ = Ŝ Am B k B.t i / d i, l t i t Y i where Ŝ Am B k B l.t i / is the Kaplan and Meier (1958) estimator evaluated at just before t i. The variance estimator of ˆF 1,Am B k B l.t/ is given in Klein and Moeschberger (23) as ˆσ 2 { ˆF 1,Am B k B l.t/} = ( Ŝ Am B k B l.t i/ 2 { ˆF 1, Am B k B l.t/ ˆF 1,Am B k B l.t i/} 2 r i + d i t i t + Ŝ Am B k B l.t i/ 2 [1 2{ ˆF 1, Am B k B l.t/ ˆF 1, Am B k B l.t i/}] d i Y 2 i Y 2 i ) : 3.2. Cumulative incidence function estimator with fixed weights The naive CIF estimator discards all the information from those subjects whose treatments are not consistent with A m B k B l ; hence it loses efficiency and, more seriously, it is often biased. This bias happens because the subsample that is used to construct the naive estimator includes only part of the responders to A m who were randomized to B k and only part of the non-responders to A m who were randomized to B l. Whenever the randomization probabilities to B k and B l are different, this subsample will have a different mixture of responders and non-responders compared with the original sample that received A m. The bias may be better understood by comparing with an unbiased evaluation of A m B k B l from the simpler design of assigning subjects

5 Cumulative Incidence Function of Dynamic Treatment Regimes 89 up front to follow A m B k B l, where all subjects would receive treatment A m, and all responders to A m would receive treatment B k and all non-responders to A m would receive treatment B l. Because of the second randomization, part of the responders and non-responders are no longer consistent with this regime. To account for the loss of those subjects and to address the potential bias, we take a similar approach to that used by Lunceford et al. (22), Guo and Tsiatis (25) and Miyahara and Wahed (21), and propose a weighted cumulative incidence function (WCIF) estimator ˆF w 1,A m B k B l.t/ = d w i t i t Yi w { i 1 h=1 ( 1 dw h + )} rw h Yh w,.3:2/ for t 1 t and otherwise, where di w = Σ n j=1 I.V j = t i, Δ j = 1/Q Am B k B l,j, ri w = Σ n j=1 I.V j = t i, Δ j = 2/Q Am B k B l,j, and Yi w = Σ n j=1 I.V j t i /Q Am B k B l,j,forq Am B k B l,j = R j I.Z 1j = k/=π Bk +.1 R j / I.Z 2j = l/=π B. If subject j has developed the event of interest or a competing event before the l second-stage randomization, their response status is not observable, and Q Am B k B l,j is set to 1. The CIF estimator in equation (3.2) is similar to the naive estimator in equation (3.1) except that those subjects following the regime are inversely weighted by the probability of being allocated to a specific treatment option during the second stage to compensate for those subjects who have been assigned to alternative treatments but could have been consistent with the regime if there were no second randomization. In contrast with relying on the bootstrap for inference in Miyahara and Wahed (21), we have derived the following asymptotic linear representation: n{ ˆF w 1,A m B k B l.t/ F 1,A m B k B l.t/} = 1 n I j.t/ + o p.1/, n j=1 where the expression for the influence function I j.t/ = I 1j.t/ I 2j.t/ is given in Appendix A. This leads to the variance estimator ˆσ 2 { ˆF w 1,A m B k B l.t/} =.1=n 2 /Σ n j=1î2 j.t/, where Î j.t/ = Î 1j.t/ Î 2j.t/ with Î 1j.t/ = Q Am B k B l,j t m t { } I.V j = t m, Δ j = 1/ I.V j t m / dw m Ym w and [ { Î 2j.t/ = Q Am B k B l,j Ŝ w 1.t m / t m t t d t m Ŝ w I.V j = t d / I.V j t d / dw d + }] rw d d w m.t d / Yd w Ym w : Details about this derivation can be found in Appendix A Cumulative incidence function estimator with estimated fixed weights In practice, owing to randomization, the proportion of subjects who responded to the initial treatment A m and were randomized to B k may not be exactly the same as π Bk. Similarly, the proportion of subjects receiving B l could be different from π B l. Estimated randomization probabilities for the second stage provide better information about the randomization process than the intended assignment probabilities. Hence we also propose a WCIF estimator where the weights are estimated by using the sample proportions instead of the true proportions: ˆF ew 1,A m B k B l.t/ = t i t d ew i Y ew i { i 1 h=1 (1 dew h + rew h Y ew h )},.3:3/

6 9 I. Yavuz, Y. Cheng and A. S. Wahed for t 1 t and otherwise. Here di ew = Σ n j=1 I.Δ j = 1/I.V j = t i / ˆQ Am B k B l,j, ri ew = Σ n j=1 I.Δ j = 2/I.V j =t i / ˆQ Am B k B l,j and Yi ew =Σ n j=1 I.V j t i / ˆQ Am B k B l,j, where ˆQ Am B k B l,j =R j I.Z 1j =k/= ˆπ Bk +.1 R j /I.Z 2j = l/= ˆπ B if R l j is observed, and ˆQ Am B k B l,j = 1 otherwise. The variance of this estimator can be estimated by replacing the weights with their estimated values in the formula that was derived for the CIF with fixed weights Cumulative incidence function estimators with time-dependent weights The proposed CIF estimators with fixed weights and estimated fixed weights can be further improved by including more subjects in the estimation. To do this, following the ideas from Guo and Tsiatis (25), subjects are assigned weights of 1 until their response status is observed because they remain consistent with the regime of interest. Once the response status is known and the second randomization has been carried out, the patients receive weights as before. The weights that are evaluated at time t can be written as follows: { 1, if T R j >t, Q Am B k B l,j.t/ = R j I.Z 1j = k/ π +.1 R j/i.z 2j = l/ Bk π, B l if Tj R t: Using the time-dependent weights, the CIF for a specific regime can be estimated as ˆF tw 1,A m B k B l.t/ = { i 1 (1 dtw h + rtw h t i t Î tw 1j.t/ = Q A m B k B l,j.t/ t m t d tw i Y tw i h=1 Y tw h )} :.3:4/ Here ˆF tw 1,A m B k B l denotes the estimated CIF with time-dependent weights and dtw i =Σ n j=1 I.V j = t i, Δ j = 1/Q Am B k B l,j.t i /, ri tw = Σ n j=1 I.V j = t i, Δ j = 2/Q Am B k B l,j.t i /, and Yi tw = Σ n j=1 I.V j t i / Q Am B k B l,j.t i /. The associated influence function can be obtained by a slight modification of the previous influence function, and the new variance estimator can be obtained just by replacing the two parts of the influence function with } {I.V j = t m, Δ j = 1/ I.V j t m / dtw m, Î tw 2j.t/ =Q A m B k B l,j.t/ t m t [ Ŝ tw.t m / t d t m 1 Ŝ tw.t d / Y tw m {I.V j = t d / I.V j t d / dtw d + rtw d Y tw d }] d tw m The CIF estimator with estimated time-dependent weights ˆF tew 1,A m B k B l and its variance can be obtained by replacing the weights in ˆF tw 1,A m B k B l and its variance with the estimated weights. In practice, information may be available suggesting that some treatment options are more effective in treating certain diseases for particular subgroups of patients. For example, B 1 may be more efficacious among females than among males. If such information is available, researchers may be inclined to randomize more patients to treatments which they may be more likely to benefit from. All our proposed methods can be easily adapted to this type of covariate-dependent randomization, by using either the true or the estimated probabilities of assignments for subjects with certain characteristics to compute the proper weight functions. Y tw m : 4. Comparing two regimes 4.1. Asymptotic confidence intervals Suppose that we are interested in comparing two regimes A 1 B 1 B 1 and A 1B 1 B 2. These regimes share a common treatment path, namely responders under both strategies are treated with B 1.

7 Cumulative Incidence Function of Dynamic Treatment Regimes 91 Therefore, the two estimators of CIFs are correlated. To compare these two regimes, we would be interested in the difference D.t/=F 1,A1 B 1 B 1.t/ F 1,A 1 B 1 B 2.t/ which can be consistently estimated by the difference between the two estimated CIFs by using any of the methods that were proposed in Sections For example, if we use the estimators with time-dependent weights of the respective regimes, we would have ˆD tw.t/ = ˆF tw 1,A 1 B 1 B 1.t/ ˆF tw 1, A 1 B 1 B 2.t/. Denoting the influence functions for the two regimes as I.1/ j.t/ and I.2/ j.t/, then we have n{ ˆD tw.t/ D.t/} = 1 n {I.1/ n j.t/ I.2/ j.t/} + o p.1/ = 1 n I D n j.t/ + o p.1/: j=1 Since Ij D.t/ can be easily estimated by ÎD j.t/ = Î.1/ j.t/ Î.2/ j.t/ and the variance of ˆD tw can be estimated as ˆσ 2ˆD.t/ =.1=n2 /Σ n j=1îd j.t/2, the 1.1 α/% asymptotic confidence interval for D.t/ is { ˆF tw 1,A 1 B 1 B 1.t/ ˆF tw 1, A 1 B 1 B 2.t/} ± Z α=2 ˆσ ˆD.t/, where Z α=2 is the top α=2-percentile of the standard normal distribution. It may also be necessary to construct asymptotic confidence bands around specific functions of D.t/ to determine the time regions where the two CIFs differ. We have also provided techniques for constructing confidence bands; see Appendix B for details Time-averaged differences In practice, one is often interested in summarizing the difference between two CIFs over time to obtain a global measure of the difference between two treatment regimes. Let G be some general distance measure. To combine information in G{F 1,A1 B 1 B 1.t/, F 1,A 1 B 1 B 2.t/} over time, Zhang and Fine (28) proposed weighted average summaries: G M = τu τ l G{F 1,A1 B 1 B 1.t/, F 1, A 1 B 1 B 2.t/}dW.t/, where W.t/ > is a deterministic weight function and τ u τ l dw.t/ = 1. Following the ideas of the weighted log-rank tests for censored survival data, we may consider the class of weights based on the CIF that is calculated by pooling the data from both regimes. The estimator for the time-averaged difference is Ĝ M = τu τ l j=1 G{ ˆF tw 1,A 1 B 1 B 1.t/, ˆF tw 1,A 1 B 1 B 2.t/}dW.t/: Then n.ĝ M G M / can be expressed as n.ĝm G M / = τu n [G{ ˆF tw 1, A 1 B 1 B 1.t/, ˆF tw 1,A 1 B 1 B 2.t/} G{F 1, A 1 B 1 B 1.t/, F 1,A 1 B 1 B 2.t/}]dW.t/ where Î.i/G M j = τ u τ l τ l = n 2 n i i=1 j=1 Î.i/G M j + o p.1/, Î.i/G j.t/dw.t/. Now the asymptotic variance can be estimated by ˆΣ GM = n 2 n i.î.i/g M i=1 j=1 j / 2 : When the distance measure is simply the difference, i.e. G.u, v/ = u v, then the influence functions will be Î.1/G j.t/ = Î.1/ j =n and Î.2/G j.t/ = Î.2/ j =n respectively. Inference can easily be carried out based on these influence functions.

8 92 I. Yavuz, Y. Cheng and A. S. Wahed 5. Simulation Simulation studies were carried out to compare the proposed estimators with the naive estimator under various scenarios. In all simulations the two-stage randomization design was used. Only the subjects who received the initial treatment A 1 were considered, since the data that were obtained from these subjects are independent of the data that were obtained from the subjects receiving A 2. The comparisons were made under three settings. For the first setting a simple two-stage randomization design was simulated where the response status of all subjects could be observed, i.e. the subjects were censored or had events only during the second stage of treatment. For the second set of simulations a similar two-stage randomization design was used but this time the design allowed subjects to experience events or to be censored at the first stage. The last set of simulations was run under the setting where the allocation probabilities of the two-stage randomization design were covariate dependent. All simulations were repeated for n = 3, 7. The results with the larger sample size are given in the Web appendix A. We considered two response rates, i.e. R j Bernoulli.:4/ and R j Bernoulli.:7/, and 4 data sets were generated for each setting. In the following tables CI(t) stands for the naive estimate of the CIF in equation (3.1) evaluated at time t, WCI(t) stands for the proposed estimate of the CIF with fixed weights in equation (3.2) and WCI2(t) is the proposed WCIF estimate with estimated fixed weights in equation (3.3), TWCI(t) is the estimate of the CIF with time-dependent weights in equation (3.4) and TWCI2(t) is the CIF estimate with estimated time-dependent weights. It is worth noting that, although we might think of the simulated sample sizes as large, because only the data from a specific regime were used in the estimation process, the sample that is used in the estimation of the CIF for a particular regime is significantly smaller than the entire sample A simple two-stage randomization design For this setting the data were generated in which Z 1j 2 Bernoulli.:3/ (i.e. Z 1j = 1 with probability :3 and Z 1j = 2 with probability :7) and Z 2j 2 Bernoulli.:3/ for both response rates. For each combination {.Tj R, R j, Z j1, Z j2, V j, Δ j /, j = 1, :::, n} were generated. More specifically, Tj R, the times to response, were generated from the exponential.:2/ distribution and restricted at 1 year. The times to death from the second randomization (T Å A 1 B kj or T Å A 1 B lj, k, l =1, 2) were drawn from different exponential distributions with the parameter values of 1 for the sequence of treatments A 1 B 1,.75 for the sequence A 1 B 2,.5 for the sequence A 1 B 1 and.25 for the A 1B 2 treatments. Following Miyahara and Wahed (21), we then defined the overall survival time for subject j as T j = Tj R + R j{i.z 1j = 1/T Å A 1 B 1j + I.Z 1j = 2/T Å A 1 B 2j } +.1 R j /{I.Z 2j = 1/T Å A 1 B 1j + I.Z 2j = 2/T Å A 1 B 2j }. The times to censoring C j were generated from a uniform.1:5, 2/ distribution which resulted in 9% censoring for pr.r = 1/ = :4 and 13% censoring for pr.r = 1/ = :7. Event type indicators were generated as ɛ j Bernoulli.:5/ + 1 and Δ j were set equal to ɛ j if C j >T j and to otherwise. Only the results for the regimes A 1 B 1 B 1 and A 1 B 1 B 2 were given since the results for other regimes were similar. The simulation results from this setting for n = 3 can be seen in Table 1. The results with n = 7 are given in Web appendix A. Under this scenario, for the first regime A 1 B 1 B 1 the naive and weighted methods perform similarly because π B1 =π B =:3. This is not surprising as, when 1 π B1 = π B = :3, the responders receiving treatment A 1 1 B 1 and non-responders receiving treatment A 1 B 1 are assigned approximately equal weight. As a result, the pseudosample that was created for the first regime after weighting has roughly the same mixture of responders and nonresponders as the original sample. Therefore, the weighted methods produce similar estimates to those by the naive estimator. However, for the second regime A 1 B 1 B 2, the naive CIF estimate

9 Cumulative Incidence Function of Dynamic Treatment Regimes 93 Table 1. Simulation results from the simple setting, n D 3 t pr(r) Method Results for treatment A 1 B 1 B 1 Results for treatment A 1 B 1 B 2 TCI EST ESD EstSE COV TCI EST ESD EstSE COV.5.4 CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI t, specific time point; pr(r), probability of response; TCI, true cumulative incidence; EST, mean of estimates; ESD, empirical standard deviation; EstSE, mean of estimated standard errors; COV, coverage rate of 95% confidence intervals. produces biased results. The naive estimate is obtained on the basis of the data from about 3% of responders who actually received treatment A 1 B 1 and about 7% of non-responders who actually received treatment A 1 B 2. However, the remaining responders and non-responders could be equally qualified to receive this treatment regime if there were no second-stage randomization. The weighted methods roughly generate a pseudosample that represents all responders and non-responders, which in turn produce more accurate estimates of the true CIF than the naive estimate. The coverage rates from the weighted methods proposed improve as the sample size is increased unlike the naive method. TWCI2 performs slightly better compared with the other methods. In general we recommend the use of the time varying weighted estimators, if feasible, since they can incorporate more subjects in the estimation and potentially gain some efficiency. Our simulations show very slight advantage of the estimated time varying estimator, which is consistent with the observations by other researchers (Wooldridge, 22; Lunceford et al., 22), suggesting that using the estimated weight may have better performance than using the fixed weight.

10 94 I. Yavuz, Y. Cheng and A. S. Wahed 5.2. A two-stage randomization design with events and censoring at first stage For this setting we generated initial times to event T 1j from an exponential.:8/ distribution, and the times to response Tj R from an exponential.:2/ distribution for all subjects. Those subjects with Tj R >T 1j were assumed to have events during the first stage, and those with Tj R <T 1j were labelled as subjects who went through the second randomization. For those subjects, the data were again generated in which Z 1j 2 Bernoulli.:3/ and Z 2j 2 Bernoulli.:3/ for various response rates. For the subjects who went through the second randomization, the times to death from the second randomization (T Å A 1 B kj or T Å A 1 B lj, k, l = 1, 2) were drawn from different exponential distributions with mean values of 5 for the sequence of treatments A 1 B 1, 3 for the sequence A 1 B 2, 2 for the sequence A 1 B 1 and 1 for the A 1B 2 treatments. We then defined the overall survival time for the subjects who went through the second randomization as T j = Tj R + R j{i.z 1j = 1/T Å A 1 B 1j + I.Z 1j = 2/T Å A 1 B 2j } +.1 R j /{I.Z 2j = 1/T Å A 1 B 1j + I.Z 2j = 2/T Å A 1 B 2j}: For the subjects with Tj R >T 1j, the overall survival time was simply chosen as T j =T 1j. The times to censoring C j were generated from the exponential.5/ distribution which resulted in about 25% censoring for pr.r = 1/ = :4 and 3% censoring for pr.r = 1/ = :7. Event type indicators were generated as ɛ j Bernoulli.:5/ + 1 and Δ j were set equal to ɛ j if C j >T j and otherwise. The simulation results from this setting for n = 3 can be seen in Table 2. The results with n = 7 are again given in Web appendix A. Under this setting the naive estimator produces biased estimates even for the first regime. Without events and censoring at the first stage the naive estimator was constructed on the basis of roughly the same mixture of responders and non-responders as the original sample and hence was unbiased for this specific regime. However, now the naive estimator assigns equal weights to the subjects who had events at the first stage and the subjects who went through the second randomization and stayed consistent with the regime. Thus, it fails to account for the subjects who followed other treatment combinations. The weighted estimators in contrast assign weights equal to 1 to the subjects who had events at the first stage and assign weights to the subjects who went through the second randomization by their inverse probability of the subsequent assignment. As a result the pseudosamples that were created by these estimators better represent what would have happened if all subjects were following the same treatment regime. This can be clearly seen from Table 2 where the weighted estimators produce unbiased estimates for all time points for both regimes and have better coverage rates compared with the naive estimator. The coverage rates of the weighted estimators become better as the sample size is increased whereas for the naive estimator they become worse A two-stage randomization design with covariate-dependent allocation In some trials the second-stage randomization depends not only on the response status but also on subject characteristics. To investigate the performance of the proposed estimators under such a setting, the two-stage randomization was carried out depending on a covariate X which was generated from a Bernoulli.:6/ distribution. It is assumed that the treatments B 1 and B 2 work better for patients with X = 1 and the treatments B 2 and B 1 work better for patients with X =. Thus the patients were given higher probabilities of being assigned to the treatment options that worked better for them. More specifically, p 11 = :75, p 21 = :25, p 1 = :25 and p 2 = :75, where p 11 is the probability that a patient with X =1 will be assigned to B 1, p 21 is the probability that a patient with X = 1 will be assigned to B 1, and p 1 and p 2 are the probabilities of being assigned to B 1 and B 1 respectively for patients with X =. As before, Tj R, the times to response, were generated from the exponential.:2/ distribution.

11 Cumulative Incidence Function of Dynamic Treatment Regimes 95 Table 2. Simulation results from the setting with events and censoring at the first stage, n D 3 t pr(r) Method Results for treatment A 1 B 1 B 1 Results for treatment A 1 B 1 B 2 TCI EST ESD EstSE COV TCI EST ESD EstSE COV.5.4 CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI t, specific time point; pr(r), probability of response; TCI, true cumulative incidence; EST, mean of estimates; ESD, empirical standard deviation; EstSE, mean of estimated standard errors; COV, coverage rate of 95% confidence intervals. An initial time to event, T 1j, was generated for all subjects from the exponential.:8/ distribution. The subjects with Tj R <T 1j were labelled as subjects who went through the second randomization. The times to death from the second randomization were drawn from different exponential distributions with mean 1 for the sequence of treatments A 1 B 1,.3 for the sequence A 1 B 2,1 for the sequence A 1 B 1 and.8 for the sequence A 1B 2 treatments for patients with X = 1. For the patients with X =, the times to death from the second randomization were drawn from a different set of exponential distributions, namely exponential distributions with means.2 for the A 1 B 1,.1 for the A 1 B 2,.8 for the A 1 B 1 and 1 for the A 1B 2 sequences were used for patients with X =. We then defined the overall survival time for the subjects who went through the second randomization as the time to response plus the appropriate time to death and, for the subjects with Tj R >T 1j, the overall survival time was simply chosen as T j = T 1j. The times to censoring C j were generated from the exponential.1/ distribution, which resulted in 21% censoring for pr.r = 1/ = :4 and 29% censoring for pr.r = 1/ = :7. Event type indicators were generated as ɛ j Bernoulli.:5/ + 1 and Δ i were set equal to ɛ j if C j >T j and otherwise. The simulation results from this setting for n = 3 can be seen in Table 3. The results with

12 96 I. Yavuz, Y. Cheng and A. S. Wahed Table 3. Simulation results from the setting with covariate-dependent allocation, n D 3 t pr(r) Method Results for treatment A 1 B 1 B 1 Results for treatment A 1 B 1 B 2 TCI EST ESD EstSE COV TCI EST ESD EstSE COV.5.4 CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI CI WCI WCI TWCI TWCI t, specific time point; pr(r), probability of response; TCI, true cumulative incidence; EST, mean of estimates; ESD, empirical standard deviation; EstSE, mean of estimated standard errors; COV, coverage rate of 95% confidence intervals. n = 7 are again given in Web appendix A. Under this setting the naive estimator produces biased estimates for both regimes again whereas the weighted estimators stay unbiased and have better coverage rates. The coverage rates of the estimators proposed improve with increasing sample size unlike those of the naive estimator. 6. Data analysis Neuroblastoma is a type of cancer that starts in certain very early forms of nerve cells. It accounts for about 6% of all cancers in children and is the most common cancer in infants (less than 1 year old). Nearly 9% cases are diagnosed by age 5 years ( Therefore, finding effective treatment regimes for neuroblastoma can lead to a significant advancement towards the welfare of these patients. We employ the proposed methods to investigate treatment regimes for neuroblastoma. The data from the neuroblastoma study were collected from a twostage randomized trial conducted to assess the long-term effects of autologous bone marrow transplantation (ABMT) or chemotherapy, and subsequent treatment with 13-cis-retinoic acid

13 Cumulative Incidence Function of Dynamic Treatment Regimes 97 CIF (a) CIF (b) CIF Time (Years) (c) CIF Time (Years) (d) Fig. 1. Different CIF estimators for the four regimes (, naive CI;, fixed weight WCI;, estimated fixed weight WCI2): (a) ABMT cis-ra; (b) ABMT no RA; (c) chemotherapy cis-ra; (d) chemotherapy no RA (cis-ra) on children with high-risk neuroblastoma (Matthay et al., 29). A total of 379 patients who received the same induction chemotherapy were first randomized to consolidation with myeloablastic chemotherapy, total body irradiation and ABMT (n = 189) versus three cycles of intensive chemotherapy (n = 19). The response for this trial was defined as being free of progressive disorder after the consolidation treatment and willing to be further randomized. Responders were randomized so that at the second stage 5 ABMT and 52 chemotherapy responders received cis-ra, and 48 ABMT and 53 chemotherapy responders were randomized to no further therapy (no RA). Non-responders were not randomized at the second stage. Therefore, there are four regimes in this study: (a) ABMT cis-ra, denoting the treatment policy that a subject started with ABMT, and then received cis-ra if the subject responded and no further treatment if the subject did not respond; (b) ABMT no RA, corresponding to the decision rule of starting with ABMT, and receiving no RA if response, and no therapy if no response; (c) chemotherapy cis-ra; (d) chemotherapy no RA.

14 98 I. Yavuz, Y. Cheng and A. S. Wahed CIF Time(Years) Fig. 2. Fixed weights CIF estimator for the four regimes, ABMT cis-ra ( ), ABMT no RA ( ), chemotherapy cis-ra ( ) and chemotherapy no RA ( ) Regimes (c) and (d) are similar to regimes (a) and (b) except that patients started with chemotherapy in the first stage. The data involve two end points: disease progression or death. A total of 269 subjects developed progressive disorder, 134 before the second randomization, and 23 deaths occurred, 22 before the second randomization. When time to progression is concerned, the event time is the competing risk censored by death. Hence we adopt the CIF to describe the cumulative risk of progression in the presence of death. A complication of the study design is that the response status is related to disease progression which is also our end point of interest. However, our proposed fixed weight estimators are still applicable to this data set. For any treatment strategy, we are evaluating the CIF based on a mixture of responders and non-responders. Imagine that we assign all the 379 eligible patients to the first treatment regime, where those subjects who do not have progressive disorder after the first consolidation treatment with ABMT would further receive cis-ra. In the end when we evaluate the long-term effect of this treatment strategy on preventing disease progression, we shall use the data from all the responders and non-responders to estimate the CIF. Hence the key for an appropriate analysis is to weight the responders and non-responders that are consistent with this specific regime from this two-stage trial to reflect properly the mixture of responders and non-responders if there were only one treatment regime involved. Therefore, for this neuroblastoma study, we calculated the naive CIF estimator, the CIF estimate with fixed weights and the CIF estimate with the estimated fixed weights for time to progression of these four regimes. The results are shown in Fig. 1. It can be seen that the naive CIF estimator produces slightly higher estimates for the CIF and the two weighted estimators produce very similar results. The naive estimates were constructed

15 Cumulative Incidence Function of Dynamic Treatment Regimes 99 on the basis of the sample with all the non-responders and only about half of the responders. The lack of a large difference between the naive and the weighted estimates suggests that those who did not develop progressive disorder after the first treatment may not do much better in terms of long-term disease progression than those who developed progressive disorder in the first stage or those who refused to be further randomized. Fig. 2 shows the CIF estimates (fixed weight) of disease progression over time across all treatment regimes. If disease progression is the main end point of interest, we might select ABMT cis-ra as the optimal regime since it has the lowest estimated cumulative incidence rates of disease progression over time. For example, the CIF estimates (with 95% Wald confidence intervals in parentheses) for the four regimes at 2 years are.47 (.39;.56) for ABMT cis- RA,.5 (.42;.59) for ABMT no RA,.62 (.54;.71) for chemotherapy cis-ra and.59 (.51;.68) for chemotherapy no RA. The same estimates at 5 years are.59 (.51;.68),.65 (.57;.74),.74 (.66;.82) and.79 (.71;.86) respectively. Since these confidence intervals have significant overlaps, to take a closer look, the forest plots in Fig. 3 were produced for the differences in the CIFs at years 2 and 5 by using the derivations that were given in Section 4.1. For example, since the confidence interval for the difference between the ABMT cis-ra CIF and chemotherpy no RA CIF at year 5 does not contain zero, on the basis of the 5-year CIFs, the ABMT cis-ra regime can be preferred over the chemotherpy no RA regime. Even though the CIFs are commonly used in the literature and are preferred by practitioners because of their nice probability interpretations, one must use caution when generalizing the CIF-based results from a study to a future population since competing events can affect the CIF of the primary event through their influence on the overall survival function. To have a more comprehensive comparison of disease progression over time among those regimes, we also present in Fig. 4 the estimated cumulative CSH of progression over time. Note that the estimation of the cumulative CSH is simply a by-product of the CIF estimators. The methods that were proposed in Section 3 can be readily adapted to estimate the cumulative CSH, resulting ABMT/cisRA ABMT/noRA ABMT/cisRA Chemo/cisRA ABMT/cisRA Chemo/noRA ABMT/noRA Chemo/cisRA ABMT/noRA Chemo/noRA Chemo/cisRA Chemo/noRA ABMT/cisRA ABMT/noRA ABMT/cisRA Chemo/cisRA ABMT/cisRA Chemo/noRA ABMT/noRA Chemo/cisRA ABMT/noRA Chemo/noRA Chemo/cisRA Chemo/noRA Difference of CIF estimates at year 2 (a) Difference of CIF estimates at year 5 Fig. 3. Confidence intervals for the differences between the CIFs at years (a) 2 and (b) 5 (b)

16 1 I. Yavuz, Y. Cheng and A. S. Wahed Cumulative CSH Cumulative CSH (a) Cumulative CSH Cumulative CSH (b) Time (Years) (c) Time (Years) (d) Fig. 4. Different cumulative CSH estimators for the four regimes (, naive;, fixed weight;, estimated fixed weight): (a) ABMT cis-ra; (b) ABMT no RA; (c) chemotherapy cis-ra; (d) chemotherapy no RA in the naive, fixed weight and estimated weight estimates of the cumulative CSH functions for each regime. The results on the CSH are similar to those from the CIFs, which further suggest that ABMT cis-ra is the optimal regime in preventing disease progression. 7. Discussion In this paper we proposed and compared alternative non-parametric estimators of the CIFs by using inverse probability weighting and we constructed inference procedures for the estimators proposed. The weighted methods are easy to implement with explicit variance formulae and produce unbiased estimates of the CIFs for DTRs. In our proposed framework, we assumed that censoring times are independent of event times. This assumption can be relaxed to allow the censoring distribution K.t/ to be independent of event times conditional on covariates, in which case K.t/ can be estimated by fitting a proportional hazard model or other suitable models. The results that were presented in the paper will be valid as long as the model assumed for censoring is correct. Moreover, the methods proposed can also be used for observational studies that involve sequential treatments, though more care is needed in estimating the second-stage randomization probabilities. For ease of illustration, we focus on a two-stage design, though our methods can easily be extended to more than two stages. For example, consider a simple three-stage design with only two choices for responders and two choices for non-responders at each stage. Let the treatment options at the first stage be A 1 and A 2. Let B 1 and B 2 be the treatment options at the second

17 Cumulative Incidence Function of Dynamic Treatment Regimes 11 stage for responders and B 1 and B 2 be the options for non-responders. Finally let C 1 and C 2 be the treatment options at the third stage for patients who respond to the second-stage treatment and C 1 and C 2 be the options for non-responders at the second stage. There will be 32 DTRs which can be denoted by A m B k B l C vc y, m, k, l, v, y = 1, 2. Let the response status at the first stage be R 1 and the response status at the second stage be R 2. Then the estimators proposed can be modified by adding another layer of inverse probability weights. For example, the weights in the fixed weights estimator for the three-stage design will be { I.Z 1j = k/ I.Z 3j = v/ Q Am B k B l C vc y,j =R 1j R 2j +.1 R 2j / I.Z } 4j = y/ π Bk π Cv +.1 R 1j / I.Z 2j = l/ π B l π C y { R 2j I.Z 3j = v/ π Cv +.1 R 2j / I.Z 4j = y/ π C y where Z 1j is the second treatment assignment indicator for responders at the first stage (Z 1j =k if subject j is assigned to B k ), Z 2j is the second treatment assignment indicator for non-responders at the first stage, Z 3j and Z 4j are the third treatment assignment indicators for responders and non-responders to the second-stage treatments, π Bk =pr.z 1j =k R 1j =1/, π B l =pr.z 2j =l R 1j = /, π Cv = pr.z 3j = v R 2j = 1/ and π C y = pr.z 4j = y R 2j = /. It is, however, worth noting that three-stage trials are rare in practice, as they often involve large numbers of potential treatment policies. There are only a few trials involving more than two stages. Such trials usually have simplified arms and naturally require large samples (for example the Sequenced treatment alternatives to relieve depression trial started with 2876 patients receiving the same treatment at the first stage). Furthermore, when practitioners deal with survival data, it is always of interest to model or control for covariate effects. For example, Moodie et al. (214) developed a marginal structural model for estimating CSHs taking into account the effects of time varying variables, and their model yields CIF estimations for the discrete case. To our best knowledge, there has been no work directly modelling covariate effects on the CIFs of specific regimes in a two-stage randomized trial. These may be future research topics. }, Acknowledgements We are grateful to the editors and referees for their constructive comments, which helped us to improve the manuscript considerably. This project was partially supported by grant DMS from the National Science Foundation to Cheng. Appendix A: Variance estimation To estimate the variance of ˆF w 1,A mb k B l, the following counting process formulation was used. For subject j, define the weighted cause-specific event processes N1j w.s/ = I.V j s, Δ j = 1/Q AmBk B l,j and N2j w.s/ = I.V j s, Δ j = 2/Q AmBk B l,j, and the overall event process Nj w.s/ = Nw 1j.s/ + Nw 2j.s/. Also define the weighted at-risk process Yj w.s/ = I.V j s/q AmBk B l,j. Summing over all subjects, we have Y: w.s/ = Σ n j=1 Y j w.s/ and N1: w.s/ = Σn j=1 Nw 1j.s/; similarly Nw 2:.s/, and Nw :.s/ = N1: w.s/ + Nw 2:.s/. Let Λ 1.s/ = s λ 1.u/du, where λ k.u/ = lim h pr.u V<u+ h, Δ = k V u/=h is the cause-specific hazard function for event k, and Λ.s/ = s λ.u/du, where λ.u/ = lim h pr.u V<u+ h V u/=h is the all-cause hazard. Let Mj w.s/ = Nw j.s/ s Y w j.u/dλ.u/. One can show that Mw j s are martingales and so is Mw :.s/ = N: w.s/ s Y w :.u/dλ.u/. Similarly, M1: w.s/=nw 1:.s/ s Y w :.u/dλ 1.u/ is also a martingale. The weighted survival and the WCIF estimator can be represented as Ŝ w A mb k B l.t/ = { } 1 ΔNw :.s/ s t Y: w.s/

STATISTICAL INFERENCES FOR TWO-STAGE TREATMENT REGIMES FOR TIME-TO-EVENT AND LONGITUDINAL DATA

STATISTICAL INFERENCES FOR TWO-STAGE TREATMENT REGIMES FOR TIME-TO-EVENT AND LONGITUDINAL DATA STATISTICAL INFERENCES FOR TWO-STAGE TREATMENT REGIMES FOR TIME-TO-EVENT AND LONGITUDINAL DATA by Sachiko Miyahara B.A. in Statistics, University of Connecticut, 1998 M.S. in Statistics, University of

More information

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with

More information

Estimating Causal Effects of Organ Transplantation Treatment Regimes

Estimating Causal Effects of Organ Transplantation Treatment Regimes Estimating Causal Effects of Organ Transplantation Treatment Regimes David M. Vock, Jeffrey A. Verdoliva Boatman Division of Biostatistics University of Minnesota July 31, 2018 1 / 27 Hot off the Press

More information

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

Censoring and Truncation - Highlighting the Differences

Censoring and Truncation - Highlighting the Differences Censoring and Truncation - Highlighting the Differences Micha Mandel The Hebrew University of Jerusalem, Jerusalem, Israel, 91905 July 9, 2007 Micha Mandel is a Lecturer, Department of Statistics, The

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models 26 March 2014 Overview Continuously observed data Three-state illness-death General robust estimator Interval

More information

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina Lecture 9: Learning Optimal Dynamic Treatment Regimes Introduction Refresh: Dynamic Treatment Regimes (DTRs) DTRs: sequential decision rules, tailored at each stage by patients time-varying features and

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Chapter 7: Hypothesis testing

Chapter 7: Hypothesis testing Chapter 7: Hypothesis testing Hypothesis testing is typically done based on the cumulative hazard function. Here we ll use the Nelson-Aalen estimate of the cumulative hazard. The survival function is used

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

DAGStat Event History Analysis.

DAGStat Event History Analysis. DAGStat 2016 Event History Analysis Robin.Henderson@ncl.ac.uk 1 / 75 Schedule 9.00 Introduction 10.30 Break 11.00 Regression Models, Frailty and Multivariate Survival 12.30 Lunch 13.30 Time-Variation and

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

4. Comparison of Two (K) Samples

4. Comparison of Two (K) Samples 4. Comparison of Two (K) Samples K=2 Problem: compare the survival distributions between two groups. E: comparing treatments on patients with a particular disease. Z: Treatment indicator, i.e. Z = 1 for

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN Ying Liu Division of Biostatistics, Medical College of Wisconsin Yuanjia Wang Department of Biostatistics & Psychiatry,

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

ASSESSING THE EFFECT OF TREATMENT REGIMES ON LONGITUDINAL OUTCOME DATA: APPLICATION TO REVAMP STUDY OF DEPRESSION

ASSESSING THE EFFECT OF TREATMENT REGIMES ON LONGITUDINAL OUTCOME DATA: APPLICATION TO REVAMP STUDY OF DEPRESSION Journal of Statistical Research 2012, Vol. 46, No. 2, pp. 233-254 ISSN 0256-422 X ASSESSING THE EFFECT OF TREATMENT REGIMES ON LONGITUDINAL OUTCOME DATA: APPLICATION TO REVAMP STUDY OF DEPRESSION SACHIKO

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

Multistate models in survival and event history analysis

Multistate models in survival and event history analysis Multistate models in survival and event history analysis Dorota M. Dabrowska UCLA November 8, 2011 Research supported by the grant R01 AI067943 from NIAID. The content is solely the responsibility of the

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

A simulation study for comparing testing statistics in response-adaptive randomization

A simulation study for comparing testing statistics in response-adaptive randomization RESEARCH ARTICLE Open Access A simulation study for comparing testing statistics in response-adaptive randomization Xuemin Gu 1, J Jack Lee 2* Abstract Background: Response-adaptive randomizations are

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Estimation for Modified Data

Estimation for Modified Data Definition. Estimation for Modified Data 1. Empirical distribution for complete individual data (section 11.) An observation X is truncated from below ( left truncated) at d if when it is at or below d

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

POWER AND SAMPLE SIZE DETERMINATIONS IN DYNAMIC RISK PREDICTION. by Zhaowen Sun M.S., University of Pittsburgh, 2012

POWER AND SAMPLE SIZE DETERMINATIONS IN DYNAMIC RISK PREDICTION. by Zhaowen Sun M.S., University of Pittsburgh, 2012 POWER AND SAMPLE SIZE DETERMINATIONS IN DYNAMIC RISK PREDICTION by Zhaowen Sun M.S., University of Pittsburgh, 2012 B.S.N., Wuhan University, China, 2010 Submitted to the Graduate Faculty of the Graduate

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan

More information

Nonparametric Model Construction

Nonparametric Model Construction Nonparametric Model Construction Chapters 4 and 12 Stat 477 - Loss Models Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 1 / 28 Types of data Types of data For non-life

More information

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Sarwar I Mozumder 1, Mark J Rutherford 1, Paul C Lambert 1,2 1 Biostatistics

More information

A multi-state model for the prognosis of non-mild acute pancreatitis

A multi-state model for the prognosis of non-mild acute pancreatitis A multi-state model for the prognosis of non-mild acute pancreatitis Lore Zumeta Olaskoaga 1, Felix Zubia Olaskoaga 2, Guadalupe Gómez Melis 1 1 Universitat Politècnica de Catalunya 2 Intensive Care Unit,

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes Thomas A. Murray, (tamurray@mdanderson.org), Ying Yuan, (yyuan@mdanderson.org), and Peter F. Thall (rex@mdanderson.org) Department

More information

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements [Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers

More information

Meei Pyng Ng 1 and Ray Watson 1

Meei Pyng Ng 1 and Ray Watson 1 Aust N Z J Stat 444), 2002, 467 478 DEALING WITH TIES IN FAILURE TIME DATA Meei Pyng Ng 1 and Ray Watson 1 University of Melbourne Summary In dealing with ties in failure time data the mechanism by which

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

Analytical Bootstrap Methods for Censored Data

Analytical Bootstrap Methods for Censored Data JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 6(2, 129 141 Copyright c 2002, Lawrence Erlbaum Associates, Inc. Analytical Bootstrap Methods for Censored Data ALAN D. HUTSON Division of Biostatistics,

More information

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

Estimating Optimal Dynamic Treatment Regimes from Clustered Data Estimating Optimal Dynamic Treatment Regimes from Clustered Data Bibhas Chakraborty Department of Biostatistics, Columbia University bc2425@columbia.edu Society for Clinical Trials Annual Meetings Boston,

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

Adaptive Designs: Why, How and When?

Adaptive Designs: Why, How and When? Adaptive Designs: Why, How and When? Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj ISBS Conference Shanghai, July 2008 1 Adaptive designs:

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University September 28,

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018 Sample Size Re-estimation in Clinical Trials: Dealing with those unknowns Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj University of Kyoto,

More information

JOINT REGRESSION MODELING OF TWO CUMULATIVE INCIDENCE FUNCTIONS UNDER AN ADDITIVITY CONSTRAINT AND STATISTICAL ANALYSES OF PILL-MONITORING DATA

JOINT REGRESSION MODELING OF TWO CUMULATIVE INCIDENCE FUNCTIONS UNDER AN ADDITIVITY CONSTRAINT AND STATISTICAL ANALYSES OF PILL-MONITORING DATA JOINT REGRESSION MODELING OF TWO CUMULATIVE INCIDENCE FUNCTIONS UNDER AN ADDITIVITY CONSTRAINT AND STATISTICAL ANALYSES OF PILL-MONITORING DATA by Martin P. Houze B. Sc. University of Lyon, 2000 M. A.

More information

Residuals and model diagnostics

Residuals and model diagnostics Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function,

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

The Design of a Survival Study

The Design of a Survival Study The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The

More information

Sample Size Determination

Sample Size Determination Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by

More information

Survival Prediction Under Dependent Censoring: A Copula-based Approach

Survival Prediction Under Dependent Censoring: A Copula-based Approach Survival Prediction Under Dependent Censoring: A Copula-based Approach Yi-Hau Chen Institute of Statistical Science, Academia Sinica 2013 AMMS, National Sun Yat-Sen University December 7 2013 Joint work

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Multi-state models: prediction

Multi-state models: prediction Department of Medical Statistics and Bioinformatics Leiden University Medical Center Course on advanced survival analysis, Copenhagen Outline Prediction Theory Aalen-Johansen Computational aspects Applications

More information

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian

More information

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Regression analysis of interval censored competing risk data using a pseudo-value approach

Regression analysis of interval censored competing risk data using a pseudo-value approach Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 555 562 http://dx.doi.org/10.5351/csam.2016.23.6.555 Print ISSN 2287-7843 / Online ISSN 2383-4757 Regression analysis of interval

More information

Package Rsurrogate. October 20, 2016

Package Rsurrogate. October 20, 2016 Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Constrained estimation for binary and survival data

Constrained estimation for binary and survival data Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

Randomization-Based Inference With Complex Data Need Not Be Complex!

Randomization-Based Inference With Complex Data Need Not Be Complex! Randomization-Based Inference With Complex Data Need Not Be Complex! JITAIs JITAIs Susan Murphy 07.18.17 HeartSteps JITAI JITAIs Sequential Decision Making Use data to inform science and construct decision

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Set-valued dynamic treatment regimes for competing outcomes

Set-valued dynamic treatment regimes for competing outcomes Set-valued dynamic treatment regimes for competing outcomes Eric B. Laber Department of Statistics, North Carolina State University JSM, Montreal, QC, August 5, 2013 Acknowledgments Zhiqiang Tan Jamie

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

Targeted Group Sequential Adaptive Designs

Targeted Group Sequential Adaptive Designs Targeted Group Sequential Adaptive Designs Mark van der Laan Department of Biostatistics, University of California, Berkeley School of Public Health Liver Forum, May 10, 2017 Targeted Group Sequential

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics. Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics. Dragi Anevski Mathematical Sciences und University November 25, 21 1 Asymptotic distributions for statistical

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation

More information

PASS Sample Size Software. Poisson Regression

PASS Sample Size Software. Poisson Regression Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Linear rank statistics

Linear rank statistics Linear rank statistics Comparison of two groups. Consider the failure time T ij of j-th subject in the i-th group for i = 1 or ; the first group is often called control, and the second treatment. Let n

More information