Concordance for prognostic models with competing risks

Size: px
Start display at page:

Download "Concordance for prognostic models with competing risks"

Transcription

1 Concordance for prognostic models with competing risks Marcel Wolbers Michael T. Koller Jacquline C. M. Witteman Thomas A. Gerds Research Report 13/3 Department of Biostatistics University of Copenhagen

2 Concordance for prognostic models with competing risks MARCEL WOLBERS 1, MICHAEL T. KOLLER 2, JACQUELINE C.M. WITTEMAN 3, and THOMAS A. GERDS 4 1 Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Programme, Ho Chi Minh City, Viet Nam, and Centre for Tropical Medicine, Nuffield Department of Medicine, University of Oxford, Oxford, UK 2 Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel, 431 Basel, Switzerland 3 Erasmus Medical Center, 315 Rotterdam, The Netherlands 4 Department of Biostatistics, University of Copenhagen, 114 Copenhagen K, Denmark February 19, 213 Abstract The concordance probability is a widely used measure to assess discrimination of prognostic models with binary and survival endpoints. In this article we discuss and formalize an earlier definition of the concordance probability for regression models with competing risks endpoints (Wolbers et al, Epidemiology 29;2: ). For right censored data, we investigate inverse probability of censoring weighted (IPCW) estimates of a truncated concordance index. The estimates are based on a working model for the censoring distribution. The simplest model assumes that the censoring distribution is independent of the predictor variables. If the working model is correctly specified then the IPCW estimate consistently estimates the concordance probability. The small sample properties of the estimates are assessed in a simulation study also against misspecification of the working model. We further illustrate the methods by computing the concordance probability for a prognostic model of coronary heart disease (CHD) events in the presence of the competing risk of non-chd death. C index; Competing risks; Concordance probability; Coronary heart disease; Prognostic models. 1

3 1 Introduction Clinical decision-making and cost-effectiveness analyses often rely on prognostic models that quantify a subject s absolute risk of a disease event of interest over time. Traditionally, these models are based on survival analysis and a large body of research discusses the development and validation of prognostic models for time-to-event data [9, 22, 24]. However, study populations of common diseases increasingly consist of elderly individuals with varying degrees of co-morbidity who are likely to experience one of several disease endpoints other than the endpoint of main interest [14]. As an example, prediction of coronary heart disease (CHD) events in elderly subjects is complicated by the fact that subjects may die from other causes prior to the observation of the disease event of interest [25, 13]. Another example is the quantification of the benefit from implantable cardioverter-defibrillators (ICDs) in patients at risk of sudden cardiac death. Patients with an ICD profit from implantation if the device appropriately treats a lifethreatening cardiac arrhythmia event (e.g. ventricular fibrillation). However, any benefit from ICD implantation is abolished if a patient dies prior to appropriate ICD therapy. It is thus of interest whether death without prior appropriate ICD therapy occurs in a relevant proportion of patients and how such patients could be identified prospectively [15]. It is well known that the naive application of standard survival analysis leads to bias and risk over-estimation if competing risks are present [8, 18]. Prognostic models that take competing risk events into account are thus of growing importance. In many applications, including the examples mentioned above, the parameter of main interest for medical decision-making is the absolute risk of the event of interest as quantified by its (covariate-dependent) cumulative incidence function [4]. Thus, regression models are particularly attractive when they provide subject specific estimates of the absolute risks based on a set of covariates [7]. Several measures for quantifying the accuracy of prognostic models have been adapted from the standard survival setting with only one failure cause to competing risks. Measures include prediction error curves [21], time-dependent sensitivity, specificity and AUC [19, 27], and reclassification methods [25, 13]. Commonly these measures assess prognoses for the event status at a fixed followup duration t. In contrast, the concordance index assesses prognoses for the order of the event times. For survival data, the concordance index [1] is the most widely reported measure of discrimination and we have previously presented a simple adaptation of the concordance index to the competing risks setting [25]. In the present paper, we formally define a cause-specific concordance index in the presence of competing risks and derive estimators suitable for right censored data. Notably, the proposed concordance index depends only on the cumulative incidence function of the event of interest. To illustrate the definition, we describe the behaviour of the concordance index for data generated according to a proportional cause-specific hazards model as well as for data from a direct model for the cumulative incidence function. 2

4 We then study estimation of a truncated concordance index in the presence of right-censoring. We introduce an inverse probability of censoring weighted (IPCW) estimator and demonstrate its consistency if the working model for the censoring distribution is correctly specified. The empirical bias and mean-square error are examined in a simulation study. Finally, we illustrate the methods for an example of coronary risk prediction in older woman using data from the Rotterdam Study [12]. 2 Definition of concordance 2.1 Assessing scores in uncensored data Competing risks data without censoring are given by pairs (T, D) of data where T is the time to the event and D is the event type. For the purpose of discussing the definition and estimation of the cause-specific concordance index it is sufficient to assume that there are only two competing events. Thus, for simplicity of presentation we let D = 1 denote the event of interest and D = 2 the occurrence of any competing event. In applications it may be important to model all competing events separately. The concordance index is defined for any prognostic score M(X) which can be used to order subjects with respect to the risk of an event of type 1. For example M(X) could be the linear predictor of a regression model for the event of interest derived on a training data set. We assume that higher values of M(X) are associated with higher risk of the event 1. The concordance probability assesses the discriminatory power of the prognostic score. Intuitively, M(X) discriminates well if it assigns high risks to individuals experiencing the event of interest early, lower risks to individuals experiencing the event of interest late, and negligible risk to those who never experience the event of interest. The latter occurs when the individual experiences a competing event [25]. To formally define the concordance index for uncensored data we assume an independent test data set consisting of i.i.d. realizations of (X i, T i, D i ) (i = 1,..., n) from the joint distribution of predictor variables and competing risks data. We then define C 1 = P ( M(X i ) > M(X j ) D i = 1 and (T i < T j or D j = 2)) (1) for any randomly chosen pair of subjects i, j from the joint distribution of predictor variables and competing risks data. According to this definition, we count two types of pairs of individuals for which the ranking of the prognostic score can be concordant or discordant. The first type are pairs where individual i experiences the event of interest while individual j is still event free. The second type are pairs where individual i experiences the event of interest while individual j experiences the competing event at any time. For both types it is intuitive to call the pair concordant when the prognostic score is higher for individual i. Pairs of individuals which both experience the competing event 3

5 are not evaluable for C 1 but are counted for the cause-specific concordance C 2 for the competing event which is defined analogously. It is important to discuss modifications of definition (1) for tied data [see 26]. For example, it may happen that M(Xi ) = M(X j ). Depending on the application, it may then be sensible to count such pairs with a weight of 1/2: C 1 = P ( M(X i ) > M(X j ) D i = 1 and (T i < T j or D j = 2)) P ( M(X i ) = M(X j ) D i = 1 and (T i < T j or D j = 2)). To simplify notation, we used definition (1) as the basis for our further developments. 2.2 Assessing prediction models in right censored data We now generalize the concordance index defined in Section 2.1 in two ways. First, we replace the simple prognostic score M(X) by a more general prediction model M(t, X) for the risk of event 1 until time t. In this way we include regression models for the cumulative incidence of the event of interest at time t, i.e. estimates of F 1 (t X) = P (T t, D = 1 X), which can be obtained by combining cause-specific hazards models, by fitting a Fine and Gray regression model or by direct binomial regression [3, 2]. Second, to include the typical application where individuals have a limited duration of follow-up (i.e. in the presence of right censoring) we define a truncated version of the concordance index. Following [23] and [6] we define C 1 (t) := P (M(t, X i ) > M(t, X j ) D i = 1 and T i t and (T i < T j or D j = 2)). (2) The parameter C 1 (t) can be interpreted as the ability of the model to correctly rank events of interest up to time t and to distinguish them from competing events. The truncation is necessary to enable estimation of C 1 (t) from right censored data. Note that C 1 = C 1 ( ) in the special case where the order of a pair of predictions does not depend on time, i.e. M(t, X i ) = M(s, X i ) for all s, t. However, C 1 ( ) is only identifiable when the cumulative incidence F 1 ( X) can be identified on [, ) for all X. But in most real world applications even the marginal Aalen-Johansen estimator for F 1 (which is independent of X) is only defined on [, t max ) where t max is the maximal follow-up length of the study. Indeed, the truncated concordance (2) can be written as a functional of F 1 and of the marginal distribution of the predictor values of a pair of individuals (X i, X j ). To see this, we introduce notation for the order of the predicted risks for a pair of individuals, Q ij (t) = I{M(t, X i ) > M(t, X j )}. 4

6 We also note that I({s < T j or D j = 2}) = 1 I({T j s, D j = 1}). Thus, P{D i = 1 and T i t and (T i < T j or D j = 2)} t = E Xi,X j (1 F 1 (s X j ))df 1 (s X i ) (3) and we can rewrite the truncated concordance index (2) as C 1 (t) = E X i,x j (Q ij (t) t (1 F 1(s X j ))df 1 (s X i )) E Xi,X j ( t (1 F. (4) 1(s X j ))df 1 (s X i )) Of note, in the form given in equation (4) the cause-specific concordance for event 1 does not depend on the cumulative incidence F 2 of the competing event. This feature is not obvious in formula (2) but desirable when the aim is to assess the discriminative ability of a model for F 1. 3 Illustration for proportional cause-specific hazards models and direct models for the cumulative incidence Cause-specific Cox regression models and direct models for the cumulative incidence are the most frequently used regression models for competing risks [18]. In this section, we illustrate the properties of the concordance probability in setting where the true data-generating mechanism is either a set of cause-specific proportional hazards models or a Fine-Gray model. In both situations, we asses the concordance of a time-independent prognostic score. Specifically, we simulate a prognostic index M(X) = X by drawing values from the standard normal distribution. Using a Monte Carlo approach we then approximated C 1 = C 1 ( ) based on simulated data sets, each consisting of n = 1, uncensored i.i.d. realizations drawn from the respective data-generating mechanism. 3.1 Proportional cause-specific hazards models Data were generated under proportional cause-specific hazards models with constant baseline hazards: Event 1: λ 1 (t X) = λ 1 exp(β 1 X) Event 2: λ 2 (t X) = λ 2 exp(β 2 X) We set λ 1 = 1 and considered three values for λ 2 = {,.5, 2}. Note that λ 2 = corresponds to no competing risks. Four different scenarios for β 2 were considered: (1) β 2 =, (2) β 2 = β 1, (3) β 2 = β 1, (4) β 2 = 2β 1. For all scenarios, we calculated the concordance probability C 1 as depending on the strength of association of the covariate with the cause-specific hazard of 5

7 the event of interest by varying β 1 from to 1. The results are displayed in Figure 1. Figure 1 shows only weak dependence of the concordance probability on the cause-specific hazard of the competing risk if the latter is not affected by the covariate (black solid lines). If the covariate is associated with an increased hazard of the event of interest but a decreased hazard of the competing event, the discrimination ability improves (black dashed lines lines). In contrast, the discrimination ability is hampered if covariates affect both cause-specific hazards with regression coefficients of the same sign (gray lines). This occurs in particular if there is a strong association with the competing risk (gray dashed lines) or if the baseline competing hazard is high (right panel). These results can be explained by the fact that the overall effect of a covariate on the cumulative incidence function of the event of interest depends on the size of the cause-specific baseline hazard of the event of interest and the competing event as well as on the size of both cause-specific hazard ratios [1, 14]. In particularly, a covariate that is positively associated with the cause-specific hazard of the event of interest can simultaneously be negatively associated with the corresponding cumulative incidence function of that event if the cause-specific hazard of the competing event is larger and shows a stronger positive association with the covariate. This explains the observed concordance probabilities <.5. Importantly, these findings illustrate that a prognostic model for the absolute risk of the event of interest with good discrimination requires predictors which are only weakly or, even better, reversely associated with the cause-specific hazard of the competing event. 3.2 Direct models for the cumulative incidence We generated data from the following Fine-Gray model [3]: P (T t, D = 1 X) = 1 [1 p(1 exp( t))] exp(βx). Note that when X =, this is a unit exponential mixture with mass 1 p at, such that the probability of an event of type 1 in the interval [, ) is p. We varied the value of p between values.1,.5, and.9. As before we varied the strength of association β from to 1. Results are displayed in Figure 2. In addition to the expected positive association between β and the concordance probability there is also a weak dependence on p. 4 Estimation of concordance in the presence of right censoring 4.1 Right censored data To indicate the end of follow up for subject i we introduce a subject specific censoring time C i. Thus, we observe only (X i, T i, D i, i ) (i = 1... n) where 6

8 T i = min(t i, C i ), i = I{T i C i } and D i = i D i. We also use the following notation: Ñ 1 i (t) = I{ T i t, D i = 1} N 1 i (t) = I{T i t, D i = 1} Ñ 2 i (t) = I{ T i t, D i = 2} N 2 i (t) = I{T i t, D i = 2} Ã ij = I{ T i < T j } A ij = I{T i < T j } B ij = I{ T i T j and D j = 2} B ij = I{T i T j and D j = 2} The event-free survival probability conditional on the covariate X i is then given by S(t X i ) = 1 E(N 1 i (t) X i ) E(N 2 i (t) X i ) = 1 F 1 (t X i ) F 2 (t X i ). We allow the censoring distribution to depend on the covariates X i but assume throughout that C i is conditionally independent of (T i, D i ) given X i. This implies P( T i > t X i ) = G(t X i ) S(t X i ), (5) where G(t X i ) = P (C i > t X i ) is the conditional probability of being uncensored at time t. Noting G(t X i ) = P (C i t X i ) we also have for k = 1, 2: E(Ñ k i (t) X i ) = E( i N k i (t) X i ) = t 4.2 Ignoring non-evaluable pairs E(C i s X i ) E(dN k i (s) X i ) = t An asymptotically biased estimate of C 1 (t) is given by Ĉ 1,naive (t) = n n i=1 j=1 (Ãij + B ij )Q ij (t)ñ i 1(t) n n i=1 j=1 (Ãij + B ij )Ñ i 1(t) G(s X i ) df k (s X i ). (6) where Q ij (t) = I{M(t, X i ) > M(t, X j )} is an indicator for the order of predicted risks at time t. This can be interpreted as the proportion of definitely concordant pairs amongst evaluable pairs, i.e. pairs for which one individual experiences the event of interest and concordance can be decided based on the observed (potentially censored) data. The evaluable pairs are all pairs of the form (( D i = 1, T i, M(t, X i )), ( D j, T j, M(t, X j ))) with T i t and T j > T i or D j = 2. A good prognostic model should assign a lower risk to an individual that experiences an event (of any type) late ( T j > T i ) or never experiences the event of interest ( D j = 2). Thus, concordant pairs are those evaluable pairs with M(t, X i ) > M(t, X j ), i.e. Q ij (t) = 1. Pairs for which both individuals are either censored or experience the competing event as well as pairs for which 7

9 the first individual experiences the event of interest and the second is censored before that time point are considered non-evaluable because it is impossible to decide whether the predictions for such a pair are concordant or not. This simple estimate evaluated at the time t max of the maximum follow-up duration has been previously defined in (author?) [25]. While simple, a major problem of this estimator is that by ignoring non-evaluable pairs without any correction, bias is introduced. It is well known that Harrel s C depends on the censoring distribution [23, 6] and Ĉ1,naive(t) shares this limitation. 4.3 IPCW estimate We use a working model to derive an estimate Ĝ for G, the conditional probability of being uncensored given the predictors. For example, we could assume a Cox regression model: G(t X i ) = exp{ t exp(γ T X i )Γ (s)ds} where Γ is an unknown baseline hazard function and γ a vector of regression coefficients that quantify the effect of the predictors on the censoring hazard. Alternatively, we could assume that the censoring is independent of the predictors and estimate G with the Kaplan-Meier estimate for the censoring times. Based on Ĝ we define weights Ŵ ij,1 = Ĝ( T i X i )Ĝ( T i X j ) and Ŵij,2 = Ĝ( T i X i )Ĝ( T j X j ). and derive the following inverse probability of censoring weighted (IPCW) estimate of C 1 (t): n n i=1 j=1 (Ãij Ŵ 1 ij,1 + B ) ij Ŵ 1 ij,2 Q ij (t)ñ i 1(t) Ĉ 1 (t) = n n i=1 j=1 (Ãij Ŵ 1 ij,1 + B (7) ij Ŵij,2) 1 Ñ 1 i (t) Lemma 1. Under conditionally independent censoring, if the working model is correctly specified in the sense that Ĝ consistently estimates G, then Ĉ1(t) consistently estimates C 1 (t) for all t with G(t x) > ɛ > uniformly over x. A proof of the lemma is given in the supplementary material. 5 Simulation study A simulation study was performed to assess bias and root mean square error (RMSE) of the proposed IPCW estimator of C 1 (t). As in Section 3.1, the competing risks data were simulated based on a standard normal covariate X and Cox proportional hazards models with constant cause-specific baseline hazards. Specifically, we chose either cause-specific hazards λ 1 (t) = 1, λ 2 (t) = 2 and regression coefficients β 1 = β 2 = 1 (scenario CR1) or λ 1 (t) = 1, λ 2 (t) =.5, 8

10 and β 1 = 2, β 2 = 1 (scenario CR2). The truncation time point t was chosen as either the median or the upper quartile q 3 of the time to event distribution. The true concordance C 1 (t) between X and failure cause 1 ranged from.62 to.86 for the different competing risks scenarios and truncation points (Table 1). We simulated both independent and dependent right-censoring for the data. Independent censoring was based on an exponential censoring distribution with the rate of the exponential distribution chosen such that (on average) 4% of events prior to t were censored. Dependent censoring was chosen according to an exponential distribution with rate λ cens exp(x) with λ cens chosen such that 3% of events prior to t were censored. Finally, we varied the sample size N between 25 and 1 observations per simulated data set. Three different estimators of C(t) were investigated: the naive estimator Ĉ 1,naive (t) ignoring non-evaluable pairs, an IPCW estimator Ĉ1,KM(t) based on a Kaplan-Meier estimate of the censoring distribution, and an IPCW estimator Ĉ 1,Cox (t) which estimated the censoring distribution with a Cox proportional hazards model depending on X. Results for all 16 simulated scenarios based on 1 simulations per scenario are displayed in Table 1. Lower empirical bias for the IPCW-based estimators and reduced bias for the Cox-model based IPCW estimator compared to the Kaplan-Meier based estimator in situations with dependent censoring are evident for almost all scenarios. The IPCW estimates also tended to have lower RMSE with substantial improvements for some scenarios. No consistent improvements in RMSE of the Cox-model based estimator over the Kaplan-Meier estimator were apparent in the situation of dependent censoring. 6 Application to coronary risk prediction Specialist medical societies recommend initiation of preventive treatment for coronary heart disease (CHD) based on a subjects predicted 1-year risk for CHD [17]. To accurately predict the absolute risk of CHD in older people, prognostic models for CHD need to account for the competing risk of non- CHD death [13]. In this section, we revisit the example of [25] on coronary risk prediction based on data of elderly women from the Rotterdam Study, a prospective, population-based cohort of elderly subjects living in a suburb area of Rotterdam, the Netherlands [12]. We analysed data from 1 years of follow-up of 4144 women aged between 55 and 9 years who were free of CHD at baseline. During that follow-up period, 389 women experienced a CHD event and 921 women died without prior CHD event. Only 41 women of those event-free had less than 1 years of follow-up. We randomly split the data set into a training data set (2763 women with 249 CHD events) and a validation data set (1381 women with 14 CHD events). Using the training set we estimated the parameters of a Fine-Gray regression model which included the traditional baseline risk factors for CHD: age, treatment for high blood pressure (yes versus no), systolic blood pressure (separate slopes depending on whether the subject was on blood pressure treatment 9

11 or not), diabetes mellitus, log-transformed total cholesterol to HDL cholesterol ratio, and smoking status (current versus never or former smoker). All these risk factors were associated with an increased CHD risk and, except for diabetes, all reached conventional significance (p <.5). We also fitted a second Fine- Gray model with age, the strongest predictor, as the only covariate. Finally we estimated two Cox regression models, one for the cause-specific hazard of CHD and one for the cause-specific hazard of non-chd death. In both Cox analyses the same factors were included as for the multivariable Fine-Gray model. Here all risk factors were significantly associated with an increased hazard of CHD. Also, the factors age, smoking and diabetes were significantly associated with an increased hazards of non-chd death whereas the log-transformed total cholesterol to HDL cholesterol ratio showed a significant inverse association. Treatment for hypertension and blood pressure were not significantly associated with the hazard of non-chd death. The two Cox analyses yielded cause-specific hazard predictions λ 1 (t x) and λ 2 (t x), respectively, which were subsequently combined into a prediction of the cumulative incidence function of cause j based on the formula: F j (t x) = 1 t { exp s } (λ 1 (u x) + λ 2 (u x))du λ j (s x)ds. Concordance estimates were obtained for these models in the validation set. The dependence of the censoring distribution on the covariates was investigated with a Cox regression model which yielded no trends and non-significant Wald tests for all variables. Thus, all IPCW estimates of concordance were based on the marginal Kaplan-Meier estimator for the censoring distribution from the validation set. The left panel of Figure 3 shows the discrimination ability of the two Fine- Gray models for varying time horizons between 1 and 1 years. The multivariable model shows higher discriminative ability compared to the model based on age alone. For both models, concordance estimates stabilize after about 2.5 years of follow-up and remain fairly stable though slightly decreasing. The decrease may occur because earlier events are easier to predict than later events. The multivariable cause-specific Cox models showed a similar trend but slightly lower c-index in the first years compared to the multivariable Fine-Gray model (curve not shown). The right panel of Figure 3 shows corresponding prediction error curves [21, 7] in the validation set which assesses the overall performance of the models (i.e. both discrimination and calibration) whereas the concordance assesses discrimination only. The upper half of Table 2 shows the estimated concordance for the cumulative incidences of CHD and non-chd death, respectively, after 1, 5 and 1 years, as were obtained with the single split into training and validation data. The multivariable models were comparable and showed superior discrimination compared to the the age only model for the prediction of CHD whereas improvements were much smaller for the prediction of non-chd death. This is 1

12 not surprising as these additional covariates are established CHD-specific risk factors which would not be expected to strongly affect non-chd death (except for other deaths related to the cardiovascular system). The lower half of Table 2 shows bootstrap-crossvalidation estimates [6] of concordance obtained by averaging the results of 1 splits into a bootstrap training set (size 4144, same as the original data) and a corresponding validation set (size n 1525). The estimated concordance for CHD at the 1 year horizon was higher as compared to the results of the single split for all models. One possible explanation for this result is that the models where trained in bootstrap samples which contains more information than the training sample used for the single split. Another possible explanation for the differences is that the single split analysis is highly dependent on how the data are split. 7 Discussion We have presented a formal definition of the concordance probability for prognostic models in the presence of competing risks. Like the concordance probability for survival or binary data, it provides a simple numeric measure of discrimination. To deal with right censored data we derived a consistent IPCW estimator of the truncated concordance probability. The fact that we estimate a truncated version of concordance rather than the unconstrained concordance probability should not be seen as a limitation of our approach. Indeed it is impossible to assess the performance of prognostic models beyond the maximum follow-up duration without strong and untestable assumptions. This is not a particularity of the competing risks situation and also known for survival data without competing risks [23, 6]. Consistency of the proposed estimator relies on the assumption that the censoring distribution is correctly specified. In many applications it will be reasonable to assume that the censoring mechanism is independent of the predictor variables and then the marginal Kaplan-Meier estimate (ignoring the predictor variables) of the censoring distribution can be used to derive a consistent IPCW estimate. The work presented here is related to the work of [19] and, especially, [27] on time-dependent sensitivity, specificity and ROC curves for competing risks. For their definition of sensitivity and specificity, these authors defined (cumulative) cases and (dynamic) controls at time t as observations with T t and D = 1 (cases) and those with T > t (controls). In addition and similar to the present work, [27] also consider alternative controls defined as observations with T > t or D = 2. With this alternative definition of controls, the area under the ROC curve at time t from [27] is similar to our definition of concordance. However, the estimators in [27] rely on cause-specific (smoothed) Cox regression models and are different from the IPCW estimator proposed here. The advantage of the IPCW estimator is that it does not require a model for the competing risks process but only an estimate of the cenroring distribution. We assumed that the prognostic model was derived on an independent train- 11

13 ing data set. While we have not analytically derived formulas for the asymptotic variance of the estimator, confidence intervals based on bootstrapping the test data set are readily available in this situation. Clearly, independent training data are not always available, and even if they are, a joint analysis of all data will be more efficient. However, some form of internal cross-validation is needed to develop and assess a prognostic model with a single data set [2, 9, 11, 5, 22]. It is important to emphasize that our definition of concordance assesses a prognostic model for the absolute risk of the event of interest in the presence of competing risks. In line with earlier work [4, 25] we regard this risk as crucial for medical decision-making in the competing risks setting. However, in many instances explicit consideration of competing events will also be important and modeling the entire competing risks multi-state process will provide further insights [1]. As an example, we saw in Section 3.1 that discrimination of prognostic models for the event of interest is hampered if covariates affect both cause-specific hazards with regression coefficients of the same sign, especially if there is a strong association with the competing risk or if the baseline competing hazard is high. This indicates that to achieve high discrimination ability one needs predictors which are only weakly or, even better, reversely associated with the cause-specific hazard of the competing event. Moreover, in settings where all competing events are of similar importance, joint accuracy criteria for the entire competing risks multi-state process are needed and their development is an important area for future research. Acknowledgments Marcel Wolbers was supported by the Wellcome Trust and the Li Ka Shing Foundation University of Oxford Global Health Programme. Conflict of Interest: None declared. References [1] J. Beyersmann, M. Dettenkofer, H. Bertz, and M. Schumacher. A competing risks analysis of bloodstream infection after stem-cell transplantation using subdistribution hazards and cause-specific hazards. Statistics in Medicine, 26: , 27. [2] Bradley Efron and Robert Tibshirani. Improvements on cross-validation: The.632+ bootstrap method. Journal of the American Statistical Association, 92:548 56, [3] J. P. Fine and R. J. Gray. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association, 446:496 59, [4] M. H. Gail and R. M. Pfeiffer. On criteria for evaluating models of absolute risk. Biostatistics, 6: ,

14 [5] T. A. Gerds, T. Cai, and M. Schumacher. The performance of risk prediction models. Biometrical Journal, 5: , 28. [6] T. A. Gerds, M. W. Kattan, M. Schumacher, and C. Yu. Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Statistics in Medicine, page Early view: DOI: 1.12/sim.5681, 212. [7] T.A. Gerds, T.H. Scheike, and P.K. Andersen. Absolute risk regression for competing risks: interpretation, link functions, and prediction. Statistics in Medicine, 31: , 212. [8] G. L. Grunkemeier, R. Jin, M. J. Eijkemans, and J. J. Takkenberg. Actual and actuarial probabilities of competing risks: apples and lemons. Annals of Thoracic Surgery, 83: , 27. [9] F. E. Harrell. Regression Modeling Strategies. Springer Series in Statistics, New York, 21. [1] F.E. Harrell, R. M. Califf, D. B. Pryor, K. L. Lee, and R. A. Rosati. Evaluating the yield of medical tests. Journal of the American Medical Association, 247: , [11] T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning (Second Edition). Springer Series in Statistics, New York, 29. [12] A. Hofman, C. M. van Duijn, O. H. Franco, M. A. Ikram, H. L. Janssen, C. C. Klaver, E. J. Kuipers, T. E. Nijsten, B. H. Stricker, H. Tiemeier, A. G. Uitterlinden, M. W. Vernooij, and J. C. Witteman. The Rotterdam Study: 212 objectives and design update. European Journal of Epidemiology, 26: , Aug 211. [13] M. T. Koller, M. J. Leening, M. Wolbers, E. W. Steyerberg, M. G. Hunink, R. Schoop, A. Hofman, H. C. Bucher, B. M. Psaty, D. M. Lloyd-Jones, and J. C. Witteman. Development and validation of a coronary risk prediction model for older U.S. and European persons in the cardiovascular health study and the Rotterdam Study. Annals of Internal Medicine, 157: , Sep 212. [14] M. T. Koller, H. Raatz, E. W. Steyerberg, and M. Wolbers. Competing risks and the clinical community: irrelevance or ignorance? Stat Med, 31: , May 212. [15] M. T. Koller, B. Schaer, M. Wolbers, C. Sticherling, H. C. Bucher, and S. Osswald. Death without prior appropriate implantable cardioverterdefibrillator therapy: a competing risk study. Circulation, 117: ,

15 [16] Ulla B. Mogensen, Hemant Ishwaran, and Thomas A. Gerds. Evaluating random forests for survival analysis using prediction error curves. Journal of Statistical Software, 5:1 23, 212. [17] NCEP. Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) final report. Circulation, 16: , 22. [18] H. Putter, M. Fiocco, and R. B. Geskus. Tutorial in biostatistics: competing risks and multi-state models. Stat Med, 26: , 27. [19] P. Saha and P. J. Heagerty. Time-dependent predictive accuracy in the presence of competing risks. Biometrics, 66: , 21. [2] Thomas Scheike, M.J. Zhang, and Thomas Alexander Gerds. Predicting cumulative incidence probability by direct binomial regression. Biometrika, 95:25 22, 28. [21] R. Schoop, J. Beyersmann, M. Schumacher, and H. Binder. Quantifying the predictive accuracy of time-to-event models in the presence of competing risks. Biometric Journal, 53:88 112, 211. [22] E. W. Steyerberg. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Springer, New York, 29. [23] H. Uno, T. Cai, M. J. Pencina, R. B. D Agostino, and L. J. Wei. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in Medicine, 3: , 211. [24] H. van Houwelingen and H. Putter. Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall/CRC Monographs on Statistics & Applied Probability, 211. [25] M. Wolbers, M. T. Koller, J. C. Witteman, and E. W. Steyerberg. Prognostic models with competing risks: methods and application to coronary risk prediction. Epidemiology, 2: , 29. [26] G. Yan and T. Greene. Investigating the effects of ties on measures of concordance. Statistics in medicine, 27: , 28. [27] Y. Zheng, T. Cai, Y. Jin, and Z. Feng. Evaluating prognostic accuracy of biomarkers under competing risk. Biometrics, 68: ,

16 Supplementary Materials Section A of the supplementary material illustrates the software implementation of the proposed concordance estimator in the function cindex of the R package pec [16]. Section B provides a proof of Lemma 1. A Software implementation The presented concordance estimators for competing risks have been implemented in the function cindex of the R package pec [16]. The following box shows R code for evaluating Ĉ1(t) at t = 1, 5, and1 simultaneously for a Fine- Gray regression model and the combination of cause-specific Cox regression with age as the only covariate based on a marginal Kaplan-Meier model for the censoring distribution. 1 library(pec) 2 library(riskregression) 3 library(cmprsk) 4 5 # Fit Fine-Gray and cause-specific models for failure cause 1 to the training data 6 7 fg.train <- FGR(Hist(ttevent,evtype)~age,data=chd.train,cause=1) 8 cox.train <- CSC(Hist(ttevent,evtype)~age,data=chd.train,cause=1) 9 1 # Calculate truncated concordance at t=1,5,1 years 11 # for CHD in the test data cindex(list(fg.train,cox.train), 14 formula=hist(ttevent,evtype)~1, 15 cens.model="marginal", 16 data=chd.test,eval.times=c(1,5,1),cause=1) # Fit Fine-Gray and cause-specific models for failure cause 1 in all data 19 2 fg <- FGR(Hist(ttevent,evtype)~age,data=chd,cause=1) 21 cox <- CSC(Hist(ttevent,evtype)~age,data=chd,cause=1) # Calculate truncated concordance at t=1,5,1 years 24 # for CHD using bootstrap cross-validation cindex(list(fg.train,cox.train), 27 formula=hist(ttevent,evtype)~1, 28 cens.model="marginal", 29 splitmethod="bootcv", 3 B=1, 31 data=chd,eval.times=c(1,5,1),cause=1) 15

17 B Proof of Lemma 4.1 It is assumed that Ĝ is consistent for G. Thus, Slutsky s lemma shows that the weights converge in probability to W ij,1 = G( T i X i )G( T i X j ) and W ij,2 = G( T i X i )G( T j X j ) as n. By the law of large numbers and Slutsky s Lemma it follows that Ĉ 1 (t) converges in probability to ( E Xi,X j [Q ij (t) E{ÃijÑ i 1 1 (t)wij,1 X i, X j } + E{ B )] ij Ñi 1 1 (t)wij,2 X i, X j } [( E Xi,X j E{ÃijÑ i 1 1 (t)wij,1 X i, X j } + E{ B )]. (8) ij Ñi 1 1 (t)wij,2 X i, X j } By applying equations (4.4) and (4.5) we have and hence E(ÃijÑ 1 i (t) X i, X j ) = = E(ÃijÑ 1 i (t)w 1 ij,1 X i, X j ) = t t t Similarly by equation (4.5) we have E(I{ T j > s X j }) E(dÑ 1 i (s) X i ) G(s X j ) S(s X j ) G(s X i )df 1 (s X i ) S(s X j )df 1 (s X i ). E( B ij Ñ 1 i (t) X i, X j ) = E( i j B ij N 1 i (t) X i, X j ) which yields = E( B ij Ñ i (t)w ij,2 X i, X j ) = t s t G(v X j ) G(s X i )df 2 (v X j ) df 1 (s X i ) F 2 (s X j )df 1 (s X i ). Inserting shows that expression (8) equals the expression for C 1 (t) given in (2.3): [ E Xi,X j Q ij (t) ] [ t {S(s X j) + F 2 (s X j )}df 1 (s X i ) E Xi,X j Q ij (t) ] t [ ] {1 F 1(s X j )}df 1 (s X i ) t = [ ] E Xi,X j {S(s X t. j) + F 2 (s X j )}df 1 (s X i ) E Xi,X j {1 F 1(s X j )}df 1 (s X i ) 16

18 Tables and Figures Figure 1: Concordance for cause-specific hazards models depending on the effect size β 1 as well as on the baseline hazard λ 2 (t) and the regression coefficient for the competing risk. Black solid lines refer to β 2 =, black dashed lines to β 2 = β 1, gray solid lines to β 2 = β 1, and gray dashed lines to β 2 = 2β 1. λ 2 (t) = (no competing risks) λ 2 (t) =.5 λ 2 (t) = 2 Concordance C ~ Concordance C ~ Concordance C ~ β 1 β 1 β 1 17

19 Figure 2: Concordance for Fine-Gray models depending on the effect size β and the baseline probability p of an event of interest in [, ). Lines correspond to p =.1 (gray dashed line),.5 (black line), and.9 (gray dash-dotted line). Concordance C ~ β 18

20 Table 1: Simulation settings, bias and root mean square error (RMSE) for the 16 scenarios of the simulation study. Bias and RMSE for each scenario are based on results from 1 simulated datasets. CR=competing risks process. Simulation setting Bias ( 1) RMSE ( 1) CR t censoring N C1(t) Ĉ1,naive(t) Ĉ1,KM(t) Ĉ1,Cox(t) Ĉ1,naive(t) Ĉ1,KM(t) Ĉ1,Cox(t) 1 median indep median indep median dep median dep q 3 indep q 3 indep q 3 dep q 3 dep median indep median indep median dep median dep q 3 indep q 3 indep q 3 dep q 3 dep

21 Table 2: Estimated concordance in the coronary heart disease study truncated after 1, 5 and 1 years. Single split into training/validation data CHD non-chd death 1y 5y 1y 1y 5y 1y Fine-Gray (age only) Fine-Gray Cause-specific Cox Average of 1 splits into bootstrap training data and corresponding validation data CHD non-chd death 1y 5y 1y 1y 5y 1y Fine-Gray (age only) Fine-Gray Cause-specific Cox Figure 3: Left panel: IPCW estimates of C 1 (t) for the multivariable Fine and Gray model (solid line) and the model with age as the only covariate (dashed line) for a follow-up duration of 1 1 years in the validation data. Error bars at 2.5, 5, 7.5 and 1 years of follow-up correspond to boostrap standard errors. Right panel: Prediction error curves for the multivariable Fine and Gray model (black solid curve) and the model with age as the only covariate (black dashed line). The gray line shows the prediction error of the covariate-free nonparametric estimate of the cumulative incidence function as a benchmark. Concordance C^ 1(t) Prediction error (Brier score) Time t (years) Time t (years) 2

22 Research Reports available from Department of Biostatistics Department of Biostatistics University of Copenhagen Øster Farimagsgade 5 P.O. Box Copenhagen K Denmark 11/1 Kreiner, S. Is the foundation under PISA solid? A critical look at the scaling model underlying international comparisons of student attainment. 11/2 Andersen, P.K. Competing risks in epidemiology: Possibilities and pitfalls. 11/3 Holst, K.K. Model diagnostics based on cumulative residuals: The R-package gof. 11/4 Holst, K.K. & Budtz-Jørgensen, E. Linear latent variable models: The lava-package. 11/5 Holst, K.K., Budtz-Jørgensen, E. & Knudsen, G.M. A latent variable model with mixed binary and continuous response variables. 11/6 Cortese, G., Gerds, T.A., &Andersen, P.K. Comparison of prediction models for competing risks with time-dependent covariates. 11/7 Scheike, T.H. & Sun, Y. On Cross-Odds Ratio for Multivariate Competing Risks Data. 11/8 Gerds, T.A., Scheike T.H. & Andersen P.K. Absolute risk regression for competing risks: interpretation, link functions and prediction. 11/9 Scheike, T.H., Maiers, M.J., Rocha, V. & Zhang, M. Competing risks with missing covariates: Effect of haplotypematch on hematopoietic cell transplant patients. 12/1 Keiding, N., Hansen, O.K.H., Sørensen, D.N. & Slama, R. The current duration approach to estimating time to pregnancy. 12/2 Andersen, P.K. A note on the decomposition of number of life years lost according to causes of death.

23 12/3 Martinussen, T. & Vansteelandt, S. A note on collapsibility and confounding bias in Cox and Aalen regression models. 12/4 Martinussen, T. & Pipper, C.B. Estimation of Odds of Concordance based on the Aalen additive model. 12/5 Martinussen, T. & Pipper, C.B. Estimation of Causal Odds of Concordance based on the Aalen additive model. 12/6 Binder, N., Gerds, T.A. & Andersen, P.K. Pseudo-observations for competing risks with covariate dependent censoring. 12/7 Keiding, N. & Clayton, D. Standardization and control for confounding in observational studies: a historical perspective. 12/8 Hilden, J. & Gerds, T.A. Evaluating the impact of novel biomarkers: Do not rely on IDI and NRI. 12/9 Gerster, M., Madsen, M. & Andersen, P.K. Matched Survival Data in a Co-twin Control Design. 12/1 Scheike, T.H., Holst, K.K. & Hjelmborg, J.B. Estimating heritability for cause specific mortality based on twin studies. 12/11 Jensen, A.K.G., Gerds, T.A., Weeke, P., Torp-Pedersen C. & Andersen, P.K. A note on the validity of the case-time-control design for autocorrelated exposure histories. 13/1 Forman, J.L. & Sørensen M. A new approach to multi-modal diffusions with applications to protein folding. 13/2 Mogensen, U.B., Hansen, M., Bjerrum, J.T., Coskun, M., Nielsen, O.H., Olsen, J. & Gerds, T.A. Microarray based classification of inflammatory bowel disease: A comparison of modelling tools and classification scales. 13/3 Wolbers, M., Koller, M.T., Witteman, J.C.M. & Gerds, T.A. Concordance for prognostic models with competing risks.

A note on the decomposition of number of life years lost according to causes of death

A note on the decomposition of number of life years lost according to causes of death A note on the decomposition of number of life years lost according to causes of death Per Kragh Andersen Research Report 12/2 Department of Biostatistics University of Copenhagen A note on the decomposition

More information

Data splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE:

Data splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE: #+TITLE: Data splitting INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+AUTHOR: Thomas Alexander Gerds #+INSTITUTE: Department of Biostatistics, University of Copenhagen

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring

Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring Noname manuscript No. (will be inserted by the editor) Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring Thomas A. Gerds 1, Michael W Kattan

More information

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,

More information

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Patrick J. Heagerty PhD Department of Biostatistics University of Washington 166 ISCB 2010 Session Four Outline Examples

More information

A multi-state model for the prognosis of non-mild acute pancreatitis

A multi-state model for the prognosis of non-mild acute pancreatitis A multi-state model for the prognosis of non-mild acute pancreatitis Lore Zumeta Olaskoaga 1, Felix Zubia Olaskoaga 2, Guadalupe Gómez Melis 1 1 Universitat Politècnica de Catalunya 2 Intensive Care Unit,

More information

Multi-state models: prediction

Multi-state models: prediction Department of Medical Statistics and Bioinformatics Leiden University Medical Center Course on advanced survival analysis, Copenhagen Outline Prediction Theory Aalen-Johansen Computational aspects Applications

More information

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements [Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers

More information

Quantifying the Predictive Accuracy of Time to Event Models in the Presence of Competing Risks

Quantifying the Predictive Accuracy of Time to Event Models in the Presence of Competing Risks Quantifying the Predictive Accuracy of Time to Event Models in the Presence of Competing Risks Rotraut Schoop, Jan Beyersmann, Martin Schumacher & Harald Binder Universität Freiburg i. Br. FDM-Preprint

More information

Department of Biostatistics University of Copenhagen

Department of Biostatistics University of Copenhagen Cox regression with missing covariate data using a modified partial likelihood method Torben Martinussen Klaus K. Holst Thomas Scheike Research Report 14/7 Department of Biostatistics University of Copenhagen

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen Mei-Jie

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

PhD course: Statistical evaluation of diagnostic and predictive models

PhD course: Statistical evaluation of diagnostic and predictive models PhD course: Statistical evaluation of diagnostic and predictive models Tianxi Cai (Harvard University, Boston) Paul Blanche (University of Copenhagen) Thomas Alexander Gerds (University of Copenhagen)

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Comparison of Predictive Accuracy of Neural Network Methods and Cox Regression for Censored Survival Data

Comparison of Predictive Accuracy of Neural Network Methods and Cox Regression for Censored Survival Data Comparison of Predictive Accuracy of Neural Network Methods and Cox Regression for Censored Survival Data Stanley Azen Ph.D. 1, Annie Xiang Ph.D. 1, Pablo Lapuerta, M.D. 1, Alex Ryutov MS 2, Jonathan Buckley

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Sarwar I Mozumder 1, Mark J Rutherford 1, Paul C Lambert 1,2 1 Biostatistics

More information

Overview of statistical methods used in analyses with your group between 2000 and 2013

Overview of statistical methods used in analyses with your group between 2000 and 2013 Department of Epidemiology and Public Health Unit of Biostatistics Overview of statistical methods used in analyses with your group between 2000 and 2013 January 21st, 2014 PD Dr C Schindler Swiss Tropical

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Dynamic Predictions with Time-Dependent Covariates. in Survival Analysis using Joint Modeling and. Landmarking

Dynamic Predictions with Time-Dependent Covariates. in Survival Analysis using Joint Modeling and. Landmarking Dynamic Predictions with Time-Dependent Covariates in Survival Analysis using Joint Modeling and Landmarking Dimitris Rizopoulos 1,, Geert Molenberghs 2 and Emmanuel M.E.H. Lesaffre 2,1 1 Department of

More information

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models 26 March 2014 Overview Continuously observed data Three-state illness-death General robust estimator Interval

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Ph.D. course: Regression models. Introduction. 19 April 2012

Ph.D. course: Regression models. Introduction. 19 April 2012 Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

arxiv: v1 [stat.ap] 6 Apr 2018

arxiv: v1 [stat.ap] 6 Apr 2018 Individualized Dynamic Prediction of Survival under Time-Varying Treatment Strategies Grigorios Papageorgiou 1, 2, Mostafa M. Mokhles 2, Johanna J. M. Takkenberg 2, arxiv:1804.02334v1 [stat.ap] 6 Apr 2018

More information

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Estimating Causal Effects of Organ Transplantation Treatment Regimes

Estimating Causal Effects of Organ Transplantation Treatment Regimes Estimating Causal Effects of Organ Transplantation Treatment Regimes David M. Vock, Jeffrey A. Verdoliva Boatman Division of Biostatistics University of Minnesota July 31, 2018 1 / 27 Hot off the Press

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Diagnostics. Gad Kimmel

Diagnostics. Gad Kimmel Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. Cross validation. ROC plot. Introduction Motivation Estimating properties of an estimator. Given data samples say the average. x 1, x 2,...,

More information

Philosophy and Features of the mstate package

Philosophy and Features of the mstate package Introduction Mathematical theory Practice Discussion Philosophy and Features of the mstate package Liesbeth de Wreede, Hein Putter Department of Medical Statistics and Bioinformatics Leiden University

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

A multi-state model for the prognosis of non-mild acute pancreatitis

A multi-state model for the prognosis of non-mild acute pancreatitis A multi-state model for the prognosis of non-mild acute pancreatitis Lore Zumeta Olaskoaga 1, Felix Zubia Olaskoaga 2, Marta Bofill Roig 1, Guadalupe Gómez Melis 1 1 Universitat Politècnica de Catalunya

More information

1 Introduction. 2 Residuals in PH model

1 Introduction. 2 Residuals in PH model Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical

More information

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen The Annals of Statistics 21, Vol. 29, No. 5, 1344 136 A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1 By Thomas H. Scheike University of Copenhagen We present a non-parametric survival model

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Package comparec. R topics documented: February 19, Type Package

Package comparec. R topics documented: February 19, Type Package Package comparec February 19, 2015 Type Package Title Compare Two Correlated C Indices with Right-censored Survival Outcome Version 1.3.1 Date 2014-12-18 Author Le Kang, Weijie Chen Maintainer Le Kang

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities

More information

Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on -APPENDIX-

Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on -APPENDIX- Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on survival probabilities using generalised pseudo-values Ulrike Pötschger,2, Harald Heinzl 2, Maria Grazia Valsecchi

More information

Probabilistic Index Models

Probabilistic Index Models Probabilistic Index Models Jan De Neve Department of Data Analysis Ghent University M3 Storrs, Conneticut, USA May 23, 2017 Jan.DeNeve@UGent.be 1 / 37 Introduction 2 / 37 Introduction to Probabilistic

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Censoring and Truncation - Highlighting the Differences

Censoring and Truncation - Highlighting the Differences Censoring and Truncation - Highlighting the Differences Micha Mandel The Hebrew University of Jerusalem, Jerusalem, Israel, 91905 July 9, 2007 Micha Mandel is a Lecturer, Department of Statistics, The

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model

Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model Quentin Guibert Univ Lyon, Université Claude Bernard Lyon 1, ISFA, Laboratoire SAF EA2429, F-69366, Lyon,

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Lehmann Family of ROC Curves

Lehmann Family of ROC Curves Memorial Sloan-Kettering Cancer Center From the SelectedWorks of Mithat Gönen May, 2007 Lehmann Family of ROC Curves Mithat Gonen, Memorial Sloan-Kettering Cancer Center Glenn Heller, Memorial Sloan-Kettering

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

A multistate additive relative survival semi-markov model

A multistate additive relative survival semi-markov model 46 e Journées de Statistique de la SFdS A multistate additive semi-markov Florence Gillaizeau,, & Etienne Dantan & Magali Giral, & Yohann Foucher, E-mail: florence.gillaizeau@univ-nantes.fr EA 4275 - SPHERE

More information

A Bias Correction for the Minimum Error Rate in Cross-validation

A Bias Correction for the Minimum Error Rate in Cross-validation A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Instrumental variables estimation in the Cox Proportional Hazard regression model

Instrumental variables estimation in the Cox Proportional Hazard regression model Instrumental variables estimation in the Cox Proportional Hazard regression model James O Malley, Ph.D. Department of Biomedical Data Science The Dartmouth Institute for Health Policy and Clinical Practice

More information

Statistical Methods for Alzheimer s Disease Studies

Statistical Methods for Alzheimer s Disease Studies Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

Survival Model Predictive Accuracy and ROC Curves

Survival Model Predictive Accuracy and ROC Curves UW Biostatistics Working Paper Series 12-19-2003 Survival Model Predictive Accuracy and ROC Curves Patrick Heagerty University of Washington, heagerty@u.washington.edu Yingye Zheng Fred Hutchinson Cancer

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,

More information

Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

More information

The Net Reclassification Index (NRI): a Misleading Measure of Prediction Improvement with Miscalibrated or Overfit Models

The Net Reclassification Index (NRI): a Misleading Measure of Prediction Improvement with Miscalibrated or Overfit Models UW Biostatistics Working Paper Series 3-19-2013 The Net Reclassification Index (NRI): a Misleading Measure of Prediction Improvement with Miscalibrated or Overfit Models Margaret Pepe University of Washington,

More information