PhD course: Statistical evaluation of diagnostic and predictive models

Similar documents
Part III Measures of Classification Accuracy for the Prediction of Survival Times

Application of the Time-Dependent ROC Curves for Prognostic Accuracy with Multiple Biomarkers

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Survival Prediction Under Dependent Censoring: A Copula-based Approach

Statistical Methods for Alzheimer s Disease Studies

Distribution-free ROC Analysis Using Binary Regression Techniques

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Time-dependent Predictive Values of Prognostic Biomarkers with Failure Time Outcome

Introduction to Statistical Analysis

Univariate shrinkage in the Cox model for high dimensional data

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome.

Survival Model Predictive Accuracy and ROC Curves

Package penalized. February 21, 2018

Lecture 3. Truncation, length-bias and prevalence sampling

NONPARAMETRIC ADJUSTMENT FOR MEASUREMENT ERROR IN TIME TO EVENT DATA: APPLICATION TO RISK PREDICTION MODELS

Version of record first published: 01 Jan 2012.

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Evaluation of the predictive capacity of a biomarker

Survival Analysis Math 434 Fall 2011

Cox s proportional hazards model and Cox s partial likelihood

Data splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE:

arxiv: v1 [stat.me] 25 Oct 2012

Lecture 5 Models and methods for recurrent event data

Multi-state Models: An Overview

Building a Prognostic Biomarker

Multi-state models: prediction

Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring

Statistics in medicine

TEST-DEPENDENT SAMPLING DESIGN AND SEMI-PARAMETRIC INFERENCE FOR THE ROC CURVE. Bethany Jablonski Horton

Survival Analysis. Stat 526. April 13, 2018

Extensions of Cox Model for Non-Proportional Hazards Purpose

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

EVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST

STAT331. Cox s Proportional Hazards Model

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Package Rsurrogate. October 20, 2016

Ph.D. course: Regression models. Introduction. 19 April 2012

MAS3301 / MAS8311 Biostatistics Part II: Survival

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Incorporating published univariable associations in diagnostic and prognostic modeling

STATISTICAL METHODS FOR EVALUATING BIOMARKERS SUBJECT TO DETECTION LIMIT

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018

c Copyright 2015 Chao-Kang Jason Liang

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Robustifying Trial-Derived Treatment Rules to a Target Population

A Parametric ROC Model Based Approach for Evaluating the Predictiveness of Continuous Markers in Case-control Studies

Auxiliary-variable-enriched Biomarker Stratified Design

Analysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome

STAT 526 Spring Final Exam. Thursday May 5, 2011

MODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS

Variable Selection in Competing Risks Using the L1-Penalized Cox Model

Prediction Performance of Survival Models

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Machine Learning Linear Classification. Prof. Matteo Matteucci

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Survival Regression Models

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Multimodal Deep Learning for Predicting Survival from Breast Cancer

ST745: Survival Analysis: Nonparametric methods

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

6.873/HST.951 Medical Decision Support Spring 2004 Evaluation

Logistic regression model for survival time analysis using time-varying coefficients

Longitudinal + Reliability = Joint Modeling

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

STAT Section 2.1: Basic Inference. Basic Definitions

Accommodating covariates in receiver operating characteristic analysis

Towards stratified medicine instead of dichotomization, estimate a treatment effect function for a continuous covariate

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Quantile Regression for Residual Life and Empirical Likelihood

Power and Sample Size Calculations with the Additive Hazards Model

Measurement Error in Spatial Modeling of Environmental Exposures

Modelling geoadditive survival data

The Design and Analysis of Benchmark Experiments Part II: Analysis

Time-varying proportional odds model for mega-analysis of clustered event times

Lecture 7 Time-dependent Covariates in Cox Regression

Classification. Chapter Introduction. 6.2 The Bayes classifier

Chapter 4 Regression Models

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Müller: Goodness-of-fit criteria for survival data

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

3003 Cure. F. P. Treasure

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Meta-Analysis for Diagnostic Test Data: a Bayesian Approach

Lecture 01: Introduction

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

DYNAMIC PREDICTION MODELS FOR DATA WITH COMPETING RISKS. by Qing Liu B.S. Biological Sciences, Shanghai Jiao Tong University, China, 2007

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Valida&on of Predic&ve Classifiers

Transcription:

PhD course: Statistical evaluation of diagnostic and predictive models Tianxi Cai (Harvard University, Boston) Paul Blanche (University of Copenhagen) Thomas Alexander Gerds (University of Copenhagen) March 18-22, 2014 1 / 38

Day 4 : Survival Prediction 2 / 38

Prediction of Survival Outcomes Survival Prediction with A Single Marker evaluating the accuracy estimating the accuracy Survival Prediction with Multiple Markers constructing composite scores through survival regression models evaluating the accuracy 3 / 38

Survival Prediction with A Single Marker In many clinical studies, the outcome of interest is time to the occurrence of a clinical condition. Examples: time to disease diagnosis; recurrence; death. (a) Survival (b) Metastasis-free Survival S^(t) 0.0 0.4 0.8 S^(t) 0.0 0.4 0.8 0 5 10 15 Time (years) 0 5 10 15 Time (years) 4 / 38

Standard Survival Analysis Kaplan Meier plots Log-rank test for two group comparisons (e.g. assessing treatment effect) Association analysis: Cox proportional hazards model hazard ratio estimates 5 / 38

The PEACE Trial Survival probability 0.85 0.90 0.95 1.00 Placebo ACEi Covariates Placebo Est SE p-value egfr -0.006 0.003 0.05 Age 0.072 0.008 <0.01 Gender -0.179 0.155 0.25 lveejf -0.026 0.007 <0.01 Hypertension 0.330 0.117 <0.01 Diabetes 0.515 0.135 <0.01 MI 0.016 0.119 0.89 0 20 40 60 80 Months Questions beyond association: How well can we predict survival? How do we evaluate the prediction performance with survival outcomes? How do we combine information from multiple markers? 6 / 38

Survival Prediction Accuracy Measures To assess the accuracy of a marker X in predicting the event time T, various accuracy measures have been suggested: Time-dependent TPR, FPR, PPV, and NPV. Heagerty & Pepe (2000); Heagerty & Zheng, 2005; Cai et al, (2005); Zheng et al, (2007). Proportion of explained variation Korn & Simon (1990); Henderson (1995); Schemper & Stare (1996). (Integrated) Brier score Graf et al (1999); Gerds and Schumacher (2006). Overall concordance measures: C-index Harrell et al (1982) s, Begg et al (2000), Uno et al (2011). 7 / 38

Time Dependent TPR, FPR and ROC When interest lies in the prediction of t-year survival, one may assess the accuracy of X in classifying the binary outcome D t = I(T t) by constructing binary prediction rules I(X c). The classification accuracy of I(X c) in predicting D t may be summarized by TPR t (c) = P(X c D t = 1), FPR t (c) = P(X c D t = 0), This corresponds to a time dependent ROC curve { ROC t (c) = TPR t FPR 1 t (u) } 8 / 38

Defining "Cases" and "Controls" for a given t In general, several types of time dependent ROC curves have been proposed by defining D t and the populations of interest differently. Entire Population : D t = 1 if T t, D t = 0 if T > t {T t} {T > τ} : D t = 1 if T t, D t = 0 if T > τ {T t} : D t = 1 if T = t, D t = 0 if T > t {T = t} {T > τ} : D t = 1 if T = t, D t = 0 if T > τ τ is a pre-defined time point such that T > τ is considered controls. Classification accuracy measures can be defined accordingly. 9 / 38

Overall Prediction Performance Measure Area under the ROC curve for classifying D t = I(T t) AUC t = ROC t (u)du = P(X 1 X 2 T 1 t,t 2 > t) Concordance Statistic (Harrell s C-statistic) C τ = P(X 1 X 2 T 1 T 2,T 1 τ) Integrated Brier score IBS τ = τ 0 {I(T > t) P(T > t X)} 2 dw(t) 10 / 38

Estimation of the Time Dependent Accuracy Measures In most studies with event time outcomes, the event time is subject to censoring due to loss to follow up or end of study. Consequently, for event time T, we observe ( T, ), where T = min(t,c), = I(T C) where C is the follow-up (censoring) time. Estimation of the accuracy measures requires assumptions about the censoring variable: A stronger assumption requires C to be independent of both T and X with a common survival function G(t) = P(C t). A weaker assumption requires C to be independent of the event time T conditional on the marker value X, but may depend on X. 11 / 38

Estimation of the Time Dependent Accuracy Suppose we are interested in estimating TPR t (c) = P(X c T t) = P(T t X c)p(x c) P(T t) Without censoring, we may estimate TPR t (c) empirically: n i=1 I(X i c,t i t) n i=1 I(T. i t) Due to censoring, D t = I(T t) is not always observed. Various approaches may be taken to account for censoring. 12 / 38

Estimation of the Time Dependent Accuracy If C is independent of T and X, a consistent estimator of TPR t (c) may be obtained based on Kaplan-Meier estimates of P(T t) and P(T t X c). For any c, P(T t X c) may be estimated using observations from the subset of patients with {X c}. Inverse Probability Weighting (IPW) with weights W i(t) = I( T i t)δ i G( T i) + I( T i > t). G(t) Note that I(T i t) is observable if I( T i t)δ i = 1 or I( T i > t) = 1. 13 / 38

Estimation of the Time Dependent Accuracy For the IPW approach, one may show that E{W i (t)i(t i t,x i c) T i,x i } = I(T i t,x i c) Thus, TPR t (c) may be estimated by TPR t (c) = n i=1ŵi(t)i(x i c,t i t) n. i=1ŵi(t)i(t i t) where Ŵi(t) is obtained by replacing G( ) in W i (t) by Ĝ( ) and Ĝ( ) is Kaplan-Meier estimator of G( ). 14 / 38

Estimation of the Time Dependent Accuracy If C depends on X but is independent of T conditional on X, one may estimate TPR t (c) by first estimating S y (t) = P(T t X = y) Non-parametrically via methods such as kernel smoothing conditional Nelson Aalen or Kaplan Meier estimator Semi-parametrically by assuming a regression model for T X. e.g. fitting a Cox proportional hazards model Subsequently, one may obtain a plug-in estimate of TPR t (c) based on P(T t X c) = c S y (t)df(y), where F(y) = P(X y) 1 F(c) 15 / 38

Framingham Offspring Study for CVD Prediction Framingham Heart Study: Goal: identifying risk factors for CVD Framingham Risk Score for CHD/Stroke prediction 3 generations original cohort (1948) Offspring cohort (1971), Omni cohort (1994) 3rd generation cohort (2002), 2nd generation Omni cohort (2003) Framingham Offspring Study Female Participants 1687 female out of a total 5124 participants 261 events (death/cvd) with 10-year event rate 6% Framingham risk score (Wilson et al. 1998) Risk score w/ C-reactive protein (CRP) (Cook et al, 2006; Ridker et al, 2007) 16 / 38

Framingham Offspring Study for CVD Prediction Table : Estimated accuracy measures ( 100) for 5-year survival based on non-parametric kernel smoothing (NP), IPW and the Cox model. Here c p is the pth percentile of the risk score. NP IPW Cox Est SE Est SE Est SE FPR 5 (c.2 ) 79.7 1.0 79.7 1.0 79.6 1.0 FPR 5 (c.8 ) 19.1 1.0 18.8 0.9 19.3 1.0 TPR 5 (c.2 ) 92.8 4.5 91.9 4.3 96.2 0.6 TPR 5 (c.8 ) 61.2 7.9 62.2 7.7 54.9 3.0 NPV 5 (c.2 ) 99.2 0.5 99.1 0.5 99.2 0.1 NPV 5 (c.8 ) 99.0 0.3 99.0 0.3 98.8 0.2 PPV 5 (c.2 ) 2.5 0.4 2.5 0.4 2.6 0.4 PPV 5 (c.8 ) 6.5 1.3 6.8 1.4 5.9 1.0 AUC 75.2 4.1 75.8 3.9 75.7 1.5 FPR TPR=.9 65.0 13.9 58.7 8.4 61.8 3.0 NPV TPR=.9 99.4 0.3 99.4 0.7 99.4 0.5 PPV TPR=.9 2.9 0.8 3.2 0.2 3.1 0.5 17 / 38

Framingham Offspring Study for CVD Prediction Figure : Time-dependent ROC curve (a) and PPV curve (b) of the risk score for predicting 5-year CVD events. TPR_5yrs 0.0 0.2 0.4 0.6 0.8 1.0 Semi-Cox CNA IPW PPV_5yrs 0.00 0.05 0.10 0.15 0.20 Semi-Cox CNA IPW 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 FPR_5yrs v (a) (b) 18 / 38

Survival Prediction with Multiple Markers When there are multiple markers available to assist in prediction, one may construct composite scores as for binary outcomes. 1. Fit a survival regression model to combine markers a risk score S( β) 2. Evaluate the performance of S( β) in predicting the survival as in the univariate case 19 / 38

Survival Prediction with Multiple Markers A wide range of survival regression models have been proposed in the literature. Cox proportional hazards model; Proportional odds model; Time-specific generalized linear model. 20 / 38

Survival Regression Models Cox Proportional Hazards Model (Cox, 1972) λ X (t) = λ 0 (t)exp(β T 0 X) λ X (t) is the hazard function for a subject with marker value X, and λ 0(t) is the baseline hazard function. An equivalent form of the model is P(T t X) = g(h 0(t)+β T 0 X) where g(x) = 1 e ex and h 0( ) is an unknown increasing function. β 0 may be estimated by maximizing the partial likelihood. 21 / 38

Survival Regression Models Proportional Odds Model logit P(T t X) = h 0 (t)+β T 0X For any fixed t logistic regression with response I(T t). Rank based estimator (Pettitt, 1984) and non-parametric maximum likelihood estimator (Murphy et al, 1997) have been proposed for β 0. Under either proportional hazards or proportional odds model, the risk score β 0 X is the optimal score for classifying D t = I(T t) for any t. 22 / 38

Time-specific Generalized Linear Model Markers useful for identifying short term survivors may be not be useful for identifying long term survivors. To construct time-dependent optimal score, one may consider a time-specific generalized linear model (GLM): P(T t X) = g {h 0 (t)+β T 0t X} Without censoring, for any given time t, one may fit a usual GLM to the synthetic data {D t = I(T t),x} to obtain an estimate of β 0t. Zheng et al (2006) considered inverse probability weighting based on estimators for time-specific logistic regression model. β T 0tX is the optimal score in distinguishing {T t} from {T > t} and achieves the highest ROC t( ). 23 / 38

Estimating the Accuracy of the Composite Score By fitting the survival models, one may obtain an estimate of the regression coefficient. Cox proportional hazards model: one may estimate β 0 as the maximizer of the partial likelihood function. Time-specific GLM: one may estimate β 0t as the solution to the weighted estimating equation n i=1 ( ) 1 Ŵ i (t) {I(T i t) g(α+β T X i )} = 0 X i where Ŵi(t) is the weight to account for censoring as defined earlier. e.g. with logistic link, equivalent to fitting a logistic regression with I( T t) as the outcome, X as the predictor, and weights Ŵi(t). 24 / 38

Estimating the Accuracy of the Composite Risk Score Suppose β t is the estimator of β 0t ( β t = β if β 0t = β 0 ). We may estimate the accuracy of the risk score β T 0tX by replacing β T 0tX as β T t X; and using tools for the single marker setting. For example, assuming that the censoring is independent of T and X, may be estimated by TPR t {c;β 0t } = P(β T 0tX c T t) TPR t (c; β t ) = n i=1ŵi(t)i( β T t X i c,t i t) n i=1ŵi(t)i(t i t) 25 / 38

Estimating the Accuracy of the Composite Score Similarly, assuming that the censoring is independent of T and X, may be estimated by FPR t (c;β 0t ) = P(β T 0tX c T > t) FPR t (c; β t ) = n i=1ŵi(t)i( β T t X i c,t i > t) n. i=1ŵi(t)i(t i > t) Consequently, we may estimate ROC t (u;β 0t ) = TPR t { FPR 1 t (u;β 0t );β 0t } by plugging in TPR t (c; β t ) and FPR t (c; β t ). 26 / 38

Example: Breast Cancer Gene Expression Study The New England Journal of Medicine Copyright 2002 by the Massachusetts Medical Society VOLUME 347 DECEMBER 19, 2002 NUMBER 25 A GENE-EXPRESSION SIGNATURE AS A PREDICTOR OF SURVIVAL IN BREAST CANCER MARC J. VAN DE VIJVER, M.D., PH.D., YUDONG D. HE, PH.D., LAURA J. VAN T VEER, PH.D., HONGYUE DAI, PH.D., AUGUSTINUS A.M. HART, M.SC., DORIEN W. VOSKUIL, PH.D., GEORGE J. SCHREIBER, M.SC., JOHANNES L. PETERSE, M.D., CHRIS ROBERTS, PH.D., MATTHEW J. MARTON, PH.D., MARK PARRISH, DOUWE ATSMA, ANKE WITTEVEEN, ANNUSKA GLAS, PH.D., LEONIE DELAHAYE, TONY VAN DER VELDE, HARRY BARTELINK, M.D., PH.D., SJOERD RODENHUIS, M.D., PH.D., EMIEL T. RUTGERS, M.D., PH.D., STEPHEN H. FRIEND, M.D., PH.D., AND RENÉ BERNARDS, PH.D. 27 / 38

Example: Breast Cancer Gene Expression Study 295 breast cancer patients who were diagnosed with breast cancer between 1984 and 1995. The median survival time is 3.8 years for these patients. Outcome: time to death Markers: gene expression markers The gene expression measurement is the logarithm of the intensity ratios between the red and the green fluorescent dyes, where green dye is used for the reference pool and red is used for the experimental tissue. The prognosis rule developed by van t veer et al (2002) and Vijver et al (2002) was derived based on a 70 gene expression markers. For illustration, we selected 6 out of 70 gene expression markers for prediction. 28 / 38

Example: Breast Cancer Gene Expression Study Obtain a linear score β T t X for classifying I(T t) by fitting various regression models: proportional hazards model λ X (t) = λ 0(t)e βt 0 X proportional odds model logitp(t t X) = h 0(t)+β T 0 X time-specific logistic regression model logitp(t t X) = h 0(t)+β T 0tX 29 / 38

Example: Breast Cancer Gene Expression Study Estimate the ROC curve, ROC t ( ), for distinguishing {T t} from {T > t} by estimating TPR t (c), and FPR t (c) non-parametrically using inverse-probability weighting. Summarize the overall accuracy of β T t X by estimating AUC t = 1 0 ROC t (u)du. 30 / 38

Example: Breast Cancer Gene Expression Study Table : Estimated AUC t (95% CI) at t = 2, 5 and 8 years after diagnosis using a 6-gene classifier with linear composite scores derived from different regression models. t = 2 years t = 5 years t = 8 years Cox.78(.62,.87).84(.78,.88).77(.71,.84) Proportional Odds.78(.59,.87).83(.68,.88).77(.65,.84) Time-specific Logistic.85(.80,.91).84(.80,.89).77(.71,.84) 31 / 38

Example: Breast Cancer Gene Expression Study sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 t=2 years t=5 years t=8 years sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 t=2 years t=5 years t=8 years 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1 specificity 1 specificity (a) Logistic (b) Cox 32 / 38

Survival Prediction with Multiple Markers Estimating the Accuracy of the Composite Score: Bias Correction When the sample size n is not large with respect to the number of markers, one may use cross-validation methods to obtain less biased accuracy estimators. one may randomly split the data into K disjoint sets of about equal size and label them as I k,k = 1,,K. For each k, an estimate ˆβ ( k) (t) for β 0(t) may be obtained based on all observations which are not in I k ; an estimate of the accuracy may be estimated based on data in I k. A bias corrected estimator of the accuracy measure may be obtained by averaging over the K accuracy estimates. 33 / 38

Survival Prediction with Multiple Markers Estimating the Accuracy of the Composite Score: Interval Estimation In addition to obtaining a point estimator for the accuracy, it is crucial to assess the variability in the estimated accuracy measure. The variability may be assessed via procedures such as the bootstrap. Treat observed data from n subjects as n units {D 1,...,D n}; Randomly sample n units from {D 1,...,D n} with replacement to obtain {D 1,...,D n }; Construct accuracy estimators based on each set of the resampled data; Repeat the procedure for M 0 times to obtain M 0 perturbed estimates of the accuracy; construct interval estimates based on the empirical percentiles of the M 0 perturbed replications. Other types of resampling methods such as the wild bootstrap have also been considered in the literature. Parzen et al (1994); Jin et al (2003); Cai et al (2005); Tian et al (2007). 34 / 38

Summary Classification accuracy measures such as the TPR, FPR and ROC can be extended to the setting with survival outcomes. Different types of time-dependent accuracy measures may be defined by defining the "diseased" and "non-diseased" populations at any given time t. To obtain estimators for the classification accuracy measures with survival outcomes, one needs to incorporate censoring appropriately. When there are multiple markers available, various survival regression models may be used to construct composite scores for prediction. Such scores may be optimal with respect to certain accuracy measures when the imposed model holds. Bias correction and variance estimation should be considered when assessing the accuracy. 35 / 38