PhD course: Statistical evaluation of diagnostic and predictive models
|
|
- Hector Horton
- 5 years ago
- Views:
Transcription
1 PhD course: Statistical evaluation of diagnostic and predictive models Tianxi Cai (Harvard University, Boston) Paul Blanche (University of Copenhagen) Thomas Alexander Gerds (University of Copenhagen) March 18-22, / 38
2 Day 4 : Survival Prediction 2 / 38
3 Prediction of Survival Outcomes Survival Prediction with A Single Marker evaluating the accuracy estimating the accuracy Survival Prediction with Multiple Markers constructing composite scores through survival regression models evaluating the accuracy 3 / 38
4 Survival Prediction with A Single Marker In many clinical studies, the outcome of interest is time to the occurrence of a clinical condition. Examples: time to disease diagnosis; recurrence; death. (a) Survival (b) Metastasis-free Survival S^(t) S^(t) Time (years) Time (years) 4 / 38
5 Standard Survival Analysis Kaplan Meier plots Log-rank test for two group comparisons (e.g. assessing treatment effect) Association analysis: Cox proportional hazards model hazard ratio estimates 5 / 38
6 The PEACE Trial Survival probability Placebo ACEi Covariates Placebo Est SE p-value egfr Age <0.01 Gender lveejf <0.01 Hypertension <0.01 Diabetes <0.01 MI Months Questions beyond association: How well can we predict survival? How do we evaluate the prediction performance with survival outcomes? How do we combine information from multiple markers? 6 / 38
7 Survival Prediction Accuracy Measures To assess the accuracy of a marker X in predicting the event time T, various accuracy measures have been suggested: Time-dependent TPR, FPR, PPV, and NPV. Heagerty & Pepe (2000); Heagerty & Zheng, 2005; Cai et al, (2005); Zheng et al, (2007). Proportion of explained variation Korn & Simon (1990); Henderson (1995); Schemper & Stare (1996). (Integrated) Brier score Graf et al (1999); Gerds and Schumacher (2006). Overall concordance measures: C-index Harrell et al (1982) s, Begg et al (2000), Uno et al (2011). 7 / 38
8 Time Dependent TPR, FPR and ROC When interest lies in the prediction of t-year survival, one may assess the accuracy of X in classifying the binary outcome D t = I(T t) by constructing binary prediction rules I(X c). The classification accuracy of I(X c) in predicting D t may be summarized by TPR t (c) = P(X c D t = 1), FPR t (c) = P(X c D t = 0), This corresponds to a time dependent ROC curve { ROC t (c) = TPR t FPR 1 t (u) } 8 / 38
9 Defining "Cases" and "Controls" for a given t In general, several types of time dependent ROC curves have been proposed by defining D t and the populations of interest differently. Entire Population : D t = 1 if T t, D t = 0 if T > t {T t} {T > τ} : D t = 1 if T t, D t = 0 if T > τ {T t} : D t = 1 if T = t, D t = 0 if T > t {T = t} {T > τ} : D t = 1 if T = t, D t = 0 if T > τ τ is a pre-defined time point such that T > τ is considered controls. Classification accuracy measures can be defined accordingly. 9 / 38
10 Overall Prediction Performance Measure Area under the ROC curve for classifying D t = I(T t) AUC t = ROC t (u)du = P(X 1 X 2 T 1 t,t 2 > t) Concordance Statistic (Harrell s C-statistic) C τ = P(X 1 X 2 T 1 T 2,T 1 τ) Integrated Brier score IBS τ = τ 0 {I(T > t) P(T > t X)} 2 dw(t) 10 / 38
11 Estimation of the Time Dependent Accuracy Measures In most studies with event time outcomes, the event time is subject to censoring due to loss to follow up or end of study. Consequently, for event time T, we observe ( T, ), where T = min(t,c), = I(T C) where C is the follow-up (censoring) time. Estimation of the accuracy measures requires assumptions about the censoring variable: A stronger assumption requires C to be independent of both T and X with a common survival function G(t) = P(C t). A weaker assumption requires C to be independent of the event time T conditional on the marker value X, but may depend on X. 11 / 38
12 Estimation of the Time Dependent Accuracy Suppose we are interested in estimating TPR t (c) = P(X c T t) = P(T t X c)p(x c) P(T t) Without censoring, we may estimate TPR t (c) empirically: n i=1 I(X i c,t i t) n i=1 I(T. i t) Due to censoring, D t = I(T t) is not always observed. Various approaches may be taken to account for censoring. 12 / 38
13 Estimation of the Time Dependent Accuracy If C is independent of T and X, a consistent estimator of TPR t (c) may be obtained based on Kaplan-Meier estimates of P(T t) and P(T t X c). For any c, P(T t X c) may be estimated using observations from the subset of patients with {X c}. Inverse Probability Weighting (IPW) with weights W i(t) = I( T i t)δ i G( T i) + I( T i > t). G(t) Note that I(T i t) is observable if I( T i t)δ i = 1 or I( T i > t) = / 38
14 Estimation of the Time Dependent Accuracy For the IPW approach, one may show that E{W i (t)i(t i t,x i c) T i,x i } = I(T i t,x i c) Thus, TPR t (c) may be estimated by TPR t (c) = n i=1ŵi(t)i(x i c,t i t) n. i=1ŵi(t)i(t i t) where Ŵi(t) is obtained by replacing G( ) in W i (t) by Ĝ( ) and Ĝ( ) is Kaplan-Meier estimator of G( ). 14 / 38
15 Estimation of the Time Dependent Accuracy If C depends on X but is independent of T conditional on X, one may estimate TPR t (c) by first estimating S y (t) = P(T t X = y) Non-parametrically via methods such as kernel smoothing conditional Nelson Aalen or Kaplan Meier estimator Semi-parametrically by assuming a regression model for T X. e.g. fitting a Cox proportional hazards model Subsequently, one may obtain a plug-in estimate of TPR t (c) based on P(T t X c) = c S y (t)df(y), where F(y) = P(X y) 1 F(c) 15 / 38
16 Framingham Offspring Study for CVD Prediction Framingham Heart Study: Goal: identifying risk factors for CVD Framingham Risk Score for CHD/Stroke prediction 3 generations original cohort (1948) Offspring cohort (1971), Omni cohort (1994) 3rd generation cohort (2002), 2nd generation Omni cohort (2003) Framingham Offspring Study Female Participants 1687 female out of a total 5124 participants 261 events (death/cvd) with 10-year event rate 6% Framingham risk score (Wilson et al. 1998) Risk score w/ C-reactive protein (CRP) (Cook et al, 2006; Ridker et al, 2007) 16 / 38
17 Framingham Offspring Study for CVD Prediction Table : Estimated accuracy measures ( 100) for 5-year survival based on non-parametric kernel smoothing (NP), IPW and the Cox model. Here c p is the pth percentile of the risk score. NP IPW Cox Est SE Est SE Est SE FPR 5 (c.2 ) FPR 5 (c.8 ) TPR 5 (c.2 ) TPR 5 (c.8 ) NPV 5 (c.2 ) NPV 5 (c.8 ) PPV 5 (c.2 ) PPV 5 (c.8 ) AUC FPR TPR= NPV TPR= PPV TPR= / 38
18 Framingham Offspring Study for CVD Prediction Figure : Time-dependent ROC curve (a) and PPV curve (b) of the risk score for predicting 5-year CVD events. TPR_5yrs Semi-Cox CNA IPW PPV_5yrs Semi-Cox CNA IPW FPR_5yrs v (a) (b) 18 / 38
19 Survival Prediction with Multiple Markers When there are multiple markers available to assist in prediction, one may construct composite scores as for binary outcomes. 1. Fit a survival regression model to combine markers a risk score S( β) 2. Evaluate the performance of S( β) in predicting the survival as in the univariate case 19 / 38
20 Survival Prediction with Multiple Markers A wide range of survival regression models have been proposed in the literature. Cox proportional hazards model; Proportional odds model; Time-specific generalized linear model. 20 / 38
21 Survival Regression Models Cox Proportional Hazards Model (Cox, 1972) λ X (t) = λ 0 (t)exp(β T 0 X) λ X (t) is the hazard function for a subject with marker value X, and λ 0(t) is the baseline hazard function. An equivalent form of the model is P(T t X) = g(h 0(t)+β T 0 X) where g(x) = 1 e ex and h 0( ) is an unknown increasing function. β 0 may be estimated by maximizing the partial likelihood. 21 / 38
22 Survival Regression Models Proportional Odds Model logit P(T t X) = h 0 (t)+β T 0X For any fixed t logistic regression with response I(T t). Rank based estimator (Pettitt, 1984) and non-parametric maximum likelihood estimator (Murphy et al, 1997) have been proposed for β 0. Under either proportional hazards or proportional odds model, the risk score β 0 X is the optimal score for classifying D t = I(T t) for any t. 22 / 38
23 Time-specific Generalized Linear Model Markers useful for identifying short term survivors may be not be useful for identifying long term survivors. To construct time-dependent optimal score, one may consider a time-specific generalized linear model (GLM): P(T t X) = g {h 0 (t)+β T 0t X} Without censoring, for any given time t, one may fit a usual GLM to the synthetic data {D t = I(T t),x} to obtain an estimate of β 0t. Zheng et al (2006) considered inverse probability weighting based on estimators for time-specific logistic regression model. β T 0tX is the optimal score in distinguishing {T t} from {T > t} and achieves the highest ROC t( ). 23 / 38
24 Estimating the Accuracy of the Composite Score By fitting the survival models, one may obtain an estimate of the regression coefficient. Cox proportional hazards model: one may estimate β 0 as the maximizer of the partial likelihood function. Time-specific GLM: one may estimate β 0t as the solution to the weighted estimating equation n i=1 ( ) 1 Ŵ i (t) {I(T i t) g(α+β T X i )} = 0 X i where Ŵi(t) is the weight to account for censoring as defined earlier. e.g. with logistic link, equivalent to fitting a logistic regression with I( T t) as the outcome, X as the predictor, and weights Ŵi(t). 24 / 38
25 Estimating the Accuracy of the Composite Risk Score Suppose β t is the estimator of β 0t ( β t = β if β 0t = β 0 ). We may estimate the accuracy of the risk score β T 0tX by replacing β T 0tX as β T t X; and using tools for the single marker setting. For example, assuming that the censoring is independent of T and X, may be estimated by TPR t {c;β 0t } = P(β T 0tX c T t) TPR t (c; β t ) = n i=1ŵi(t)i( β T t X i c,t i t) n i=1ŵi(t)i(t i t) 25 / 38
26 Estimating the Accuracy of the Composite Score Similarly, assuming that the censoring is independent of T and X, may be estimated by FPR t (c;β 0t ) = P(β T 0tX c T > t) FPR t (c; β t ) = n i=1ŵi(t)i( β T t X i c,t i > t) n. i=1ŵi(t)i(t i > t) Consequently, we may estimate ROC t (u;β 0t ) = TPR t { FPR 1 t (u;β 0t );β 0t } by plugging in TPR t (c; β t ) and FPR t (c; β t ). 26 / 38
27 Example: Breast Cancer Gene Expression Study The New England Journal of Medicine Copyright 2002 by the Massachusetts Medical Society VOLUME 347 DECEMBER 19, 2002 NUMBER 25 A GENE-EXPRESSION SIGNATURE AS A PREDICTOR OF SURVIVAL IN BREAST CANCER MARC J. VAN DE VIJVER, M.D., PH.D., YUDONG D. HE, PH.D., LAURA J. VAN T VEER, PH.D., HONGYUE DAI, PH.D., AUGUSTINUS A.M. HART, M.SC., DORIEN W. VOSKUIL, PH.D., GEORGE J. SCHREIBER, M.SC., JOHANNES L. PETERSE, M.D., CHRIS ROBERTS, PH.D., MATTHEW J. MARTON, PH.D., MARK PARRISH, DOUWE ATSMA, ANKE WITTEVEEN, ANNUSKA GLAS, PH.D., LEONIE DELAHAYE, TONY VAN DER VELDE, HARRY BARTELINK, M.D., PH.D., SJOERD RODENHUIS, M.D., PH.D., EMIEL T. RUTGERS, M.D., PH.D., STEPHEN H. FRIEND, M.D., PH.D., AND RENÉ BERNARDS, PH.D. 27 / 38
28 Example: Breast Cancer Gene Expression Study 295 breast cancer patients who were diagnosed with breast cancer between 1984 and The median survival time is 3.8 years for these patients. Outcome: time to death Markers: gene expression markers The gene expression measurement is the logarithm of the intensity ratios between the red and the green fluorescent dyes, where green dye is used for the reference pool and red is used for the experimental tissue. The prognosis rule developed by van t veer et al (2002) and Vijver et al (2002) was derived based on a 70 gene expression markers. For illustration, we selected 6 out of 70 gene expression markers for prediction. 28 / 38
29 Example: Breast Cancer Gene Expression Study Obtain a linear score β T t X for classifying I(T t) by fitting various regression models: proportional hazards model λ X (t) = λ 0(t)e βt 0 X proportional odds model logitp(t t X) = h 0(t)+β T 0 X time-specific logistic regression model logitp(t t X) = h 0(t)+β T 0tX 29 / 38
30 Example: Breast Cancer Gene Expression Study Estimate the ROC curve, ROC t ( ), for distinguishing {T t} from {T > t} by estimating TPR t (c), and FPR t (c) non-parametrically using inverse-probability weighting. Summarize the overall accuracy of β T t X by estimating AUC t = 1 0 ROC t (u)du. 30 / 38
31 Example: Breast Cancer Gene Expression Study Table : Estimated AUC t (95% CI) at t = 2, 5 and 8 years after diagnosis using a 6-gene classifier with linear composite scores derived from different regression models. t = 2 years t = 5 years t = 8 years Cox.78(.62,.87).84(.78,.88).77(.71,.84) Proportional Odds.78(.59,.87).83(.68,.88).77(.65,.84) Time-specific Logistic.85(.80,.91).84(.80,.89).77(.71,.84) 31 / 38
32 Example: Breast Cancer Gene Expression Study sensitivity t=2 years t=5 years t=8 years sensitivity t=2 years t=5 years t=8 years specificity 1 specificity (a) Logistic (b) Cox 32 / 38
33 Survival Prediction with Multiple Markers Estimating the Accuracy of the Composite Score: Bias Correction When the sample size n is not large with respect to the number of markers, one may use cross-validation methods to obtain less biased accuracy estimators. one may randomly split the data into K disjoint sets of about equal size and label them as I k,k = 1,,K. For each k, an estimate ˆβ ( k) (t) for β 0(t) may be obtained based on all observations which are not in I k ; an estimate of the accuracy may be estimated based on data in I k. A bias corrected estimator of the accuracy measure may be obtained by averaging over the K accuracy estimates. 33 / 38
34 Survival Prediction with Multiple Markers Estimating the Accuracy of the Composite Score: Interval Estimation In addition to obtaining a point estimator for the accuracy, it is crucial to assess the variability in the estimated accuracy measure. The variability may be assessed via procedures such as the bootstrap. Treat observed data from n subjects as n units {D 1,...,D n}; Randomly sample n units from {D 1,...,D n} with replacement to obtain {D 1,...,D n }; Construct accuracy estimators based on each set of the resampled data; Repeat the procedure for M 0 times to obtain M 0 perturbed estimates of the accuracy; construct interval estimates based on the empirical percentiles of the M 0 perturbed replications. Other types of resampling methods such as the wild bootstrap have also been considered in the literature. Parzen et al (1994); Jin et al (2003); Cai et al (2005); Tian et al (2007). 34 / 38
35 Summary Classification accuracy measures such as the TPR, FPR and ROC can be extended to the setting with survival outcomes. Different types of time-dependent accuracy measures may be defined by defining the "diseased" and "non-diseased" populations at any given time t. To obtain estimators for the classification accuracy measures with survival outcomes, one needs to incorporate censoring appropriately. When there are multiple markers available, various survival regression models may be used to construct composite scores for prediction. Such scores may be optimal with respect to certain accuracy measures when the imposed model holds. Bias correction and variance estimation should be considered when assessing the accuracy. 35 / 38
Part III Measures of Classification Accuracy for the Prediction of Survival Times
Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples
More informationApplication of the Time-Dependent ROC Curves for Prognostic Accuracy with Multiple Biomarkers
UW Biostatistics Working Paper Series 4-8-2005 Application of the Time-Dependent ROC Curves for Prognostic Accuracy with Multiple Biomarkers Yingye Zheng Fred Hutchinson Cancer Research Center, yzheng@fhcrc.org
More informationPart IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation
Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Patrick J. Heagerty PhD Department of Biostatistics University of Washington 166 ISCB 2010 Session Four Outline Examples
More informationSurvival Prediction Under Dependent Censoring: A Copula-based Approach
Survival Prediction Under Dependent Censoring: A Copula-based Approach Yi-Hau Chen Institute of Statistical Science, Academia Sinica 2013 AMMS, National Sun Yat-Sen University December 7 2013 Joint work
More informationStatistical Methods for Alzheimer s Disease Studies
Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations
More informationDistribution-free ROC Analysis Using Binary Regression Techniques
Distribution-free Analysis Using Binary Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Introductory Talk No, not that!
More information[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements
[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers
More informationTime-dependent Predictive Values of Prognostic Biomarkers with Failure Time Outcome
Time-dependent Predictive Values of Prognostic Biomarkers with Failure Time Outcome Yingye Zheng, Tianxi Cai, Margaret S. Pepe and Wayne C. Levy Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationUnivariate shrinkage in the Cox model for high dimensional data
Univariate shrinkage in the Cox model for high dimensional data Robert Tibshirani January 6, 2009 Abstract We propose a method for prediction in Cox s proportional model, when the number of features (regressors)
More informationClassification. Classification is similar to regression in that the goal is to use covariates to predict on outcome.
Classification Classification is similar to regression in that the goal is to use covariates to predict on outcome. We still have a vector of covariates X. However, the response is binary (or a few classes),
More informationSurvival Model Predictive Accuracy and ROC Curves
UW Biostatistics Working Paper Series 12-19-2003 Survival Model Predictive Accuracy and ROC Curves Patrick Heagerty University of Washington, heagerty@u.washington.edu Yingye Zheng Fred Hutchinson Cancer
More informationPackage penalized. February 21, 2018
Version 0.9-50 Date 2017-02-01 Package penalized February 21, 2018 Title L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model Author Jelle Goeman, Rosa Meijer, Nimisha
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationNONPARAMETRIC ADJUSTMENT FOR MEASUREMENT ERROR IN TIME TO EVENT DATA: APPLICATION TO RISK PREDICTION MODELS
BIRS 2016 1 NONPARAMETRIC ADJUSTMENT FOR MEASUREMENT ERROR IN TIME TO EVENT DATA: APPLICATION TO RISK PREDICTION MODELS Malka Gorfine Tel Aviv University, Israel Joint work with Danielle Braun and Giovanni
More informationVersion of record first published: 01 Jan 2012.
This article was downloaded by: [North Carolina State University] On: 15 October 212, At: 8:45 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered
More informationSupport Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina
Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,
More informationEvaluation of the predictive capacity of a biomarker
Evaluation of the predictive capacity of a biomarker Bassirou Mboup (ISUP Université Paris VI) Paul Blanche (Université Bretagne Sud) Aurélien Latouche (Institut Curie & Cnam) GDR STATISTIQUE ET SANTE,
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationData splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE:
#+TITLE: Data splitting INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+AUTHOR: Thomas Alexander Gerds #+INSTITUTE: Department of Biostatistics, University of Copenhagen
More informationarxiv: v1 [stat.me] 25 Oct 2012
Time-dependent AUC with right-censored data: a survey study Paul Blanche, Aurélien Latouche, Vivian Viallon arxiv:1210.6805v1 [stat.me] 25 Oct 2012 Abstract The ROC curve and the corresponding AUC are
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationMulti-state models: prediction
Department of Medical Statistics and Bioinformatics Leiden University Medical Center Course on advanced survival analysis, Copenhagen Outline Prediction Theory Aalen-Johansen Computational aspects Applications
More informationEstimating a time-dependent concordance index for survival prediction models with covariate dependent censoring
Noname manuscript No. (will be inserted by the editor) Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring Thomas A. Gerds 1, Michael W Kattan
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationTEST-DEPENDENT SAMPLING DESIGN AND SEMI-PARAMETRIC INFERENCE FOR THE ROC CURVE. Bethany Jablonski Horton
TEST-DEPENDENT SAMPLING DESIGN AND SEMI-PARAMETRIC INFERENCE FOR THE ROC CURVE Bethany Jablonski Horton A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial
More informationSurvival Analysis. Stat 526. April 13, 2018
Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined
More informationExtensions of Cox Model for Non-Proportional Hazards Purpose
PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More informationEVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST
EVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST TIAN ZHENG, SHAW-HWA LO DEPARTMENT OF STATISTICS, COLUMBIA UNIVERSITY Abstract. In
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationBayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects
Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University September 28,
More informationLecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL
Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationPackage Rsurrogate. October 20, 2016
Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast
More informationPh.D. course: Regression models. Introduction. 19 April 2012
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationLecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationPh.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable
More informationYou know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?
You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationIncorporating published univariable associations in diagnostic and prognostic modeling
Incorporating published univariable associations in diagnostic and prognostic modeling Thomas Debray Julius Center for Health Sciences and Primary Care University Medical Center Utrecht The Netherlands
More informationSTATISTICAL METHODS FOR EVALUATING BIOMARKERS SUBJECT TO DETECTION LIMIT
STATISTICAL METHODS FOR EVALUATING BIOMARKERS SUBJECT TO DETECTION LIMIT by Yeonhee Kim B.S. and B.B.A., Ewha Womans University, Korea, 2001 M.S., North Carolina State University, 2005 Submitted to the
More informationLecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016
Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family
More informationLecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018
Introduction Statistics 211 - Statistical Methods II Presented January 8, 2018 linear models Dan Gillen Department of Statistics University of California, Irvine 1.1 Logistics and Contact Information Lectures:
More informationc Copyright 2015 Chao-Kang Jason Liang
c Copyright 2015 Chao-Kang Jason Liang Methods for describing the time-varying predictive performance of survival models Chao-Kang Jason Liang A dissertation submitted in partial fulfillment of the requirements
More informationRobust estimates of state occupancy and transition probabilities for Non-Markov multi-state models
Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models 26 March 2014 Overview Continuously observed data Three-state illness-death General robust estimator Interval
More informationRobustifying Trial-Derived Treatment Rules to a Target Population
1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized
More informationA Parametric ROC Model Based Approach for Evaluating the Predictiveness of Continuous Markers in Case-control Studies
UW Biostatistics Working Paper Series 11-14-2007 A Parametric ROC Model Based Approach for Evaluating the Predictiveness of Continuous Markers in Case-control Studies Ying Huang University of Washington,
More informationAuxiliary-variable-enriched Biomarker Stratified Design
Auxiliary-variable-enriched Biomarker Stratified Design Ting Wang University of North Carolina at Chapel Hill tingwang@live.unc.edu 8th May, 2017 A joint work with Xiaofei Wang, Haibo Zhou, Jianwen Cai
More informationAnalysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome
Analysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome Heidi Chen, Ph.D. Cancer Biostatistics Center Vanderbilt University School of Medicine March 20, 2009 Outline
More informationSTAT 526 Spring Final Exam. Thursday May 5, 2011
STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationMODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS
MODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS by Meredith JoAnne Lotz B.A., St. Olaf College, 2004 Submitted to the Graduate Faculty
More informationVariable Selection in Competing Risks Using the L1-Penalized Cox Model
Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2008 Variable Selection in Competing Risks Using the L1-Penalized Cox Model XiangRong Kong Virginia Commonwealth
More informationPrediction Performance of Survival Models
Prediction Performance of Survival Models by Yan Yuan A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Doctor of Philosophy in Statistics Waterloo,
More informationREGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520
REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationSTAT 5500/6500 Conditional Logistic Regression for Matched Pairs
STAT 5500/6500 Conditional Logistic Regression for Matched Pairs Motivating Example: The data we will be using comes from a subset of data taken from the Los Angeles Study of the Endometrial Cancer Data
More informationApplication of Time-to-Event Methods in the Assessment of Safety in Clinical Trials
Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationMultimodal Deep Learning for Predicting Survival from Breast Cancer
Multimodal Deep Learning for Predicting Survival from Breast Cancer Heather Couture Deep Learning Journal Club Nov. 16, 2016 Outline Background on tumor histology & genetic data Background on survival
More informationST745: Survival Analysis: Nonparametric methods
ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate
More informationPubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH
PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;
More informationAnalysis of Time-to-Event Data: Chapter 6 - Regression diagnostics
Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the
More information6.873/HST.951 Medical Decision Support Spring 2004 Evaluation
Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support, Fall 2005 Instructors: Professor Lucila Ohno-Machado and Professor Staal Vinterbo 6.873/HST.951 Medical Decision
More informationLogistic regression model for survival time analysis using time-varying coefficients
Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research
More informationLongitudinal + Reliability = Joint Modeling
Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,
More informationIndividualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models
Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018
More informationSTAT Section 2.1: Basic Inference. Basic Definitions
STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.
More informationAccommodating covariates in receiver operating characteristic analysis
The Stata Journal (2009) 9, Number 1, pp. 17 39 Accommodating covariates in receiver operating characteristic analysis Holly Janes Fred Hutchinson Cancer Research Center Seattle, WA hjanes@fhcrc.org Gary
More informationTowards stratified medicine instead of dichotomization, estimate a treatment effect function for a continuous covariate
Towards stratified medicine instead of dichotomization, estimate a treatment effect function for a continuous covariate Willi Sauerbrei 1, Patrick Royston 2 1 IMBI, University Medical Center Freiburg 2
More informationConstrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources
Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan
More informationBiost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation
Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationMeasurement Error in Spatial Modeling of Environmental Exposures
Measurement Error in Spatial Modeling of Environmental Exposures Chris Paciorek, Alexandros Gryparis, and Brent Coull August 9, 2005 Department of Biostatistics Harvard School of Public Health www.biostat.harvard.edu/~paciorek
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationThe Design and Analysis of Benchmark Experiments Part II: Analysis
The Design and Analysis of Benchmark Experiments Part II: Analysis Torsten Hothorn Achim Zeileis Friedrich Leisch Kurt Hornik Friedrich Alexander Universität Erlangen Nürnberg http://www.imbe.med.uni-erlangen.de/~hothorn/
More informationTime-varying proportional odds model for mega-analysis of clustered event times
Biostatistics (2017) 00, 00, pp. 1 18 doi:10.1093/biostatistics/kxx065 Time-varying proportional odds model for mega-analysis of clustered event times TANYA P. GARCIA Texas A&M University, Department of
More informationLecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationChapter 4 Regression Models
23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,
More informationSTAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis
STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive
More informationMüller: Goodness-of-fit criteria for survival data
Müller: Goodness-of-fit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Goodness of fit criteria for survival data
More informationDefinitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen
Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation
More information3003 Cure. F. P. Treasure
3003 Cure F. P. reasure November 8, 2000 Peter reasure / November 8, 2000/ Cure / 3003 1 Cure A Simple Cure Model he Concept of Cure A cure model is a survival model where a fraction of the population
More informationEstimation of Conditional Kendall s Tau for Bivariate Interval Censored Data
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional
More informationMeta-Analysis for Diagnostic Test Data: a Bayesian Approach
Meta-Analysis for Diagnostic Test Data: a Bayesian Approach Pablo E. Verde Coordination Centre for Clinical Trials Heinrich Heine Universität Düsseldorf Preliminaries: motivations for systematic reviews
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More informationDYNAMIC PREDICTION MODELS FOR DATA WITH COMPETING RISKS. by Qing Liu B.S. Biological Sciences, Shanghai Jiao Tong University, China, 2007
DYNAMIC PREDICTION MODELS FOR DATA WITH COMPETING RISKS by Qing Liu B.S. Biological Sciences, Shanghai Jiao Tong University, China, 2007 Submitted to the Graduate Faculty of the Graduate School of Public
More informationPractice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:
Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation
More informationValida&on of Predic&ve Classifiers
Valida&on of Predic&ve Classifiers 1! Predic&ve Biomarker Classifiers In most posi&ve clinical trials, only a small propor&on of the eligible popula&on benefits from the new rx Many chronic diseases are
More information