You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Similar documents
ST745: Survival Analysis: Cox-PH!

Lecture 8 Stat D. Gillen

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine

MAS3301 / MAS8311 Biostatistics Part II: Survival

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Survival Analysis Math 434 Fall 2011

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

ST495: Survival Analysis: Maximum likelihood

β j = coefficient of x j in the model; β = ( β1, β2,

Residuals and model diagnostics

STAT331. Cox s Proportional Hazards Model

Beyond GLM and likelihood

Philosophy and Features of the mstate package

Multistate models in survival and event history analysis

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

8. Parametric models in survival analysis General accelerated failure time models for parametric regression

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Multistate models and recurrent event models

Survival Regression Models

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

Multistate models and recurrent event models

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Time-dependent covariates

Lecture 5 Models and methods for recurrent event data

Joint Modeling of Longitudinal Item Response Data and Survival

5. Parametric Regression Model

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

CURE FRACTION MODELS USING MIXTURE AND NON-MIXTURE MODELS. 1. Introduction

Chapter 7: Hypothesis testing

Package threg. August 10, 2015

STAT 526 Spring Final Exam. Thursday May 5, 2011

Modelling geoadditive survival data

Statistical aspects of prediction models with high-dimensional data

Multi-state Models: An Overview

Semiparametric Regression

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

Statistical Inference and Methods

Frailty Modeling for clustered survival data: a simulation study

ST745: Survival Analysis: Nonparametric methods

Accelerated Failure Time Models

University of California, Berkeley

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102

Proportional hazards regression

Statistics in medicine

Cox s proportional hazards model and Cox s partial likelihood

TMA 4275 Lifetime Analysis June 2004 Solution

Building a Prognostic Biomarker

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Ph.D. course: Regression models. Introduction. 19 April 2012

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Analysis I (CHL5209H)

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Power and Sample Size Calculations with the Additive Hazards Model

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Survival Analysis. Stat 526. April 13, 2018

11 Survival Analysis and Empirical Likelihood

Multistate Modeling and Applications

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Survival Analysis I (CHL5209H)

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Multi-state models: prediction

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Cox s proportional hazards/regression model - model assessment

DAGStat Event History Analysis.

9. Estimating Survival Distribution for a PH Model

9 Estimating the Underlying Survival Distribution for a

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Multivariate Survival Analysis

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

STAT Sample Problem: General Asymptotic Results

The Weibull Distribution

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Generalized logit models for nominal multinomial responses. Local odds ratios

Exercises. (a) Prove that m(t) =

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Censoring and Truncation - Highlighting the Differences

A general mixed model approach for spatio-temporal regression data

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Logistic regression: Miscellaneous topics

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan

Distribution-free ROC Analysis Using Binary Regression Techniques

7.1 The Hazard and Survival Functions

Lecture 22 Survival Analysis: An Introduction

Extensions of Cox Model for Non-Proportional Hazards Purpose

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

General Regression Model

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Transcription:

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David Cox

Statistics 745: Lecture 12 Eric B. Laber Department of Statistics, North Carolina State University February 28, 2012

Where we ve been Unconditional modeling of survival functions Parametric regression models Accelerated failure time models Cox proportional hazards models

Where we re going Application of these methods (a few more data examples and projects) Extending these ideas Personalized treatment and machine learning methods...

So, you ve got some data You observe a training set D = {(Y i, X i )} n i=1 where Y R and X R p. Your goal is model the mean of Y give X, how do you get started?

So, you ve got some data You observe a training set D = {(Y i, X i )} n i=1 where Y R and X R p. Your goal is model the mean of Y give X, how do you get started? You observe the training set D = {(X i, i, Z i )} n i=1 where X i = min(t i, C i ) R +, Delta i = 1 Ti C i, and Z i R p. Your goal is to model the conditional survival function of T given Z, how do you get started?

Basics of data analysis TALK TO THE SCIENTISTS! Exploratory data analysis (EDA, this is where we will focus our attention today)

Basics of data analysis TALK TO THE SCIENTISTS! Exploratory data analysis (EDA, this is where we will focus our attention today) Univariate plots Model diagnostics Survival is more complex than regression (Why?)

BMT Example Bone marrow transplant data (example 1.3 in K&M) Response: patient survival time, right censored Many baseline predictors: age, sex, donor demographics, transplant waiting time, degree of need, center etc. Time-dependent covariates: time to graft vs. host disease, time to return of platelets to normal levels, etc.

Looking for relationships...

Binary predictors Case 1: Z is binary Ex. sex

Binary predictors Case 1: Z is binary Ex. sex 0.0 0.2 0.4 0.6 0.8 1.0 Male Female 0 500 1000 1500 2000 2500

Categorical predictors Case 2: Z is categorical Ex. center

Categorical predictors Case 2: Z is categorical Ex. center 0.0 0.2 0.4 0.6 0.8 1.0 OSU Alferd St. Vincent Hahnemann 0 500 1000 1500 2000 2500

Continuous predictors Case 3: Z is continuous Ex. patient age What to do?

Continuous predictors Case 3: Z is continuous Ex. patient age What to do? Categorize and apply case 1 or 2 0.0 0.2 0.4 0.6 0.8 1.0 Age < 35 Age >= 35 0 500 1000 1500 2000 2500

Continuous predictors Case 3: Z is continuous Ex. patient age What to do? Categorize and apply case 1 or 2 0.0 0.2 0.4 0.6 0.8 1.0 Age <= 21 21 < Age <= 28 28 < Age <= 35 35 < Age 0 500 1000 1500 2000 2500

Model assumptions...

Checking proportional hazards Case 1: Binary covariate Z If the proportional hazards model holds then Λ(t Z = 1) = exp{β}λ(t Z = 0), where Λ(t Z = z) denotes the cumulative hazard function.

Checking proportional hazards Case 1: Binary covariate Z If the proportional hazards model holds then Λ(t Z = 1) = exp{β}λ(t Z = 0), where Λ(t Z = z) denotes the cumulative hazard function. Convince your neighbor that this is true (1 minute).

Checking proportional hazards Case 1: Binary covariate Z If the proportional hazards model holds then Λ(t Z = 1) = exp{β}λ(t Z = 0), where Λ(t Z = z) denotes the cumulative hazard function. Convince your neighbor that this is true (1 minute). Using the above relationship we have log Λ(t Z = 1) logλ(t Z = 0) = β, thus, we can estimate the cumulative hazard and plot the right hand size of the above equation against t. A constant trend indicates the proportional hazards model may be appropriate.

Checking proportional hazards cont d Recall the Nelson-Aalen estimator is given by ˆΛ(t) u<t dn(u) Y (u). Estimate Λ(t Z = 1 and Λ(t Z = 0) using the Nelson-Aalen estimator on disjoint subsets of the data corresponding to Z = 1 and Z = 0

Binary predictors Look for parallel log cumulative hazards 0.01 0.02 0.05 0.10 0.20 0.50 1.00 Male Female 0 500 1000 1500 2000 2500

To be continued... To test proportional hazards for continuous predictors Z we ll need some more tools...

Time-dependent covariates So far we have considered models built on baseline information Often, data is collected during the course of a study Using evolving patient information can lead to a better understanding of the survival function

Time-dependent covariates So far we have considered models built on baseline information Often, data is collected during the course of a study Using evolving patient information can lead to a better understanding of the survival function Does this matter?

Time-dependent covariates So far we have considered models built on baseline information Often, data is collected during the course of a study Using evolving patient information can lead to a better understanding of the survival function Does this matter? Yes, we can obtain better estimates of the survival function and better understand the interplay between evolving patient health characteristics and survival.

Time-dependent covariates So far we have considered models built on baseline information Often, data is collected during the course of a study Using evolving patient information can lead to a better understanding of the survival function Does this matter? Yes, we can obtain better estimates of the survival function and better understand the interplay between evolving patient health characteristics and survival. No, we have to make treatment decisions before such evolving patient information is observed.

Time-dependent covariates So far we have considered models built on baseline information Often, data is collected during the course of a study Using evolving patient information can lead to a better understanding of the survival function Does this matter? Yes, we can obtain better estimates of the survival function and better understand the interplay between evolving patient health characteristics and survival. No, we have to make treatment decisions before such evolving patient information is observed. Maybe, we can update patient treatment according to their changing health status.

Time-dependent covaraites cont d Suppose that for each patient, in addition to observing X i = min(t i, C i ) and i = 1 Ti C i, we observe Z(t) = {Z 1 (t), Z 2 (t), Z p (t), t T i } How can we incorporate time-dependent covariates into the model?

Time-dependent covaraites cont d Suppose that for each patient, in addition to observing X i = min(t i, C i ) and i = 1 Ti C i, we observe Z(t) = {Z 1 (t), Z 2 (t), Z p (t), t T i } How can we incorporate time-dependent covariates into the model? Basic Cox model λ(t Z(t)) = λ 0 (t) exp{β Z(t)}

Basic Cox model Observed data D = {(X i, i, {Z i (t), 0 t T i }} n i=1 Under assumptions of conditional independence of censoring time and failure time, partial likelihood is given by D i=1 exp p j=1 β jz (i)j (t i ) k Y (t i ) exp p j=1 β jz kj (t i ), where t 1 < t 2 <... < t D denote distinct failure times, Z (i) (t i ) denotes the covariates associated with the failure time t i, and Y (t i ) denotes the risk set at t i.

Basic Cox model cont d Estimation and inference for β can proceed as usual Maximize partial likelihood to find an estimator Use observed Fisher information and asymptotic normality to construct a confidence set, etc.

Basic Cox model cont d Common examples of time-varying covaraites BMI Blood pressure Number of hospitalizations Presence of a comorbid condition Depression inventory (e.g. HAMD) Treatment adherence... Of course, by defining Z j (t) = Z j for all t makes any covariate time-dependent It is assumed in the derivation of the partial likelihood that Z(t) is predictable, e.g., it is known conditional on all available information just prior to time t

Back to testing model adequacy Suppose that Z is continuous (not time dependent) We would like to test the proportional hazards assumption λ(t z) = λ 0 (t) exp{β 1 z} A common approach is to use a time-dependent proportional hazards model to do this For fixed function g(t) define W = g(t)z and consider the proportional hazards model λ(t Z, W ) = λ 0 (t) exp{β 1 Z + β 2 W }

Back to testing model adequacy cont d For fixed function g(t) define W = g(t)z and consider the alternative proportional hazards model λ(t Z, W ) = λ 0 (t) exp{β 1 Z + β 2 W } If the posited proportional hazards model (e.g., the one that depends only on Z) is correct than β 2 = 0. Idea! Test H 0 : β 2 = 0 to test validity of proportional hazards.