Survival Analysis Math 434 Fall 2011

Similar documents
Cox s proportional hazards/regression model - model assessment

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

1 Introduction. 2 Residuals in PH model

Survival Regression Models

Cox s proportional hazards model and Cox s partial likelihood

β j = coefficient of x j in the model; β = ( β1, β2,

STAT331. Cox s Proportional Hazards Model

Residuals and model diagnostics

Tied survival times; estimation of survival probabilities

Semiparametric Regression

MAS3301 / MAS8311 Biostatistics Part II: Survival

8. Parametric models in survival analysis General accelerated failure time models for parametric regression

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Analysis. Stat 526. April 13, 2018

Chapter 4 Regression Models

Introduction to Statistical Analysis

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Statistics in medicine

Power and Sample Size Calculations with the Additive Hazards Model

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

TMA 4275 Lifetime Analysis June 2004 Solution

9 Estimating the Underlying Survival Distribution for a

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan

Survival Analysis. STAT 526 Professor Olga Vitek

Time-dependent covariates

Statistical Methods for Alzheimer s Disease Studies

e 4β e 4β + e β ˆβ =0.765

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Relative-risk regression and model diagnostics. 16 November, 2015


Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Statistical Inference and Methods

Generalized logit models for nominal multinomial responses. Local odds ratios

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.

1 The problem of survival analysis

Correlation and regression

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Lecture 7 Time-dependent Covariates in Cox Regression

Survival Analysis for Case-Cohort Studies

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

ST745: Survival Analysis: Cox-PH!

Philosophy and Features of the mstate package

Extensions of Cox Model for Non-Proportional Hazards Purpose

Chapter 2 Inference on Mean Residual Life-Overview

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Lecture 5 Models and methods for recurrent event data

Machine Learning. Module 3-4: Regression and Survival Analysis Day 2, Asst. Prof. Dr. Santitham Prom-on

Integrated likelihoods in survival models for highlystratified

Ph.D. course: Regression models. Introduction. 19 April 2012

Poisson regression 1/15

PONTIFICIA UNIVERSIDAD CATÓLICA DEL PERÚ. Postgraduate School

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 8 Stat D. Gillen

Multistate Modeling and Applications

DAGStat Event History Analysis.

Joint Modeling of Longitudinal Item Response Data and Survival

11 Survival Analysis and Empirical Likelihood

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Part III: Chap. 2.5,2.6 & 12

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Nonparametric Model Construction

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA

Proportional hazards regression

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

ST495: Survival Analysis: Maximum likelihood

University of California, Berkeley

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Consider Table 1 (Note connection to start-stop process).

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Logistic regression model for survival time analysis using time-varying coefficients

TESTINGGOODNESSOFFITINTHECOX AALEN MODEL

Frailty Modeling for clustered survival data: a simulation study

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

From semi- to non-parametric inference in general time scale models

Beyond GLM and likelihood

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Müller: Goodness-of-fit criteria for survival data

Chapter 1 Statistical Inference

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Frailty Models and Copulas: Similarities and Differences

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

UNIVERSITÄT POTSDAM Institut für Mathematik

Survival Analysis I (CHL5209H)

Lecture 22 Survival Analysis: An Introduction

Exercises. (a) Prove that m(t) =

Lecture 14: Introduction to Poisson Regression

Transcription:

Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup Introduction.................................................. slide #2 Proportional Hazards Model....................................... slide #3 Properties and Interpretation...................................... slide #4 Coding Covariates.............................................. slide #5 Estimation and Inference Partial Likelihood.............................................. slide #6 Breslow s Estimator............................................. slide #8 Partial Likelihood More.......................................... slide #9 Solving Partial Likelihood....................................... slide #10 Tests...................................................... slide #11 Local Tests.................................................. slide #12 More on Wald s Test........................................... slide #13 When Ties Are Present......................................... slide #14 Time-Dependent Covariates...................................... slide #16 Stratified Proportional Hazards Models.............................. slide #17 Model Diagnostic Cox-Snell Residuals............................................ slide #18 Martingale Residuals........................................... slide #20 Deviance Residuals............................................ slide #22 Check Proportional Hazard Assumption............................. slide #24

Introduction Nonparametric Models: Kaplan-Meier survival function estimation, Nelson-Aalen cumulative hazard estimation. Parametric Models: Exponential, Weibull, Log-normal, Log-logistic, Gamma distributions w/o covariates. Semiparametric Models: AFT model, Proportional odds model, Proportional hazards model. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #2 Proportional Hazards Model It was first proposed in 1972 and further studied in 1975 by Cox and hence often called Cox Model. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #3

Properties and Interpretation If we look at two individuals with covariate values Z and Z, the ratio of their hazards is a constant: which is called as the relative risk of an individual with risk factor Z having event as compared to an individual with risk factor Z. The logarithm of the ratio of hazard rate to the baseline hazard rate is: So the coefficients {β 1,, β k } can be thought as the effect of covariates, similar as the usual linear models. The test of covariate effects is equivalent to the test of the coefficients being 0s. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #4 Coding Covariates Usually the independent variables (covariates) are known at the start of the study. They are called fixed time covariates or baseline covariates. Occasionally covariates vary with time and are called time-dependent covariates. Different methods have to be used for baseline covariates and time-dependent covariates. Here, we first discuss the fixed time covariates. Quantitative: BMI, age, blood presure,. Qualitative: Gender, race, treatment, disease type,. Dummy variable (indicators). The coefficient β k represent the difference between two categories associated with covariate Z k. When there are a categories (levels), we only need a 1 dummy variables (indicators). Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #5

Partial Likelihood Assuming that censoring is noninformative and there are no ties between the event times. We first derive the partial likelihood through a profile likelihood, which is discused in Johansen 1983. Denote the observed data as (T 1, δ 1, Z 1 ), (T 2, δ 2, Z 2 ),, (T n, δ n, Z n ). Let t 1 < t 2 < < t n denote the ordered event times and Z (i)k be the kth covariate associated with the individual whose failure time is t i. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #6 Partial Likelihood Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #7

Breslow s Estimator of Baseline Cumulative Hazard: Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #8 Partial Likelihood More from Conditional Probabilities Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #9

Solving Partial Likelihood Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #10 Tests Let ˆβ and I(β) denote the MLE of β and the p p information matrix evaluated at β. For large samples, ˆβ approximately follows a p dim normal distribution with mean β and variance-covariance I 1 (β). Wald s test H 0 : β = β 0. Likelihood ratio test Scores test Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #11

Local Tests Consider a hypothesis about a subset of β s. Let β = (β T 1, βt 2 )T, where β 1 is q 1 vector and is the interesting part of β. The remaining p q vector is denoted by β 2. Wald s test H 0 : β 1 = β 10. Likelihood ratio test Scores test Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #12 More on Wald s Test Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #13

When Ties Are Present Revised partial likelihood: Brewslow (1974) Efron (1977) Cox (1972) Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #14 When Ties Are Present When the ratio of the sample size to the number of ties is small, all methods have similar results. When the number of ties are small, Brewslow and Effron s methods are similar. When the number of ties is large, usually a more sophisticated and computational intensive method, exact, is preferred. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #15

Time-Dependent Covariates Partial likelihood: Examples 9.1 & 9.2 on page 297-307. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #16 Stratified Proportional Hazards Models Model: h j (t Z(t)) = h 0j (t)exp[β T Z(t)], j = 1,, s. The regression coefficients are assumed to be same in each stratum although the baseline hazard functions may be different. The log partial likelihood function is logpl(β) = logpl 1 (β) + logpl 2 (β) + + logpl s (β), where logpl j (β) is the log partial likelihood of the jth stratum. Examples 9.1 on page 309-311. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #17

Cox-Snell Residuals Facts: If a random variable X has distribution function F(x) and cumulative hazard function H(x), then Y = F(X) U[0, 1] and W = H(X) Exp(1). Cox-Snell residuals are defined as r = Ĥ(T), where Ĥ is the estimated cumulative hazard function (based on the model) and T is the observed survival time. Under proportional hazards model h(t Z) = h 0 (t) exp(β T Z), for i = 1, 2,, n observations, Under Weibull model S(t Z) = exp{ exp[ (µ+β T Z)/σ]t 1/σ }, for i = 1, 2,, n observations, Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #18 Cox-Snell Residuals If the model is exactly right, the Cox-Snell residuals should approximately follow a unit exponential distribution, hence the cumulative hazard function of the residuals should be H r (t) = t. Therefore, a plot of the Nelson-Aalen cumulative hazard estimate of residuals versus residuals should be a straight line through the origin with a slope of 1, if the model is corrected. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #19

Martingale Residuals Martingale residual is a slight modification of the Cox-Snell residual. In general, the martingale residual is defined as, for i = 1, 2,, n: ˆM i = N i ( ) 0 Y j (t) exp{ ˆβ T Z i (t)}dĥ0(t), where N i (t) is the counting process of having event for the ith subject and Y i (t) is the indicator that individual i is under study at a time just prior to time t (indicator of being in risk set). The ˆβ and Ĥ0(t) are regression parameter estimate and Brewslow cumulative hazard estimate. If the model is exactly right, that is, if the ˆβ and Ĥ0(t) are replaced by the true β and H 0 (t), then the martingale residuals would be martingales. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #20 Martingale Residuals Different from Cox-Snell residuals, the plot of martingale residuals usually does not only check the model assumption but also suggest the form of the covariate in the model. Suppose some covariates, which we know the proper functional form, are already in the proportional hazards model. To see how to add an additional covariate with proper form, we could plot the martingale residuals against this new covariate and fit the points using some smoothing technique. The smoothed curve suggests the proper functional form of the new covariate. Example 11.2 on page 360-362. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #21

Deviance Residuals The problem of martingale residuals is that they are skewed with maximum value 1 but minimum value. The deviance residual is used to obtain a more normally shaped residual. The deviance residual is defined as: D i = sign[ ˆM { i ] 2[ ˆM i + δ i log(δ i ˆM 1/2 i )]}. One may plot deviance residuals versus the risk scores ˆβ T Z i = p k=1 ˆβZ ik. When there is a light to moderate censoring, the residuals should look like a sample of normal noise. When there is heavy censoring, many values close to 0 may distort the normal approximation. In either case, potential outliers will have deviance residuals whose absolute values are too large. Example 11.2 on page 382-384. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #22 Deviance Residuals in Parametric Models Deviance residuals is defined same in parametric models and could be used to check outliers. But the martingale residuals in parametric models are simply ˆM i = δ i r i for i = 1, 2,, n, where r i is the Cox-Snell residual. Here, the martingale residuals are not martingales under the true model, but have similar properties and hence have the same name. Example 12.2 on page 416 and 420. Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #23

Check Proportional Hazard Assumption Besides Cox-Snell residuals, other plots, such as Anderson plot, Arjas plot, standardized score residual plot, could also be used to check the proportional hazard assumption. For more details, read chap 11.4 of the text book (page 363-379). Jimin Ding, November 1, 2011 Survival Analysis, Fall 2011 slide #24