Multivariate Survival Analysis

Similar documents
ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL

Frailty Modeling for clustered survival data: a simulation study

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

Models for Multivariate Panel Count Data

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Modelling geoadditive survival data

Lecture 5 Models and methods for recurrent event data

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work

CTDL-Positive Stable Frailty Model

Regularization in Cox Frailty Models

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Stat 5101 Lecture Notes

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Longitudinal + Reliability = Joint Modeling

A general mixed model approach for spatio-temporal regression data

Tied survival times; estimation of survival probabilities

Proportional hazards regression

Lecture 7 Time-dependent Covariates in Cox Regression

Analysing geoadditive regression data: a mixed model approach

Frailty Models and Copulas: Similarities and Differences

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Lecture 3. Truncation, length-bias and prevalence sampling

The Proportional Hazard Model and the Modelling of Recurrent Failure Data: Analysis of a Disconnector Population in Sweden. Sweden

Understanding the Cox Regression Models with Time-Change Covariates

Modelling and Analysis of Recurrent Event Data

FRAILTY MODELS FOR MODELLING HETEROGENEITY

Applied Survival Analysis Lab 10: Analysis of multiple failures

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

MAS3301 / MAS8311 Biostatistics Part II: Survival

Extensions of Cox Model for Non-Proportional Hazards Purpose

On the Breslow estimator

Reduced-rank hazard regression

Collinearity/Confounding in richly-parameterized models

ST745: Survival Analysis: Cox-PH!

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Philosophy and Features of the mstate package

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Survival Regression Models

Part 8: GLMs and Hierarchical LMs and GLMs

Graag wil ik iedereen bedanken die gedurende de voorbije zes jaren heeft bijgedragen tot de realisatie van deze thesis.

Hierarchical Modeling for Univariate Spatial Data

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

4. Comparison of Two (K) Samples

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

WU Weiterbildung. Linear Mixed Models

Instrumental variables estimation in the Cox Proportional Hazard regression model

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Subject CS1 Actuarial Statistics 1 Core Principles

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Extensions of Cox Model for Non-Proportional Hazards Purpose

Multivariate spatial modeling

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Model Selection in Cox regression

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Bayesian linear regression

Full likelihood inferences in the Cox model: an empirical likelihood approach

Time-dependent covariates

Residuals and model diagnostics

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Two-level lognormal frailty model and competing risks model with missing cause of failure

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

Part 7: Hierarchical Modeling

Semiparametric Regression

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Multistate models and recurrent event models

Statistical Methods For Biomarker Threshold models in clinical trials

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Survival Analysis for Case-Cohort Studies

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Proportional hazards model for matched failure time data

Chapter 2. Data Analysis

Package JointModel. R topics documented: June 9, Title Semiparametric Joint Models for Longitudinal and Counting Processes Version 1.

On a connection between the Bradley-Terry model and the Cox proportional hazards model

11 Survival Analysis and Empirical Likelihood

STAT331. Cox s Proportional Hazards Model

Bayesian Gaussian / Linear Models. Read Sections and 3.3 in the text by Bishop

Simulation-based robust IV inference for lifetime data

Classification. Chapter Introduction. 6.2 The Bayes classifier

Maximum likelihood estimation of a log-concave density based on censored data

Estimation with clustered censored survival data with missing covariates in the marginal Cox model

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

β j = coefficient of x j in the model; β = ( β1, β2,

TGDR: An Introduction

frailtyem: An R Package for Estimating Semiparametric Shared Frailty Models

Transcription:

Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in practice in difference ways: Clustered survival data. This happens when failure times (often of the same type, eg. death) from the same cluster are dependent, such as in multicenter clinical trials: cluster = clinical center familial data: cluster = family matched pairs: cluster = pair Recurrent event data. This happens when an individual may experience events of the same type more than once, eg, infections. It is actually a special case of clustered data, where the failure times are ordered. What are the failure time random variables in each case? 1

Work has been done (and still being done) to estimate nonparametrically the joint survival distribution, without any predictors, eg. bivariate survival function. This turns out to be much harder than the one dimensional case (KM), mainly due to censoring. In practice, we are often interested in relating certain covariates to the survival time (regression setting), while taking into consideration the dependence among the survival times. Here we will introduce extensions of the Cox model to handle correlated survival data. There are two types of approaches: the conditional approach (using random effects) the marginal (WLW) approach 2

The random effects approach The dependence among failure times may be viewed as induced by unmeasured factors that are common to subjects within a cluster but may vary from cluster to cluster. For example, EST 1582 is a phase III multi-institutional lung cancer trial to compare two treatments, CAV vs. CAV-HEM. It was shown (Gray 1995) that there is significant institutional variation in the treatment effects (see plot); for familial data, common genetic or environmental factors might have induce the dependence of survival times among the family members; in recurrent events, multiple events of the same individual are likely correlated due to some factor unique to that individual but different from individual to individual. 3

Estimated treatment effects (log HR) -0.5-0.4-0.3-0.2-0.1 0.0 0.1 16 18 19 28 29 0 5 10 15 20 25 30 Institution Figure 1: Estimated institutional treatment effects (E1582). Institutions are sorted in increasing order of number of patients. One may model these cluster-to-cluster variation as fixed effects. For example, in the E1582 data, if Z = 1 for CAV- HEM, and Z = 0 for CAV, instead of using a single β for the treatment difference, we may use β 1 for the treatment effect in institution 1, β 2 for the treatment effect in institution 2,... there are 31 institutions (579 observations). How do we write down such a fixed effects Cox model? Obviously the above fixed effects model will introduce many parameters. In the above figure, that is NOT how we obtained the estimated effects. 4

Alternatively one may try to use a stratified Cox model. But this will still have problems with small strata. In fact, the number of patients per institution varied from 1 to 56. One way to solve this problem is to turn the many (31) fixed treatment effects into random effects, i.e. put a (prior) distribution on them (in a Bayesian sense). This effectively shrinks the magnitude of the estimated effects towards zero. Shrinkage estimators are known to be biased, but nonetheless have smaller MSE (Morris, 1983). 5

The univariate frailty model The simplest random effects model for clustered survival data is the univariate frailty model λ ij (t) = λ 0 (t) exp{β Z ij + b i }. Here i indicates the cluster, and j the individual from cluster i (i = 1,..., n, j = 1,...n i ), and b i is the random cluster effect. This is sometimes called a hierarchical model. The assumption of the random effects model is that b 1,..., b n are i.i.d. random variables, from a distribution known up to a finite number of parameters. For example, b i N(0, σ 2 ). This can substantially reduce the number of parameters compared to a fixed effects model. Such modeling is most suitable when there are many clusters and each of relative small size. In general b i induces correlation among the individuals in the same cluster i. 6

We can equivalently write the above model as λ ij (t) = λ 0 (t) exp{β Z ij }ω i. so that b i = log ω i ; ω i is often called the frailty term. Early use of frailty models was mainly to count for unobserved heterogeneity, or unmeasured covariates, for i.i.d. data (n i = 1). In the past, the commonly used distribution for the ω i s is the gamma distribution with mean of one and unknown variance. This is mainly due to mathematical convenience (conjugate prior). The mean of b i (or ω i ) needs to be fixed to avoid problems of identifiability (why?). The univariate frailty model can be fitted using coxph() with option frailty. The distributions allowed for ω i are gamma and log-normal. 7

In terms of the data generating mechanism, we can think of the clusters as being sampled from a population. b i here is sometimes called the random intercept; it s a random effect on the baseline hazard. This model counts for the correlation among the failure times within a cluster, but not the covariate effects (like in E1582) that vary from cluster to cluster. 8

General random effects Cox model A more general random effects Cox model is λ ij (t) = λ 0 (t) exp{β Z ij + b iw ij }, Here W ij is a vector of covariates that have random effects, such as treatment in E1582 data; b i is the covariate by cluster interaction (compare to the fixed effects model we might have written down earlier). When W ij = 1, the above is the univariate frailty model. Often the covariates in W are also included in Z, so that they have fixed effects (can be thought of as main effects ), and we impose that E(b i ) = 0. The above model is also called the proportional hazards mixed-effects model (PHMM). It parallels the linear mixedeffects model (LMM) and generalized mixed-effects model (GLMM). The b i s are often assumed to be normal. Gamma distribution is no longer suitable, as it is not scale invariant. This model was considered in Vaida and Xu (2000), and Ripatti and Palmgren (2000). 9

log hazard -1.0-0.5 0.0 0.5 1.0 Cluster Treatment 0 2 4 6 8 10 time log hazard -1.0-0.5 0.0 0.5 1.0 Trt, Cluster 2 Trt, Cluster 1 0 2 4 6 8 10 time Figure 2: Top: cluster effects (βz + b i0 ); bottom: cluster x treatment interaction (βz + b i0 + b i1 Z). 10

Inference Inference under the random effects Cox model is carried out using the full likelihood (as opposed to the partial likelihood). The assumptions are: data from different clusters are independent; within the same cluster i, the dependence is induced by b i ; in other words, conditional on b i, the individuals in cluster i are independent. Conditional on b i, we may write down the full likelihood for cluster i: L i = n i j=1 λ ij (X ij ) δ ij S ij (X ij ) = n i j=1 {λ 0 (X ij )e β Z ij +b i W ij } δ ij exp{ Λ 0 (X ij )e β Z ij +b i W ij } The whole likelihood condition on the b i s is n L i. i=1 11

The likelihood for observed data is then n L(β, λ 0, σ) = L i p(b i ; σ)db i. i=1 Here σ denotes the unknown parameter(s) in the distribution of the random effects, and p(b i ; σ) is the density of b i. The parameters to be estimated are (β, λ 0, σ). In order to solve for MLE, one needs to evaluate the above integrals. Often these integrals are not of closed forms. Iterative algorithms such as EM are often used. An R package phmm is available for the above NPMLE. Table 1: Estimates (standard errors) from E1582 lung cancer trial d = 0 d = 1 d = 2 β β Var(b) β Var(b) treatment -0.254 (0.085) -0.250 (0.104) 0.071 (0.069) -0.247 (0.119) 0.046 (0.184) bone 0.223 (0.093) 0.212 (0.095) - 0.230 (0.144) 0.129 (0.083) liver 0.429 (0.090) 0.423 (0.091) - 0.393 (0.094) - ps -0.602 (0.104) -0.641 (0.109) - -0.649 (0.131) - wt loss 0.200 (0.087) 0.218 (0.089) - 0.208 (0.092) - 12

Notes In a previous figure of estimated treatment effects by institution, those are the estimated b i s under the d = 1 model. They are obtained as empirical Bayes estimate following the MLE; they are the posterior means (can also be mode) of the b i s given the observed data. In addition, R package phmm gives the NPMLE, as well as the empirical Bayes estimate of the random effects, and various testing, AIC etc.; Asymptotic properties for the NPMLE has been established, with also stable numerical properties (Gamst et al, 2009). coxph() with frailty option gives a penalized partial likelihood (PPL) estimator; there is also an R package coxme that gives the PPL under the general PHMM. No theoretical properties for the PPL has been established (inconsistent in general). It has numerical problems sometimes, and the estimated variances (hence CI s) can be inaccurate. PPL is faster to compute. 13

(a) Random effect -3-2 -1 0 1 2 3 1 event/patient 2 events/patient 3 4 0 20 40 60 80 100 120 Patient (b) Random effect -3-2 -1 0 1 2 3 1 event/patient 2 events/patient 3 4 0 20 40 60 80 100 120 Patient Figure 3: Estimated random effects from univariate lognormal frailty model and 95% confidence intervals (CGD data). (a) without adjustment for covariates other than treatment; (b) with adjustment for covariates. Notes on recurrent events Both the marginal model and the random effects model can be used for recurrent event data. When the random effects model is used, the assumption is a common baseline hazard function for the events of the same individual, like in the CGD example. 14

Other methods have been studied that are aimed specifically for recurrent event data. One popular such method is PWP (Prentice, Williams and Peterson 1981), which is a conditional approach that takes into account the ordering of the events, and was the recommended method by Oakes (1992, in Survival Analysis, State of the Art, ed. Klein and Goel). Another simple approach that is sometimes used is the Anderson-Gill model, or the counting process approach. This has the favor of the simple Cox model, and requires the relatively strong assumption that the times between successive events be conditionally independent given the history. The covariate prior number of events is sometimes included in such a model. Therneau and Hamilton (1997, Stat in Med, p.2029-2047) compared the above methods using example datasets, showing how they are implemented using popular statistical software. Similar material can also be found in Therneau and Grambsch s book. 15