Semiparametric Regression

Similar documents
Accelerated Failure Time Models

Proportional hazards regression

Cox regression: Estimation

Tied survival times; estimation of survival probabilities

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Cox s proportional hazards model and Cox s partial likelihood

Survival Analysis Math 434 Fall 2011

The Weibull Distribution

STAT331. Cox s Proportional Hazards Model

Censoring mechanisms

One-sample categorical data: approximate inference

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

Likelihood Construction, Inference for Parametric Survival Distributions

Introduction to Statistical Analysis

Multistate models and recurrent event models

Multistate models and recurrent event models

ST745: Survival Analysis: Cox-PH!

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Interval Estimation for Parameters of a Bivariate Time Varying Covariate Model

Time-dependent coefficients

Simulation-based robust IV inference for lifetime data

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

β j = coefficient of x j in the model; β = ( β1, β2,

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Residuals and model diagnostics

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

TGDR: An Introduction

Modelling geoadditive survival data

Logistic regression: Miscellaneous topics

Two-sample Categorical data: Testing

Lecture 22 Survival Analysis: An Introduction

Lecture 5 Models and methods for recurrent event data

Sampling distribution of GLM regression coefficients

Continuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21

Multistate Modeling and Applications

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

MAS3301 / MAS8311 Biostatistics Part II: Survival

The Accelerated Failure Time Model Under Biased. Sampling

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Survival Regression Models

TMA 4275 Lifetime Analysis June 2004 Solution

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Empirical Likelihood

Power and Sample Size Calculations with the Additive Hazards Model

UNIVERSITY OF CALIFORNIA, SAN DIEGO

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Survival Analysis. Lu Tian and Richard Olshen Stanford University

1 The problem of survival analysis

ST495: Survival Analysis: Maximum likelihood

Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location

Permutation tests. Patrick Breheny. September 25. Conditioning Nonparametric null hypotheses Permutation testing

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Other likelihoods. Patrick Breheny. April 25. Multinomial regression Robust regression Cox regression

Multiple samples: Modeling and ANOVA

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Lecture 3. Truncation, length-bias and prevalence sampling

Chapter 4 Regression Models

Stability and the elastic net

Correlation. Patrick Breheny. November 15. Descriptive statistics Inference Summary

Chapter 2 Inference on Mean Residual Life-Overview

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

An Analysis on the Efficiency of Non-Parametric Estimation of Duration Dependence by Hazard-Based Duration Model

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

MAS3301 / MAS8311 Biostatistics Part II: Survival

Covariance test Selective inference. Selective inference. Patrick Breheny. April 18. Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

Survival Analysis for Case-Cohort Studies

Simple linear regression

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

Applied Regression Analysis

Logistic regression model for survival time analysis using time-varying coefficients

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Discrete Dependent Variable Models

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.

Proportional hazards model for matched failure time data

Outline of GLMs. Definitions

University of California, Berkeley

Nonconvex penalties: Signal-to-noise ratio and algorithms


Distribution-Free Procedures (Devore Chapter Fifteen)

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Introduction to Reliability Theory (part 2)

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Turning a research question into a statistical question.

Objective Experiments Glossary of Statistical Terms

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek

Cox s proportional hazards/regression model - model assessment

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

What s New in Econometrics? Lecture 14 Quantile Methods

Transcription:

Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23

Introduction Over the past few weeks, we ve introduced a variety of regression models under the proportional hazards and accelerated failure time frameworks I say variety in the sense that under each modeling framework, we can assume various parametric distributions for the failure time and arrive at different models In this course, we only considered a few examples, concentrating mainly on exponential and Weibull regression, but Chapter 2 of Kalbfleish and Prentice introduce a number of additional possibilities (of ever-increasing complexity) Patrick Breheny Survival Data Analysis (BIOS 7210) 2/23

Parametric vs. nonparametric One approach to modeling, then, is to try out and compare various parametric models in an attempt to determine which parametric form best fits the observed distribution An alternative approach, however, is to avoid parametric assumptions concerning the distribution of survival times altogether in an attempt to address the problem in a nonparametric manner Broadly speaking, these nonparametric approaches are something of a last resort in the industrial testing and reliability field, but the models of choice in medical research Patrick Breheny Survival Data Analysis (BIOS 7210) 3/23

Semiparametric modeling The models we will consider, however, are not entirely nonparametric, in that we would still like parameters to describe the way in which the covariates affect survival Thus, we would like our models to be semiparametric: Nonparametric portion: The underlying survival distribution Parametric portion: The way in which covariates affect that underlying distribution The development of semiparametric models can be pursued under both the PH and AFT frameworks; our goal for today is to introduce the main idea behind both approaches Patrick Breheny Survival Data Analysis (BIOS 7210) 4/23

Introduction First, let us consider the AFT model: Y i = x T i β + W i, where Y i = log T i Here, the parametric aspect of the model is the x T i β portion, while the nonparametric aspect involves assuming that W i iid F, where F is some generic, unspecified distribution (note that for the sake of identifiability, our model cannot contain an intercept unless we introduce a restriction on the location of W ) In other words, we want to carry out inference concerning β without deciding on a specific distribution F Patrick Breheny Survival Data Analysis (BIOS 7210) 5/23

Rank regression Introduction The regular linear regression version of this problem is a well-studied problem in nonparametric statistics A widely used method, rank regression, generalizes the basic idea of the Wilcoxon rank sum test to the regression setting Consider the case of a single covariate, and let x (i) denote the covariate value associated with the ith largest response A nonparametric rank-based test for the association between x and the outcome can then be based on the test statistic (i ī)(x (i) x), where ī = (n + 1)/2 i Patrick Breheny Survival Data Analysis (BIOS 7210) 6/23

Extension to censored data To extend this idea to censored data, consider modifying the test statistic to U = j (x (j) x (j) ), where j indexes the unique failure times, x (j) denote the covariate associated with the jth failure time, and x (j) denotes the average of the covariate values of all subjects at risk at time t j Furthermore, it can be shown that under the null, the variance of the above sum is V = j (x (j) x (j) ) 2 Patrick Breheny Survival Data Analysis (BIOS 7210) 7/23

Rank-based estimation Thus, we can base our test of H 0 : β = 0 upon U 2 V. χ 2 1 In order to use this idea for estimation and confidence interval construction, however, we need to be able to test the general null hypothesis H 0 : β = β 0 Consider, therefore, testing whether the residuals of the AFT model are associated with the covariate: i.e., we apply the same procedure as before, but instead of ranking the outcomes, we rank W i = Y i x T i β 0 Patrick Breheny Survival Data Analysis (BIOS 7210) 8/23

Pike rat: Setup Introduction To get a sense of how this works, let s apply the idea to the Pike rat data and use it to estimate the effect of pretreatment regimen (Group) Over a grid of values for β 0, we will compute W i = Y i x i β 0, then calculate U and V along with the test statistic U 2 /V We will retain inside our confidence interval all values of β 0 such that U 2 /V χ 2 1,1 α Furthermore, we will select as a point estimate β the value that minimizes U 2 /V Patrick Breheny Survival Data Analysis (BIOS 7210) 9/23

Pike rat: Results Introduction 15 10 χ 2 5 0 0.2 0.0 0.2 0.4 0.6 β Patrick Breheny Survival Data Analysis (BIOS 7210) 10/23

Pike rat: Comparison 95% CI β Lower Upper Rank-based 0.11-0.03 0.26 Weibull 0.13 0.01 0.25 Exponential 0.09-0.56 0.75 Lognormal 0.09-0.04 0.23 We get wildly different results depending on the assumed distribution; it is reassuring, however, the the rank-based results resemble the Weibull, which previous diagnostics suggested was a reasonable choice Patrick Breheny Survival Data Analysis (BIOS 7210) 11/23

Simulated example Introduction True distribution: Weibull with γ = 1.5, β = 1 120 100 80 95% CI β Lower Upper χ 2 60 40 20 0 0.0 0.5 1.0 1.5 2.0 Rank-based 1.04 0.89 1.21 Weibull 1.05 0.89 1.21 Exponential 1.31 1.07 1.54 Lognormal 1.08 0.90 1.27 β Patrick Breheny Survival Data Analysis (BIOS 7210) 12/23

: Advantages The obvious advantage of the rank-based approach is that often, all one wants is to estimate the effects of various covariates specifying the distribution of W is a nuisance that one only cares about to the extent that it affects the estimation of β In those cases, the rank-based AFT is a very robust method: regardless of the underlying distribution, it produces reasonable estimates (assuming that the AFT assumptions hold, of course) Furthermore, the loss of efficiency (compared to choosing the correct parametric form, if one exists) is typically rather mild Patrick Breheny Survival Data Analysis (BIOS 7210) 13/23

: Disadvantages The overwhelming disadvantage, however, is that the test statistic is not a continuous function of β Typically, changing β by a small amount will not change the ranking of the residuals at all (thereby leading the test statistic unchanged as well) Only for a countable number of values does changing β produce changes in the test statistic, and when it does, those changes occur in discrete jumps as observations get reordered Patrick Breheny Survival Data Analysis (BIOS 7210) 14/23

: Disadvantages (cont d) This lack of differentiability has some important consequences for using these models First, solving for the MLE is computationally intensive in multiple dimensions (no Newton-Raphson algorithm) Second, inference is also problematic as we can t apply the usual Wald-type approach (which requires derivatives) in order to get confidence intervals For these reasons, rank-based estimation in the AFT model is not widely used in practice Patrick Breheny Survival Data Analysis (BIOS 7210) 15/23

Semiparametric PH modeling It turns out, however, that the proportional hazards model is much more amenable to semiparametric modeling Recall the PH model: λ i (t) = λ 0 (t) exp(x T i β) The parametric part of the model is the exp(x T i β) portion, while the nonparametric aspect is the lack of any assumptions concerning the baseline hazard λ 0 (t) In this approach, we will assume that our model does not contain an intercept (if we introduced an intercept, we would have to introduce some restrictions on λ 0 in order to maintain identifiability) Patrick Breheny Survival Data Analysis (BIOS 7210) 16/23

Special case: Two observations To understand how estimation works in this model, let s start by considering the special case of two observations, T 1 and T 2, with hazard functions λ 1 (t) and λ 2 (t) Suppose that the first failure occurs at time t; how likely is it that the subject who failed was subject 1? Proposition: P(T 1 = t T (1) = t) = λ 1 (t) λ 1 (t) + λ 2 (t) Patrick Breheny Survival Data Analysis (BIOS 7210) 17/23

Elimination of the baseline hazard Under the proportional hazards assumption, therefore, we have P(T 1 < T 2 ) = exp(x T 1 β) exp(x T 1 β) + exp(xt 2 β) The remarkable result here is that the baseline hazard, λ 0 (t), has canceled out of the expression Extending this logic to the somewhat more general case in which we have multiple subjects, but no censoring, we have P(T 1 < T 2 < < T J ) = J j=1 exp(x T j β) J k=j exp(xt k β), where x j denotes the vector of covariates for the subject with the jth failure time Patrick Breheny Survival Data Analysis (BIOS 7210) 18/23

Rank-based likelihood We therefore have an expression for the the likelihood of β given the ranks of the failure times, and have found that this likelihood is free of λ 0 (t) One would expect the ranks to contain most of the information about β and that, say, the duration of time between t j and t j+1 probably wouldn t add a great deal of information to our knowledge of β At any rate, drawing further conclusions about β based on the gaps between failure times is going to be highly dependent on making distributional assumptions concerning λ 0 Focusing on the ranks, therefore, is likely to be both efficient and robust Patrick Breheny Survival Data Analysis (BIOS 7210) 19/23

Censoring Introduction Extending these results to account for censoring is mathematically straightforward, but perhaps a bit challenging philosophically The probability that subject j fails at time t given that one of the subjects from the risk set R(t) failed at time t is Thus, the full likelihood is exp(x T j β) k R(t) exp(xt k β) L(β) = j exp(x T j β) k R(t j ) exp(xt k β), where the product is taken over the observed failure times Patrick Breheny Survival Data Analysis (BIOS 7210) 20/23

A likelihood? Introduction But is this a likelihood? Not exactly it does not specify the probability of observing t and d given X and β It doesn t even specify the probability of observing a given ranking of failure times, like the likelihood in the uncensored case Thus, it is not entirely appropriate to call this product of conditional probability statements a likelihood, in that it does not specify the probability of observing a given set of data Patrick Breheny Survival Data Analysis (BIOS 7210) 21/23

Partial likelihood Introduction To get around this difficulty, Sir David Cox proposed the name partial likelihood in 1975 for the expression on slide 20, providing a general definition for what constitutes a partial likelihood and a justification for why it can be treated like a regular likelihood This material is summarized in Section 4.2.1 of Kalbfleish & Prentice The take-home message is that the partial likelihood still yields a score with mean zero and a variance given by the negative Hessian matrix; thus, we can use all the likelihood-based techniques we have developed thus far in the course to study it Patrick Breheny Survival Data Analysis (BIOS 7210) 22/23

Final remarks Introduction Although both the AFT and PH frameworks can be extended to allow nonparametric specification of the underlying survival distribution, the PH framework is much more convenient in that we end up with a regular, differentiable (partial) likelihood For that reason, regression based on the partial likelihood from earlier, originally proposed by Cox in 1972, is far more widely used than rank-based AFT regression models, and is indeed by far the most common regression modeling approach in survival data analysis Most of the remainder of the course will be spent covering Cox regression in greater depth Patrick Breheny Survival Data Analysis (BIOS 7210) 23/23