STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Similar documents
Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Survival Analysis Math 434 Fall 2011

Cox s proportional hazards model and Cox s partial likelihood

TMA 4275 Lifetime Analysis June 2004 Solution

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT331. Cox s Proportional Hazards Model

MAS3301 / MAS8311 Biostatistics Part II: Survival

Survival Regression Models

β j = coefficient of x j in the model; β = ( β1, β2,

Survival Analysis. Stat 526. April 13, 2018

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Semiparametric Regression

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

ST745: Survival Analysis: Cox-PH!

Proportional hazards regression

Beyond GLM and likelihood

n =10,220 observations. Smaller samples analyzed here to illustrate sample size effect.

Cox s proportional hazards/regression model - model assessment

Full likelihood inferences in the Cox model: an empirical likelihood approach

Tied survival times; estimation of survival probabilities

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

UNIVERSITÄT POTSDAM Institut für Mathematik

Reduced-rank hazard regression

University of California, Berkeley

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

Power and Sample Size Calculations with the Additive Hazards Model

Survival Analysis. STAT 526 Professor Olga Vitek

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

Lecture 5 Models and methods for recurrent event data

Cox regression: Estimation

Introduction to Statistical Analysis

Step-Stress Models and Associated Inference

The Log-generalized inverse Weibull Regression Model

Variable Selection in Competing Risks Using the L1-Penalized Cox Model

Consider Table 1 (Note connection to start-stop process).

3003 Cure. F. P. Treasure

Survival Analysis for Case-Cohort Studies

Accelerated Failure Time Models: A Review

Generalized Linear Models

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

LOGISTIC REGRESSION Joseph M. Hilbe

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Logistic regression model for survival time analysis using time-varying coefficients

Statistics for Engineers Lecture 4 Reliability and Lifetime Distributions

Follow this and additional works at: Part of the Applied Mathematics Commons

Modelling geoadditive survival data

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.

MAS3301 / MAS8311 Biostatistics Part II: Survival

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

Chapter 4 Regression Models

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Lecture 7 Time-dependent Covariates in Cox Regression

Machine Learning. Module 3-4: Regression and Survival Analysis Day 2, Asst. Prof. Dr. Santitham Prom-on

A Bivariate Weibull Regression Model

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Single-level Models for Binary Responses

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Building a Prognostic Biomarker

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA

Empirical Likelihood in Survival Analysis

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

THESIS for the degree of MASTER OF SCIENCE. Modelling and Data Analysis

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

DAGStat Event History Analysis.

8. Parametric models in survival analysis General accelerated failure time models for parametric regression

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Sample size determination for logistic regression: A simulation study

Analysing geoadditive regression data: a mixed model approach

Validation. Terry M Therneau. Dec 2015

Likelihood Construction, Inference for Parametric Survival Distributions

ST495: Survival Analysis: Maximum likelihood

Duration Analysis. Joan Llull

Generalized Linear Modeling - Logistic Regression

Outline of GLMs. Definitions

Lecture 22 Survival Analysis: An Introduction

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Bayesian Linear Regression

Reliability Engineering I

Frailty Modeling for clustered survival data: a simulation study

Accelerated Failure Time Models

Statistics in medicine

Efficiency of Profile/Partial Likelihood in the Cox Model

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

The material for categorical data follows Agresti closely.

USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Chapter 2 Inference on Mean Residual Life-Overview

Fundamentals of Reliability Engineering and Applications

Transcription:

STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis

Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive a long time. Continuous variables like stress, temperature, voltage, and pressure. Discrete variables like number of hardening treatments or number of simultaneous users of a system. Categorical variables like manufacturer, design, and location. Regression model relates failure time distribution to explanatory variables x = (x 1, x 2,..., x k ): Pr(T t) = F(t) = F(t; x). 1

Failure-Time Regression Analysis Parameters as Functions of Explanatory Variables Material in this chapter is an extension of statistical regression analysis with normal distributed data and mean = β 0 + β 1 x 1 +... + β k x k where the x i are explanatory variables. The ideas presented here are more general: Data not necessarily from a normal distribution. Data may be censored. Nonstandard regression models that relate life to explanatory variables. 2

Failure-Time Regression Analysis Scale-Accelerated Failure-Time Model The scale-accelerated failure-time (SAFT) model is commonly used to describe the effect that explanatory variables x have on time. This model has a simple time-scaling acceleration factor that is a function of x and is defined by T(x) = T(x 0) AF(x), AF(x) > 0, AF(x 0) = 1, where T(x) is the time at condition x and T(x 0 ) is the corresponding time at some baseline condition x 0. 3

Scale-Accelerated Failure-Time Model Some commonly used forms for the timescale factor AF(x) include the log linear relationships where typical forms for AF(x) are log[af(x)] = β 1 x with x 0 = 0 for a scalar x. log[af(x)] = β 1 x 1 +β 2 x 2 +..., β k x k with x 0 = 0 for a vector x. When AF(x) > 1, T(x) < T(x 0 ), time at x is accelerated relative to time at x 0. When 0 < AF(x) < 1, T(x) > T(x 0 ), time at x is decelerated relative to time at x 0. 4

Scale-Accelerated Failure-Time Model In terms of cdfs, with baseline cdf F(t; x 0 ), F(t; x) = F[AF(x) t; x 0 ]. In terms of quantiles, t p (x) = t p (x 0 )/AF(x), i.e. ln[t p (x)] ln[t p (x 0 )] = AF(x). SAFT models are often suggested by the physical theory of some simple failure mechanisms, but they do not hold universally. 5

Location-Scale Regression Model With one explanatory variable (it can be extended to more complicate situations), the location-scale simple regression model is ( ) y µ Pr(Y y) = F(y; µ, σ) = F(y; β 0, β 1, σ) = Φ, σ where µ = β 0 + β 1 x and σ does not depend on the explanatory variable x. The quantile function y p = µ + Φ 1 (p)σ = β 0 + β 1 x + Φ 1 (p)σ is linear in x. Choosing Φ determines the shape of the distribution for a particular value of x. Distribution Normal Extreme Value Logistic Φ(z) z 1 exp( w2 2π 2 )dw exp[z exp(z)] exp(z) 1+exp(z) 6

A simple form of the proportional hazards model is h(t, x) = h 0 (t)ψ(x, β), in which the explanatory vector x does not change over time for any individual. The proportional hazards model can also be written as S(t, x) = [S(t; x 0 )] ψ(x,β), or F(t, x) = 1 [1 F(t; x 0 )] ψ(x,β), or ln{ ln[1 F(t, x)]} ln{ ln[1 F(t; x 0 )]} = ln[ψ(x, β)]. When ψ(x, β) > 1, the model accelerates time, i.e. F(t; x) > F(t; x 0 ) for all t. When ψ(x, β) < 1, the model decelerates time, i.e. F(t; x) < F(t; x 0 ) for all t. 7

For the Weibull distribution (and only the Weibull distribution), a proportional hazards regression model is also a SAFT regression model. Three commonly used parameterizations of ψ may be considered Log linear form: ψ(x; β) = exp(β T x) Linear form: ψ(x; β) = 1 + β T x Logistic form: ψ(x; β) = log(1 + e βt x ) Discrimination among these forms may be achieved by fitting an augmented family, for example, ψ(x; β, κ) = (1 + κβ T x) 1/κ includes the linear and log linear models as special cases, κ = 1 and κ 0, respectively. 8

First we assume the survival times have continuous distributions and are recorded exactly (no ties). Let t 1 < t 2 < < t n denote the failure times of the n units. We consider inferences about β when the baseline hazard function h 0 (t) is completely unknown. The Cox proportional hazards model: is a semiparametric model makes no assumptions about the form of h 0 (t) (nonparametric part of the model) assumes parametric form for the effect of the predictors on the hazard 9

The risk set at time t j is denoted by R(t j ) = {i : t i t j }. In the absence of knowledge of h 0 (t), the t j (actual times of failures) can provide little or no information about β, for their distribution will depend heavily on h 0 (t). Cox partial likelihood estimation method for β corresponds to the method of maximum likelihood only the ranks of the failure times (and the location between the failure ranks of the censoring) is considered. 10

Partial Likelihood: The conditional probability that k fails at t j given that one individual from the risk set R(t j ) fails at t j, which is simply h(t j ; x k ) h(t j ; x i ) = ψ(x k, β) ψ(x i, β). i R(t j ) i R(t j ) The overall partial likelihood is L(β) = n j=1 i R(t j ) ψ(x k, β) ψ(x i, β) where δ i denotes the censoring indicators. δ i, Let ˆβ denote the maximum (partial) likelihood estimate of β, obtained by maximizing the partial log-likelihood function, ln L(β). 11

If ψ(x i, β) = exp(β T x i ), we have ln L(β) = n δ i (β T x i ) i=1 n i=1 δ i i R(t j ) exp(β T x j ). The first derivative of ln L(β) with respect to β is called vector of efficient scores, given by U(β) = dln L(β) dβ = δ T X n i=1 δ i exp(β T x j )X (j, ) exp(β T, x j ) j R(t i ) j R(t i ) where δ = (δ 1,..., δ n ) T denotes the vector of censoring indicators, and X is the n p matrix of covariate values, with the j-th row containing the covariate values of the j-th individual, X (j, ) = x T j. 12

The information matrix I(β) is given by the negative of the second derivative of ln L(β). Let 1 R(ti ) denotes the indicator vector of the risk set R(t i ), i.e. the j-th element of 1 R(ti ) is 1 when t j t i, and 0, otherwise. Then, the information matrix takes the form = I(β) = d2 ln L(β) dβ T dβ n i=1 { δ i (i) T w [w i (β)] 2X i (β)diag(e Xβ ) [ e Xβ][ e Xβ] T } X (i) where w i (β) = 1 T exp(xβ) are scalars; R(t i ) for matrix v, diag(v) is the diagonal matrix with the main diagonal v, and exp(v) is defined elementwise; and X(i) = diag(1 R(ti ) )X. The inverse of the information matrix I 1 (ˆβ), is a consistent estimate of the variance -covariance matrix of ˆβ. 13

Test hypotheses about the regression parameters β, for example H 0 : β = β 0 Wald test χ 2 W = (ˆβ β 0 ) T I(ˆβ)(ˆβ β 0 ) which has a chi-squared distribution with p degrees of freedom under H 0 for large samples. Likelihood ratio test χ 2 LR = 2[ln L(ˆβ) ln(β 0 )] which has a chi-squared distribution with p degrees of freedom under H 0 for large samples. Score test χ 2 S = U(β 0 )T I(β 0 )U(β 0 ) which has a chi-squared distribution with p degrees of freedom under H 0 for large samples. 14

Partial Likelihood for Discrete failure times: ties Let t 1 < t 2 < < t D : the D distinct, ordered, failure times. d i : number of deaths at t i D i : the set of all individuals who die at time t i. r i : number of individuals in R(t i ). In the absence of knowledge of the true order (the real case), we have to consider all possible orders of these observed d i tied survival times. For each t i, the observed d i tied survival time can be ordered in d i! different possible ways. 15

For each of these possible orders we will have a product as the continuous case for the corresponding d i survival times. For large r i, construction and computation of the exact partial likelihood function is very tedious task. To approximate the exact partial likelihood function, the following two methods can be used when each d i is small compared to r i Breslow (1974, International Statistical Review) L(β) = D [ i=1 Efron (1977, JASA) L(β) = D i=1 d i j=1 { k R(t i ) j D i ψ(x j ; β) j R(t i ) ψ(x j ; β) j D i ψ(x j ; β) ψ(x k ; β) j 1 d i ] di k D i ψ(x k ; β) } 16