PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Similar documents
LOGISTIC REGRESSION Joseph M. Hilbe

Regularization in Cox Frailty Models

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Survival Analysis Math 434 Fall 2011

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Modelling geoadditive survival data

Frailty Models and Copulas: Similarities and Differences

Analysing geoadditive regression data: a mixed model approach

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Power and Sample Size Calculations with the Additive Hazards Model

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan

Lecture 5 Models and methods for recurrent event data

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Chapter 2 Inference on Mean Residual Life-Overview

A general mixed model approach for spatio-temporal regression data

1 Introduction. 2 Residuals in PH model

Residuals and model diagnostics

Multi-state Models: An Overview

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Logistic regression model for survival time analysis using time-varying coefficients

General Regression Model

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Multivariate Survival Analysis

PQL Estimation Biases in Generalized Linear Mixed Models

Efficiency of Profile/Partial Likelihood in the Cox Model

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

11 Survival Analysis and Empirical Likelihood

Multistate Modeling and Applications

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

From semi- to non-parametric inference in general time scale models

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Statistical Inference and Methods

Cox regression: Estimation

SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Fahrmeir: Discrete failure time models

Continuous Time Survival in Latent Variable Models

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

University of California, Berkeley

Semiparametric Generalized Linear Models

3003 Cure. F. P. Treasure

Maximum Likelihood Estimation. only training data is available to design a classifier

On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting

Likelihood Construction, Inference for Parametric Survival Distributions

Package JointModel. R topics documented: June 9, Title Semiparametric Joint Models for Longitudinal and Counting Processes Version 1.

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Accelerated Failure Time Models: A Review

Generalized Linear Models (GLZ)

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Estimation of the Mean Function with Panel Count Data Using Monotone Polynomial Splines

A brief note on the simulation of survival data with a desired percentage of right-censored datas

Longitudinal + Reliability = Joint Modeling

The Accelerated Failure Time Model Under Biased. Sampling

Outline of GLMs. Definitions

On the Breslow estimator

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Tests of independence for censored bivariate failure time data

TGDR: An Introduction

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Introduction to BGPhazard

University of California, Berkeley

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Integrated likelihoods in survival models for highlystratified

Survival Regression Models

Introduction to Reliability Theory (part 2)

Professors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Philosophy and Features of the mstate package

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

Survival Analysis. Stat 526. April 13, 2018

Semi-parametric estimation of non-stationary Pickands functions

Quantile Regression for Residual Life and Empirical Likelihood

Published online: 10 Apr 2012.

Lecture 22 Survival Analysis: An Introduction

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Survival Analysis I (CHL5209H)

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

Semiparametric Regression

Missing covariate data in matched case-control studies: Do the usual paradigms apply?

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Stat 710: Mathematical Statistics Lecture 12

PENALIZING YOUR MODELS

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Transcription:

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

Outline Introduction Survival models Types of censoring Research problem Methodology Examples Conclusion/ Discussion References

Introduction Survival models Proportional Hazards model h t = h 0 t. exp(x T β) Proposed by Cox (1972) Covariates act multiplicatively on some unknown baseline hazard rate Additive Hazards model h t = h 0 t + (X t ) T β(t) Proposed by Aalen (1989) Covariates act additively on some unknown baseline hazard rate β t is depend on time t Additive Hazards model Special case h t = h 0 t + (X(t)) T β Proposed by Lin & Ying (1994) Coefficient β is constant Lin and Ying considered this model with the baseline h 0 (t) be any nonnegative function

Introduction Types of censoring T i - failure time C i - censoring time t start t end Left censoring; t i = max(t i, C i ) - event of interest has already occured before the study starts Interval censoring; t i (L i, R i ) - exact event time is unknown, but interval bounding is known Right censoring; t i = min(t i, C i ) - study ends before the event has occured Fully observed; t i = T i - events occured within study period

Introduction Research problem Interested on parameter estimation of the additive hazard model: h i t, X i = h 0 t + X i β For this model the two non-negativity constraints as follows : Constraint 1 : h 0 t 0 Constraint 2 : h 0 t + X i β 0

Introduction Research problem Main focus: Estimating the regression coefficients, β and the baseline hazard,h 0 t alternately by considering the two constraints Aalen (1980) Used least square approach to estimate cumulative estimates Without considering constraints Ghosh (2001) & Zeng et al.(2006) Used maximum likelihood approach to estimate β and H 0 (t)/ S 0 (t) = exp(h 0 t Imposed constraints on H 0 (t) and H(t) Farrington (1996) Used GLM approach to estimate β and h 0 (t) Non-negativity of h 0 (t) cannot be guaranteed

Methodology Used Maximum Penalized Likelihood (MPL) approach To estimate h 0 (t) by considering the full likelihood To smooth baseline hazard, a penalty function, J h 0 can be added on h 0 t Then, maximum penalized likelihood objective function with respect to h 0 t and β, h 0 t, β = argmax { Φ h 0 t, β = l h 0 t, β λ. J h 0 t } here λ is the smoothing parameter

Methodology continued General form of Additive hazard model for observation i : h i t i, X i = h 0 t i + X i β Constraint 1 : h 0 t i 0 h 0 t i = m u=1 θ u ψ u (t i ) Select a basis function such that ψ u t i 0 for all u i. e need to restrict θ u 0 to enforce h 0 t i 0 Constraint 2 : h 0 t i + X i β 0 X i β h 0 (t i ) Possible to re-write above constraint in a simplified form as follows X i β θ u ; t i B u

Methodology continued Used Augmented Lagrangian (AL) method to treat constraints Let η i = X i β to transfer the part of the constraints to η i Augmented Lagrangian with respect to θ, β, η and γ, θ, β, η, γ = argmax θ,β,η,γ L α θ, β, η, γ where L α θ, β, η, γ = l θ, β λ. J θ n i=1 γ i (X T i β η i ) α 2 i=1 n (X T i β η i ) 2 These four parameters are updated alternately in each iteration.

Methodology Alternately updates Iteration (k + 1) consists of following steps: STEP 1:With η (k), β (k) and γ (k) obtained θ (k+1) ; by running one iteration of the Multiplicative Iterative (MI) algorithm (Ma et al. (2014) followed by a line search this guarantees that each updated θ value respects the non-negativity constraint θ (k+1) = argmax θ L α θ, β (k), η (k), γ (k) ; θ 0 for all u

Methodology Alternately updates STEP 2: With θ (k+1), β (k), γ (k) computed η (k+1) ; by running one iteration of the MI algorithm followed by a line search this step is standard and ensures that L α θ (k+1), β (k), η, γ (k) increases as a function of η η (k+1) = argmax η L α θ (k+1), β (k), η, γ (k) ; η i θ u ; t i B u

Methodology Alternately updates STEP 3: With θ (k+1), η (k+1) and γ (k) computed β (k+1) ; by running one iteration of the Newton algorithm use Armijo s rule to perform line search β (k+1) = argmax β L α θ (k+1), β, η (k+1), γ (k) β (k+1) = β (k) ω k. 2 L α β j β k 1 Lα β j STEP 4: With θ (k+1), η (k+1) and β (k+1) updated γ (k+1) as follows; γ (k+1) = γ (k) + α. (X T β (k+1) η (k+1) )

Examples In the examples, the main aims are to demonstrate this method of estimating β and h 0 t works well study the behavior of the results under different censoring proportions and number of events compare the results with the existing methods Indicator function is used as the basis function in h 0 (t) estimation Arbitrary selected smoothing parameter, λ (= 0. 05) value is used Simulate survival times (t) from Weibull distribution with hazard; h i t = 3t 2 + (x i1 + 0. 6x i2 0. 8x i3 )

Results Regression coefficient estimation Considered n = 100 & n = 1000 with approximate censoring proportions; π c of 20%, 50% and 80% for each value of n Compared results (MPL) with two existing estimation procedures; i. Aalen s additive hazard models using aareg function of Survival R package ii. Lin & Ying s additive hazards model using ahaz function of Ahaz R package

Results : study ONE Right censored data MSE comparison Aalen s method poorly performed with higher censoring proportions MPL method performs slightly better than Lin & Ying s method

Results : study TWO Interval censored data MSE comparison Aalen s method poorly performed with interval censored data Lin & Ying s method not stable with higher censoring proportions

Results : study TWO Interval censored data Bias & Variance comparison Bias and variance increases with the censoring proportion, but decreases with the sample size

Results - Baseline hazard estimation (Interval censored data) n = 1000, λ = 0.05 and p = 0.2 n = 1000, λ = 0.05 and p = 0.8

Conclusion MPL produces better results than Aalen s method (Right & Interval cens) MPL produces slightly better results than Lin & Ying s method (Right cens) & Lin & Ying s methods produces roughly same results as MPL for lower censoring proportions, but under performed for higher censoring proportions (Interval cens) MPL produces estimates that are less biased than other methods, leading to a substantial MSE reduction Overall, MPL provides a gain in efficiency over Aalen s and Lin & Ying s method Baseline hazard estimation performs well even with the random λ value, could be improved by selecting an optimal smoothing parameter

Future work Extend the algorithm for different basis functions, including spline function Implement a procedure to obtain the optimal smoothing parameter λ Develop asymptotic properties of the estimates of θ and β

References Aalen, O.O. (1980), A model for nonparametric regression analysis of counting processes, Lecture notes in Statistics, 2, Springer-Verlag, 1-25 Cai, T. and Betensky, R.A. (2003), Hazard regression for interval censored data with penalized spline, Biometrics, 59,570-579 J. Ma, S. Heritier and S. Lo, (2014),"On the maximum penalized likelihood approach for proportional hazard models with right censored survival data", Computational Statistics and Data Analysis, vol 74, pp 142-156, 2014. Lin, D. Y. and Ying, Z. (1994), Semiparametric analysis of the additive risk model, Biometrika, 81:61 71. Lin, D. Y. and Ying, Z. (2007), Maximum likelihood estimation in semi-parametric regression models with censored data, J. Roy. Statist Soc.B, 69:507 564. Luenberger, D. (1984). Linear and Nonlinear Programming (2nd edition). J. Wiley. McCullagh, P. and Nelder, J. A. (1989), Generalized linear models (2 nd edition). Chapman and Hall, London.

THANK YOU!!!

Literature Review Here we review three approaches on fitting the Additive hazards model Least squares approach Aalen (1980) used formal least squares principle for nonparametric additive hazard model Firstly obtained cumulative versions of the estimates using continuous data Then, the coefficients can be estimated from the slope of the cumulative estimates Leads to well-known Nelson-Aalen estimate for cumulative hazard estimation *Could not extend this model for interval censored data

Literature Review Maximum Likelihood (ML) approach Ghosh (2001) and Zeng et al.(2006) developed ML approach for additive hazards model. Ghosh fits the additive hazards model by estimating β and a cumulative baseline hazard function H 0 (. ) Used primal-dual interior point algorithm Algorithm imposes contraints of positivity & monotonic increasing on H 0 (. ) and the cumulative hazard H i (. ) Zeng et al. fit the additive hazards model with interval censored data Using the log likelihood function which is expressed in terms of S 0 (. ) and β This constraints positivity and monotonic decreasing on S 0 (. ) by using a logarithm transformation * Here the way the constraints are imposed can make the estimation procedure unstable when the baseline survival estimate approaches zero.

Literature Review Generalized Linear Model (GLM) approach Farrington (1996) fits the additive hazards model for interval censored data using a generalized linear model (GLM) approach The occurrences left, right and interval censored observations are assumed to be from indepedent Bernoulli trials Occurrence probability is related to a linear predictor by a negative log link function Then β and h 0 (. ) can be estimated by fitting the generalized linear model The baseline hazard h 0 (. ) is assumed to be piecewise constant over some intervals *Neither non-negativity nor smoothness of the h 0 (. ) can be guaranteed

Simulation results : study ONE Right censored data MSE comparison

Simulation results : study TWO Interval censored data MSE comparison

Simulation results : study TWO Interval censored data Bias & Variance comparison

Simulation results : study TWO Interval censored data Bias & Variance comparison