Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Similar documents
Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

Efficiency of Profile/Partial Likelihood in the Cox Model

Survival Analysis Math 434 Fall 2011

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 22 Survival Analysis: An Introduction

Lecture 5 Models and methods for recurrent event data

11 Survival Analysis and Empirical Likelihood

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

Multivariate Survival Data With Censoring.

AFT Models and Empirical Likelihood

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

1 Glivenko-Cantelli type theorems

On the Breslow estimator

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

University of California, Berkeley

Tests of independence for censored bivariate failure time data

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

STAT Sample Problem: General Asymptotic Results

Empirical Likelihood in Survival Analysis

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Statistical Analysis of Competing Risks With Missing Causes of Failure

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Lecture 3. Truncation, length-bias and prevalence sampling

An augmented inverse probability weighted survival function estimator

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Analysis for Case-Cohort Studies

Cox s proportional hazards model and Cox s partial likelihood

and Comparison with NPMLE

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

Statistical Inference and Methods

Bayesian estimation of the discrepancy with misspecified parametric models

ST745: Survival Analysis: Nonparametric methods

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

New mixture models and algorithms in the mixtools package

Estimation for two-phase designs: semiparametric models and Z theorems

Analysis of transformation models with censored data

Modeling and Analysis of Recurrent Event Data

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

Multistate Modeling and Applications

Additive Isotonic Regression

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

9 Estimating the Underlying Survival Distribution for a

Power and Sample Size Calculations with the Additive Hazards Model

Harvard University. Harvard University Biostatistics Working Paper Series

Estimation of the Bivariate and Marginal Distributions with Censored Data

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

On Estimation of Partially Linear Transformation. Models

STAT331. Cox s Proportional Hazards Model

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

ST745: Survival Analysis: Cox-PH!

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

Survival Analysis. STAT 526 Professor Olga Vitek

Quantile Regression for Residual Life and Empirical Likelihood

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

β j = coefficient of x j in the model; β = ( β1, β2,

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Full likelihood inferences in the Cox model: an empirical likelihood approach

From semi- to non-parametric inference in general time scale models

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency

Multivariate Survival Analysis

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Efficient Estimation of Censored Linear Regression Model

Department of Statistics, School of Mathematical Sciences, Ferdowsi University of Mashhad, Iran.

Frailty Models and Copulas: Similarities and Differences

Inconsistency of the MLE for the joint distribution. of interval censored survival times. and continuous marks

Likelihood Construction, Inference for Parametric Survival Distributions

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Stat Seminar 10/26/06. General position results on uniqueness of nonrandomized Group-sequential decision procedures in Clinical Trials Eric Slud, UMCP

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

Longitudinal + Reliability = Joint Modeling

Goodness-of-fit test for the Cox Proportional Hazard Model

Maximum likelihood estimation for Cox s regression model under nested case-control sampling

A Very Brief Summary of Statistical Inference, and Examples

Empirical Processes & Survival Analysis. The Functional Delta Method

Session 9: Introduction to Sieve Analysis of Pathogen Sequences, for Assessing How VE Depends on Pathogen Genomics Part I

Analysing geoadditive regression data: a mixed model approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Attributable Risk Function in the Proportional Hazards Model

Regularization in Cox Frailty Models

Survival Analysis I (CHL5209H)

Asymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri

Tied survival times; estimation of survival probabilities

Models for Multivariate Panel Count Data

Median regression using inverse censoring weights

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

Modelling and Analysis of Recurrent Event Data

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.

Effects of a Misattributed Cause of Death on Cancer Mortality

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques

Transcription:

NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia Vonta, Univ. of Cyprus. The talk is based on joint papers: (i) about NPMLE Scan. Jour. Stat. to appear 2003. (ii) about Modified Profile Likelihood, JSPI 2004? TALK OUTLINE I. MOTIVATION Transformation Model Problems II. APPROACH Finite-Dimensional Version III. BACKGROUND LITERATURE Profile Likelihood, Frailty & Accelerated Failure Models IV. FRAILTY CASE (info bounds) V. ACCELERATED-FAILURE CASE (more details) 1

Transformation Survival Models Variables: T i Survival times, Discrete Covariates Z i C i Censoring Times, cond surv fcn R z (c) given Z i = z DATA: iid triples (min(t i, C i ), I [ T i C i ], Z i ) Observable processes: N i z(t) = I [Zi =z, T i min(c i,t)], Y i z (t) = I [min(ti, C i ) t] TRANSF. MODEL: S T Z (t z) = exp( G(e β z Λ(t))) G known, β R m, Λ cumulative-hazard fcn PROBLEM: efficient estimation of β. Special Cases: (1) Cox 1972: G(x) x (2) Frailty: unobserved random intercept β 0 = ξ i, G x = G(x) = log 0 e sx df (s) (3) Clayton-Cuzick 1986: G(x) 1 b log(1 + bx) 2

Cox-Model Case S T Z (t z) = exp( e β z Λ(t)),, G(x) = x h T Z (t z) = e β z λ(t) is the Proportional or Multiplicative Hazards model. Frailty If β Z covariate has added to it an unobservable random-effect intercept log ξ called frailty, P (T > t Z = z) = E ξ ( exp( ξ e β z Λ(t)) exp( G(e β z Λ(t))) In famous Clayton-Cuzick (1986) frailty model example, take ξ Gamma(b 1, b 1 ), leading to S T Z (t z) = (1+be β z Λ(t)) 1/b, h T Z (t z) = e β z λ(t) 1 + be β z Λ(t) Why are these Transformation Models? (Recall cum. haz. at failure is Expon(1) variate V.) G( e β Z Λ(T ) ) = V or for G known, but β, Λ unknown, log Λ(T ) = log G 1 (V ) β Z 3

Accelerated-Failure Models These are also Transformation Models. Covariates now have an additive effect on transformed time-variable: T 0 is neutral individual s failure-time g a (known) measurement scale survival fcn of g(t 0 ) unknown and equal to K(e t ). Now suppose g(t ) = β Z + g(t 0 ) Then S T Z (t z) = P (g(t ) > g(t) Z = z) = P (g(t ) > g(t) + β z) = K(e β z+g(t) ) has transformation-model form, for K unknown, g known (often equal to log). Note: this is the same as the transformation model log Λ(T ) = log G 1 (V ) β Z for frailty if Λ were known and G unknown! 4

Finite-dimensional Case X i, i = 1,..., n iid f(x, β, λ), β R m, λ R d unknown, with true values (β 0, λ 0 ) loglik(β, λ) = n log f(x i, β, λ) Profile Likelihood = loglik(β, ˆλ β ) with restricted MLE ˆλβ = arg max λ loglik(β, λ) Min Kullback-Leibler Modified Profile Approach (Severini and Wong 1992) K(β, λ) E β0,λ 0 ( log f(x 1, β, λ)) = {log f(x, β, λ)} f(x, β 0, λ 0 ) dx Define: λ β = arg max λ K(β, λ) Then: λβ estimates curve λ β Candidate Estimator β arg max β loglik(β, λ β ) 5

Information Matrix : I(β, λ) = = 2 A β,λ Bβ,λ β log f(x, β, λ) 2 βλ log f(x, β, λ) 2 λβ log f(x, β, λ) 2 λ log f(x, β, λ) B β,λ C β,λ f 0 (x)dx Note that by definition of K and implicit (total) differentiation T β [ λ K(β, λ β ))] = B + C β λ β = 0 The usual Information about β for this model, defined (as in the Cramer-Rao Inequality) as inverse of the minimum variance matrix for unbiased estimators of β, is A β0,λ 0 B β 0,λ 0 C 1 β 0,λ 0 B β0,λ 0 Equivalently, to test β = β 0, denoting restricted MLE ˆλ r as maximizer of loglik(β 0, λ), efficient test-statistic is 1 [ β loglik(β 0, ˆλ r ) B n β 0,ˆλ r (C β 0,ˆλ r ) 1 λ loglik(β 0, ˆλ r ) ] Neyman (1959) indicated that the same efficiency for test-statistic can be obtained much more generally, with ˆλ r replaced by preliminary estimator consistent for λ 0 at rate o P (n 1/4 ). 6

Key mathematical features of the modified profile approach via β, λ β are: the technical convenience of restricting attention to nuisance parameters such as hazards or density functions which satisfy smoothness restrictions; replacement of operator-inversion within (blocks of) the generalized information operator by differentiation of the Kullback-Leibler minimizer, since β λ β = C 1 B and the semiparametric Information about β is J (β 0, λ 0 ) = A β0,λ 0 ( β λ β0 ) C β0,λ 0 ( β λ β0 ) and there is no need for high-order consistency of estimation of λ, when consistent estimators of K-L minimizers λ β and their derivatives with respect to structural parameters are available. Whether dimension of nuisance parameter is finite or infinite, under regularity conditions: n ( β β0 ) D N (0, (J (β 0, λ 0 )) 1 ) 7

Semiparametric Case (λ -dim) Kullback-Leibler Functional K(β, λ) = ( log f(x, β, λ)) f(x, β 0, λ 0 ) dx Define λ β = arg max λ K(β, λ) to satisfy: λ K(β, λ β ) = 0 Under minimal regularity conditions: (β, λ β ) is a least-favorable nuisance-parameterization Substitute preliminary estimators β0, λ 0 (usually involves density-estimator for λ 0 ), into λ β formula to get estimator λ β. Then maximize over β within loglik(β, λ β ) = n log f X (X i, β, λ β ) 8

LITERATURE Profile Likelihood Cox, D. R. & Reid, N. (1987) JRSSB McCullagh, P. & Tibshirani, R. (1990) JRSSB Severini, T. and Wong, W. (1992) Ann. Stat. Semiparametrics Bickel, Klaassen, Ritov, & Wellner: 1993 Book Cox, D. R. (1972) Cox-Model paper JRSSB Murphy & van der Vaart (2000) JASA Transformation/Frailty Models Cheng, Wei, & Ying (1995) Biometrika Clayton & Cuzick (1986) ISI Centenary Session Slud, E. & Vonta, I. (2003) Scand. Jour. Stat Parner (1998) Ann. Stat. Accelerated Failure Time Models Koul, Susarla & van Ryzin (1981) Ann. Stat. Tsiatis (1990) Ann. Stat. Ritov (1990) Ann. Stat. 9

-dim Examples (I). Cox model. For q z (t) = p Z (z)r z (t) exp( e z β 0 Λ 0 (t)), can solve uniquely for: Λ β (t) t 0 λ β(s) ds = z q z (x) e z β 0 λ 0 (x) z e z β q z (x) Let λ 0 be a consistent density estimate of λ 0 (x) (eg by smoothing and differentiating the Nelson-Aalen estimator on data in a z = 0 data-stratum.) Estimate q z (t) by kernel-smoothing the at-risk process Y z (t)/n, λ β (t) = n e Z i β 0 A( t T i ) b λ β0 (T i ) / n n e Z i β 0 A( t T i b n ) where A is a smooth cdf (kernel) and b n a bandwidth parameter decreasing slowly to 0 as n. NB. In this example, any β estimator produced in this way collapses to the usual Cox Max Partial Likelihood Estimator! 10

(II). Transformation/Frailty Models In the general G transformation model case, must assume for some finite time τ 0 with Λ 0 (τ 0 ) < that all data are censored at τ 0. In this model, Slud & Vonta (2003) characterize the K-optimizing hazard intensity λ β in its integrated form L = Λ β = 0 λ β (x) dx, through the second order ODE system: dl dλ 0 (s) = z e z β 0 q z (s) G (e z β 0 Λ 0 (s)) z e z β q z (s) G (e z β L(s)) + Q(s) dq dλ 0 (s) = z ez β q z (s) G G e z β L(s) (e z β 0 G (e z β 0 Λ 0 (s)) (e z β G ((e z β L(s)) dl dλ 0 (s)) subject to the initial/terminal conditions L(0) = 0, Q(τ 0 ) = 0 Slud and Vonta (2002) show that these ODE s have unique solutions, smooth with respect to β and differentiable in t, which (with λ β L ) minimize the functional K(β, λ β ) as desired. 11

Consistent preliminary estimators λ β can be developed by substituting for β 0, Λ 0 in those equations (smoothed with respect to t) consistent preliminary estimators. Punchline: new estimator β = arg max β loglik(β, λ β ) is efficient! Software for these estimators so far not ready for prime time because of need for general-purpose twopoint boundary value problem for ODE, but has been used to generate formulas for Semiparametric Information. That is technically easier because it only involves the adjoint ODE system obtained by differentiating the one above at the true values (β 0, λ 0 ) with respect to a parameter Q(0). 12

Censored Linear Regression Censored linear regression model (usually, for log-survival times) assumes ɛ i independent of (Z i, C i ) in X i = β tr Z i + ɛ i, Z i and ɛ i independent Data: iid triples (T i, Z i, i ), T i = min(x i, C i ), i = I [Xi C i ] Unknown parameters: β, λ(u) F ɛ(u)/(1 F ɛ (u)) Step 1. Preliminary estimator β0 as in Koul-Susarlavan Ryzin (1981) by regression i T i / ŜC Z(T i Z i ) on Z i Generally, Ŝ C Z a kernel-based nonparametric regression estimator (Cheng, 1989). If Z i, C i independent, use Kaplan-Meier ŜC KM (T i ). Step 2. λ0 estimated by kernel-density variant of Nelson-Aalen estimator (Ramlau-Hansen 1983), with kernel cdf A( ), bandwidth b n slowly enough: 13

λ 0 (w) = 1 b n A ( w u i dn i (u + Z ) β i 0 ) b n i I [Ti u+z i β 0 ] = n i A ( w T i + Z i β 0 ) ) / n b n j=1 I [Tj T i +(Z j Z i ) β 0 ] Step 3. Next use K-L functional to find: λ β (t) z q z(t+z β) λ 0 (t+z (β β 0 )) / z q z(t+z β) Step 4. Then define λβ by substituting λ 0 into n A( T i t Z iβ ) b λ 0 (t+z i(β β 0 )) / n n A( T i t Z iβ b n ) and Λ β by numerical integral of λβ over [0, t]. Step 5. Finally substitute these expressions into loglik = n { j log λ β (T j Z jβ) Λ β (T j Z jβ)} and maximize numerically. 14