Time-to-Event Modeling and Statistical Analysis

Similar documents
Modeling and Non- and Semi-Parametric Inference with Recurrent Event Data

Recurrent Event Data: Models, Analysis, Efficiency

Modelling and Analysis of Recurrent Event Data

Modeling and Analysis of Recurrent Event Data

Dynamic Models in Reliability and Survival Analysis

Semiparametric Estimation for a Generalized KG Model with R. Model with Recurrent Event Data

Lecture 5 Models and methods for recurrent event data

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Introduction to repairable systems STK4400 Spring 2011

Lecture 3. Truncation, length-bias and prevalence sampling

Frailty Models and Copulas: Similarities and Differences

Survival Analysis Math 434 Fall 2011

e 4β e 4β + e β ˆβ =0.765

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Multistate Modeling and Applications


Lecture 2: Martingale theory for univariate survival analysis

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Lecture 22 Survival Analysis: An Introduction

Estimating Load-Sharing Properties in a Dynamic Reliability System. Paul Kvam, Georgia Tech Edsel A. Peña, University of South Carolina

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Non- and Semi-parametric Bayesian Inference with Recurrent Events and Coherent Systems Data

Exercises. (a) Prove that m(t) =

Chapter 2 Inference on Mean Residual Life-Overview

MAS3301 / MAS8311 Biostatistics Part II: Survival

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Survival Regression Models

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency

Survival Analysis. Stat 526. April 13, 2018

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

STAT Sample Problem: General Asymptotic Results

11 Survival Analysis and Empirical Likelihood

1 Glivenko-Cantelli type theorems

CHAPTER 1 A MAINTENANCE MODEL FOR COMPONENTS EXPOSED TO SEVERAL FAILURE MECHANISMS AND IMPERFECT REPAIR

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

ST495: Survival Analysis: Maximum likelihood

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Multivariate Survival Analysis

Statistical Inference and Methods

Efficiency of Profile/Partial Likelihood in the Cox Model

DAGStat Event History Analysis.

Models for Multivariate Panel Count Data

Multistate models and recurrent event models

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Cox s proportional hazards model and Cox s partial likelihood

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

University of California, Berkeley

Frailty Modeling for clustered survival data: a simulation study

Empirical Processes & Survival Analysis. The Functional Delta Method

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

Power and Sample Size Calculations with the Additive Hazards Model

Multistate models and recurrent event models

Quantile Regression for Residual Life and Empirical Likelihood

8. Parametric models in survival analysis General accelerated failure time models for parametric regression

A generalized Brown-Proschan model for preventive and corrective maintenance. Laurent DOYEN.

Introduction to Reliability Theory (part 2)

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

ST745: Survival Analysis: Nonparametric methods

Empirical Likelihood in Survival Analysis

Likelihood Construction, Inference for Parametric Survival Distributions

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Reliability Engineering I

Survival Analysis I (CHL5209H)

Follow this and additional works at: Part of the Applied Mathematics Commons

β j = coefficient of x j in the model; β = ( β1, β2,

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

STAT331. Cox s Proportional Hazards Model

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

MAS3301 / MAS8311 Biostatistics Part II: Survival

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Review and continuation from last week Properties of MLEs

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Statistical Analysis of Competing Risks With Missing Causes of Failure

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

Statistical Estimation

Survival Analysis for Case-Cohort Studies

AFT Models and Empirical Likelihood

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

Survival Analysis. STAT 526 Professor Olga Vitek

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Step-Stress Models and Associated Inference

Multivariate Survival Data With Censoring.

TMA 4275 Lifetime Analysis June 2004 Solution

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

9 Estimating the Underlying Survival Distribution for a

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data

Regularization in Cox Frailty Models

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Transcription:

PeñaTalkGT) p. An Appetizer The important thing is not to stop questioning. Curiosity has its own reason for existing. One cannot help but be in awe when he contemplates the mysteries of eternity, of life, of the marvelous structure of reality. It is enough if one tries merely to comprehend a little of this mystery every day. Never lose a holy curiosity. Albert Einstein

Time-to-Event Modeling and Statistical Analysis Edsel A. Peña (Portions joint work with J. Gonzalez, M. Hollander, E. Slate, R. Strawderman) Department of Statistics University of South Carolina, Columbia, SC Research support from NIH Grants (RO1, COBRE)

PeñaTalkGT) p. Some Events of Interest Death. First publication after PhD graduation. Occurrence of tumor. Onset of depression. Machine/system failure. Occurrence of a natural disaster. Hospitalization. Non-life insurance claim. Accident or terrorist attack. Onset of economic recession. Divorce.

PeñaTalkGT) p. Event Times and Distributions T : the time to the occurrence of an event of interest. F (t) = Pr{T t} : the distribution function of T. S(t) = F (t) = 1 F (t) : survivor/reliability function. Hazard rate/ probability and Cumulative Hazards: Cont: λ(t)dt Pr{T t + dt T t} = f(t) S(t ) dt Disc: λ(t j ) = Pr{T = t j T t j } = f(t j) S(t j ) Cumulative: Λ(t) = t 0 λ(w)dw or Λ(t) = t j t λ(t j )

Representation/Relationships 0 < t 1 <... < t M = t, M(t) = max t i t i 1 = o(1), S(t) = Pr{T > t} = M i=1 M i=1 Pr{T > t i T t i 1 } [1 {Λ(t i ) Λ(t i 1 )}]. S as a product-integral of Λ: When M(t) 0, S(t) = w t [1 Λ(dw)] In general, Λ in terms of F : Λ(t) = t 0 df (w) 1 F (w ). PeñaTalkGT) p.

PeñaTalkGT) p. Estimation of F and Why? Most Basic Problem: Given a sample T 1, T 2,..., T n from an unknown distribution F, to obtain an estimator ˆF of F. Why is it important to know how to estimate F? Functionals/parameters θ(f ) of F (e.g., mean, median, variance) can be estimated via ˆθ = θ( ˆF ). Prediction of time-to-event for new units. Knowledge of population of units or event times. For comparing groups, e.g., thru a statistic Q = [ W (t)d ˆF1 (t) ˆF ] 2 (t) where W (t) is some weight function.

PeñaTalkGT) p. Gastroenterology Data: Aalen and Husebye ( 91) Migratory Motor Complex (MMC) Times for 19 Subjects MMC Data Set Unit Number 0 5 10 15 20 0 100 200 300 400 500 600 700 Calendar Time Question: How to estimate the MMC period dist, F?

PeñaTalkGT) p. Parametric Approach Unknown df F is assumed to belong to some parametric family (e.g., exponential, gamma, Weibull) F = {F (t; θ) : θ Θ R p } with functional form of F ( ; ) known; θ is unknown. Based on data t 1, t 2,..., t n, θ is estimated by ˆθ, say, via maximum likelihood (ML). ˆθ maximizes likelihood L(θ) = n f(t i ; θ) = n λ(t i ; θ) exp{ Λ(t i ; θ)}. i=1 i=1 The distribution function F is estimated by ˆF pa (t) = F (t; ˆθ).

PeñaTalkGT) p. Parametric Estimation: Asymptotics When F holds, MLE of θ satisfies ˆθ AN (θ, 1n ) I(θ) 1 ; I(θ) = Var{ θ log f(t 1; θ)} = Fisher information. Therefore, when F holds, by δ-method, with F (t; θ) = F (t; θ) θ then ˆF pa (t) AN ( F (t; θ), 1 n ) F (t; θ) I(θ) 1 F (t; θ).

PeñaTalkGT) p. Nonparametric Approach No assumptions are made regarding the family of distributions to which the unknown df F belongs. Empirical Distribution Function (EDF): ˆF np (t) = 1 n n i=1 I{T i t} ˆF np ( ) is a nonparametric MLE of F. Since I{T i t}, i = 1, 2,..., n, are IID Ber(F (t)), by Central Limit Theorem, ˆF np (t) AN (F (t), 1n ) F (t)[1 F (t)].

PeñaTalkGT) p.1 An Efficiency Comparison Assume that family F = {F (t; θ) : θ Θ} holds. Both ˆF pa and ˆF np are asymptotically unbiased. To compare under F, we take ratio of asymptotic variances to give the efficiency of parametric estimator over nonparametric estimator. Eff( ˆF pa (t) : ˆF np (t)) = F (t; θ)[1 F (t; θ)] F (t; θ) I(θ) 1 F. (t; θ) When F = {F (t; θ) = 1 exp{ θt} : θ > 0}, then Eff( ˆF pa (t) : ˆF np (t)) = exp{θt} 1 (θt) 2

Efficiency: Parametric/Nonparametric Asymptotic efficiency of parametric versus nonparametric estimators under a correct negative exponential family model. Effi of Para Relative to NonPara in Expo Case Efficiency (in percent) 0 200 400 600 800 1000 0 1 2 3 4 5 Value of (Theta)(t) PeñaTalkGT) p.1

PeñaTalkGT) p.1 Whither Nonparametrics? Consider however the case where the negative exponential family is fitted, but it is actually not the correct model. Let us suppose that the gamma family of distributions is the correct model. Under wrong model, with T = 1 n n i=1 T i the sample mean, the parametric estimator of F is ˆF pa (t) = 1 exp{ t/ T }. Under gamma with shape α and scale θ, and since T AN(α/θ, α/(nθ 2 )), by δ-method ˆF pa (t) AN ( { 1 exp θt }, 1 α n (θt) 2 α 3 exp { 2(θt) }). α

Efficiency: Under a Mis-specified Model Simulated Effi: MSE(NonParametric)/MSE(Parametric) under a mis-specified exponential family model. True Family of Model: Gamma Family Effi of Para vs NonPara under Misspecified Model Effi(%) (Para/Nonpara) 0 50 100 150 200 250 300 n=10 n=30 n=100 n=500 0 1 2 3 4 5 6 Value of t PeñaTalkGT) p.1

PeñaTalkGT) p.1 MMC Data: Censoring Aspect For each unit, red mark is the potential termination time. MMC Data Set Unit Number 0 5 10 15 20 0 100 200 300 400 500 600 700 Calendar Time Remark: All 19 MMC times completely observed.

PeñaTalkGT) p.1 Estimation of F : With Censoring For ith unit, a right-censoring variable C i with C 1, C 2,..., C n IID df G. Observables are (Z i, δ i ), i = 1, 2,..., n with Z i = min{t i, C i } and δ i = I{T i C i }. Problem: For observed (Z i, δ i )s, to estimate df F or hazard function Λ of the T i s. Nonparametric Approaches: Naive (product-limit)! Nonparametric MLE (Kaplan-Meier). Martingale and method-of-moments. Pioneers: Kaplan & Meier; Efron; Nelson; Breslow; Breslow & Crowley; Aalen; Gill.

PeñaTalkGT) p.1 Product-Limit Estimator Counting and At-Risk Processes: N(t) = n i=1 I{Z i t; δ i = 1}; Y (t) = n i=1 I{Z i t} Hazard probability estimate at t: ˆΛ(dt) = N(t) Y (t) = # of Observed Failures at t # at-risk at t Product-Limit Estimator (PLE): 1 ˆF (t) = Ŝ(t) = w t [ 1 N(t) ] Y (t)

PeñaTalkGT) p.1 Some Properties of PLE Nonparametric MLE of F (Kaplan-Meier, 58). PLE is a step-function which jumps only at observed failure times. With censored data, unequal jumps. Efron ( 67): Possesses self-consistency property. Biased for finite n. When no censoring and no tied values: N(t (i) ) = 1 and Y (t (i) ) = n i + 1, so Ŝ(t (i) ) = i j=1 [ 1 ] 1 n j + 1 = 1 i n.

PeñaTalkGT) p.1 Stochastic Process Approach A martingale M is a zero-mean process which models a fair game. With H t = history up to t: E{M(s + t) H t } = M(t). M(t) = N(t) t 0 Y (w)λ(dw) is a martingale, so with J(t) = I{Y (w) > 0} and stochastic integration, E { t 0 } J(w) Y (w) dn(w) = E { t Nelson-Aalen estimator of Λ, and PLE: 0 } J(w)Λ(dw). ˆΛ(t) = t 0 dn(w) Y (w), so Ŝ(t) = [1 ˆΛ(dw)]. w t

PeñaTalkGT) p.1 Likelihood Process: Hazard-Based J. Jacod s likelihood: L t (Λ( )) = w t [Y (w)λ(dw)] N(dw) [1 Y (w)λ(dw)] 1 N(dw). When Λ( ) is continuous, L t (Λ( )) = [Y (w)λ(dw)] N(dw) w t R t e Y (w)λ(dw) 0. With T (t) = t 0 Y (w)dw = TTOT(t), for λ(t) = θ, L t (θ) = θ N(t) exp{ θt (t)}.

PeñaTalkGT) p.2 Asymptotic Properties Proofs uses martingale central limit theorem. NAE: n[ˆλ(t) Λ(t)] Z 1 (t) with {Z 1 (t) : t 0} a zero-mean Gaussian process with d 1 (t) = Var(Z 1 (t)) = t 0 Λ(dw) S(w)Ḡ(w ). PLE: n[ ˆF (t) F (t)] Z 2 (t) st = S(t)Z 1 (t) so d 2 (t) = Var(Z 2 (t)) = S(t) 2 t 0 Λ(dw) S(w)Ḡ(w ). If Ḡ(w) 1 (no censoring), d 2(t) = F (t)s(t)!

PeñaTalkGT) p.2 Gaussian Process: Sample Paths Brownian Motion Paths Value of Process 3 2 1 0 1 2 3 0.0 0.2 0.4 0.6 0.8 1.0 Time

PeñaTalkGT) p.2 Regression Models In many situations we observe covariates: temperature, degree of usage, stress level, age, blood pressure, race, etc. How to account of them to improve knowledge of time-to-event. Modelling approaches: Log-linear models: log(t ) = β x + σɛ. The accelerated failure-time model. Error distribution to use? Normal errors not appropriate. Hazard-based models: Cox proportional hazards (PH) model; Aalen s additive hazards model.

PeñaTalkGT) p.2 Cox ( 72) PH Model: Single Event Conditional on x, hazard rate of T is: λ(t x) = λ 0 (t) exp{β x}. ˆβ maximizes partial likelihood function of β: L P (β) n [ exp(β x i ) n j=1 Y j(t) exp(β x j ) ] Ni (t). i=1 t< Aalen-Breslow semiparametric estimator of Λ 0 ( ): ˆΛ 0 (t) = t 0 n i=1 dn i(w) n i=1 Y i(w) exp( ˆβ x i ).

PeñaTalkGT) p.2 MMC Data: Recurrent Aspect Aalen and Husebye ( 91) Full Data MMC Data Set Unit Number 0 5 10 15 20 0 100 200 300 400 500 600 700 Calendar Time Problem: Estimate inter-event time distribution.

Representation: One Subject PeñaTalkGT) p.2

PeñaTalkGT) p.2 Observables: One Subject X(s) = covariate vector, possibly time-dependent T 1, T 2, T 3,... = inter-event or gap times S 1, S 2, S 3,... = calendar times of event occurrences τ = end of observation period: Assume τ G K = max{k : S k τ} = number of events in [0, τ] Z = unobserved frailty variable N (s) = number of events in [0, s] Y (s) = I{τ s} = at-risk indicator at time s F = {F s : s 0} = filtration: information that includes interventions, covariates, etc.

PeñaTalkGT) p.2 Aspect of Sum-Quota Accrual Remark: A unique feature of recurrent event modeling is the sum-quota constraint that arises due to a fixed or random observation window. Failure to recognize this in the statistical analysis leads to erroneous conclusions. K = max k : k T j τ j=1 (T 1, T 2,..., T K ) satisfies K T j τ < K+1 T j. j=1 j=1

PeñaTalkGT) p.2 Recurrent Event Models: IID Case Parametric Models: HPP: T i1, T i2, T i3,... IID EXP(λ). IID Renewal Model: T i1, T i2, T i3,... IID F where F F = {F ( ; θ) : θ Θ R p }; e.g., Weibull family; gamma family; etc. Non-Parametric Model: T i1, T i2, T i3,... IID F which is some df. With Frailty: For each unit i, there is an unobservable Z i from some distribution H( ; ξ) and (T i1, T i2, T i3,...), given Z i, are IID with survivor function [1 F (t)] Z i.

PeñaTalkGT) p.2 A General Class of Models Peña and Hollander (2004) model. Intensity: N (s) = A (s Z) + M (s Z) M (s Z) M 2 0 = sq-int martingales A (s Z) = s 0 Y (w)λ(w Z)dw λ(s Z) = Z λ 0 [E(s)] ρ[n (s ); α] ψ[β t X(s)] This class of models includes as special cases many models in reliability and survival analysis.

Effective Age Process PeñaTalkGT) p.3

Effective Age Process, E(s) Predictable, observable, nonnegative, dynamically specified, monotone, and differentiable on [S k 1, S k ), E(s) with E (s) 0. Perfect Intervention: E(s) = s S N (s ). Imperfect Intervention: E(s) = s. Minimal Intervention (Brown & Proschan, 83; Block, Borges & Savits, 85): E(s) = s S Γη(s ) where, with I 1, I 2,... IID BER(p), η(s) = N (s) i=1 I i and Γ k = min{j > Γ k 1 : I j = 1}. PeñaTalkGT) p.3

PeñaTalkGT) p.3 Semi-Parametric Estimation: No Frailty Observed Data for n Subjects: {(X i (s), N i (s), Y i (s), E i(s)) : 0 s s }, i = 1,..., n N i (s) = # of events in [0, s] for ith unit Y i (s) = at-risk indicator at s for ith unit with the model for the signal being A i (s) = s 0 Y i (v) ρ[n i (v ); α] ψ[βt X i (v)] λ 0 [E i (v)]dv where λ 0 ( ) is an unspecified baseline hazard rate function.

Processes and Notations Calendar/Gap Time Processes: N i (s, t) = s 0 I{E i (v) t}n i (dv) A i (s, t) = Notational Reductions: s 0 I{E i (v) t}a i (dv) E ij 1 (v) E i (v)i (Sij 1,S ij ](v)i{y i (v) > 0} ϕ ij 1 (w α, β) ρ(j 1; α)ψ{βt X i [E 1 ij 1 (w)]} E ij 1 [E 1 ij 1 (w)] PeñaTalkGT) p.3

PeñaTalkGT) p.3 Generalized At-Risk Process Y i (s, w α, β) N i (s ) j=1 I (Eij 1 (S ij 1 ), E ij 1 (S ij )](w) ϕ ij 1 (w α, β)+ I (EiN (s )(S in (s )), E i))](w) ϕ in (s )((s τ in i (s )(w α, β) i i i For IID Renewal Model (PSH, 01) this simplifies to: Y i (s, w) = N i (s ) j=1 I{T ij w} + I{(s τ i ) S in i (s ) w}

PeñaTalkGT) p.3 Estimation of Λ 0 A i (s, t α, β) = t 0 Y i (s, w α, β)λ 0 (dw) S 0 (s, t α, β) = n i=1 Y i (s, t α, β) J(s, t α, β) = I{S 0 (s, t α, β) > 0} Generalized Nelson-Aalen Estimator : ˆΛ 0 (s, t α, β) = t 0 { } { } J(s, w α, β) n N i (s, dw) S 0 (s, w α, β) i=1

PeñaTalkGT) p.3 Estimation of α and β Partial Likelihood (PL) Process: L P (s α, β) = n i=1 N i (s ) j=1 [ ρ(j 1; α)ψ[β t X i (S ij )] S 0 [s, E i (S ij ) α, β] ] N i (S ij) PL-MLE: ˆα and ˆβ are maximizers of the mapping (α, β) L P (s α, β) Iterative procedures. Implemented in an R package called gcmrec (Gonzaléz, Slate, Peña 04).

PeñaTalkGT) p.3 Estimation of F 0 G-NAE of Λ 0 ( ): ˆΛ 0 (s, t) ˆΛ 0 (s, t ˆα, ˆβ) G-PLE of F 0 (t): t [ ] n i=1 N i(s, dw) ˆ F 0 (s, t) = w=0 1 S 0 (s, w ˆα, ˆβ) For IID renewal model with E i (s) = s S in i (s ), ρ(k; α) = 1, and ψ(w) = 1, the estimator in PSH (2001) obtains.

PeñaTalkGT) p.3 Sum-Quota Effect: IID Renewal Generalized product-limit estimator ˆF 0 of common gap-time df F 0 presented in PSH (2001, JASA). n( ˆ F 0 ( ) F 0 ( )) = GP(0, σ 2 ( )) σ 2 (t) = F 0 (t) 2 t ν(w) = ρ ( ) = 0 1 Ḡ(w ) j=1 dλ 0 (w) F 0 (w)ḡ(w ) [1 + ν(w)] w ρ (v w)dg(v) F j 0 ( ) = renewal function

PeñaTalkGT) p.3 Semi-Parametric Estimation: With Frailty Recall the intensity rate: λ i (s Z i, X i ) = Z i λ 0 [E i (s)] ρ[n i (s ); α] ψ(βt X i (s)) Frailties Z 1, Z 2,..., Z n are unobserved and assumed to be IID Gamma(ξ, ξ) Unknown parameters: (ξ, α, β, λ 0 ( )) Use of the EM algorithm (Dempster, et al; Nielsen, et al), with frailties as missing observations. Estimator of baseline hazard function under no-frailty model plays an important role. Details are in Peña, Slate & Gonzalez (To appear in JSPI, 2006).

PeñaTalkGT) p.4 First Application: MMC Data Set Aalen and Husebye (1991) Data Estimates of distribution of MMC period Survivor Probability Estimate 0.0 0.2 0.4 0.6 0.8 1.0 IIDPLE WCPLE FRMLE 50 100 150 200 250 Migrating Moto Complex (MMC) Time, in minutes

PeñaTalkGT) p.4 Second Application: Bladder Data Set Bladder cancer data pertaining to times to recurrence for n = 85 subjects studied in Wei, Lin and Weissfeld ( 89). Subject 0 20 40 60 80 0 10 20 30 40 50 60 Calendar Time

PeñaTalkGT) p.4 Results and Comparisons Estimates from Different Methods for Bladder Data Cova Para AG WLW PWP General Model Marginal Cond*nal Perfect a Minimal b log N(t ) α - - -.98 (.07).79 Frailty ξ - - -.97 rx β 1 -.47 (.20) -.58 (.20) -.33 (.21) -.32 (.21) -.57 Size β 2 -.04 (.07) -.05 (.07) -.01 (.07) -.02 (.07) -.03 Number β 3.18 (.05).21 (.05).12 (.05).14 (.05).22 a Effective Age is backward recurrence time (E(s) = s S N (s ) ). b Effective Age is calendar time (E(s) = s).

PeñaTalkGT) p.4 On Asymptotic Properties Asymptotics under the no-frailty models. Difficulty: λ 0 ( ) has E(s) as argument in the model; whereas, interest is usually on Λ 0 (t). No martingale structure in gap-time axis. MCLT not directly applicable. Under regularity conditions: consistency and joint weak convergence to Gaussian processes of standardized (ˆα, ˆβ) and ˆΛ 0 (s, ). Results extend those in Andersen and Gill (AoS 82) regarding Cox PHM, though proofs different. Research on the asymptotics for the model with frailty in progress.

PeñaTalkGT) p.4 Asymptotics: Master Theorem {H i } a sequence defined on [0, s ] [0, t ]. M i (s, t) = s 0 I{E i(v) t}m i (dv). Y i (s, t) - generalized at-risk process. Under some regularity conditions, and if 1 n n i=1 H 2 i (s, )Y i (s, ) upr v(s, ), then, with Σ(s, t) = t 0 v(s, w)λ 0 (dw), 1 n n i=1 0 H i (s, w)m i (s, dw) = GP(0, Σ(s, )).

PeñaTalkGT) p.4 Relevant Empirical Measures Simplified model (one unit): Pr{dN i (v) = 1 F s } = Y i (v)λ 0[E i (v)]ξ i (v; η)dv. Conditional PM Q(s, w; η) on {1, 2,..., N (s ) + 1}: Q({j}; s, w; η) = ϕ j 1(w; η)i{e(s j 1 ) < w E(S j )} Y (s, w) with S N (s )+1 = min(s, τ). Conditional PM P (s, w; η) on {1, 2,..., n}: P ({i}; s, w; η) = Y i(s, w; η) npy (s, w; η).

PeñaTalkGT) p.4 Empirical Means & Variances Pf(D) = 1 n n i=1 f(d i ) E Q(s,w;η)g(J) = N (s )+1 j=1 g(j)q({j}; s, w; η) V Q(s,w;η)g(J) = E Q(s,w;η)[g 2 (J)] (E Q(s,w;η)g(J)) 2 n E P (s,w;η)g(i) = g(i)p ({i}; s, w; η) i=1 V P (s,w;η)g(i) = E Q(s,w;η)[g 2 (I)] (E Q(s,w;η)g(I)) 2

PeñaTalkGT) p.4 Relevant Limit Functions s 0 (s, w; η, Λ 0 ) = plim PY (s, w; η). Partial Likelihood Information Limit: I p (s, t; η, Λ 0 ) = plim t {[ ( EP (s,w;η)v Q(s,w;η) η log Ξ I (EIJ 1 1 (w); η)) + 0 ( V P (s,w;η)e Q(s,w;η) η log Ξ I (EIJ 1 1 (w); η))]} s 0 (s, w; η, Λ 0 ) Λ 0 (dw). With e(s, w; η, Λ 0 ) = plim P ηy (s,w;η) PY (s,w;η), let A(s, t; η, Λ 0 ) = t 0 e(s, w; η, Λ 0 )Λ 0 (dw).

PeñaTalkGT) p.4 Weak Convergence Results As n and under certain regularity conditions: n(ˆη(s, t ) η) N(0, I p (s, t ; η, Λ 0 ) 1 ) n(ˆλ0 (s, ) Λ 0 ( )) GP (0, Γ(s, ; η, Λ 0 )) where the limiting variance function is given by Γ(s, t; η, Λ 0 ) = t 0 Λ 0 (dw) s 0 (s, w; η) + A(s, t; η, Λ 0 )I p (s, t ; η, Λ 0 ) 1 A(s, t; η, Λ 0 ) t.

PeñaTalkGT) p.4 Concluding Remarks Many aspects of the general dynamic recurrent event model still under investigation. Asymptotics for the model with frailty. Testing hypothesis procedures. Goodness-of-fit and residual analysis. Its practical relevance still needs exploring, e.g., could the effective age process be determined appropriately in practice. Comparisons with marginal-based models (PWP, WLW). Dynamic recurrent event modeling remains a challenge and is a fertile area for research.