Empirical Processes & Survival Analysis. The Functional Delta Method

Similar documents
STAT Sample Problem: General Asymptotic Results

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Efficiency of Profile/Partial Likelihood in the Cox Model

Asymptotic statistics using the Functional Delta Method

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Continuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Understanding product integration. A talk about teaching survival analysis.

Product-limit estimators of the survival function with left or right censored data

Statistical Analysis of Competing Risks With Missing Causes of Failure

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

ST745: Survival Analysis: Nonparametric methods

Lecture 5 Models and methods for recurrent event data

DAGStat Event History Analysis.

Estimation for Modified Data

1 Glivenko-Cantelli type theorems

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Theoretical Statistics. Lecture 19.

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

Statistical Inference and Methods

Consistency of bootstrap procedures for the nonparametric assessment of noninferiority with random censorship

Survival Analysis I (CHL5209H)

Lecture 3. Truncation, length-bias and prevalence sampling

EMPIRICAL LIKELIHOOD AND DIFFERENTIABLE FUNCTIONALS

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Exercises. (a) Prove that m(t) =

Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time

Multi-state models: prediction

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Survival Analysis Math 434 Fall 2011

Lecture 22 Survival Analysis: An Introduction

MODELING THE SUBDISTRIBUTION OF A COMPETING RISK

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

STAT 331. Martingale Central Limit Theorem and Related Results

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

A multi-state model for the prognosis of non-mild acute pancreatitis

Appendix. Proof of Theorem 1. Define. [ ˆΛ 0(D) ˆΛ 0(t) ˆΛ (t) ˆΛ. (0) t. X 0 n(t) = D t. and. 0(t) ˆΛ 0(0) g(t(d t)), 0 < t < D, t.

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

Quantile Regression for Residual Life and Empirical Likelihood

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Survival analysis in R

Theoretical Statistics. Lecture 17.

A Comparison of Different Approaches to Nonparametric Inference for Subdistributions


Part III Measures of Classification Accuracy for the Prediction of Survival Times

Estimation of the Bivariate and Marginal Distributions with Censored Data

MAS3301 / MAS8311 Biostatistics Part II: Survival

A Regression Model for the Copula Graphic Estimator

From semi- to non-parametric inference in general time scale models

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Comparing Distribution Functions via Empirical Likelihood

Survival Regression Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Multistate Modeling and Applications

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Nonparametric Model Construction

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Tests of independence for censored bivariate failure time data

SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE. By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam

Empirical Processes: General Weak Convergence Theory

The International Journal of Biostatistics

Multivariate Survival Data With Censoring.

Investigation of goodness-of-fit test statistic distributions by random censored samples

An augmented inverse probability weighted survival function estimator

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

Lectures on Survival Analysis

Distance between multinomial and multivariate normal models

Kernel density estimation in R

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Goodness-of-fit test for the Cox Proportional Hazard Model

TMA 4275 Lifetime Analysis June 2004 Solution

Likelihood ratio confidence bands in nonparametric regression with censored data

Resampling methods for randomly censored survival data

Chapter 2 Inference on Mean Residual Life-Overview

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Lecture 2: CDF and EDF

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Machine Learning. Module 3-4: Regression and Survival Analysis Day 2, Asst. Prof. Dr. Santitham Prom-on

A General Kernel Functional Estimator with Generalized Bandwidth Strong Consistency and Applications

Longitudinal + Reliability = Joint Modeling

arxiv:submit/ [math.st] 6 May 2011

Sample-weighted semiparametric estimates of cause-specific cumulative incidence using left-/interval censored data from electronic health records

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Semiparametric posterior limits

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Reinforced urns and the subdistribution beta-stacy process prior for competing risks analysis

Effects of a Misattributed Cause of Death on Cancer Mortality

Lecture 2: Martingale theory for univariate survival analysis

Stat 710: Mathematical Statistics Lecture 31

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

University of California, Berkeley

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Multistate models and recurrent event models

Survival analysis in R

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Transcription:

STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1

Objectives By the end of this lecture, you will learn the intuitive idea of functional delta method see various examples of functional derivatives be able to apply the functional delta method to survival analysis problems such as estimation of the cumulative incidence function of competing risks (Gray s estimator and tests) The Functional Delta Method 3-2

Contents 1.1 von-mises Calculus 1.2 Hadamard Differentiable Functions 1.3 Application: The Cumulative Incidence of Competing Risks The Functional Delta Method 3-3

Smooth Functionals Consider a parameter that is defined as a functional of the underlying distribution P : θ(p ). Examples: Mean: θ(p ) = P X Variance: θ(p ) = P (X P X) 2 Quantiles: θ(f ) = inf{ξ : F (ξ) p} The Functional Delta Method 3-4

Smooth Functionals The natural estimator is θ(p n ): Examples: Sample mean: θ(p n ) = P n X Sample variance: θ(p n ) = P n (X P n X) 2 Sample quantiles: θ( F n ) = inf{ξ : F n (ξ) p} The Functional Delta Method 3-5

Smooth Functionals How to derive the asymptotic distribution of θ(p n )? Note n(θ(pn ) θ(p )) = ) n( θ(p + n 1/2 G n ) θ(p ). (3.1) Now, view G n as if it were a fixed quantity H (not an entirely strange thing to do since G is stabilized over a Donsker class). Then the right hand side (3.1) can be written as for some linear operator θ. θ(p + n 1/2 H) θ(p ) n 1/2 θ(h), The Functional Delta Method 3-6

Smooth Functionals The linear operator θ can be computed by θ[h] = θ(p + ɛh). ɛ ɛ= A linear functional θ on the signed measures H can always be represented by θ[h] = HΦ for some function Φ. The Functional Delta Method 3-7

Smooth Functionals So n(θ(pn ) θ(p )) = θ[g n ] = G n Φ + o P (1). The influence function usually presents itself during the calculation of θ. The Functional Delta Method 3-8

Smooth Functionals Example 1: θ(p ) = P X. θ[h] = (P + ɛh)x = HX. ɛ ɛ= Hence n(pn X P X) = G n X + o p (1). This is actually a quite trivial example where the remainder is obviously zero. The Functional Delta Method 3-9

Smooth Functionals Example 2: θ(p ) = P (X P X) 2 =: σ 2. θ[h] = ( (P + ɛh) X (P + ɛh)x ɛ ɛ= = H(X P X) 2 2HP (X P X) = H(X P X) 2. ) 2 Hence n ( P n (X P n X) 2 σ 2) = G n (X P X) 2 + o p (1). The Functional Delta Method 3-1

Smooth Functionals Example 3: θ(f ) = F 1 (p) =: ξ p. Re-define θ(p ) such that P 1(X θ(p )) = p. Hence (P + ɛh)1(x θ(p + ɛh)) =. ɛ ɛ= By the chain rule, H1(X θ(p )) + f(θ(p )) θ[h] =. So, θ[h] = H1(X ξ p). f(ξ p ) The Functional Delta Method 3-11

Smooth Functionals So n(θ( Fn ) ξ p ) = G n 1(X ξ p ) f(ξ p ) The same result as derived in Example 1.1. + o P (1). The Functional Delta Method 3-12

Contents 1.1 von-mises Calculus 1.2 Hadamard Differentiable Functions 1.3 Application: The Cumulative Incidence of Competing Risks The Functional Delta Method 3-13

Hadamard Differentiable Functions In the previous section, we have treated G n as if it were fixed. If G n on a Donsker class, then it eventually ranges over a compact set. So for first-order approximation, we need the stronger condition that for every compact set K θ(p + ɛh) θ(p ) sup ɛ H K θ(h). If there exists such a linear function θ, then θ(p ) is said to be Hadamard differentiable at P with derivative θ. The Functional Delta Method 3-14

Hadamard Differentiable Functions More generally, let φ(η) be a Hadamard differentiable function on a (functional) parameter η, and let η n be an estimator of η, such that η n η is tight: n( ηn η ) = G n Ψ + o P (1). Let φ be the derivative (a linear operator) of φ at η. Then n(φ( ηn ) φ(η )) = G n φ[ψ] + op (1). The Functional Delta Method 3-15

Hadamard Differentiable Functions Many functions are Hadamard differentiable. We omit the technical proofs. Example: φ(f, G) = F dg. φ F,G [h 1, h 2 ] = (F + ɛh 1 )d(g + ɛh 2 ) ɛ ɛ= = h 1 dg + F dh 2. Application: Mann-Whitnet statistic. The Functional Delta Method 3-16

Hadamard Differentiable Functions Example 1.3 (Nelsen-Aalen Estimator, cont d) We have shown that the Nelsen-Aalen estimator takes the form Λ(t) = t P n dn(s) P n Y (s) The Functional Delta Method 1-17

Hadamard Differentiable Functions Example 1.3 (Nelsen-Aalen Estimator, cont d) Now, view Λ as a function φ of P, where Obviously, and φ(p ) = P dn(s) P Y (s). Λ( ) = φ(p n ), Λ ( ) = φ(p ). The Functional Delta Method 1-18

Hadamard Differentiable Functions Example 1.3 (Nelsen-Aalen Estimator, cont d) The functional derivative can be calculated by φ[h] = φ(p + ɛh) ɛ ɛ= = (P + ɛh)dn(s) ɛ ɛ= (P + ɛh)y (s) { = H = H = H dn(s) π(s) dn(s) Y (s)dλ (s) π(s) dm Λ (s). π(s) The same as derived in Example 1.3 of 1. Y (s)p dn(s) } π(s) 2 The Functional Delta Method 1-19

Kaplan-Meier as Product Limit Functional We consider the product limit functional mapping the space of cadlag functions on [, τ] into itself: φ(a)(t) = t (1 + da(s)) = lim s= s i s i 1 {1 + A(s i ) A(s i 1 )}, where the second equality is the definition and the limit is over partitions = s < s 1 < < s m = t. with maximum separation decreasing to zero. i The Functional Delta Method 1-2

Kaplan-Meier as Product Limit Functional To derive the functional derivative of φ(a), observe φ A (H)(t) = φ(a + ɛh)(t) ɛ ɛ= = lim {1 + A(s i ) A(s i 1 ) + ɛ(h(s i ) H(s i 1 ))} ɛ ɛ= s i s i 1 i = lim (H(s i ) H(s i 1 ) + A(s j ) A(s j 1 )} s i s i 1 j i{1 i = lim {1 + A(s i ) A(s i 1 )} s i s i 1 i i = φ(a)(t) t where A(s) = A(s) A(s ). {1 + A(s)} 1 dh(s), H(s i ) H(s i 1 ) 1 + A(s i ) A(s i 1 ) The Functional Delta Method 1-21

Kaplan-Meier as Product Limit Functional Note that the Kaplan-Meier estimator can be expressed as a product limit functional of Nelsen-Aalen estimator: Thus, Ŝ n (t) = φ( Λ n )(t) = n( Ŝ n S )(t) = G n S (t) t = G n S (t) t t (1 d Λ n (s)). s= 1 (1 Λ (s))π(s) dm Λ (s) + o P (1) 1 π(s) dm Λ (s) + o P (1). (when Λ is continuous.) The Functional Delta Method 1-22

Contents 1.1 von-mises Calculus 1.2 Hadamard Differentiable Functions 1.3 Application: The Cumulative Incidence of Competing Risks The Functional Delta Method 1-23

Competing Risks Competing risks data arise when each subject can experience one and only one of several competing causes of failure. Examples: death from cancer vs death related to treatment (e.g., chemotherapy). The Functional Delta Method 1-24

Competing Risks Competing risks data (T, D), T is failure time and D = 1,, J is the cause of failure. We can image (T, D) as arising from a vector of J latent competing failure times, T 1,, T J such that T = min{ T 1,, T J }, and D is the subscript of the first event time among the T j. The Functional Delta Method 1-25

Competing Risks However, the joint distribution of ( T 1,, T J ) cannot be identified from the observed data (T, D). Even independence is not identifiable. To make inference on the latent event times one has to make strong and usually unrealistic assumptions such as independence. The alternative is to stick with identifiable quantities based on (T, D). The Functional Delta Method 1-26

Competing Risks A popular quantity that is identifiable is the cause-specific hazard: dλ j (t) = Pr(t T < t + dt, D = j T t), i.e., the instantaneous rate for the jth cause of failure given survival to that point. The cause-specific hazard can be estimated by the Nelsen-Aalen estimator treating other causes as censoring. The cause-specific hazard Λ j reduces to the net hazard of T j if all other T k, k j, are independent of T j. The Functional Delta Method 1-27

Competing Risks Another identifiable quantity that is often of interest is the sub-distribution: F j (t) = Pr(T t, D = j), i.e., the cumulative incidence of the jth cause of failure in the presence of other causes. The sub-distribution F j (t) is not a functional of the cause-specific hazard Λ j (t); in particular, one cannot use the naive Kaplan-Meier curve (product limit of the Nelsen-Aalen estimator for the cause-specific hazard) to estimate the sub-distribution. The Functional Delta Method 1-28

Competing Risks Observe that df j (t) = Pr(t T < t + dt, D = j) = Pr(T t)pr(t T < t + dt, D = j T t) = S(t )dλ j (t), where S(t) = Pr(T > t). So F j (t) = t S(s )dλ j (s). The Functional Delta Method 1-29

Competing Risks Hence we can estimate the sub-distribution by F jn (t) = t Ŝ n (s )d Λ jn (s), where Ŝn is the Kaplan-Meier estimator for the overall survival function and Λ jn is the Nelsen-Aalen estimator fort the cause-specific hazard function of the jth cause of failure. The Functional Delta Method 1-3

Competing Risks Specifically, let C be the independent censoring time, then the observed data are {T I(T C), DI(T C), I(T C)}. The observed data can also be represented by N(t) = I(T t C), Y (t) = I(T C t), and N j (t) = I(T t C, D = j), j = 1,, J. Denote π(s) = PY (s), M Λ (t) = N(t) t Y (s)dλ(s), M Λj (t) = N j (t) t Y (s)dλ j(s), where Λ is the hazard function for T. The Functional Delta Method 1-31

Competing Risks We focus on the estimation of the cumulative incidence of the first cause of failure. We know that and n( Ŝ n S )(t) = G n S (t) t n( Λ1n Λ 1 )(t) = G n t 1 π(s) dm Λ (s) + o P (1), 1 π(s) dm Λ 1 (s) + o P (1) The Functional Delta Method 1-32

Competing Risks Since F 1n = φ(ŝn, Λ 1n ) and F 1 = φ(s, Λ 1 ), where φ(s, Λ 1 ) = We know that φ S,Λ 1 [H 1, H 2 ](t) = t S(s )dλ 1 (s). H 1 (s )dλ 1 (s) + t S (s )dh 2 (s). The Functional Delta Method 1-33

Competing Risks So, ( ) [ 1 n F1n F 1 (t) = G n φs,λ 1 S ( ) π(s) dm Λ (s), 1 ] π(s) dm Λ 1 (s) (t) + o P (1) t = G n S (u ) u 1 π(s) dm Λ (s)dλ 1 (u) t 1 + G n S (s ) π(s) dm Λ 1 (s) + o P (1). The Functional Delta Method 1-34

Competing Risks t To simplify the first term on the right, use integration by parts, u 1 S (u ) π(s) dm Λ (s)dλ 1 (u) = t u = F 1 (u) = t t 1 π(s) dm Λ (s)df 1 (u) 1 π(s) dm Λ (s) t u= u F 1 (s) π(s) dm Λ (s) F 1 (t) F 1 (s) dm Λ (s) π(s) The Functional Delta Method 1-35

Competing Risks Therefore, to conclude, ( ) t F 1 (t) F 1 (s) n F1n F 1 (t) = G n dm Λ (s) π(s) t 1 + G n S (s ) π(s) dm Λ 1 (s) + o P (1). The Functional Delta Method 1-36

Concluding Remarks The functional delta method is a powerful tool in semiparametric inference, particularly survival analysis. We have provided a few simple examples in this lecture. For a more formal treatment of the functional delta method and more examples refer to van der Vaart (1998, chap 2) and Andersen et al. (1993, II.8). The Functional Delta Method 1-37

References - Andersen, P. K., Borgan, O., Gill, R. D., & Keiding, N. (1993). Statistical models based on counting processes. Springer Science & Business Media. - van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge University Press. The Functional Delta Method 1-38