Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng

Similar documents
STAT Sample Problem: General Asymptotic Results

DAGStat Event History Analysis.

Empirical Processes & Survival Analysis. The Functional Delta Method

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

MAXIMUM LOCAL PARTIAL LIKELIHOOD ESTIMATORS FOR THE COUNTING PROCESS INTENSITY FUNCTION AND ITS DERIVATIVES

Linear rank statistics

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model

Estimation of a quadratic regression functional using the sinc kernel

AMS 147 Computational Methods and Applications Lecture 13 Copyright by Hongyun Wang, UCSC

EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

A General Kernel Functional Estimator with Generalized Bandwidth Strong Consistency and Applications

Comparing Distribution Functions via Empirical Likelihood

Adaptive Kernel Estimation of The Hazard Rate Function

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

STAT 331. Martingale Central Limit Theorem and Related Results

The covariate order method for nonparametric exponential regression and some applications in other lifetime models.

Kernel Density Estimation

Statistical Analysis of Competing Risks With Missing Causes of Failure

Kernel density estimation in R

1 Glivenko-Cantelli type theorems

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Efficiency of Profile/Partial Likelihood in the Cox Model

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Attributable Risk Function in the Proportional Hazards Model

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

ORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN

CUSUM TEST FOR PARAMETER CHANGE IN TIME SERIES MODELS. Sangyeol Lee

Nonparametric Density Estimation

NONPARAMETRIC CONFIDENCE INTERVALS FOR MONOTONE FUNCTIONS. By Piet Groeneboom and Geurt Jongbloed Delft University of Technology

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Asymptotic distributions of nonparametric regression estimators for longitudinal or functional data

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Survival analysis in R

Variance Function Estimation in Multivariate Nonparametric Regression

Nonparametric Estimation of Smooth Conditional Distributions

Philosophy and Features of the mstate package

Introduction to Regression

Research Article Almost Periodic Viscosity Solutions of Nonlinear Parabolic Equations

Likelihood ratio confidence bands in nonparametric regression with censored data

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Product-limit estimators of the survival function with left or right censored data

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

An idea how to solve some of the problems. diverges the same must hold for the original series. T 1 p T 1 p + 1 p 1 = 1. dt = lim

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Generalized Seasonal Tapered Block Bootstrap

On the Multi-Dimensional Controller and Stopper Games

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Introduction to Regression

Nonparametric Regression. Changliang Zou


Properties of an infinite dimensional EDS system : the Muller s ratchet

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Estimation of the Bivariate and Marginal Distributions with Censored Data

Introduction to Regression

Appendix. Proof of Theorem 1. Define. [ ˆΛ 0(D) ˆΛ 0(t) ˆΛ (t) ˆΛ. (0) t. X 0 n(t) = D t. and. 0(t) ˆΛ 0(0) g(t(d t)), 0 < t < D, t.

Spiking problem in monotone regression : penalized residual sum of squares

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Nonparametric Model Construction

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan

Lecture 2: Martingale theory for univariate survival analysis

11 Survival Analysis and Empirical Likelihood

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Goodness-of-fit test for the Cox Proportional Hazard Model

Fixed-b Inference for Testing Structural Change in a Time Series Regression

Journal of Inequalities in Pure and Applied Mathematics

Asymptotic statistics using the Functional Delta Method

Example: Bipolar NRZ (non-return-to-zero) signaling

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

ST745: Survival Analysis: Nonparametric methods

Maximum Likelihood Estimation for the Proportional Odds Model with Random Effects

Survival Analysis for Case-Cohort Studies

Lecture 5 Models and methods for recurrent event data

A Comparison of Different Approaches to Nonparametric Inference for Subdistributions

Frontier estimation based on extreme risk measures

Semiparametric Trending Panel Data Models with Cross-Sectional Dependence

Survival Analysis Math 434 Fall 2011

A note on the decomposition of number of life years lost according to causes of death

Understanding product integration. A talk about teaching survival analysis.

Survival Analysis: Counting Process and Martingale. Lu Tian and Richard Olshen Stanford University

The Bernstein and Nikolsky inequalities for trigonometric polynomials

LAN property for sde s with additive fractional noise and continuous time observation

Statistical Properties of Numerical Derivatives

7 Day 3: Time Varying Parameter Models

ONLINE SUPPLEMENT FOR EVALUATING FACTOR PRICING MODELS USING HIGH FREQUENCY PANELS

I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S (I S B A)

Natural boundary and Zero distribution of random polynomials in smooth domains arxiv: v1 [math.pr] 2 Oct 2017

Reflected Brownian Motion

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Infinitely Imbalanced Logistic Regression

Nonparametric Trending Regression with Cross-Sectional Dependence

dy dx dx = 7 1 x dx dy = 7 1 x dx e u du = 1 C = 0

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Graphical Diagnostic for Mortality Data Modeling

Transcription:

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong seng Reference: 1. Andersen, Borgan, Gill, and Keiding (1993). Statistical Model Based on Counting Processes, Springer-Verlag, p.229-255 2. Muller and Wang (1994). Hazard Rate Estimation Under Random Cnsoring with Varying Kernels and Bandwidths, Biometrics 50, 61-76 1 Survival Data and Nelson-Aalen Estimator 1. Survival time 2. Censoring time C 3. Oservational time X = min(, C) 4. Censoring indicator D = I( C) 5. Hazard function: α(t) = lim Pr(t < t + t t) t 6. Cumulative hazard function : A(t) = t 0 α(u)du 7. Nelson-Aalen estimator Â(t) = D i n i t 0 1 0 i t 8. n i = numer of suject at risk at time i 9. at-risk process Y i (t) = I(X i t), Y (t) = Y i (t) 10. counting process N i (t) = I(X i t, D i = 1), N(t) = N i (t) 11. Nelson-Aalen estimator Â(t) = t 12. J(s) = I(Y (s) > 0) 2 Kernel Function Estimator 0 J(u) n Y (u) dn(u) = he kernel function estimator is given y, ˆα(t) = K( t s )dâ(s) = j i=1 K( t j )(Y ( j )) 1. K: Kernel function, 1 K(s)ds = 1 e.g. K(x) = 0.75(1 x 2 ), x 1. 2. : andwidth 3. at each t, {j, t j t + } contriute to the sum. 1 t 0 J(u) Y (u) dn i(u)

4. Eˆα(t) = Consider α (t) = α(t) K( t s )Pr(Y (s) > 0)dA(s) = = K( t s )da (s) K( t s )J(s)dA(s) (almost = Eˆα(t)) K( t s )da(s) we can see that ˆα(t) α (t) is a martingale: ˆα(t) α (t) = = K( t s )d(â A )(s) K( t s and its expected predictale variation process: ) J(s)dM(s). Y (s) τ 2 (t) = E(ˆα(t) α (t)) 2 = 2 K 2 ( t s )E( J(s) Y (s) )α(s)ds 1 = K 2 J(t u) (u)e( )α(t u)du Y (t u) therefore an estimator of variance of ˆα(t): ˆτ 2 (t) = 2 K 2 ( t s ) J(s) Y 2 (s) dn(s) 3 Mean Intergrated Squared Error and Optimal Bandwith o otain the optimal gloal andwidth, we choose the one minimizing the mean intergrated squared error(mise), which is defined as, MISE(ˆα(t)) = E t 2 (ˆα(t) α(t)) 2 dt = E t 2 [(ˆα(t) α(t)) + ( α(t) α(t))] 2 dt = t 2 E(ˆα(t) α(t)) 2 dt + t 2 ( α(t) α(t)) 2 dt +2 t 2 (Eˆα(t) α(t))( α(t) α(t))dt = ( variance term) + (squared ias term) + R 2 (t) 2

cumulative hazard 0.0 0.5 1.0 1.5 BM data 0 100 200 300 400 500 600 700 time andwidth=100 smoothed hazard 0.000 0.001 0.002 0.003 0.004 0.005 0 100 200 300 400 500 600 700 time 3

smoothed hazard 0.000 0.001 0.002 0.003 0.004 0.005 andwidth=50 0 100 200 300 400 500 600 700 time andwidth=150 smoothed hazard 0.000 0.001 0.002 0.003 0.004 0.005 0 100 200 300 400 500 600 700 time 4

Let with Assume t2 t2 R 2 = 2 (Eˆα(t) α(t))( α(t) α(t))dt = 2 R 1 (t)( α(t) α(t))dt R 1 (t) = (Eˆα(t) α(t)) = K(t s )Pr(Y (s) = 0)dA(s) = 1 K(u)Pr(Y (t nu) = 0)α(t n u)du R 1 (t) C 1 sup t [t1 c,t 2 +c] Pr(Y (t) = 0) R 2 (t) C 2 sup t [t1 c,t 2 +c] Pr(Y (t) = 0) 1. 1 K(u)du = 1, 1 uk(u)du = 0, 1 u2 K(u)du = k 2 > 0 2. there exist a sequence {a n }, a n as n (in most case, a n = n) 3. there exists a continuous function y such that E( a2 nj(t) Y (t) ) 1 y(t) uniformly on [ c, t 2 + c], as n 4. sup t [t1 c,t 2 +c] Pr(Y (t) = 0) = o(a 2 n ) Since α(t) α(t) = 1 K(u)(α(t nu) α(t))du = n α (t) 1 uk(u)du + 1 2 2 nα (t) 1 u2 K(u)du + o( 2 n) = 1 2 2 nα (t)k 2 + o( 2 n) the squared ias term = t 2 ( α(t) α(t)) 2 dt = 1 4 4 nk2 2 t2 (α (t)) 2 dt + o( 4 n) For the variance term: E(ˆα(t) α(t)) 2 = E(ˆα(t) α (t)) 2 + E(α (t) α(t)) 2 +2E(ˆα(t) α (t))(α (t) α(t)) the first term on the right-hand side (see ˆτ 2 (t)) can e expressed y, (a 2 n n ) α(t) y(t) 1 and the second term on the right-hand side: K 2 (u)du + o((a 2 n n ) ) α (t) α(t) = 1 K(u)I(Y (t n u) = 0)α(t n u)du = o(a 2 n ) 5

y Cauchy -Schwarz inequality, the last term on the right-hand side is of the form o((a 2 n 1/2 n ) ). so the variance term can e written as, t2 All together, E(ˆα(t) α(t)) 2 dt = (a 2 n n) MISE(ˆα) = 1 4 4 nk 2 2 t2 1 (α (t)) 2 dt+a 2 n n K 2 (t)dt 1 t2 t2 K 2 (t)dt α(t) y(t) dt + o((a2 n n) ) α(t) y(t) dt+o(4 n)+o((a 2 n n ) ) to minimize the sum of the first two terms, the optimal andwidth is [ 1 t2 ] 1/5 [ n = an 2/5 k 2/5 2 K 2 α(t) t2 (t)dt y(t) dt (α (t)) 2 dt wo quantities are remained to estimated: α (t) and t 2 α(t) y(t) dt ˆα (t) = 3 K ( t s )dâ(s) and, an estimator of t 2 α(t) y(t) dt is ] /5 Because the ias is a 2 n t2 dâ(t) Y (t) dt. E(ˆα(t)) α(t) 1 2 2 α (t)k 2 the ias-corrected estimate is ˆα(t) + 1 2 2 nˆα (t)k 2 3.1 Cross Validation Method We can expressed the MISE, MISE() = E t2 (ˆα(t)) 2 dt 2E t2 ˆα(t)α(t)dt + t2 (α(t)) 2 dt, it s a function of ; the first term is easy to caluclate, the third term deos not depend on, and a consistent estimator of the second term is the cross -validation estimator: 2 i j 1 K( i j ) N( i) N( j ) Y ( i ) Y ( j ) and plotting MISE() against will find the optimal 6

3.2 Boundary Correction and Local Bandwidth Now we allow oth andwidth = (t) and kernel K = K t depend on t: ˆα(t, (t)) = (t) j K t ( t j (t) )(Y ( j)) Assuming the support of the survival time distriution is on [0, R]. interior points I = t : (t) t R (t) left oundary region B L = t : 0 t < (t) right oundary region B R = t : R (t) < t R the kernels K t are polynomials K t (z) = { K + (1, z) t I, K + (t/(t), z) t B L, K ((R t)/(t), z) t B R, where K ± : [0, 1] [, 1] R are ounded kernel function with K ± (q, z)dz = 1, K ± (q, z)zdz = 0, K ± (q, z)z 2 dz 0, K (q, z) = K + (q, z) and some differentiaility conditions. e.g, K + (q, z) = 12 (1 + q) 4(z + 1)[z(1 2q) + (3q2 2q + 1)/2] and K + (1, z) = 3 4 (1 x2 ) is the Epanechnikov kernel For local andwidth, we consider to minimize local mean square error to otain andwidth (t) MSE(t, (t)) = ˆv(t, (t)) + ˆβ 2 (t, (t)) ˆv and ˆβ are variance and ias estimatrs, ˆv(t, (t)) = 1 K 2 ˆα(t (t)u) t (u)( n(t) ŷ(t (t)u) )du ˆβ(t, (t)) = ˆα(t (t)u)k t (u)du ˆα(t) he following is the recommended algorithm to otain the α(t) with (t). 7

1 choose K + (q, z) and initial andwidth 0 to otain ˆα(t) 1a alternatively, choose a parametric model and otain MLE ˆα(t) 2 choose equidistant grid of m l points t i, i = 1 m l, etween 0 and R. choose a grid of andwidth j, j = 1,, l, etween some 1 and 2. compute ˆv(t i ) and ˆβ(t i ) with all andwidth j s and otain minimizer (t i ) of MSE(t i ). 3 otain final andwidth using oundary -modified smoother with andwidth 0 or 3 2 0 ˆ(t) = Kt ( t t i ) (t i )/ K t ( t t i ) 0 0 4 otain final estimate ˆα(t) with (t) = ˆ(t) 4 Large Sample Properties Now we discuss the large sample properties of the kernel function estimator with gloal andwidth. heoremn IV.2.1 pointwise consistency Assume 1. t e an interior point of 2. α is continuous at t 3. andwith n 0 as n 4. there exist an ǫ > 0 such that inf s [t ǫ,t+ǫ] n Y (s) p as n hen ˆα p α as n Proof. ˆα(t) α(t) ˆα(t) α (t) + α (t) α(t) + α(t) α(t) we have to show ˆα(t) α (t) p 0 α (t) α(t) p 0 α(t) α(t) p 0 8

Pr( ˆα(t) α (t) > η) Pr( sup n t n s t+ n 1 δ η 2 + Pr( n y Lenglart s inequality K( t u ) J(u)dM(u) t n n Y (u) > η) s K 2 (u) J(t nu)α(t n u)du ) > δ) Y (t n u) since α and K are ounded, and n Y (t) p in a neighorhood of t, the last term on the right-hand side can e aritrary small. so ˆα(t) α (t) p 0 α (t) α(t) 1 α(t) α(t) K(u) {1 J(t n u)}α(t n u)du p 0 1 K(u) α(t n u) α(t) du p 0 heoremn IV.2.2 uniform conisitency Assume 1. t e an interior point of 2. 0 < < t 2 < t e fixed numers 3. α is continuous on [0, t] 4. the kernel K is of ounded variation 5. andwith n 0 as n 6. 2 n t 0 J(s)α(s)ds Y (s) p 0 7. t 0 (1 J(s))α(s)ds p 0 hen as n, Proof. sup ˆα(s) α(s) p 0 s [,t 2 ] ˆα(t) α (t) = K( t s )d(â A )(s) 2 V (K) sup Â(s) t [0,t] A (s) where V (K) is the toal variation of K(t). similar to the proof of the consistency of Nelson-Aalen estimator, with assumption (6) the last term converges to zero. 9

sup t [t1,t 2 ] α (t) α(t) p 0 with assumption (7) sup t [t1,t 2 ] α(t) α(t) p 0 follows y the oundedness of K and continuity of α. Remark Sufficient conditions for consistency uniform consistency of Nelson Aalen estimator : inf s [0,t] Y (s) p pointwise consistency of kernel function estimator : inf s [0,t] Y (s) p uniform consistency of kernel function estimator : inf s [0,t] 2 Y (s) p heorem IV.2.4 Asymptotic normality Assume 1. t e an interior point of 2. α is continuous at t 3. there exits positive constants {a n }, increasing to as n, 4. n 0, a 2 n n as n 5. there exists a function y, positive and continuous at t such that hen as n, for an ǫ sup a 2 n Y (s) y(s) p 0 s [t ǫ,t+ǫ] 1. a n 1/2 n (ˆα(t) α(t)) d N(0, τ 2 (t)), τ 2 (t) = α(t) y(t) 2. a 2 n nˆτ 2 (t) p τ 2 1 3. for t 2, ˆα( ) and ˆα(t 2 ) are asymptotic independent. Proof. A. a n n 1/2 (ˆα(t) α (t)) = with H(s) = a n n /2 K( t s ) J(s) n Y (s). Using Reolledo s martingale CL, H(s)dM(s), K 2 (u)du inf H 2 (s)y (s)α(s)ds = a2 n n K 2 ( t s n ) J(s)α(s)ds Y (s) = 1 a2 nk 2 (u) J(t nu)α(t n u)du Y (t n u) p α(t) y(t) 1 K2 (u)du, 10 as n

With any ε > 0, I( H(s) > ε) = I( K( t s ) J(s)a2 n n Y (s) > εa nn 1/2 ) converges to zero uniformly ecause a n 1 n /2, therefore Also inf H2 (s)y (s)α(s)i( H(s) > ε)ds 0 n a n 1/2 n (α (t) α(t)) 0 B. since a n n 1/2 (ˆα(t i ) α (t i )) = with H i (s) = a n n /2 K( t i s ) J(s) Y (s) therefore, H i (s)dm(s), H 1(s)H 2 (s)y (s)α(s)ds = a2 n n K( s n )K( t 2 s n ) J(s)α(s)ds Y (s) 0 heorem IV.2.5 Assume all conditions in heorem IV.2.4, and 1. α is twice continuously differentiale in a neighourhood of t 2. 1 K(u)du = 1, 1 uk(u)du = 0, 1 u2 K(u)du = k 2 > 0 3. limsup n a 2/5 n n < hen a n 1/2 n (ˆα(t) α(t) 2 2 nα (t)k 2 ) d N(0, τ 2 (t)) Proof. With aylor expansion (t is etween t n u and t) a n n 1/2 ( α(t) [ α(t) 2 2 nα (t)k 2 ) ] = a n n 1/2 1 K(u)α(t nu)du α(t) 2 2 n α (t)k 2 = 1 2 a nn 2/5 [ ] u 2 K(u)α(t )du α (t)k 2 0 11