ESTIMATION IN A SEMI-PARAMETRIC TWO-STAGE RENEWAL REGRESSION MODEL

Similar documents
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Efficiency of Profile/Partial Likelihood in the Cox Model

A note on L convergence of Neumann series approximation in missing data problems

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

POINTWISE BOUNDS ON QUASIMODES OF SEMICLASSICAL SCHRÖDINGER OPERATORS IN DIMENSION TWO

University of California, Berkeley

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

APPLICATIONS OF DIFFERENTIABILITY IN R n.

Measuring Ellipsoids 1

Statistical Properties of Numerical Derivatives

INTRODUCTION TO REAL ANALYTIC GEOMETRY

Lecture 3. Truncation, length-bias and prevalence sampling

Singular Integrals. 1 Calderon-Zygmund decomposition

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model

3. Some tools for the analysis of sequential strategies based on a Gaussian process prior

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Supremum of simple stochastic processes

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true

arxiv: v1 [stat.ml] 5 Jan 2015

STAT 331. Martingale Central Limit Theorem and Related Results

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the functional Weibull-tail coefficient

Course 212: Academic Year Section 1: Metric Spaces

FUNCTIONAL ANALYSIS LECTURE NOTES: COMPACT SETS AND FINITE-DIMENSIONAL SPACES. 1. Compact Sets

A class of domains with fractal boundaries: Functions spaces and numerical methods

NOTES ON FRAMES. Damir Bakić University of Zagreb. June 6, 2017

Sobolev Spaces. Chapter 10

14 Higher order forms; divergence theorem

Analysing geoadditive regression data: a mixed model approach

Tools from Lebesgue integration

Multivariate Survival Data With Censoring.

UNIVERSITY OF CALIFORNIA, SAN DIEGO

STAT331. Cox s Proportional Hazards Model

Integral Jensen inequality

Goodness-of-fit tests for the cure rate in a mixture cure model

arxiv: v1 [math.pr] 22 May 2008

Sensitivity analysis of the expected utility maximization problem with respect to model perturbations

and BV loc R N ; R d)

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

Survival Analysis Math 434 Fall 2011

Supplementary Materials for Tensor Envelope Partial Least Squares Regression

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

A LOCALIZATION PROPERTY AT THE BOUNDARY FOR MONGE-AMPERE EQUATION

MORE NOTES FOR MATH 823, FALL 2007

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Model reduction for linear systems by balancing

Iterative methods for positive definite linear systems with a complex shift

Mathematics for Economists

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Product-limit estimators of the survival function with left or right censored data

The speed of propagation for KPP type problems. II - General domains

Optimization and Optimal Control in Banach Spaces

Elliptic PDEs of 2nd Order, Gilbarg and Trudinger

1 Fourier transform as unitary equivalence

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

Chapter 3 : Likelihood function and inference

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

B553 Lecture 3: Multivariate Calculus and Linear Algebra Review

STAT Sample Problem: General Asymptotic Results

Stochastic Design Criteria in Linear Models

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP

Laplace s Equation. Chapter Mean Value Formulas

ON A CLASS OF COVARIANCE OPERATORS U =

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Lecture 2: Linear Algebra Review

Definable Extension Theorems in O-minimal Structures. Matthias Aschenbrenner University of California, Los Angeles

Superconvergence of discontinuous Galerkin methods for 1-D linear hyperbolic equations with degenerate variable coefficients

Modelling geoadditive survival data

The Canonical Gaussian Measure on R

Convergence rates of moment-sum-of-squares hierarchies for volume approximation of semialgebraic sets

Chapter 7. Hypothesis Testing

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Semiparametric Regression

PERTURBATION THEORY FOR NONLINEAR DIRICHLET PROBLEMS

MODERATE DEVIATIONS IN POISSON APPROXIMATION: A FIRST ATTEMPT

Translation Invariant Experiments with Independent Increments

Fuchsian groups. 2.1 Definitions and discreteness

11 COMPLEX ANALYSIS IN C. 1.1 Holomorphic Functions

Numerical Solutions to Partial Differential Equations

Chair Susan Pilkington called the meeting to order.

Lecture 5: Hodge theorem

HASSE-MINKOWSKI THEOREM

Finite difference method for elliptic problems: I

Additional proofs and details

Approximation of complex algebraic numbers by algebraic numbers of bounded degree

Stochastic Comparisons of Order Statistics from Generalized Normal Distributions

Finite difference method for heat equation

9 Sequences of Functions

Regularity of the density for the stochastic heat equation

Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes

Selected Exercises on Expectations and Some Probability Inequalities

Abstract. 1. Introduction

Supplementary Notes March 23, The subgroup Ω for orthogonal groups

Math113: Linear Algebra. Beifang Chen

Multistate models in survival and event history analysis

Root statistics of random polynomials with bounded Mahler measure

f(x) f(z) c x z > 0 1

arxiv: v3 [math.st] 29 Nov 2018

An Inverse Problem for Gibbs Fields with Hard Core Potential

Transcription:

Statistica Sinica 19(29): Supplement S1-S1 ESTIMATION IN A SEMI-PARAMETRIC TWO-STAGE RENEWAL REGRESSION MODEL Dorota M. Dabrowska University of California, Los Angeles Supplementary Material This supplement has two sections. In Section S1, we give a derivation of the information bound for estimation of the Euclidean component of the model. In Section S2, we provide the proof of Lemma A.1 as well as some details of the Hoeffding expansion used in the proof of Propositions 3.1 and 3.2. S1.Information bound In the following the Euclidean parameter of interest is denoted by ξ = (β,θ), and we let η = (A,A c,a 1c,A 2c,µ) be the nuisance parameter. We let ξ and η = (A (),A () c,a() 1c,A() 2c,µ() ) be the true parameter values. Without loss of generality, we assume that the cumulative hazards of the censoring distributions are absolutely continuous with respect to (A () ic,i =,1,2) and denote the corresponding derivatives by (α ic,i =,1,2). Likewise, we assume that the distribution µ of the covariates has density m with respect to the true distribution µ (). The condition 2.1 (ii) implies that for h H c we have E [Y h(u) Z = z] = S (u z)g (u z), where G (x z) is the conditional survival function of C 1 given Z = z. Its cumulative hazard function is A () c and we have Q h (x 1,z) = E N h (x 1 )1(Z z) = x1 E [Y h (u) Z = z]a () h (du z)µ() (dz) for h H c. The condition 2 (iii) implies also that for h Hc i,i = 1,2 we have E [Y h (u) Z = z,x 1 = x 1,J = i,δ 1 = 1] = S i (u x 1,z)G i (u x 1,z), where G i (u x 1,z) = P(C 2 X 1 > u Z = z,j = i,x 1 = x,x 1 C 1 ) is the survival function corresponding to A () ic. It follows that

S2 DOROTA M. DABROWSKA Q h (x 2,x 1,z) = E N h (x 2 )N i (x 1 )1(Z z) = x2 x1 for h H c i. E Y h (u) X 1 = u 1,J = i,δ 1 = 1,Z = z)a () h (du u 1,z)Q i (d(u 1,z)) We shall need the following moment identities. Lemma S1.1 Let ϕ(w) = {ϕ h (X 1,Z h ),ϕ h (X 2,Z h ) : h H c,h Hi c,i = 1,2} be a vector of measurable functions. Define I h (ϕ) = ϕ h (u,z h )M h (du) if h H c. Then I h (ϕ) L q (P),q = 1,2 if and only if E ϕ h (u,z h ) q N h (du) = E Y h (u) ϕ h (u,z h ) q A () h (du,z h) < for all h H c. In addition, E I h (ϕ) =, and E I h (ϕ) 2 = E Y h (u)ϕ h (u,z h ) 2 A () h (du Z h) if h H, = E Y h (u)ϕ h (u,z h ) 2 [1 A () h ( u Z h)]a () h (du Z h) if h = (i,c),i =,1,2. For any h,h H such that h h, we also have E I h (ϕ)i h (ϕ) =. This lemma follows easily by noting that under conditions 2.1 the processes x ϕ h (u,z)m h (du),x,h H c form orthogonal martingales with respect to the filtration F x = σ(n h (u),z,y h (u+) : u x,h H c ), and similarly, the processes x ϕ h (u,z,x 1 )M h (du),x,h Hi c,i = 1,2 form orthogonal martingales with respect to the filtration F ix = σ(n h (u),y h (u+),z1(δ 1 = 1,J 1 = i),x 1 1(δ 1 = 1,J 1 = i) : u x,h Hi c ). Using direct calculation, it is also easy to verify that the processes are orthogonal. As in Nan, Edmond and Wellner (24), in the following we assume that the censoring distributions are continuous. The condition 2.1 implies that each

SEMI-PARAMETRIC RENEWAL MODEL S3 subect contributes to the complete data log-likelihood the sum L(ξ,α) + L c (α ic,i =,1,2) + L Z (m), where the three terms are given by L(ξ,α) = h H + h H h H [log q h (Z h,u,θ) + r(z,u,β)]n h (du) + log α(u)n h (du) Y h (u)q h (Z h,u,θ)e r(z,u,β) α(u)du, L c (α ic,i =,1,2) = i= i= L Z (m) = m(z). log(α ic (du Z ic ))N ic (du) Y ic (u)α ic (u Z ic )A () ic (du Z ic), The score function for the parameter of interest ξ and the nuisance parameter η are l 1[ξ](W) = l 2 [a](w) = l 3[b c ](W) = l 4[b 1c ](W) = l 5 [b 2c](W) = h H h H ϕ h (u,z h,ξ )M h (du), a(u)m h (du), a L 2 ( h H(Q h ), b c (u,z)m ic (du), b c L 2 (Q c ), b 1c (u,z 1c )M 1c (du), b 1c L 2 (Q 1c ), b 2c (u,z 2c )M 2c (du), b 2c L 2 (Q 2c ), l 6[c](W) = c(z), c L 2(µ () ). Here l 1 [ξ] is the derivative of the log-likelihood with respect to the Euclidean parameter ξ = (θ,β), From Section 2, ϕ h (Z h,u,ξ) = [ϕ 1h (Z h,u,ξ),ϕ 2h (Z h,u,ξ)] T where ϕ 1h (Z h,u,ξ) = ṙ(z,u,β),

S4 DOROTA M. DABROWSKA ϕ 2h (Z h,u,ξ) = For h H c, Z h = Z, and we have q h q h (Z h,u,θ), h H. Further, ϕ 2h (Z h,u,ξ) = q h q h (Z,u,θ) for h = (,i),i = 1,2, = q 1 + q 2 1 q 1 q 2 (Z,u,θ) for h = (,3). a(u) = ε log α ε(u) ε=, b ic (u,z ic ) = c(z) = log α ic,ηi (du Z ic ) η ηi =,i =,1,2, i κ log m κ(z) κ= for regular parametric submodels {α ε }, {α ic,ηi,i =,1,2} and {m κ } passing through the parameters α, α ic and m when ε = κ = η i =,i =,1,2. The tangent spaces are Ṗi = [ l i ],i = 1,...,6 and [α] is the closed linear span generated by the score α. Using Lemma S1.1, it is easy to see that the spaces Ṗ i,i = 2,...,6 are mutually orthogonal, and Ṗ1 is orthogonal to Ṗi,i = 3,...,6. The following proposition generalizes Propositions 3.1 and 3.2 in Nan, Edmond and Wellner (24). Proposition S1.2 (i) Define Ṗ = { h H c Then Ṗ = L 2 (P). g h (u,z h )M h (du)g (Z) : g h L 2 (Q h ),h H c,g L 2 (m)}. (ii) Let c(w) = {c h (X 1,Z h ),c h (X 2,Z h ) : h H,h H i,i = 1,2} be a vector of functions such that c h L 2 (Q h ) for each h H. Define B(c) = [c h e c ](u)m h (du) h H by setting e c (u) = s (1) c (u)/s (u), where s (1) c (u) = h H E Y h (u)c h (u,z h )q h (u,z h,θ )e r(z,u,β ),

SEMI-PARAMETRIC RENEWAL MODEL S5 s (u) = h E Y h (u)q h (u,z h,θ )e r(z,u,β ). Then B(c) Ṗη L 2 (P) and Ṗ η = ( 6 Ṗ) = {B(c) : c h L 2 (Q h ),h H}. When specialized to the renewal model considered in the paper, this Proposition implies that the efficient score function of estimation of the parameter ξ = (θ,β) corresponds to the choice of c(w) = {ϕ h (Z h,u,ξ) : h H}, which is quite similar to the standard Cox regression. Note, however, that the model assumes that the functions r(z,u,β) and q h (Z h,u,θ),h H have support not depending on the unknown parameters. In the absence of covariates, this assumption fails for instance in the parametric Marshall- Olkin model (1967), where n rate of convergence of the asymptotically efficient estimators applies only to the interior of the parameter set. Proof. Recall that Z h = Z, for h H c and Z h = (X 1,Z) for h Hi c,i = 1,2. Any function f(w) L 2 (P) can be represented as a sum For = 1,2 define f(w) = δ 1 δ 2 1(J 1 = )f 3 (X 2,X 1,Z) + δ 1 (1 δ 2 ) 1(J 1 = )f c (X 2,X 1,Z) + δ 1 1(J 1 = 3)f 3 (X 1,Z) + (1 δ 1 )f c (X 1,Z). f (X 1,Z) = E [δ 2 f 3 (X 2,X 1,Z) + (1 δ 2 )f c (X 2,X 1,Z) X 1,Z,δ 1 = 1,J = ]. Then f(w) = I 1 (W) + I 2 (W), 3 I 1 (W) = δ 1 1(J = )f (X 1,Z) + (1 δ 1 )f c (X 1,Z), I 2 (W) = δ 1 δ 2 1(J = )f 3 (X 1,X 2,Z) + (1 δ 2 )δ 1 1(J = )f c (X 1,X 2,Z) δ 1 1(J = )f (X 1,Z).

S6 DOROTA M. DABROWSKA Set 3 R 1 [f](x 1,z) = E [δ 1 f (X 1,Z) X 1 > x 1,Z = z] + E [(1 δ 1 )f c (X 1,Z) X 1 > x 1,Z = z] and R 2 [f](x 2,x 1,z) = E [δ 2 f 3 (X 2,X 1,Z) X 2 > x 2,X 1 = x 1,Z = z,j 1 =,δ 1 = 1] + E [(1 δ 2 )f 1 (X 2,X 1,Z) X 2 > x 2,X 1 = x 1,Z = z,j 1 =,δ 1 = 1] for = 1,2. Finally, let g (Z) = E f(w) Z, g h (X 1,Z) = f (X 1,Z) R 1 [f](x 1,Z), h H c, g h (X 2,X 1,Z) = f h (X 2,X 1,Z) R 2 [f](x 2,X 1,Z), h H c, = 1,2. We have h H R 1 [f](x 1,Z) g (Z) = c x 1 f h (u,z)q h (du,z) 1 h H c Q g (Z) h(du,z) = x1 [f h (u,z) R 1 [f](u,z)]a () h (du Z). Therefore I 1 (W) = h H c 3 δ 1 1(J = )f (X 1,Z) + (1 δ 1 )f c (X 1,Z) R 1 [f](x 1,Z) ( ) + R 1 [f](x 1,Z) g (Z) + g (Z) = g h (u,z)m h (du) + g (Z) = g h (u,z h )M h (du) + g (Z) h H c h H c Similarly, for = 1,2, R 2 [f](x 2,X 1,Z) f (X 1,Z) = x2 [f h (u,x 1,Z) R 2 [f](u,x 1,Z)]A () h (du X 1,Z) h H c

SEMI-PARAMETRIC RENEWAL MODEL S7 and δ 1 δ 2 1(J = )f 1 (X 2,X 1,Z) + (1 δ 2 )δ 1 f c (X 1,X 2,Z) = = δ 1 δ 2 1(J = )f 1 (X 2,X 1,Z) + (1 δ 2 )δ 1 f c (X 1,X 2,Z) R 2 [f](x 2,X 1,Z) ( ) + R 2 [f](x 2,X 1,Z) f (X 1,Z) + f (X 1,Z) = g h (u,x 1,Z)M h (du) + f (X 1,Z)δ 1 1(J = ). h H c Summing over we obtain I 2 (W) = g h (u,x 1,Z)M h (du) = h H c h H c g h (u,z h )M h (du). It follows that f(w) Ṗ. To show part (ii) of the proposition, let b(w) = c h dm h. h H Then Π(b Ṗ2) = h a (u)dm h (u) for a square integrable function a such that E ( adm h )( [c h a ]dm h ) = h H h H for any a L 2 ( h H Q h). Orthogonality implies that the left side is equal to a s )A(du) (as (1) c so that a = s (1) c /s. The proof can be completed as in Nan et al. (24) and using that B(c) is orthogonal to 6 =3 Ṗ. S2. Proof of Lemma A.1 Before showing this lemma, we recall that a class of functions G defined on some measure space (Ω, A) is Euclidean for envelope G if g G for all g G, and there exist constants A and V such that N(ε G L2 (P), G, L2 (P)) (A/ε) V for all ε (,1) and all probability measures P such that G L2 (P) <. Here L2 (P) is the L 2 (P) norm and N(η, G, L2 (P)) is the minimal number of L 2 (P) balls of radius η covering the class G. In the case of classes G n changing with n, the Euclidean constants A and V are taken to be independent of n.

S8 DOROTA M. DABROWSKA It is easy to verify that the condition E G(W m ) p < implies E G p (W m )1(G(W m ) n m/p ), n m(2 p)/p E G 2 (W m )1(G(W m ) < n m/p ). Let g 1n (W m ) = g(w m )1(G(W m ) < n m/p ), g 2n (W m ) = g(w m )1(G(W m ) n m/p ). Since g is a canonical kernel, we have U n,m (g) = U n,m (π m [g 1n ]) + U n,m (π m [g 2n ]). Define G 1n (W m ) = A G 2n (W m ) = A E A G(W m )1(G(W m ) < n m/p ), E A G(W m )1(G(W m ) n m/p ). Then G 1n (W m ) and G 2n (W m ) form envelopes for the classes of truncated canonical kernels G 1n = {π m [g 1n ] : g G} and G 2n = {π m [g 2n ] : g G}, respectively. We have E n m(p 1)/p) sup U n,m (π m [g 2n ]) n m(p 1)/p E G 2n (W m ) g 2n G 2n Cn m(p 1)/p E G(W m )1(G(W m ) n m/p ) CE G p (W m )1(G(W m ) n m/p ) and the right-hand side tends to. We also have E n m(p 1)/p sup U n,m (π m [g 1n ]) = E n m/2 sup U n,m (π m [g 1n /n m(2 p)/2p ]). g 1n G 1n g 1n G 1n The class G 1n is Euclidean for the envelope G 1n (W m ). By the randomization Theorem 3.5.3 and Corollary 5.1.8 of de la Peña and Giné (1999) or using similar developments as on page 254-255 of their text, the right-hand side of the above display is of order KE U n,m (G 2 1n (W m)/n m(2 p)/p ) K[E [G 2 1n (W m)/n m(2 p)/p ] 1/2 = KC[n m(2 p)/p E [G 2 (W m )1(G(W m )] 1/2.

SEMI-PARAMETRIC RENEWAL MODEL S9 Here K is a constant depending only on m and the VC-characteristics of the class G 1n, but not n, while C is bounded by 2 m. This completes the proof. Finally, we give some more details of the Hoeffding decomposition used in the proofs of Lemmas A.2 and A.3. In calculations given below, we use A = A, the true baseline cumulative hazard function. In Lemma A.2, we consider first the statistic U n,2 (g (2) ). For i, we have E g (2) (W i,w ) =, E {1} g (2) (W i,w ) = E [g (2) (W i,w ) W i ] = h τ [ ] ϕ ih s () s (1) (u,ξ )N hi (du), s () E {2} g (2) (W i,w ) = E [g (2) (W i,w ) W ] = = [ τ (s (1) S () ) S (1) ] s () h s () (u,ξ )A(du) Thus U n,2 (g (2) ) = U n,1 (E {1} g (2) ) + U n,1 (E {2} g (2) ) + U 2,n (π 2 [g (2) ]) = 1 τ ϕ ih s () s (1) n s () (u,ξ )M ih (du) + U 2,n (π 2 [g (2) ]). i h Next, we consider the statistics U n,3 (g (3) ). For triplets (i,,k) of distinct indices, we have E g (3) (W i,w,w k ) =. In addition, E {13} g (3) (W i,w,w k ) = E g (3) (W i,w,w k ) W i,w k = [ ] τ ϕ ih s () s (1) S () k s () h s () s () (u,ξ )N ih (du) E {23} g (3) (W i,w,w k ) = E g (3) (W i,w,w k ) W,W k [ τ s (1) S () S (1) ] s () = s () ( S() k s () s () ) (u,ξ )A(du) and E A g (3) (W i,w,w k ) = for all other proper subsets of the index set {1,2,3}. Hence U n,3 (g (3) ) = U n,2 (E {13} g (3) ) + U n,2 (E {23} g (3) ) + U n,3 (π 3 [g (3) ]). Finally, for any quadruplet (i,,k,l) of distinct indices, we have E g (4) (W i,w,w k,w l ) =, and E 134 g (4) (W i,w,w k,w l ) = E [g (4) (W i,w,w k,w l ) W i,w k,w l ]

S1 DOROTA M. DABROWSKA = h τ [ ϕ h s () + s (1) ] s () k s () ( S() s () E 234 g (4) (W i,w,w k,w l ) = E [g (4) (W i,w,w k,w l ) W,W k,w l ] = τ [s (1) S () + s () S (1) ] s () k s () ( S() s () E 34 g (4) (W i,w,w k,w l ) = E [g (4) (W i,w,w k,w l ) W k,w l ] = 2 τ [s (1) ( S() k s () s () )( S() s () l s () )N hi (du), )( S() s () l s () )A(du) )( S l s s () )A(du), and E A g (4) = for all other proper subsets of the index set {1,2,3,4}. Hence U n,4 (g (4) ) = U n,2 (E {34} g (4) ) + A={134},{234} U n,3 (π 3 [E A g (4) ]) + U n,4 (π 4 [g (4) ]). In Lemma A.3, the Hoeffding decomposition of the statistics U n,p (f (p) ξ ),p = 2,3,4 is quite analogous. On the other hand the remainder term is given by rem(ξ) = 3 p=1 rem p(ξ), where rem 1 (ξ) is given by (4.3), and rem 2 (ξ) = 1 n τ [ h ϕ h(u,ξ)(s () h (u,ξ) S() h (u,ξ )) n i=1 S () (u,ξ ) ] (ξ ξ ) T S(2) (u,ξ ) S () N i (du), (u,ξ ) rem 3 (ξ) = 1 n τ [ ( ) S (1) S () (u,ξ) n i=1 S ()(u,ξ) S () (u,ξ ) 1 ( ) 2 (ξ ξ ) T S (1) (u,ξ )] N i (du). S () These two terms are easily verified to be order o p ( ξ ξ ).