Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

Similar documents
Efficiency of Profile/Partial Likelihood in the Cox Model

Quantile Regression for Doubly Censored Data with Applications to Cystic Fibrosis Studies. Technical Report July 1, 2010

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Quantile Processes for Semi and Nonparametric Regression

Empirical Processes & Survival Analysis. The Functional Delta Method

Estimation of the Bivariate and Marginal Distributions with Censored Data

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT Sample Problem: General Asymptotic Results

Smoothed and Corrected Score Approach to Censored Quantile Regression With Measurement Errors

STAT 331. Martingale Central Limit Theorem and Related Results

University of California, Berkeley

arxiv:submit/ [math.st] 6 May 2011

AFT Models and Empirical Likelihood

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

Supplementary Appendix: Difference-in-Differences with. Multiple Time Periods and an Application on the Minimum. Wage and Employment

Comparing Distribution Functions via Empirical Likelihood

1 Introduction. 2 Residuals in PH model

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Simple and Fast Overidentified Rank Estimation for Right-Censored Length-Biased Data and Backward Recurrence Time

A Resampling Method on Pivotal Estimating Functions

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

Large sample theory for merged data from multiple sources

Quantile Regression for Dynamic Panel Data

Maximum likelihood estimation in the proportional hazards cure model

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Regression Calibration in Semiparametric Accelerated Failure Time Models

Full likelihood inferences in the Cox model: an empirical likelihood approach

Quantile Regression for Recurrent Gap Time Data

Quantile Regression: Inference

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Published online: 10 Apr 2012.

A note on L convergence of Neumann series approximation in missing data problems

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

Improving linear quantile regression for

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

Stability of optimization problems with stochastic dominance constraints

Expecting the Unexpected: Uniform Quantile Regression Bands with an application to Investor Sentiments

On the Breslow estimator

University of California San Diego and Stanford University and

Verifying Regularity Conditions for Logit-Normal GLMM

Quantile Regression: Inference

Asymptotic normality of Powell s kernel estimator

SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE. By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam

Empirical Likelihood in Survival Analysis

On Estimation of Partially Linear Transformation. Models

STAT331. Cox s Proportional Hazards Model

Appendix. Proof of Theorem 1. Define. [ ˆΛ 0(D) ˆΛ 0(t) ˆΛ (t) ˆΛ. (0) t. X 0 n(t) = D t. and. 0(t) ˆΛ 0(0) g(t(d t)), 0 < t < D, t.

Homework # , Spring Due 14 May Convergence of the empirical CDF, uniform samples

Exercises. (a) Prove that m(t) =

Cox s proportional hazards model and Cox s partial likelihood

Attributable Risk Function in the Proportional Hazards Model

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Analysis of least absolute deviation

TESTINGGOODNESSOFFITINTHECOX AALEN MODEL

MAXIMUM LIKELIHOOD METHOD FOR LINEAR TRANSFORMATION MODELS WITH COHORT SAMPLING DATA

Asymptotics of minimax stochastic programs

Median regression using inverse censoring weights

Multivariate Least Weighted Squares (MLWS)

The Contraction Method on C([0, 1]) and Donsker s Theorem

Lecture 11: Regression Methods I (Linear Regression)

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Efficient Estimation of Censored Linear Regression Model

1 Glivenko-Cantelli type theorems

Lecture 11: Regression Methods I (Linear Regression)

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL

Survival Analysis for Case-Cohort Studies

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

Survival Analysis Math 434 Fall 2011

Lecture 2: Martingale theory for univariate survival analysis

Tests of independence for censored bivariate failure time data

CUSUM TEST FOR PARAMETER CHANGE IN TIME SERIES MODELS. Sangyeol Lee

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

LEAST ABSOLUTE DEVIATIONS ESTIMATION FOR THE ACCELERATED FAILURE TIME MODEL

Monotonicity and Aging Properties of Random Sums

Likelihood ratio confidence bands in nonparametric regression with censored data

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Approximate Self Consistency for Middle-Censored Data

Testing linearity against threshold effects: uniform inference in quantile regression

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Asymmetric least squares estimation and testing

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

Accelerated Failure Time Models: A Review

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

5.1 Approximation of convex functions

Statistical Analysis of Competing Risks With Missing Causes of Failure

large number of i.i.d. observations from P. For concreteness, suppose

UNIVERSITÄT POTSDAM Institut für Mathematik

Estimation for two-phase designs: semiparametric models and Z theorems

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Closest Moment Estimation under General Conditions

Delta Theorem in the Age of High Dimensions

Goodness-of-fit test for the Cox Proportional Hazard Model

Statistical Properties of Numerical Derivatives

CENSORED QUANTILE REGRESSION WITH COVARIATE MEASUREMENT ERRORS

Transcription:

Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased sampling scheme Here, we provide the justification of (11) which is introduced in Section 3. Observe that f T, (t, = 1 A, Z) = P T (t, t + dt), = 1 A, Z/dt = I(t A)f A,V (A, t A Z)S c (t A Z) = I(t A)f(t Z)S c (t A Z) 1 µ(z) = I(t A)P (T (t, t + dt), C > t Z)/dt 1 µ(z) 1 = I(t A)f T, (t, = 1 Z) µ(z), f T, (t, = A, Z) = P ( T (t, t + dt), = A, Z)/dt = I(t A) t f A,V (A, s A Z)ds g c (t A Z) 1 = I(t A) f(s Z)ds g c (t A Z) t µ(z) = I(t A)P T > t, C (t, t + dt) Z/dt 1 µ(z) 1 = I(t A)f T, (t, = Z) µ(z). 1

It follows that that v i (t) = f T, ( T i, i Z i ) f T, ( T i, i Z i ) f T, (t, 1 Z i ) f T, (t, 1 Z i) = I(A i t). S2 Proof of Theorems 1 and 2 For the biased samplings introduced in Sections 2.2 and 3, let f T (t Z) be the conditional density function of T, i.e., f T (t Z) = δ,1 f T, (t, δ Z), where f T, (t, δ Z) is defined in (8) of Section 2.2. For a vector a, let a 2 denote aa, and a denote the Euclidean norm of a. For b R p, define m(b) = E ZN(e Z b ), m(b) = E Zv(e Z b )Y (e Z b ), m n (b) = 1 n Z i N i (e Z i b ), n m n (b) = 1 n Z i v i (e Z i b )Y i (e Z i b ), n B(b) = E Z 2 f T, (e Z b, 1 Z) exp(z b), J(b) = E Z 2 v(e Z b )f T (e Z b Z) exp(z b). For d >, define the set B(d) as B(d) = b R p : inf m(b) m(β (τ)) d, τ (,τ u] where β (τ) is the true parameter value, τ (, τ u ]; and τ u satisfies Condition C4 below. 2

Furthermore, consider the setting in Section 3.2. For b R p, define m (b) = E Z ψ( T )N(e Z b ), m (b) = E Z ψ(e Z b )v(e Z b )Y (e Z b ), m n(b) = 1 n Z i ψ( n T i )N i (e Z i b ), m n(b) = 1 n Z i ψ(e Z i b )v i (e Z i b )Y i (e Z i b ), n B (b) = E [Z ψ(e Z b )]Z f T, (e Z b, 1 Z) exp(z b), J (b) = E [Z ψ(e Z b )]Z v(e Z b )f T (e Z b Z) exp(z b). We assume the following conditions for theoretical derivation. C1: Z is uniformly bounded, i.e., sup i Z i < and the matrix E(Z 2 ) is positive definite. C2: mβ(τ) is a Lipschitz continuous function of τ (, τ u ], and f T (t z) and f T, (t, 1 z) are bounded above and continuous uniformly in t and z. C3: There exists d > such that for b B(d ) and any Z, f T, (e Z b, 1 Z) > and J(b)B(b) 1 is uniformly bounded. C4: inf τ [τl,τ u] eigmin Bβ (τ) > for any τ l (, τ u ], where eigmin( ) denotes the minimum eigenvalue of a matrix. C5: The class of weight functions (Z, A, T, ) v(e Z b, A, T,, Z); b R d is Glivenko- Cantelli and uniformly bounded; and (Z, A, T, ) v(e Z b, A, T,, Z); b B(d ) belongs to a Donsker class. C6: The weight function ψ(t) is positive, bounded and differentiable; (Z, A, T, ) ψ(e Z b, A, T,, Z); b R d is Glivenko-Cantelli and (Z, A, T, ) ψ(e Z b, A, T,, Z); b B(d ) belongs to a Donsker class. Matrix W (β, τ) is nonsingular, inf t [τl,τ] eigmin B β (t) 3

W (β, τ) 1 B β (t) > for any τ l (, τ] and J (b)b (b) W (β, τ) 1 B (b) 1 is uniformly bounded for b B(d ). Conditions C1 C4 are mild assumptions concerning the covariates Z, the underlying regression quantile parameter process β(τ), and the density functions associated with the observed data ( T, ). The boundedness assumption of covariates in C1 are often assumed in survival models although it can be further relaxed with extra technical complexity. For Conditions C2 and C3, the boundedness and uniform continuity assumption of f T (t z) and f T, (t, 1 z) is reasonable in many biased sampling problems. For instance, it is satisfied for case-cohort study if the density functions of survival and censoring times are bounded and uniformly continuous. Similarly, for the length-biased sampling introduced in Section 3, this is satisfied if the density functions of T and C are bounded and uniformly continuous. Under these conditions, it can been verified that m(b) and m(b) are differentiable and B(b) = m(b)/ b and J(b) = m(b)/ b are well defined. As in Peng and Huang (28), Condition C4 is assumed to ensure the identifiability of β (τ). The regularity of the weight function assumed in Condition C5 is satisfied for all the sampling schemes introduced in Section 3. In particular, for the case-cohort type designs (Examples 3 and 4), the weight function v does not depend on β and satisfies C5. For length-biased sampling with weight function in the form (14), v(e Z b ), b B(d ) is a VC class, which implies C5 (Theorems 2.6.7 and 2.5.2 in van der Vaart and Wellner, 1996). Condition C6 is similar as conditions C3-C5 and is satisfied for many weight functions. It is assumed to ensure the asymptotic normality of the efficient estimator. Proof of Theorem 1 The proof follows closely that in Peng and Huang (28) and we only present the key steps. We should note that, however, their results cannot be directly applied due to the associated random weight functions in the estimating equation. Let α (τ) = mβ (τ), ˆα(τ) = mˆβ(τ), and A(d) = m(b) : b B(d). Following the argument in Peng and Huang (28), m is a one-to-one mapping from B(d ) to A(d ). There 4

exists an inverse function κ : B(d ) A(d ) such that κm(b) = b, for any b B(d ). Uniform Convergence Let ξ n,k = m n ˆβ(τ k ) k m n ˆβ(u)dH(u). According to the estimating equation taking the general form of (1), we have sup k ξ n,k = O(1) sup i Z i /n = O(n 1 ). From mβ (τ k ) k mβ (s)dh(s) =, we have the following decomposition: mˆβ(τ k ) mβ (τ k ) [ ] k 1 = m n ˆβ(τ k ) mˆβ(τ k ) + j= j+1 k [ ] + m n ˆβ(s) mˆβ(s) dh(s) + ξ n,k. τ j [ ] mˆβ(s) mβ (s) dh(s) (S1) Under the condition that Z is uniformly bounded, Z i N(e Z i b ); b R p and Z i Y i (e Z i b ); b R p are Glivenko-Cantelli. By Condition C5 and Theorem 3 in van der Vaart and Wellner (2), Z i v i (e Z i b )Y i (e Z i b ); b R p is also Glivenko-Cantelli. Then, we have the following results: sup b R p m(b) m n (b) and sup b R p m(b) m n (b) almost surely. This implies that the first and third terms in (S1) are ignorable. Denote c n, := sup k [m n ˆβ(τ k ) mˆβ(τ k )]+ k [ m nˆβ(s) mˆβ(s)]dh(s), and it follows that c n, = o p (1). Under condition C2, there exists c 1 such that mβ (τ) mβ (τ ) < c 1 τ τ for any τ, τ (, τ u ]. For = τ τ < τ 1, since mˆβ() =, we have sup τ τ<τ 1 mˆβ(τ) mβ (τ) = sup τ τ<τ 1 mβ (τ) b n, := c 1 S L(n). Then for large enough n, we know b n, < d and ˆβ n (τ) B(d ) for τ [τ, τ 1 ). Thus, we can write sup τ τ<τ 1 mˆβ(τ) mβ (τ) = sup τ τ<τ 1 m[κ ˆα(τ)] m[κα(τ)]. Under Condition C3, there exists α(τ) between α (τ) and ˆα(τ) such that sup mˆβ(τ) mβ (τ) = sup J[κ α(τ)]b[κ α(τ)] 1 α(τ) α (τ) c 2 b n,, τ τ<τ 1 τ τ<τ 1 5

where c 2 is some constant. Next consider τ 1 τ < τ 2. We have sup mˆβ(τ) mβ (τ) mˆβ(τ 1 ) mβ (τ 1 ) + sup mβ (τ 1 ) mβ (τ). τ 1 τ<τ 2 τ 1 τ<τ 2 We know sup τ1 τ<τ 2 mβ (τ 1 ) mβ (τ) c 1 S L(n). Further from the decomposition (S1) that gives an upper bound for mˆβ(τ 1 ) mβ (τ 1 ), we have: sup mˆβ(τ) mβ (τ) b n,1 := c n, + c 3 n 1 + c 2 b n, (1 τ u ) 1 S L(n) + c 1 S L(n), τ 1 τ<τ 2 where c 3 is some big constant. Note that b 1,n and it is smaller than d for large enough n. This implies ˆβ n (τ) B(d ) for τ [τ 1, τ 2 ). Then an induction argument similarly to the Proof of Theorem 1 in Peng and Huang (28) gives k 1 sup mˆβ(τ) mβ (τ) b n,k := c n, +c 3 n 1 +c 2 b n,i (1 τ u ) 1 S L(n) +c 1 S L(n). τ k τ τ k+1 This implies that sup τ τ τ u mˆβ(τ) mβ (τ) p and for τ τ τ u and large enough n, ˆβ(τ) B(d ). Thus sup τ ˆα(τ) α (τ) = sup τ mˆβ(τ) mβ (τ) p. Take Taylor s expansion of mˆβ(τ) around α (τ), and we have sup ˆβ(τ) β (τ) τ [τ l,τ u] sup Bβ (τ) 1 ˆα(τ) α (τ) + τ [τ l,τ u] sup ɛ n(τ), τ [τ l,τ u] where ɛ n(τ) is the remainder term of the Taylor expansion and sup τ [τl,τ u] ɛ n(τ). By Condition C4, we have sup τ [τl,τ u] Bβ (τ) 1 ˆα(τ) α (τ) = O ˆα(τ) α (τ) and this implies the uniform consistency. 6

Weak Convergence Following Lemma B.1 of Peng and Huang (28), we have sup n 1/2 τ (,τ u] n [ Z i N i (e Z ˆβ(τ) i ) N i (e Z i β (τ) ) n 1/2 mˆβ(τ) mβ (τ)] = o p (1) (S2) and sup n 1/2 τ (,τ u] n ( Z i I ˆβ(τ)) ( ) Ti e Z i I Ti e Z i β (τ) n 1/2 [ mˆβ(τ) mβ (τ)] = o p (1). (S3) In addition, for the grid size chosen in Theorem 1, S n (ˆβ, τ) = o p (1) uniformly in τ (, τ u ]. Then, the following equations hold uniformly in τ (, τ u ], ] S n (β, τ) =n [mˆβ(τ) 1/2 mβ (τ) [ ] n 1/2 mˆβ(u) mβ (u) dh(u) + o p (1) ] =n [mˆβ(τ) 1/2 mβ (τ) n [ 1/2 Jβ (u)bβ (u) 1 + o p (1) ] [ ] mˆβ(u) mβ (u) dh(u) + o p (1). A discrete version of the above equation ] [ ] S n (β, τ) =n [m 1/2 n ˆβ(τ) m n β (τ) n 1/2 m n ˆβ(u) m n β (u) dh(u) + o p (1) 7

leads to the following recursive formula: S n (β, τ k ) S n (β, τ k 1 ) ] [ = n [m 1/2 n ˆβ(τ k ) m n β (τ k ) n 1/2 n m 1/2 n (ˆβ(τ k 1 )) m n (β (τ k 1 )) = Bβ (τ k )n 1/2 ˆβ(τ k ) β (τ k ) ] m n ˆβ(τ k 1 ) m n β (τ k 1 ) H(τ k ) H(τ k 1 ) [Bβ (τ k 1 ) + Jβ (τ k 1 )H(τ k ) H(τ k 1 )]n 1/2 ˆβ(τ k 1 ) β (τ k 1 ) + o p (1). The above recursive equation gives the following approximation result Bβ (τ k )n 1/2 ˆβ(τ k ) β (τ k ) = S n (β, τ k ) S n (β, τ k 1 ) [I + Jβ (τ k 1 )B 1 β (τ k 1 )H(τ k ) H(τ k 1 )] S n (β, τ k 1 ) S n (β, τ k 2 ) k [I + Jβ (τ h 1 )B 1 β (τ h 1 )H(τ h ) H(τ h 1 )] S n (β, τ 1 ) S n (β, τ ) h=2 + o p (1). (S4) Using the product integration theory (Gill and Johansen, 199; Andersen et al., 1993 II.6), we can write the above decomposition as n 1/2 [mˆβ(τ) mβ (τ)] = φ S n (β, τ) + o p (1), (S5) where φ is functional defined as follows. For any g G = g : [, τ u ] R p, g is left-continuous with right limit, g() = and product integral I(s, t) = π u (s,t] [I p +Jβ (u)bβ (u) 1 ]dh(u) 8

with π the product integral notation, φ(g)(τ) is defined as φ(g)(τ) = I(s, τ)dg(s). (S6) Note that Z i N i (e Z iβ (τ) ), τ [τ l, τ u ] is a Donsker class and uniformly bounded. From Condition C5, the class Z i v i (e Z i β (s) )Y i (e Z i β (s) ), s [τ l, τ u ] is also Donsker (see Example 2.1.8 in Section 2.1 of van der Vaart and Wellner, 1996). Since v i(e Z i β (s) )Y i (e Z i β (s) )dh(s) is Libschitz in τ, from the permanence properties of the Lipschitz transformation of the Donsker class (Theorem 2.1.6, van der Vaart and Wellner, 1996), we can show that Z i N i (e Z iβ (τ) ) Z i v i (e Z i β (s) )Y i (e Z i β (s) )dh(s), τ [τ l, τ u ] is a Donsker class. Then the Donsker theorem implies that S n (β, τ) converges weakly to a Gaussian process U(τ) living on τ [τ l, τ u ] with mean and covariance matrix Σ(s, t), where Σ(s, t) = Eu i (s)u i (t) with u i (τ) = Z i N i (e Z iβ (τ) ) Z i v i(e Z i β (s) )Y i (e Z i β (s) )dh(s). Furthermore, since φ is a linear operator, φ S n (β, τ) converges weakly to φ U(τ) and φ U(τ) is also a Gaussian process (Römisch, 25). From (S5), apply Taylor s expansion and we have Bβ (τ)n 1/2 ˆβ(τ) β (τ) = φ S n (β, τ) + o p (1). Then n 1/2 ˆβ(τ) β (τ) converges weakly to the Gaussian process Bβ (τ) 1 φ U(τ) with covariance matrix Bβ (τ) 1 Σ [Bβ (τ) 1 ], where Σ (τ) denotes the limiting covariance matrix of φ S n (β, τ). Proof of Theorem 2 Next we prove the weak convergence of nˆβ eff (τ) β (τ). From the above definitions, we can write η(β, τ) and η (β, τ k ) in (18) and (2) as η(β, τ) = n 1/2 m nβ(τ) n 1/2 m nβ(u)dh(u) 9

and k 1 η (β, τ k ) = n 1/2 m nβ(τ k ) n 1/2 j= m nˆβ eff (τ j )H(τ j+1 ) H(τ j ). By the proposed estimation method for ˆβ eff (τ), the sequential estimator ˆβ(τ k ), 1 k L, minimizes n 1 η (β, τ k ) W (ˆβ int, τ) 1 η (β, τ k ). Since ˆβ int is taken as a consistent estimator of β, it can be shown that W (ˆβ int, τ) 1 W (β, τ) 1 in probability. Further note that B (b) = m (b) b and J (b) = m (b) b. As in the proof of Theorem 1, m n(b) and m n(b) converge to m (b) and m (b) uniformly in probability; then the sequential estimators ˆβ eff (τ k ) satisfy n 1/2 B ˆβ eff (τ k ) W (β, τ) 1 η (ˆβ eff, τ k ) = o p (1). Then from a similar argument as in the proof of Theorem 1 for the consistency of ˆβ( ), we have that ˆβ eff (t) is uniformly consistent in t τ. Similarly as Lemma B.1 of Peng and Huang (28), we have for β(t) such that sup t β(t) β (t) in probability, sup [m n β(t) m nβ (t)] [m β(t) m β (t)] = o p ( n 1/2 ) t (,τ] and Then we can write sup [ m n β(t) m nβ (t)] [ m β(t) m β (t)] = o p (n 1/2 ). t (,τ] η ( β, τ k ) = n 1/2 B β (τ k ) β(τ k ) β (τ k ) + o p (1) k 1 +n 1/2 m nβ (τ k ) n 1/2 1 j= m nˆβ eff (τ j )H(τ j+1 ) H(τ j ).

This implies the sequential estimator ˆβ eff (τ k ) satisfies B β (τ k ) W (β, τ) 1 η (ˆβ eff, τ k ) = o p (1), uniformly in probability, and furthermore, uniformly in t, B β (t)w (β, τ) 1 η(ˆβ eff, t) = o p (1). This equation plays a similar role in the current proof as that of the unweighted estimating equation S n (ˆβ, t) = o p (1) in the proof of Theorem 1 for the weak convergence of ˆβ. The weak convergence of ˆβ eff then follows from a similar argument. S3 Resampling method An alternative Resampling method A commonly used resampling method is the perturbationbased approach proposed by Jin et al. (23) for the AFT model; its modified version for the quantile regression was adopted in Peng and Huang (28) under unbiased sampling. This approach can be easily extended to the biased sampling case. In particular, consider the following stochastic perturbation of the observed estimating equations S n (β, τ) = n 1/2 n ζ i Z i N i (e Z i β(τ) ) v i (e Z i β(s) )Y i (e Z i β(s) )dh(s) =. (S7) where ζ i s are i.i.d. from a known nonnegative distribution with mean 1 and variance 1, such as the exponential distribution with rate 1. The above estimation can be implemented in the same fashion as discussed in Section 2.3. Conditional on the observed data, we independently generate the variates (ζ 1,..., ζ n ) M b times, with M b being a big number, and obtain the corresponding M b estimates ˆβ (τ) by solving (S7). The resampling method described above is consistent. Following the proof of Theorem 3 below, conditional on Z i, i, T i n, we have S n (ˆβ, τ) has the same asymptotic distribution 11

as S n (β, τ) in probability; moreover, for τ uniformly in (, τ u ], we have n 1/2 β (τ) ˆβ(τ) = B(β (τ)) 1 φ S n (ˆβ, τ) + o p (1), (S8) where function φ is defined as in (S6). Therefore, conditional on the observed data, n 1/2 ˆβ (τ) ˆβ(τ) converges weakly to the same limiting process of n 1/2 ˆβ(τ) β (τ) for τ [τ l, τ u ], where τ l (, τ u ). Proof of Theorem 3 We first show that, conditional on the observed data, Sn (ˆβ, τ) has the same asymptotic distribution as S n (β, τ) in probability. Consider the weighed summation defined in (S7) S n (β, τ) = n 1/2 n ζ i Z i N i (e Z i β(τ) ) v i (e Z i β(s) )Y i (e Z i β(s) )dh(s). Since E(ζ i ) = 1 and V ar(ζ i ) = 1, a similar argument as in the proof of Theorem 1 gives that sup S n (ˆβ, τ) = o p (1), τ (,τ u] where ˆβ is the solution of S n (β, τ) = using the proposed sequential algorithm. Because sup τ (,τu] S n (ˆβ, τ) = o p (1) as shown in the proof of Theorem 1, for τ uniformly in (, τ u ], S n (ˆβ, τ) can be expressed as S n (ˆβ, τ) = S n (ˆβ, τ) S n (ˆβ, τ) + o p (1) = n 1/2 n (1 ζ i )û i (τ) + o p (1), where û i (τ) = Z i N(e Z i ˆβ(τ) ) Z i v i (e Z ˆβ(s) i, T i, i )Y i (e Z ˆβ(s) i )dh(s). 12

Then we have for any s, t [τ l, τ u ], [ n n E n 1/2 (1 ζ i )û i (s) n 1/2 Zi (1 ζ i )û i (t), i, T i n n = n 1 û i (s)û i (t) Σ(s, t), ] in probability as n, where Σ(, ) is the corresponding limiting covariance function of S n (β, ) as defined in the proof of Theorem 1. Therefore, by a similar argument as in Lin, Wei, and Ying (1993), conditional on Z i, i, T i n, Sn (ˆβ, τ) has the same limiting covariance matrix and asymptotic distribution as S n (β, τ) in probability. From the decomposition in (S4), then we can show that conditional on the data, φ n S n (ˆβ, τ) has the same asymptotic distribution as φ n ( S n (β, τ)) in probability. Furthermore, since n 1/2 [mˆβ(τ) mβ (τ)] = φ n S n (β, τ)+o p (1), we obtain that conditional on the data, φ n S n (ˆβ, τ) converges weakly to the limiting distribution of n 1/2 [mˆβ(τ) mβ (τ)] in probability. Following from Zeng and Lin (28), we use the perturbation estimators ˆB and Ĵ to estimate B and J. The consistency property can be established as follows. Following Lemma B.1 of Peng and Huang (28), for β(t) such that sup t β(t) β (t) in probability, the following result holds uniformly in τ n 1/2 [m n β(τ) m n β (τ)] n 1/2 [m β(τ) mβ (τ)] = o p (1). This implies that n 1/2 [m n β(τ) m n β (τ)] = Bβ (τ)n 1/2 β(τ) β (τ) + o p (1). Therefore, the above approximation holds for β(τ) = ˆβ(τ) + n 1/2 γ, where γ follows a p- dimensional multivariate normal distribution with zero mean and identity covariance matrix. 13

For M generated γ s, let γ be the M p matrix (γ 1,, γ M ) and m be the m-dimensional vector [n 1/2 m n ˆβ(τ) + n 1/2 γ i n 1/2 m n ˆβ(τ); i = 1,, M]. Then we have Bb (τ) = (γ γ) 1 γ m + o p (1). Note that (γ γ) 1 exist with probability 1 for M > p. Therefore, the slope matrix estimator given by regressing the perturbed values n 1/2 m n ˆβ(τ) + n 1/2 γ i on γ i, i.e., ˆBˆβ(τ) = (γ γ) 1 γ m, is consistent. Similarly we have the consistency of Ĵ. This implies that ˆBˆβ(τ) 1 φ n S n (ˆβ, τ) has the same asymptotic distribution as n 1/2 ˆβ(τ) β (τ). This validates the proposed resampling method. References Andersen, P. K., Borgan, Ø., Gill, R. D., and Keiding, N. (1993), Statistical Models Based on Counting Processes, Springer, New York. Gill, R. D. and Johansen, S. (199), A Survey of Product-Integration with a View Toward Application in Survival Analysis, The Annals of Statistics, 18, 155 1555. Jin, Z., Lin, D. Y., Wei, L. J., and Ying, Z. (23), Rank-Based Inference for the Accelerated Failure Time Model, Biometrika, 9, 341 353. Lin, D. Y., Wei, L.-J., and Ying, Z. (1993), Checking the Cox Model with Cumulative Sums of Martingale-Based Residuals, Biometrika, 8, 557 572. Peng, L. and Huang, Y. (28), Survival Analysis with Quantile Regression Models, Journal of the American Statistical Association, 13, 637 649. 14

Römisch, W. (25), Delta Method, Infinite Dimensional, Encyclopedia of Statistical Sciences. van der Vaart, A. and Wellner, J. A. (2), Preservation theorems for Glivenko-Cantelli and uniform Glivenko-Cantelli classes, in High dimensional probability II, Springer, pp. 115 133. van der Vaart, A. W. and Wellner, J. A. (1996), Weak Convergence, Springer. 15