Regression Analysis of Clustered Failure Time Data under the Additive Hazards Model

Similar documents
Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

An (almost) unbiased estimator for the S-Gini index

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

STAT 3008 Applied Regression Analysis

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

QUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT PLUS LINEAR BY LINEAR ASSOCIATION MODEL FOR ORDINAL CONTINGENCY TABLES

Testing for seasonal unit roots in heterogeneous panels

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

Efficient nonresponse weighting adjustment using estimated response probability

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Influence Diagnostics on Competing Risks Using Cox s Model with Censored Data. Jalan Gombak, 53100, Kuala Lumpur, Malaysia.

STAT 511 FINAL EXAM NAME Spring 2001

Lecture 3 Stat102, Spring 2007

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Estimation: Part 2. Chapter GREG estimation

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

Computing MLE Bias Empirically

Chapter 11: Simple Linear Regression and Correlation

Statistics for Economics & Business

STK4080/9080 Survival and event history analysis

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

x = , so that calculated

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Linear Regression Analysis: Terminology and Notation

NUMERICAL DIFFERENTIATION

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Linear Approximation with Regularization and Moving Least Squares

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

RELIABILITY ASSESSMENT

Chapter 5 Multilevel Models

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Estimation in partial linear EV models with replicated observations

Parametric fractional imputation for missing data analysis

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Modeling and Simulation NETW 707

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

e i is a random error

Estimation of the Mean of Truncated Exponential Distribution

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Chapter 12 Analysis of Covariance

Random Partitions of Samples

Appendix B: Resampling Algorithms

Lecture 4 Hypothesis Testing

Limited Dependent Variables

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

Negative Binomial Regression

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

ASYMPTOTIC PROPERTIES OF ESTIMATES FOR THE PARAMETERS IN THE LOGISTIC REGRESSION MODEL

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Global Sensitivity. Tuesday 20 th February, 2018

UNR Joint Economics Working Paper Series Working Paper No Further Analysis of the Zipf Law: Does the Rank-Size Rule Really Exist?

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Chapter 8 Indicator Variables

Lecture 6 More on Complete Randomized Block Design (RBD)

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Estimation of accelerated failure time models with random effects

MAXIMUM A POSTERIORI TRANSDUCTION

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Asymptotic Properties of the Jarque-Bera Test for Normality in General Autoregressions with a Deterministic Term

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

On mutual information estimation for mixed-pair random variables

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 13: Multiple Regression

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

A Hybrid Variational Iteration Method for Blasius Equation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Applications of GEE Methodology Using the SAS System

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

A note on regression estimation with unknown population size

Econometrics of Panel Data

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

10-701/ Machine Learning, Fall 2005 Homework 3

Probability Theory (revisited)

Restricted divisor sums

Lecture 3: Probability Distributions

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

x i1 =1 for all i (the constant ).

Uncertainty as the Overlap of Alternate Conditional Distributions

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Transcription:

A^VÇÚO 1 33 ò 1 5 Ï 217 c 1 Chnese Journal of Appled Probablty and Statstcs Oct., 217, Vol. 33, No. 5, pp. 517-528 do: 1.3969/j.ssn.11-4268.217.5.8 Regresson Analyss of Clustered Falure Tme Data under the Addtve Hazards Model DI RongRong (School of Mathematcs and Statstcs, Wuhan Unversty, Wuhan, 4372, Chna) WANG ChengYong (School of Mathematcs and Computer Scence, Hube Unversty of Arts and Scence, Xangyang, 44153, Chna) Abstract: Clustered nterval-censored falure tme data often arses n medcal studes when study subjects come from the same cluster. Furthermore, the falure tme may be related to the cluster sze. A smple and common approach s to smplfy nterval-censored data due to the lack of proper nference procedures for drect analyss. For ths reason, we proposed the wthn-cluster resamplng-based method to consder the case II nterval-censored data under the addtve hazards model. Wth-cluster resamplng s smple but computatonally ntensve. A major advantage of the proposed approach s that the estmator can be easly mplemented when the cluster sze s nformatve. Asymptotc propertes and some smulaton results are provded and ndcate that the proposed approach works well. Keywords: addtve hazards model; nterval-censored; wthn-cluster resamplng; semparametrc regresson 21 Mathematcs Subject Classfcaton: 62N2 Ctaton: D R R, Wang C Y. Regresson analyss of clustered falure tme data under the addtve hazards model [J]. Chnese J. Appl. Probab. Statst., 217, 33(5): 517 528. 1. Introducton Case II nterval-censored data s commonly encountered n bomedcne where the event tme of nterest s not observed drectly but known only to le between the two montorng tmes. A few methods have been proposed for regresson analyss of the ntervalcensored data. Zeng et al. [1] dscussed regresson analyss of case II nterval-censored data, usng the addtve hazards model. Wang et al. [2] developed an approach whch s easy to mplement for case II nterval-censored data and allows that the montorng tmes The project was supported by the Natonal Natural Scence Foundaton of Chna (Grant No. 7137166). Correspondng author, E-mal: wchyxf@163.com. Receved June 2, 216. Revsed November 17, 216.

518 Chnese Journal of Appled Probablty and Statstcs Vol. 33 are random and contnuous. It assumes that the falure tme of nterest follows Cox-type models [3]. However, these methods do not take nto account the clustered data. In some cases, falure tme data certanly comes from the same cluster. For example, falure tme can be the tme to dsease occurrence for the patents n the same famly or the same clnc. L et al. [4] proposed an estmatng equaton-based approach for regresson analyss of clustered nterval-censored falure tme data generated from the addtve hazards model whch does not nvolve the estmaton of any baselne hazard functon. Another commonly used statstcal method to analyse clustered falure tme data s the gamma-fralty model, ncorporatng an unobserved random effect known as fralty nto the Cox proportonal hazards model. L et al. [5] proposed a seve estmaton procedure for fttng a Cox fralty model to clustered nterval-censored falure tme data. A two-step algorthm for parameter estmaton was developed and the asymptotc propertes of the resultng seve maxmum lkelhood estmators were establshed. Kor et al. [6] gave a method for analyzng clustered nterval-censored data based on Cox s model. As ponted out n [7] that the addtve hazards model descrbes a dfferent aspect of the assocaton between the falure tme and covarates compared wth the Cox s model and the addtve model could be more plausble than the Cox s model n many applcatons. Ths s especally the stuaton when one s nterested n the rsk dfference as often the case n epdemology and publc health [8]. In ths paper, we consder case II nterval-censored data under the addtve hazards model and the stuaton where the correlated falure tme of nterest may be related to cluster sze. We assume that there exst only two montorng tmes ndependent of the falure tme of nterest under the gven covarate process. We then use the wthn-cluster resamplng (WCR) procedure under the addtve hazards model. WCR s a method for analyzng clustered data n the presence of nformatve cluster sze when estmaton of margnal effects weghted at the cluster level s of nterest. Parameter estmaton wth WCR s based on resamplng replcate data sets, each contanng one observaton from each cluster. In the followng, we present the approach under the addtve hazards model. The rest of the paper s organzed as follows. Secton 2 proposes the model and some notatons used n ths paper. Secton 3 gves a method based on the WCR method by usng the nference procedure proposed by [2] under the addtve hazards model for case II falure tme data, and Secton 4 presents some extensve smulaton studes to assess the performance of the proposed approach. 2. Notaton and Model Suppose that there are n ndependent clusters and each cluster has n j exchangeable subjects for = 1, 2,..., n and j = 1, 2,..., n. Let U j and V j denote the two montorng

No. 5 DI R. R., WANG C. Y.: Regresson Analyss of Clustered Falure Tme Data 519 tmes for the j-th subject n the -th cluster. Let Z j (t) be the correspondng p-dmensonal vector of covarates that may depend on tme t, and T j denote the falure tme of nterest for subject j n the cluster whch s ndependent of montorng tmes U j and V j gven covarate Z j (t). For each (, j), defne δ 1j = I(T j < U j ), δ 2j = I(U j T j < V j ) and δ 3j = 1 δ 1j δ 2j. The observed data are (U j, V j, δ 1j, δ 2j, δ 3j, Z j ( )). It just as ponted out n [9], the cause for cluster szes beng nformatve can be complcated and usually unknown, and some latent varables may mplctly affect the baselne hazard for each cluster and/or covarates. For example, the margnal hazard functon may be assocated wth the cluster sze through the followng fralty mode λ j (t Z j ) = λ (t) + ω β Z j (t), where β s the unknown vector of p-dmensonal regresson coeffcent, ω s the clusterspecfc random effect to account for wthn-cluster correlaton n cluster, and λ (t) s the unknown baselne hazard functon. If cluster szes are gnorable (nonnformatve to survval), the usual margnal addtve hazards model [1] s applcable, gven by λ j (t Z j ) = λ (t) + β Z j (t). (1) Motvated by the work of [2], we model the montorng varables usng Cox-type hazard functons λ U j(t Z j ) = λ 1 (t)e γ Z j(t), (2) λ V j(t U j, Z j ) = I(t > U j )λ 2 (t)e γ Z j(t), (3) where λ 1 (t) and λ 2 (t) denote unspecfed baselne functons, γ s the unknown vector of regresson parameters. For each and j, defne N (1) j (t) = (1 δ 1j)I(U j t), and condtonal on U j, defne N (2) j (t) = δ 3jI(V j t) f t U j and f t < U j. We also defne λ (1) j (t Z j) = λ 1 (t)e Λ (t) e β Z j (t)+γ Z j(t) := λ 1 (t)e β Z j (t)+γ Z j(t) (4) and λ (2) j (t U j, Z j ) = I(t > U j )λ 2 (t)e Λ (t) e β Z j (t)+γ Z j(t) := I(t > U j )λ 2 e β Z j (t)+γ Z j(t), (5) where Z j (t) = t Z j(s)ds, Λ (t) = t λ (s)ds, λ 1 = λ 1 (t)e Λ (t) and λ 2 = λ 2 (t)e Λ (t). Clearly models (4) and (5) satsfy the Cox proportonal hazards model.

52 Chnese Journal of Appled Probablty and Statstcs Vol. 33 3. The WCR-Based Procedure When cluster szes are nformatve, to estmate the unknown parameter vectors β and γ, the estmates and nference based on equaton (2) may be ncorrect. To account for nformatve cluster szes, ths secton wll propose a method based on the wthn-cluster resamplng (WCR) technque. The basc dea behnd the WCR-based procedure s that one observaton s randomly sampled wth replacement from each of the n clusters usng the WCR approach (refer to [11]). For ths, let Q be a postve nteger, we randomly sample one subject wth replacement from each of the n clusters and suppose that the resamplng process s repeated Q tmes. study perod, defne δ q 1 = I(T q Let τ denote a known tme for the length of < V q ) and δq 3 = 1 δq 1 δq 2, the (t); = 1, 2,..., n, t τ}, < U q ), δq 2 = I(U q T q q-th resampled data set denoted by {U q, V q, δq 1, δq 2, δq 3, Zq conssts of n ndependent observatons, whch can be analyzed usng the models (4) and (5) for ndependent data set. The wthn-cluster resamplng estmate s constructed as the average of the Q resample-based estmates. For the q-th resampled data, to estmate β and γ, motvated by [2], we frst estmate γ, and for ths, for = 1, 2,..., n and q = 1, 2,..., Q, we defne Ñ (1)q (t) = I(U q t) and Ñ (2)q (t) = I(V q t) f t U q and f t < U q gven the observed U q. For j = and 1, also defne S (j) 1,γ,q (t, γ) = 1 n S (j) 2,γ,q (t, γ) = 1 n I(t U q Z q )eγ (t) (Z q (t)) j, =1 =1 I(U q < t V q )eγ Z q (t) (Z q (t)) j, where a j = 1 and a for j = and 1. We construct an estmatng functon U q γ (γ) for γ as [ =1 (Z q (t) S(1) 1,γ,q (t, γ) S () 1,γ,q (t, γ) ) dñ (1)q (t) + (Z q = n {Z q (U q ) S(1) 1,γ,q (U q, γ) } =1 S () 1,γ,q (U q, γ) + n { Z q (V q ) S(1) =1 S () (t) S(1) 2,γ,q 2,γ,q (V q 2,γ,q (V q, γ), γ) (t, γ) ) S () 2,γ,q (t, γ) }. dñ (2)q Let γ q be the soluton to Uγ q (γ) =. Next we estmate β gven γ q. For ths, we also defne N (1)q (t) = (1 δ q 1 )I(U q t), N (2)q (t) = δ q 3 I(V q t) for = 1, 2,..., n, q = 1, 2,..., Q, and for j =, 1, let S (j) = 1 n S (j) = 1 n =1 I(t U q )e β (t)+γ Z q (t) ( (t)) j, I(U q < t V q =1 )e β (t)+γ Z q (t) ( (t)) j. ] (t)

No. 5 DI R. R., WANG C. Y.: Regresson Analyss of Clustered Falure Tme Data 521 We propose the estmatng equaton U q β (β, γ q) =, where U q β (β, γ) s defned as ( (1 δ q 1 ) =1 (U q ) S(1) (U q S () (U q, β, γ), β, γ) ) + n δ q 3 =1 ( (V q ) S(1) (V q S () (V q, β, γ), β, γ) where (t) = t Zq (s)ds. Then we can estmate β by β q defned as the root of U q β (β, γ q) =. Furthermore, Wang et al. [2] showed that n( β q β ) can be asymptotcally approxmated by a normal vector wth mean zero and a covarance matrx of β q that can be consstently estmated by Σ q := (Âq β ) 1 Γq (Âq β ) 1 /n, where Âq β and Γ q wll be defned n the Appendx, thus β q s consstent. As t s known to all that sample mean can reduce the system error, after repeatng ths procedure Q tmes, the WCR estmator for β can be constructed as the average of the Q resample-based estmators, whch s β wcr = 1 Q Under some regularty condtons, t can be shown that n( β wcr β ) converges n dstrbuton to a zero-mean normal random vector, and the varance-covarance matrx of β wcr can be consstently estmated by Σ wcr = 1 Σ q 1 Q Q β q. ( β q β wcr )( β q β wcr ). The proof of ths result s sketched n the Appendx. ), 4. Smulaton An extensve smulaton study was conducted to assess the fnte sample performance of the estmates proposed n the prevous sectons. For smplcty, here only consder nonnformatve cases. In the smulaton study, the true covarate Z j generated from the Bernoull dstrbuton B(1,.5). Gven the Z j s, the falure tmes of nterest were assumed to follow model (1) wth λ (t) = 2 or λ (t) = 4, the observaton tmes U j s and V j s, generated from (2) and (3) wth λ 1 (t) = 4, λ 2 (t) = 2 or λ 1 (t) = 8, λ 2 (t) = 4. The cluster sze n was randomly generated from unform dstrbuton U{1, 2, 3, 4, 5, 6, 7}. The results gven below are based on 4 replcatons wth Q = 4 resamples and the number of clusters n = 2 or 4. Table 1 and Table 2 present the results on estmaton of (γ, β ) wth true values (γ, β ) = (, ), (,.2), (,.2), (.2, ), (.2,.2) or (.2,.2). The results nclude the estmated bases (Bas) gven by the averages of the pont estmates mnus the true values, the averages of the standard error estmates (SEE), the samplng standard errors of the

522 Chnese Journal of Appled Probablty and Statstcs Vol. 33 pont estmates (SSE) and the 95% percent emprcal coverage probabltes (CP). The results ndcate that the proposed estmate seems to be approxmately unbased and the proposed varance estmate also seems to be reasonable, and all estmates become better when the sample sze ncreases. Table 1 Smulaton results for estamton of β and γ wth λ = 2, λ 1 = 4, λ 2 = 2 n = 2 n = 4 (γ, β ) BIAS SEE SSE CP BIAS SEE SSE CP (,) γ.1.63.61.9575.1.436.427.9475 β -.15.236.2381.9475.2.174.178.9475 (,.2) γ -.29.58.62.955.37.424.43.9475 β.245.2485.2442.9375 -.2.1767.178.945 (,-.2) γ.5.619.64.9475.1.436.427.9475 β.34.2463.2283.9375.4.1642.1639.9425 (.2,) γ -.3.612.66.95 -.18.418.429.9475 β.61.2447.245.945.37.176.1734.9475 (.2,.2) γ.26.64.611.94 -.8.431.43.945 β.5.2654.2556.94.27.188.1819.945 (.2,-.2) γ -.3.612.66.95 -.8.431.43.945 β.12.2282.2316.95 -.48.1681.1667.95 Table 2 Smulaton results for estamton of β and γ wth λ = 4, λ 1 = 8, λ 2 = 4 n = 2 n = 4 (γ, β ) BIAS SEE SSE CP BIAS SEE SSE CP (,) γ -.17.635.62.94.1.436.427.9475 β -.21.5167.4735.935.41.341.3417.9475 (,.2) γ -.17.63.62.94.1.436.427.9475 β -.17.5323.4837.9375.81.3491.3492.9475 (,-.2) γ -.47.612.63.955.37.424.43.9475 β.135.487.4626.9425 -.163.334.3317.9475 (.2,) γ.26.64.611.94 -.8.436.431.945 β -.12.5143.4891.945.27.3459.348.945 (.2,.2) γ -.3.612.66.95 -.18.418.429.9475 β.172.536.4913.9425 -.22.361.3549.9425 (.2,-.2) γ -.3.612.66.95 -.8.431.432.945 β.66.4712.4721.94.11.3396.344.9475 For comparson, we also consder the correlated falure tmes model used n [4], that

No. 5 DI R. R., WANG C. Y.: Regresson Analyss of Clustered Falure Tme Data 523 s, λ j (t Z j, b ) = λ (t) + β Z j + b (6) wth λ (t) = 2. The latent varables b s were assumed to follow a normal dstrbuton wth zero mean and varance equal to 1/4. The covarates Z j s were generated from the Bernoull dstrbuton wth success probablty p =.5. The montorng varables U j s and V j s were generated from (2) and (3) wth λ 1 (t) = 4 and λ 2 (t) = 2. The cluster sze n was generated from the unform dstrbuton U{2, 3, 4} and the number of clusters n = 2. The results based on 1 replcatons and the WCR method wth Q = 4 resamples for each step. The true regresson parameter γ was taken to be.25, and β was.25, and -.25. Smulated results are lsted n Table 3, and all the results lsted below L, Wang and Sun are extracted drectly from the paper of [4]. These results ndcate that the proposed procedure actually better performance than the method gven by [4]. The proposed method seems to gve smaller bases and standard errors. Ths s because the ndvduals are related and the WCR method take nto account the correlaton compared wth the method gven by [4]. So the WCR method s more effectve. Table 3 Compared smulaton results for estamton wth λ = 2, λ 1 = 4, λ 2 = 2, n = 2 L, Wang and Sun WCR (γ, β ) BIAS SEE SSE CP BIAS SEE SSE CP (,) γ.19.116.181.948.2.62.597.951 β.315.884.8273.975 -.47.246.2298.937 (,.25) γ.73.112.165.946.2.62.597.951 β -.225.8835.833.973.3.2618.243.936 (,-.25) γ -.51.1126.11.952 -.5.59.595.95 β.172.8776.8255.938 -.55.2257.2191.941 (.25,) γ.35.1154.1126.941.7.594.63.946 β.85.961.915.94.44.2454.2393.943 (.25,.25) γ.122.1138.117.948.7.594.63.946 β.731.9634.9133.971.17.269.2526.942 (.25,-.25) γ -.36.1177.1151.941.7.594.63.946 β.992.9643.9112.963 -.13.239.2271.952 Fnally, t can be seem from the Tables 1 3 that γ seems to have smaller standard error than that of β for all the estmates. Ths s because that completely observed data can be used for the estmate of γ, whle only ncompletely observed data for the estmate of β.

524 Chnese Journal of Appled Probablty and Statstcs Vol. 33 Appendx: Proofs of the Asymptotc Normalty of β wcr Proof For = 1, 2,..., n, we frst defne M (1)q M (2)q M (1)q M (2)q whch are martngales. (t) = N (1)q (t) (t) = N (2)q (t) (t) = Ñ (1)q (t) (t) = Ñ (2)q (t) t t t t I(s U q )λ 1(s)e β Zq (s)+γ Zq (s) ds, I(U q < s V q )λ 2(s)e β Zq I(s U q )λ 1(s)e γ Zq (s) ds, I(U q < s V q )λ 2(s)e γ Zq (s) ds, (s)+γ Zq (s) ds, Snce β q s the soluton of the estmatng equaton U q β (β, γ q) =. By the Taylor s expanson, we have U q β (β, γ q ) = U q β ( β q, γ q ) U q β (β, γ q ) = U q β (β, γ q ) β ( β q β ), where β s on the lne segment between β q and β. Rewrtng the above equaton yelds that ( n( βq β ) = 1 U q β (β, γ q )) 1( 1 ) n n β U q β (β, γ q ). Note that n 1 U q β (β, γ)/ β s equal to 1 n = 1 n where Z q (1) =1 + 1 n =1 + 1 n =1 ( (2) S =1 S () ( ( (2) S S () (S (1) ) 2 ) (S () )2 (S (1) dn (1)q ) 2 ) (S () )2 (t) dn (2)q (s) Z q (1) (β, γ, t)) 2 I(U q t)e β (t)+γ Z q (t) dn (1)q (t) ( (s) Z q (2) (β, γ, t)) 2 I(U q t < V q S(1) (β, γ, t) = S (), Zq (2)(β, γ, t) = and N (2)q (t) = n 1 n =1 (t) )e β S () (t)+γ Z q (t) dn (2)q (t) S (1) S (), N (1)q 1 (t) = n S () =1, N (1)q (t), N (2)q (t). It can be easly seen that n 1 U q β (β, γ q )/ β s postve defnte and n 1 U q β (β, γ )/ β converges n probablty to a determnstc and

No. 5 DI R. R., WANG C. Y.: Regresson Analyss of Clustered Falure Tme Data 525 postve defnte matrx denoted by A β, whch can be consstently estmated by Âq β := n 1 U q β (β, γ q )/ β. Averagng over q = 1, 2,..., Q resamples, t yelds that n( βwcr β ) = 1 Q n( βq β ) = 1 Q = A 1 β ( 1 U q β (β, γ q )) 1( 1 ) n n β U q β (β, γ q ) 1 Q U q nq β (β, γ q ) + o p (1). Use the Taylor seres expansons of U q β (β, γ q ) and U q γ ( γ q ) around γ, we have U q β (β, γ q ) U q β (β, γ ) = U q β (β, γ t ) γ t ( γ q γ ), U q γ ( γ ) = U q γ ( γ q ) U q γ (γ ) = U q γ (γ s ) γ s ( γ q γ ), where both γ t and γ s are on the lne segment between γ q and γ. By the consstency of γ q and rewrtng the above equatons yelds that n 1 U q β (β, γ q ) s equal to where 1 { U q n β (β, γ ) + U q β (β, γ t ) } γ t ( γ q γ ) = 1 { U q n β (β, γ ) + 1 U q β (β, γ t )( n γ t 1 Uγ q (γ s ) ) 1 } U q n γ s γ (γ ) = 1 {U q n β (β, γ ) + A q γ(bγ) q 1 Uγ q (γ )} + o p (1) := 1 {a q n 1 (β, γ ) + a q 2 (β, γ ) + A q γ(bγ) q 1 (b q 1 (γ ) + b q 2 (γ ))} + o p (1), =1 ( a q 1 (β, γ) = ( a q 2 (β, γ) = ( b q 1 (γ) = ( b q 2 (γ) = (t) s(1) (t) s(1) ) s () (t) s(1) ) s () ) 1,γ,q (t, γ) s () 1,γ,q (t, γ) 2,γ,q (t) s(1) (t, γ) ) s () 2,γ,q (t, γ) dm (1)q dm (2)q d M (1)q d M (2)q (t), (t), (t), (t),

526 Chnese Journal of Appled Probablty and Statstcs Vol. 33 A q γ and Bγ q are lmts of  q γ(β, γ) = n 1 U q β (β, γ)/ γ and B γ(γ) q = n 1 Uγ q (γ)/ γ at (β, γ ). s (j) l,β,q and s(j) l,γ,q (t, γ) denote the lmts of S(j) l,β,q and S(j) l,γ,q (t, γ), respectvely, for l = 1, 2 and j =, 1. Note that It s easy to show that ( nq) 1 U q β (β, γ ) = n {a q 1 (β, γ ) + a q 2 (β, γ )} + o p (1), =1 Uγ q (γ ) = n {b q 1 (γ ) + b q 2 (γ )} + o p (1), =1 Q changng the order of summaton as 1 Q nq U q β (β, γ q ) = 1 Q {U q nq β (β, γ ) + A q γ(bγ) q 1 Uγ q (γ )} + o p (1) = 1 n =1 := 1 n =1 1 Q U q β (β, γ q ) converge to a normal dstrbuton as n, {a q 1, (β, γ ) + a q 2, (β, γ ) + A q γ(bγ) q 1 (b q 1, (γ ) + b q 2, (γ ))} + o p (1) U (β, γ ) + o p (1), where U (β, γ ), = 1, 2,..., n are ndependent wth mean zero and fnte varance. It thus follows from the multvarate Central Lmt Theorem that ( nq) 1 Q U q β (β, γ q ) s asymptotcally normal wth zero mean. Combnng wth Slutsky s theorem, n( β wcr β ) converges n dstrbuton to a zero-mean normal random vector and covarance matrx can be consstently estmated by n Σ wcr. Wang et al. [2] showed that n( β q β ) can be asymptotcally approxmated by a normal vector wth mean zero and a covarance matrx that can be consstently estmated by n Σ q = (Âq β ) 1 Γq (Âq β ) 1, where wth Γ q = 1 n α q ( β q, γ q )( α q ( β q, γ q )) =1 α q ( β q, γ q ) = â q 1 ( β q, γ q ) + â q 2 ( β q, γ q ) + Âq γ( β q, γ q )( B q γ( γ q )) 1 { b q 1 ( γ q) + b q 2 ( γ q)}. For each resampled data, Var ( β q ) can be consstently estmated by Σ q. Average over the Q resamples, the resultng estmator denoted by Q 1 Q Σ q s also consstent. For the consstent estmator of the covarance matrx of β wcr, smlar to [11], we frst wrte Var ( β q ) = E(Var ( β q data)) + Var (E( β q data)).

No. 5 DI R. R., WANG C. Y.: Regresson Analyss of Clustered Falure Tme Data 527 By the fact of E( β q data) = β wcr, t yelds that Var ( β wcr ) = Var ( β q ) E(Var ( β q data)). Snce ( 1 E(Var ( β q data)) = E Q ( β q β wcr )( β q β wcr ) ), t can be estmated as the covarance matrx based on the Q resamplng estmators β q, q = 1, 2,..., Q, that s Ω = 1 Q ( β q β wcr )( β q β wcr ). Thus the estmated varance-covarance matrx of β wcr s Σ wcr = 1 Σ q 1 Q Q ( β q β wcr )( β q β wcr ). To show the consstency of Σ wcr, t s easy to see that Ω E(Ω) n probablty as n. Actually, ths can be easly shown by applyng the same arguments as those n the proof of [9]. Ths completes the proof. Acknowledgements The authors gratefully acknowledge the recommendatons of the assocate edtor and the revewers that led to an mproved revson of an earler manuscrpt. References [1] Zeng D L, Ca J W, Shen Y. Semparametrc addtve rsks model for nterval-censored data [J]. Statst. Snca, 26, 16(1): 287 32. [2] Wang L M, Sun J G, Tong X W. Regresson analyss of case II nterval-censored falure tme data wth the addtve hazards model [J]. Statst. Snca, 21, 2(4): 179 1723. [3] Cox D R. Regresson models and lfe-tables (wth dscusson) [J]. J. Roy. Statst. Soc. Ser. B, 1972, 34(2): 187 22. [4] L J L, Wang C J, Sun J G. Regresson analyss of clustered nterval-censored falure tme data wth the addtve hazards model [J]. J. Nonparametr. Stat., 212, 24(4): 141 15. [5] L J L, Tong X W, Sun J G. Seve estmaton for the Cox model wth clustered nterval-censored falure tme data [J]. Statst. Bosc., 214, 6(1): 55 72. [6] Kor C T, Cheng K F, Chen Y H. A method for analyzng clustered nterval-censored data based on Cox s model [J]. Stat. Med., 213, 32(5): 822 832. [7] Ln D Y, Oakes D, Yng Z L. Addtve hazards regresson wth current status data [J]. Bometrka, 1998, 85(2): 289 298.

528 Chnese Journal of Appled Probablty and Statstcs Vol. 33 [8] Kulch M, Ln D Y. Addtve hazards regresson for case-cohort studes [J]. Bometrka, 2, 87(1): 73 87. [9] Cong X J, Yn G S, Shen Y. Margnal analyss of correlated falure tme data wth nformatve cluster szes [J]. Bometrcs, 27, 63(3): 663 672. [1] Ln D Y, Yng Z L. Semparametrc analyss of the addtve rsk model [J]. Bometrka, 1994, 81(1): 61 71. [11] Hoffman E B, Sen P K, Wenberg C R. Wthn-cluster resamplng [J]. Bometrka, 21, 88(4): 1121 1134. \{ºx.eàa«mí êâ 8 Û,JJ (ÉÇŒÆêÆ ÚOÆ, ÉÇ, 4372) ] ( næêæ OŽÅ ÆÆ, ˆ, 44153) Á : àa«mí žm~ñyušæïä ïäé 5gÓ a œ/. d, žmœu aœƒ'. du"y Û IíüL, Ïd~ {ü ªÒ {z«mí êâ. ud, JÑaS Ä {5Ä\{ºx.e II.«mí K. as Ä {{üi Œþ OŽ, ù {Ì `³3u3aŒƒ'ž, OCþ u y. ìc5ÿúü [(J?Øy T {k5. ' c: \{ºx.; «mí ; as Ä; Œëê 8 ã aò: O213.9