A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient?

Size: px

Start display at page:

Download "A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient?"

Eunice Dean
5 years ago
Views:

1 A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood stimators Become Badly Inefficient? Jon A. Wellner 1, Ying Zhang 2, and Hao Liu Department of Statistics, University of Washington, Department of Statistics, University of Central Florida, and Department of Biostatistics, University of Washington October 2, 2001 Abstract We consider estimation in a particular semiparametric regression model for the mean of a counting process under the assumption of panel count data. The basic model assumption is that the conditional mean function of the counting process is of the form Nt Z} expθ ZΛt where Z is a vector of covariates and Λ is the baseline mean function. The panel count observation scheme involves observation of the counting process N for an individual at a random number K of random time points; both the number and the locations of these time points may differ across individuals. We study maximum pseudo-likelihood and maximum likelihood estimators and θ n of the regression parameter θ. The pseudo-likelihood estimators are fairly easy to compute, while the full maximum likelihood estimators pose more challenges from the computational perspective. We derive expressions for the asymptotic variances of both estimators under the proportional mean model. Our primary aim is to understand when the pseudo-likelihood estimators have very low efficiency relative to the full maximum likelihood estimators. The upshot is that the pseudo-likelihood estimators can have arbitrarily small efficiency relative to the full maximum likelihood estimators when the distribution of K, the number of observation time points per individual, is very heavy-tailed. θ ps n 1 Research supported in part by National Science Foundation grant DMS , NIAID grant 2R01 AI , and the Stieltjes Institute 2 Research supported in part by National Science Foundation grant DMS AMS 2000 subject classifications. Primary: 60F05, 60F17; secondary 60J65, 60J70. Key words and phrases. asymptotic distributions, asymptotic efficiency, consistency, counting process, empirical processes, information bounds, maximum likelihood, missing data, poisson process, pseudo-likelihood estimators, monotone function,. 1

2 Outline Section 1: Introduction: panel count data with covariates Section 2: Two Methods of stimation. Section 3: Information bounds for θ under the Poisson model. Section 4: Asymptotic normality of the two estimators of θ. Section 5: Comparisons: maximum pseudo-likelihood versus maximum likelihood. Section 6: Conclusions and Further Problems. Bibliography: 2

3 1 Introduction Suppose that N Nt :t 0}is a univariate counting process. In many applications, it is important to estimate the expected number of events Nt Z} which will occur by the time t, conditionally on a covariate vector Z. In this paper we consider the proportional mean regression model given by Λt Z Nt Z} e θ Z Λt, 1.1 where the monotone increasing function Λ is the baseline mean function. The parameters of primary interest are θ and Λ. The observation scheme we want to study is as follows: suppose that we observe the counting process N at a random number K of random times 0 T K,0 <T K,1 < <T K,K. We write T K T K,1,...,T K,K, and we assume that K, T K Z G Z is conditionally independent of the counting process N given the covariate vector Z. We further assume that Z H on R d, but we will make no further assumptions about G or H modulo mild integrability and boundedness requirements. The data for each individual will consist of X Z, K, T K, NT K,1,...,NT K,K Z, K, T K, N K. 1.2 We will assume that the data consist of X 1,...,X n i.i.d. as X. Panel count data arise in many fields including demographic studies, industrial reliability, and clinical trials; see for example Kalbfleisch and Lawless 1985, Gaver and O Muircheartaigh 1987, Thall and Lachin 1988, Thall 1988, Sun and Kalbfleisch 1995, andwellner and Zhang 2000 where the estimation of either the intensity of event recurrence or the mean function of a counting process with panel count data was studied. Many applications involve covariates whose effects on the underlying counting process are of interest. While there is considerable work on regression modeling for recurrent events based on continuous observations see, for example Lawless and Nadeau 1995, Cook, Lawless, and Nadeau 1996, and Lin, Wei, Yang, and Ying 2000, regression analysis with panel count data for counting processes has just started recently. Sun and Wei 2000 proposed estimating equation methods, while Zhang 1998 and Zhang 2001 proposed a pseudo-likelihood method for studying the multiplicative mean model 1.1 with panel count data. To derive useful estimators for this model we will often assume, in addition to 1.1, that the counting process N, conditionally on Z, is a non-homogeneous Poisson process. But our general perspective will be to study the estimators and other procedures when the Poisson assumption fails to hold and we assume only that the proportional mean assumption 1.1 holds. Such a program was carried out for estimation of Λ without any covariates for this panel count observation model by Wellner and Zhang The outline of the rest of paper is as follows: In section 2, we describe two methods of estimation, namely maximum pseudo-likelihood estimators and maximum likelihood estimators of 3

4 θ and Λ. The basic picture is that the pseudo-likelihood estimators are computationally relatively straightforward and easy to implement, while the full, semiparametric maximum likelihood estimators are considerably more difficult, requiring an iterative algorithm in the computation of the profile likelihood. In section 3 we present information calculations for the semiparametric model described by the proportional mean function assumption 1.1 together with the non-homogeneous Poisson process assumption on N. This provides a baseline for comparisons of variances with the best possible asymptotic variance under the Poisson and proportional mean model assumptions. In section 4 we describe asymptotic normality results for the pseudo-likelihood and full maximum ps likelihood estimators θ n and θ n of θ assuming only the proportional mean structure 1.1, but not assuming that N is a Poisson process. Proofs of these results will be presented in detail in Liu, Wellner, and Zhang Finally, in section 4 we compare the pseudo-likelihood and full likelihood estimators of θ under three different scenarios with the goal of determining situations under which the pseudo-likelihood estimators will lose considerable efficiency relative to the full maximum likelihood estimators. As will be seen, the rough upshot of the calculations here is that the efficiency of the pseudolikelihood estimators relative to the full maximum likelihood estimators can be low when the distribution of K, the number of observation times per subject, is heavy-tailed. 2 Two Methods of stimation Maximum Pseudo-likelihood stimation: The natural pseudo-likelihood estimators for this model use the marginal distributions of N, conditional on Z, P Nt k Z Λt Zk exp Λt Z k! and ignore dependence between Nt 1, Nt 2 to obtain the pseudo-likelihood: l ps n θ, Λ n K i i1 } N i T i i K i,j log ΛT K i,j + Ni T i K i,j θ Z i e θ Z i ΛT i K i,j. Then the maximum pseudo-likelihood estimator θ n ps ps, Λ n ofθ, Λ is given by θ n ps ps, Λ n argmax θ,λ ln ps θ, Λ. This can be implemented in two steps via the usual pseudo- profile likelihood. For each fixed value of θ we set Λ ps n,θ argmax Λ ln ps θ, Λ, 2.3 and define Then θ ps n ln ps,profile θ ln ps ps θ, Λ n,θ. argmax θ ln ps,profile θ, and Λps ps ps n Λ n, θ n. 4

5 In fact, the optimization problem in 2.3 is easily solved as follows: Let t 1 <...<t m denote the ordered distinct observation time points in the collection of all observations times, T i K i,j,j 1,...,K i,i1,...,n},letn i K i,j Ni T i K i,j, and set w l n K i i1 A l θ 1 w l Then it is easily shown that 1 [T i K i,j t l], n K i i1 N l 1 w l n K i i1 expθ Z i 1 [T i K i,j t l]. N i K i,j 1 [T i K i,j t l], Λ ps n,θ left-derivative of Greatest Convex Minorant of l i w l A l θ, l i w l N l } m i1 i p max min w pn p i l j l i p w pa p θ at t l, which is straightforward to compute. Maximum Likelihood stimation: Under the assumption that N is conditionally, given Z a non-homogeneous Poisson process, the likelihood can be calculated using the conditional independence of the increments of N, Ns, t] Nt Ns, and the Poisson distribution of these increments: to obtain the log-likelihood: P Ns, t] k Z [ Λs, t] Z]k k! exp Λs, t] Z l n θ, Λ n K i i1 } N i K i,j log Λ K i,j + N i K i,j θ Z i e θ Z i Λ Ki,j where Then N Kj NT K,j NT K,j 1, j 1,...,K, Λ Kj ΛT K,j ΛT K,j 1, j 1,...,K. θ n, Λ n argmax θ,λ l n θ, Λ. This maximization can also be carried out in two steps via profile likelihood. For each fixed value of θ we set Λ n,θ argmax Λ l n θ, Λ, 5

6 and define l profile n θ l n θ, Λ n,θ. Then θ n argmax θ ln profile θ, and Λn Λ n, θ n. Computation of the profile estimator Λ n,θ is computationally involved, but possible via the iterative convex minorant algorithm; see e.g. Jongbloed For more on computation without covariates see Wellner and Zhang Information bounds for θ under the Poisson model. We first compute information bounds for estimation of θ under the proportional mean nonhomogeneous Poisson process model. Suppose that N Z PoissonΛ Z, and K, T K Z G Z are conditionally independent given Z. We will assume here that N is conditionally a nonhomogeneous Poisson process with conditional mean function [Nt Z] Λt Z e θ Z Λ 0 t. 3.1 The second equality expresses the proportional mean regression model assumption. The likelihood for one observation is, using the same notation introduced in Section 2, px; θ, Λ 0 K exp Λ Kj Λ Kj N Kj N Kj!. Thus the log-likelihood for θ, Λ 0 for one observation is given by log px; θ, Λ 0 N Kj log Λ Kj Λ Kj log N Kj!}. Differentiating this with respect to θ and Λ 0 respectively, the scores for θ and Λ are easily seen to be l θ x Z N Kj e θ Z Λ 0Kj, 3.2 while l Λ ax } NKj e θ Z a Kj Λ 0Kj N Kj e θ Z Λ 0Kj } akj Λ 0Kj, 6

7 where TK,j a Kj adλ 0, a L 2 Λ 0. T K,j 1 To compute the information bound for estimation of θ it follows from the results of Begun, Hall, Huang, and Wellner 1983 and Bickel, Klaassen, Ritov, and Wellner 1993 that we want to find a so that l θ l Λ a l Λ a for all a L 2 Λ 0 ; i.e. } 0 l θ l Λ a l Λ a N Kj e θ Z Λ 0Kj Z a Kj l Λ a Λ 0Kj N Kj e θ Z Λ 0Kj 2 Z a Kj akj Λ 0Kj Λ 0Kj e θ Z Λ 0Kj Z a Kj akj Λ 0Kj Λ 0Kj by conditioning on K, T K,Z K e θ Z Λ 0Kj Z a Kj Λ 0Kj K K akj } K Λ 0Kj e θ Z Λ 0Kj Z a Kj akj } K } K, TK,j 1,T K,j Λ 0Kj Λ 0Kj a } Kj Λ 0Kj Ze θ Z K,j 1,T K,j Λ 0Kj a } Kj e }} θ Z K K,j 1,T K,j. Λ 0Kj Thus we see that the desired orthogonality holds with Ze θ Z K,j 1,T K,j } a Kj Λ 0Kj e θ Z K,j 1,T K,j }. 3.3 Hence the efficient score function for θ is given by l θ x l θ x l Λ a x } Ze θ Z K,j 1,T K,j N Kj e θ Z Λ 0Kj Z e θ Z, K,j 1,T K,j } 7

8 and the information for θ is, by computing conditionally on Z, K, T K, Iθ 0 l θ X 2} } Ze θ Z 2 K,j 1,T K,j 0 e θ Z Λ 0Kj Z e θ Z K,j 1,T K,j }. In particular, we have the following corollary: Corollary 1. Current status data. If P K 1 1 so that the only T Kj of relevance is T 1,1 T while T 1,0 0, then the efficient score function is } Ze θ Z T l θ x l θ x l Λ a xnt e θ Z Λ 0 T Z e θ Z. T} and the information for θ is given by Iθ 0 l θ X 2} 0 eθ Z Λ 0 T Z } Ze θ Z 2 T e θ Z T}. This should be compared with the information for θ for the Cox proportional hazards model with current status data given by Huang 1996, page 547. Corollary 2. Case 2 Interval-censored data. If P K 2 1 so that the only T Kj s of relevance are T 2,1 T 1 and T 2,2 T 2, while T 2,0 0, then the efficient score function is l θ x l θ x l Λ a x } Ze θ Z T 1 NT 1 e θ Z Λ 0 T 1 Z e θ Z T 1 } } Ze θ Z T 1,T 2 +NT 2 NT 1 e θ Z Λ 0 T 2 Λ 0 T 1 Z e θ Z, T 1,T 2 } and the information for θ is given by Iθ 0 l θ X 2} } Ze θ Z 2 T 1 0 eθ Z Λ 0 T 1 Z e θ Z T 1 } } Ze θ Z 2 T 1,T eθ Z Λ 0 T 2 Λ 0 T 2 Z e θ Z T 1,T 2 }. 8

9 This should be compared with the information for θ for the Cox proportional hazards model with interval censored case II data given by Huang and Wellner Note that those calculations resulted in an integral equation to be solved, analogously to the results for the mean functional considered by Geskus and Groeneboom 1996, Geskus and Groeneboom 1997 and Geskus and Groeneboom Asymptotic normality of the two estimators of θ. Here is the crucial theorem concerning the asymptotic behavior of the maximum pseudo-likelihood and maximum likelihood estimators of θ when the proportional mean model holds, but the Poisson assumption concerning N may fail. Theorem 1. Under suitable regularity and integrability conditions, the estimators and θ n are asymptotically normal: n θn θ 0 d Z N d 0,A 1 B A 1, 4.1 θ ps n and where n θps n θ 0 d Z ps N d 0, A ps 1 B ps A ps Ze θ 0 Z 2 K,j,T K,j B m θ 0,Λ 0 ;X 2 C j,j Z Z e θ j,j 0 Z 1 K,j,T K,j, Ze θ 0 Z 2 K,j 1,T K,j A Λ 0Kj e θ 0 Z Z e θ 0 Z K,j 1,T K,j, C j,j Z Cov [ ] N Kj, N Kj Z, K, T K, B ps m ps θ 0, Λ 0 ; X 2 Ze θ C ps 0 Z K,j j,j Z Z Z e θ j,j 0 Z 1 K,j Ze θ 0 Z 2 K,j A ps Λ 0Kj e θ 0 Z Z e θ 0 Z K,j, C ps j,j Z Cov [ N Kj,N Kj Z, K, T K,j,T K,j ]. Ze θ 0 Z K,j e θ 0 Z K,j, 9

10 Our proof of this theorem is based on the results of Zhang While we will not give the proof in detail, we will present here as sketch of the computation of the asymptotic variances given in 4.1 and 4.2. Based on the Poisson model, the log-likelihood for θ 0, Λ 0 with one observation is given by mθ, Λ; X logpx;θ, Λ N Kj log Λ Kj Λ Kj log N Kj!} } N Kj log Λ 0Kj + N Kj θ Z e θ Z Λ 0Kj log N Kj!. 4.3 Thus the log-likelihood l n θ, Λ for n i.i.d. observations is given by l n θ, Λ np n mθ, Λ;. 4.4 The maximum likelihood estimators θ, Λ are obtained by maximizing 4.4. A natural pseudo-likelihood is obtained by simply taking the product of the marginal distributions of the observed counts at the successive observation times. Thus a log-pseudolikelihood for one observation is given by m ps θ, Λ; X N Kj log Λ Kj Λ Kj logn Kj!} } N Kj log Λ 0Kj + N Kj θ Z e θ Z Λ 0Kj logn Kj!, 4.5 and the log-pseudo-likelihood l ps n θ, Λ for n i.i.d. observations is given by l ps n θ, Λ np n m ps θ, Λ;, 4.6 and the corresponding pseudo-ml s θ ps, Λ ps are obtained by maximizing Asymptotic variance of the ML Based on the Poisson model, the log-likelihood for θ 0, Λ 0 with one observation is given by 4.3. Using the notation of Zhang 1998, page 29, we have m 1 θ, Λ; X m 2 θ, Λ; X[h] Z m 11 θ, Λ; X [ N Kj Λ 0Kj e θ Z ], [ ] NKj e θ Z h Kj, Λ 0Kj Λ 0Kj ZZ e θ Z, 10

11 m 12 θ, Λ; X[h] m T 21θ, Λ; X[h] Ze θ Z h Kj, m 22 θ, Λ; X[h,h] N Kj Λ 0Kj 2 h Kj h Kj, where h Kj T K,j T K,j 1 hdλ 0 for h L 2 Λ 0. By A2 of Zhang 1998, page 30, we need to find a h such that Ṡ 12 θ 0, Λ 0 [h] Ṡ22θ 0, Λ 0 [h,h]pm 12 θ 0, Λ 0 ; X[h] m 22 θ 0, Λ 0 ; X[h,h]}0, for all h L 2 Λ 0. Note that P m 12 θ 0, Λ 0 ; X[h] m 22 θ 0, Λ 0 ; X[h,h]} [ ] Ze θ 0 Z N Kj Λ 0Kj 2 h Kj K,TK,Z [ Ze θ 0 Z eθ 0 Z h Kj Λ 0Kj h Kj Therefore, an obvious choice of h is Ze θ 0 Z K,j 1,T K,j h Kj Λ 0Kj e θ 0 Z. K,j 1,T K,j Hence m θ 0, Λ 0 ; X m 1 θ 0, Λ 0 ; X m 2 θ 0, Λ 0 ; X[h ] Z N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Λ 0Kj Z NKj ] h Kj. Ze e θ 0 Λ θ 0 Z K,j 1,T K,j Z 0Kj Λ 0Kj e θ 0 Z K,j 1,T K,j Ze θ 0 Z K,j 1,T K,j e θ 0 Z K,j 1,T K,j By Theorem of Zhang 1998, page 32, the asymptotic variance will be A 1 B A 1, where B m θ 0,Λ 0 ;X 2 Ze θ 0 Z K,j 1,T K,j K,TK,Z CT K,j,T K,j,T K,j 1,T K,j 1; Z Z e θ 0 Z K,j 1,T K,j j,j 1 11.

12 Ze θ 0 Z K,j Z 1,T K,j e θ 0 Z K,j 1,T K,j, A Ṡ11θ 0, Λ 0 +Ṡ21θ 0, Λ 0 [h ] Ze θ 0 Z K,j Λ 0Kj e θ 0 Z ZZ e θ 0 Z 1,T K,j Λ 0Kj e θ 0 Z Z K,j 1,T K,j Ze θ 0 Z K,j 1,T K,j K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z Z K,j 1,T K,j Ze θ 0 Z 2 K,j 1,T K,j K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z K,j 1,T K,j, and CT K,j,T K,j,T K,j 1,T K,j 1; Z ] [ N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Λ 0Kj Z, K, T K,j 1,T K,j,T K,j 1,T K,j. Note that if the counting process is indeed a conditional Poisson process with the mean function given as specified, Λ0Kj e CT K,j,T K,j,T K,j 1,T K,j 1; Z θ 0 Z, if j j 0, if j j. This yields B A Iθ 0 and thus A 1 B A 1 I 1 θ Asymptotic Variance of the Pseudo-ML Based on the Poisson model, the pseudo log-likelihood for θ 0, Λ 0 with one observation is given by 4.5. Using the notation of Zhang 1998, page 29, we have K 1 θ, Λ; X Z m ps [ N Kj Λ 0Kj e θ Z ], K m ps 2 θ, Λ; X[h] [ ] NKj e θ Z h Kj, Λ 0Kj m ps K 11 θ, Λ; X Λ 0Kj ZZ e θ Z, 12

13 m ps 12 θ, Λ; X[h] mt 21θ, Λ; X[h] Ze θ Z h Kj, K m ps 22 θ, Λ; X[h,h] N Kj Λ 0Kj 2h Kjh Kj, where h Kj T K,j 0 hdλ 0 for h L 2 Λ 0. By A2 of Zhang 1998, page 30, we need to find a h such that Ṡ ps 12 θ 0, Λ 0 [h] Ṡps 22 θ 0, Λ 0 [h,h]pm ps 12 θ 0, Λ 0 ; X[h] m ps 22 θ 0, Λ 0 ; X[h,h]}0, for all h L 2 Λ 0. Note that P m ps 12 θ 0, Λ 0 ; X[h] m ps 22 θ 0, Λ 0 ; X[h,h]} [ Ze θ 0 Z K,TK,Z [ N Kj Λ 0Kj 2h Kj ] h Kj Ze θ 0 Z eθ 0 Z h Kj Λ 0Kj Therefore, an obvious choice of h is Ze θ 0 Z K,j h Kj Λ 0Kj e θ 0 Z. K,j Hence m ps θ 0, Λ 0 ; X m ps 1 θ 0, Λ 0 ; X m ps 2 θ 0, Λ 0 ; X[h ] Z N Kj e θ 0 Z NKj Ze Λ 0Kj e θ 0 Λ θ 0 Z K,j Z 0Kj Λ 0Kj e θ 0 Z K,j Ze N θ 0 Z K,j Kj e θ 0 Z Λ 0Kj Z e θ 0 Z. K,j By Theorem of Zhang 1998, page 32, the asymptotic variance will be A ps 1 B ps A ps 1, where B ps m ps θ 0, Λ 0 ; X 2 Ze θ 0 Z K,j K,TK,Z C ps T K,j,T K,j ; Z Z e θ j,j 0 Z 1 K,j Ze θ 0 Z K,j Z e θ 0 Z K,j, ] h Kj. 13

14 A ps Ṡps 11 θ 0, Λ 0 +Ṡps 21 θ 0, Λ 0 [h ] Ze θ 0 Z K,j Λ 0Kj e θ 0 Z ZZ e θ 0 Z Λ 0Kj e θ 0 Z Z K,j Ze θ 0 Z K,j K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z Z K,j Ze θ 0 Z 2 K,j K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z K,j, and ] C ps T K,j,T K,j ; Z [N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Λ 0Kj Z, K, T K,j,T K,j. Note that if the counting process is indeed a conditional Poisson process with the mean function given as specified, C ps T K,j,T K,j ; Z e θ 0 Z Λ 0Kj j. This yields Ze θ 0 Z K,j B ps A ps +2 K,TK,Z e θ0z Λ 0Kj Z e θ j<j 0 Z K,j Ze θ 0 Z K,j Z e θ 0 Z K,j A ps. 5 Comparisons: ML versus pseudo-ml. Scenario 1. We first suppose that the underlying counting process is in fact a standard homogeneous Poisson process conditionally given Z, with baseline mean function Λ 0 t λt, We will also assume that the distribution of K, T K is independent of Z. As a consequence, Z is independent of K, T K, and the formulas in the preceding section simplify considerably. Because of the Poisson process assumption, A B Iθ, and this matrix is given by Ze θ 0 Z 2 K,j 1,T K,j Iθ K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z K,j 1,T K,j 14

15 Ze θ 0 Z 2 K,TK Λ 0 T K,K } Z eθ 0 Z Z e θ Z 0 K,TK Λ 0 T K,K } C, so that if C is nonsingular, Iθ 1 C 1 1 K,TK Λ 0 T K,K }. On the other hand, Ze θ 0 Z 2 K,j A ps K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z K,j Ze θ 0 Z 2 K,TK Λ 0 T Kj eθ 0 Z Z e θ Z 0, while so that B ps K,TK,Z j,j 1 Ze θ 0 Z 2 Λ 0 T K,j j Z eθ 0 Z Z e θ Z 0, K j,j 1 Λ 0T K,j j } A ps 1 B ps A ps 1 K,T C 1 K,Z K } 2. K,TK Λ 0T Kj Thus it follows that the AR of the pseudo-ml of θ relative to the ML of θ under the above scenario is given by ARpseudo, mle A 1 BA 1 T A ps 1 B ps A ps 1 T K }} 2 Λ 0T K,j K }. Λ 0 T K,K } j,j 1 Λ 0T K,j j Note that this equals 1 if P K 1 1. Actually, we have not yet used the assumption about Λ 0. If we assume that Λ 0 t λt, then K }} 2 T K,j ARpseudo, mle K } T K,K } j,j 1 T K,j j 15

16 If we assume, further, that P K k 1 for a fixed integer k 2, and T K T K,1,...,T K,K are the order statistics of a sample of k uniformly distributed random variables on an interval [0,M], then k T K,j j k +1 M k 2 M, and Hence in this case as k. T K,j j j,j 1 T K,K } ARpseudo, mle k j,j 1 k k +1 M, j j k2k +1 M M. k +1 6 k/2 2 k k2k+1 k+1 6 3k +1 22k k Iθ 1 kmλ 2 3/2 4/3 5/4 6/5 7/6 8/7 9/8 10/9 11/10 V ps kmλ 2 5/3 14/9 3/2 22/15 13/9 10/7 17/12 38/27 7/5 ARk Scenario 2. A variant on these calculations is to repeat all the assumptions about Z and K, T K, assume, conditionally on K, that T K T K,1,...,T K,K are the order statistics of a sample of K uniformly distributed random variables on an interval [0,M], but now allow a distribution for K. Then K K T K,j j K +1 M K 2 M, and T K,j j j,j 1 } K T K,K K K j,j 1 K K +1 M, j j K2K +1 M M. K +1 6 If we let K be distributed according to 1+Poissonµ, then the AR will be asymptotically 3/4 again as µ. A more interesting choice of the distribution of K is the Zetaα distribution given as follows: for α>1, P K k 1/kα ζα, k 1,2,..., 16

17 where ζα j α is the Riemann Zeta function. Then we can compute T K,j j K +1 M αk M M ζα 1, 2 2 ζα and T K,j j j,j 1 K T K,K } α M, K +1 α j,j 1 j j M K +1 M 6 αk2k +1 M 6 2α K 2 + α K } } 2ζα 2 + ζα 1 M 6 ζα Hence in this case and Iθ 1 α C 1 1, Mλ K α K+1 } V ps α C 1 2ζα 2+ζα 1 ζα 6Mλ ζα 1 2ζα 2, ARpseudo, mleα ARα α K/2 2 α K K+1 } α K2K ζα 1/ζα 2 α K K+1 } 2ζα 2+ζα 1 ζα 3 ζα 1 2 2ζα 2 + ζα 1} α K K+1 }, and this varies between 0 and 1 as α varies from 3 to ; notethatforα3,k 2 ζ1, while K ζ2/ζ See Figure 1. 17

18 Figure 1: ARα If we change the distribution of K to K Unif1,...,k 0 }, then T K,j j K +1 M K M k 0+1 M, 2 4 T K,j j j,j 1 α j,j 1 j j M K +1 M 6 M 6 M 6 K2K +1 2K 2 +K } 2 k 0+ 12k k } while Hence in this case K T K,K } M, K +1 Iθ 1 α C 1 1 Mλ K K+1, k 0 +12k 0 +1/3+k 0 +1/2 V ps k 0 C 1 6 Mλk /16, 18

19 and ARpseudo, mlek 0 ARk 0 K/2 2 K K2K+1 K+1 } 6 k /16 K K+1 } k 0+12k 0 +1/3+k 0 +1/2 6, and this varies between 1 and 9/16 as k 0 varies from 1 to. Scenario 3. We now suppose that the underlying counting process is not a Poisson process, conditionally given Z, but is, instead, defined as in terms of the negative-binomialization of an empirical counting process, as follows: suppose that X 1,X 2,... are i.i.d. with distribution function F on R, and define n N n t 1 [Xi t] for t 0. i1 Suppose that N Z Negative BinomialrZ, γ, θ,p where rz, γ, θ γe θ Z, and define N by Nt N N t N 1 [Xi t]. i1 Then, since N Z rz, γ, θ q p, VarN ZrZ, γ, θ q p 2, it follows that Nt Z} [Nt Z, N] Z} NFt Z} γe θ Z Ft q p eθ Z Λ 0 t, with Λ 0 t γftq/p. Alternatively, Nt Z Negative Binomialr, straightforward computation, and hence it follows that p p+qft by qft/p + qft Nt Z} r rft q p/p + qft p γeθ 0 Z q p Fteθ 0 Z Λ 0 t, in agreement with the above calculation. Moreover qft/p + qft VarNt Z} r p/p + qft 2 r q p F t1 + q p F t r q p F t+r q p 2 Ft 2. 19

20 Now we want to calculate CT K,j,T K,j,T K,j 1,T K,j 1; Z ] [ N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Λ 0Kj Z, K, T K,j 1,T K,j,T K,j 1,T K,j. By computing conditionally on N, and using the fact that conditionally on Z, K, T K,j 1,T K,j,T K,j 1,T K,j and N, N has increments with a joint multinomial distribution, we find that CT K,j,T K,j,T K,j 1,T K,j 1; Z ] [ N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Z, Λ 0Kj K, TK,j 1,T K,j,T K,j 1,T K,j [ N Kj N Kj N+ N Kj N e θ 0 Z Λ 0Kj N Kj N Kj N+ N Kj N e θ 0 Z Λ 0Kj } Z, N,Z,K,T K,j 1,T K,j,T K,j 1,T K,j ] K, TK,j 1,T K,j,T K,j 1,T K,j N FKj 1 F Kj Z, K, T K } + N rq/p 2 F Kj 2 } Z, K, T K, if j j } N F Kj F Kj Z, K, T K + N rq/p 2 } F Kj F Kj Z, K, T K, if j j r q p FK,j F K,j 2 + r q F 2 p 2 K,j, if j j } r q r q p 2 p F Kj F Kj, if j j r q p F K,j + r q2 F p 2 K,j 2, if j j r q2 F p 2 Kj F Kj, if j<j e θ 0 Z Λ 0,K,j + γ 1 e θ 0 Z Λ 0,K,j 2, if j j, γe θ 0 Z q2 F p 2 Kj F Kj, if j j, e θ 0 Z Λ 0,K,j + γ 1 e θ 0 Z Λ 0,K,j 2, if j j, γ 1 e θ 0 Z Λ 0Kj Λ 0Kj, if j j. Remark. NotethatifN Z PoissonrZ, γ, θ, then the process N N N given Z, a non-homogeneous Poisson process with conditional mean function is conditionally, and conditional variance function Nt Z} γe θ Z Fte θ Z Λ 0 t, VarNt Z}γe θ Z Fte θ Z Λ 0 t where Λ 0 t γft. We will also assume, as in Scenarios 1 and 2, that the distribution of K, T K is independent of Z. As a consequence, Z is independent of K, T K, and the formulas in the preceding section 20

21 again simplify. By taking F uniform on [0,M], Λ 0 t λt where λ γ/mq/p, and we compute, using moments and covariances of uniform spacings, as found on page 721 of Shorack and Wellner 1986, B m θ 0,Λ 0 ;X 2 Ze θ 0 Z K,j 1,T K,j K,TK,Z CT K,j,T K,j,T K,j 1,T K,j 1; Z Z e θ j,j 0 Z 1 K,j 1,T K,j Ze θ 0 Z K,j Z 1,T K,j e θ 0 Z K,j 1,T K,j Ze θ 0 Z Ze θ 0 Z K,TK,Z e θ 0 Z Λ 0Kj + γ 1 e θ 0 Z Λ 0Kj 2 Z e θ Z Z 0 e θ Z 0 Ze θ 0 Z Ze θ 0 Z + K,TK,Z γ 1 e θ 0 Z Λ 0Kj Λ 0Kj Z e θ Z Z j j 0 e θ Z 0 C K,T K Λ 0Kj + γ 1 Λ 0Kj 2 + γ 1 K,TK Λ 0Kj Λ 0Kj j j } K K C λm +γ 1 λ 2 M 2 K +1 K+2 } K K λmc +γ 1 λm K +1 K +2 where, as in scenario 1, Ze θ 0 Z 2 C Z eθ 0 Z Z e θ Z 0. On the other hand, we find that Ze θ 0 Z 2 K,j 1,T K,j A K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z K,j 1,T K,j } K CλM. K +1 21

22 Thus the asymptotic variance of the ML for this scenario is } } K A 1 BA 1 λmc 1 K+1 + λm γ K K+2 } 2. K K+1 Now for the asymptotic variance of the pseudo-ml under scenario 3. To calculate B ps we first need to calculate C ps T K,j,T K,j ; Z ] [N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Λ 0Kj Z, K, T K,j,T K,j } [N Kj e θ 0 Z Λ 0Kj N Kj e θ 0 Z Z, Λ 0Kj N,Z,K,T K,j,T K,j ] K, TK,j,T K,j } NF T K,j T K,j F T K,j F T K,j +N rq/p 2 FT K,j F T K,j Z, K, T K,j,T K,j r q p F T K,j T K,j F T K,j F T K,j } + r q p 2 F T K,jF T K,j e θ 0 Z Λ 0 T K,j T K,j +γ 1 e θ 0 Z Λ 0 T K,j Λ 0 T K,j e θ 0 Z Λ 0 T K,j 1+γ 1 Λ 0 T K,j if j j. We can then calculate B ps m ps θ 0, Λ 0 ; X 2 Ze θ 0 Z K,j K,TK,Z C ps T K,j,T K,j ; Z Z e θ j,j 0 Z 1 K,j Ze θ 0 Z K,j Z e θ 0 Z K,j K,TK,Z Λ0 T K,j 1+γ 1 Λ 0 T K,j Ze θ 0 Z e θ 0 Z Z e θ Z j,j 0 1 Ze θ 0 Z Z e θ Z 0 C λm j j + λ2 M 2 U K,j U K,j k +1 γ j,j 1 j,j K2K +1 C + λm λ2 M 2 U K,j U K,j 6 γ j,j 22

23 K2K +1 C λm + λ2 M 2 } K3K +1 6 γ 12 K2K +1 λmc + λm } K3K +1 6 γ 12 Ze θ 0 Z 2 K,j A ps K,TK,Z Λ 0Kj e θ 0 Z Z e θ 0 Z K,j K λmc. 2 Thus we find that the asymptotic variance of the pseudo-ml is given by K2K+1 A ps 1 B ps A ps 1 λmc λm γ K3K+1 12 K } 2, 2 θ ps n and the asymptotic relative efficiency of the pseudo-mle to the mle is, under scenario 3, ARpseudo, mlenegbin K K+1}+ λm γ K K K+2} K+1} 2 K2K+1 + λm 6 γ K3K+1 12 K 2 } 2 } } K K+1 + λm γ K 2 K+2 K/2 K2K λm γ K3K+1 12 K K+1 } } K K+1 + λm γ K K+2 K2K+1 6 K K+1 K2K λm γ K3K+1 12 ARpseudo, mlep oisson 1+ λm K+2 K γ K+1 K ARpseudo, mlep oisson. 1+ λm γ K3K+1 12 K2K+1 6 Note that when factor λm/γ q/p 0, is zero then we recover our earlier result for the Poisson case. This is to be expected since P oissonλ 0 t expθ 0 Z becomes the limiting distribution of Negative Binomialr, p/p + qft as p 1. 23

24 6 Conclusions and Further Problems Conclusions As in the case of panel count data without covariates studied in Sun and Kalbfleisch 1995 and Wellner and Zhang 2000, the pseudo likelihood estimation method for the semiparametric proportional mean model with panel count data proposed and studied in Zhang 1998 and Zhang 2001 also has advantages in terms of computational simplicity. The results of section 5 above show that the maximum pseudo-likelihood estimator of the regression parameters can be very inefficient relative to the maximum likelihood estimator, especially when the distribution of K is heavy - tailed. In such cases it is clear that we will want to avoid the pseudo-likelihood estimator, and the computational effort required by the full maximum likelihood estimators can be justified by the consequent gain in efficiency. Our derivation of the asymptotic normality of the maximum likelihood estimator of the regression parameters results in a relatively complicated expression for the asymptotic variance which may be difficult to estimate directly. Hence it becomes important to develop efficient algorithms for computation of the maximum likelihood estimator in order to allow implementation, for example, of bootstrap inference procedures. Alternatively, profile likelihood inference may be quite feasible in this model; see e.g. Murphy and Van der Vaart 1997, Murphy and Van der Vaart 1999, Murphy and Van der Vaart 2000 for likelihood ratio procedures in some related interval censoring models. Further problems The asymptotic normality results stated in section 4 will be developed and given in detail in Liu, Wellner, and Zhang There are quite a large number of interesting problems still open concerning the semiparametric model for panel count data which has been studied here. Here is a short list: Find a fast and reliable algorithm for computation of the ML θ of θ. Although reasonable algorithms for computation of the maximum pseudo-likelihood estimators have been proposed in Zhang 1998 and Zhang 2001 based on the earlier work of Sun and Kalbfleisch 1995, good algorithms for computation of the maximum likelihood estimators have yet to be developed and implemented. Show that the natural semiparametric profile likelihood ratio procedures are valid for inference about the regression parameter θ via the theorems of Murphy and Van der Vaart 1997, Murphy and Van der Vaart 1999, andmurphy and Van der Vaart Do the non-standard likelihood ratio procdures and methods of Banerjee and Wellner 2001 extend to the present model to give tests and confidence intervals for Λ 0 t? Are there compromise or hybrid estimators between the maximum pseudo-likelihood estimators and the full maximum likelihood estimators which have the computational advantages of the former and the efficiency advantages of the latter? Do similar results continue to hold for panel count data with covariates, but with other models for the mean function replacing the proportional mean model given by 1.1? Are there computational or efficiency advantages to using the ML s for one of the class of Mixed Poisson Process N Z, for example the Negative-Binomial model? Further comparisons with the 24

25 work of Dean and Balshaw 1997, Hougaard, Lee, and Whitmore 1997, and Lawless 1987a, Lawless 1987b would be useful. References Banerjee, M. and Wellner, J. A Likelihood ratio tests for monotone functions. Ann. Statist. 29, to appear. Begun, J. M., Hall, W. J., Huang, W.M., and Wellner, J. A Information and asymptotic efficiency in parametric-nonparametric models. Ann. Statist. 11, Bickel, P. J., Klaassen, C.A.J., Ritov, Y., and Wellner, J. A fficient and Adaptive stimation for Semiparametric Models. Johns Hopkins University Press, Baltimore. Cook, R. J., Lawless, J. F., and Nadeau, C Robust tests for treatment comparisons based on recurrent event responses. Biometrics 52, Dean, C. B. and Balshaw, R fficiency lost by analyzing counts rather than event times in Poisson and overdispersed Poisson regression models. J. Amer. Statist. Assoc. 92, Gaver, D. P., and O Muircheartaigh, I. G Robust mpirical Bayes analysis of event rates, Technometrics, 29, Geskus, R. and Groeneboom, P Asymptotically optimal estimation of smooth functionals for interval censoring, part 1. Statist. Neerlandica 50, Geskus, R. and Groeneboom, P Asymptotically optimal estimation of smooth functionals for interval censoring, part 2. Statist. Neerlandica 51, Geskus, R. and Groeneboom, P Asymptotically optimal estimation of smooth functionals for interval censoring case 2. Ann. Statist. 27, Groeneboom, P Nonparametric maximum likelihood estimators for interval censoring and deconvolution. Technical Report 378, Department of Statistics, Stanford University. Groeneboom, P Inverse problems in statistics. Proceedings of the St. Flour Summer School in Probability, Lecture Notes in Math. 1648, Springer Verlag, Berlin. Groeneboom, P. and Wellner, J Information Bounds and Nonparametric Maximum Likelihood stimation. Birkhäuser, Basel. Hougaard, P., Lee, M.T., and Whitmore, G. A Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes. Biometrics 53, Huang, J fficient estimation for the Cox model with interval censoring, Annals of Statistics, 24, Huang, J., and Wellner, J. A fficient stimation for the Cox Model with Case 2 Interval Censoring, Preprint. 25

26 Jongbloed, G The iterative convex minorant algorithm for nonparametric estimation. Journal of Computation and Graphical Statistics 7, Kalbfleisch, J. D. and Lawless, J. F Statistical inference for observational plans arising in the study of life history processes. In Symposium on Statistical Inference and Applications In Honour of George Barnard s 65th Birthday. University of Waterloo, August 5-8, Kalbfleisch, J. D. and Lawless, J. F The analysis of panel count data under a Markov assumption. Journal of the American Statistical Association 80, Lawless, J. F. 1987a. Regression methods for Poisson process data. J. Amer. Statist. Assoc. 82, Lawless, J. F. 1987b. Negative binomial and mixed Poisson regression. Canad. J. Statist. 15, Lawless, J. F. and Nadeau, C Some simple robust methods for the analysis of recurrent events. Technometrics 37, Lin, D. Y., Wei, L. J., Yang, I., and Ying, Z Semiparametric regression for the mean and rate functions of recurrent events. J. Roy. Statist. Soc. B, Liu, H., Wellner, J. A., and Zhang, Y Large sample theory for two estimators in a semiparametric model for panel count data. Manuscript in progress. Murphy, S. and Van der Vaart, A. W Semiparametric likelihood ratio inference. Ann. Statist. 25, Murphy, S. and Van der Vaart, A. W Observed information in semi-parametric models. Bernoulli 5, Murphy, S. and Van der Vaart, A. W On profile likelihood. J. Amer. Statist. Assoc. 95, Shorack, G. R. and Wellner, J. A mpirical Processes with Applications to Statistics. Wiley, New York. Sun, J. and Kalbfleisch, J. D stimation of the mean function of point processes based on panel count data. Statistica Sinica 5, Sun, J. and Wei, L. J Regression analysis of panel count data with covariate-dependent observation and censoring times. J. R. Stat. Soc. Ser. B 62, Thall, P. F., and Lachin, J. M Analysis of Recurrent vents: Nonparametric Methods for Random-Interval Count Data. J. Amer. Statist. Assoc. 83, Thall, P. F Mixed Poisson likelihood regression models for longitudinal interval count data. Biometrics 44, Wellner, J. A. and Zhang, Y Two estimators of the mean of a counting process with panel count data. Ann. Statist. 28, Zhang, Y stimation for Counting Processes Based on Incomplete Data. Unpublished Ph.D. dissertation, University of Washington. 26

27 Zhang, Y A semiparametric pseudo likelihood estimation method for panel count data. Biometrika 88, to appear. University of Washington Statistics Box Seattle, Washington U.S.A. jaw@stat.washington.edu Department of Statistics University of Central Florida P.O. Box Orlando, Florida zhang@pegasus.cc.ucf.edu University of Washington Biostatistics Box Seattle, Washington U.S.A. hliu@u.washington.edu 27

A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient?

A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood stimators Become Badly Inefficient? Jon A. Wellner 1, Ying Zhang 2, and Hao Liu 3 1 Statistics, University of Washington