A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient?

Size: px
Start display at page:

Download "A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient?"

Transcription

1 A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood stimators Become Badly Inefficient? Jon A. Wellner 1, Ying Zhang 2, and Hao Liu 3 1 Statistics, University of Washington jaw@stat.washington.edu 2 Statistics, University of Central Florida zhang@pegasus.cc.ucf.edu 3 Biostatistics, University of Washington hliu@u.washington.edu ABSTRACT. We consider estimation in a particular semiparametric regression model for the mean of a counting process under the assumption of panel count data. The basic model assumption is that the conditional mean function of the counting process is of the form Nt Z} expθ ZΛt where Z is a vector of covariates and Λ is the baseline mean function. The panel count observation scheme involves observation of the counting process N for an individual at a random number of random time points; both the number and the locations of these time points may differ across individuals. We study maximum pseudo-likelihood and maximum likelihood estimators and θ n of the regression parameter θ. The pseudo-likelihood estimators are fairly easy to compute, while the full maximum likelihood estimators pose more challenges from the computational perspective. We derive expressions for the asymptotic variances of both estimators under the proportional mean model. Our primary aim is to understand when the pseudo-likelihood estimators have very low efficiency relative to the full maximum likelihood estimators. The upshot is that the pseudo-likelihood estimators can have arbitrarily small efficiency relative to the full maximum likelihood estimators when the distribution of, the number of observation time points per individual, is very heavy-tailed. θ ps n

2 2 Jon A. Wellner, Ying Zhang, and Hao Liu 1Introduction Our goal in this paper is to study efficiency aspects of two types of estimators for a particular semiparametric model for panel count data. Panel count data arise in many fields including demographic studies, industrial reliability, and clinical trials; see for example [19], [7], [31], [32], [29], and [33] where the estimation of either the intensity of event recurrence or the mean function of a counting process with panel count data was studied. Many applications involve covariates whose effects on the underlying counting process are of interest. While there is considerable work on regression modeling for recurrent events based on continuous observations see, for example [22], [5], and [23], regression analysis with panel count data for counting processes has just started recently. [30] proposed estimating equation methods, while [35] and [36] proposed a pseudo-likelihood method for studying the multiplicative mean model 1 with panel count data. Here is a description of the model and the observation scheme. Suppose that N Nt :t 0} is a univariate counting process. In many applications, it is important to estimate the expected number of events Nt Z} which will occur by the time t, conditionally on a covariate vector Z. In this paper we consider the proportional mean regression model given by Λt Z Nt Z} e θ Z Λt, 1 where the monotone increasing function Λ is the baseline mean function. The parameters of primary interest are θ and Λ. The observation scheme we want to study is as follows: suppose that we observe the counting process N at a random number of random times 0 T,0 <T,1 < <T,. We write T T,1,...,T,, and we assume that, T Z G Z is conditionally independent of the counting process N given the covariate vector Z. Wefurther assume that Z H on R d, but we will make no further assumptions about G or H modulo mild integrability and boundedness requirements. The data for each individual will consist of X Z,, T, NT,1,...,NT, Z,, T, N. 2 We will assume that the data consist of X 1,...,X n i.i.d. as X. To derive useful estimators for this model we will often assume, in addition to 1, that the counting process N, conditionally on Z, is a non-homogeneous Poisson process. But our general perspective will be to study the estimators and other procedures when the Poisson assumption fails to hold and we assume only that the proportional mean assumption 1 holds. Such a program was carried out for estimation of Λ without any covariates for this panel count observation model by [33]. Briefly, [33] studied both the

3 Semiparametric Regression Model for Panel Count Data 3 maximum likelihood estimator Λ n and the pseudo-maximum likelihood estimator Λ n of Λ. They showed that both estimators are consistent in L 2 µ ps where µ is the measure defined in terms of the observation process by µb P k k1 k P T k,j B k. [33] also succeeded in showing under additional smoothness and boundedness assumptions that n 1/3 ps σ Λ 2 t 0 Λ } 1/3 n t 0 Λ 0 t 0 0t 0 d 2G 2Z t 0 where σ 2 t Var[Nt], G t k1 P k k G k,j t, and Z argmaxw t t 2 } for a two-sided Brownian motion process W starting at 0. They also proved a corresponding result for a toy estimator version of the maximum likelihood estimator Λ n under the Poisson process assumption, and made efficiency comparisons between the two estimators based on Monte- Carlo studies. The outline of the rest of the present paper is as follows: In section 2, we describe two methods of estimation, namely maximum pseudo-likelihood estimators and maximum likelihood estimators of θ and Λ. The basic picture is that the pseudo-likelihood estimators are computationally relatively straightforward and easy to implement, while the full, semiparametric maximum likelihood estimators are considerably more difficult, requiring an iterative algorithm in the computation of the profile likelihood. For other examples of the use of pseudo-likelihood to obtain computationally simple methods, see e.g. [3] and [27]. In section 3 we present information calculations for the semiparametric model described by the proportional mean function assumption 1 together with the non-homogeneous Poisson process assumption on N. This provides a baseline for comparisons of variances with the best possible asymptotic variance under the Poisson and proportional mean model assumptions. In section 4 we describe asymptotic normality results for the pseudo-likelihood and full maximum likelihood estimators and θ n of θ assuming only the proportional mean structure 1, but not assuming that N is a Poisson process. Proofs of these results will be presented in detail in [34]. Finally, in section 4wecompare the pseudo-likelihood and full likelihood estimators of θ under three different scenarios with the goal of determining situations under which the pseudo-likelihood estimators will lose considerable efficiency relative to the full maximum likelihood estimators. As will be seen, the rough upshot of the calculations here is that the efficiency of the pseudo-likelihood estimators relative to the full maximum likelihood estimators can be low when the distribution of, the number of observation times per subject, is heavy-tailed. θ ps n

4 4 Jon A. Wellner, Ying Zhang, and Hao Liu 2 Two Methods of stimation Maximum Pseudo-likelihood stimation: The natural pseudo-likelihood estimators for this model use the marginal distributions of N, conditional on Z, P Nt k Z Λt Zk exp Λt Z k! and ignore dependence between Nt 1, Nt 2 toobtain the pseudo-likelihood: n i ln ps θ, Λ N i T i i i,j log ΛT i,j i1 } + N i T i i,j θ Z i e θ Z i ΛT i i,j. Then the maximum pseudo-likelihood estimator θ n ps, by θ n ps ps, Λ n argmax θ,λ ln ps θ, Λ. Λ ps n ofθ, Λ isgiven This can be implemented in two steps via the usual pseudo- profile likelihood. For each fixed value of θ we set Λ ps n,θ argmax Λ l ps n θ, Λ, 3 and define Then ln ps,profile θ ln ps ps θ, Λ n,θ. θ ps n argmax θ ln ps,profile θ, and Λps n ps ps Λ n, θ n. In fact, the optimization problem in 3 is easily solved as follows: Let t 1 <...<t m denote the ordered distinct observation time points in the collection of all observations times, T i, j 1,..., i,j i,i1,...,n}, let N i i,j N i T i i,j, and set w l n i i1 A l θ 1 w l 1 [T i i,j t l], n i i1 Then it is easily shown that N l 1 w l n i i1 expθ Z i 1 [T i i,j t l]. N i i,j 1 [T i i,j t l],

5 Semiparametric Regression Model for Panel Count Data 5 Λ ps n,θ left-derivative of Greatest Convex Minorant of w l A l θ, w l N l } m i1 l i l i max min i l j l i p w pn p i p w pa p θ at t l, which is straightforward to compute. Maximum Likelihood stimation: Under the assumption that N is conditionally, given Z a non-homogeneous Poisson process, the likelihood can be calculated using the conditional independence of the increments of N, Ns, t] Nt Ns, and the Poisson distribution of these increments: P Ns, t] k Z to obtain the log-likelihood: l n θ, Λ where i1 [ Λs, t] Z]k k! exp Λs, t] Z n i } N i log Λ i,j i,j + N i i,j θ Z i e θ Z i Λ i,j N j NT,j NT,j 1, Λ j ΛT,j ΛT,j 1, j 1,...,, j 1,...,, with NT,0 0and ΛT,0 0.Then θ n, Λ n argmax θ,λ l n θ, Λ. This maximization can also be carried out in two steps via profile likelihood. For each fixed value of θ we set and define Then Λ n,θ argmax Λ l n θ, Λ, l profile n θ l n θ, Λ n,θ. θ n argmax θ l profile n θ, and Λn Λ n, θ n. Computation of the profile estimator Λ n,θiscomputationally involved, but possible via the iterative convex minorant algorithm; see e.g. [17]. For more on computation without covariates see [33].

6 6 Jon A. Wellner, Ying Zhang, and Hao Liu 3 Information bounds for θ under the Poisson model. We first compute information bounds for estimation of θ under the proportional mean non-homogeneous Poisson process model. Suppose that N Z PoissonΛ Z, and, T Z G Z are conditionally independent given Z.We will assume here that N is conditionally a nonhomogeneous Poisson process with conditional mean function [Nt Z] Λt Z e θ 0 Z Λ 0 t. 4 The second equality expresses the proportional mean regression model assumption. The likelihood for one observation is, using the same notation introduced in Section 2, px; θ, Λ exp Λ j Λ j N j N j!. Thus the log-likelihood for θ, Λ for one observation is given by log px; θ, Λ N j log Λ j Λ j log N j!}. Differentiating this with respect to θ and Λ respectively, the scores for θ and Λ are easily seen to be while where l θ x l Λ ax Z N j e θ 0 Z Λ 0j, 5 Nj } e θ 0 Z a j Λ 0j N j e θ 0 Z Λ 0j } aj Λ 0j, T,j a j adλ 0, a L 2 Λ 0. T,j 1 To compute the information bound for estimation of θ it follows from the results of [2] and [4] that we want to find a so that l θ l Λ a l Λ a

7 Semiparametric Regression Model for Panel Count Data 7 for all a L 2 Λ 0 ; i.e. 0 l θ l } Λ a l Λ a N j e θ 0 Z Λ 0j Z a j l Λ a Λ 0j N j e θ 0 Z Λ 0j 2 Z a j Λ 0j e θ 0 Z Λ 0j Z a j aj Λ 0j Λ 0j aj Λ 0j by conditioning on, T,Z e θ 0 Z Λ 0j Z a j aj } Λ 0j Λ 0j e θ 0 Z Λ 0j Z a j Λ 0j aj } }}, T,j 1,T,j Λ 0j a } j Λ 0j Ze θ 0 Z, T,j 1,T,j Λ 0j a } j e }} θ 0 Z, T,j 1,T,j. Λ 0j Thus we see that the desired orthogonality holds with } a Ze θ 0 Z, T,j 1,T,j j Λ 0j }. 6 e θ 0 Z, T,j 1,T,j Hence the efficient score function for θ is given by l θx l θ x l Λ a x } Ze θ Z, T,j 1,T,j N j e θ Z Λ 0j Z e θ Z,, T,j 1,T,j } and the information for θ is, by computing conditionally on Z,, T, Iθ 0 l θ X 2}

8 8 Jon A. Wellner, Ying Zhang, and Hao Liu } Ze θ 0 Z 2, T,j 1,T,j 0 e θ 0 Z Λ 0j Z } e θ 0 Z, T,j 1,T,j. In particular, we have the following corollary: Corollary 1. Current status data. If P 11so that the only T j of relevance is T 1,1 T while T 1,0 0, then the efficient score function is } l θx l θ x l Ze θ 0 Z T Λ a x NT e θ 0 Z Λ 0 T Z e θ 0 Z T }. and the information for θ is given by Iθ 0 l θ X 2} 0 eθ 0 Z Λ 0 T Z Ze θ 0 Z T e θ 0 Z T } } 2. This can be compared with the information for θ for the Cox proportional hazards model with current status data given by [15], page 547 with Huang s Y replaced by our present T for comparison: Iθ RT,Z Z ZRT,Z T } } 2 RT,Z T where RT,ZΛ 2 T,ZOT Z and 0 z Ot z F t z 1 F t z 1 F 0t expθ 1 1 F 0 t. expθ 0 z Corollary 2. Case 2 Interval-censored data. If P 21so that the only T j s of relevance are T 2,1 T 1 and T 2,2 T 2, while T 2,0 0, then the efficient score function is l θx l θ x l Λ a x } Ze θ 0 Z T 1 NT 1 e θ 0 Z Λ 0 T 1 Z } e θ 0 Z T 1 + NT 2 NT 1 } Ze θ 0 Z T 1,T 2 e θ 0 Z Λ 0 T 2 Λ 0 T 1 Z }, e θ 0 Z T 1,T 2 and the information for θ is given by

9 Semiparametric Regression Model for Panel Count Data 9 Iθ 0 l θ X 2} } Ze θ 0 Z 2 T 1 0 eθ 0 Z Λ 0 T 1 Z } e θ 0 Z T 1 } Ze θ 0 Z 2 T 1,T eθ 0 Z Λ 0 T 2 Λ 0 T 2 Z } e θ 0 Z T 1,T 2. This is much simpler than the information for θ for the Cox proportional hazards model with interval censored case II data given by [16]. The calculations of Huang and Wellner resulted in an integral equation to be solved, analogously to the results for the mean functional considered by [8], [9], and [10]. 4 Asymptotic normality of the two estimators of θ. Here is the crucial theorem concerning the asymptotic behavior of the maximum pseudo-likelihood and maximum likelihood estimators of θ when the proportional mean model holds, but the Poisson assumption concerning N may fail. Theorem 1. Under suitable regularity and integrability conditions, the estimators θ n and θ n are asymptotically normal: ps n θn θ 0 d Z N d 0,A 1 B A 1, 7 and where n θps n θ 0 d Z ps N d 0, A ps 1 B ps A ps 1 8 B m θ 0,Λ 0 ; X 2 Ze θ 0 Z 2, T,j,T,j C j,j Z Z e θ j,j 0 Z, T 1,j,T,j, Ze θ 0 Z 2, T,j 1,T,j A Λ 0j e θ 0 Z Z e θ 0 Z, T,j 1,T,j, C j,j Z Cov[ N j, N j Z,, T ], B ps m ps θ 0,Λ 0 ; X 2

10 10 Jon A. Wellner, Ying Zhang, and Hao Liu Ze θ 0 Z, T,j C ps j,j Z Z e θ j,j 0 Z, T 1,j Ze θ 0 Z, T,j Z e θ 0 Z, T,j, Ze θ 0 Z 2, T,j A ps Λ 0j e θ 0 Z Z e θ 0 Z, T,j, C ps j,j Z Cov[N j,n j Z,, T,j,T,j ]. Our proof of this theorem is based on the results of [35]. While we will not give the proof in detail, we will present here a sketch of the computation of the asymptotic variances given in 7 and 8. Based on the Poisson model, the log-likelihood for θ, Λ with one observation is given by mθ, Λ; X log px; θ, Λ N j log Λ j Λ j log N j!} N j log Λ j + N j θ Z } e θ Z Λ 0j log N j!. 9 Thus the log-likelihood l n θ, Λ for n i.i.d. observations is given by l n θ, Λ np n mθ, Λ;. 10 The maximum likelihood estimators θ, Λ are obtained by maximizing 10. A natural pseudo-likelihood is obtained by simply taking the product of the marginal distributions of the observed counts at the successive observation times. Thus a log-pseudo-likelihood for one observation is given by m ps θ, Λ; X } N j log Λ j + N j θ Z e θ Z Λ j logn j! 11 with Λ,j ΛT,j, and the log-pseudo-likelihood ln ps θ, Λ for n i.i.d. observations is given by ln ps θ, Λ np n m ps θ, Λ;, 12 and the corresponding pseudo-ml s θ ps, Λ ps are obtained by maximizing 12.

11 Semiparametric Regression Model for Panel Count Data Asymptotic variance of the ML Based on the Poisson model, the log-likelihood for θ, Λ with one observation is given by 9. Using the notation of [35], page 29, we have m 1 θ, Λ; X m 2 θ, Λ; X[h] m 11 θ, Λ; X [ ] Z N j Λ j e θ Z, [ ] Nj e θ Z h j, Λ j Λ j ZZ e θ Z, m 12 θ, Λ; X[h] m T 21θ, Λ; X[h] m 22 θ, Λ; X[h,h] Ze θ Z h j, N j Λ j 2 h j h j, where h j T,j T,j 1 hdλ 0 for h L 2 Λ. By A2 of [35], page 30, we need to find a h such that Ṡ 12 θ 0,Λ 0 [h] Ṡ22θ 0,Λ 0 [h,h] P m 12 θ 0,Λ 0 ; X[h] m 22 θ 0,Λ 0 ; X[h,h]} 0, for all h L 2 Λ 0. Note that P m 12 θ 0,Λ 0 ; X[h] m 22 θ 0,Λ 0 ; X[h,h]} [ ] Ze θ 0 Z N j Λ 0j 2 h j [,T,Z Ze θ 0 Z eθ ] 0 Z h j Λ 0j h j. h j Therefore, an obvious choice of h is Ze θ 0 Z, T,j 1,T,j h j Λ 0j. e θ 0 Z, T,j 1,T,j Hence m θ 0,Λ 0 ; X

12 12 Jon A. Wellner, Ying Zhang, and Hao Liu m 1 θ 0,Λ 0 ; X m 2 θ 0,Λ 0 ; X[h ] Z N j e θ 0 Z Λ 0j Nj Ze θ 0 Z, T e θ,j 1,T,j 0 Z Λ 0j Λ 0j e θ 0 Z, T,j 1,T,j Ze N θ 0 Z, T,j 1,T,j j e θ 0 Z Λ 0j Z. e θ 0 Z, T,j 1,T,j By Theorem of [35], page 32, the asymptotic variance will be A 1 B A 1, where B m θ 0,Λ 0 ; X 2,T,Z CT,j,T,j,T,j 1,T,j 1; Z j,j 1 Ze θ 0 Z, T,j 1,T,j Z e θ 0 Z, T,j 1,T,j Ze θ 0 Z, T,j Z 1,T,j e θ 0 Z, T,j 1,T,j, A Ṡ11θ 0,Λ 0 +Ṡ21θ 0,Λ 0 [h ] [ Λ 0j e θ 0 Z ZZ Ze θ 0 Z, T,j e θ 0 Z 1,T,j Λ 0j Z e θ 0 Z, T,j 1,T,j Ze θ 0 Z, T,j 1,T,j,T,Z Λ 0j e θ 0 Z Z Z e θ 0 Z, T,j 1,T,j Ze θ 0 Z 2, T,j 1,T,j,T,Z Λ 0j e θ 0 Z Z e θ 0 Z, T,j 1,T,j, and

13 Semiparametric Regression Model for Panel Count Data 13 CT,j,T,j,T,j 1,T,j 1; Z [ N j e θ 0 Z Λ 0j ] N j e θ 0 Z Λ 0j Z,, T,j 1,T,j,T,j 1,T,j. Note that if the counting process is indeed a conditional Poisson process with the mean function given as specified, Λ0j e CT,j,T,j,T,j 1,T,j 1; Z θ 0 Z, if j j 0, if j j. This yields B A Iθ 0 and thus A 1 B A 1 I 1 θ Asymptotic Variance of the Pseudo-ML Based on the Poisson model, the pseudo log-likelihood for θ, Λ with one observation is given by 11. Using the notation of Zhang 1998, page 29, we have m ps 1 θ, Λ; X [ ] Z N j Λ j e θ Z, m ps 2 θ, Λ; X[h] [ ] Nj e θ Z h j, Λ j m ps 11 θ, Λ; X Λ j ZZ e θ Z, m ps 12 θ, Λ; X[h] mt 21θ, Λ; X[h] Ze θ Z h j, m ps 22 θ, Λ; X[h,h] N j Λ j 2 h jh j, where h j T,j 0 hdλ for h L 2 Λ. By A2 of [35], page 30, we need to find a h such that Ṡ ps 12 θ 0,Λ 0 [h] Ṡps 22 θ 0,Λ 0 [h,h] P m ps 12 θ 0,Λ 0 ; X[h] m ps 22 θ 0,Λ 0 ; X[h,h]} 0, for all h L 2 Λ 0. Note that

14 14 Jon A. Wellner, Ying Zhang, and Hao Liu P m ps 12 θ 0,Λ 0 ; X[h] m ps 22 θ 0,Λ 0 ; X[h,h]} [ ] Ze θ 0 Z N j Λ 0j 2 h j h j,t,z [ Ze θ 0 Z eθ 0 Z h j Λ 0j Therefore, an obvious choice of h is Ze θ 0 Z, T,j h j Λ 0j. e θ 0 Z, T,j Hence ] h j. m ps θ 0,Λ 0 ; X m ps 1 θ 0,Λ 0 ; X m ps 2 θ 0,Λ 0 ; X[h ] Z N j e θ 0 Z Nj Ze θ 0 Z, T Λ 0j e θ,j 0 Z Λ 0j Λ 0j e θ 0 Z, T,j Ze N θ 0 Z, T,j j e θ 0 Z Λ 0j Z. e θ 0 Z, T,j By Theorem of [35], page 32, the asymptotic variance will be A ps 1 B ps A ps 1, where B ps m ps θ 0,Λ 0 ; X 2 Ze θ 0 Z, T,j,T,Z C ps T,j,T,j ; Z Z e θ j,j 0 Z, T 1,j Ze θ 0 Z, T,j Z e θ 0 Z, T,j, A ps Ṡps 11 θ 0,Λ 0 +Ṡps 21 θ 0,Λ 0 [h ] Ze θ 0 Z, T,j Λ 0j e θ 0 Z ZZ e θ 0 Z Λ 0j Z e θ 0 Z, T,j Ze θ 0 Z, T,j,T,Z Λ 0j e θ 0 Z Z Z e θ 0 Z, T,j

15 Semiparametric Regression Model for Panel Count Data 15 Ze θ 0 Z 2, T,j,T,Z Λ 0j e θ 0 Z Z e θ 0 Z, T,j, and C ps T,j,T,j ; Z ] [N j e θ 0 Z Λ 0j N j e θ 0 Z Λ 0j Z,, T,j,T,j. Note that if the counting process is indeed a conditional Poisson process with the mean function given as specified, C ps T,j,T,j ; Z e θ 0 Z Λ 0j j. This yields Ze θ 0 Z, T,j B ps A ps +2,T,Z e θ0z Λ 0j Z e θ j<j 0 Z, T,j Ze θ 0 Z, T,j Z e θ 0 Z, T,j A ps. 5 Comparisons: ML versus pseudo-ml. Scenario 1. We first suppose that the underlying counting process is in fact a standard homogeneous Poisson process conditionally given Z, with baseline mean function Λ 0 t λt, Wewill also assume that the distribution of, T isindependent of Z. Asaconsequence, Z is independent of, T, and the formulas in the preceding section simplify considerably. Because of the Poisson process assumption, A B Iθ 0, and this matrix is given by Ze θ 0 Z 2, T,j 1,T,j Iθ 0,T,Z Λ 0j e θ 0 Z Z e θ 0 Z, T,j 1,T,j Ze θ 0 Z 2,T Λ 0 T, } Z eθ 0 Z Z e θ Z 0,T Λ 0 T, } C, so that if C is nonsingular,

16 16 Jon A. Wellner, Ying Zhang, and Hao Liu Iθ 0 1 C 1 1,T Λ 0 T, }. On the other hand, Ze θ 0 Z 2, T,j A ps,t,z Λ 0j e θ 0 Z Z e θ 0 Z, T,j Ze θ 0 Z 2,T Λ 0 T j eθ 0 Z Z e θ Z 0, while B ps,t,z so that j,j 1 A ps 1 B ps A ps 1,T C 1,Z Ze θ 0 Z 2 Λ 0 T,j j Z eθ 0 Z Z e θ Z 0,,T } j,j 1 Λ 0T,j j } 2. Λ 0T j Thus it follows that the AR of the pseudo-ml of θ 0 relative to the ML of θ 0 under the above scenario is given by ARpseudo, mle A 1 BA 1 T A ps 1 B ps A ps 1 T }} 2 Λ 0T,j }. Λ 0 T, } j,j 1 Λ 0T,j j Note that this equals 1 if P 11.Actually, we have not yet used the assumption about Λ 0.Ifweassume that Λ 0 t λt, then }} 2 T,j ARpseudo, mle } T, } j,j 1 T,j j Scenario 1A. If we assume, further, that P k 1for a fixed integer k 2, and T T,1,...,T, are the order statistics of a sample of k uniformly distributed random variables on an interval [0,M], then T,j k j k +1 M k 2 M,

17 and Hence in this case as k. Semiparametric Regression Model for Panel Count Data 17 T,j j j,j 1 T, } ARpseudo, mle k j,j 1 k k +1 M, j j k2k +1 M M. k +1 6 k/22 k k2k+1 k+1 6 3k +1 22k k Iθ 1 kmλ 2 3/2 4/3 5/4 6/5 7/6 8/7 9/8 10/9 11/10 V ps kmλ 2 5/3 14/9 3/2 22/15 13/9 10/7 17/12 38/27 7/5 ARk Scenario 1B. If we assume instead that is random and for some 0 c< 1/2 T,j Uniform[j c, j + c], j 1,...,, and are independent conditionally on, then we calculate T,, +1 T,j, 2 j,j T,j j, 6 and hence ARpseudo, mle } Note that this depends only on the first three moments of, with the third moment appearing in the denominator. It does not depend on c, but only on T,j j, j 1,...,. When P k 1we find that under this scenario ARpseudo, mle 3k +1 22k It is clear that for random the AR in this scenario can be arbitrarily small if the distribution of is heavy-tailed. This will be shown in more detail in Scenario 2.

18 18 Jon A. Wellner, Ying Zhang, and Hao Liu Scenario 2. Avariant on these calculations is to repeat all the assumptions about Z and, T, assume, conditionally on, that T T,1,...,T, are the order statistics of a sample of uniformly distributed random variables on an interval [0,M], but now allow a distribution for. Then T,j j +1 M 2 M, and T,j j j,j 1 } T, j,j 1 +1 M, j j 2 +1 M M If we let be distributed according to 1+Poissonµ, then the AR will be asymptotically 3/4 again as µ.a more interesting choice of the distribution of is the Zetaα distribution given as follows: for α > 1, P k 1/kα, k 1, 2,..., ζα where ζα j α is the Riemann Zeta function. Then we can compute T,j j +1 M α M M ζα 1, 2 2 ζα and T, } α M, +1 T,j j j,j 1 α j,j 1 j j M +1 M 6 α2 + 1 M 6 2α 2 + α } } 2ζα 2 + ζα 1 M 6 ζα Hence in this case Iθ 1 α C 1 1 Mλ α +1,

19 and Semiparametric Regression Model for Panel Count Data 19 } 2ζα 2+ζα 1 V ps α C 1 6Mλ ζα ζα 1 2ζα 2, ARpseudo, mleα ARα α /2 2 α +1 } α ζα 1/ζα 2 α +1 } 2ζα 2+ζα 1 ζα 3 ζα 1 2 2ζα 2 + ζα 1} α +1 }, and this varies between 0 and 1 as α varies from 3 to ; note that for α 3, 2 ζ1, while ζ2/ζ See Figure 1. AR alpha Fig. 1. ARα If we change the distribution of to Unif1,...,k 0 }, then T,j j +1 M M k 0 +1 M, 2 4

20 20 Jon A. Wellner, Ying Zhang, and Hao Liu T,j j j j α M +1 j,j 1 j,j 1 M 6 M 6 M } 2 k k k } while Hence in this case and T, } M, +1 Iθ 1 α C 1 1 Mλ +1, k 0+12k 0+1/3+k 0+1/2 V ps k 0 C 1 6 Mλk /16, ARpseudo, mlek 0 ARk 0 / } 6 k /16 +1 } k0+12k0+1/3+k0+1/2 6 and this varies between 1 and 9/16 as k 0 varies from 1 to. Scenario 3. We now suppose that the underlying counting process is not a Poisson process, conditionally given Z, but is, instead, defined as in terms of the negative-binomialization of an empirical counting process, as follows: suppose that X 1,X 2,... are i.i.d. with distribution function F on R, and define n N n t 1 [Xi t] for t 0. i1 Suppose that N Z Negative BinomialrZ, γ, θ 0,p where rz, γ, θ 0 γe θ 0 Z, and define N by Nt N N t N 1 [Xi t]. i1,

21 Semiparametric Regression Model for Panel Count Data 21 Then, since it follows that N Z rz, γ, θ 0 q p, VarN Z rz, γ, θ 0 q p 2, Nt Z} [Nt Z, N] Z} NFt Z} γe θ 0 Z F t q p eθ 0 Z Λ 0 t, with Λ 0 t γftq/p. Alternatively, Nt Z Negative Binomialr, p p + qft by straightforward computation, and hence it follows that qft/p + qft Nt Z} r rft q p/p + qft p γeθ 0 Z q p F t eθ 0 Z Λ 0 t, in agreement with the above calculation. Moreover qft/p + qft VarNt Z} r p/p + qft 2 r q p F t1 + q p F t r q p F t+r q p 2 F t 2. Remark. It should be noted that this is somewhat different than the model obtained by supposing that the underlying counting process is a nonhomogeneous Poisson process conditionally given Z and an unobserved frailty Y Gammaγ,γ and conditional on Y and Z mean function In this model we have but Nt Z, Y } Ye θ 0 Z Λ 0 t. Nt Z} e θ 0 Z Λ 0 t, VarNt Z} Var[Nt Y,Z] Z} + Var[Nt Y,Z] Z} e θ 0 Z Λ 0 t+γ 1 e 2θ 0 Z Λ 2 0t. Now we want to calculate

22 22 Jon A. Wellner, Ying Zhang, and Hao Liu CT,j,T,j,T,j 1,T,j 1; Z [ N j e θ 0 Z Λ 0j ] N j e θ 0 Z Λ 0j Z,, T,j 1,T,j,T,j 1,T,j. By computing conditionally on N, and using the fact that conditionally on Z,, T,j 1,T,j,T,j 1,T,j and N, N has increments with a joint multinomial distribution, we find that CT,j,T,j,T,j 1,T,j 1; Z [ N j e θ 0 Z Λ 0j ] N j e θ 0 Z Z, Λ 0j, T,j 1,T,j,T,j 1,T,j [ N j N j N+ N j N e θ 0 Z Λ 0j N j N j N+ N j N e θ 0 Z Λ 0j ] N,Z,,T,j 1,T,j,T,j 1,T,j } Z,, T,j 1,T,j,T,j 1,T,j N F j 1 F j Z,, T } + } N rq/p 2 F j 2 Z,, T, if j j N F j F j Z,, T } + } N rq/p 2 F j F j Z,, T, if j j r q p F,j F,j 2 + r q p F 2 2,j, if j j } r q p r q 2 p F j F j, if j j r q p F,j + r q2 p F 2,j 2, if j j r q2 p F 2 j F j, if j<j e θ 0 Z Λ 0,,j + γ 1 e θ 0 Z Λ 0,,j 2, if j j, γe θ 0 Z q2 p F 2 j F j, if j j, e θ 0 Z Λ 0,,j + γ 1 e θ 0 Z Λ 0,,j 2, if j j, γ 1 e θ 0 Z Λ 0j Λ 0j, if j j. Remark. Note that if N Z PoissonrZ, γ, θ 0, then the process N N N is conditionally, given Z, anon-homogeneous Poisson process with conditional mean function Nt Z} γe θ 0 Z F t e θ 0 Z Λ 0 t, and conditional variance function VarNt Z} γe θ 0 Z F t e θ 0 Z Λ 0 t

23 Semiparametric Regression Model for Panel Count Data 23 where Λ 0 t γft. We will also assume, as in Scenarios 1 and 2, that the distribution of, T isindependent of Z. Asaconsequence, Z is independent of, T, and the formulas in the preceding section again simplify. By taking F uniform on [0,M], Λ 0 t λt where λ γ/mq/p, and we compute, using moments and covariances of uniform spacings, as found on page 721 of [28], B m θ 0,Λ 0 ; X 2,T,Z CT,j,T,j,T,j 1,T,j 1; Z j,j 1 Ze θ 0 Z, T,j 1,T,j Z e θ 0 Z, T,j 1,T,j Ze θ 0 Z, T,j Z 1,T,j e θ 0 Z, T,j 1,T,j,T,Z e θ 0 Z Λ 0j Ze θ 0 Z 2 + γ 1 e θ 0 Z Λ 0j 2 Z e Z θ 0 Ze θ 0 Z 2 +,T,Z γ 1 e θ 0 Z Λ 0j Λ 0j Z e Z θ j j 0 C,T Λ 0j + γ 1 Λ 0j 2 + γ 1,T Λ 0j Λ 0j j j } C λm + γ 1 λ 2 M } λmc + γ 1 λm where, as in scenario 1,

24 24 Jon A. Wellner, Ying Zhang, and Hao Liu Ze θ 0 Z 2 C Z eθ 0 Z Z e θ Z 0. On the other hand, we find that Ze θ 0 Z 2, T,j 1,T,j A,T,Z Λ 0j e θ 0 Z Z e θ 0 Z, T,j 1,T,j } CλM. +1 Thus the asymptotic variance of the ML for this scenario is } } A 1 BA 1 λmc λmγ +2 } Now for the asymptotic variance of the pseudo-ml under scenario 3. To calculate B ps we first need to calculate C ps T,j,T,j ; Z ] [N j e θ 0 Z Λ 0j N j e θ 0 Z Λ 0j Z,, T,j,T,j [N j e θ 0 Z Λ 0j N j e θ 0 Z Λ 0j } N,Z,, T,j,T,j ] Z,, T,j,T,j NF T,j T,j F T,j F T,j } +N rq/p 2 F T,j F T,j Z,, T,j,T,j r q p F T,j T,j F T,j F T,j } + r q p 2 F T,jF T,j e θ 0 Z Λ 0 T,j T,j +γ 1 e θ 0 Z Λ 0 T,j Λ 0 T,j e θ 0 Z Λ 0 T,j 1+γ 1 Λ 0 T,j if j j. We can then calculate B ps m ps θ 0,Λ 0 ; X 2 Ze θ 0 Z, T,j,T,Z C ps T,j,T,j ; Z Z e θ 0 Z, T,j j,j 1

25 Semiparametric Regression Model for Panel Count Data 25 Ze θ 0 Z, T,j Z e θ 0 Z, T,j,T,Z Λ0 T,j 1+γ 1 Λ 0 T,j e θ 0 Z j,j 1 Ze θ 0 Z 2 Z e θ Z 0 C λm j j + λ2 M 2 U,j U,j k +1 γ j,j 1 j,j 2 +1 C λm + λ2 M 2 U,j U,j 6 γ j,j 2 +1 C λm + λ2 M 2 } γ λmc + λm } γ 12 Ze θ 0 Z 2, T,j A ps,t,z Λ 0j e θ 0 Z Z e θ 0 Z, T,j λmc. 2 Thus we find that the asymptotic variance of the pseudo-ml is given by 2+1 A ps 1 B ps A ps 1 λmc λm γ } 2, 2 θ ps n and the asymptotic relative efficiency of the pseudo-mle to the mle is, under scenario 3, ARpseudo, mlenegbin +1}+ λm γ +1} λm γ 2 } 2 +2}

26 26 Jon A. Wellner, Ying Zhang, and Hao Liu } } +1 + λmγ 2 +2 / λm γ } } +1 + λmγ λm γ ARpseudo, mlep oisson 1+ λm +2 γ +1 ARpseudo, mlep oisson. 1+ λm γ Note that when factor λm/γ q/p 0, is zero then we recover our earlier result for the Poisson case. This is to be expected since P oissonλ 0 t expθ 0Z becomes the limiting distribution of Negative Binomialr, p/p + qft as p 1.

27 Semiparametric Regression Model for Panel Count Data 27 6 Conclusions and Further Problems Conclusions As in the case of panel count data without covariates studied in [29] and [33], the pseudo likelihood estimation method for the semiparametric proportional mean model with panel count data proposed and studied in [35] and [36] also has advantages in terms of computational simplicity. The results of section 5 above show that the maximum pseudo-likelihood estimator of the regression parameters can be very inefficient relative to the maximum likelihood estimator, especially when the distribution of is heavy - tailed. In such cases it is clear that we will want to avoid the pseudo-likelihood estimator, and the computational effort required by the full maximum likelihood estimators can be justified by the consequent gain in efficiency. Our derivation of the asymptotic normality of the maximum likelihood estimator of the regression parameters results in a relatively complicated expression for the asymptotic variance which may be difficult to estimate directly. Hence it becomes important to develop efficient algorithms for computation of the maximum likelihood estimator in order to allow implementation, for example, of bootstrap inference procedures. Alternatively, profile likelihood inference may be quite feasible in this model; see e.g. [24], [25], [26] for likelihood ratio procedures in some related interval censoring models. Further problems The asymptotic normality results stated in section 4 will be developed and given in detail in [34]. There are quite a large number of interesting problems still open concerning the semiparametric model for panel count data which has been studied here. Here is a short list: Find a fast and reliable algorithm for computation of the ML θ of θ. Although reasonable algorithms for computation of the maximum pseudolikelihood estimators have been proposed in [35] and [36] based on the earlier work of [29], good algorithms for computation of the maximum likelihood estimators have yet to be developed and implemented. Show that the natural semiparametric profile likelihood ratio procedures are valid for inference about the regression parameter θ via the theorems of [24], [25], and [26]. Do the non-standard likelihood ratio procdures and methods of [1] extend to the present model to give tests and confidence intervals for Λ 0 t? Are there compromise or hybrid estimators between the maximum pseudolikelihood estimators and the full maximum likelihood estimators which have the computational advantages of the former and the efficiency advantages of the latter? Do similar results continue to hold for panel count data with covariates, but with other models for the mean function replacing the proportional mean model given by 1?

28 28 Jon A. Wellner, Ying Zhang, and Hao Liu Are there computational or efficiency advantages to using the ML s for one of the class of Mixed Poisson Process N Z, for example the Negative- Binomial model? Further comparisons with the work of [6], [14], and [20], [21] would be useful. References [1] Banerjee, M. and Wellner, J. A Likelihood ratio tests for monotone functions. Ann. Statist. 29, [2] Begun, J. M., Hall, W. J., Huang, W.M., and Wellner, J. A Information and asymptotic efficiency in parametric-nonparametric models. Ann. Statist. 11, [3] Betensky, R.A., Rabinowitz, D., and Tsiatis, A. A Computationally simple accelerated failure time regression for interval censored data. Biometrika 88, [4] Bickel, P. J., laassen, C.A.J., Ritov, Y., and Wellner, J. A fficient and Adaptive stimation for Semiparametric Models. Johns Hopkins University Press, Baltimore. [5] Cook, R. J., Lawless, J. F., and Nadeau, C Robust tests for treatment comparisons based on recurrent event responses. Biometrics 52, [6] Dean, C. B. and Balshaw, R fficiency lost by analyzing counts rather than event times in Poisson and overdispersed Poisson regression models. J. Amer. Statist. Assoc. 92, [7] Gaver, D. P., and O Muircheartaigh, I. G Robust mpirical Bayes analysis of event rates, Technometrics, 29, [8] Geskus, R. and Groeneboom, P Asymptotically optimal estimation of smooth functionals for interval censoring, part 1. Statist. Neerlandica 50, [9] Geskus, R. and Groeneboom, P Asymptotically optimal estimation of smooth functionals for interval censoring, part 2. Statist. Neerlandica 51, [10] Geskus, R. and Groeneboom, P Asymptotically optimal estimation of smooth functionals for interval censoring case 2. Ann. Statist. 27, [11] Groeneboom, P Nonparametric maximum likelihood estimators for interval censoring and deconvolution. Technical Report 378, Department of Statistics, Stanford University. [12] Groeneboom, P Inverse problems in statistics. Proceedings of the St. Flour Summer School in Probability, Lecture Notes in Math. 1648, Springer Verlag, Berlin. [13] Groeneboom, P. and Wellner, J Information Bounds and Nonparametric Maximum Likelihood stimation. Birkhäuser, Basel. [14] Hougaard, P., Lee, M.T., and Whitmore, G. A Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes. Biometrics 53, [15] Huang, J fficient estimation for the Cox model with interval censoring, Annals of Statistics, 24,

29 Semiparametric Regression Model for Panel Count Data 29 [16] Huang, J., and Wellner, J. A fficient estimation for the Cox model with case 2 interval censoring, Technical Report 290, University of Washington Department of Statistics, 35 pages. [17] Jongbloed, G The iterative convex minorant algorithm for nonparametric estimation. Journal of Computation and Graphical Statistics 7, [18] albfleisch, J. D. and Lawless, J. F Statistical inference for observational plans arising in the study of life history processes. In Symposium on Statistical Inference and Applications In Honour of George Barnard s 65th Birthday. University of Waterloo, August 5-8, [19] albfleisch, J. D. and Lawless, J. F The analysis of panel count data under a Markov assumption. Journal of the American Statistical Association 80, [20] Lawless, J. F. 1987a. Regression methods for Poisson process data. J. Amer. Statist. Assoc. 82, [21] Lawless, J. F. 1987b. Negative binomial and mixed Poisson regression. Canad. J. Statist. 15, [22] Lawless, J. F. and Nadeau, C Some simple robust methods for the analysis of recurrent events. Technometrics 37, [23] Lin, D. Y., Wei, L. J., Yang, I., and Ying, Z Semiparametric regression for the mean and rate functions of recurrent events. J. Roy. Statist. Soc. B, [24] Murphy, S. and Van der Vaart, A. W Semiparametric likelihood ratio inference. Ann. Statist. 25, [25] Murphy, S. and Van der Vaart, A. W Observed information in semiparametric models. Bernoulli 5, [26] Murphy, S. and Van der Vaart, A. W On profile likelihood. J. Amer. Statist. Assoc. 95, [27] Rabinowitz, D., Betensky, R. A., and Tsiatis, A. A Using conditional logistic regression to fit proportional odds models to interval censored data. Biometrics 56, [28] Shorack, G. R. and Wellner, J. A mpirical Processes with Applications to Statistics. Wiley, New York. [29] Sun, J. and albfleisch, J. D stimation of the mean function of point processes based on panel count data. Statistica Sinica 5, [30] Sun, J. and Wei, L. J Regression analysis of panel count data with covariate-dependent observation and censoring times. J. R. Stat. Soc. Ser. B 62, [31] Thall, P. F., and Lachin, J. M Analysis of Recurrent vents: Nonparametric Methods for Random-Interval Count Data. J. Amer. Statist. Assoc. 83, [32] Thall, P. F Mixed Poisson likelihood regression models for longitudinal interval count data. Biometrics 44, [33] Wellner, J. A. and Zhang, Y Two estimators of the mean of a counting process with panel count data. Ann. Statist. 28, [34] Wellner, J. A., Zhang, Y., and Liu Large sample theory for two estimators in a semiparametric model for panel count data. Manuscript in progress. [35] Zhang, Y stimation for Counting Processes Based on Incomplete Data. Unpublished Ph.D. dissertation, University of Washington. [36] Zhang, Y A semiparametric pseudo likelihood estimation method for panel count data. Biometrika 89,

30 30 Jon A. Wellner, Ying Zhang, and Hao Liu University of Washington Statistics Box Seattle, Washington U.S.A. Department of Statistics University of Central Florida P.O. Box Orlando, Florida University of Washington Biostatistics Box Seattle, Washington U.S.A.

A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient?

A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood Estimators Become Badly Inefficient? A Semiparametric Regression Model for Panel Count Data: When Do Pseudo-likelihood stimators Become Badly Inefficient? Jon A. Wellner 1, Ying Zhang 2, and Hao Liu Department of Statistics, University of

More information

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data Xingqiu Zhao and Ying Zhang The Hong Kong Polytechnic University and Indiana University Abstract:

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Two Likelihood-Based Semiparametric Estimation Methods for Panel Count Data with Covariates

Two Likelihood-Based Semiparametric Estimation Methods for Panel Count Data with Covariates Two Likelihood-Based Semiparametric Estimation Methods for Panel Count Data with Covariates Jon A. Wellner 1 and Ying Zhang 2 December 1, 2006 Abstract We consider estimation in a particular semiparametric

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Estimation for two-phase designs: semiparametric models and Z theorems

Estimation for two-phase designs: semiparametric models and Z theorems Estimation for two-phase designs:semiparametric models and Z theorems p. 1/27 Estimation for two-phase designs: semiparametric models and Z theorems Jon A. Wellner University of Washington Estimation for

More information

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p. Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,

More information

Estimation of the Mean Function with Panel Count Data Using Monotone Polynomial Splines

Estimation of the Mean Function with Panel Count Data Using Monotone Polynomial Splines Estimation of the Mean Function with Panel Count Data Using Monotone Polynomial Splines By MINGGEN LU, YING ZHANG Department of Biostatistics, The University of Iowa, 200 Hawkins Drive, C22 GH Iowa City,

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

A Least-Squares Approach to Consistent Information Estimation in Semiparametric Models

A Least-Squares Approach to Consistent Information Estimation in Semiparametric Models A Least-Squares Approach to Consistent Information Estimation in Semiparametric Models Jian Huang Department of Statistics University of Iowa Ying Zhang and Lei Hua Department of Biostatistics University

More information

Maximum likelihood: counterexamples, examples, and open problems

Maximum likelihood: counterexamples, examples, and open problems Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington visiting Vrije Universiteit, Amsterdam Talk at BeNeLuxFra Mathematics Meeting 21 May, 2005 Email:

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle 2015 IMS China, Kunming July 1-4, 2015 2015 IMS China Meeting, Kunming Based on joint work

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle European Meeting of Statisticians, Amsterdam July 6-10, 2015 EMS Meeting, Amsterdam Based on

More information

A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis

A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis Jon A. Wellner and Vladimir Koltchinskii Abstract. Proofs are given of the limiting null distributions of the

More information

arxiv:math/ v2 [math.st] 5 Dec 2007

arxiv:math/ v2 [math.st] 5 Dec 2007 The Annals of Statistics 2007, Vol. 35, No. 5, 2106 2142 DOI: 10.1214/009053607000000181 c Institute of Mathematical Statistics, 2007 arxiv:math/0509132v2 [math.st] 5 Dec 2007 TWO LIKELIHOOD-BASED SEMIPARAMETRIC

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Spline-based sieve semiparametric generalized estimating equation for panel count data

Spline-based sieve semiparametric generalized estimating equation for panel count data University of Iowa Iowa Research Online Theses and Dissertations Spring 2010 Spline-based sieve semiparametric generalized estimating equation for panel count data Lei Hua University of Iowa Copyright

More information

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,

More information

A Measure of Association for Bivariate Frailty Distributions

A Measure of Association for Bivariate Frailty Distributions journal of multivariate analysis 56, 6074 (996) article no. 0004 A Measure of Association for Bivariate Frailty Distributions Amita K. Manatunga Emory University and David Oakes University of Rochester

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data Efficiency Comparison Between Mean and Log-rank Tests for Recurrent Event Time Data Wenbin Lu Department of Statistics, North Carolina State University, Raleigh, NC 27695 Email: lu@stat.ncsu.edu Summary.

More information

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Jian Huang Department of Statistics and Actuarial Science University of Iowa, Iowa City, IA 52242 Email: jian@stat.uiowa.edu

More information

A note on L convergence of Neumann series approximation in missing data problems

A note on L convergence of Neumann series approximation in missing data problems A note on L convergence of Neumann series approximation in missing data problems Hua Yun Chen Division of Epidemiology & Biostatistics School of Public Health University of Illinois at Chicago 1603 West

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Nonparametric estimation under shape constraints, part 2

Nonparametric estimation under shape constraints, part 2 Nonparametric estimation under shape constraints, part 2 Piet Groeneboom, Delft University August 7, 2013 What to expect? Theory and open problems for interval censoring, case 2. Same for the bivariate

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2016-0534.R2 Title A NONPARAMETRIC REGRESSION MODEL FOR PANEL COUNT DATA ANALYSIS Manuscript ID SS-2016-0534.R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202016.0534

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

CURE MODEL WITH CURRENT STATUS DATA

CURE MODEL WITH CURRENT STATUS DATA Statistica Sinica 19 (2009), 233-249 CURE MODEL WITH CURRENT STATUS DATA Shuangge Ma Yale University Abstract: Current status data arise when only random censoring time and event status at censoring are

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Fiducial Inference and Generalizations

Fiducial Inference and Generalizations Fiducial Inference and Generalizations Jan Hannig Department of Statistics and Operations Research The University of North Carolina at Chapel Hill Hari Iyer Department of Statistics, Colorado State University

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Ensemble estimation and variable selection with semiparametric regression models

Ensemble estimation and variable selection with semiparametric regression models Ensemble estimation and variable selection with semiparametric regression models Sunyoung Shin Department of Mathematical Sciences University of Texas at Dallas Joint work with Jason Fine, Yufeng Liu,

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Semiparametric Regression Analysis of Panel Count Data and Interval-Censored Failure Time Data

Semiparametric Regression Analysis of Panel Count Data and Interval-Censored Failure Time Data University of South Carolina Scholar Commons Theses and Dissertations 2016 Semiparametric Regression Analysis of Panel Count Data and Interval-Censored Failure Time Data Bin Yao University of South Carolina

More information

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Statistics Preprints Statistics -00 A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Jianying Zuo Iowa State University, jiyizu@iastate.edu William Q. Meeker

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

On the Breslow estimator

On the Breslow estimator Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,

More information

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo) Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics

Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee 1 and Jon A. Wellner 2 1 Department of Statistics, Department of Statistics, 439, West

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

An Information Criteria for Order-restricted Inference

An Information Criteria for Order-restricted Inference An Information Criteria for Order-restricted Inference Nan Lin a, Tianqing Liu 2,b, and Baoxue Zhang,2,b a Department of Mathematics, Washington University in Saint Louis, Saint Louis, MO 633, U.S.A. b

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

A Conditional Approach to Modeling Multivariate Extremes

A Conditional Approach to Modeling Multivariate Extremes A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying

More information

Poisson Processes. Stochastic Processes. Feb UC3M

Poisson Processes. Stochastic Processes. Feb UC3M Poisson Processes Stochastic Processes UC3M Feb. 2012 Exponential random variables A random variable T has exponential distribution with rate λ > 0 if its probability density function can been written

More information

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St. Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Statistica Sinica 19 (2009), 997-1011 ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Anthony Gamst, Michael Donohue and Ronghui Xu University

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Nonparametric estimation of log-concave densities

Nonparametric estimation of log-concave densities Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work

More information

Panel Count Data Regression with Informative Observation Times

Panel Count Data Regression with Informative Observation Times UW Biostatistics Working Paper Series 3-16-2010 Panel Count Data Regression with Informative Observation Times Petra Buzkova University of Washington, buzkova@u.washington.edu Suggested Citation Buzkova,

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,

More information

Step-Stress Models and Associated Inference

Step-Stress Models and Associated Inference Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated

More information

A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA

A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA Statistica Sinica 23 (2013), 383-408 doi:http://dx.doi.org/10.5705/ss.2011.151 A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA Chi-Chung Wen and Yi-Hau Chen

More information

Probabilistic Inference for Multiple Testing

Probabilistic Inference for Multiple Testing This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information