Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data

Similar documents
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Models for Multivariate Panel Count Data

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

The Ef ciency of Simple and Countermatched Nested Case-control Sampling

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Statistical Analysis of Competing Risks With Missing Causes of Failure

Spread, estimators and nuisance parameters

Published online: 10 Apr 2012.

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

Spline-based sieve semiparametric generalized estimating equation for panel count data

Estimation of the Mean Function with Panel Count Data Using Monotone Polynomial Splines

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Survival Analysis for Case-Cohort Studies

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Within-individual dependence in self-controlled case series models for recurrent events

Product-limit estimators of the survival function with left or right censored data

Semiparametric Regression

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time

Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models

Lecture 5 Models and methods for recurrent event data

Analysing longitudinal data when the visit times are informative

Group Sequential Designs: Theory, Computation and Optimisation

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Power and Sample Size Calculations with the Additive Hazards Model

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Ef ciency considerations in the additive hazards model with current status data


1. The Multivariate Classical Linear Regression Model

Survival models and health sequences

A Poisson Process Approach for Recurrent Event Data with Environmental Covariates NRCSE. T e c h n i c a l R e p o r t S e r i e s. NRCSE-TRS No.

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

Sample Size Determination

Negative Multinomial Model and Cancer. Incidence

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

Rene Tabanera y Palacios 4. Danish Epidemiology Science Center. Novo Nordisk A/S Gentofte. September 1, 1995

Efficiency of Profile/Partial Likelihood in the Cox Model

Chapter 1 Statistical Inference

Lecture 7 Time-dependent Covariates in Cox Regression

Multi-state Models: An Overview

Exact Non-parametric Con dence Intervals for Quantiles with Progressive Type-II Censoring

CTDL-Positive Stable Frailty Model

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Likelihood ratio confidence bands in nonparametric regression with censored data

Semiparametric Regression Analysis of Panel Count Data and Interval-Censored Failure Time Data

Multi-state models: prediction

University of California, Berkeley

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

MAS3301 / MAS8311 Biostatistics Part II: Survival

Chapter 3 ANALYSIS OF RESPONSE PROFILES

A note on profile likelihood for exponential tilt mixture models

STAT 526 Spring Final Exam. Thursday May 5, 2011

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

A Course on Advanced Econometrics

MULTIVARIATE POPULATIONS

A Bivariate Weibull Regression Model

A multi-state model for the prognosis of non-mild acute pancreatitis

Truncated Poisson Regression for Time Series of Counts

Probability and Probability Distributions. Dr. Mohammed Alahmed

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

TESTINGGOODNESSOFFITINTHECOX AALEN MODEL

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

Full likelihood inferences in the Cox model: an empirical likelihood approach

HANDBOOK OF APPLICABLE MATHEMATICS

Multistate models and recurrent event models

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Nonparametric Estimation for Semi-Markov Processes Based on K-Sample Paths with Application to Reliability

Philosophy and Features of the mstate package

STAT Sample Problem: General Asymptotic Results

Asymptotic Properties of Kaplan-Meier Estimator. for Censored Dependent Data. Zongwu Cai. Department of Mathematics

Chapter 1. GMM: Basic Concepts

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Simple Estimators for Semiparametric Multinomial Choice Models

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Subject CS1 Actuarial Statistics 1 Core Principles

Testing for Regime Switching: A Comment

Exam details. Final Review Session. Things to Review

Quantile Regression for Residual Life and Empirical Likelihood

3 Joint Distributions 71

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Statistical Practice

A multi-state model for the prognosis of non-mild acute pancreatitis

On the Breslow estimator

Chapter 2 Inference on Mean Residual Life-Overview

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Exact unconditional tests for a 2 2 matched-pairs design

ingestion of selenium tablets for plasma levels to rise; another explanation may be that selenium aects only initiation not promotion of tumors, so tu

An Approximate Test for Homogeneity of Correlated Correlation Coefficients

Modeling Recurrent Events in Panel Data Using Mixed Poisson Models

Likelihood Construction, Inference for Parametric Survival Distributions

TMA 4275 Lifetime Analysis June 2004 Solution

STAT 331. Martingale Central Limit Theorem and Related Results

Transcription:

Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data JIANGUO SUN University of Missouri SHESH N. RAI St. Jude Children's Research Hospital ABSTRACT. We consider the comparison of point processes in a discrete observation situation in which each subject is observed only at discrete time points and no history information between observation times is available. A class of non-parametric test statistics for the comparison of point processes based on this kind of data is presented and their asymptotic distributions are derived. The proposed tests are generalizations of the corresponding tests for continuous observations. Some results from a simulation study for evaluating the proposed tests are presented and an illustrative example from a clinical trial is discussed. Key words: clinical trial, non-parametric test, panel count data, point process 1. Introduction We discuss non-parametric statistical methods for the comparison of continuous point processes. The problem of observing any subject under study over a time-interval (continuous observation) has been investigated by a number of authors and thoroughly discussed in a recent book by Andersen et al. (1993). This paper discusses a discrete observation situation that is considered relatively less in the literature. Speci cally, we consider situations in which, for each subject, observations are taken only at discrete time points and no information about histories between observation times is available. In other words, only the cumulative numbers of events of interest are observed at successive time points, but the speci c times of the events are not available. This kind of data is often referred to as panel count data (Kalb eisch & Lawless, 1985; Sun & Kalb eisch, 1995). Panel count data arise naturally when a recurrent event is under investigation and it is dif cult to keep subjects under observation over the entire study period. For example, in medical followup studies of chronic diseases (e.g. bladder cancer and gallstones), patients are often required by protocol to visit a clinical centre for evaluation at prespeci ed times. At each visit, the patient reports the number of events of a certain type which are related to the disease under investigation and have occurred since the previous visit. It is often not practical or even possible for the patient to remember and record the time of the occurrence of each event. Thus, the data include only the numbers of events that have occurred between observation times. One such example is discussed in section 4. Panel count data also occur, for example, in AIDS clinical trials, animal tumourgenicity experiments, reliability studies and sociological studies (Gaver & O'Muricheartaigh, 1987; Sun & Kalb eisch, 1993; Rai & Matthews, 1997). If each subject is observed only once, the data are usually referred to as current status data (Sun & Kalb eisch, 1993). For the comparison of point processes based on panel count data, the main dif culty is that observation times usually differ among subjects in the study. Consider, for example, a clinical trial in which patients are required to visit the clinic at the same prespeci ed successive times. If all the patients follow the scheduled visit times, we have a standard inference problem which is

726 J. Sun and S. N. Rai Scand J Statist 28 relatively easy to address. However, this rarely happens in practice for various reasons. Among others, Thall & Lachin (1988) considered this problem in the context of repeated measurements and Sun & Kalb eisch (1993) discussed it for current status data. More recently, Staniswalis et al. (1997) and Zhang (1999) approached the problem by tting the data to regression models with treatment indicators as covariates. In this paper, we use the basic idea from the analysis of continuous observation data to construct test statistics for the discrete case. In section 2, we begin with a brief review of the method used in the continuous observation case. Section 3 discusses the construction of statistics for testing the equality of the intensities of several point processes based on panel count data. The proposed test statistics are generalizations of the corresponding test statistics for continuous observations. Section 4 reports some results from a simulation study for evaluating the performance of the proposed tests, which suggest that the method works well under the situations considered. An illustrative example from a clinical trial is also presented in section 4. Section 5 concludes the paper with some remarks. The general notation and assumptions required are as follows. Suppose that there are n independent subjects who come from k populations and give rise to continuous point processes. Let N ij (t) denote the process generated by the jth subject of the ith population, i.e. the number of occurrences of the event of interest by time t from the subject, j ˆ 1,..., n i, i ˆ 1,..., k, 0 < t < 1, where P k iˆ1 n i ˆ n (for convenience, we assume that time t belongs to the interval [0, 1]). Suppose that fn ij (t)g are multivariate counting processes with multiplicative intensity models fá ij (t) ˆ ë i (t)y ij (t); t 2 [0, 1]g, where ë i (t) is a non-negative deterministic function and Y ij (t) is a process indicating (by value 1) whether subject (ij) is in the study at time t. For a thorough discussion of multivariate counting processes, see Andersen et al. (1993). It will be assumed that observation times are non-informative about the processes of interest, which means that in medical follow-up studies, the visit times of each patient are independent of the disease status. 2. Test for continuous observation data Suppose that each subject is observed over a time interval starting with t ˆ 0 and consider the hypothesis H 0 : ë 1 (t) ˆˆë k (t) (t 2 [0, 1]). For this problem, Andersen et al. (1982) proposed the following method that begins with estimating the cumulative intensity functions. Let ë(t) denote the common function of ë 1 (t),..., ë k (t) under H 0. De ne an estimator of the cumulative intensity function Ë i (t) ˆ t 0 ë i ds as ^Ëc i (t) ˆ t P ni 0 jˆ1 fj ij=y i g dn ij, where Y i ˆ Pn i jˆ1 Y ij and J ij ˆ IfY ij. 0g. De ne an estimator of the common cumulative intensity function Ë(t) ˆ t 0 ë ds as ^Ëc (t) ˆ t P k P ni 0 iˆ1 jˆ1 fj ij=y g dn ij under H 0, where Y ˆ Pk iˆ1 Y i. For H 0, de ne a vector of statistics Z c ˆ fz1 c(1),..., Z c k (1)g9, where Z c i (t) ˆ t 0 KY i fd ^Ë c i J i d ^Ë c g ˆ t X k KJ hl Y i ä ih Y i 1 dn hl, Y and K is a weighting process, ä ih is a Kronecker delta and J i ˆ IfY i. 0g. By an application of the martingale central limit theorem to fz c i (t), t 2 [0, 1], i ˆ 1,..., kg, Andersen et al. (1982) showed that Z c is asymptotically multinormally distributed with mean 0. Thus, a test of H 0 can be based on an approximate 2 (k 1) distribution for Z c 9V c Z c, where

Scand J Statist 28 Comparison of point processes 727 V c is a generalized inverse of the covariance matrix estimate of Z c (see Andersen et al., 1982). When only panel count data are available, the processes N ij (t)s and Y i (t)s are not completely known. Thus, the above method is not valid in this case and some adjustments have to be made to apply a similar approach. 3. Non-parametric test for panel count data In this section we consider panel count data. That is, each subject is assumed to be observed only at discrete time points and thus only the cumulative numbers of events of interest are observed at observation times. The speci c occurrence times of the events are unknown. For subject (ij), let 0, t ij1,, t ijmij denote the observation times. Then observed data consist of fn ij (t ijl ); l ˆ 1,..., m ij, j ˆ 1,..., n i, i ˆ 1,..., kg. De ne Y ij (t) ˆ I(t < t ijmij ), j ˆ 1,..., n i, i ˆ 1,..., k. Our aim is to test the hypothesis H 0. Let ë(t), Ë(t) and Ë i (t)s be de ned as in the previous section. To construct the test statistic for H 0, we follow the basic idea used in the previous section and start by considering how to estimate Ë i (t)s and Ë(t). For this purpose, for each u (ˆ 1,..., k) and subject (ij), de ne Y (ij) u (t) ˆ Y u(t ijl ) if t 2 (t ijl 1, t ijl ] and Y (ij) u (t) ˆ Y u(t) if t. t ijmij, l ˆ 1,..., m ij, where Y u (t) ˆ Pn u lˆ1 Y ul(t) as de ned before. Note that here it is assumed that a patient known to be under observation at t ijl must have been at risk of events during the preceding interval (t ijl 1, t ijl ]. It is seen that Y (ij) u has similar meaning to Y u, the number of individuals from the uth population under study, and is also a step function. Corresponding to ^Ë c i, de ne ^Ë i (t) ˆ t P ni 0 jˆ1 fj ij=y (ij) i g dn ij. With respect to Ë(t), for subject (ij), de ne Y (ij) (t) ˆ Pk uˆ1 Y (ij) u (t) and ^Ë(t) ˆ t P k P ni 0 iˆ1 jˆ1 fj ij=y (ij) g dn ij. Corresponding to the statistic Z c, de ne the statistic Z ˆfZ 1 (1),..., Z k (1)g9, where t X k X n ( h Z i (t) ˆ KJ hl Y (hl) ä ih i Y (hl) i 1 ) Y (hl) dn hl, (1) where K (0 < s < 1) is a step-function process with possible jumps at the common observation times of subjects under study. For the selection of the weight process K, comments and suggestions given for the continuous observation data (Andersen et al., 1993) apply here. It should be noted that compared to the continuous observation case, the choice of K here may be limited due to the structure of the panel count data. Of course, the choice K ˆ 1 is always available, irrespective of common observation times. It is easy to see that if continuous observations are obtained, Y (ij) u (t) ˆ Y u(t) and Y (ij) (t) ˆ Y(t), j ˆ 1,..., n i, i, u ˆ 1,..., k. Therefore, the ^Ë i s and ^Ë are just the estimators ^Ë c i s and ^Ë c, respectively, and the statistic Z ˆfZ 1 (1),..., Z k (1)g9 is exactly the test statistic Z c described in the previous section. Taking K ˆ 1 in (1), we have that Z i (1) ˆ N i (1) J hl Y (hl) i Y (hl) dn hl, i ˆ 1,..., k: For the asymptotic normality of the statistic Z, suppose that the regularity conditions in th. 3.1 of Andersen et al. (1982) for the asymptotic normality of the statistic Z c hold. Also suppose that n i =n! p i as n!1, i ˆ 1,..., k, where the p i s are constants between 0 and 1 and P k iˆ1 p i ˆ 1. Then it can be shown that under H 0,if

728 J. Sun and S. N. Rai Scand J Statist 28 Y i =Y! p i, i ˆ 1,..., k (2) in probability uniformly on s as n!1, Z ˆfZ 1 (1),..., Z k (1)g is asymptotically multinormally distributed with mean 0 and covariance matrix that can be estimated by V ˆfa i1 i 2 g, i 1, i 2 ˆ 1,..., k, where a i1 i 2 ˆ K 2 J hl ( ) ä i1 h Y (hl) i 1 Y (hl) ( ) ä i2 h Y (hl) i 2 Y (hl) dn hl : (3) The proof of the above result is sketched out in the appendix. We remark that condition (2) means that the ratio of the number of subjects from each population under study to the total number of subjects under study is roughly constant over time. Note that this is required only for the situation where H 0 is true and is usually the case for longitudinal follow-up studies that prespecify observation times, which are thus non-informative. In particular in many clinical trials, subjects are randomly assigned to treatments groups. In this case, it is usually reasonable to assume that processes for patient withdrawal from the study are the same among treatments groups if the observation processes are independent of the underlying point process of interest. For the calculations of test statistic Z and its covariance matrix estimate, note that they are summations of integrals over all subjects. For each integral, the integrand is a step-function with jumps only at the observation times corresponding to that subject, and thus the integral is the weighted summation of the increments of individual point process between these observation times. Using the above result, we can test the hypothesis H 0 based on the statistic 2 ˆ Z9V Z, which has asymptotically a chi-squared distribution with k 1 degrees of freedom, where V is a generalized inverse of V: The test statistic can also be formed using any k 1 elements of Z and the inverse of the corresponding (k 1) 3 (k 1) submatrix of V. 4. Simulation study and an example A simulation study was conducted to evaluate the performance of the test proposed in the previous section. Two aspects of the test were considered. One was the asymptotic normal approximation to the null distribution of statistic Z in nite sample situations and the other was the power of the test. In the simulation study, we considered a two-sample comparison problem (k ˆ 2) and assumed that N ij (t) is a Poisson process with cumulative intensity function Ë(t) ˆ at for subjects in one population and Ë(t) ˆ atexp( â) for subjects in the other population, where a is a constant and â represents the difference between the two populations. For observation times, we mimicked the situation that is usually the case for long-term medical follow-up studies or clinical trials and assumed that each subject is supposed to be observed at time t ˆ 1,..., r with a certain probability p of missing an observation. Here r is an integer and p can be chosen to give a certain percentages of missing observations. An example of such situations is the periodic follow-up study in which all possible observation times are weeks, months, or years 1, 2,... The results reported below are based on 1000 replications and for the situation where n 1 ˆ n 2 ˆ 25, a ˆ 1 and r ˆ 10 unless speci ed otherwise. To investigate the normal distribution approximation, we obtained quantile plots against the standard normal distribution of the standardized test statistic Z ˆ Z=V 1=2 and present in Fig. 1 one such plot with K ˆ 1 and 50% missing observations. A diagonal straight line is drawn in the gure to help visual assessment of the plot. Similar plots were obtained for other situations. The plots suggest that the normal distribution approximation to the null distribution of Z is quite good under the situations considered. Table 1 presents the sizes and powers of the same test statistic for three different missing percentages of observations and ve different â values

Scand J Statist 28 Comparison of point processes 729 Fig. 1. Quantitle plot of the test statistic with 50% censoring. Table 1. Sizes and powers of the non-parametric test True â Missing percentage â ˆ 0:5 â ˆ 0:4 â ˆ 0:2 â ˆ 0 â ˆ 0:2 â ˆ 0:4 â ˆ 0:5 70% 0.933 0.910 0.283 0.057 0.451 0.973 1.000 50% 0.996 0.954 0.493 0.047 0.566 0.997 1.000 20% 0.999 0.970 0.562 0.049 0.657 0.998 1.000 with the signi cance level of 0.05. As expected, the power increases when the missing percentage of observations decreases. Similar simulation results to those presented above were obtained for other settings. In particular, as a referee suggested, we considered a situation which is similar to that in the example considered below and in which n 1 ˆ 50, n 2 ˆ 65, and r ˆ 12 (months). For this case, based on 10 000 replications with 50% missing observations and the signi cance level of 0.05, we obtained the size and powers of the test statistic as 0.0498, 0.5030, 0.5044, 0.9742, 0.8741, 1, and 1 with â equal to 0, 0.1, 0.1, 0.2, 0.2, 0.4, and 0.4, respectively. Now we illustrate the proposed methodology by applying it to the data presented in table 1 of Thall & Lachin (1988), which gives the successive visit times in weeks and the associated counts of episodes of nausea for 113 patients with oating gallstones in a follow-up study. These data comprise the rst year follow-up of patients in two study groups, placebo (48) and highdose chenodiol (65), from the National Cooperative Gallstone Study. The whole study consists of 916 patients who were randomized to placebo, low dose, or high dose group and followed for up to two years. One of the objectives of the study was to test the difference of the two treatments with respect to the incidence rate of nausea. During the study, patients were scheduled to return for clinical visits at 1, 2, 3, 6, 9, and 12 months, and asked to report the total number of each type of symptom relating to the disease (e.g. nausea) that had occurred since the

730 J. Sun and S. N. Rai Scand J Statist 28 last clinical visit. Most patients visited about six times within the rst year, but actual visit times differed from patient to patient. As pointed out by Thall & Lachin (1988), there is no evidence that the number of observations and actual observation times are related to the incidence of nausea, and so it seems reasonable to assume that the real visit times are non-informative. To test the difference between the two groups, the test procedure described in section 3 was used. The statistic 2 with K ˆ 1 gives an observed value of 0.0206 with the corresponding P-value of 0.8859. This agrees with the original analysis given by Schoen eld et al. (1981) and suggests that the incidence rates of nausea do not differ signi cantly between the patients in the placebo and high-dose chenodiol groups. Thall & Lachin (1988) and Sun & Kalb eisch (1995) graphically compared the empirical incidence rates and the cumulative incidence rates of nausea for the patients in the two groups, respectively. Both studies suggested that the incidence rate for the placebo group was higher initially and then became lower than that for the high-dose chenodiol group during the rst year follow-up. 5. Concluding remarks A class of non-parametric tests for the comparison of point processes based on panel count data is presented in this paper. The tests are appropriate if observation processes are non-informative about the processes of interest. On the other hand, if observation times depend on the underlying point process of interest, then methods which take this into account need to be used. The appropriateness of the proposed test also requires that condition (2) holds under the null hypothesis. As pointed out before, this is usually true in many follow-up studies, especially clinical trials, as long as observation processes are non-informative. As mentioned before, an alternative to the proposed non-parametric method is to t the data to a parametric or semiparametric regression model with treatment indicators as covariates. For example, Thall (1988) considered the parametric Poisson regression model and Staniswalis et al. (1997) discussed tting a semiparametric regression model. As usual, both parametric and semiparametric methods need model checking and this is often dif cult. In addition, for panel count data, more computation effort could be required for the parametric and semiparametric methods than the non-parametric approach. For example, to test if regression parameters are equal to zero under the semiparametric model, the baseline rate or mean function has to be estimated, which is not straightforward due to the structure of the data (Staniswalis et al., 1997; Zhang, 1999). Two hypotheses that are sometimes of interest and can be assessed in a similar manner to that discussed in section 3 are H 01 : ë i (t) ˆ ë 0 i (t) (t 2 [0, 1]) for xed i and H 02: ë 1 (t) ˆ ë 0 1 (t),..., ë k(t) ˆ ë 0 k (t) (t 2 [0, 1]), where the ë0 i s are known functions. Let ^Ë0 i (t) ˆ t 0 J ië 0 i ds, the estimator of Ë i under the hypothesis H 01. The test statistic for H 01 can be constructed as Z 1 i (1) ˆ 1 0 K i fd ^Ë i d ^Ë 0 i g, where K i is a step function process with possible jumps at common observation times of subjects from the ith population. It can be shown by similar methods to those used for the statistic Z that Z 1 i (1) has an asymptotic normal distribution with mean 0 and variance which can be estimated by V 1 i (1) ˆ 1 P ni 0 jˆ1 K2 i (t)fj ij(t)=y (ij)2 i (t)gë 0 i (t) dt. Hence, the test of the hypothesis H 01 can be based on the statistic W i ˆ Z 1 i (1)=fV 1 i (1)g1=2, which asymptotically has a standard normal distribution. The test of the hypothesis H 02 can be based on the statistic W ˆ Pk iˆ1 W 2 i, which, under H 02, is asymptotically distributed as chi-squared with k degrees of freedom.

Scand J Statist 28 Comparison of point processes 731 Acknowledgements The authors are very grateful to Professor Jack Kalb eisch for his discussion and comments on the early version of the paper, and to Dr Hongbin Fang for his help on the appendix. They also wish to thank two anonymous referees for their many helpful comments and suggestions. The rst author's research was supported by a grant from the National Institutes of Health and the second author's research was supported in part by a Cancer Center Support grant (CA 21765) and the American Lebanese Syrian Associated Charities (ALSAC). References Andersen, P. K., Borgan, Q., Gill, R. D. & Keiding, N. (1982). Linear nonparametric tests for comparison of counting processes, with applications to censored survival data (with discussion). Intern. Statist. Rev. 50, 219±258. Andersen, P. K., Borgan, Q., Gill, R. D. & Keiding, N. (1993). Statistical models based on counting processes. Spring-Verlag, New York. Gaver, D. P. & O'Muircheartaigh, I. G. (1987). Robust empirical Bayes analyses of event rates. Technometrics 29, 1±15. Kalb eisch, J. D. & Lawless, J. F. (1985). The analysis of panel data under a Markov assumption. J. Amer. Statist. Assoc. 80, 863±871. Rai, S. N. & Matthews, D. E. (1997). Discrete scale models for survival/sacri ce experiments. Appl. Statist. 46, 93±109. Schoen eld, L. J., Lachin, J. M., et al. (1981). National cooperative gallstone study: a controlled trial of the ef ciency and safety of chenodeoxycholic acid for dissolution of gallstones. Ann. Intern. Med. 95, 257±282. Staniswalis, J. G., Thall, P. F. & Salch, J. (1997). Semiparametric regression analysis for recurrent event interval counts. Biometrics 53, 1334±1353. Sun, J. & Kalb eisch, J. D. (1993). The analysis of current status data on point processes. J. Amer. Statist. Assoc. 88, 1449±1454. Sun, J. & Kalb eisch, J. D. (1995). Estimation of the mean function of point processes based on panel count data. Statist. Sinica 5, 279±289. Thall, P. F. (1988). Mixed Poisson likelihood regression models for longitudinal interval count data. Biometrics 44, 197±209. Thall, P. F. & Lachin, J. M. (1988). Analysis of recurrent events: nonparametric methods for random-interval count data. J. Amer. Statist. Assoc. 83, 339±347. Zhang, Y. (1999). A semiparametric pseudolikelihood estimation for panel count data. Technical Report, Department of Statistics, University of Central Florida. Received March 2000, in nal form January 2001 Jiangu Sun, Department of Statistics, University of Missouri, 222 Math Science Building, Columbia, MO 65211, USA. Appendix: proof of the asymptotic distribution of Z Let a n denote the sequence of positive constants de ned in th. 3.1 of Andersen et al. (1982). To derive the asymptotic normality of the test statistic Z given in section 3, note that and Z i (1) ˆ 1 0 K dn i Z c i (1) ˆ 1 0 K dn i KJ hl Y (hl) i Y (hl) dn hl, KJ hl Y i Y dn hl,

732 J. Sun and S. N. Rai Scand J Statist 28 i ˆ 1,..., k. Thus, we have a n Z i (1) ˆ a n Z c i (1) a nu i, (4) where U i ˆ ( ) (hl) Y i Y i KJ hl Y Y (hl) dn hl : Therefore, it is suf cient to show that a n U ˆ a n (U 1,..., U k ) converges to zero in probability and a n Z c is asymptotically normally distributed. For a n U, note that for any time s, according to the construction of Y (hl) i and Y (hl), there exists a time s such that Y (hl) i Y (hl) ˆ Yi(s ) Y (s ) : Thus, we have inf s 1,s 2 2[0,1] Y i (s 1 ) Y (s 1 ) Y i(s 2 ) Y (s 2 ) < Y i Y (hl) Y i Y (hl) < sup s 1,s 2 2[0,1] Y i (s 1 ) Y (s 1 ) Y i(s 2 ) Y (s 2 ), (5) which converges to zero in probability according to (2). It then follows that a n U converges to zero in probability since X k a n U 1 i ˆ a n KJ hl dn hl is bounded in probability assuming that jkj < M and N hl (1) < M, where M is a positive constant. The asymptotic normality with mean zero of a n Z c follows from th. 3.1 of Andersen et al. (1982). They also showed that the covariance of a n Z c i 1 (1) and a n Z c i 2 (1) can be estimated by X n r i1 i 2 ˆ a 2 h n K 2 J hl ä i1 h Y i 1 ä i2 h Y i 2 dn hl : Y Y By (5), it is easily seen that r i1 i 2 and a 2 n a i 1 i 2 are asymptotically equivalent, where a i1 i 2 is given in (3). That is, the distribution of Z c can be asymptotically approximated by the multivariate normal distribution with mean zero and covariance matrix that can be estimated by V ˆfa i1 i 2 g. This with (4) completes the proof.