Truncated Poisson Regression for Time Series of Counts

Size: px

Start display at page:

Download "Truncated Poisson Regression for Time Series of Counts"

Valerie Smith
5 years ago
Views:

1 Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 645±659, 2001 Truncated Poisson Regression for Time Series of Counts KONSTANTINOS FOKIANOS University of Cyprus ABSTRACT. We consider partial likelihood analysis of a truncated Poisson regression model for time series of counts. We focus our attention on the study of asymptotic theory for the maximum partial likelihood estimator of a vector of regression parameters. Simulations and data analysis integrate the presentation. Key words: martingale, non-stationary, partial likelihood, random time dependent covariates, score 1. Introduction Regression models for non-gaussian time series data have attracted a fair amount of attention recently since such data become rather common in many applications. For example, the polio incidence dataðdiscussed in section 4Ðlists the monthly number of poliomyelitis cases in the USA during 1970 to Those dataðwhich form a time series of countsðare not amenable to the usual ARMA methodology. There is, however, a growing literature for regression models for non-gaussian time series. We refer in particular to the recent texts authored by MacDonald & Zucchini (1997, ch. 1), Fahrmeir & Tutz (1994, ch. 6±8), Diggle et al. (1994, ch. 10), Kedem (1994, ch. 9) featuring work on the so-called transition models for time series. Accordingly, a transition model speci es a generalized linear model (McCullagh & Nelder, 1989) for the conditional distribution of the response given past information. We focus on a speci c regression model for the analysis of time series of counts in this work. Regression models for time series of counts in the context of generalized linear models have been discussed by many authors. Important contributions include Wong (1986), Holden (1987), Zeger & Qaqish (1988), Li (1991, 1994) Albert et al. (1994), and more recently LeÂon & Tsai (1998). These models are usually called observation driven (Cox, 1981). We contribute to the choice of the distribution by introducing the doubly truncated Poisson model for time series of countsða notion discussed in section 2. The main feature of the model is that inference can still be drawn within the framework of time series models following generalized linear models. Estimation and testing are both carried out by partial likelihood parametric inference regarding a time invariant vector of regression parameters. Our approach accommodates random time dependent covariates and drops any stationary or Markov assumptions. There is also important work on parameter driven models for non-gaussian time series, see the articles by Zeger (1988) and Albert (1991) for example. The books by Spall (1988), West & Harrison (1997) and Fahrmeir & Tutz (1994) elaborate on the state space modelling of non-linear and non-gaussian time series. Econometricians have already noticed the usefulness of the truncated Poisson distribution for regression modelling of independent counts (see Grogger & Carson, 1991; Gurmu & Triverdi, 1992). The paper by Grogger & Carson (1991) examines maximum likelihood regression estimators from the positive Poisson distribution, a concept that we will introduce in the next section. The work by Gurmu & Triverdi (1992) features score tests of extra-poisson variation in left or right truncated Poisson regression models. These models are introduced in section 2 as special cases of the doubly-truncated Poisson distribution. We continue by considering inference

2 646 K. Fokianos Scand J Statist 28 and asymptotics, in the framework of maximum partial likelihood estimation, and the paper concludes with applications. 2. Truncated Poisson regression for time series 2.1. On truncated Poisson distribution The doubly truncated Poisson distribution has been studied extensively by many authors (see Johnson et al., 1992, for details). Let X be a random variable distributed according to the doubly-truncated Poisson distribution (Cohen, 1954). That is, X is a Poisson random variable but the rst c 1 values and the values exceeding a speci ed value c 2 are omitted, with c 1, c 2.A straightforward calculation leads to the probability mass function of X, with P(X ˆ x; c 1, c 2, ë) ˆ ë x x!ø(c 1, c 2, ë), x ˆ c 1, c 1 1,..., c 2, c 1, c 2 (1) ø(c 1, c 2, ë) ˆ Xc 2 ëx x!, xˆc 1 de ned only when both constants c 1 and c 2 are non-negative and such that c 1, c 2.Ifc 1 is negative then we put ø(c 1, c 2, ë) ˆ ø(0, c 2, ë). Clearly ø(0, 1, ë) ˆ exp(ë). Setting c 2 ˆ1 in (1) leads to the so-called left truncated Poisson distribution. In particular, if c 1 ˆ 1 and c 2 ˆ1, (1) becomes ë x P(X ˆ x; ë) ˆ, x ˆ 1, 2,..., x!(exp(ë) 1) which is the probability mass function of the positive Poisson distribution. References on probabilistic and statistical properties of this distribution include David & Johnson (1952), Grab & Savage (1954) and Kemp & Kemp (1988), for example. Similarly, substituting c 1 ˆ 0in(1) yields the right truncated Poisson distribution. The book by Johnson et al. (1992) is a useful source for additional information on the properties of these distributions. The next step is to utilize the doubly-truncated Poisson distribution for modelling a time series of counts observed jointly with random time dependent covariates Modelling Suppose that fy t, t ˆ 1, 2,..., Ng, is a univariate time series of counts truncated at levels c 1 and c 2. That is values below c 1 or above c 2 have been omitted. Furthermore, let fx t, t ˆ 1,..., Ng be a vector of covariates that in uence the evolution of fy t, t ˆ 1,..., Ng and assume that F t 1 stands for the past of the processes up to and including time t. Then the doubly-truncated Poisson regression model for time series of counts is completely speci ed by the following assumptions. Assumption 1 Let f (y t ; ë t jf t 1 ) denotes the conditional distribution of y t given the history of the processes. We assume that f has following form f (y t ; ë t jf t 1 ) ˆ ë yt t y t!ø(c 1, c 2, ë t ), y t ˆ c 1,..., c 2 : (2)

3 Scand J Statist 28 Truncated poisson regression 647 An interesting observation is that (2) introduces a member of the exponential families of distributions. Indeed, by letting log ë t ˆ è t we obtain f (y t ; ë t jf t 1 ) ˆ exp(y t è t log ø(c 1, c 2, exp(è t )) log(y t!)): Standard properties of the exponential family of distributions (see for example, Cox & Hinkley, 1974) yield the following relations regarding the conditional mean and variance of the process ø(c 1 1, c 2 1, ë t ) ì t ˆ E[Y t jf t 1 ] ˆ ë t (3) ø(c 1, c 2, ë t ) and v t ˆ var[y t jf t 1 ] ˆ 1 ø 2 (c 1, c 2, ë t ) fë2 t ø(c 1 2, c 2 2, ë t )ø(c 1, c 2, ë t ) ë t ø(c 1 1, c 2 1, ë t )[ø(c 1, c 2, ë t ) ë t ø(c 1 1, c 2 1, ë t )]g: (4) We point out that if c 1 takes the values 0 or 1, then (3) and (4) should be used according to our earlier model assumptions for ø. Assumption 2 The processes y t and x t are related by means of the linear predictor ã t ˆ â9z t 1 : The coef cient â denotes a p-dimensional vector of unknown regression parameters. The vector of random time dependent covariates z t 1 ±which might include past values of the response, past values of x t, or any interactions between them±is F t 1 measurable. This speci cation covers the case of a non-homogeneous Markov chain modelled through z t 1 ˆ (1, y t 1,..., y t l, x t ), and the example of a purely autoregressive process z t 1 ˆ (1, y t 1,..., y t l ): However, we do not postulate any Markovian assumption on the process. In addition the process may be non-stationary. Assumption 3 The linear predictor is linked to the mean ì t ˆ ì(è t ) by the expression è t ˆ log ë t ˆ ã t : We observe that the above assumption is the so-called canonical link. The condition of the canonical link facilitates further calculations and guarantees existence and uniqueness of the maximum likelihood estimator (Wedderburn, 1976). We do not discuss the case of noncanonical link; that is there exists a twice differentiable function, say g, such that g( ì t ) ˆ ã t : This topic can be treated along the lines of Fokianos & Kedem (1998). Figures 1 and 3 display typical realizations of a truncated time series of counts of length 250. For these examples, we choose the simple model log ë t (â) ˆ â 0 â 1 cos 2ðt â 2 log(y t 1 ): 12

4 648 K. Fokianos Scand J Statist 28 Fig. 1. Typical realizations of a time series fy t, t ˆ 1,..., 250g for truncated counts. The data have been generated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). The truncation points are c 1 ˆ 1 and c 2 ˆ1. (a) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:75 and â 2 ˆ 1. (b) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50. Fig. 2. (a) Autocorrelation function of the simulated time series which corresponds to the top panel of Fig. 1. (b) Autocorrelation function of the simulated time series which corresponds to the bottom panel of Fig. 1. Here â ˆ (â 0, â 1, â 2 )9and z t 1 ˆ (1, cos([2ðt]=12), log(y t 1 ))9. Figure 1 illustrates typical realizations when the truncation points are c 1 ˆ 1 and c 2 ˆ1, i.e. the case of the positive Poisson distribution. Figure 3 shows time series plots of the same model but now the truncation points have been xed to c 1 ˆ 1 and c 2 ˆ 10. For both of these gures the top part illustrates realizations of the process when â 0 ˆ 0:30,

5 Scand J Statist 28 Truncated poisson regression 649 Fig. 3. Typical realizations of a time series fy t, t ˆ 1,..., 250g for truncated counts. The data have been generated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). The truncation points are c 1 ˆ 1 and c 2 ˆ 10. (a) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:75 and â 2 ˆ 1. (b) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50. â 1 ˆ 0:75 and â 2 ˆ 1 while the bottom part illustrates typical realizations when â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50. It is rather interesting to observe that the behaviour of the process depends on the sign of the coef cient of log(y t 1 ). This is true for both of the models at hand. Indeed, negative values of the parameter â 2 lead to an alternating seriesða fact which is manifested by plotting the autocorrelation function of both time series shown in Figs 1(a) and 3(a). The respective autocorrelation plots are displayed in Figs 2(a) and 4(a). We can Fig. 4. (a) Autocorrelation function of the simulated time series which corresponds to the top panel of Fig. 3. (b) Autocorrelation function of the simulated time series which corresponds to the bottom panel of Fig. 3.

6 650 K. Fokianos Scand J Statist 28 deduce that successive observations tend to be located on different sides of the mean, thus leading to larger oscillation. The sinusoidal pattern of the autocorrelation function is clearly evident. In contrast, Figs 1(b) and 3(b) exhibit time series with less oscillation. This fact is also con rmed by the plots of their autocorrelation functionsðfigs 2(b) and 4(b) respectivelyð which exhibit a short term correlation indicating that values above or below the mean tend to be followed by observations that fall above or below the mean. In addition, we notice the sinusoidal pattern of the autocorrelation function. 3. Partial likelihood inference Interest is certainly focused on the estimation of the vector parameter â. This is a challenging problem since the data are dependent. Dependence is modelled through the covariate vector z t 1 which might include past values of the process fy t, t ˆ 1,..., Ng. The partial likelihood methodology, suggested by Cox (1975), approaches the problem successfully via martingale theory. In the context of time series following generalized linear models, the idea has been further developed in Wong (1986), Slud & Kedem (1994), Fokianos & Kedem (1998) among others. In the context of survival analysis and counting processes see Andersen & Gill (1982) or Arjas & Haara (1987), for example. Partial likelihood allows for sequential conditional processing of the information. Lengthy discussions of these ideas have been given in Slud & Kedem (1994) and Fokianos & Kedem (1998). We follow Fokianos & Kedem (1998) in the development of the theory. Hence, suppose that fy t, t ˆ 1,..., Ng is a doubly-truncated time series of counts and put, as in subsection 2.2, log ë t (â) ˆ â9z t 1. Equivalently, we have that ë t (â) ˆ exp(â9z t 1 ). Then, the partial log-likelihood (pl) function relative to â, F t, and the data fy t, t ˆ 1,..., Ng, is given by pl(â) XN log f (y t, ë t (â)jf t 1 ) ˆ XN fy t log ë t (â) log ø(c 1, c 2, ë t (â)) log(y t!)g: (5) Differentiation of (5) with respect to â yields the partial score S N (â) ˆ XN z t 1 (y t ì t (â)): (6) The solution of the equation S N (â) ˆ 0 is called the maximum partial likelihood estimator (MPLE) and it is denoted by ^â. The most widely used method for the solution of S N (â) ˆ 0 is Fisher scoring (details are given in the second chapter of McCullagh & Nelder (1989), for example). Notice that fs t (â), t ˆ 1,..., Ng coupled with the increasing sequence of ó - elds ff t g forms a zero mean square integrable martingale sequence. This is a crucial fact for an application of a central limit theorem for studying the asymptotic properties of the maximum partial likelihood estimator ^â. The conditional information matrix is G N (â) ˆ XN var[z t 1 (y t ì t (â))jf t 1 ] ˆ XN z t 1 z9 t 1 v t (â), (7) with v t (â) de ned by (4). The unconditional information matrix is given by F N (â) ˆ E[G N (â)] (8) and the second derivative of the partial log likelihood, multiplied by 1, is

7 Scand J Statist 28 Truncated poisson regression 651 H N (â) ˆ G N (â): (9) It is crucial to observe that the last equation implies that the partial likelihood surface is a concave function of â. Thus, if the maximum partial likelihood estimator ^â exists, then it is unique. We discuss its asymptotic properties in the next subsection Asymptotic theory The asymptotic properties of the maximum partial likelihood estimator ^â are examined with the help of the score function and the conditional information matrix (Arjas & Haara, 1987; Andersen & Gill, 1982; Wong, 1986). Our approach follows Slud & Kedem (1994), Kedem (1994, ch. 9) and Fokianos & Kedem (1998). The following assumptions help to establish asymptotic properties. Assumption 4 (1) The regression coef cients â belong to an open subset of R p. (2) The regressors z t 1 are almost surely bounded. (3) There is a probability measure í such that R pzz9í(dz) is positive de nite and such for Borel sets, A, wehave 1 X N I [zt 1 2A]! p (A), as N!1: N A detailed discussion of these assumptions can be found in Fokianos & Kedem (1998). We brie y mention that assumption 4 (3) implies that the empirical measure of the set fz t : t ˆ 1,..., Ng converges weakly almost surely to a non-random measure í. This hints to the fact that for every continuous function g which is bounded on the compact support of z t we have 1 X N g(z t 1 )! p g(z)í(dz), N R p as N!1. Assumption 4 (3) cannot be veri ed directly. Its main implication is that the conditional information matrix G N (â) has a non-random limit G N (â) N!p zz9v(â)í(dz) ˆ G(â) (10) R p which is positive de nite by assumption and therefore its inverse existsða useful fact for the asymptotic theory. Notice that v(â) ˆ 1 ø 2 (c 1, c 2, ë(â)) fë2 ø(c 1 2, c 2 2, ë(â))ø(c 1, c 2, ë(â)) ë(â)ø(c 1 1, c 2 1, ë(â))[ø(c 1, c 2, ë(â)) ë(â)ø(c 1 1, c 2 1, ë(â))]g: Recall that the partial score process is zero mean square integrable martingale. Its convergence in distribution is established by invoking the CrameÂr±Wold device coupled with (10). Indeed, u9g N u u9f N u ˆ u9g N u=n u9gu!p u9f N u=n u9gu ˆ 1 (11)

8 652 K. Fokianos Scand J Statist 28 for every u 2 R p. In addition, if I Nt (E) denotes the indicator of the set fju9a t j 2 > (u9f N u) 1=2 Eg, with a t ˆ S t S t 1, then a simple calculation shows that 1 X N u9f N u E[ju9a t j 2 I Nt (E)kF t 1 ] < 1 X N E[ju9a (u9f N u) 3=2 t j 3 kf t 1 ] E NM 1 < (u9f N u) 3=2 E where M 1 is a bound. That bound exists from assumption 4 (2). We summarize our discussion in the form of a lemma (Hall & Heyde, 1980, coroll. 3.1). Lemma 1 The partial score process fs t, F t g, de ned by (6), is a zero mean square integrable martingale such that: F 1=2 N S N (â)! D N as N!1, where N stands for a standard normal random vector. By expanding in a Taylor series the partial score around the true value, say â 0, and using the fact that S N (^â) ˆ 0, we obtain the following approximation: p N (^â â0 ) N G N (â 0 ) p S N (â 0 ) N G 1 1 (â 0 ) p S N (â 0 ): N Thus, lemma 1 and (11) lead to the following theorem. Theorem 1 Under assumption 4, the maximum partial likelihood estimator ^â is almost surely unique. Additionally, the estimator is consistent and asymptotically normally distributed, p N (^â â0 )! D N (0, G 1 (â 0 )): This establishes asymptotic properties of the maximum partial likelihood estimator Testing hypotheses We complete this section by discussing how to test hypotheses for the truncated Poisson regression model. We focus on testing the general linear hypotheses H 0 : Câ ˆ ã against H 1 : Câ 6ˆ ã, (12) where C is an appropriate known matrix with full rank, say r < p. Denote by ~ â the restricted partial maximum likelihood estimator under the hypothesis (12). Then the most commonly used statistics for testing these hypotheses are: the partial likelihood ratio statistic ë(â) ˆ 2fpl( ~ â) pl(^â)g, (13) the Wald statistic

9 Scand J Statist 28 Truncated poisson regression 653 w ˆfC^â ãg9fcg 1 (^â)c9g 1 fc^â ãg, (14) and the partial score statistic c ˆ S9 N ( ~ â)g 1 ( ~ â)s N ( ~ â): (15) The following theorem states the asymptotic distribution of the aforementioned statistics. Its proof can be derived along the lines of Fahrmeir (1987) and is omitted. Theorem 2 Under assumption 4 the test statistics ë, w and c are asymptotically equivalent. Furthermore, their asymptotic distribution is a chi±square with r degrees of freedom, under hypotheses (12). In summary, we presented the doubly-truncated Poisson regression model for time series of counts. Estimation was carried out by the method of partial likelihood which allows for sequential processing of the information. The fact that the gradient of the partial log-likelihood is a zero mean square integrable martingale was useful on establishing asymptotic properties of the estimator. In addition, classical statistical tests can be used for testing linear hypotheses about the vector of regression parameters. 4. Simulations and data analysis We conclude this work with the presentation of some limited simulation results. These are put together with an application of the theory to a real data set Simulations We implement a limited simulation study to illustrate empirically the adequacy of the theoretical results with 1000 runs. We generate truncated time series of counts from the following model log ë t (â) ˆ â 0 â 1 cos 2ðt â 2 log(y t 1 ) 12 for various choices of â 0, â 1 and â 2. Notice that the truncation points are c 1 ˆ 1 and c 2 ˆ1 for these data. This implies the case of positive Poisson distribution. The length of each simulated time series is 250. The rst three columns of Table 1 list the true values of parameters. The next three columns list the means of the estimated parameters while the last three columns report their standard Table 1. True parameters, estimated parameters and Monte Carlo standard errors from 1000 simulations of the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). The truncation points are c 1 ˆ 1 and c 2 ˆ1 and the length of the observed time series is 250 True parameters Estimated parameters Estimated standard errors â 0 â 1 â 2 ^â0 ^â1 ^â2 S.E. (^â 0 ) S.E. (^â 1 ) S.E. (^â 2 )

10 654 K. Fokianos Scand J Statist 28 deviations. It seems that the estimated parameters are in close agreement with the true parameters. In addition, Figs 5 and 6 illustrate Q±Q plots of the partial maximum likelihood estimators using data that correspond to rst and fth row of Table 1. There are no gross departures from the asserted normality Polio incidence in the USA We apply the new methodology to an already published data set, namely the polio incidence rates in the USA. These data have been published by the US Centers for Disease and Control and list the monthly number of poliomyelitis cases during the years 1970 to 1983, that is t ˆ 1,..., 168. Previous analyses have been given by Zeger (1988), Fahrmeir & Tutz (1994, p. 197), Li (1994), Eilers & Marx (1999), for example. Fig. 5. Q±Q plots of 1000 maximum partial likelihood estimators. The data have been simulated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). Here, the length of each time series is 250, the truncation points are c 1 ˆ 1 and c 2 ˆ1, and â 0 ˆ 0:30, â 1 ˆ 0:75 and â 2 ˆ 1. (a) Q±Q plot of ^â 0. (b) Q±Q plot of ^â 1. (c) Q±Q plot of ^â 2. Fig. 6. Q±Q plots of 1000 maximum partial likelihood estimators. The data have been simulated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). Here, the length of each time series is 250, the truncation points are c 1 ˆ 1 and c 2 ˆ1, and â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50 and â 2 ˆ 0:50. (a) Q±Q plot of ^â 0. (b) Q±Q plot of ^â 1. (c) Q±Q plot of ^â 2.

11 Scand J Statist 28 Truncated poisson regression 655 A close look at the data (see Fig. 7) reveals that there is a long term decrease of the incidence rate. Furthermore, we notice that the values of this time series fall between 0 and 14. Thus, we are led to apply a truncated Poisson regression model with c 1 ˆ 0 and c 2 ˆ 14. Similar to Fahrmeir & Tutz (1994), we consider models with trend, sine and cosine pairs of the annual frequencies and past values of the response. That is, log ë t (â) ˆ â 0 â 1 t â 2 cos(2ðt=12) â 3 sin(2ðt=12) â 4 y t 1 â 5 y t 2 : We point out that the analysis given by Fahrmeir & Tutz (1994) takes into account sine and cosine pairs of semiannual frequencies as well as lagged values of the response of order 5 by using a log-linear model. Table 2 summarizes our ndings for the different models we applied to the polio incidence rates. The rst columns lists all the models considered while the second column gives the corresponding values of the negative partial log-likelihood. The results in Table 2 show that a reasonable model for these data includes trend, sinusoidal terms and an autoregressive part of Fig. 7. Monthly number of poliomyelitis cases in the USA from 1970 to Table 2. Monthly number of poliomyelitis cases in the USA from 1970 to 1983: tted models using the truncated Poisson regression model with c 1 ˆ 0 and c 2 ˆ 14 Time dependent covariates Negative partial log-likelihood (1, t ) (1, t , sinusoidal terms) (1, t , sinusoidal terms, y t 1 ) (1, t , sinusoidal terms, y t 1, y t 2 ) (1, sinusoidal terms, y t 1 ) (1, t , y t 1 ) (1, y t 1 )

12 656 K. Fokianos Scand J Statist 28 order 1. The addition of the autoregressive term for order 2 does not improve the t. A formal application of the partial log-likelihood ratio test shows no signi cance±its p-value being The tted model to these data is log ë t ˆ 0:632 5:306t :149 cos(2ðt=12) 0:471 sin(2ðt=12) 0:437y t 1 : The negative sign of the trend term indicates that there is a long term decrease of the number of poliomyelitis cases during the observation period. Figure 8(a) illustrates time series plot of the predicted and observed data. In contrast to Fahrmeir & Tutz (1994, g. 6.2), we notice that the truncated Poisson model can predict large values of the responseða feature not shared with the log-linear model. The quality of the t depends upon the choice of truncation points. Initially we set c 1 ˆ 0 and c 2 ˆ 14 motivated by the range of those data. However, keeping the lower truncation point c 1 xed at 0 and considering the model which includes constant term, trend, yearly sinusoidal components and lagged values of order 1, we notice that the negative partial log-partial likelihood is an increasing function of c 2 (see Table 3). Indeed, the value c 2 ˆ 14 ˆ max t y t maximizes the partial log-likelihood and thus minimizes the negative partial log-likelihood. Thus, any likelihood based criterion leads c 2 ˆ 14 for the choice of the upper truncation point. Furthermore, calculation of Pearson's goodness of t statistic 2 ˆ X t (y t ^ì t ) 2 ^v t, using expressions (3) and (4) lead identical results which we do not report. In particular, set c 2 ˆ 30 to obtain log ë t ˆ 0:395 3:836t :112 cos(2ðt=12) 0:328 sin(2ðt=12) 0:110y t 1 : A direct comparison with the previous t shows that the sign of all the parameters remains the same leading to similar conclusions. Figure 8(b) illustrates that this model does not predict the large values of the response as the log-linear model of Fahrmeir & Tutz (1994, g. 6.2) since as c 2!1, the likelihood of the truncated Poisson model approaches the standard likelihood of a Poisson regression model. In fact, the estimators from a log-linear model that includes constant term, trend, yearly sinusoidal components and lagged values of order 1 are in close agreement to those obtained from the truncated Poisson model with c 2 ˆ 30. Similar results are obtained for all the models reported in Table 2. This discussion shows that the choice of truncation points is crucial since any likelihood based criterion entails the minimum and maximum value of the data at hand as the optimal solution. However, future observation might lie outside this range complicating the problem of prediction. We hope our work will stimulate further research on this area. Table 3. Values of the negative partial log-likelihood function obtained by varying the upper truncation point c 2. The tted model for the monthly numbers of poliomyelitis cases includes constant term, trend, yearly sinusoidal components and lagged values of order 1 c Negative partial log-likelihood

13 Scand J Statist 28 Truncated poisson regression 657 Fig. 8. Predicted ( ^ì t ) and observed monthly number of poliomyelitis cases in the USA from 1970 to 1983Ðsolid line and circles (s) respectively. (a) The tted model is log ë t ˆ 0:632 5:306t :149 cos(2ðt=12) 0:471 sin(2ðt=12) 0:437y t 1. The truncation points are c 1 ˆ 0 and c 2 ˆ 14. (b) The tted model is log ë t ˆ 0:395 3:836t :112 cos(2ðt=12) 0:328 sin(2ðt=12) 0:110y t 1. The truncation points are c 1 ˆ 0 and c 2 ˆ Concluding remarks We studied the truncated Poisson model for time series of counts within the framework of time series following generalized linear models. The partial likelihood theory is useful on both estimation and testing. The discussion at the end of the last section reveals that the choice of truncation points is of great importance in applications and further study is still needed on the choice of those parameters. There are several possible extensions of the model at hand. For instance, consider the dynamic doubly truncated Poisson model with rst order random walk and log ë t ˆ â9 t z t 1 â t ˆ â t 1 E t, where E t are independent multivariate Normal random variables with zero mean and covariance matrix, say Q t. Another extension is to allow state space priors for the covariates in a semiparametric Bayesian framework. Recent advances in the area of Kalman ltering (for example, Fahrmeir, 1992; FruÈhwirth-Schnatter, 1994; Durbin & Koopman, 1997; Shephard & Pitt, 1997; GoÈsll et al., 2000 among others) can be applied for exploring the properties of the dynamic doubly truncated Poisson model.

14 658 K. Fokianos Scand J Statist 28 Acknowledgements We would like to thank the Editor, the Associate Editor and the reviewers for their useful and constructive remarks. References Albert, P. S. (1991). A two-state markov mixture model for a time series of epileptic seizure counts. Biometrics 47, 1371±1381. Albert, P. S., McFarland, H., Smith, M. & Frank, J. (1994). Time series for modeling counts from a relapsingremitting disease: application to modelling disease activity in multiple sclerosis. Statist. Med. 13, 453± 466. Andersen, P. K. & Gill, R. D. (1982). Cox's regression models for counting process: a large sample approach. Ann. Statist. 10, 1100±1120. Arjas, E. & Haara, P. (1987). A logistic regression model for hazard: asymptotic results. Scand. J. Statist. 14, 1±18. Cohen, A. C. (1954). Estimation of the poisson parameter from truncated samples and from censored samples. J. Amer. Statist. Assoc. 49, 158±168. Cox, D. R. (1975). Partial likelihood. Biometrika 62, 69±76. Cox, D. R. (1981). Statistical analysis of time series; Some recent developments. Scand. J. Statist. 8, 93±115. Cox, D. R. & Hinkely, D. V. (1974). Theoretical statistics. Chapman & Hall, London. David, F. N. & Johnson, N. L. (1952). The truncated Poisson distribution. Biometrics 8, 275±285. Diggle, J. P., Liang, K.-Y. & Zeger, L. S. (1994). Analysis of longitudinal data. Oxford University Press, New York. Durbin, J. & Koopman, S. J. (1997). Monte±Carlo maximum likelihood estimation for non-gaussian state space models. Biometrika 84, 669±684. Eilers, P. H. C. & Marx, B. D. (1999). Generalized linear additive smooth structures. Technical report, Louisiana State University, Department of Experimental Statistics. Fahrmeir, L. (1987). Asymptotic testing theory for generalized linear models. Statistics 18, 65±76. Fahrmeir, L. (1992). Posterior mode estimation by extended Kalman ltering for multivariate dynamic generalized linear models. J. Amer. Statist. Assoc. 87, 501±509. Fahrmeir, L. & Tutz, G. (1994). Multivariate statistical modeling based on generalized linear models. Springer-Verlag, New York. Fokianos, K. & Kedem, B. (1998). Prediction and classi cation of non-stationary categorical time series. J. Multivariate Anal. 67, 277±296. FruÈhwirth-Schnatter, S. (1994). Applied state space modeling of non-gaussian time series using integration based Kalman ltering. Statist. Comput. 4, 259±269. GoÈsll, C., Auer, D. P. & Fahrmeir, L. (2000). Dynamic models in fmri. Magnet Resonance Med. 43, 72±81. Grab, E. L. & Savage, I. R. (1954). Tables of the expected value of 1=x for positive Bernoulli and Poisson variables. J. Amer. Statist. Assoc. 49, 169±177. Grogger, J. T. & Carson, R. T. (1991). Models for truncated counts. J. Appl. Econometrics 6, 225±238. Gurmu, S. & Triverdi, P. (1992). Overdispersion tests for truncated Poisson regression models. J. Econometrics 54, 347±370. Hall, P. & Heyde, C. C. (1980). Martingale limit theory and its applications. Academic Press, New York. Holden, R. T. (1987). Time series analysis of contagious process. J. Amer. Statist. Assoc. 82, 1019±1026. Johnson, N. L., Kotz, S. & Kemp, A. W. (1992). Univariate discrete distributions, 2nd edn. Wiley, New York. Kedem, B. (1994). Time series analysis by higher order crossings. IEEE Press, New York. Kemp, C. D. & Kemp, A. W. (1988). Rapid estimation for discrete distributions. The Statistician 37, 243± 255. LeÂon L. F. & Tsai, C. (1998). Assessment of model adequacy for Markov regression time series models. Biometrics 54, 1165±1175. Li, W. K. (1991). Testing model adequacy for some markov regression models for time series. Biometrika 78, 83±89. Li, W. K. (1994). Time series models based on generalized linear models: some further results. Biometrics 50, 506±511. MacDonald, I. L. & Zucchini, W. (1997). Hidden Markov and other models for discrete-valued time series. Chapman & Hall, London.

15 Scand J Statist 28 Truncated poisson regression 659 McCullagh, P. & Nelder, J. A. (1989). Generalized linear models, 2nd edn. Chapman & Hall, London. Shephard, N. & Pitt, M. K. (1997). Likelihood analysis of non-gaussian measurement time series. Biometrika 84, 653±657. Slud, E. & Kedem, B. (1994). Partial likelihood analysis of logistic regression and autoregression. Statist. Sinica 4, 89±106. Spall, J. C. (1988). Bayesian analysis of time series and dynamic models. Marcel Dekker, New York. West, M. & Harrison, P. (1997). Bayesian forecasting and dynamic models, 2nd edn. Springer-Verlag, New York. Wong, W. H. (1986). Theory of partial likelihood. Ann. Statist. 14, 88±123. Zeger, S. L. (1988). A regression model for time series of counts. Biometrika 75, 621±629. Zeger, S. L. & Qaqish, B. (1988). Markov regression models for time series: a quasi-likelihood approach. Biometrics 44, 1019±1031. Received August 1999, in nal form December 2000 K. Fokianos, Department of Mathematics and Statistics, University of Cyprus, P.O. Box 20537, CY 1678 Nicosia, Cyprus.

Index. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright John Wiley & Sons, Inc. ISBN.

Index. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright John Wiley & Sons, Inc. ISBN. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright 0 2002 John Wiley & Sons, Inc. ISBN. 0-471-36355-3 Index Adaptive rejection sampling, 233 Adjacent categories