Truncated Poisson Regression for Time Series of Counts

Size: px
Start display at page:

Download "Truncated Poisson Regression for Time Series of Counts"

Transcription

1 Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 645±659, 2001 Truncated Poisson Regression for Time Series of Counts KONSTANTINOS FOKIANOS University of Cyprus ABSTRACT. We consider partial likelihood analysis of a truncated Poisson regression model for time series of counts. We focus our attention on the study of asymptotic theory for the maximum partial likelihood estimator of a vector of regression parameters. Simulations and data analysis integrate the presentation. Key words: martingale, non-stationary, partial likelihood, random time dependent covariates, score 1. Introduction Regression models for non-gaussian time series data have attracted a fair amount of attention recently since such data become rather common in many applications. For example, the polio incidence dataðdiscussed in section 4Ðlists the monthly number of poliomyelitis cases in the USA during 1970 to Those dataðwhich form a time series of countsðare not amenable to the usual ARMA methodology. There is, however, a growing literature for regression models for non-gaussian time series. We refer in particular to the recent texts authored by MacDonald & Zucchini (1997, ch. 1), Fahrmeir & Tutz (1994, ch. 6±8), Diggle et al. (1994, ch. 10), Kedem (1994, ch. 9) featuring work on the so-called transition models for time series. Accordingly, a transition model speci es a generalized linear model (McCullagh & Nelder, 1989) for the conditional distribution of the response given past information. We focus on a speci c regression model for the analysis of time series of counts in this work. Regression models for time series of counts in the context of generalized linear models have been discussed by many authors. Important contributions include Wong (1986), Holden (1987), Zeger & Qaqish (1988), Li (1991, 1994) Albert et al. (1994), and more recently LeÂon & Tsai (1998). These models are usually called observation driven (Cox, 1981). We contribute to the choice of the distribution by introducing the doubly truncated Poisson model for time series of countsða notion discussed in section 2. The main feature of the model is that inference can still be drawn within the framework of time series models following generalized linear models. Estimation and testing are both carried out by partial likelihood parametric inference regarding a time invariant vector of regression parameters. Our approach accommodates random time dependent covariates and drops any stationary or Markov assumptions. There is also important work on parameter driven models for non-gaussian time series, see the articles by Zeger (1988) and Albert (1991) for example. The books by Spall (1988), West & Harrison (1997) and Fahrmeir & Tutz (1994) elaborate on the state space modelling of non-linear and non-gaussian time series. Econometricians have already noticed the usefulness of the truncated Poisson distribution for regression modelling of independent counts (see Grogger & Carson, 1991; Gurmu & Triverdi, 1992). The paper by Grogger & Carson (1991) examines maximum likelihood regression estimators from the positive Poisson distribution, a concept that we will introduce in the next section. The work by Gurmu & Triverdi (1992) features score tests of extra-poisson variation in left or right truncated Poisson regression models. These models are introduced in section 2 as special cases of the doubly-truncated Poisson distribution. We continue by considering inference

2 646 K. Fokianos Scand J Statist 28 and asymptotics, in the framework of maximum partial likelihood estimation, and the paper concludes with applications. 2. Truncated Poisson regression for time series 2.1. On truncated Poisson distribution The doubly truncated Poisson distribution has been studied extensively by many authors (see Johnson et al., 1992, for details). Let X be a random variable distributed according to the doubly-truncated Poisson distribution (Cohen, 1954). That is, X is a Poisson random variable but the rst c 1 values and the values exceeding a speci ed value c 2 are omitted, with c 1, c 2.A straightforward calculation leads to the probability mass function of X, with P(X ˆ x; c 1, c 2, ë) ˆ ë x x!ø(c 1, c 2, ë), x ˆ c 1, c 1 1,..., c 2, c 1, c 2 (1) ø(c 1, c 2, ë) ˆ Xc 2 ëx x!, xˆc 1 de ned only when both constants c 1 and c 2 are non-negative and such that c 1, c 2.Ifc 1 is negative then we put ø(c 1, c 2, ë) ˆ ø(0, c 2, ë). Clearly ø(0, 1, ë) ˆ exp(ë). Setting c 2 ˆ1 in (1) leads to the so-called left truncated Poisson distribution. In particular, if c 1 ˆ 1 and c 2 ˆ1, (1) becomes ë x P(X ˆ x; ë) ˆ, x ˆ 1, 2,..., x!(exp(ë) 1) which is the probability mass function of the positive Poisson distribution. References on probabilistic and statistical properties of this distribution include David & Johnson (1952), Grab & Savage (1954) and Kemp & Kemp (1988), for example. Similarly, substituting c 1 ˆ 0in(1) yields the right truncated Poisson distribution. The book by Johnson et al. (1992) is a useful source for additional information on the properties of these distributions. The next step is to utilize the doubly-truncated Poisson distribution for modelling a time series of counts observed jointly with random time dependent covariates Modelling Suppose that fy t, t ˆ 1, 2,..., Ng, is a univariate time series of counts truncated at levels c 1 and c 2. That is values below c 1 or above c 2 have been omitted. Furthermore, let fx t, t ˆ 1,..., Ng be a vector of covariates that in uence the evolution of fy t, t ˆ 1,..., Ng and assume that F t 1 stands for the past of the processes up to and including time t. Then the doubly-truncated Poisson regression model for time series of counts is completely speci ed by the following assumptions. Assumption 1 Let f (y t ; ë t jf t 1 ) denotes the conditional distribution of y t given the history of the processes. We assume that f has following form f (y t ; ë t jf t 1 ) ˆ ë yt t y t!ø(c 1, c 2, ë t ), y t ˆ c 1,..., c 2 : (2)

3 Scand J Statist 28 Truncated poisson regression 647 An interesting observation is that (2) introduces a member of the exponential families of distributions. Indeed, by letting log ë t ˆ è t we obtain f (y t ; ë t jf t 1 ) ˆ exp(y t è t log ø(c 1, c 2, exp(è t )) log(y t!)): Standard properties of the exponential family of distributions (see for example, Cox & Hinkley, 1974) yield the following relations regarding the conditional mean and variance of the process ø(c 1 1, c 2 1, ë t ) ì t ˆ E[Y t jf t 1 ] ˆ ë t (3) ø(c 1, c 2, ë t ) and v t ˆ var[y t jf t 1 ] ˆ 1 ø 2 (c 1, c 2, ë t ) fë2 t ø(c 1 2, c 2 2, ë t )ø(c 1, c 2, ë t ) ë t ø(c 1 1, c 2 1, ë t )[ø(c 1, c 2, ë t ) ë t ø(c 1 1, c 2 1, ë t )]g: (4) We point out that if c 1 takes the values 0 or 1, then (3) and (4) should be used according to our earlier model assumptions for ø. Assumption 2 The processes y t and x t are related by means of the linear predictor ã t ˆ â9z t 1 : The coef cient â denotes a p-dimensional vector of unknown regression parameters. The vector of random time dependent covariates z t 1 ±which might include past values of the response, past values of x t, or any interactions between them±is F t 1 measurable. This speci cation covers the case of a non-homogeneous Markov chain modelled through z t 1 ˆ (1, y t 1,..., y t l, x t ), and the example of a purely autoregressive process z t 1 ˆ (1, y t 1,..., y t l ): However, we do not postulate any Markovian assumption on the process. In addition the process may be non-stationary. Assumption 3 The linear predictor is linked to the mean ì t ˆ ì(è t ) by the expression è t ˆ log ë t ˆ ã t : We observe that the above assumption is the so-called canonical link. The condition of the canonical link facilitates further calculations and guarantees existence and uniqueness of the maximum likelihood estimator (Wedderburn, 1976). We do not discuss the case of noncanonical link; that is there exists a twice differentiable function, say g, such that g( ì t ) ˆ ã t : This topic can be treated along the lines of Fokianos & Kedem (1998). Figures 1 and 3 display typical realizations of a truncated time series of counts of length 250. For these examples, we choose the simple model log ë t (â) ˆ â 0 â 1 cos 2ðt â 2 log(y t 1 ): 12

4 648 K. Fokianos Scand J Statist 28 Fig. 1. Typical realizations of a time series fy t, t ˆ 1,..., 250g for truncated counts. The data have been generated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). The truncation points are c 1 ˆ 1 and c 2 ˆ1. (a) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:75 and â 2 ˆ 1. (b) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50. Fig. 2. (a) Autocorrelation function of the simulated time series which corresponds to the top panel of Fig. 1. (b) Autocorrelation function of the simulated time series which corresponds to the bottom panel of Fig. 1. Here â ˆ (â 0, â 1, â 2 )9and z t 1 ˆ (1, cos([2ðt]=12), log(y t 1 ))9. Figure 1 illustrates typical realizations when the truncation points are c 1 ˆ 1 and c 2 ˆ1, i.e. the case of the positive Poisson distribution. Figure 3 shows time series plots of the same model but now the truncation points have been xed to c 1 ˆ 1 and c 2 ˆ 10. For both of these gures the top part illustrates realizations of the process when â 0 ˆ 0:30,

5 Scand J Statist 28 Truncated poisson regression 649 Fig. 3. Typical realizations of a time series fy t, t ˆ 1,..., 250g for truncated counts. The data have been generated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). The truncation points are c 1 ˆ 1 and c 2 ˆ 10. (a) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:75 and â 2 ˆ 1. (b) Realizations with â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50. â 1 ˆ 0:75 and â 2 ˆ 1 while the bottom part illustrates typical realizations when â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50. It is rather interesting to observe that the behaviour of the process depends on the sign of the coef cient of log(y t 1 ). This is true for both of the models at hand. Indeed, negative values of the parameter â 2 lead to an alternating seriesða fact which is manifested by plotting the autocorrelation function of both time series shown in Figs 1(a) and 3(a). The respective autocorrelation plots are displayed in Figs 2(a) and 4(a). We can Fig. 4. (a) Autocorrelation function of the simulated time series which corresponds to the top panel of Fig. 3. (b) Autocorrelation function of the simulated time series which corresponds to the bottom panel of Fig. 3.

6 650 K. Fokianos Scand J Statist 28 deduce that successive observations tend to be located on different sides of the mean, thus leading to larger oscillation. The sinusoidal pattern of the autocorrelation function is clearly evident. In contrast, Figs 1(b) and 3(b) exhibit time series with less oscillation. This fact is also con rmed by the plots of their autocorrelation functionsðfigs 2(b) and 4(b) respectivelyð which exhibit a short term correlation indicating that values above or below the mean tend to be followed by observations that fall above or below the mean. In addition, we notice the sinusoidal pattern of the autocorrelation function. 3. Partial likelihood inference Interest is certainly focused on the estimation of the vector parameter â. This is a challenging problem since the data are dependent. Dependence is modelled through the covariate vector z t 1 which might include past values of the process fy t, t ˆ 1,..., Ng. The partial likelihood methodology, suggested by Cox (1975), approaches the problem successfully via martingale theory. In the context of time series following generalized linear models, the idea has been further developed in Wong (1986), Slud & Kedem (1994), Fokianos & Kedem (1998) among others. In the context of survival analysis and counting processes see Andersen & Gill (1982) or Arjas & Haara (1987), for example. Partial likelihood allows for sequential conditional processing of the information. Lengthy discussions of these ideas have been given in Slud & Kedem (1994) and Fokianos & Kedem (1998). We follow Fokianos & Kedem (1998) in the development of the theory. Hence, suppose that fy t, t ˆ 1,..., Ng is a doubly-truncated time series of counts and put, as in subsection 2.2, log ë t (â) ˆ â9z t 1. Equivalently, we have that ë t (â) ˆ exp(â9z t 1 ). Then, the partial log-likelihood (pl) function relative to â, F t, and the data fy t, t ˆ 1,..., Ng, is given by pl(â) XN log f (y t, ë t (â)jf t 1 ) ˆ XN fy t log ë t (â) log ø(c 1, c 2, ë t (â)) log(y t!)g: (5) Differentiation of (5) with respect to â yields the partial score S N (â) ˆ XN z t 1 (y t ì t (â)): (6) The solution of the equation S N (â) ˆ 0 is called the maximum partial likelihood estimator (MPLE) and it is denoted by ^â. The most widely used method for the solution of S N (â) ˆ 0 is Fisher scoring (details are given in the second chapter of McCullagh & Nelder (1989), for example). Notice that fs t (â), t ˆ 1,..., Ng coupled with the increasing sequence of ó - elds ff t g forms a zero mean square integrable martingale sequence. This is a crucial fact for an application of a central limit theorem for studying the asymptotic properties of the maximum partial likelihood estimator ^â. The conditional information matrix is G N (â) ˆ XN var[z t 1 (y t ì t (â))jf t 1 ] ˆ XN z t 1 z9 t 1 v t (â), (7) with v t (â) de ned by (4). The unconditional information matrix is given by F N (â) ˆ E[G N (â)] (8) and the second derivative of the partial log likelihood, multiplied by 1, is

7 Scand J Statist 28 Truncated poisson regression 651 H N (â) ˆ G N (â): (9) It is crucial to observe that the last equation implies that the partial likelihood surface is a concave function of â. Thus, if the maximum partial likelihood estimator ^â exists, then it is unique. We discuss its asymptotic properties in the next subsection Asymptotic theory The asymptotic properties of the maximum partial likelihood estimator ^â are examined with the help of the score function and the conditional information matrix (Arjas & Haara, 1987; Andersen & Gill, 1982; Wong, 1986). Our approach follows Slud & Kedem (1994), Kedem (1994, ch. 9) and Fokianos & Kedem (1998). The following assumptions help to establish asymptotic properties. Assumption 4 (1) The regression coef cients â belong to an open subset of R p. (2) The regressors z t 1 are almost surely bounded. (3) There is a probability measure í such that R pzz9í(dz) is positive de nite and such for Borel sets, A, wehave 1 X N I [zt 1 2A]! p (A), as N!1: N A detailed discussion of these assumptions can be found in Fokianos & Kedem (1998). We brie y mention that assumption 4 (3) implies that the empirical measure of the set fz t : t ˆ 1,..., Ng converges weakly almost surely to a non-random measure í. This hints to the fact that for every continuous function g which is bounded on the compact support of z t we have 1 X N g(z t 1 )! p g(z)í(dz), N R p as N!1. Assumption 4 (3) cannot be veri ed directly. Its main implication is that the conditional information matrix G N (â) has a non-random limit G N (â) N!p zz9v(â)í(dz) ˆ G(â) (10) R p which is positive de nite by assumption and therefore its inverse existsða useful fact for the asymptotic theory. Notice that v(â) ˆ 1 ø 2 (c 1, c 2, ë(â)) fë2 ø(c 1 2, c 2 2, ë(â))ø(c 1, c 2, ë(â)) ë(â)ø(c 1 1, c 2 1, ë(â))[ø(c 1, c 2, ë(â)) ë(â)ø(c 1 1, c 2 1, ë(â))]g: Recall that the partial score process is zero mean square integrable martingale. Its convergence in distribution is established by invoking the CrameÂr±Wold device coupled with (10). Indeed, u9g N u u9f N u ˆ u9g N u=n u9gu!p u9f N u=n u9gu ˆ 1 (11)

8 652 K. Fokianos Scand J Statist 28 for every u 2 R p. In addition, if I Nt (E) denotes the indicator of the set fju9a t j 2 > (u9f N u) 1=2 Eg, with a t ˆ S t S t 1, then a simple calculation shows that 1 X N u9f N u E[ju9a t j 2 I Nt (E)kF t 1 ] < 1 X N E[ju9a (u9f N u) 3=2 t j 3 kf t 1 ] E NM 1 < (u9f N u) 3=2 E where M 1 is a bound. That bound exists from assumption 4 (2). We summarize our discussion in the form of a lemma (Hall & Heyde, 1980, coroll. 3.1). Lemma 1 The partial score process fs t, F t g, de ned by (6), is a zero mean square integrable martingale such that: F 1=2 N S N (â)! D N as N!1, where N stands for a standard normal random vector. By expanding in a Taylor series the partial score around the true value, say â 0, and using the fact that S N (^â) ˆ 0, we obtain the following approximation: p N (^â â0 ) N G N (â 0 ) p S N (â 0 ) N G 1 1 (â 0 ) p S N (â 0 ): N Thus, lemma 1 and (11) lead to the following theorem. Theorem 1 Under assumption 4, the maximum partial likelihood estimator ^â is almost surely unique. Additionally, the estimator is consistent and asymptotically normally distributed, p N (^â â0 )! D N (0, G 1 (â 0 )): This establishes asymptotic properties of the maximum partial likelihood estimator Testing hypotheses We complete this section by discussing how to test hypotheses for the truncated Poisson regression model. We focus on testing the general linear hypotheses H 0 : Câ ˆ ã against H 1 : Câ 6ˆ ã, (12) where C is an appropriate known matrix with full rank, say r < p. Denote by ~ â the restricted partial maximum likelihood estimator under the hypothesis (12). Then the most commonly used statistics for testing these hypotheses are: the partial likelihood ratio statistic ë(â) ˆ 2fpl( ~ â) pl(^â)g, (13) the Wald statistic

9 Scand J Statist 28 Truncated poisson regression 653 w ˆfC^â ãg9fcg 1 (^â)c9g 1 fc^â ãg, (14) and the partial score statistic c ˆ S9 N ( ~ â)g 1 ( ~ â)s N ( ~ â): (15) The following theorem states the asymptotic distribution of the aforementioned statistics. Its proof can be derived along the lines of Fahrmeir (1987) and is omitted. Theorem 2 Under assumption 4 the test statistics ë, w and c are asymptotically equivalent. Furthermore, their asymptotic distribution is a chi±square with r degrees of freedom, under hypotheses (12). In summary, we presented the doubly-truncated Poisson regression model for time series of counts. Estimation was carried out by the method of partial likelihood which allows for sequential processing of the information. The fact that the gradient of the partial log-likelihood is a zero mean square integrable martingale was useful on establishing asymptotic properties of the estimator. In addition, classical statistical tests can be used for testing linear hypotheses about the vector of regression parameters. 4. Simulations and data analysis We conclude this work with the presentation of some limited simulation results. These are put together with an application of the theory to a real data set Simulations We implement a limited simulation study to illustrate empirically the adequacy of the theoretical results with 1000 runs. We generate truncated time series of counts from the following model log ë t (â) ˆ â 0 â 1 cos 2ðt â 2 log(y t 1 ) 12 for various choices of â 0, â 1 and â 2. Notice that the truncation points are c 1 ˆ 1 and c 2 ˆ1 for these data. This implies the case of positive Poisson distribution. The length of each simulated time series is 250. The rst three columns of Table 1 list the true values of parameters. The next three columns list the means of the estimated parameters while the last three columns report their standard Table 1. True parameters, estimated parameters and Monte Carlo standard errors from 1000 simulations of the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). The truncation points are c 1 ˆ 1 and c 2 ˆ1 and the length of the observed time series is 250 True parameters Estimated parameters Estimated standard errors â 0 â 1 â 2 ^â0 ^â1 ^â2 S.E. (^â 0 ) S.E. (^â 1 ) S.E. (^â 2 )

10 654 K. Fokianos Scand J Statist 28 deviations. It seems that the estimated parameters are in close agreement with the true parameters. In addition, Figs 5 and 6 illustrate Q±Q plots of the partial maximum likelihood estimators using data that correspond to rst and fth row of Table 1. There are no gross departures from the asserted normality Polio incidence in the USA We apply the new methodology to an already published data set, namely the polio incidence rates in the USA. These data have been published by the US Centers for Disease and Control and list the monthly number of poliomyelitis cases during the years 1970 to 1983, that is t ˆ 1,..., 168. Previous analyses have been given by Zeger (1988), Fahrmeir & Tutz (1994, p. 197), Li (1994), Eilers & Marx (1999), for example. Fig. 5. Q±Q plots of 1000 maximum partial likelihood estimators. The data have been simulated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). Here, the length of each time series is 250, the truncation points are c 1 ˆ 1 and c 2 ˆ1, and â 0 ˆ 0:30, â 1 ˆ 0:75 and â 2 ˆ 1. (a) Q±Q plot of ^â 0. (b) Q±Q plot of ^â 1. (c) Q±Q plot of ^â 2. Fig. 6. Q±Q plots of 1000 maximum partial likelihood estimators. The data have been simulated according to the model log ë t (â) ˆ â 0 â 1 cos(2ðt=12) â 2 log(y t 1 ). Here, the length of each time series is 250, the truncation points are c 1 ˆ 1 and c 2 ˆ1, and â 0 ˆ 0:30, â 1 ˆ 0:50 and â 2 ˆ 0:50 and â 2 ˆ 0:50. (a) Q±Q plot of ^â 0. (b) Q±Q plot of ^â 1. (c) Q±Q plot of ^â 2.

11 Scand J Statist 28 Truncated poisson regression 655 A close look at the data (see Fig. 7) reveals that there is a long term decrease of the incidence rate. Furthermore, we notice that the values of this time series fall between 0 and 14. Thus, we are led to apply a truncated Poisson regression model with c 1 ˆ 0 and c 2 ˆ 14. Similar to Fahrmeir & Tutz (1994), we consider models with trend, sine and cosine pairs of the annual frequencies and past values of the response. That is, log ë t (â) ˆ â 0 â 1 t â 2 cos(2ðt=12) â 3 sin(2ðt=12) â 4 y t 1 â 5 y t 2 : We point out that the analysis given by Fahrmeir & Tutz (1994) takes into account sine and cosine pairs of semiannual frequencies as well as lagged values of the response of order 5 by using a log-linear model. Table 2 summarizes our ndings for the different models we applied to the polio incidence rates. The rst columns lists all the models considered while the second column gives the corresponding values of the negative partial log-likelihood. The results in Table 2 show that a reasonable model for these data includes trend, sinusoidal terms and an autoregressive part of Fig. 7. Monthly number of poliomyelitis cases in the USA from 1970 to Table 2. Monthly number of poliomyelitis cases in the USA from 1970 to 1983: tted models using the truncated Poisson regression model with c 1 ˆ 0 and c 2 ˆ 14 Time dependent covariates Negative partial log-likelihood (1, t ) (1, t , sinusoidal terms) (1, t , sinusoidal terms, y t 1 ) (1, t , sinusoidal terms, y t 1, y t 2 ) (1, sinusoidal terms, y t 1 ) (1, t , y t 1 ) (1, y t 1 )

12 656 K. Fokianos Scand J Statist 28 order 1. The addition of the autoregressive term for order 2 does not improve the t. A formal application of the partial log-likelihood ratio test shows no signi cance±its p-value being The tted model to these data is log ë t ˆ 0:632 5:306t :149 cos(2ðt=12) 0:471 sin(2ðt=12) 0:437y t 1 : The negative sign of the trend term indicates that there is a long term decrease of the number of poliomyelitis cases during the observation period. Figure 8(a) illustrates time series plot of the predicted and observed data. In contrast to Fahrmeir & Tutz (1994, g. 6.2), we notice that the truncated Poisson model can predict large values of the responseða feature not shared with the log-linear model. The quality of the t depends upon the choice of truncation points. Initially we set c 1 ˆ 0 and c 2 ˆ 14 motivated by the range of those data. However, keeping the lower truncation point c 1 xed at 0 and considering the model which includes constant term, trend, yearly sinusoidal components and lagged values of order 1, we notice that the negative partial log-partial likelihood is an increasing function of c 2 (see Table 3). Indeed, the value c 2 ˆ 14 ˆ max t y t maximizes the partial log-likelihood and thus minimizes the negative partial log-likelihood. Thus, any likelihood based criterion leads c 2 ˆ 14 for the choice of the upper truncation point. Furthermore, calculation of Pearson's goodness of t statistic 2 ˆ X t (y t ^ì t ) 2 ^v t, using expressions (3) and (4) lead identical results which we do not report. In particular, set c 2 ˆ 30 to obtain log ë t ˆ 0:395 3:836t :112 cos(2ðt=12) 0:328 sin(2ðt=12) 0:110y t 1 : A direct comparison with the previous t shows that the sign of all the parameters remains the same leading to similar conclusions. Figure 8(b) illustrates that this model does not predict the large values of the response as the log-linear model of Fahrmeir & Tutz (1994, g. 6.2) since as c 2!1, the likelihood of the truncated Poisson model approaches the standard likelihood of a Poisson regression model. In fact, the estimators from a log-linear model that includes constant term, trend, yearly sinusoidal components and lagged values of order 1 are in close agreement to those obtained from the truncated Poisson model with c 2 ˆ 30. Similar results are obtained for all the models reported in Table 2. This discussion shows that the choice of truncation points is crucial since any likelihood based criterion entails the minimum and maximum value of the data at hand as the optimal solution. However, future observation might lie outside this range complicating the problem of prediction. We hope our work will stimulate further research on this area. Table 3. Values of the negative partial log-likelihood function obtained by varying the upper truncation point c 2. The tted model for the monthly numbers of poliomyelitis cases includes constant term, trend, yearly sinusoidal components and lagged values of order 1 c Negative partial log-likelihood

13 Scand J Statist 28 Truncated poisson regression 657 Fig. 8. Predicted ( ^ì t ) and observed monthly number of poliomyelitis cases in the USA from 1970 to 1983Ðsolid line and circles (s) respectively. (a) The tted model is log ë t ˆ 0:632 5:306t :149 cos(2ðt=12) 0:471 sin(2ðt=12) 0:437y t 1. The truncation points are c 1 ˆ 0 and c 2 ˆ 14. (b) The tted model is log ë t ˆ 0:395 3:836t :112 cos(2ðt=12) 0:328 sin(2ðt=12) 0:110y t 1. The truncation points are c 1 ˆ 0 and c 2 ˆ Concluding remarks We studied the truncated Poisson model for time series of counts within the framework of time series following generalized linear models. The partial likelihood theory is useful on both estimation and testing. The discussion at the end of the last section reveals that the choice of truncation points is of great importance in applications and further study is still needed on the choice of those parameters. There are several possible extensions of the model at hand. For instance, consider the dynamic doubly truncated Poisson model with rst order random walk and log ë t ˆ â9 t z t 1 â t ˆ â t 1 E t, where E t are independent multivariate Normal random variables with zero mean and covariance matrix, say Q t. Another extension is to allow state space priors for the covariates in a semiparametric Bayesian framework. Recent advances in the area of Kalman ltering (for example, Fahrmeir, 1992; FruÈhwirth-Schnatter, 1994; Durbin & Koopman, 1997; Shephard & Pitt, 1997; GoÈsll et al., 2000 among others) can be applied for exploring the properties of the dynamic doubly truncated Poisson model.

14 658 K. Fokianos Scand J Statist 28 Acknowledgements We would like to thank the Editor, the Associate Editor and the reviewers for their useful and constructive remarks. References Albert, P. S. (1991). A two-state markov mixture model for a time series of epileptic seizure counts. Biometrics 47, 1371±1381. Albert, P. S., McFarland, H., Smith, M. & Frank, J. (1994). Time series for modeling counts from a relapsingremitting disease: application to modelling disease activity in multiple sclerosis. Statist. Med. 13, 453± 466. Andersen, P. K. & Gill, R. D. (1982). Cox's regression models for counting process: a large sample approach. Ann. Statist. 10, 1100±1120. Arjas, E. & Haara, P. (1987). A logistic regression model for hazard: asymptotic results. Scand. J. Statist. 14, 1±18. Cohen, A. C. (1954). Estimation of the poisson parameter from truncated samples and from censored samples. J. Amer. Statist. Assoc. 49, 158±168. Cox, D. R. (1975). Partial likelihood. Biometrika 62, 69±76. Cox, D. R. (1981). Statistical analysis of time series; Some recent developments. Scand. J. Statist. 8, 93±115. Cox, D. R. & Hinkely, D. V. (1974). Theoretical statistics. Chapman & Hall, London. David, F. N. & Johnson, N. L. (1952). The truncated Poisson distribution. Biometrics 8, 275±285. Diggle, J. P., Liang, K.-Y. & Zeger, L. S. (1994). Analysis of longitudinal data. Oxford University Press, New York. Durbin, J. & Koopman, S. J. (1997). Monte±Carlo maximum likelihood estimation for non-gaussian state space models. Biometrika 84, 669±684. Eilers, P. H. C. & Marx, B. D. (1999). Generalized linear additive smooth structures. Technical report, Louisiana State University, Department of Experimental Statistics. Fahrmeir, L. (1987). Asymptotic testing theory for generalized linear models. Statistics 18, 65±76. Fahrmeir, L. (1992). Posterior mode estimation by extended Kalman ltering for multivariate dynamic generalized linear models. J. Amer. Statist. Assoc. 87, 501±509. Fahrmeir, L. & Tutz, G. (1994). Multivariate statistical modeling based on generalized linear models. Springer-Verlag, New York. Fokianos, K. & Kedem, B. (1998). Prediction and classi cation of non-stationary categorical time series. J. Multivariate Anal. 67, 277±296. FruÈhwirth-Schnatter, S. (1994). Applied state space modeling of non-gaussian time series using integration based Kalman ltering. Statist. Comput. 4, 259±269. GoÈsll, C., Auer, D. P. & Fahrmeir, L. (2000). Dynamic models in fmri. Magnet Resonance Med. 43, 72±81. Grab, E. L. & Savage, I. R. (1954). Tables of the expected value of 1=x for positive Bernoulli and Poisson variables. J. Amer. Statist. Assoc. 49, 169±177. Grogger, J. T. & Carson, R. T. (1991). Models for truncated counts. J. Appl. Econometrics 6, 225±238. Gurmu, S. & Triverdi, P. (1992). Overdispersion tests for truncated Poisson regression models. J. Econometrics 54, 347±370. Hall, P. & Heyde, C. C. (1980). Martingale limit theory and its applications. Academic Press, New York. Holden, R. T. (1987). Time series analysis of contagious process. J. Amer. Statist. Assoc. 82, 1019±1026. Johnson, N. L., Kotz, S. & Kemp, A. W. (1992). Univariate discrete distributions, 2nd edn. Wiley, New York. Kedem, B. (1994). Time series analysis by higher order crossings. IEEE Press, New York. Kemp, C. D. & Kemp, A. W. (1988). Rapid estimation for discrete distributions. The Statistician 37, 243± 255. LeÂon L. F. & Tsai, C. (1998). Assessment of model adequacy for Markov regression time series models. Biometrics 54, 1165±1175. Li, W. K. (1991). Testing model adequacy for some markov regression models for time series. Biometrika 78, 83±89. Li, W. K. (1994). Time series models based on generalized linear models: some further results. Biometrics 50, 506±511. MacDonald, I. L. & Zucchini, W. (1997). Hidden Markov and other models for discrete-valued time series. Chapman & Hall, London.

15 Scand J Statist 28 Truncated poisson regression 659 McCullagh, P. & Nelder, J. A. (1989). Generalized linear models, 2nd edn. Chapman & Hall, London. Shephard, N. & Pitt, M. K. (1997). Likelihood analysis of non-gaussian measurement time series. Biometrika 84, 653±657. Slud, E. & Kedem, B. (1994). Partial likelihood analysis of logistic regression and autoregression. Statist. Sinica 4, 89±106. Spall, J. C. (1988). Bayesian analysis of time series and dynamic models. Marcel Dekker, New York. West, M. & Harrison, P. (1997). Bayesian forecasting and dynamic models, 2nd edn. Springer-Verlag, New York. Wong, W. H. (1986). Theory of partial likelihood. Ann. Statist. 14, 88±123. Zeger, S. L. (1988). A regression model for time series of counts. Biometrika 75, 621±629. Zeger, S. L. & Qaqish, B. (1988). Markov regression models for time series: a quasi-likelihood approach. Biometrics 44, 1019±1031. Received August 1999, in nal form December 2000 K. Fokianos, Department of Mathematics and Statistics, University of Cyprus, P.O. Box 20537, CY 1678 Nicosia, Cyprus.

Index. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright John Wiley & Sons, Inc. ISBN.

Index. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright John Wiley & Sons, Inc. ISBN. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright 0 2002 John Wiley & Sons, Inc. ISBN. 0-471-36355-3 Index Adaptive rejection sampling, 233 Adjacent categories

More information

Sample size calculations for logistic and Poisson regression models

Sample size calculations for logistic and Poisson regression models Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National

More information

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

Gaussian processes. Basic Properties VAG002-

Gaussian processes. Basic Properties VAG002- Gaussian processes The class of Gaussian processes is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space, or time and space. The popularity

More information

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Births at Edendale Hospital

Births at Edendale Hospital CHAPTER 14 Births at Edendale Hospital 14.1 Introduction Haines, Munoz and van Gelderen (1989) have described the fitting of Gaussian ARIMA models to various discrete-valued time series related to births

More information

On power and sample size calculations for Wald tests in generalized linear models

On power and sample size calculations for Wald tests in generalized linear models Journal of tatistical lanning and Inference 128 (2005) 43 59 www.elsevier.com/locate/jspi On power and sample size calculations for Wald tests in generalized linear models Gwowen hieh epartment of Management

More information

LARGE SAMPLE PROPERTIES OF PARAMETER ESTIMATES FOR PERIODIC ARMA MODELS

LARGE SAMPLE PROPERTIES OF PARAMETER ESTIMATES FOR PERIODIC ARMA MODELS LARGE SAMPLE PROPERIES OF PARAMEER ESIMAES FOR PERIODIC ARMA MODELS BY I. V. BASAWA and ROBER LUND he University of Georgia First Version received November 1999 Abstract. his paper studies the asymptotic

More information

Problem set 1 - Solutions

Problem set 1 - Solutions EMPIRICAL FINANCE AND FINANCIAL ECONOMETRICS - MODULE (8448) Problem set 1 - Solutions Exercise 1 -Solutions 1. The correct answer is (a). In fact, the process generating daily prices is usually assumed

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Theory and Methods of Statistical Inference

Theory and Methods of Statistical Inference PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:

More information

Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods

Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods PhD School in Statistics XXV cycle, 2010 Theory and Methods of Statistical Inference PART I Frequentist likelihood methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle AUTOREGRESSIVE LINEAR MODELS AR(1) MODELS The zero-mean AR(1) model x t = x t,1 + t is a linear regression of the current value of the time series on the previous value. For > 0 it generates positively

More information

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,

More information

LESLIE GODFREY LIST OF PUBLICATIONS

LESLIE GODFREY LIST OF PUBLICATIONS LESLIE GODFREY LIST OF PUBLICATIONS This list is in two parts. First, there is a set of selected publications for the period 1971-1996. Second, there are details of more recent outputs. SELECTED PUBLICATIONS,

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Negative binomial quasi-likelihood inference for general integer-valued time series models

Negative binomial quasi-likelihood inference for general integer-valued time series models MPRA Munich Personal RePEc Archive Negative binomial quasi-likelihood inference for general integer-valued time series models Abdelhakim Aknouche and Sara Bendjeddou and Nassim Touche Faculty of Mathematics,

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

Elements of Multivariate Time Series Analysis

Elements of Multivariate Time Series Analysis Gregory C. Reinsel Elements of Multivariate Time Series Analysis Second Edition With 14 Figures Springer Contents Preface to the Second Edition Preface to the First Edition vii ix 1. Vector Time Series

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Pruscha: Semiparametric Estimation in Regression Models for Point Processes based on One Realization

Pruscha: Semiparametric Estimation in Regression Models for Point Processes based on One Realization Pruscha: Semiparametric Estimation in Regression Models for Point Processes based on One Realization Sonderforschungsbereich 386, Paper 66 (1997) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations Farhat Iqbal Department of Statistics, University of Balochistan Quetta-Pakistan farhatiqb@gmail.com Abstract In this paper

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983), Mohsen Pourahmadi PUBLICATIONS Books and Editorial Activities: 1. Foundations of Time Series Analysis and Prediction Theory, John Wiley, 2001. 2. Computing Science and Statistics, 31, 2000, the Proceedings

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY PREFACE xiii 1 Difference Equations 1.1. First-Order Difference Equations 1 1.2. pth-order Difference Equations 7

More information

Forecasting 1 to h steps ahead using partial least squares

Forecasting 1 to h steps ahead using partial least squares Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

KALMAN-TYPE RECURSIONS FOR TIME-VARYING ARMA MODELS AND THEIR IMPLICATION FOR LEAST SQUARES PROCEDURE ANTONY G AU T I E R (LILLE)

KALMAN-TYPE RECURSIONS FOR TIME-VARYING ARMA MODELS AND THEIR IMPLICATION FOR LEAST SQUARES PROCEDURE ANTONY G AU T I E R (LILLE) PROBABILITY AND MATHEMATICAL STATISTICS Vol 29, Fasc 1 (29), pp 169 18 KALMAN-TYPE RECURSIONS FOR TIME-VARYING ARMA MODELS AND THEIR IMPLICATION FOR LEAST SQUARES PROCEDURE BY ANTONY G AU T I E R (LILLE)

More information

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications Yongmiao Hong Department of Economics & Department of Statistical Sciences Cornell University Spring 2019 Time and uncertainty

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components

More information

Exact Non-parametric Con dence Intervals for Quantiles with Progressive Type-II Censoring

Exact Non-parametric Con dence Intervals for Quantiles with Progressive Type-II Censoring Published by Blackwell Publishers Ltd, 08 Cowley Road, Oxford OX4 JF, UK and 350 Main Street, Malden, MA 0248, USA Vol 28: 699±73, 200 Exact Non-parametric Con dence Intervals for Quantiles with Progressive

More information

Fisher information for generalised linear mixed models

Fisher information for generalised linear mixed models Journal of Multivariate Analysis 98 2007 1412 1416 www.elsevier.com/locate/jmva Fisher information for generalised linear mixed models M.P. Wand Department of Statistics, School of Mathematics and Statistics,

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Stochastic Processes

Stochastic Processes Stochastic Processes Stochastic Process Non Formal Definition: Non formal: A stochastic process (random process) is the opposite of a deterministic process such as one defined by a differential equation.

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

Bayesian Analysis of Vector ARMA Models using Gibbs Sampling. Department of Mathematics and. June 12, 1996

Bayesian Analysis of Vector ARMA Models using Gibbs Sampling. Department of Mathematics and. June 12, 1996 Bayesian Analysis of Vector ARMA Models using Gibbs Sampling Nalini Ravishanker Department of Statistics University of Connecticut Storrs, CT 06269 ravishan@uconnvm.uconn.edu Bonnie K. Ray Department of

More information

Part I State space models

Part I State space models Part I State space models 1 Introduction to state space time series analysis James Durbin Department of Statistics, London School of Economics and Political Science Abstract The paper presents a broad

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com

More information

A general mixed model approach for spatio-temporal regression data

A general mixed model approach for spatio-temporal regression data A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

Strati cation in Multivariate Modeling

Strati cation in Multivariate Modeling Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen

More information

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions Hong Kong Baptist University HKBU Institutional Repository Department of Economics Journal Articles Department of Economics 1998 Testing for a unit root in an ar(1) model using three and four moment approximations:

More information

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Research Division Federal Reserve Bank of St. Louis Working Paper Series Research Division Federal Reserve Bank of St Louis Working Paper Series Kalman Filtering with Truncated Normal State Variables for Bayesian Estimation of Macroeconomic Models Michael Dueker Working Paper

More information

Expressions for the covariance matrix of covariance data

Expressions for the covariance matrix of covariance data Expressions for the covariance matrix of covariance data Torsten Söderström Division of Systems and Control, Department of Information Technology, Uppsala University, P O Box 337, SE-7505 Uppsala, Sweden

More information

Projected partial likelihood and its application to longitudinal data SUSAN MURPHY AND BING LI Department of Statistics, Pennsylvania State University

Projected partial likelihood and its application to longitudinal data SUSAN MURPHY AND BING LI Department of Statistics, Pennsylvania State University Projected partial likelihood and its application to longitudinal data SUSAN MURPHY AND BING LI Department of Statistics, Pennsylvania State University, 326 Classroom Building, University Park, PA 16802,

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

consistency is faster than the usual T 1=2 consistency rate. In both cases more general error distributions were considered as well. Consistency resul

consistency is faster than the usual T 1=2 consistency rate. In both cases more general error distributions were considered as well. Consistency resul LIKELIHOOD ANALYSIS OF A FIRST ORDER AUTOREGRESSIVE MODEL WITH EPONENTIAL INNOVATIONS By B. Nielsen & N. Shephard Nuæeld College, Oxford O1 1NF, UK bent.nielsen@nuf.ox.ac.uk neil.shephard@nuf.ox.ac.uk

More information

Modelling AR(1) Stationary Time series of Com-Poisson Counts

Modelling AR(1) Stationary Time series of Com-Poisson Counts Modelling AR(1) Stationary Time series of Com-Poisson Counts Naushad Mamode Khan University of Mauritius Department of Economics Statistics Reduit Mauritius nmamodekhan@uomacmu Yuvraj Suneechur University

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Prequential Analysis

Prequential Analysis Prequential Analysis Philip Dawid University of Cambridge NIPS 2008 Tutorial Forecasting 2 Context and purpose...................................................... 3 One-step Forecasts.......................................................

More information

RESEARCH REPORT. Estimation of sample spacing in stochastic processes. Anders Rønn-Nielsen, Jon Sporring and Eva B.

RESEARCH REPORT. Estimation of sample spacing in stochastic processes.   Anders Rønn-Nielsen, Jon Sporring and Eva B. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING www.csgb.dk RESEARCH REPORT 6 Anders Rønn-Nielsen, Jon Sporring and Eva B. Vedel Jensen Estimation of sample spacing in stochastic processes No. 7,

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Generalized Method of Moments Estimation

Generalized Method of Moments Estimation Generalized Method of Moments Estimation Lars Peter Hansen March 0, 2007 Introduction Generalized methods of moments (GMM) refers to a class of estimators which are constructed from exploiting the sample

More information

Fahrmeir: Recent Advances in Semiparametric Bayesian Function Estimation

Fahrmeir: Recent Advances in Semiparametric Bayesian Function Estimation Fahrmeir: Recent Advances in Semiparametric Bayesian Function Estimation Sonderforschungsbereich 386, Paper 137 (1998) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Recent Advances in Semiparametric

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2 Unit roots in vector time series A. Vector autoregressions with unit roots Scalar autoregression True model: y t y t y t p y tp t Estimated model: y t c y t y t y t p y tp t Results: T j j is asymptotically

More information

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong Modeling, Estimation and Control, for Telecommunication Networks Notes for the MGR-815 course 12 June 2010 School of Superior Technology Professor Zbigniew Dziong 1 Table of Contents Preface 5 1. Example

More information

Managing Uncertainty

Managing Uncertainty Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian

More information

GMM tests for the Katz family of distributions

GMM tests for the Katz family of distributions Journal of Statistical Planning and Inference 110 (2003) 55 73 www.elsevier.com/locate/jspi GMM tests for the Katz family of distributions Yue Fang Department of Decision Sciences, Lundquist College of

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

The Ef ciency of Simple and Countermatched Nested Case-control Sampling

The Ef ciency of Simple and Countermatched Nested Case-control Sampling Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 26: 493±509, 1999 The Ef ciency of Simple and Countermatched Nested Case-control

More information

Federal Reserve Bank of New York Staff Reports

Federal Reserve Bank of New York Staff Reports Federal Reserve Bank of New York Staff Reports A Flexible Approach to Parametric Inference in Nonlinear Time Series Models Gary Koop Simon Potter Staff Report no. 285 May 2007 This paper presents preliminary

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

A test for improved forecasting performance at higher lead times

A test for improved forecasting performance at higher lead times A test for improved forecasting performance at higher lead times John Haywood and Granville Tunnicliffe Wilson September 3 Abstract Tiao and Xu (1993) proposed a test of whether a time series model, estimated

More information

Longitudinal data analysis using generalized linear models

Longitudinal data analysis using generalized linear models Biomttrika (1986). 73. 1. pp. 13-22 13 I'rinlfH in flreal Britain Longitudinal data analysis using generalized linear models BY KUNG-YEE LIANG AND SCOTT L. ZEGER Department of Biostatistics, Johns Hopkins

More information