Modeling Long-Range Dependence, Nonlinearity, and Periodic Phenomena in Sea Surface Temperatures Using TSMARS

Size: px
Start display at page:

Download "Modeling Long-Range Dependence, Nonlinearity, and Periodic Phenomena in Sea Surface Temperatures Using TSMARS"

Transcription

1 Modeling Long-Range Dependence, Nonlinearity, and Periodic Phenomena in Sea Surface Temperatures Using TSMARS Peter A. W. Lewis; Bonnie K. Ray Journal of the American Statistical Association, Vol. 92, No (Sep., 1997), pp Stable URL: Journal of the American Statistical Association is currently published by American Statistical Association. Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is an independent not-for-profit organization dedicated to creating and preserving a digital archive of scholarly journals. For more information regarding JSTOR, please contact support@jstor.org. Sun Aug 6 12:

2 Modeling Long-Range Dependence, Nonlinearity, and Periodic Phenomena in Sea Surface Temperatures Using TSMARS Peter A. W. LEWIS and Bonnie K. RAY We analyze a time series of 20 years of daily sea surface temperatures measured off the California coast. The temperatures exhibit quite complicated features, such as effects on many different time scales, nonlinear effects, and long-range dependence. We show how a time series version of the multivariate adaptive regression splines (MARS) algorithm, TSMARS, can be used to obtain univariate adaptive spline threshold autoregressive models that capture many of the physical characteristics of the temperatures and are useful for short- and long-term prediction. We also discuss practical modeling issues, such as handling cycles, long-range dependence, and concurrent predictor time series using TSMARS. Models for the temperatures are evaluated using out-of-sample forecast comparisons, residual diagnostics, model skeletons, and sample functions of simulated series. We show that a categorical seasonal indicator variable can be used to model nonlinear structure in the data that is changing with time of year, but find that none of the models captures all of the cycles apparent in the data. KEY WORDS: Adaptive spline threshold autoregressive models; Forecasting; Long memory; Multivariate adaptive regression splines; Nonlinear time series; Threshold autoregression. 1. INTRODUCTION AND DATA DESCRIPTION Physical systems, such as sea surface temperatures (SSTs), cloud cover, wind velocities, wind speeds, and water salinity, exhibit complex and interconnected behaviors that are difficult to capture using standard time series models. For example, Figure 1 shows the natural logarithm of daily SSTs (in "C) at Granite Canyon, on the coast of California approximately 30 km south of Monterey, over the period March 1, November 9, Oceanographers are interested in modeling SSTs to understand what drives changes in temperatures and to obtain accurate predictions of SSTs. Short-term predictions (approximately 1-20 days) are used in large-scale weather models (Haltiner and Williams 1980), whereas long-term predictions (2-3 years or more) are used to explore issues such as global warming and El Nino effects. Our goals are to model SSTs as a function of past SST values and other climatic variables and to develop good short- and long-term predictions. The series of logged SSTs has cycles on several different time scales. The cyclic time-of-year effect is the dominant feature in the data. Temperature drops every spring, corresponding to the coastal upwelling, although the drop does not occur at precisely the same time every year. Following the spring transition, the temperatures remain low and rise after the summer. It is well known (Hidaka 1954; see also Breaker and Lewis 1988) that this effect is due mainly to the wind direction. There is also a longer cyclic effect Peter A. W. Lewis is Distinguished Professor, Department of Operations Research, Naval Postgraduate School, Monterey, CA Bonnie K. Ray is Associate Professor, Department of Mathematics and Center for Applied Math and Statistics, New Jersey Institute of Technology, Newark, NJ The research of Bonnie Ray was supported in part by a separately budgeted research grant from the New Jersey Institute of Technology and National Science Foundation grant DMS The authors thank Larry Breaker for his input concerning the substantive issues of interest to oceanographers. They also thank the editor, associate editor, and the referees for comments that helped to substantially improve the presentation. due to the El Nino phenomenon. This phenomenon is reflected in the higher temperatures seen in Figure 1 in 1972, 1976, 1983, and 1987; it has a quasi-period of 4-5 years. An 18.6-year cycle in the SSTs has also been investigated (Royer 1991). A sinusoid with period 1 year fit to the first 20 years of the data using least squares approximates the yearly effect (see Fig. 1). The futility of trying to capture the cyclic effects completely with a global deterministic term can be seen from Figure la; the later data (March 1, 1991 and beyond) shows a massive and broad El Nifio warming in progress, which extended well into Therefore, fitting a 1-year cycle to the entire series would produce a sinusoid with a higher level than the sinusoid fitted using only the first 20 years of data, giving a worse fit to the data in the earlier years than that shown. The SST data also exhibit long-range dependence, which shows up as very long cycles of different lengths. The clear downward slope of the log periodogram relative to the log frequency shown in Figure 2 is characteristic of long-range dependent processes (Mandelbrot and Van Ness 1968). Because the logged periodogram ordinates are distributed as log exponential random variables, which are strongly negatively skewed, a robust straight line fit is shown. A similar spectral behavior has been seen in 60 years of daily SSTs recorded at Pacific Grove, CA, as well as other sets of SSTs (Mendelson 1990 and personal communication). A more subtle cyclic pattern is an approximately 47-day oscillation (Breaker and Lewis 1988). Previous attempts to model this oscillation by complex demodulation, for example, have proved fruitless, but it is believed that this oscillation may be related to the effect of the wind on SSTs (Hidaka 1954). We examine this nonlinear hypothesis in more detail in Section 1997 American Statistical Association Journal of the American Statistical Association September 1997, Vol. 92, No. 439, Applications and Case Studies

3 Journal of the American Statistical Association, September Years of Daily Sea Surface Temps at Granite Canyon.. :: Year Xr.. (a) 7 0 ' I I 1 I I I I Time in Days from 3/1/1971 (b) Figure 1. (a) The Logarithm of the SSTs From March 1, 1971-November 9, A fitted sinusoid with a I-year period is also shown for this period. (b) the residuals from this fit. It has been suggested that a linear trend component be included in a model for SSTs to describe the apparent longterm increase in temperatures. In fact, investigation of the SST data began in the early 1980s when oceanographers at the Naval Postgraduate School inquired about the existence Initial part of periodogram of daily sea surface temperatures at Granite Canyon years Log of frequency in cycles per year Figure 2. Plot of the Logarithm of the Periodogram of 20 Years of Logged SSTs at Granite Canyon Versus the Logarithm of an Initial Set of Fourier Frequencies. The circles depict logarithms of each periodograrn ordinate. The x's are logarithms of averages of adjacent periodograrn points; the straight line is a fit to the x points. of a long-term trend. Changes observed in flora and fauna in the region suggested that an increase in temperatures had occurred; oceanographers wanted to know whether such a trend was likely to continue. Fitting a linear trend plus 1-year cycle to the first 20 years of logged data gives a trend coefficient x 1OP' with estimated standard error,0588 x lop5 (neglecting the effect of the dependence structure of Xt), which translates into a roughly.43% yearly increase in SST. Studies of global temperatures (see, e.g., Smith 1993) give conflicting results about long-term trends, especially when long-range dependence is taken into consideration. Additionally, although there may be a very longterm cycle in the data, continued warming at a rate predicted by a straight line fit is unlikely. Thus a linear trend model is not satisfactory to oceanographers as a descriptive or predictive device. Because global warming is debatable, other models for the SSTs that do not assume a linear trend, such as spline-type functions, are preferable. We analyze the daily SSTs at Granite Canyon using both a linear long-range dependent model and nonlinear threshold-type autoregressive (AR) models obtained using TSMARS, a modification of the multivariate adaptive regression splines (MARS) algorithm (Friedman 199 1, 1993a, 1993b) appropriate for time series. The period March 1, 1971-February 28, 1991 is used for model estimation; the remainder of the data up to November 9, 1993 (620 days) are used for model validation. We also investigate models that include covariate series of wind velocities and wind directions at Granite Canyon and models that include a co-

4 Lewis and Ray: Time Series Multivariate Adaptive Regression Splines 883 variate categorical time series representing seasons of the year. The article is organized as follows. Section 2 describes TSMARS and discusses some computational issues. Section 3.1 discusses methods of handling the yearly cyclic effects in the data when using TSMARS. Sections 3.2 and 3.3 analyze the extent of long-range dependence in the SSTs and model it using both a linear long-memory model and an approximating nonlinear AR model with lags of up to 5 years. Section 3.4 evaluates the fitted models using out-of-sample forecast root-mean squared errors (RMSEs), residual diagnostics, model skeletons, and stability of simulated series. Section 4 illustrates how wind speeds, wind directions, and seasonal effects can be incorporated into our model using TSMARS. Section 5 summarizes our findings and gives directions for future research. Previous attempts at modeling SSTs using TSMARS have been reported by Stevens (1991), Lewis, Ray, and Stevens (1993), and Lewis and Ray (1993). Unlike that previous work, here we explore the inclusion of a categorical seasonal predictor variable that affects the entire nonlinear structure of the SSTs rather than only their mean value. We find that explicitly subtracting a fitted sinusoidal cycle from the data before using TSMARS gives the best predictive models. We also incorporate lagged values of up to 5 years in the nonlinear models in an attempt to capture long-range dependent behavior. The models of Lewis and Ray (1993) included lags only up to 365 and 50 days, which were not sufficient to account for long-range dependence or the El Nino warming. 2. USING TSMARS FOR NONLINEAR TIME SERIES MODELING 2.1 Threshold Autoregressive Models Threshold models, or models with partition points, are nonlinear time series models that emerge naturally when different models for the response variable {Xt) hold over different regions of the predictor space. Tong (1983, 1990) introduced threshold autoregression (TAR) models, which characterize this structure using different linear AR models in disjoint subregions of the predictor space. TAR models are powerful and flexible, but their identification is not systematic. Additionally, they can contain subregion AR models that are discontinuous at subregion boundaries. Systematic implementation of the TAR modeling methodology based on Friedman's (1991) multivariate adaptive regression splines (MARS) algorithm was introduced by Lewis and Stevens (1991). For a time series of re- sponses {Xt) with predictors {Xt-l}, {XtP2).... > {XtPp), the MARS algorithm can be used to obtain continuous nonlinear threshold-type AR models with interactions between predictors. We call the MARS algorithm modified for time series TSMARS, and we call the resulting models adaptive spline threshold autoregressive (ASTAR) models. (The form and analysis of univariate ASTAR models have been discussed in detail in Lewis and Stevens 1991.) TSMARS provides a systematic procedure for fit- ting nonlinear threshold models that are continuous in the domain of the predictor variables, allow interactions among lagged predictor variables, and have multiple lagged predictor variable thresholds, thus overcoming the limitations of Tong's approach and admitting more general nonlinear threshold models. Models containing covariate predictor series are called semi-multivariate ASTAR (SMASTAR) models (Lewis and Stevens, unpublished manuscript). For t = , n, let {x) denote the input time series and let {Xt) denote the output time series. A simple SMA- STAR model for {Xt) with threshold value T, on Xt-dl and T, on and an interaction between {Xt-,l,) and {x-d~) is where c is an intercept and (x - a)+ denotes a truncated spline function with (x - a)+ = (x - a) if x > a and 0 otherwise. The lags dl and d2 are not necessarily equal. Also, Yt, the current value of the covariate time series, may or may not be included in (1). The SMASTAR model can have a physical interpretation in the sense that the effects of lagged variables, covariates, and threshold levels on the response variable can be seen from the form of the fitted model. This is not always true of other nonlinear models, such as neural networks. 2.2 The TSMARS Algorithm TSMARS is a computer-intensive methodology that uses an exhaustive search algorithm to evaluate a very large number of potential models for the data, selects one of them, and estimates parameters for that model. The exact form of the selected model for Xt depends on the nature of the data, as well as on the parameters that control the implementation of the TSMARS algorithm, as specified by the user. A SPAN parameter is used to specify the minimum number of observations allowed between threshold placements for each predictor variable, thus acting to control the level of fine structure captured by the model. The SPAN parameter may be specified a priori or selected to minimize a model selection criterion, such as out-of-sample RMSE. The maximum number of linear truncated spline functions allowed in an interaction term is specified by the maximum interaction (MI) parameter. For ease of interpretation, MI is usually set to 2 or 3. An MI value of 1 precludes interactions between predictors. The maximum number of basis functions, MB, allowed in the forward step recursive partitioning algorithm can also be specified by the user. Additionally, a SPEED factor can be set to control the exhaustiveness of the search algorithm, with 1 indicating complete exhaustive search of the predictor space and 5 indicating a modified, less thorough, search. We use SPEED = 1 to obtain all models discussed in this article. Different depths of search do not seem to affect the accuracy of the approximating models in the regression framework (see Friedman 1993b), but may affect out-of-sample predictive accuracy.

5 884 Journal of the American Statistical Association, September 1997 Given the large number of algorithm parameters and the difficulty of predicting even simple nonlinear models, we have found it best to allow TSMARS as much flexibility in fitted models as is possible in the available computing environment. To evaluate model fit, TSMARS uses the residual sum of squared errors in the forward steps of the algorithm. A final model is selected by backward trimming using one of four selection criteria that penalize for overparameterization: the generalized cross-validation (GCV) criterion, the Akaike information criterion (AIC), Amemyia's prediction criterion, or the Schwarz-Rissanen criterion (SC). When using MARS in a time series setting, the SC (Schwarz 1978) was found by Stevens (1991) and Lewis and Ray (1993) to give more parsimonious models than the GCV criterion used by Friedman. This result is consistent with the behavior of the SC when selecting among linear ARMA(p, q) models. The SC is used to select all models investigated in this article. Of course, common sense and physical interpretation should dictate the form of the final model. Estimating the precision of TSMARS parameter estimates is difficult, because a TSMARS-selected model rarely corresponds to the true model. Estimates of precision that are not conditional on the fitted model have not been fully addressed even for simple cases; for example, estimating a linear AR(p) model by first selecting p and then estimating the coefficients (see Chatfield 1995 for a discussion). For TAR models, Tong (1990, thms. 5.7 and 5.8) showed that assuming that the delay variable, threshold values, and orders of the subregion autoregressions are known, the least squares estimates of the AR coefficients are asymptotically normally distributed, with standard errors equal to those given by linear regression theory under certain conditions. The estimated thresholds do not have a limiting normal distribution, however. Tong obtained standard errors for estimated threshold values in a TAR model using a bootstrap method. For the more complex ASTAR models, bootstrapping is often impractical; a single model may have several predictor variables, with each variable having several threshold points. Even if bootstrapping is feasible, models for the bootstrapped data are not guaranteed to contain the same terms as the original model. Consequently, we give only regression standard errors for the estimated coefficients conditioned on the selected ASTAR and SMASTAR models, although these provide only a rough lower bound on precision. 3. MODELING SSTs USING ASTAR AND ARFIMA MODELS It is well known that SSTs exhibit high, positive-valued autocorrelations at lags between 1 and approximately 40, and that the autocorrelation function (ACF) is not well modeled by the usual linear ARMA(p, q) models (Breaker, Lewis, and Orav 1983). In this section we model possible long-range dependence with a linear fractionally integrated ARMA (ARFIMA) model and with approximating ASTAR models obtained using TSMARS. 3.1 Handling Cyclic Effects Using TSMARS The dominant effect in the SSTs (see Fig. 1) is the 1-year cycle. This cycle is a fixed effect. That it is a true cycle can be shown by computing the amplitude of the yearly spectral component for the first year, the next 2 years, the next 3 years, and so on. A regression analysis shows that the amplitude increases linearly in n, the number of days. We consider three ways of handling this cyclic effect: 1. Use the general form of a univariate ASTAR model with an extensive range of lagged XtVj's as predictors and allow thresholds and interactions between lagged values. This can generate cyclic processes and limit cycles (see Lewis and Stevens 1991). Of course even a linear, firstorder AR equation with parameter close to 1 generates long "cycles," but the cycles vary in length. 2. Subtract a fitted sinusoidal function St = /A + cy cos(2rt/365) + psin(2rt/365) from Xt and let X,' = Xt - St. TSMARS is then used to model the transformed process, X,', which is assumed to be stationary. Although our results in Section indicate that this method gives possibly the best predictive model for the Granite Canyon SSTs, it is awkward and restrictive as an explanatory model from the viewpoint of oceanographers. To see this, consider X,' = Xt - St and transform it back to the variable Xt. Then the threshold on Xt will be cyclic! In practical terms, this implies that the nonlinearity occurs when the variable gets higher than a "time-of-year level" rather than when the variable reaches an absolute level. The nonlinear dynamical structure is not affected by the linear filter X,' = Xt - St (Broomhead, Huke, and Muldoon 1992); however, it is not clear whether inference is affected by substituting residuals from the cyclic fit in the TSMARS algorithm. Some initial results regarding this issue are discussed later in this section. 3. Use both lagged values of the SSTs and the 1-year cosine and sine terms as covariate predictor time series. This is a more general model than that described in method 1. However, for the SST data, the sine and cosine terms seldom came into the final model, even with 20 years of data, and the resulting predictions were extremely poor. In fact, TSMARS seems very poor at detecting a fixed cyclic effect with a long period. As a simple test, we generated series from a Gaussian AR process of order one with an added sinusoid and used TSMARS with a sinusoidal covariate and one lagged predictor to obtain strictly linear models for the series. Other TSMARS parameters were set commensurate with those used to model the Granite Canyon data. In 50 replications, the resulting TSMARS model only once included a sinusoid. The estimates of the AR coefficient, q5, were better than those obtained by the usual device of fitting a cycle by least squares, subtracting the cycle, and estimating q5 from the residuals, however. Consequently, for the univariate TSMARS models discussed in the following sections, a yearly cycle was removed from the logged data prior to modeling and predicting. As noted earlier, this may not be the best procedure from an explanatory viewpoint, but it does give the best predictors, as we discuss in Section The same variance stabilizing and cyclic detrending was applied before fitting a linear long-range dependent model

6 Lewis and Ray: Time Series Multivariate Adaptive Regression Splines 885 to the data, as discussed in Section 3.2. Because the longmemory model is linear, however, the (linear) detrending does not affect the interpretability. 3.2 Handling Long-Memory Using ARFIMA Models The existence of long-range dependence in physical systems in the form of very long cycles and apparently shifting mean levels has been the subject of much study in the last 20 years (see, e.g., Haslett and Raftery 1989, Hosking 1984, Mendelson 1990, and McLeod and Hipel 1978). By long-range dependence, we mean that the spectrum of the process at frequencies w close to 0 is proportional to up", 0 < a < 1 and is unbounded at zero frequency. Existence of such behavior might help to explain the anomalous effects of apparent global warming (Smith 1993). Figure 2 shows that the Granite Canyon SSTs have a sample spectrum that takes very large values at small frequencies, which is characteristic of series having long-range dependence. Several authors have proposed linear time series models that describe the slowly decaying structure of such series (Hosking 1981; Jonas 1983; Mandelbrot and Van Ness 1968). For example, the ARFIMA model of Granger and Joyeux (1980) and Hosking (1981), a generalization of the Box-Jenkins ARIMA models in which the degree of integration, d, is allowed to take fractional values, has long memory. The fractional integration operation (1- B ) ~ is defined by the binomial expansion (1 - B )~ = CZo($)(-B)" where B denotes the backward shift operator. The exponent d is related to the long-range dependence parameter a by a = 1-2d. We use an ARFIMA model to describe the low-frequency components of the 20 years of daily SSTs at Granite Canyon after first removing the 1-year cycle. Removing the cycle does not change the characteristics of the residual data (Yajima 1988). Because of the large sample size, we estimate the model in two steps. The long-memory parameter, d, is estimated using the periodogram spectral regression method of Geweke and Porter-Hudak (GPH; 1983), with number of regressors equal to [&I = 85. For the 20 years of Granite Canyon SSTs, we obtain d =.37, with standard error.076. This is consistent with the slope of the estimated log-spectral density for the SSTs shown in Figure 2. Hurvich, Deo, and Brodsky (1996) showed that the GPH estimate of d is consistent assuming that the process is Gaussian. A normal probability plot for the logarithm of the SSTs indicated that a Gaussian assumption is plausible. The estimate of d is used to approximate the fractional "difference" of the SST data, (1- B)'x~, using the infinite AR representation of the fractional differencing operator truncated at lag 500. The sample ACF and partial autocorrelation function (PACF) of the resulting fractionally differenced data indicate that the series contains shortmemory ARMA components. In fact, the fractionally differenced data exhibit behavior consistent with that of an AR(1) process, with estimated coefficient 4 =,606 (standard error.009). Thus there is strong correlation between SSTs from one day to the next and small, but nonnegligi- ble, correlation between SSTs many days apart. The result of the two-step procedure is an ARFIMA(1, d. 0) model for the logged and detrended SSTs. The ACF of the residuals from the fitted ARFIMA(1: d: 0) model (Fig. 3a) shows little correlation. However, given the nonlinear nature of the data, it seems highly unlikely that the linear ARFIMA model describes the data adequately. Examining the ACF for the squared residuals, as suggested by Granger and Anderson (1978), shows that the squared residuals retain some correlation (Fig. 3b). The correlation of squared residuals may indicate nonlinearity in the data, possibly in the direction of bilinearity, or a more general model misspecification, perhaps reflecting omission of a co- variate regressor. A more general model is discussed in Section Handling Long-Memory Using TSMARS An invertible long-memory ARFIMA(p. d, q) process has an infinite AR representation with AR that decay at w j-d-l as j + O. Thus it is reasonable to try to approximate long-range effects using an AR model with many lags. For a linear fractional noise process (an ARFIMA(0. d. 0) process), Ray (1993) found that an approximating AR(p) model accurately forecasted a fractional noise process at long lead times. Thus we attempt to approximate long-range dependence in ASTAR models by in- cluding many lagged predictor variables. We do not require the AR coefficients to decay as earlier; if the process has long-range dependence, then fitting a long AR model to the data should result in significant AR coefficients at long lags of approximately the correct form. We use TSMARS to model 20 years of the Granite Canyon data (with the 1-year cycle removed) using all lagged SSTs up to lag 100, and every fifth lag thereafter up to lag 1,925, representing lags up to approximately 5 years and 3 months. (Using all lags to 1,925 is computationally infeasible; the limited grid here represents a tenfold increase over the computation times in Lewis and Ray 1993.) We allow the SPAN and MI parameters to vary, to investigate the effect of these parameters on the selected model. Let the log-transformed and detrended process be where St is the SST on day t and the values in parentheses are the standard errors of the estimated coefficients (neglecting the effect of the dependence structure of xt). The sample variance of Xt is 62 = The differences that result from prohibiting as opposed to allowing interactions between predictors in the TSMARS algorithm can be seen from examining the following two models. Using SPAN = 1 and no allowable interactions (MI = 1) when applying the TSMARS algorithm to Xt, the resulting model is as follows:

7 ARFIMA Model Residuals Journal of the American Statistical Association, September 1997 Figure 3. (a) The Sample ACF for the Residuals From an ARFlMA(1, d, 0) Model Fitted to Granite Canyon SSTs. (b) The Sample ACF for the Squared Residuals. The dotted lines indicate upper and lower bounds on the estimated ACF assuming the residuals are white noise. Threshold TSMARS model for 20 years of SSTs at Gran- Granite Canyon SPAN 1; order 3 interactions allowed (MI ite Canyon SPAN 1; no interactions allowed (MI = 1): = 3): with 6; = var(xt - = The threshold in the lag-one term in the model indicates that SST is strongly correlated with the previous day's SST and that the dependence structure is nonlinear, with temperature increasing with coefficient if the temperature the previous day was more than -.I16 and increasing with coefficient.838 if it was less than Given that the minimum value of Xt over the 20 years is -.340, we see that lags 2 and 17 enter the model linearly (no interior threshold). Terms with lags of 8 and 17 probably reflect the fact that the average time between storm fronts in the vicinity of Granite Canyon in the winter is about 8 days. The model does not contain terms with lags greater than 17, as would be expected for a long-range dependent process. This suggests that the long-memory behavior may only be observed in conjunction with other variables; that is, interactively. When three interaction levels are allowed (MI = 3), the resulting model for Xt has the following form: Nonlinear long-memory model for 20 years of SSTs at with 8: = var(xt - x ~)~ = The highly nonlinear first-order terms are similar to those of the previous model, but lags 8 and 17, as well as higher-order terms, now enter as interaction terms. (The previous model did not allow interactions between variables.) The existence of long lag terms in the model (up to 1,425, or slightly less than 4 years) is consistent with long-range dependence in SSTs, although the variables Xt-435 or Xt-1415, for example, may not be physically interpretable. In fact, monthly averages corresponding to 14 and 48 months previous may provide more stable and interpretable models in the final analysis. We do not explore that avenue here, but note that TSMARS currently allows

8 Lewis and Ray: Time Series Multivariate Adaptive Regression Splines K-STEP-AHEAD FORECAST COMPARISONS - W T VALUE... MCAN VALUE _ _ R t -... ARFlHA(l.d,O) THRESHOLD. SPAN 1 IMERACTIM: SPAN 25 INTERACTIM: SPAN 1 Figure 4. The RMSEs of k-step-ahead Forecasts for the Granite Canyon SSTs. The forecast period begins on March 1, 1991, and extends for 620 days. aggregated variables as predictors. The last three-way interaction term is nonzero if XtP2 > (which always occurs), Xt-435 >,071 (which occurs about half of the time), and Xt-1415 < (which rarely occurs). If the interaction is nonzero, then the contribution to xt is large and negative. Consequently, the following xt values take very large positive values, quickly causing the model to become unstable (see Sec ). With SPAN = 25 and allowing up to three-way interactions, the resulting model for Xt has the following form: Nonlinear long-memory model for 20 years of SSTs at Granite Canyon SPAN 25; order 3 interactions allowed (MI = 3): with 32 = var(xt - x ~)~ =, The first-order terms are similar to those of the previous models, but there are now no level-2 interactions. SPAN = 50 gives a similar model, but contains only three terms of interaction level 3. As would be expected, using larger SPAN values results in a "smoother" model. The model of (5) has no three-way interactions with very large coefficients, as in the model of (4), and the resulting process appears to be stable. Equations (4) and (5) are approximations to a model with long-range dependence. We do not know of a specific ASTAR model that is also truly long-range dependent, as opposed to an approximating model. Other types of nonlinear models having long-range dependence, such as fractionally integrated models with ARCH errors or random solutions to the Burger's' equation, have been discussed by Robinson (1994), Rosenblatt (1987), and Taqqu (1987). In the following sections we compare the estimated linear ARFIMA model and the ASTAR models on the basis of their forecasts, residuals, and model skeletons. 3.4 Comparison of Linear and Nonlinear Models Out-of-Sample Forecasts. Predicted SSTs are used as input to large-scale weather models and for analysis of long-term environmental trends, so the predictive capability of the models at both short and long forecast horizons is important. Figure 4 shows the RMSEs of predicted k-step-ahead forecasts beginning on March 1, 1991, for the linear and nonlinear models discussed in Sections 3.2 and 3.3. The RMSEs are calculated in the original scale of the data and are shown in "C. We also include the RMSE obtained if the last observed value in the time series is used for prediction. The last observed value is a reasonable predictor

9 888 Journal of the American Statistical Association, September 1997 for highly correlated data and is typically the standard by which physicists judge forecast ability. The RMSE of the naive forecast (i.e., the sample mean) is also shown. The mean is the typical basis of forecast comparison for statisticians. The RMSEs were obtained using a moving forecast origin, forecasting k-steps ahead, k = 1,...,620, and taking the square root of the average squared difference between forecast and actual value. We see that the RMSEs do not increase monotonically in k, probably because the data have nonlinear structure (Casdagli 1992). Accuracy of nonlinear forecasts depends on the region over which the forecasts are made. We extend our forecasts out to 620 days to include more than one year in the forecast region. All models (except the sample mean) are competitive for small forecast steps (i.e., k = 1,...,20). This is not surprising, given the high correlation between adjacent SSTs. However, there are differences at larger forecast steps. For example, at k = 365, the RMSEs of the forecasts range from about 1.35"C using (4) to about 1.74"C using the naive mean forecast. Thus we have a forecast improvement using the TSMARS model of about 22%. The TSMARS model with SPAN = 1 and MI = 3, the model of (4), appears to be the best predictive ASTAR model. The SPAN 25 model is competitive at steps > 200, differing from the SPAN 1 model by at most about 5%. The linear ARFIMA model performs worse than any of the ASTAR models. The RM- SEs of the ASTAR model forecasts shown in Figure 4 are larger than the measurement errors for the SSTs, which are known to be approximately *0.l0C, but are smaller than the RMSEs for the forecasts based on the last observed value, the method currently used by oceanographers (Haltiner and Williams 1980). Of course, RMSE measurements are subject to sampling variability, which has not been investigated here. Estimates of the precision of the RMSEs for the models discussed could be obtained by considering the variability of the RMSEs over different segments of the data. From Figure 1 there appears to be a change (at least in overall level) of the SSTs at about the time the forecast period begins (February 28, 1991). There is no known reason for a permanent change in the structure of the SST process around late 1990 or early 1991, but there is evidence of a massive El Niiio warming in progress. Alternatively, the apparent changes in behavior around this period could be attributed to the long-range dependence property of the SST data. What is the effect of the change in SSTs on the forecasts? The forecasts may be less accurate than they could be, because the forecasting model is based on the behavior of the SST process up until February 28, 1991, and there does not appear to have been a comparably broad warming episode in the SST process before that time. Model updating (i.e., reestimation of the model using - additional data including w the El Nifio episode) can be used to investigate changes in process behavior, but this is not computationally feasible Residual Diagnostics. Figure 5 shows the ACF for the residuals and squared residuals from the TSMARS model of Eq. (4) fitted to the 20 years of SSTs. The dotted lines indicate upper and lower bounds on the estimated ACF assuming that the residuals are white noise. The residuals are uncorrelated, but the ACF for the squared residuals indicates some remaining correlation, although less than that observed for the squared residuals from the ARFIMA model. The ACFs of Figure 5 are similar to those of Figure 3 for the fitted ARFIMA(1, d, 0) process, and both suggest the kind of pattern seen in ARCH processes (see Tong 1991, p. 115); that is, processes having variances that are a function of lagged values of the squared process. The linear ARFIMA model and the TSMARS models of Sections 3.2 and 3.3 assume that the dominant-year cycle manifests itself both in the mean and the standard deviation of the process. No fitted ASTAR model showed a limit cycle of 365 days, and the cyclic effect on the mean value was removed (approximately) with an additive sinusoidal component for the logarithm of the data. We can also ask whether the structure of the process is changing with time of year. (Note that in an ASTAR model, the structure changes with the level of the process; no cyclic time thresholds have been postulated.) Figure 6 shows the mean, the standard deviation, and the lag 1 serial correlation, pl, for each of the 365 days in the year. The plots are formed by averaging across the 20 years of logged data and then smoothing along the time axis. The mean value of the logged data (Fig. 6a) is time-of-year dependent; the standard deviation (Fig. 6b) is not. The value of pl (Fig. 6c) is large and changes with time of year, but not always in the same way that the mean changes. Given the multiplier of Xt-l in Equation (5) when XtPl > -.116, the size of pl is not surprising. But pl does not remain high whenever XtPl is high. To cope with this nonstationarity in the data, a TSMARS model that introduces time of year as a categorical predictor variable is considered in Section Model Skeletons and Simulated Periodograms. The right side of an equation such as (4) is called the model skeleton. If it is used with the first 1,415 observed Xt's to generate x ~ and ~ subsequently, ~ ~, x1416 and X1415,..., X2 are used to generate x ~ and ~ so ~ on, then ~ the, resulting output sequence is a trajectory of the skeleton of the estimated ASTAR model. If the initial 1,415 values in the SSTs eventually reappear in this sequence, then the model is said to have a limit cycle. Although Lewis and Stevens (1991) found such a limit cycle for the Wolfer sunspot series using TSMARS, no limit cycles were found for any of the models for SSTs discussed in this article. If trajectories are generated as just described, but with white noise having variance 6: added, (i.e., xi416 is now for the SST data. But because we are using an adaptive ~1416+~1416, etc.), then we have a simulated sample path of spline-type model, an updated model would accommodate the ASTAR model. In generating trajectories having differexperience in the SST process of a broader kind than that ent lengths and/or initial values with and without noise, we modeled in, for example, (5). may find empirically that the process is unstable, meaning

10 Lewis and Ray: Time Series Multivariate Adaptive Regression Splines Best TSMARS Model Residuals Figure 5. (a) The Sample ACF for the Residuals From Equation (4), the Fitted Interactive TSMARS Model With SPAN I, for the Granite Canyon SSTs. (b) The Sample ACF for the Squared Residuals. that /Xt/ tends to infinity with t (see Tong 1990, p. 7 the last three-way interaction term is so large that it uland sec. 4.1). Thus in generating data from the SPAN timately builds up an unstable oscillation with period 2, 1 model of (4) with added noise, we found that the fit- 435, or 1,415 days. Clearly, the interaction is complex. Alted model was in fact stochastically unstable. One can see though examination of simulated data from (4) suggests the the cause of this in (4). The multiplier of a stochastically unstable model, the process does appear LOG SEA SURFACE TEMPERATURES March I to February 28 (a) March 1 to February 28 Figure 6. Averaged and Smoothed (Along the Time Axis) Sample Statistics for the 20 Years of Logged SSTs.

11 890 Journal of the American Statistical Association, September 1997 to be deterministically stable. Other studies of dynarnical systems have also found a nonsystematic relationship between stability of a skeleton generated with and without noise. Intuitively, it seems that models with small SPAN (little smoothing) may be ultimately unstable due to overfitting of the data, although these models generally seem to provide the best long-term predictions. Nonetheless, the difference in predictive power is not great, as can be seen from Figure 4. If the sample path of the series remains stable throughout the simulation, as was the case when using (5), then one can look at sample properties of the simulated series, such as the log periodogram versus log frequency, to see whether the model captures the behavior of the series, such as the long-range dependence. By examining a few very long simulated time series, we conclude that the series simulated from the fitted ASTAR models for SSTs appear to behave like long-range dependent processes; that is, they have long cycles, changing mean levels, and large-sample spectra values at low frequencies, even though the estimated models only approximate true long-range dependence. Other model properties can also be investigated via simulation; for example, the sample ACF for series simulated from a fitted ASTAR model can be used to investigate stationarity of the model. 4. SMASTAR AND CASTAR MODELS A recent implementation of the MARS algorithm allows for categorical predictor variables in MARS (Friedman 1993a). Adapting this implementation to time series implies that if one had m categorical predictor variables and each was constrained in the TSMARS algorithm to enter linearly and without interactions, then each value of each of the categorical predictor variables would contribute a possibly different additive constant to the model for Xt at time t. If interactions among the categorical variables are allowed, then more complex patterns of dependence of Xt can occur. If interactions between categorical and ordinal variables are allowed, then a different AR threshold model can be obtained for each different combination of the categorical variables. We include categorical predictor variables in TSMARS to obtain semimultivariate models of SSTs. For example, the SSTs may be a function of past values of the SSTs, present or past wind speeds, and present or past wind directions, which are categorical predictors. (WD) using the categories 1 = East: On< WD < 45"or 315"< WD 5 360"; 2 = North: 45"< WD 5 135"; 3 =West: 135"< WD < 225"; and 4 = South: 225"< WD < 315". Days in which no wind or only light airs are reported receive a code of 5. The wind categorical variable is not ordered, so that categories East and South are not adjacent, but TSMARS allows these categories to be combined if the SSTs are found to have similar behavior when the wind blows from either direction. Categorizing the wind direction gives information concerning the broad directional effect and leads to interpretable models. We call models that include categorical predictor variables categorical ASTAR (CASTAR) models. The effect of wind direction on SSTs is shown in Figure 7. Using the 5 years and 3 months of data, the figure depicts the range of temperatures reported for each wind category when temperature lags wind direction by 1, 6, 11, or 16 days. Clearly, the SSTs depend on the wind direction; they tend to be lower when the wind blows from the northwest (Categories 2 and 3). The dependence is strongest at lag 1. In fact, oceanographers agree (Hidaka 1954) that the spring transition is driven by the wind shifting to blow from the northwest in the spring. A good descriptive model should show this nonlinear effect on SSTs. We model logged SST (Xt) using 10 lags of wind direction (WDt-,), 10 lags of the logarithm of (1 + wind speed), WSi-,., and 50 lags of logged SSTs (Xt-j). Here we do not remove the 1-year cycle from the data, but include sine and cosine terms as potential predictors variables to investigate whether the cyclic effects are wind driven. The SPAN parameter in the TSMARS model is set equal to 10. The fitted CASTAR model, given next, makes explicit the relationship seen in Figure 7 between the wind direction and the SSTs: Model for 5 years of logged SSTs using wind direction and log of (1 + wind speed): 4.1 Modeling SSTs Using Wind Speeds and Directions Wind speeds and wind directions are two of the most important meteorological variables affecting SSTs, but they are difficult and expensive to measure reliably. For Granite Canyon, wind speed and direction data are available from January 1, 1986-March 31, The direction from which the wind blows at Granite Canyon is a circular variable measured from 0 C-360 C. We code it as a categorical variable The fitted model has the following properties: 1. The logged SSTs range in value from over this 5-year period, and the first, second, and fifth terms in (6) (not counting the constant term) are functions of lagged SSTs. The first term is always present because the threshold of 2.13 is the minimum value of the logged SST series. The third term adds to xt only when Xt-34 is less than 2.22, which rarely happens because the minimum value of the logged SSTs is The three-way interaction between

12 Lewis and Ray: Time Series Multivariate Adaptive Regression Splines EFFECT OF WIND DIRECTION ON LOGARITHM OF SEA SURFACE TEMPERATURES Temperature lags Wind Direction by 1 day Temperature lags Wind Direction by 6 days I Wind Direction Category Temperature lags Wind Direction by 11 days Wind Direction Category Temperature lags Wind Direction by 16 days Wind Direction Category Wind Direction Cateeorv Figure 7. The Effect of Wind Direction on the SSTs at Granite Canyon From January 1, 1986-March 31, logged SSTs in the fifth term occurs infrequently. Lags 1, 7, and 17 were also present in the models of (3), (4), and (5). 2. The change in xt when the wind blows from the northwest on the previous day, which is explicit in the third and fourth terms of (6), agrees with the relationship seen in Figure 7. Splitting the two terms into three by separating out directions 1, 2 and 3, we see that direction 1 causes a slight increase in the overall level (+.013), while directions 2 and 3 cause slight decreases of,013 -,035 = and Both the third and fourth three-way interaction terms involve Xt-4S. Only one of these terms is present for a given t, depending on whether Xt-4g is greater than or less than 2.51 and whether the wind direction is in category 2 or 3. When Xt-4g is larger than 2.51 and the wind speed is greater than 3.00 knots, Xt and Xt-4S are coupled. This kind of oscillation was observed by Breaker and Lewis (1988) for several sets of SSTs, including the one at Granite Canyon. The last two terms of the CASTAR model show that the "oscillation" is present only when the wind direction is from the northwest (WDtPl E {2.3)), and the oscillation increases as lag 1 wind speed increases. Attempts to understand the oscillation with complex demodulation models were fruitless. 4. The WSt = log(1 + wind speed) thresholds, which are selected automatically by the TSMARS algorithm, have a meteorological interpretation. A transformed wind speed threshold of 1.10 knots translates into m/sec, below which wind speeds have little effect on SSTs. Looking at the seventh term in the CASTAR model, a transformed wind speed greater than 3.00 knots, which translates into 9.82 m/sec, lowers SSTs when the wind blows from the northwest and the logged temperature 49 days ago exceeded Note that in deriving (6), we did not subtract a time-of-year cycle from either transformed wind speeds or the transformed SSTs. As noted before, subtracting the cycle from the data destroys the relationship of the thresholds to the physical measurements. It may be better to first remove the cyclic trend if one were to make forecasts from the model. To forecast more than one step ahead requires forecasts for wind speeds and directions. One solution (Lewis and Ray 1993) is to use TSMARS to fit a semi-multivariate model for the wind speeds, as well as for the SSTs, and use these models for prediction. 4.2 Modeling Cyclic Nonstationary in the SSTs using Seasonal Categorical Variables The residual plots in Section 3 (Figs. 3 and 5) indicated that the squares of the residuals for the fitted ARFIMA and ASTAR models are not uncorrelated sequences. This result can perhaps be explained by the fact that the 1-year cycle manifests itself in the SSTs not only in the mean of the time series, but also in its lag-1 serial correlation. A categorical variable Ct taking value 1, 2, 3, 4, 5, or 6, depending on whether t corresponds to the first 2 months of the year, the second 2 months of the year, and so on, can be included as an additional predictor variable to capture more of the yearly cycle. The resulting model can be thought of as a nonlinear generalization of the periodically stationary models of Gladyev (1961). However, it has a much larger

13 892 Journal of the American Statistical Association, September 1997 periodicity (365) than typically considered in the literature. (See McLeod 1994 for details concerning linear modeling of periodically correlated time series.) Preliminary results concerning the use of TSMARS for nonlinear modeling of periodically correlated monthly data having period 1 year have been reported in another work (Lewis and Ray, unpublished manuscript). Additional research into the properties of such models is ongoing. 5. DISCUSSION AND DIRECTIONS FOR FUTURE RESEARCH We have demonstrated how the TSMARS algorithm can be used to obtain ASTAR, SMASTAR, and CASTAR models for a set of SSTs having complex nonlinear and cyclic patterns. The ASTAR models obtained using TSMARS are able to model nonlinear effects and long-range dependence. The ASTAR models with a fixed cycle give better long-term predictions than the linear ARFIMA model analyzed here. None of the skeletons produced from the ASTAR models for SSTs indicates a limit cycle, but the number of cycles present (20) may be too small to induce this behavior. A model containing a limit cycle would be better than one in which a 1-year harmonic must be subtracted from the data for adequate fitting. The SMASTAR model that includes wind direction and wind speed is interpretable by oceanographers. Predictions for more than one step ahead are difficult with the SMA- STAR model, however. Modeling several series with separate SMASTAR models is one solution, but to have a complete multivariate model, one needs to postulate or empirically derive a simple model for the resultant error sequences. Alternatively, an extension of TSMARS to the multivariate case, analogous to multivariate nonparametric regression modeling, could be developed for nonlinear multivariate time series modeling. This is an area for future research. [Received June Revised December REFERENCES Breaker, L. C., and Lewis, P. A. W. (1988), "A Day Oscillation in Sea-Surface Temperature Along the Central California Coast," Estuarine, Coastal, and Shelf Science, 26, Breaker, L. C., Lewis, P. A. W., and Orav, E. J. (19831, "Analysis of a 12- Year Record of Sea-Surface Temperatures Off Point Sur, California," Report NPS , Naval Postgraduate School, Dept. of Operations Research. Broomhead, D. S., Huke, J. P., and Muldoon, M. R. (1992),"Linear Filters and Nonlinear Systems," Biometrika, 54, Casdagli, M. (1992), "Nonlinear Forecasting, Chaos, and Statistics," in Modeling Complex Phenomena, eds. L. Lam and V. Naroditsky, New York: Springer-Verlag, pp Chatfield, C. (1995), "Model Uncertainty, Data Mining, and Statistical Inference" (with discussion), Journal of the Royal Statistical Society, Ser. A, 158,419466, Friedman, J. H. (1991), "Multivariate Adaptive Regression Splines," The Annals of Statistics, 19, (1993a), "Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines," in New Directions in Statistical Data Analysis and Robustness, eds. S. Morgenthaler, E. Rohchetti, and W. A. Stahel, Boston: Birkhausen-Verlag. (1993), "Fast MARS," Technical Report 110, Stanford University, Dept, of Statistics. Geweke, J., and Porter-Hudak, S. (19831, "The Estimation and Application of Long Memory Time Series Models," Journal of Time Series Analysis, 4, Gladyev, E. G. (1961), "Periodically Correlated Random Sequences," Soviet Mathematics (Doklady Akademic Nauk), 2, Granger, C. W., and Anderson, A. P. (1978), "An Introduction to Bilinear Time Series Models," Vandenhoeck and Ruprecht, Granger, C. W. J., and Joyeux, R. (19801, "An Introduction to Long- Memory Time Series Models and Fractional Differencing," Journal of Time Series Analysis, 1, Haltiner, G. J., and Williams, R. T. (1980), Numerical Prediction and Dynamic Meteorology, New York: Wiley. Haslett, J., and Raftery, A. E. (1989), "Space-Time Modelling With Long- Memory Dependence: Assessing Ireland's Wind Power Resource," Applied Statistics, 38, Hidaka, K. (1954), "A Contribution to the Theory of Upwelling and Coastal Currents," Transactions of the American Geophysical Union, 35, Hosking, J. R. M. (1981), "Fractional Differencing," Biometrika, 68, (1984), "Modeling Persistence in Hydrological Time Series Using Fractional Differencing," Water Resources Research, 20, Hurvich, C., Deo, R., and Brodsky, J. (in press), "The Mean-Squared Error of Geweke and Porter-Hudak's Estimator of the Memory Parameter of a Long Memory Time Series," submitted to Journal of Time Series Analysis. Jonas, A. B. (1983), "Persistent Memory Random Processes," Ph.D, thesis, Harvard University. Lewis, P. A. W., and Ray, B. K. (19931, "Nonlinear Modelling of Multivariate and Categorical Time Series Using Multivariate Adaptive Regression Splines," in Dimension Estimation and Models, ed. H. Tong, Singapore: World Scientific, pp Lewis, P. A. W., Ray, B. K., and Stevens, J. (19931, "Modelling Time Series Using Multivariate Adaptive Regression Splines (MARS)," in Time Series Prediction: Forecasting the Future and Understanding the Past, eds. A. Weigend and N. Gershenfeld, Santa Fe Institute, Sante Fe, NM: Addison-Wesley, pp Lewis, P. A. W., and Stevens, J. G. (1991), "Nonlinear Modeling of Time Series Using Multivariate Adaptive Regression Splines (MARS)," Journal of the American Statistical Association, 87, (1992), "Semi-Multivariate Modeling of Time Series Using Adaptive Spline Threshold Autoregressions (ASTAR)," preprint. Mandelbrot, B. B., and Van Ness, J. W. (1968), "Fractional Brownian Motions, Fractional Noises, and Applications," SIAM Review, 10, McLeod, A. I. (1994), "Diagnostic Checking of Periodic Autoregression Models With Application," Journal of Time Series Analysis, 15, McLeod, A. I., and Hipel, K. W. (1978), "Preservation of the Rescaled Adjusted Range. 1. A Reassessment of the Hurst Phenomenon," Water Resources Research, 14, Mendelson, R. (1990), "Relating Fisheries to the Environment in the Gulf of Guinea: Information, Causality and Long-Term Memory," in Les Pecheries Ouest-Africaines: Variabilite, instabilite et changement, eds. P. Cury and C. L. Roy, Paris: OSTROM, pp Ray, B. K. (1993), "Modeling Long Memory Processes for Optimal Long- Range Prediction," Journal of Time Series Analysis, 14, Rosenblatt, M. (1987), "Scale Renormalization and Random Solutions of the Burgers Equation," Journal ofapplied Probability, 24, Robinson, P. M. (1994), "Time Series With Strong Dependence," in Advances in Econometrics: Sixth World Congress, Vol. 1, ed. C. Sims, Cambridge, U.K.: Cambridge University Press, pp Royer, T. C. (1991), "High-Latitude Oceanic Variability Associated With the 18.6-Year Nodal Tide," Journal of Geophysical Research, 98, Schwarz, G. (19781, "Estimating the Dimension of a Model," The Annals of Statistics, 6, Smith, R. L. (1993), "Long-Range Dependence and Global Warming," in Statistics for the Environment, eds. V. Barnett and K. F. Turkman, New York: Wiley, pp

at least 50 and preferably 100 observations should be available to build a proper model

at least 50 and preferably 100 observations should be available to build a proper model III Box-Jenkins Methods 1. Pros and Cons of ARIMA Forecasting a) need for data at least 50 and preferably 100 observations should be available to build a proper model used most frequently for hourly or

More information

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data

More information

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis

More information

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis Chapter 12: An introduction to Time Series Analysis Introduction In this chapter, we will discuss forecasting with single-series (univariate) Box-Jenkins models. The common name of the models is Auto-Regressive

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

Short and Long Memory Time Series Models of Relative Humidity of Jos Metropolis

Short and Long Memory Time Series Models of Relative Humidity of Jos Metropolis Research Journal of Mathematics and Statistics 2(1): 23-31, 2010 ISSN: 2040-7505 Maxwell Scientific Organization, 2010 Submitted Date: October 15, 2009 Accepted Date: November 12, 2009 Published Date:

More information

Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I. El Niño/La Niña Phenomenon

Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I. El Niño/La Niña Phenomenon International Journal of Geosciences, 2011, 2, **-** Published Online November 2011 (http://www.scirp.org/journal/ijg) Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I.

More information

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M. TIME SERIES ANALYSIS Forecasting and Control Fifth Edition GEORGE E. P. BOX GWILYM M. JENKINS GREGORY C. REINSEL GRETA M. LJUNG Wiley CONTENTS PREFACE TO THE FIFTH EDITION PREFACE TO THE FOURTH EDITION

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Estimation of Parameters of Multiplicative Seasonal Autoregressive Integrated Moving Average Model Using Multiple Regression

Estimation of Parameters of Multiplicative Seasonal Autoregressive Integrated Moving Average Model Using Multiple Regression International Journal of Statistics and Applications 2015, 5(2): 91-97 DOI: 10.5923/j.statistics.20150502.07 Estimation of Parameters of Multiplicative Seasonal Autoregressive Integrated Moving Average

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference

More information

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Facoltà di Economia Università dell Aquila umberto.triacca@gmail.com Introduction In this lesson we present a method to construct an ARMA(p,

More information

Time Series Analysis. Correlated Errors in the Parameters Estimation of the ARFIMA Model: A Simulated Study

Time Series Analysis. Correlated Errors in the Parameters Estimation of the ARFIMA Model: A Simulated Study Communications in Statistics Simulation and Computation, 35: 789 802, 2006 Copyright Taylor & Francis Group, LLC ISSN: 0361-0918 print/1532-4141 online DOI: 10.1080/03610910600716928 Time Series Analysis

More information

FE570 Financial Markets and Trading. Stevens Institute of Technology

FE570 Financial Markets and Trading. Stevens Institute of Technology FE570 Financial Markets and Trading Lecture 5. Linear Time Series Analysis and Its Applications (Ref. Joel Hasbrouck - Empirical Market Microstructure ) Steve Yang Stevens Institute of Technology 9/25/2012

More information

Time Series and Forecasting

Time Series and Forecasting Time Series and Forecasting Introduction to Forecasting n What is forecasting? n Primary Function is to Predict the Future using (time series related or other) data we have in hand n Why are we interested?

More information

Forecasting. Simon Shaw 2005/06 Semester II

Forecasting. Simon Shaw 2005/06 Semester II Forecasting Simon Shaw s.c.shaw@maths.bath.ac.uk 2005/06 Semester II 1 Introduction A critical aspect of managing any business is planning for the future. events is called forecasting. Predicting future

More information

M O N A S H U N I V E R S I T Y

M O N A S H U N I V E R S I T Y ISSN 440-77X ISBN 0 736 066 4 M O N A S H U N I V E R S I T Y AUSTRALIA A Test for the Difference Parameter of the ARIFMA Model Using the Moving Blocks Bootstrap Elizabeth Ann Mahara Working Paper /99

More information

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Dynamic Time Series Regression: A Panacea for Spurious Correlations International Journal of Scientific and Research Publications, Volume 6, Issue 10, October 2016 337 Dynamic Time Series Regression: A Panacea for Spurious Correlations Emmanuel Alphonsus Akpan *, Imoh

More information

Time Series: Theory and Methods

Time Series: Theory and Methods Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY PREFACE xiii 1 Difference Equations 1.1. First-Order Difference Equations 1 1.2. pth-order Difference Equations 7

More information

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis CHAPTER 8 MODEL DIAGNOSTICS We have now discussed methods for specifying models and for efficiently estimating the parameters in those models. Model diagnostics, or model criticism, is concerned with testing

More information

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE Mozammel H. Khan Kuwait Institute for Scientific Research Introduction The objective of this work was to investigate

More information

LONG MEMORY AND FORECASTING IN EUROYEN DEPOSIT RATES

LONG MEMORY AND FORECASTING IN EUROYEN DEPOSIT RATES LONG MEMORY AND FORECASTING IN EUROYEN DEPOSIT RATES John T. Barkoulas Department of Economics Boston College Chestnut Hill, MA 02167 USA tel. 617-552-3682 fax 617-552-2308 email: barkoula@bcaxp1.bc.edu

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

The ARIMA Procedure: The ARIMA Procedure

The ARIMA Procedure: The ARIMA Procedure Page 1 of 120 Overview: ARIMA Procedure Getting Started: ARIMA Procedure The Three Stages of ARIMA Modeling Identification Stage Estimation and Diagnostic Checking Stage Forecasting Stage Using ARIMA Procedure

More information

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL B. N. MANDAL Abstract: Yearly sugarcane production data for the period of - to - of India were analyzed by time-series methods. Autocorrelation

More information

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data Revisiting linear and non-linear methodologies for time series - application to ESTSP 08 competition data Madalina Olteanu Universite Paris 1 - SAMOS CES 90 Rue de Tolbiac, 75013 Paris - France Abstract.

More information

NOTES AND PROBLEMS IMPULSE RESPONSES OF FRACTIONALLY INTEGRATED PROCESSES WITH LONG MEMORY

NOTES AND PROBLEMS IMPULSE RESPONSES OF FRACTIONALLY INTEGRATED PROCESSES WITH LONG MEMORY Econometric Theory, 26, 2010, 1855 1861. doi:10.1017/s0266466610000216 NOTES AND PROBLEMS IMPULSE RESPONSES OF FRACTIONALLY INTEGRATED PROCESSES WITH LONG MEMORY UWE HASSLER Goethe-Universität Frankfurt

More information

An Evaluation of Errors in Energy Forecasts by the SARFIMA Model

An Evaluation of Errors in Energy Forecasts by the SARFIMA Model American Review of Mathematics and Statistics, Vol. 1 No. 1, December 13 17 An Evaluation of Errors in Energy Forecasts by the SARFIMA Model Leila Sakhabakhsh 1 Abstract Forecasting is tricky business.

More information

5 Autoregressive-Moving-Average Modeling

5 Autoregressive-Moving-Average Modeling 5 Autoregressive-Moving-Average Modeling 5. Purpose. Autoregressive-moving-average (ARMA models are mathematical models of the persistence, or autocorrelation, in a time series. ARMA models are widely

More information

Time Series and Forecasting

Time Series and Forecasting Time Series and Forecasting Introduction to Forecasting n What is forecasting? n Primary Function is to Predict the Future using (time series related or other) data we have in hand n Why are we interested?

More information

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle AUTOREGRESSIVE LINEAR MODELS AR(1) MODELS The zero-mean AR(1) model x t = x t,1 + t is a linear regression of the current value of the time series on the previous value. For > 0 it generates positively

More information

Chapter 8: Model Diagnostics

Chapter 8: Model Diagnostics Chapter 8: Model Diagnostics Model diagnostics involve checking how well the model fits. If the model fits poorly, we consider changing the specification of the model. A major tool of model diagnostics

More information

Time Series I Time Domain Methods

Time Series I Time Domain Methods Astrostatistics Summer School Penn State University University Park, PA 16802 May 21, 2007 Overview Filtering and the Likelihood Function Time series is the study of data consisting of a sequence of DEPENDENT

More information

NOTES AND CORRESPONDENCE. A Quantitative Estimate of the Effect of Aliasing in Climatological Time Series

NOTES AND CORRESPONDENCE. A Quantitative Estimate of the Effect of Aliasing in Climatological Time Series 3987 NOTES AND CORRESPONDENCE A Quantitative Estimate of the Effect of Aliasing in Climatological Time Series ROLAND A. MADDEN National Center for Atmospheric Research,* Boulder, Colorado RICHARD H. JONES

More information

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo Vol.4, No.2, pp.2-27, April 216 MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo ABSTRACT: This study

More information

Box-Jenkins ARIMA Advanced Time Series

Box-Jenkins ARIMA Advanced Time Series Box-Jenkins ARIMA Advanced Time Series www.realoptionsvaluation.com ROV Technical Papers Series: Volume 25 Theory In This Issue 1. Learn about Risk Simulator s ARIMA and Auto ARIMA modules. 2. Find out

More information

Minitab Project Report - Assignment 6

Minitab Project Report - Assignment 6 .. Sunspot data Minitab Project Report - Assignment Time Series Plot of y Time Series Plot of X y X 7 9 7 9 The data have a wavy pattern. However, they do not show any seasonality. There seem to be an

More information

A COMPARISON OF ESTIMATION METHODS IN NON-STATIONARY ARFIMA PROCESSES. October 30, 2002

A COMPARISON OF ESTIMATION METHODS IN NON-STATIONARY ARFIMA PROCESSES. October 30, 2002 A COMPARISON OF ESTIMATION METHODS IN NON-STATIONARY ARFIMA PROCESSES Lopes, S.R.C. a1, Olbermann, B.P. a and Reisen, V.A. b a Instituto de Matemática - UFRGS, Porto Alegre, RS, Brazil b Departamento de

More information

A time series is called strictly stationary if the joint distribution of every collection (Y t

A time series is called strictly stationary if the joint distribution of every collection (Y t 5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a

More information

Acta Universitatis Carolinae. Mathematica et Physica

Acta Universitatis Carolinae. Mathematica et Physica Acta Universitatis Carolinae. Mathematica et Physica Jitka Zichová Some applications of time series models to financial data Acta Universitatis Carolinae. Mathematica et Physica, Vol. 52 (2011), No. 1,

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins Methods

Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins Methods International Journal of Sciences Research Article (ISSN 2305-3925) Volume 2, Issue July 2013 http://www.ijsciences.com Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins

More information

Analysis. Components of a Time Series

Analysis. Components of a Time Series Module 8: Time Series Analysis 8.2 Components of a Time Series, Detection of Change Points and Trends, Time Series Models Components of a Time Series There can be several things happening simultaneously

More information

ARIMA modeling to forecast area and production of rice in West Bengal

ARIMA modeling to forecast area and production of rice in West Bengal Journal of Crop and Weed, 9(2):26-31(2013) ARIMA modeling to forecast area and production of rice in West Bengal R. BISWAS AND B. BHATTACHARYYA Department of Agricultural Statistics Bidhan Chandra Krishi

More information

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each) GROUND RULES: This exam contains two parts: Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each) The maximum number of points on this exam is

More information

FinQuiz Notes

FinQuiz Notes Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression

More information

Elements of Multivariate Time Series Analysis

Elements of Multivariate Time Series Analysis Gregory C. Reinsel Elements of Multivariate Time Series Analysis Second Edition With 14 Figures Springer Contents Preface to the Second Edition Preface to the First Edition vii ix 1. Vector Time Series

More information

Univariate, Nonstationary Processes

Univariate, Nonstationary Processes Univariate, Nonstationary Processes Jamie Monogan University of Georgia March 20, 2018 Jamie Monogan (UGA) Univariate, Nonstationary Processes March 20, 2018 1 / 14 Objectives By the end of this meeting,

More information

Modeling water flow of the Rhine River using seasonal long memory

Modeling water flow of the Rhine River using seasonal long memory WATER RESOURCES RESEARCH, VOL. 39, NO. 5, 1132, doi:10.1029/2002wr001697, 2003 Modeling water flow of the Rhine River using seasonal long memory Michael Lohre, Philipp Sibbertsen, and Tamara Könning Fachbereich

More information

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University Topic 4 Unit Roots Gerald P. Dwyer Clemson University February 2016 Outline 1 Unit Roots Introduction Trend and Difference Stationary Autocorrelations of Series That Have Deterministic or Stochastic Trends

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Analysis of Violent Crime in Los Angeles County

Analysis of Violent Crime in Los Angeles County Analysis of Violent Crime in Los Angeles County Xiaohong Huang UID: 004693375 March 20, 2017 Abstract Violent crime can have a negative impact to the victims and the neighborhoods. It can affect people

More information

Forecasting Levels of log Variables in Vector Autoregressions

Forecasting Levels of log Variables in Vector Autoregressions September 24, 200 Forecasting Levels of log Variables in Vector Autoregressions Gunnar Bårdsen Department of Economics, Dragvoll, NTNU, N-749 Trondheim, NORWAY email: gunnar.bardsen@svt.ntnu.no Helmut

More information

Some Time-Series Models

Some Time-Series Models Some Time-Series Models Outline 1. Stochastic processes and their properties 2. Stationary processes 3. Some properties of the autocorrelation function 4. Some useful models Purely random processes, random

More information

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION APPLIED ECONOMETRIC TIME SERIES 4TH EDITION Chapter 2: STATIONARY TIME-SERIES MODELS WALTER ENDERS, UNIVERSITY OF ALABAMA Copyright 2015 John Wiley & Sons, Inc. Section 1 STOCHASTIC DIFFERENCE EQUATION

More information

LONG-MEMORY FORECASTING OF U.S. MONETARY INDICES

LONG-MEMORY FORECASTING OF U.S. MONETARY INDICES LONG-MEMORY FORECASTING OF U.S. MONETARY INDICES John Barkoulas Department of Finance & Quantitative Analysis Georgia Southern University Statesboro, GA 30460, USA Tel. (912) 871-1838, Fax (912) 871-1835

More information

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

A SARIMAX coupled modelling applied to individual load curves intraday forecasting A SARIMAX coupled modelling applied to individual load curves intraday forecasting Frédéric Proïa Workshop EDF Institut Henri Poincaré - Paris 05 avril 2012 INRIA Bordeaux Sud-Ouest Institut de Mathématiques

More information

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index Applied Mathematical Sciences, Vol. 8, 2014, no. 32, 1557-1568 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.4150 Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial

More information

TIME SERIES MODELLING OF SURFACE PRESSURE DATA

TIME SERIES MODELLING OF SURFACE PRESSURE DATA INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 18: 443 455 (1998) TIME SERIES MODELLING OF SURFACE PRESSURE DATA SHAFEEQAH AL-AWADHI and IAN JOLLIFFE* Department of Mathematical Sciences, Uni ersity

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

Chapter 9: Forecasting

Chapter 9: Forecasting Chapter 9: Forecasting One of the critical goals of time series analysis is to forecast (predict) the values of the time series at times in the future. When forecasting, we ideally should evaluate the

More information

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University Seminar p.1/28 Predictive spatio-temporal models for spatially sparse environmental data Xavier de Luna and Marc G. Genton xavier.deluna@stat.umu.se and genton@stat.ncsu.edu http://www.stat.umu.se/egna/xdl/index.html

More information

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn } Stochastic processes Time series are an example of a stochastic or random process Models for time series A stochastic process is 'a statistical phenomenon that evolves in time according to probabilistic

More information

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand Applied Mathematical Sciences, Vol. 6, 0, no. 35, 6705-67 A Comparison of the Forecast Performance of Double Seasonal ARIMA and Double Seasonal ARFIMA Models of Electricity Load Demand Siti Normah Hassan

More information

Forecasting Bangladesh's Inflation through Econometric Models

Forecasting Bangladesh's Inflation through Econometric Models American Journal of Economics and Business Administration Original Research Paper Forecasting Bangladesh's Inflation through Econometric Models 1,2 Nazmul Islam 1 Department of Humanities, Bangladesh University

More information

arxiv: v1 [stat.me] 5 Nov 2008

arxiv: v1 [stat.me] 5 Nov 2008 arxiv:0811.0659v1 [stat.me] 5 Nov 2008 Estimation of missing data by using the filtering process in a time series modeling Ahmad Mahir R. and Al-khazaleh A. M. H. School of Mathematical Sciences Faculty

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

Advanced Econometrics

Advanced Econometrics Advanced Econometrics Marco Sunder Nov 04 2010 Marco Sunder Advanced Econometrics 1/ 25 Contents 1 2 3 Marco Sunder Advanced Econometrics 2/ 25 Music Marco Sunder Advanced Econometrics 3/ 25 Music Marco

More information

Moving Average (MA) representations

Moving Average (MA) representations Moving Average (MA) representations The moving average representation of order M has the following form v[k] = MX c n e[k n]+e[k] (16) n=1 whose transfer function operator form is MX v[k] =H(q 1 )e[k],

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Decision 411: Class 9. HW#3 issues

Decision 411: Class 9. HW#3 issues Decision 411: Class 9 Presentation/discussion of HW#3 Introduction to ARIMA models Rules for fitting nonseasonal models Differencing and stationarity Reading the tea leaves : : ACF and PACF plots Unit

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10 Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis 17th Class 7/1/10 The only function of economic forecasting is to make astrology look respectable. --John Kenneth Galbraith show

More information

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1 Forecasting using R Rob J Hyndman 2.4 Non-seasonal ARIMA models Forecasting using R 1 Outline 1 Autoregressive models 2 Moving average models 3 Non-seasonal ARIMA models 4 Partial autocorrelations 5 Estimation

More information

A Course in Time Series Analysis

A Course in Time Series Analysis A Course in Time Series Analysis Edited by DANIEL PENA Universidad Carlos III de Madrid GEORGE C. TIAO University of Chicago RUEY S. TSAY University of Chicago A Wiley-Interscience Publication JOHN WILEY

More information

2. An Introduction to Moving Average Models and ARMA Models

2. An Introduction to Moving Average Models and ARMA Models . An Introduction to Moving Average Models and ARMA Models.1 White Noise. The MA(1) model.3 The MA(q) model..4 Estimation and forecasting of MA models..5 ARMA(p,q) models. The Moving Average (MA) models

More information

Probabilistic forecasting of solar radiation

Probabilistic forecasting of solar radiation Probabilistic forecasting of solar radiation Dr Adrian Grantham School of Information Technology and Mathematical Sciences School of Engineering 7 September 2017 Acknowledgements Funding: Collaborators:

More information

The Identification of ARIMA Models

The Identification of ARIMA Models APPENDIX 4 The Identification of ARIMA Models As we have established in a previous lecture, there is a one-to-one correspondence between the parameters of an ARMA(p, q) model, including the variance of

More information

7. Forecasting with ARIMA models

7. Forecasting with ARIMA models 7. Forecasting with ARIMA models 309 Outline: Introduction The prediction equation of an ARIMA model Interpreting the predictions Variance of the predictions Forecast updating Measuring predictability

More information

Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites

Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites Tiago Santos 1 Simon Walk 2 Denis Helic 3 1 Know-Center, Graz, Austria 2 Stanford University 3 Graz University of Technology

More information

A Gaussian state-space model for wind fields in the North-East Atlantic

A Gaussian state-space model for wind fields in the North-East Atlantic A Gaussian state-space model for wind fields in the North-East Atlantic Julie BESSAC - Université de Rennes 1 with Pierre AILLIOT and Valï 1 rie MONBET 2 Juillet 2013 Plan Motivations 1 Motivations 2 Context

More information

MCMC analysis of classical time series algorithms.

MCMC analysis of classical time series algorithms. MCMC analysis of classical time series algorithms. mbalawata@yahoo.com Lappeenranta University of Technology Lappeenranta, 19.03.2009 Outline Introduction 1 Introduction 2 3 Series generation Box-Jenkins

More information

Empirical Approach to Modelling and Forecasting Inflation in Ghana

Empirical Approach to Modelling and Forecasting Inflation in Ghana Current Research Journal of Economic Theory 4(3): 83-87, 2012 ISSN: 2042-485X Maxwell Scientific Organization, 2012 Submitted: April 13, 2012 Accepted: May 06, 2012 Published: June 30, 2012 Empirical Approach

More information

A Diagnostic for Seasonality Based Upon Autoregressive Roots

A Diagnostic for Seasonality Based Upon Autoregressive Roots A Diagnostic for Seasonality Based Upon Autoregressive Roots Tucker McElroy (U.S. Census Bureau) 2018 Seasonal Adjustment Practitioners Workshop April 26, 2018 1 / 33 Disclaimer This presentation is released

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Nonlinear time series analysis Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Nonlinearity Does nonlinearity matter? Nonlinear models Tests for nonlinearity Forecasting

More information

Births at Edendale Hospital

Births at Edendale Hospital CHAPTER 14 Births at Edendale Hospital 14.1 Introduction Haines, Munoz and van Gelderen (1989) have described the fitting of Gaussian ARIMA models to various discrete-valued time series related to births

More information

Nonparametric time series forecasting with dynamic updating

Nonparametric time series forecasting with dynamic updating 18 th World IMAS/MODSIM Congress, Cairns, Australia 13-17 July 2009 http://mssanz.org.au/modsim09 1 Nonparametric time series forecasting with dynamic updating Han Lin Shang and Rob. J. Hyndman Department

More information

Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 2. Robert Almgren. Sept. 21, 2009 Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models

More information

3 Time Series Regression

3 Time Series Regression 3 Time Series Regression 3.1 Modelling Trend Using Regression Random Walk 2 0 2 4 6 8 Random Walk 0 2 4 6 8 0 10 20 30 40 50 60 (a) Time 0 10 20 30 40 50 60 (b) Time Random Walk 8 6 4 2 0 Random Walk 0

More information

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them. TS Module 1 Time series overview (The attached PDF file has better formatting.)! Model building! Time series plots Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book;

More information

Ch 6. Model Specification. Time Series Analysis

Ch 6. Model Specification. Time Series Analysis We start to build ARIMA(p,d,q) models. The subjects include: 1 how to determine p, d, q for a given series (Chapter 6); 2 how to estimate the parameters (φ s and θ s) of a specific ARIMA(p,d,q) model (Chapter

More information

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu Volume 109 No. 8 2016, 225-232 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

More information

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia International Journal of Applied Science and Technology Vol. 5, No. 5; October 2015 Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia Olayan

More information

Introduction to ARMA and GARCH processes

Introduction to ARMA and GARCH processes Introduction to ARMA and GARCH processes Fulvio Corsi SNS Pisa 3 March 2010 Fulvio Corsi Introduction to ARMA () and GARCH processes SNS Pisa 3 March 2010 1 / 24 Stationarity Strict stationarity: (X 1,

More information

Exponential smoothing in the telecommunications data

Exponential smoothing in the telecommunications data Available online at www.sciencedirect.com International Journal of Forecasting 24 (2008) 170 174 www.elsevier.com/locate/ijforecast Exponential smoothing in the telecommunications data Everette S. Gardner

More information

Time Series 4. Robert Almgren. Oct. 5, 2009

Time Series 4. Robert Almgren. Oct. 5, 2009 Time Series 4 Robert Almgren Oct. 5, 2009 1 Nonstationarity How should you model a process that has drift? ARMA models are intrinsically stationary, that is, they are mean-reverting: when the value of

More information

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Basics: Definitions and Notation. Stationarity. A More Formal Definition Basics: Definitions and Notation A Univariate is a sequence of measurements of the same variable collected over (usually regular intervals of) time. Usual assumption in many time series techniques is that

More information