Seasonal Autoregressive Integrated Moving Average Model for Precipitation Time Series

Similar documents
TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Univariate ARIMA Models

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Time Series Analysis Model for Rainfall Data in Jordan: Case Study for Using Time Series Analysis

Application of Time Sequence Model Based on Excluded Seasonality in Daily Runoff Prediction

Time Series I Time Domain Methods

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Forecasting of meteorological drought using ARIMA model

Stat 565. (S)Arima & Forecasting. Charlotte Wickham. stat565.cwick.co.nz. Feb

Time Series Analysis of Monthly Rainfall data for the Gadaref rainfall station, Sudan, by Sarima Methods

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins Methods

Empirical Approach to Modelling and Forecasting Inflation in Ghana

STAT 436 / Lecture 16: Key

Estimation of Parameters of Multiplicative Seasonal Autoregressive Integrated Moving Average Model Using Multiple Regression

at least 50 and preferably 100 observations should be available to build a proper model

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Application of ARIMA Models in Forecasting Monthly Total Rainfall of Rangamati, Bangladesh

TIME SERIES DATA PREDICTION OF NATURAL GAS CONSUMPTION USING ARIMA MODEL

Rainfall Drought Simulating Using Stochastic SARIMA Models for Gadaref Region, Sudan

Modeling climate variables using time series analysis in arid and semi arid regions

FE570 Financial Markets and Trading. Stevens Institute of Technology

Modeling and forecasting global mean temperature time series

Forecasting using R. Rob J Hyndman. 2.5 Seasonal ARIMA models. Forecasting using R 1

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Minitab Project Report - Assignment 6

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Chapter 3: Regression Methods for Trends

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Estimation and application of best ARIMA model for forecasting the uranium price.

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM

Forecasting Area, Production and Yield of Cotton in India using ARIMA Model

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand

Multiplicative Sarima Modelling Of Nigerian Monthly Crude Oil Domestic Production

Stochastic Generation and Forecasting Of Weekly Rainfall for Rahuri Region

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9)

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9) Outline. Return Rate. US Gross National Product

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

ARMA models with time-varying coefficients. Periodic case.

Development of Demand Forecasting Models for Improved Customer Service in Nigeria Soft Drink Industry_ Case of Coca-Cola Company Enugu

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Ch 6. Model Specification. Time Series Analysis

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Sugarcane Productivity in Bihar- A Forecast through ARIMA Model

Lecture 5: Estimation of time series

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Box-Jenkins ARIMA Advanced Time Series

Time Series Analysis of Currency in Circulation in Nigeria

MCMC analysis of classical time series algorithms.

Univariate Time Series Analysis; ARIMA Models

MODELING MAXIMUM MONTHLY TEMPERATURE IN KATUNAYAKE REGION, SRI LANKA: A SARIMA APPROACH

Study of Time Series and Development of System Identification Model for Agarwada Raingauge Station

Suan Sunandha Rajabhat University

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL

0.1 ARIMA: ARIMA Models for Time Series Data

Modelling Monthly Precipitation in Faridpur Region of Bangladesh Using ARIMA

INTERNATIONAL JOURNAL OF ENVIRONMENTAL SCIENCES Volume 6, No 6, Copyright by the authors - Licensee IPA- Under Creative Commons license 3.

Available online at ScienceDirect. Procedia Computer Science 72 (2015 )

A SEASONAL TIME SERIES MODEL FOR NIGERIAN MONTHLY AIR TRAFFIC DATA

ARIMA modeling to forecast area and production of rice in West Bengal

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

TIME SERIES MODELING OF MONTHLY RAINFALL IN ARID AREAS: CASE STUDY FOR SAUDI ARABIA

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Automatic seasonal auto regressive moving average models and unit root test detection

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

STAT Financial Time Series

5 Autoregressive-Moving-Average Modeling

ARIMA Modelling and Forecasting

Volume 11 Issue 6 Version 1.0 November 2011 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.

Implementation of ARIMA Model for Ghee Production in Tamilnadu

A time series is called strictly stationary if the joint distribution of every collection (Y t

The log transformation produces a time series whose variance can be treated as constant over time.

A stochastic modeling for paddy production in Tamilnadu

arxiv: v1 [stat.co] 11 Dec 2012

Homework 4. 1 Data analysis problems

Estimating AR/MA models

UNIVARIATE TIME SERIES ANALYSIS BRIEFING 1970

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Intervention Model of Sinabung Eruption to Occupancy of the Hotel in Brastagi, Indonesia

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation

Forecasting. Simon Shaw 2005/06 Semester II

The ARIMA Procedure: The ARIMA Procedure

Forecasting Bangladesh's Inflation through Econometric Models

The Fitting of a SARIMA model to Monthly Naira-Euro Exchange Rates

Some Time-Series Models

Trends and Multifractal Analyses of Precipitation Data from Shandong Peninsula, China

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MAS451/MTH451 Time Series Analysis TIME ALLOWED: 2 HOURS

of seasonal data demonstrating the usefulness of the devised tests. We conclude in "Conclusion" section with a discussion.

1 Time Series Concepts and Challenges

Lecture 19 Box-Jenkins Seasonal Models

Transcription:

Journal of Mathematics and Statistics 8 (4): 500-505, 2012 ISSN 1549-3644 2012 doi:10.3844/jmssp.2012.500.505 Published Online 8 (4) 2012 (http://www.thescipub.com/jmss.toc) Seasonal Autoregressive Integrated Moving Average Model for Precipitation Time Series 1 Xinghua Chang, 2 Meng Gao, 1 Yan Wangand 2 Xiyong Hou 1 School of Mathematics and Information Sciences, Yantai University, Yantai, 264005, China 2 Key Laboratory of Coastal Zone Environmental Processes, Yantai Institute of Coastal Zone Research, CAS, Yantai, 264003, China Received 2012-09-07, Revised 2013-01-10; Accepted 2013-02-20 ABSTRACT Predicting the trend of precipitation is a difficult task in meteorology and environmental sciences. Statistical approaches from time series analysis provide an alternative way for precipitation prediction. The ARIMA model incorporating seasonal characteristics, which is referred to as seasonal ARIMA model was presented. The time series data is the monthly precipitation data in Yantai, China and the period is from 1961 to 2011. The model was denoted as SARIMA (1, 0, 1) (0, 1, 1) 12 in this study. We first analyzed the stability and correlation of the time series. Then we predicted the monthly precipitation for the coming three yesrs. The results showed that the model fitted the data well and the stochastic seasonal fluctuation was sucessfuly modeled. Seasonal ARIMA model was a proper method for modeling and predicting the time series of monthly percipitation. Keywords: Time Series Analysis, SARIMA Model, Forecasting 1. INTRODUCTION ARIMA(p, d, q) model, we should also consider some periodical time series. The periodicity of periodical time Autoregressive Integrated Moving Average Model series is usually due to seasonal changes (including (ARIMA), is a widely used time series analysis model in monthly, quarterly and degree of weeks change) or some statistics. ARIMA model was firstly proposed by Box other natural reasons. We can build pure seasona A and Jenkins in the early 1970s, which is often termed as ARIMA(P,D,Q) model (He, 2004) with the time series Box-Jenkins model or B-J model for simplicity (Stoffer date in different cycle and the same phase, the and Dhumway, 2010). ARIMA is a kind of short-term parameters P, D and Q are the relevant seasonal prediction model in time series analysis. Because this autoregressive parameter, seasonal integrated parameter method is relatively systematic, flexible and can grasp and seasonal moving average parameter. more original time series information, it is widely used in Considering the data relation, we can build a meteorology, engineering technology, Marine, economic multiplication seasonal SARIMA(p, d, q)(p, D, Q) s statistics and prediction technology, (Kantz and model, (Wang et al., 2008). The model has been Schreiber, 2004; Cryer and Chan, 2008). successfully applied in many subjects. In practical The general ARIMA model is also applicable for applications, the order of model SARIMA is usually not non-stationary time series that have some clearly too large (Guo, 2009). If the period of time sereis equals identifiable trends (Stoffer and Dhumway, 2010). We to 12, it can be denoted as SARIMA(p, d, q)(p, D, Q) 12. usually denote ARIMA model as ARIMA(p, d, q), where In the adjustment of the season, this is a very convenient, P and q are non-negative integers that correspond to the steady model. order of the autoregressive, integrated and moving In this study, we will take the monthly precipitation average parts of the model, respectively. In addition to time series as an example to build an seasonal ARIMA the general ARIMA model, namely non-seasonal model and then forecast the precipitation in the next few Corresponding Author: Xinghua Chang, School of Mathematics and the Information Sciences, Yantai University, Yantai, 264005, China 500

years. Specifically, in a seasonal ARIMA model, once we have smoothed the data and identified the parameters D and d, other parameters P, Q, P and q can be preliminarily identified from the ACF and PACF of the stationary processing series. Other related technologies were also used in the study. 2. MATERIALS AND METHODS 2.1. Seasonal ARIMA Model The general form of seasonal model SARIMA(p, d, q) (P, D, Q)s is given by: Φ (B ) ϕ(b) x =Θ (B ) θ (B)w (1) s D d s P s t Q t where, {w t } is the nonstationary time series, {w t } is the usual Gaussian white noise process. s is the period of the time series. The ordinary autoregressive and moving average components are represented by polynomials φ(b) and θ (B) of orders p and q. The seasonal autoregressive and moving average components are Φ p (B s ) and Θ Q (B s ), where P and Q are their orders. d and are ordinary and seasonal difference components. B is the backshift operator. The expressions are shown as follows: ϕ (B) = 1 ϕ B ϕ B L ϕ B 2 p 1 2 p Φ (B ) = 1 Φ B Φ B L Φ B s s 2s ps P 1 2 p θ (B) = 1+θ B+θ B + L+θ B 2 q 1 2 q Θ (B ) = 1+Θ B +Θ B + L +Θ B s s 2s Qs Q 1 2 Q d = D s k t (1 B) x t k d = (1 B ) B x = s D In this study, we concentrate on monthly precipitation time series. If the seasonal period of the series s = 12. It is clear that we may then rewrite Equation (1) as: Φ (B ) ϕ(b) x =Θ (B ) θ (B)w (2) 12 D d 12 P 12 t Q t 2.2. Model Identification In the tentative specification phase, namely model identification, the goal is to employ computationally simple techniques to narrow down the range of parsimonious models. The B-J method is only suitable for stationary time D s 501 series data. In such case, we should possibly observe time series graph and transform the data appropriately. First, we should construct a time plot of the data and inspect the graph for any anomalies (Cryer and Chan, 2008). If the variance grows with time, it will be necessary stabilize the variance. The next step is to identify preliminary values of autoregressive order P, the order of differencing d, the moving average order q and their corresponding seasonal parameters P, D and Q. Here, the Autocorrelation Function (ACF) and the Partial Autocorrelation Function (PACF) are the most important elements (Stoffer and Dhumway, 2010). The ACF measures the amount of linear dependence between observations in a time series that are separated by a lag q. The PACF helps to determine how many autoregressive terms p is necessary. The parameter d is the order of difference frequency from non-stationary time series to stationary series. Furthermore, a time series plot and ACF of data will typically suggest whether any differencing is needed. If differencing is called for, the time plot will show some kind of linear trend. When preliminary values of D and d have been fixed, D d the next step is to check the ACF and PACF of 12 xt to determine the values of P, Q, P and q. We can further choose parameters using Akaike s Information Criterion (AIC) to determine the values of parameters (Stoffer and Dhumway, 2010). 2.3. Parameters Estimation Once the model is tentatively established, the parameters and the corresponding standard errors can be estimated using statistical techniques, such as Maximum Likelihood (ML), least square estimation method and Yule-Walker. 2.4. Diagnostic Checking Generally, this step includes the analysis of the residuals as well as model comparisons. If the model fits well, the standardized residuals should behave as an independent and identically distributed sequence with mean zero and variance one (Cryer and Chan, 2008). A standardized residuals plot or a Q-Q plot can help in identifying the normality (Stoffer and Dhumway, 2010). The model should pass the parametric test and diagnostic check. 2.5. Fitting and Prediction Once a model has been identified and all the parameters have been estimated, we can predict future values of a time series with this model.

Fig. 1. Time sereis of monthly precipitation data for Yantai station Table 1. Basic statistics for Yantai monthly precipitation data (in mm) No of observation Mean St Dev Median Min Max 612 15.292 10.419 13.119 0 69.427 2.6. Data In this study, the time series is the monthly percipitation data from Yantai, a coastal city in China. The annual mean temperature in Yantai is 12 C and the annual precipitation is 620 mm. The data processing tool is the free statistical software R. Time series plot is shown in Fig. 1. The descriptive statistics for our data are summarized in Table 1. 3. RESULTS The ACF and PACF of the original data {x t }, t = 1,2,,612, are shown in Fig. 2. The ACF and Fig. 1 show a seasonal fluctuation occur every 12 month, resulting in s = 12 (Wang, 2008; Momani and Naill, 2009). Concentrating on the ACF of original data, we note a slow decreasing trend in the ACF peaks at seasonal lags, h = 1s, 2s, 3s, 4s, where s = 12. It indicates a nonstationary behavior and suggests a seasonal difference. Figure 3 shows the ACF and PACF of the deseasonalized precipitation data. The ACF decreases to zero exponentially indicating a stationary behavior 502 (Stoffer and Dhumway, 2010; Han et al., 2008). Then the SARIMA (p, 0, q)(p, 1, Q) 12 model could be fitted to the de-seasonalized data. From ACF of the stationary series, we can see the ACF peak at h =1s; while for PACF, it peaks at h =1s, 2s,,6s. This phenomenon means that the ACF is cutting off after lag 1s and the PACF is tailing off in the seasonal lags. So we can build two models: (i) an SAR model of order Q = 1, or (ii) an SARMA of orders P = 1,2,,6 and Q = 1. The characteristic of graph turns out model (i) is much better. Inspecting the ACF and PACF at lags h = 1, 2,,11, it appears that either: (a) ACF and PACF are both tailing off; (b) PACF cuts off at lag 1, ACF tails off; (c) ACF cuts off at lag 1, PACF tails off. The result indicates that we should consider the following models and choose a better model based on AIC, AICc and BIC criteria. The optional models and the correlation values are shown in Table 2. Obviously, model SARIMA (1,0,1) (0,1,1) 12 has the smallest value of AIC, AICc and BIC and then we temporarily have a model SARIMA (1,0,1) (0,1,1) 12. As a rule of thumb in SARIMA modeling, we need to minimize the sum squared of residuals (RSS) and the number of model parameters. We had considered this message when calculating the related values (Stoffer and Dhumway, 2010).

(a) Fig. 2. (a) Autocorrelation (ACF) and (b) Partial Autocorrelation (PACF) for original time series of monthly precipitation data (b) (a) Fig. 3. (a) Autocorrelation (ACF) and (b) Partial Autocorrelation (PACF) for first order seasonal differencing and de-seasonal original precipitation data (b) The model parameters are estimated using Maximum Likelihood Estimation. The related parameters are shown in Table 3, where s.e stands for the standard deviation. It can be observed the parameters of model SARIMA(1, 0, 1)(0, 1, 1) 12 are all significant. Then we plug the related parameter into the Equation 2 and 3 the fitted model in this case is: ϕ(b) x =Θ (B ) θ (B)w (3) 1 12 12 t 1 t The diagnostics for the model SARIMA(1, 0, 1)(0, 1, 1) 12 is displayed in Fig. 4 and 5. The standardized residual shows no obvious patterns, although there are a 503 few suspicious values and unusual values (Kantz and Schreiber, 2004). The model fits well although a small amount of autocorrelation still remains. Moreover, we use the Ljung-Box test to examine the independence of the residuals. The p-values of Q-statistic for the first 12 lags of the model are shown in Fig. 5. Finally, predictions based on the fitted model for the next three years are shown in Fig. 5. The model SARIMA (1, 0, 1) (0, 1, 1) 12 could be written as Equation 4: (1 ϕ B) x = (1+Θ B )(1+θ B)w (4) 12 1 12 t 1 1 t

Fig. 4. Diagnostic for the SARIMA (1, 0, 1) (0,1,1) 12 model Fig. 5. Predicted and real values of monthly precipitation 504

Table 2. Optional models and the related standards values Models AIC AICc BIC SARIMA (0,0,1) (0,1,1) 12 5.114070 5.117445 4.135720 SARIMA (1,0,1) (0,1,1) 12 5.098611 5.102041 4.127479 SARIMA (0,0,0) (0,1,1) 12 5.123903 5.127235 4.138337 SARIMA (1,0,1) (0,1,1) 12 5.112628 5.116003 4.134278 Table 3. Estimates of the model parameters Model parameter ----------------------------- Model φ1 θ1 Φ1 RSS K SARIMA (1,0,1) (0,1,1) 12 0.9709-0.9219-0.8223 36398.08 4 s.c. 0.0245 0.0376 0.0257 Or: (1 ϕ B)(1 B )x = (1+Θ B )(1+θ B)w 12 12 1 t 1 1 t The equation can be multiplied and written in the following form that is used in forecasting, the values of the correlation coefficient as shown in Table 3, Equation 5: x =ϕ x + x -ϕ x t 1 t 1 t 12 1 t 13 + w +θ w +θ w +Θθ w t 1 t 1 1 t 12 1 1 t 13 (5) Finally, The comparison between the real values and the fitted value is shown in Fig. 5. The vertical dotted line separates the data from the predictions. 4. DISSCUSSION Because of many stochastic environmental factors, such as temperature, geographic location and climate, the model state of precipitation is a complicated dynamical system. The time series model in study does not model the extreme values well. Further extensions of study may be undertaken by considering an intervention time series analysis such as Autoregressive conditional heteroskedasticity model to model the pheneonemon of extremums. 5. CONCLUSION In this study, an ARIMA model that incorporates the seasonality of time series was presented. Using the time series of monthly precipitation in Yantai, we build a seasonal SARIMA (1, 0, 1) (0, 1, 1) 12. It was found that the model fitted the data well and the stochastic seasonal fluctuation was sucessfuly modeled except some extreme values. The predictions based on this model indicate that the percipitation in the next three years will decrease. The decreasing trend is consistent with that obtained in our previous study (Gao and Hou, 2012). This changing trend reminds us to make proper strategies of water resource management in response to water shortage. 6. ACKNOWLEDGEMENT This study was partly supported by National Natural Sciences Foundation of China (10971011; 31000197) and Knowledge innovation project of CAS (KZCX2- EW-QN209). 7. REFERENCES Cryer, J. D. and K.S. Chan, 2008. Time Series Analysis with Application in R. 2nd Edn., Springer, New York, ISBN-10: 0387759581, pp: 491. Kantz, H. and T. Schreiber, 2004. Nonlinear Time Series Analysis. 2nd Edn., Cambridge University Press, Cambridge, ISBN-10: 0521529026, pp: 369. Gao, M. and X.Y. Hou, 2012. Trends and multifractal analyses of precipitation data from shandong peninsula, China. Am. J. Environ. Sci., 8: 271-279. DOI: 10.3844/ajessp.2012.271.279 Guo, Z.W., 2009. The adjustment method and research progress based on the ARIMA model. Chinese J. Hosp. Stat., 161: 65-69. Han, P., P.X. Wang and Y.J. Wang, 2008. Drought forecasting based on the standardized precipitation index at different temporal scales using ARIMA models. Agric. Res. Arid Areas, 26: 212-218. He, S.Y., 2004. Applied Time Series Analysis. 1st Edn., Peking University Press, Beijing. Momani, M. and P.E. Naill, 2009. Time series analysis model for rainfall data in Jordan: Case study for using time series analysis. Am. J. Environ. Sci., 5: 599-604. DOI: 10.3844/ajessp.2009.599.604 Stoffer, D.S. and R.H. Dhumway, 2010. Time Series Analysis and its Application. 3rd Edn., Springer, New York, ISBN-10: 1441978658, pp: 596. Wang, J., Y.H. Du and X.T. Zhang, 2008. Theory and Application with Seasonal Time Series. 1st Edn., Nankai University Press, Chinese. Wang, Y., 2008. Applied Time Series Analysis. 1st Edn., China Renmin University Press, Beijing. 505