PREDICTION OF DAILY RUNOFF USING TIME SERIES FORECASTING AND ANN MODELS Santosh K Patil 1, Dr. Shrinivas S. Valunjkar Research scholar Dept. of Civil Engineering, Government College of Engineering, Aurangabad, India Professor, Dept. of Civil Engineering, Government College of Engineering, Karad, Satara, India santosh68.patil@gmail.com 1, ssvalunjkar@gmail.com ABSTRACT Today s scenario motivates the researchers to develop innovative models for the need of increased accuracies in time series forecasting. This paper proposes a new time series neural network model that utilizes the strengths of traditional time series approaches and artificial neural networks (ANN s). The proposed approach towards overall modeling framework is a combination of the conventional and ANN techniques. Before presenting the modified time series data to the ANN, data processing is implicated in the time series analysis. In this paper, the daily stream flow data of Gunjwani River near Pune, Maharashtra, India is tested for time series forecasting. The results from time series models of ARIMA and ANN models are presented. More accurate forecasts can be produced due to the approach of combining the strengths of the conventional and ANN techniques and is confirmed by the results obtained that provide a robust modeling framework capable of capturing the non-linear nature of the complex time series and hence producing more accurate forecasts. In this study, the proposed neural network models are applied in hydrology, but they also have tremendous scope for application in a wide range of areas for achieving increased accuracies in time series forecasting. Keyword Time series, ANN techniques, daily stream flow, Hydrology Introduction In the last few decades, time series forecasting has received tremendous attention of researchers. Planning, designing, management and other important activities in all branches of engineering and other fields need the time series forecasting methods. Conventionally, the researchers have employed traditional methods of time series analysis, modeling, and forecasting, e.g. Box-Jenkins methods of autoregressive (AR), auto-regressive moving average (ARMA), auto-regressive integrated moving average (ARIMA), auto-regressive moving average with exogenous inputs (ARMAX), etc. The conventional time series modeling methods have served the scientific community for a long time; though, they provide only reasonable accuracy and suffer from the stationary and linear assumptions. About two decades ago artificial neural networks (ANNs) were introduced as efficient tools of modeling and forecasting. One can find numerous ANN applications in a wide range of areas for time series forecasting. A great deal of time and effort has been spent by the researchers in both conventional and soft computing techniques for time series forecasting. However, the need of producing more and more accurate time series forecasts has forced the researchers to develop innovative methods to model time series. This paper presents a study aimed at achieving accurate forecasts of a hydrologic time series using a combination of traditional time series and neural network approaches. The paper begins with a brief review of the time series forecasting using neural networks in a wide range of fields. Hydrologic time series modeling The stream flow at a location in a river in a catchment is one of the key hydrologic variables. The availability of accurate stream flow forecasts at a location in a river in a catchment is important in many water resources management and design activities such as flood control and management and design of various hydraulic structures such as dams and bridges. To generate stream flow forecasts two types of mathematical models are used namely the rainfall-runoff models that use both climatic and hydrologic data and stream flow models that use only the hydrologic data. The researchers, usually, have relied on conventional modeling techniques, either deterministic or conceptual models that consider the physics of the underlying process or systems theoretic/ black box models that do not. Deterministic and black-box models of varying degree of complexity have been employed in the past for modeling rainfall-runoff process with a varying degree of success. The stream flow process in a catchment is a complex and non-linear processes affected by many and often inter related physical factors. The factors affecting the stream flow response of a catchment subjected to rainfall input include: (a) storm characteristics, i.e. intensity and duration of rainfall event,(b) catchment characteristics, i.e. size, shape, slope and storage characteristics of the catchment, percentage of the catchment contributing stream flow at the outlet at various time steps during a rainfall event, (c) geomorphological characteristics of a catchment, i.e. topography, 1
land use pattern, soil type, vegetation that affects infiltration and (d) climatic characteristics such as temperature, humidity and wind characteristics. The influence of these factors and many of their combinations in generating stream flow is an extremely complex physical process and is not understood clearly [1]. Moreover, many of the deterministic or conceptual rainfall-runoff models need a large amount of data for calibration and validation purposes and are computationally extensive. As a result, the use of deterministic/ conceptual models of the rainfall-runoff process has been viewed rather skeptically by the researchers and has now become very popular []. ANNs have been proposed as efficient tools for modeling and prediction and are supposed to possess the capability to reproduce the unknown relationship existing between a set of input explanatory variables and output variables [3]. Many studies have demonstrated that the ANNs are adequate to model the runoff process and can even perform better than the conventional modeling technique [1, -1]. Many efforts has been spent in using traditional time series analysis techniques for stream flow forecasting. Many models of varying degree of complexity and sophistication have been proposed by various researchers. Some of the earliest examples of the AR type of stream flow forecast models include Thomas and Fiering [11]. Carlson et al. []proposed significant developments in the form of ARMA models of the hydrologic time series. McKerchar and Delleur [13] used the ARIMA modeling to model monthly stream flow of 16 watersheds in Indiana, Illinois and Kentucky. An important contribution to the time series modeling was due to Kalman [1], in the form of model capability to operate in an adaptive sense. He provided a mechanism, which would minimize the forecast error variance. Some other notable examples of the use of time series modeling for stream flow forecasting include Bolzern et al. [15], Lettenmaier [16], Burn and McBean [17], Bender and Simonivic [18]and Awwad et al.[19]. The ANN modeling of stream flow using only flow data and the comparison of ANN models with the time series models have been limited in hydrology. Atiya et al. []compared time series and ANN models for making single-step and multiple-step ahead forecasts for river flow. Jain et al. [1]compared the ANN models with regression and time series models in making short term water demand predictions at the Indian Institute of Technology, Kanpur. Jain and Ormsbee []used ANN to model the short-term water demand process in U SA, and found its performance to be better than the regression and time series models of AR type. Jain and Indurthy [3]used past flow information to model the complex rainfallrunoff process and compared the same with regression models. Coulibaly et al. [] presented multilayer perceptron (MLP), input delayed neural network and recurrent neural network with and without input delays for reservoir inflow prediction in the Chute-du-Diable catchment in Canada. Apart from these studies, the efforts in the area of using ANNs for time series modeling and prediction in hydrology have been limited. W hile developing ANN models of the hydrologic time series, most of the researchers have employed raw data to be presented to the ANN. The raw data consist of various trends in the form of long term memory and seasonal variations. For these reasons, the hydraulic time series may be non-stationary affecting the performance of the ANN models. It may be possible to improve the performance of ANN models by first carefully removing the long-term variations before presenting an ANN with the modified data. The conventional time series modeling approaches of ARMA type suffer from being based on the linear systems theory. The non-linear and massively parallel structure of ANNs coupled with traditional time series methods may provide robust modeling framework capable of producing more accurate forecasts; however, it needs to be investigated. The objectives of the study presented in this paper are to: (a) investigate the use of non-linear ANNs for modeling the complex hydrologic time series, (b) evaluate the impact of removing the long-term and seasonal variations in a time series before presenting the filtered data to an ANN on prediction and modeling and (c) compare the performance of the proposed approach with the traditional time series models. All the models and the proposed methodologies are tested using the daily stream flow data. The development of various models is presented next. Model Development Two types of approaches have been investigated in this study for the purpose of stream flow forecasting. The first approach involves the conventional time series modeling of AR type and the second one uses ANN approach. In addition, each of the two approaches is tested on two categories of data: raw data consisting of daily flow and the de-trended de-seasonalised data after removing long-term trends and seasonal variations. The purpose of using de-trended de-seasonalised data was to evaluate the impact of de-trending and de-seasonalisation of time series modeling on the performance of ANN models. The ANN models developed on data in second category represents hybrid models. Daily stream flow seasonal (June-Oct) data for a period of 3 years (1985-7) derived from the Gunjwani River at velhe near pune, were employed for the model development in this study. The stream flow data for the first 16 years (1985-) were employed for training and the data for the remaining 7 years were employed for testing purpose.
Auto-regressive models The steps involved in developing a time series model of AR type include modeling of long-term trends, modeling of seasonal variations and modeling of the auto-correlation structure of the time series. The long term trends were removed by subtracting the annual average stream flow from the original time series to obtain the de-trended time series. The seasonal variations can be modeled using the arithmetic mean approach for its smoothness and superiority in modeling the seasonal effects and to determine the de-trended de-seasonalised time series. The data were normalized to have a mean of. and a variance of 1. before exploring for the auto-correlation structure. A simple normalization expression was employed for this purpose: developed in this study consisted of three layers: an input layer consisting of input explanatory variables, one hidden layer and an output layer consisting of a single neuron representing the flow to be modeled at time t. Four different ANN models were developed for each category of data set. The first ANN model (ANN M1) consisted of the past stream flow Q (t-1) as the input, the second ANN model (ANN M) consisted of the past days stream flow Q (t-1) and Q (t-) as the input, the third ANN model (ANN M3) was formed by the input vector consisting of the past 3 days stream flow Q (t-1), Q (t-) and Q (t-3) as the input. The number of neurons in hidden layer was determined by using trial and error procedure for each model. The data were scaled in the range of and 1. The TanhAxon activation function and Levenberg Marquart learning rule were employed in all the ANN models developed in this study. where, is the normalized time series variable, the original time series variable, the mean of original time series data, is the standard deviation of the original time series data. From the auto-correlation and partial auto-correlation analysis by trial and error method an auto-regressive models are tried and presented in the study. The structure of the AR models can be represented by the following equation: Performance evaluation criteria Two different types of standard statistical performance evaluation criteria were employed to evaluate the performance of various models developed in this study. Mean square error (MSE) is selected as a measure for indicating goodness-of-fit at high output values. Correlation coefficient (R) is a popular global error statistics for measuring goodness-of-fit of the models and trends to give higher weightage to the high magnitude runoff due to square of the difference between observed and predicted stream flows. where Q (t) is the daily stream flow being modeled, Q (t-1) the past stream flow, is the auto-regressive parameters to be determined, the order of the auto-regressive process, an index representing the order of the AR process, a random variable and is an index representing time. In developing the AR models for category 1, the auto-correlation step was carried out on the raw data and for AR models in the data category, the auto correlation model was developed using de-trended and de-seasonalised data. This was done in an attempt to compare the performance of the AR models with the corresponding ANN models. Once the estimates of the AR coefficients have been obtained using the training data set, the model can be validated by computing the performance statistics during both training and testing ANN models The feed-forward multilayer perceptron type ANN model architecture was considered in this study to develop time series models of the non-linear type in an attempt to improve the performance of stream flow forecasting. The ANN models. Results and Discussion The results in terms of various performance evaluation measures are presented in Table.1 and Table. for the two categories of data, respectively. It can be noted from Table.1 that the performance of the three AR models in terms of all the statistics is poor. On the other hand, the performance of the ANN models (ANN M1-ANN M3), significantly better than the corresponding AR models. This is highlighted by the best R value of.91 from the ANN M model. The performance of AR(3) model was the best among AR models and the performance of the ANN model was the best among all the models of category 1, i.e. for which the original data was used for model development. Also, all ANN models consistently outperformed the AR models. Analysing the results from Table. when the de-trended 3
de-seasonalised data are employed for model development, it can be observed that the trends in the model performances are similar, i.e. the performance improves with an increase in the order of the AR process. The performance of the AR models was improved slightly but not considerably because data employed in the present study does not require trend analysis and also only seasonal data was employed. However, the ANN models still consistently outperformed the AR models. The AR (3) model performed the best among the AR models and obtained the best values of MSE and R of 7 and.797. The ANN M model obtained best values of MSE and R of 1 and.9. For being employed in important water resources management applications for use in daily stream flow forecasting, it is desirable to have a model that is robust (measured by R value). 1 Qt (Predicted) 8 6 Table.1:Performance evaluation statistics for data category 1 models Model MSE R AR (1) 111.715 AR () 15.77 AR (3) 99.738 ANN M1 7.89 ANN M 3.91 ANN M3 5.899 Table. Performance evaluation statistics for data category models Model MSE R AR (1) 97.71 AR () 83.763 AR (3) 7.797 ANN M1.9 ANN M 1.9 ANN M3 3.98 Scatter Plot of AR (3) 6 8 1 Qt (observed) Figure 1: Scatter Plot of AR(3) 1 16 18 1 Predicted stream flow 9 8 7 Qt (Predicted) 6 5 3 1 8 6 9 8 Predicted stream flow 7 6 5 3 1 Conclusions 1 1 35 7 Time Series Plot of AR (3) 156 18 176 1 time in days 6 816 Figure : Time Series Plot of AR(3) Scatter Plot of ANN 6 8 1 Qt (observed) Figure 3: Scatter Plot of ANN M 15 1 Time Series Plot of ANN 315 55 63 time in days 735 8 Figure : Time Series Plot of ANN M This study presents the findings of an investigation of the use of ANNs and traditional time series approaches for achieving improved accuracies in time series forecasting. A proposed new approach of modeling complex time series is capable of exploiting the advantages of both the conventional and the ANNs. For the proposed model development, the data was filtered using conventional method and then used as input to the ANNs. The daily stream flow data were employed to develop all models and tested the proposed methodology. The results obtained in this study indicate that, the ANNs are powerful tools to model the complex time series and need to be exploited further. The ANNs were able to capture the hidden relationship among the historical stream flows and the future flows in a much better manner than the conventional time series models. The ANN models were able to produce more accurate forecasts. The findings of this study have revealed 3168 1 95 16 15
that using mathematical filters to filter out the long-term and seasonal variations in the data before presenting data to the ANNs can be extremely useful in producing more accurate time series forecasts. References 1. B. Zhang, S. Govindaraju, Prediction of watershed runoff using Bayesian concepts and modular neural networks, W ater Resour. Res. 36(3) (), pp. 753 76.. R. B. Grayson, I. D. Moore, T. A. McMahon, Physically based hydrologic modeling, W ater Resour. Res. 8(1) (199), pp.659 666. 3. K. Chakraborty, K. Mehrotra, C. K. Mohan, S. Ranka, Forecasting the behavior of the multivariate time series using neural networks, Neural Networks 5 (199), pp. 961 97.. M. I. Zhu, M. Fujita, N. Hashimoto, Application of neural networks to runoff predictions, Stochastic and Statistical Methods in Hydrology and Environmental Engineering, Kluwer Academic, Norwell, MA, 199, pp. 5 16. 5. J. Smith, R. N. Eli, Neural network models of the rainfall runoff process, J. W ater Resour. Plan, Manage. ASCE 1 (1995) pp.99-58. 6. A. W. Minns, M. J. Hall, Artificial neural networks as rainfall runoff models, Hydrol. Sci. J. 1 (3) (1996), pp. 399-17. 7. A.Y. Shamseldin, Application of neural network technique to rainfall-runoff modeling, J. Hydrol. 199 (1997), pp. 7-9. 8. D. W. Dawson, R. W ilby, An artificial neural network approach to rainfall-runoff modeling, Hydrol. Sci. J. 3 (1) (1998), pp. 7-65. 9. MM. Campolo, A. Soldati, P. Andreussi, Forecasting river flow rate during low-flow periods using neural networks, W ater Resour. Res. 35 (11) (1999), pp.357-355. 1. A. Jain, S. Srinivasulu, Development of effective and efficient rainfall-runoff models using integration of deterministic, real-coded genetic algorithms, and artificial neur.al network techniques, W ater Resour. Res. () () 11. H. A. Thomas, M. B. Fiering, Mathematical synthesis of stream flow sequences for the analysis of river basin by simulation, Design of W ater Resources System, Harvard U niversity press, Cambridge, (196), pp.59-93.. R. F. Carlson, A.I.A. MacCormick, D. G. W atts, Application of linear models to four annual stream flow series, W ater Resour. Res. 6 () (197), pp. 17-178. Chat 13. A. I. Mckercher, J. W. Delleur, Application of seasonal parametric stochastic models for monthly flow data, W ater Resour. Res. (197), pp. 6-55. 1. R. E. Kalman, A new approach to linear filtering and prediction problems, ASME Trans. Basic Eng. 8 () (196), pp. 35-5. 15. P. M. Bolzern, G. Ferrario, A daptve real time forecast of river flow rates from rainfall data, J. Hydrol. 7 (198), pp. 51-67. 16. D. P. Lettenmaier, Synthetic stream flow forecast generation, J. Hydraul. Eng. ASCE 11 (3) (198), pp. 77-89. 17. D. H. Burn, E. A. McBean, River flow forecasting model for Sturgeon River, J. Hydrul. Eng. ASCE 118 (6) (199), pp. 316-333. 18. M. Bender, S. Simovinic, Time series modeling for longrange stream flow fore Han casting, J. W ater Resour. Plan. Manage. ASCE 118 (6) (199), pp. 857-869. 19. H. Awwad, J. Valdes, P. Restrepo, Stream flow forecasting for Han River basin Korea, J. W ater Resour. Plan. Manage. ASCE (5) (199), pp. 651-673.. A. F. Atiya, S. M. EI-Shoura, S. I. Shaheen, M. S. EI- Sherif, A comparison between neural network forecasting techniques-case study: river flow forecasting, IEEE Trans. Neural Networks 1 () (1999), pp. -9. 1. A. Jain, A. K. Varshney, U. C. Joshi, Short-term water demand forecast modeling at IIT Kanpur using artificial neural networks, W ater Resour. Manage. 15 (5) (1), pp. 99-31.. A. Jain, L. E. Ormsbee, Evalution of short-term water demand forecast modeling techniques: conventional v/s artificial intelligence, J. Am. W ater W orks Assoc. 9 (7) (), pp. 6-7. 5