A Hybrid Wavelet Analysis and Adaptive. Neuro-Fuzzy Inference System. for Drought Forecasting

Applied Mathematical Sciences, Vol. 8, 4, no. 39, 699-698 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.988/ams.4.4863 A Hybrid Wavelet Analysis and Adaptive Neuro-Fuzzy Inference System for Drought Forecasting Ani Shabri Mathematical Science Department, Faculty of Science Universiti Teknologi Malaysia, 83 Johor, Malaysia Copyright 4 Ani Shabri. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Drought forecasting plays an important role in the planning and management of water resources systems. In this paper, a hybrid wavelet and adaptive neuro-fuzzy inference system (WANFIS) is proposed for drought forecasting. The WANFIS model was developed by combining two methods, namely a discrete wavelet transform and adaptive neuro-fuzzy inference system (ANFIS) model. To assess the effectiveness of this model, the standardised precipitation index (SPI) was applied for meteorological drought analysis at five rainfalls gauging stations located around the Klang River basin, Malaysia. The SPI drought forecasting capability performance of the WANFIS model is compared with the autoregressive integrated moving average (ARIMA) and ANFIS models using various statistical measures. Comparison of the results reveals that the WANFIS model performs better than the traditional ANFIS and ARIMA models. Keywords: ANFIS, ARIMA, Wavelet, drought, SPI, forecasting. Introduction Drought is part of natural variability of climate and ranks the first natural disaster in the world that affecting large-regions and causing significant damages both in human lives and the natural environment. Drought recurrence is inevitable

69 Ani Shabri and random and only apparent after a long period of precipitation deficit. It affects nearly everywhere across all climate regions, though its features are different from region to region. As a result, the onset and end of a drought are difficult to ascertain precisely. Drought forecasting plays an important role in taking contingency actions in advance of drought to mitigate its risk and impacts. The success of drought preparedness and mitigation depends on timely information about drought onset and forecasting. This information may be obtained through continuous drought monitoring, which is normally performed using drought indices []. Drought indices are variables that identify drought characteristics, i.e. the magnitude, duration, severity and spatial extent. One of the well-known meteorological drought indices is the Standardized Precipitation Index (SPI) originally suggested by McKee et al. []. The SPI drought index was chosen due to its simplicity and versatility [3]. Furthermore, SPI is only requires data on rainfall amount in the research area. A variety of forecasting methods have been established to predict drought occurrence. Traditionally, autoregressive integrated moving average (ARIMA) models have been widely applied for drought forecasting using SPI series [4-6]. However, these models are basically linear models assuming that the data are stationary, and have a limited ability to capture non-stationary and non-linearity in drought data [5]. Adaptive Neuro-Fuzzy Inference System (ANFIS) model, which is an integration of Artificial Neural Networks (ANN) and Fuzzy Logic (FL) methods, has the potential to capture the advantages of both these methods in a single framework. ANFIS models has been accepted as an efficient alternative tool for drought forecasting during recent years [7-9]. When building forecasting models directly using the original series, it is difficult to obtain satisfactory forecast results due to the non-stationary and non-linear of time series data. Another approach that has been used the hybrid wavelet transform with the other forecasting model and proven to be a quite effective method for designing forecasting models for various applications [-3]. The aim of this paper is to investigate the performance of a coupled wavelet-anfis (WANFIS) model for drought forecasting using SPI time series as drought indicator. For drought analysis, the SPI series is used. Drought is investigated for 3, 6, 9 and month periods. To illustrate the applicability of the WANFIS model in drought forecasting, the rainfall data from Klang River basin, Malaysia, are selected as the study area.. Method. ANFIS Model Adaptive Neuro-Fuzzy Inference System (ANFIS) uses a feed-forward network to optimize parameters of a given FIS to perform well on a given task. The ANFIS

Hybrid wavelet analysis 69 proposed here is the Tagaki-Sugeno FIS embedded within the structure of the ANN. The ANFIS uses the neural network training process to adjust the membership function and the associated parameter that approach the desired data sets. The learning algorithm for ANFIS is a hybrid algorithm consisting of the use back-propagation learning algorithm and least-squares method together. All these parameters are updated using this hybrid learning algorithm until an acceptable error is reached. A more detailed description of the ANFIS algorithm is published elsewhere [7-9] and thus, descriptions of these methods are not given here.. Wavelet Analysis Nowadays wavelet-analysis constitutes one of the most powerful tools in the study of time series. Wavelet transform can be divided by two categories: continuous wavelet transform (CWT) and discrete wavelet transform (DWT). The CWT is not often used for forecasting due to its computationally complex and time requirements to compute []. Instead, DWT is often used in forecasting applications to simplfy the numeric solutions. DWT requires less computation time and is simpler to apply. DWT is given by m, n t t n s m m s s m () where (t) is mother wavelet, m and n are integers that control the scale and time. The most common choices for the parameters s = and =. According to the Mallat s theory, the original discrete time series x(t) can be decomposed into a series of linearity independent approximation and detail signals by using the inverse DWT. The inverse DWT is given by Mallat [4]. where M M m m/ m Wm, n ( t n) m t x( t) T () N m/ m m, n ( t n) x( t) t W is the wavelet coefficient for the discrete wavelet at scale m m s and n..3 WANFIS Model WANFIS model is obtained by combining two methods, a DWT and ANFIS model. The WANFIS model is ANFIS model, which uses sub-series obtained using DWT on original data. The WANFIS model structure developed in the present study is shown in Figure. As can be seen from Figure, the WANFIS forecasting can be described as following steps:

t 69 Ani Shabri i. Decompose the original time series for each input into L sub-series components by DWT. ii. Select the most important and effective of each sub-series components for each input by correlation coefficient. iii. The WANFIS model was constructed in which the new summed series obtained by adding the significant sub-time series components for each input was used as the new input to the ANFIS, and the original output time series is the output of the ANFIS. 3. STUDY AREA In this study, the region in the Klang River basin, which flows through Kuala Lumpur and Selangor in Peninsular Malaysia and eventually flows into the Straits of Malacca, is considered. The region is approximately km in length and drains a basin of about 88 square kilometres. The Klang River has major tributaries, and as the river flows through the capital city of Kuala Lumpur, which is a heavily populated area of more than four million people, it is considerably polluted. The locations of the rain gauge stations used are shown in Figure. Monthly rainfall data were procured for the period 975 to 7. For this study, the average rainfall over the monthly series of five rain gauging stations was considered in this study, following the studies employed by Mishra and Desai [4-5] and Mishra et al. [6]. In which they also used the average rainfall over a monthly series over the Kansabati basin in their studies. Data Collection ( y t ) Input Time Series yt f ( yt, yt,..., yt p) Decompose Input Using DWT y t y t yt p D t D t...... DL t D t D DL t... D t t p D t... p AL t AL t DL p ALt p Add the effects Ds DW DWt t DWt p The Effective of DWT as Input for ANFIS yt f ( DWt, DWt,..., DWt p) Figure. The WANFIS model structure. Figure. Location map of the study area. 4. STANDARDISED PRECIPITATION INDEX SPI was developed by McKee et al. [] for the purpose of defining and monitoring droughts in the United States. The SPI is computed by fitting a gamma probability density function to the given frequency distribution of precipitation summed over the time scale of interest. SPI is a dimensionless index that takes

975 98 985 99 995 5 975 98 985 99 995 5 SPI9 SPI 975 98 985 99 995 5 975 98 985 99 995 5 SPI3 SPI6 Hybrid wavelet analysis 693 negative values in drought periods and positive values in wet periods. Drought intensity based on SPI is classified according to the categories given in Table. Table. SPI drought categories Value of SPI Drought category Value of SPI Drought category to -.99 -. to -.49 -.5 to -.99 -. Mild drought Moderate drought Severe drought Extreme drought to.99. to.49.5 to.99. Near normal Moderate wet Severe wet Extreme wet In the present study, running series of total precipitation corresponding to 3, 6, 9 and months were used and the corresponding SPIs were calculated: SPI3, SPI6, SPI9 and SPI. The time series of SPI3, SPI6, SPI9 and SPI values were calculated and are shown in Fig. 3. The reason for considering the total precipitation for running periods of 3, 6, 9 and months was because the drought has been classified as short-term for SPI3, medium-term for SPI6 and SPI9, and long-term for SPI. For model development, the data set was split into two parts; the data set from 975 to is used to estimate the model parameters, and the data from to 7 is used to check the forecast accuracy. 4 3 - - -3-4 4 3 - - -3 Time Time 3 - - -3 3 - - -3 Time Time Figure 3. SPIs series over different time scales based on the average rainfall over Klang basin. The performance of different models was evaluated using the root mean-square error (RMSE) andmean absolute error (MAE). The RMSE and MAE are defined as RMSE n n t o ( y t y f t ) and MAE n n t o y t y f t

694 Ani Shabri where y o t and f y t are the observed and forecasted values at time t respectively, and n is the number of data points. The RMSE and MAE evaluate how closely the predictions match the observations. The criteria to judge the best model are the relatively small MAE and RMSE found in the forecasting of the data. 5. RESULTS AND DISCUSSION One of the most important steps in developing a satisfactory forecasting ANFIS model is the selection of the input variables. For this study, five input combinations based on SPI series of previous periods are evaluated to estimate current SPI series. In each model every input variable must be clustered into several class values in layer to develop fuzzy rules, and each fuzzy rule is constructed through several parameters of the membership function (MF) in layer. As the number of parameters increases with the fuzzy rule increment, the model s structure becomes more complicated. In this study, the subtractive fuzzy clustering function was used to establish the fuzzy rule based on the relationship between the input-output variables. In order to determine the nonlinear input and linear output parameters, the hybrid algorithm was used. The learning procedure and the construction of the rules were provided by this algorithm. The hybrid learning algorithm of ANFIS combines the back propagation gradient descent method with the least squares method to update the parameters in an adaptive network. This algorithm converges much faster since it reduces the dimensions of the search space of the original back-propagation. Each epoch of this hybrid learning processes is composed of a forward pass and a backward pass. In ANFIS modelling two points are important and particular attention must be paid to them: firstly the ANFIS architecture (e.g. type and number of MF) and secondly the training iteration number (epoch), the appropriate selection of which can improve the model s efficiency. In this study, the bell function as membership function (MF) and two MF (gbellmf-) is satisfied in the modelling. In addition, by training different ANFIS structures via different training epoch numbers we found that epochs are sufficient to train the network. A hybrid WANFIS model is obtained by combining two methods, a DWT and ANFIS model. Hence the above-mentioned ANFIS structures were utilised for developing the WANFIS model for all SPI series data. A WANFIS model was constructed in which the sub-time series components obtained using DWT on the original data are the input of the ANFIS, and the original output time series are the output of the ANFIS. For the WANFIS model inputs, the original time series was decomposed by DWT into a certain level of decomposition and detail (DWs) using Mallat s algorithm. The Daubechies wavelet of order (db), one of the most widely used wavelet families, is chosen as the wavelet function. This wavelet offers an appropriate balance between wave length and smoothness [5-6]. In this study, three wavelet decomposition levels (-4-8) were employed, as well as in similar studies by Kisi [35] and Nourani et al. [3].

Hybrid wavelet analysis 695 The effectiveness of the wavelet components is determined using the correlation between the SPI series data and the wavelet coefficients of different decomposition levels, similar was done by Partial and Kisi [8, 7]. The correlation values given in Table 3 provide information on the effective wavelet component for all SPI series. Table 3. Correlation coefficients between original monthly drought data and each sub-time series (Dt). DWT Correlations Original Data With Mean Data components Dt- Dt- Dt-3 Dt-4 Dt-5 R SPI3 D -.3.39 -.4. -.4.3 D. -.75 -.94 -.8..7 D3 -.49 -.39 -.6.7.4.87 A3.49.56.575.63.664.56 SPI6 D.3.375.3 -.4 -.8.5 D -.8 -.9 -.7 -.99.6.74 D3 -.39 -.8.3.6.356.3 A3.58.664.739.799.838.74 SPI9 D -.38.4. -.9 -.99.4 D.39 -.63 -.36 -..8.84 D3 -.64 -.74.54.6.8.36 A3.7.775.84.89.93.86 SPI D -..4. -.5 -.79.34 D.9 -.8 -.83 -.46.75.46 D3 -. -. -.49.43.4.89 A3.77.837.89.93.956.877 Table 3 shows that the D component shows low correlations for all SPI series. The correlation between the wavelet components D and D3 of the SPI series and the observed SPI series shows significantly higher correlations compared to the D components. The results of the correlation analysis showed that D and D3 are the most effective components for use in forecasting. According to the correlation analyses, the effective components D and D3 were selected as the dominant wavelet components. Afterwards, the significant wavelet components D, D3 and the approximation (A3) component were added to each other to constitute the new series. A program code including the wavelet toolbox was written in MATLAB language for the development of the ANFIS models. The one step ahead forecasting performances of the ANFIS and WANFIS models are presented in Table 4 in terms of RMSE and MAE in the testing period. In order to evaluate the proposed model s ability in comparison with other classic models, the autoregressive integrated moving average (ARIMA) time series model was also used to model the SPI3, SPI6, SPI9 and SPI time series. ARIMA models have been trained and tested using the same data sets. For illustration, an example from SPI3 is briefly described. The sample auto-

696 Ani Shabri correlation function (ACF) and sample partial autocorrelation function (PACF) for the SPI3 series is plotted in Figure 4. The ACF and PACF shown the series is stationary. The ACF is damping out in exponential waves with significant spikes near lags,, and 4. The PACF has significance values at lags, 3, 4 and 7. This indicates a possible ARIMA (p,, q) model with p = to 7 and q = to 4. All combinations are evaluated to determine the best model out of these candidate models. The model finally selected was ARIMA (,,4). The residual ACF (RACF) and PACF (RPACF) of the best model are demonstrated in Figure 5. The RACF and RPACF lie within confidence limits, which clearly supports the fact that the residuals from the best model are white noise. Autocorrelation Function for SPI3 (with 5% significance limits for the autocorrelations) Partial Autocorrelation Function for SPI3 (with 5% significance limits for the partial autocorrelations) Autocorrelation..8.6.4.. -. -.4 -.6 -.8 -. Partial Autocorrelation..8.6.4.. -. -.4 -.6 -.8 -. 5 5 5 3 35 5 5 5 Lag Lag Figure. 6 ACF and PACF plots for SPI3. 3 35 ACF of Residuals for SPI3 (with 5% significance limits for the autocorrelations) PACF of Residuals for SPI3 (with 5% significance limits for the partial autocorrelations) Autocorrelation..8.6.4.. -. -.4 -.6 -.8 -. Partial Autocorrelation..8.6.4.. -. -.4 -.6 -.8 -. 5 5 Lag 5 3 35 5 5 Lag 5 3 35 Figure 5. RACF and RPACF for the ARIMA (,,4) model. Table 4 shows the performance of selected ARIMA models for SPI3, SPI6, SPI9 and SPI series. Table 4 shows that WANFIS has good performance during the testing phase, and outperforms ARIMA and ANFIS in terms of all the standard statistical measures for all series. This results shows that the new series (DWT) has a significant, extremely positive effect on the ANFIS model results. The results suggest that the WANFIS is superior to the ARIMA and ANFIS models in drought forecasting. The ARIMA model seems to be much better than the ANFIS model.

Hybrid wavelet analysis 697 Table 4. Performance results of ARIMA, ANFIS and WANFIS models during the testing period. Data Model RMSE MAE Data Model RMSE MAE SPI3 ARIMA.573.445 SPI9 ARIMA.95.34 ANFIS.6.5 ANFIS.347.73 WANFIS.46.33 WANFIS.8.55 SPI6 ARIMA.387.3 SPI ARIMA.46.9 ANFIS.475.364 ANFIS.7.8 WANFIS.3.7 WANFIS.9.45 6 CONCLUSIONS This study focused on SPI as a drought indicator for drought analysis at five stations located around the Klang River basin. Since SPI is one of the most widely used methods related to drought forecasting, the accurate and reliable estimation of SPI is very important. This paper proposes the application of the WANFIS technique for the modelling of SPI series. The WANFIS models are obtained by combining two methods, an ANFIS model and discrete wavelet transforms. The WANFIS models were trained and tested by applying different input combinations for different SPI series data. The performance of the proposed WANFIS model was compared to forecast using regular ARIMA and ANFIS models. Comparison of the results indicated that the WANFIS model was substantially more accurate than the ARIMA and ANFIS models. The study concludes that the forecasting abilities of the ANFIS model in short- and long-term monthly drought time series are found to be improved when the wavelet transformation technique is adopted for data pre-processing. Acknowledgements. This financial support provided by Universiti Teknologi Malaysia and MOSTI under FRGS Grant (R.J3.788.4F399) is gratefully acknowledged. References [] F.L. S. Morid, V. Smaktin and K. Bagherzadeh, Drought forecasting using artificial neural networks and time series of drought indices, International Journal of Climatology, 7(7), 3-. [] T. B. McKee, N. J. Doesken, and J. Kleist, The relationship of drought frequency and duration to time scales, in Proceedings of the 8th Conference on Applied Climatology, American Meteorological Society, Anaheim, Calif, USA, 993.

698 Ani Shabri [3] M.J. Hayes, M.D. Svoboda, D.A. Wilhite and O.V. Vanyarkho, Monitoring the 996 drought using the standardized precipitation index, Bull Am Meterol Soc., 8(999), 49-438. [4] A.K. Mishra and V.R. Desai, Drought forecasting using stochastic models, Stoch Environ Res Risk Assess, 9(5), 36-339. [5] A.K. Mishra and V.R. Desai, Drought forecasting using feed-forward recursive neural network, Ecological Modeling, 98 (6), 7-38. [6] A.K. Mishra, V.R. Desai and V.P. Singh and F. ASCE, Drought forecasting using a hybrid stochastic and neural network model, Journal of Hydrologic Engineering, (7), 66-638. [7] U.G. Bacanli, M. Firat and F. Dikbas, Adaptive Neuro-Fuzzy Inference System for drought forecasting, Stoch Environ Res Risk Assess, 3(9), 43-54. [8] T. Partal and O. Kisi, Wavelet and neuro-fuzzy conjuction model for precipitation forecasting, Journal of Hydrology, 34(7), 99-. [9] M. Firat and M. Gungor, River flow estimation using adaptive neuro fuzzy inference system, Mathematics and Computers in Simulation, 75(7), 87-96. [] R. Noori, M.A. Abdoli, A. Farokhnia and M. Abbasi, Results uncertainty of solid waste generation forecasting by hybrid of wavelet transform-anfis and wavelet transform-neural network, Expert System with Applications, 36(9), 999-9999. [] O. Kisi, River flow forecasting and estimation using different artificial neural network techniques,hydrol Res. 39(8), 7 4. [] O. Kisi, Wavelet regression model as an altrnative to neural networks for river stage forecasting, Water Resour Manage, 5(), 579-6. [3] V. Nourani, M.T. Alami and M.H. Aminfar, A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation, Engineering Applications of Artificial Intelligence, (9), 466-47. [4] S.G. Mallat, A theory for multi decomposition signal decomposition: the wavelet representation. IEEE Trans. Pattern Anal, Mach. Intell, (989), 674 693. [5] O. Kisi, Evapotranspiration modeling using wavelet regresion model, Irrig Sci., 9(), 4-5. [6] F. Giacometto, J.J. Cardenas, K.K. Kampouropoulos and J.L. Romeral, Load forecasting in the user side using wavelet-anfis, in Proceedings of the 38th Annual Conference on IEEE Industrial Electronics Society, 49-54,. [7] O. Kisi, Wavelet regression model for short-term stream flow forecasting, Journal of Hydrology, 389(), 344-353. Received: September, 4