Seasonality and Rainfall Prediction Arpita Sharma 1 and Mahua Bose 2 1 Deen Dayal Upadhyay College, Delhi University, New Delhi, India. e-mail: 1 asharma@ddu.du.ac.in; 2 e cithi@yahoo.com Abstract. Time Series Analysis is helpful in understanding past and current patterns of change and provides important clue for predicting future patterns. Weather forecasting is one of the challenging tasks in the area of Time Series Analysis. The purpose of this paper is to evaluate seasonal factors and predict the amount of quarterly rainfall in Kolkata based on the past dataset using Regression Model and to compare the estimated values with the actual observations. Keywords: forecast, rainfall, regression, seasonal index, time series. 1. Introduction Measurement and prediction of change using statistical methods is a very important tool for decision making. Seasonal variation is a component of a time series which is defined as the repetitive and predictable movement around the trend line in one year or less. The term seasonality [1,2] indicates some factors which occur regularly within the same time of the year (day, week, month, and quarter or six-month period) and exerts an unquestionable influence on economic and social activities. Seasonality is quite common in economic time series as well as geophysical and ecological time series. It has application in retail sales, employment sector, agriculture, construction, transportation, tourist trips etc. There are several main reasons for studying seasonal variation: (1) to understand seasonal pattern. (2) seasonal adjustment of data (3) prediction of the future trends. Seasonal adjustment of a time series is the application of certain mathematical techniques to remove periodic infra-annual variations from these series. In recent years a lot of methodologies have been evolved for analyzing time series [3 10]. These Models have been studied extensively in the areas of economic studies for the last thirty five years. Over the past decade notable research works have also been done in the areas of climatology and hydrology especially in the field of rainfall prediction or distribution [11 18] using various techniques namely ARIMA, ANN, fuzzy logic [19] and calculation of seasonality [20]. In this paper we used different regression based techniques for forecasting rainfall with seasonal adjustment. This paper is organized as follows: Section 2 defines objective of our work. Section 3 specifies data. We have presented our procedural details in Section 4. Experimental results are discussed in Section 5. Section 6 summarizes our conclusion and gives some direction for future work. K. R. Venugopal, P. Deepa Shenoy and L. M. Patnaik (Eds.) ICDMW 2013, pp. 145 150. Elsevier Publications 2013.
Arpita Sharma and Mahua Bose 2. Objective: Need for Predicting Rainfall? Rainfall in West Bengal is primarily caused by South-West monsoon. Amount of rainfall is variable in time and space. Rainfall is concentrated in a few months of the year. Any fluctuations in the monsoon lead to floods or drought. Prediction of the spatial and temporal variability in rainfall is a major problem in an agricultural country like India and particularly in the state West Bengal. It not only affects the economic development of an area but also the vegetation and soil characteristics of the area. Knowing about predictable recurring seasonal fluctuations, policy makers set up policies to attenuate the impact. Our objective is to predict quarterly rainfall of Kolkata, the capital of West Bengal, in a particular year based on previous year s data using four distinct but related methods. We shall also perform error estimation in each of these methods and compare our results with the output of Multiple Linear Regression Technique described in [21]. 3. Data In our experiment we have used rainfall data of Kolkata from 2004 07 [22]. Data for the year 2008 will be compared with the forecast for error estimation purpose. We have arranged the rainfall data into four quarters starting from the first month of English Calendar and have not taken into consideration the actual seasons recognized by the Indian Meteorological Department namely Summer, Autumn, Spring, Winter etc. 4. Forecast Models In our experiment we have used Regression Model [23] to forecast quarterly Rainfall. For our first three methods, we have applied following steps: (1) Calculation of Seasonal index. It is an average that can be used to compare an actual observation relative to what it would be if there were no seasonal variation. An index value is attached to each period of the time series within a year. (2) Deseasonalization or seasonal adjustment of data, i.e., elimination of seasonal component from the time-series. (3) Projecting the existing patterns into the future for the prediction of the future trends using Linear Regression. (4) Error Estimation using the observed values and the predicted values. In our fourth experiment we shall use Multiple Linear Regression to forecast rainfall. 4.1 Method 1: Computing seasonal index by linear regression and forecasting Steps 1. Linear Regression through data points to get predicted values. 2. Calculation of index for each data point by dividing the actual data by linear regression prediction for that period. 3. Calculation of Seasonal Index for each quarter. For example, average of all the first quarter indexes will give the index for the first quarter and so on. 4. Deseasonalization: division of the actual data by the seasonal index for that quarter. 5. Linear regression through these deseaonalised data points to get a forecast. 6. Multiplication of the forecast values by seasonal index of that quarter to get a seasonalised forecast. 146
Seasonality and Rainfall Prediction 4.2 Method 2: Calculation of seasonal index by Link Relative method This Method is also called Pearson s Method. Step (1), (2) and (3) of the method 1 will be replaced by the following steps. Remaining steps will be the same. Steps 1. Link Relative (LR) for any season = (current periods value *100)/previous periods value (Link relative for first season cannot be defined). 2. Computation of average link relative for each month/quarter over the years. 3. Computation of chain relatives on the base of first season. Assuming LR for first season is 100, CR for i-th period = LR for (i)-thperiod*crof(i 1)-th Period /100. 4. By taking last period as base, a new chain relative is found for the first period. The adjusted chain relative of the i-th season = Chain relative of the i-th season = ((i 1) c) Where, c = (New chain relative for first season 100)/n and n = 12 for monthly and n = 4 for quarterly data. 5. Computation of average of all adjusted CR. 6. Seasonal index for i-th period = (Adjusted CR for (i)-th period *100)/average of adjusted CR 4.3 Method 3: Calculation of seasonal index by yearly average In this Method Seasonal index is calculated by the following steps: 1. Calculation of yearly averages of each years data. 2. Division of each data by appropriate yearly mean. 3. Seasonal index for each quarter is formed by computing average of quarterly values obtained from step2. So, Step (1), (2) and (3) of the first method will be replaced by the above steps. Remaining steps will be the same. 4.4 Method 4: Multiple linear regressions Multiple Regression Model adopted for the prediction of summer monsoon rainfall [21] is applied here to forecast quarterly rainfall. Here, we have analyzed monthly rainfall data (in each quarter) of 2004 07. For each quarter, Multiple Linear regression technique is used separately to predict the rainfall for the corresponding period in the next year. MLR equation is defined as y = a + bx1 + cx2, where a, b, c are regression coefficients. X1andX2 represent any two month s rainfall data (previous year s data) in a particular quarter and year and y is the quarterly data from the corresponding time period. Coefficients are estimated by Gaussian Elimination. Average value for each year s estimation for that particular period is taken as quarterly forecast value of the next year. 5. Experimental Results and Discussion Rainfall graph shows Figure 1 shows that most of the rainfall take place from July-Sept. Seasonal peaks generally occur this time but varies from year to year. From the Table 2, it is clear that seasonal index value for each of the four quarters are nearly equal in method 1 and 3 whereas results of 147
Arpita Sharma and Mahua Bose Pearson s method show higher values. The main reason behind this difference is that these values are obtained by taking base value as 100. If we divide each observation by 100, almost similar results will be obtained. We have calculated Root Mean Square error (RMSE) for each method. Results of error estimations are summarized in Table 3. From the Table 3 it is observed that forecast by method 1 is superior to the other methods. It is followed by method 3. In Method 2 error rate is almost double than that of method 1 and 3. Results using different forecasting methods are shown in Table 1 and Figure 2. Though the result obtained by method 4 is an improvement over the method 2, method 1 and 3 are more accurate. Figure 1. Rainfall Pattern in Kolkata. Table 1. Forecast of Method 1, 2, 3, 4. Observed Forecast Next Quarter Value (2008) Method 1 Method 2 Method 3 Method 4 1(Jan.-Mar) 117 49.484 8.354 43.848 45 2 (Apr-June) 743 415.339 549.924 392.04 332.854 3 (July-Sept.) 949 1135.468 1661.365 1161.208 1228.5 4 (Oct-Dec.) 140 285.099 372.161 290.553 285.75 Table 2. Seasonal Index of Method 1, 2, 3. Seasonal Index Next Quarter Method 1 Method 2 Method 3 1 0.105 1.051 0.093 2 0.88 76.362 0.829 3 2.404 257.387 2.461 4 0.603 65.2 0.617 148
Seasonality and Rainfall Prediction Figure 2. Rainfall graphs. Table 3. Error estimation. Methods RMSE = SQRT(MSE) 1 204.783 2 390.889 3 221.485 4 261.135 6. Conclusion and Future Work Though accurate forecasting or prediction is a very difficult task, the purpose of our investigation is to analyse different regression based models and find out which one shows higher percentage of accuracy. We have also compared our results with the output obtained by applying Multiple Linear Regression [21] on the same dataset. Our study has revealed that method1 gives best result in comparison to others. Problem may arise in calculating seasonal index using method 2, if amount of rainfall in any quarter/month is nil. Reason for this is that the calculation of link relative of any period, previous period s data is required. But this problem is absent in other methods. Above methods can be used to calculate seasonal index for each month and for computing monthly forecast. These models can be extended to observe the variation of rainfall in different places in a particular year and for the detection of changes in seasonality over the time. References [1] Hylleberg, S.: Modelling Seasonality. Oxford University Press, Oxford (1992). [2] Bell, W. R. and Hillmer, S. C.: Issues Involved with the Seasonal Adjustment of Economic Time Series (With Comments). Journal of Business and Economic Statistics, 2, 291 320 (1984). [3] Box, G. E. P. and Jenkins, G. M.: Series Analysis Forecasting and Control. Prentice-Hall Inc., London (1976). [4] Gooijer, J. G. D. and Hyndman, R. J.: 25 years of Time Series Forecasting. International Journal of Forecasting. 22, 443 473 (2006). 149
Arpita Sharma and Mahua Bose [5] Planas, C.: The Analysis of Seasonality in Economic Statistics: A Survey of Recent Developments. Questiio, 22(1), 157 171 (1998). [6] David, A. P.: A Survey on Recent Developments in Seasonal Adjustment. The American Statistician, 34(3), 125 134 (1980). [7] Franses, P. H.: Recent Advances in Modelling Seasonality. Journal of Economic Surveys, 10(3), 299 345 (1996). [8] Pole, A. West, M. Harrison, J.: Applied Bayesian Forecasting and Time Series Analysis. Business and Economics, 1 (1994). [9] Newaz, M. K.: Comparing the Performance of Time Series Models for Forecasting Exchange Rate. BRAC University Journal, 5(2), 55 65 (2008). [10] Gooijer, J. G. D. and Kumar, K.: Some Recent Development in Non-Linear Time Series Modeling, Testing and Forecasting. International Journal of Forecasting, 8, 135 156 (1992). [11] Khadar Babu, S. K., Karthikeyan, K., Ramanaiah, M. V. and Ramanah, D.: Prediction of Rainfall- Flow Time Series Using Auto-Regressive Models. Advances in Applied Science Research, 2(2), 128 133 (2011). [12] Olaiya, F. and Adeyemo, A. B.: Application of Data Mining Techniques in Weather Prediction and Climate Change Studies. I. J.Information Engineering and Electronic Business, 1, 51 59 (2012). [13] Valipour, M.: Number of Required Observation Data for Rainfall Forecasting According to Climate Conditions. American Journal of Scientific Research, 74, 79 86 (2012). [14] Verbist, K., Robertson, A. W., Cornelis, W. M. and Gabriels, D.: Seasonal Predictability in Daily Rainfall Characteristics in Central Northern Chile for Dry-Land Management. Journal of Applied Meteorology and Climatology, 49, 1938 1955 (2010). [15] Jain, S. K. and Kumar, V.: Trend Analysis of Rainfall and Temperature Data for India. Current Science, 102, 37 49 (2012). [16] Guhathakurta, P. and Saji, E.: Trends and Variability of Monthly, Seasonal and Annual Rainfall for the Districts of Maharashtra and Spatial Analysis of Seasonality Index in Identifying the Changes in Rainfall Regime. Report, National Climate Centre Research, IMD, Pune (2012). [17] Henley, B. J., Thyer, M. A. and Kuczera, G.: Seasonal Stochastic Rainfall Modelling Using Climate Indices: A Bayesian Hierarchical Model. In International Congress on Modelling and Simulation, 575 1581 (2007). [18] Sivaramakrishnan, T. R. and Meganathan, S.: Association Rule Mining and Classifier Approach for Quantitative Spot Rainfall Prediction.Journal of Theoretical and Applied Information Technology, 34(2), 173 177 (2011). [19] Yu, P., Chen, S., Wu, C. and Lin, S.:Comparison of Grey and Phase-Space Rainfall Forecasting Models Using a Fuzzy Decision Method. Hydrological Sciences, 49(4) (2004). [20] Walsh, P. D. and Lawler, D. M.: Rainfall Seasonality: Description, Spatial Patterns and Change Through Time. Weather, 36, 201 208 (1981). [21] Kannan, M., Prabhakaran, S. and Ramachandran, P.: Rainfall Forecasting With Data Mining Technique. International Journal of Engineering and Technology, 2(6), 397 401 (2010). [22] West Bengal State Marketing board, http://www.wbagrimarketingboard.gov.in [23] Rajaraman. V.: Computer Oriented Numerical Methods. PHI Pvt. Ltd., New Delhi (2000). 150