PATTERN VISUALIZATION ON METEOROLOGICAL DATA FOR RAINFALL PREDICTION MODEL

Similar documents
ASSOCIATION RULE MINING AND CLASSIFIER APPROACH FOR QUANTITATIVE SPOT RAINFALL PREDICTION

CHAPTER 2: DATA MINING - A MODERN TOOL FOR ANALYSIS. Due to elements of uncertainty many problems in this world appear to be

Chapter 6. Frequent Pattern Mining: Concepts and Apriori. Meng Jiang CSE 40647/60647 Data Science Fall 2017 Introduction to Data Mining

To Predict Rain Fall in Desert Area of Rajasthan Using Data Mining Techniques

CSE 5243 INTRO. TO DATA MINING

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

Long Range Forecasts of 2015 SW and NE Monsoons and its Verification D. S. Pai Climate Division, IMD, Pune

A High Resolution Daily Gridded Rainfall Data Set ( ) for Mesoscale Meteorological Studies

variation analysis of Vattamalaikarai sub basin, Tamil Nadu, India.

Long Range Forecast Update for 2014 Southwest Monsoon Rainfall

Seasonal Climate Watch September 2018 to January 2019

Seasonal Climate Watch July to November 2018

Study of Hydrometeorology in a Hard Rock Terrain, Kadirischist Belt Area, Anantapur District, Andhra Pradesh

Key Finding: Long Term Trend During 2014: Rain in Indian Tradition Measuring Rain

Seasonal Climate Forecast August October 2013 Verification (Issued: November 17, 2013)

Seasonality and Rainfall Prediction

Changes in the characteristics of rain events in India

REFERENCES. 5. Akaike H. A new look at the statistical model identification. IEEE transaction on automatic control, 1977; 19:

CSE 5243 INTRO. TO DATA MINING

Rainfall variation and frequency analysis study of Salem district Tamil Nadu

Analysis of Rainfall and Other Weather Parameters under Climatic Variability of Parbhani ( )

PROJECT REPORT (ASL 720) CLOUD CLASSIFICATION

Water Resource & Management Strategies

Seasonal Rainfall Trend Analysis

Chapter-1 Introduction

EFFICIENT MINING OF WEIGHTED QUANTITATIVE ASSOCIATION RULES AND CHARACTERIZATION OF FREQUENT ITEMSETS

Mining Class-Dependent Rules Using the Concept of Generalization/Specialization Hierarchies

Encyclopedia of Machine Learning Chapter Number Book CopyRight - Year 2010 Frequent Pattern. Given Name Hannu Family Name Toivonen

Occurrence of heavy rainfall around the confluence line in monsoon disturbances and its importance in causing floods

ISPRS Archives XXXVIII-8/W3 Workshop Proceedings: Impact of Climate Change on Agriculture CLIMATE VARIABILITY OVER GUJARAT, INDIA

Journal of Pharmacognosy and Phytochemistry 2017; 6(4): Sujitha E and Shanmugasundaram K

1 Ministry of Earth Sciences, Lodi Road, New Delhi India Meteorological Department, Lodi Road, New Delhi

CHAPTER 4: PREDICTION AND ESTIMATION OF RAINFALL DURING NORTHEAST MONSOON

Climate Change Scenarios 2030s

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 6

Seasonal Climate Watch June to October 2018

The Palfai Drought Index (PaDI) Expansion of applicability of Hungarian PAI for South East Europe (SEE) region Summary

Association Rule. Lecturer: Dr. Bo Yuan. LOGO


Chapter 2 Variability and Long-Term Changes in Surface Air Temperatures Over the Indian Subcontinent

Effect of El Niño Southern Oscillation (ENSO) on the number of leaching rain events in Florida and implications on nutrient management

WINTER-SASCOF-1 Current Prediction System/Verification- Maldives

SPATIAL TEMPORAL VARIABILITYOF RAINFALL TRENDS ANALYSIS IN NAGAPATTINAM COASTAL ZONE, TAMIL NADU

Comparison of GIS based SCS-CN and Strange table Method of Rainfall-Runoff Models for Veeranam Tank, Tamil Nadu, India.

Temperature and Rainfall Variability in South Costal Districts of Andhra Pradesh

Probability models for weekly rainfall at Thrissur

Analysis of Historical Pattern of Rainfall in the Western Region of Bangladesh

CLIMATOLOGY OF MONSOON RAINS OF MYANMAR (BURMA)

Assessment of water resources and seasonal prediction of rainfall in India

Spatial and Temporal Analysis of Rainfall Variation in Yadalavagu Hydrogeological unit using GIS, Prakasam District, Andhra Pradesh, India

Climate Outlook through 2100 South Florida Ecological Services Office Vero Beach, FL January 13, 2015

Data Analytics Beyond OLAP. Prof. Yanlei Diao

Life Science Archives (LSA)

Climate Outlook through 2100 South Florida Ecological Services Office Vero Beach, FL September 9, 2014

THE STUDY OF NUMBERS AND INTENSITY OF TROPICAL CYCLONE MOVING TOWARD THE UPPER PART OF THAILAND

REGRESSION FORECASTING OF PRE-MONSOON RAINFALL IN BANGLADESH ABSTRACT

A Report on a Statistical Model to Forecast Seasonal Inflows to Cowichan Lake

El Niño 2015 Conference

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES

Frequency analysis of rainfall deviation in Dharmapuri district in Tamil Nadu

Summary and Conclusions

DETECTION OF TREND IN RAINFALL DATA: A CASE STUDY OF SANGLI DISTRICT

Prediction of Snow Water Equivalent in the Snake River Basin

Study of Intra Annual and Intra Seasonal Variations of Indian Summer Monsoon during ARMEX Period

"STUDY ON THE VARIABILITY OF SOUTHWEST MONSOON RAINFALL AND TROPICAL CYCLONES FOR "

EL NINO-SOUTHERN OSCILLATION (ENSO): RECENT EVOLUTION AND POSSIBILITIES FOR LONG RANGE FLOW FORECASTING IN THE BRAHMAPUTRA-JAMUNA RIVER

DROUGHT A WATER SCARCITY CASE STUDY BY STATISTICAL APPROACH

Thai Meteorological Department, Ministry of Digital Economy and Society

Rainfall variation and frequency analysis study in Dharmapuri district, India

USDA s Operational GIS & related processes for International Weather & Crop Assessements

Rainfall is the major source of water for

Verification of the Seasonal Forecast for the 2005/06 Winter

Forecasting of meteorological drought using ARIMA model

MPACT OF EL-NINO ON SUMMER MONSOON RAINFALL OF PAKISTAN

1. Evaluation of Flow Regime in the Upper Reaches of Streams Using the Stochastic Flow Duration Curve

Effect of land use/land cover changes on runoff in a river basin: a case study

Water Scarcity Situation Report 14 th September 2018

Probabilistic Analysis of Monsoon Daily Rainfall at Hisar Using Information Theory and Markovian Model Approach

Climate Impacts to Southwest Water Sector. Dr. Dave DuBois New Mexico State Climatologist

Best Fit Probability Distributions for Monthly Radiosonde Weather Data

Changing Hydrology under a Changing Climate for a Coastal Plain Watershed

REPORT ON SOWING PERIOD AND FLOODING RISKS IN DOUALA, CAMEROON. Cameroon Meteorological Department

Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York

ATMOSPHERIC MODELLING. GEOG/ENST 3331 Lecture 9 Ahrens: Chapter 13; A&B: Chapters 12 and 13

CLIMATE SIMULATION AND ASSESSMENT OF PREDICTABILITY OF RAINFALL IN THE SOUTHEASTERN SOUTH AMERICA REGION USING THE CPTEC/COLA ATMOSPHERIC MODEL

Study of Changes in Climate Parameters at Regional Level: Indian Scenarios

2013 ATLANTIC HURRICANE SEASON OUTLOOK. June RMS Cat Response

Dry spell analysis for effective water management planning

Daily Rainfall Disaggregation Using HYETOS Model for Peninsular Malaysia

Country Presentation-Nepal

DROUGHT IN MAINLAND PORTUGAL

COUNTRY REPORT. Jakarta. July, th National Directorate of Meteorology and Geophysics of Timor-Leste (DNMG)

Development of Pakistan s New Area Weighted Rainfall Using Thiessen Polygon Method

Unseasonable weather conditions in Japan in August 2014

DEPARTMENT OF EARTH & CLIMATE SCIENCES Name SAN FRANCISCO STATE UNIVERSITY Nov 29, ERTH 360 Test #2 200 pts

MULTI MODEL ENSEMBLE FOR ASSESSING THE IMPACT OF CLIMATE CHANGE ON THE HYDROLOGY OF A SOUTH INDIAN RIVER BASIN

UPDATE OF REGIONAL WEATHER AND SMOKE HAZE November 2016

2015 Summer Forecast

Modeling of peak inflow dates for a snowmelt dominated basin Evan Heisman. CVEN 6833: Advanced Data Analysis Fall 2012 Prof. Balaji Rajagopalan

An objective criterion for the identification of breaks in Indian summer monsoon rainfall

Transcription:

PATTERN VISUALIZATION ON METEOROLOGICAL DATA FOR RAINFALL PREDICTION MODEL 1 S. MEGANATHAN, 2 T.R. SIVARAMAKRISHNAN 1 Department of CSE, SASTRA University, Kumbakonam, 612 001 2 School of EEE, SASTRA University, Thanjavur, 613 401 E-mail: 1 meganathan@src.sastra.edu, 2 trsivaraman@yahoo.co.in ABSTRACT Regional rainfall forecast is an important task for meteorologists. While statistical methods are usually in vogue, for this, the concept of data mining is applied to see the efficacy. The objective of this work is to develop a forecast method for the rainfall of the Northeast (NE) season over the Cauvery delta region of South India. There are three main parts in this work. The first part is in loading the meteorological data into centralized server so that it can be extracted and filtered according to our needs. Secondly, graphical visualization has been applied by using gnuplot. And finally the association rules have been mined from those former outputs. From these results we can assess the monthly and seasonal rainfall for the future. Results show that overall accuracy of this approach is satisfactory. Keywords: Association Rule, Data Mining, Rainfall Prediction, Gnuplot. 1. INTRODUCTION Cauvery river flows in the peninsular India from west to east which is very vital for agricultural production in Tamilnadu state of South India. Besides this region gets the main seasonal rainfall during the northeast monsoon months of October to December. In summer months from May to August as well as September also thunderstorm rains occur here. In India the period from mid June to September is called Southwest (SW) monsoon or Summer monsoon. With the Cauvery waters getting depleted in the downstream regions due to the construction of many dams, the Cauvery delta farmers have to depend upon the seasonal rainfall to a great extent since the nineties of the last century. In the late eighties many townships also came up in delta region. Hence seasonal rainfall estimate over the delta and its prediction will go a long way to help the agriculturists, hydrologists and water managers. Several studies have been made regarding southwest monsoon as recorded by Rao Y.P,1976[19]. For Tamil Nadu and East Coastal of India, the Northeast monsoon is more relevant as that is the main rainy season. Some works are available regarding rain assessment and forecast for Northeast monsoon also Balachandran et.al.,2006 [5], Dhar and Rakhecha 1983[7], Duraiswamy 1946[8], Kripalanli and Pankaj Kumar 2004[10], Pankaj Kumar et al 2007[16], Lau[11], Raman[17], Rao K.V 1963[18], Sivaramakrishnan [24], Sivaramakrishnan and Sridharan [26], Sridaran and Muthuchamy 1990[30] and Zubair and Roplowski 2006[31] to discuss the nature and prediction of rainfall for India based on conventional statistics. A few isolated attempts in India by Sivaramakrishnan [25], Mohanty [13] and Seetharam [21] are available to study the rainfall over single stations. But all of them use conventional synoptic correlations and they are for a country as a whole. With the vastly varying terrains in India, no clear correlation between Southwest (June to September) and Northeast (October to December) monsoon rainfall could be established. Hence at sub regional level where there is fairly a uniform terrain; study has to be conducted with latest tools and methodologies to predict the Northeast (NE) monsoon rainfall from the earlier rains. Such a study has been attempted for the Cauvery delta basin in Tamilnadu state of South India. There is no earlier study specifically for this region. Usually the sowing operations start during July to August. The SW monsoon is about 30% of annual rainfall over this region, but the NE monsoon (70% of annual rainfall) is found to show large variations. Hence the water managers have to adjust the dependable rainfall realized by August with the operation schedule allowing for certain threshold 169

rainfall by November end. This needs a model for prediction of NE monsoon rain with good chances of success. Graphical mining and data mining technique [1],[2],[9] are some computer based tools which have been successfully used for predicting parameters in certain fields in the recent past. Same are being tried here for rainfall prediction for the first time. 2. DATA USED Monthly rainfall data for available stations of the composite Thanjavur district of Tamilnadu in India in downstream Cauvery basin for 25 years since 1985 have been used for analysis. The area covered is shown in Fig 1. The sample stations are Grand Anaicut(1), Thanjavur(2), Papanasam(3), Kumbakonam(4), Valangaiman(5), Nannillam(6), and Nagapattinam(7). Figure 1: Rain gauge stations in the cauvery delta of composite thanjavur district in south India. 3. METHODOLOGY 3.1 Association Rules Association rule mining [2] discovers the relationship between items from the set of transactions [3]. These relationships can be expressed by association rules such as [i 1 i 2, i 3 support, s = 2%, confidence, c = 60%]. This association rule implies that 2% of all the transactions under analysis show that items i 1, i 2 and i 3 appear jointly. A confidence of 60% indicates that 60% of the transactions containing i 1 also contain i 2 and i 3. Associations may include any number of items on either side of the rule. The problem of mining association rules was first introduced in the last decade by Agrawal et al in 1993 [1], Agrawal and Srikant in 1994 [2], Agrawal et al in 1995 [3], and Sarawagi et al in 2000 [20]. Let I = {i 1, i 2,, i m } denote a set of literals, namely, items. Moreover, let D represent a set of transactions, where each transaction T is a set of items such that T I. A unique identifier, namely TID, is associated with each transaction. A transaction T is said to contain X, a set of some items in I, if X T. An association rule implies the form X Y, where X I, Y I and X Y = φ. The rule X Y holds in the transaction set D with confidence, c, where c% of transactions in D that contain X also contain Y. The rule has support, s, in the transaction set D if s% of transactions in D contain X Y. An efficient algorithm is required that restricts the search space and checks only a subset of all association rule, yet does not miss important rules. The apriori algorithm [2] developed by Agrawal et al. (1993) and Srikant and Agrawal (1997) is such an algorithm. However, the interestingness (validity) of the rule is only based on support and confidence. Association rule mining is a popular technique for marketing basket analysis, which typically aims at discovering buying patterns of customers in supermarkets, mail order and other types of stores. By mining association rules, marketing analysis try to find sets of products that are frequently bought together, so that certain other items can be inferred from a shopping cart containing particular items. Association rules [3] can often be used to design marketing promotions, for example, by approximately arranging products on supermarket shelves and by directly suggesting items to customers that may be of interest to them. With the constant collection and storage of considerable quantities of business data, association rules are discovered from the domain databases and applied in many areas, such as marketing, logistics and manufacturing. In the areas of marketing, advertising and sales, corporations have found that they can benefit enormously if implicit and previously unknown buying and calling patterns of customers can be discovered from large volumes of business data. Generally, support and confidence are taken as two measurable factors to evaluate the 'interestingness' of association rules which is discussed by Agrawal et al., 1993 [1]; and Srikant and Agrawal, 1994 [2]. 170

Association rules are regarded as interesting if their support and confidence are greater than the user-specified minimum support and minimum confidence, respectively. In data mining, it is important but difficult to determine these two thresholds of 'interestingness' (rule validity) appropriately. Data miners usually specify these thresholds in an arbitrary manner. However, this paper has attempted to employ this technique on the meteorological data. 3.2. Proposed Approach To apply the techniques of association rules in graphical mining, it is necessary to filter the raw data and convert into the format which is accepted by visualization tool, gnuplot. The proposed approach is described in Figure 2 which emphasizes the steps involved in the knowledge discovery using graphical mining of data mining technique. Step 6: Mining rules from those patterns. Using the graphical mining and data mining technique, rainfall realizable in the NE monsoon was assessed from the rainfall realized during June, July and August months with association rules. The running 5 year mean rainfall is shown for different stations in Figures 3(a) to 3(g). Figure 3(a): Five year mean rainfall of grand anaicut Figure 2. Steps involved in the knowledge discovery using graphical mining. Step 1: Loading climatic data to SQL Database. Step 2: Filtering the noises and unwanted data from the dataset. Step 3: Transform data as required for the visualization software. Step 4: Generating graphs using the visualization tool. Step 5: Studying of patterns on the generated results. Figure 3(b): Five year mean rainfall of kumbakonam 171

Figure 3(c): Five year mean rainfall of nagapattinam Figure 3(f): Five year mean rainfall of thanjavur Figure 3(d): Five year mean rainfall of nannillam Figure 3(g): Five year mean rainfall of valangaiman After the analysis, the following rules are mined out from the above graphs: RF (X, SW C < SW P ) => NE C (X, > NE P ) (3.1) RF ( X, OCT < 200mm ) => NOV ( X, OCT R/F + 50mm ) (3.2) Figure 3(e): five year mean rainfall of papanasam RF ( X, OCT > 200mm ) => NOV ( X, >400mm ) (3.3) 172

The equation 3.1 describes the rule whether the total Rainfall (RF) of current year of SouthWest (SW C ) monsoon is less than the previous year of SouthWest (SW P ) monsoon then the Rainfall(RF) of current year of North East ( NE C ) monsoon is greater than the previous year of the total Rainfall(RF) of North East (NE P ) monsoon. The equation 3.2 describes the rule whether the total Rainfall(RF) of October month is less than the 200mm then the total Rainfall of the successive November month is most probably the actual October month Rainfall with an excess of 50mm. The equation 3.3 describes the rule whether the total Rainfall of October month is greater than the 200mm then the total Rainfall of the successive November month will have the precipitation of more than 400mm. 4. CONCLUSION The study reveals that in the Cauvery delta basin, the rainfall pattern of many stations obey the association rule except Nagapattinam. Nagapattinam being coastal in nature may perhaps be the reason. The rule suggests a correlation between the Southwest and Northeast monsoon rains over this area when considered in relation to previous years performance. It is also possible to assess the November month rain from the rains of October using the same rule. Frequent pattern mining leads to the discovery of associations and correlations among rainfall occurrence in large climate data set. This methodology can help to forecast rainfall during the months of Northeast monsoon for the Cauvery delta region of Thanjavur district of South India. ACKNOWLEDGMENT The authors thank the office of Superintending Engineer, Lower Cauvery Division, Public Works Department, Thanjavur, Tamilnadu, India for supplying the rainfall data. REFERENCES [1] Agrawal, R., Imielinski, T., Swami, A., 1993, "Mining Associations between Sets of Items in Massive Databases", Proc. of the ACM- SIGMOD 1993 Int. Conference on Management of Data, Washington D.C.. [2] Agrawal, R., Srikant, R., 1994, "Fast Algorithms for Mining Association Rules", Proc. of the 20th Int. Conference on Very Large Databases, Santiago, Chile, Sept. 1994. Expanded version available as IBM Research Report RJ9839. [3] Agrawal, R., Mannila, H., Srikant R., Toivonen, H., and Verkamo, A. I., 1995, "Fast Discovery of Association Rules", Advances in Knowledge Discovery and Data Mining, Chapter 12, AAAI/MIT Press. [4] Bayardo Jr., R. J., and Agrawal R., 1999, "Mining the Most Interesting Rules" In Proc. of the 5th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. [5] Balachandran S, Asokan R, Sridaran S., 2006, Global surface temperature in relation to northeast monsoon rainfall over Tamil Nadu, Journal of Earth System Science, 115,3, pp 349-362. [6] Chew, F H S, Piechota T C, Dracup J.A, and Mcmohan T.A., Elnino/ Southern Oscillation and Rainfall Variations, Journal of Hydrology (1998), p 138. [7] Dhar O.N., Rakheja R.R., 1983, Foreshadowing Northeast monsoon rainfall over Tamil Nadu, India, Mon.Wea.Rev., 111, p 109-112. [8] Duraiswamy I.V., 1946, Scientific note no:98, India Meteorological Department, p147. [9] Jiawei Han and Micheline Kamber, 2008, Data Mining Concepts and Techniques, Morgan Kaufmann Publications. [10] Kripalani R.H, Pankaj Kumar., 2004, International Journal of Climatology, pp 1267-1282. [11] K. M. Lau, 1992, East Asian Summer Monsoon Rainfall Variability and Climate teleconnections, Journal of Meteorological Society of Japan, 70, p 211. [12] Meganathan, S., Sivaramakrishnan, T.R. and Chandrasekhara Rao, K., 2009, OLAP operations on the multidimensional climate data model: A theoretical approach, Acta Ciencia Indica, XXXVM, 4, p 1233. [13] Mohanty V.C., Forecast of precipitation over Delhi during SW Monsoon, Mausam, 45 (1994), p 87. [14] Nacheketa Acharya, Kar S.C, Kulkarni M.A, Mohanty U.C, Sahoo L.N., 2011, Journal of Earth System Science, 120, No. 5, p 795. [15] Nayagam Lorna R., 2009, Variability and teleconnectivity of northeast monsoon 173

rainfall over India, Journal of Global and Planetary Change, 69, pp 225-231. [16] Pankaj Kumar, Rupa Kumar K, Rajeevan M, Sahai A.K., 2007, Climate Dynamics, 28, p 649. [17] Raman Kzyszie, 2001, The case for probabilistic forecast in hydrology., Journal of Hydrology 249, p 2. [18] Rao K.V,, 1963, A Study of the Indian northeast monsoon season, Indian Journal of Meteorololgy, Hydrology and Geophysics, 14, pp 143-155. [19] Rao Y.P., 1976, Southwest Monsoon, Met. Monograph, India Meteorological Department, pp 354-364. [20] Sarawagi, S., Thomas, S., Agrawal R., 2000, "Integrating Association Rule Mining with Databases: Alternatives and Implications", Data Mining and Knowledge Discovery Journal, 4. [21] Seetharam. K, Arima Model of Rainfall prediction over Gangtok, Mausam, 60 (2009), p 361. [22] Shukla, J. and Mooley, D.A., Empirical Prediction of Summer monsoon rainfall in India, Monthly Weather Rev. (1987), p 695. [23] Shukla, J. and Pavolino, D.A., Southern Oscillation and long range forecast of summer monsoon rainfall in India, Monthly Weather Rev., 111 (1983), p 1830. [24] Sivaramakrishnan, T.R, 1989, Annual Rainfall over Tamil Nadu, Hydrology Journal, IAH, p 20. [25] Sivaramakrishnan, T.R, et al, 1983, A study of rainfall over Madras, Vayumandal, p 69. [26] Sivararamakrishnan, T.R, and Sridharan, S., 1987, Occurrence of heavy rain episodes over Madras, Proceedings of National symposium on Hydrology, NIH, Roorkee, P VI 54. [27] Sivaramakrishnan, T.R, Meganathan, S., and Sibi, P., An analysis of northeast monsoon rainfall for the Cauvery delta of Tamil Nadu, Proceedings of INEMREC 2011, pp 105-107. [28] Sivaramakrishnan,T.R, Meganathan,S., August 2011, Point Rainfall Estimation using Association Rule Mining A Case Study, proceedings of the National Conference on Environmental Science & Technologies for Sustainable Development, Chennai, pp 81-85. [29] Sivaramakrishnan.T.R, Meganathan.S, December 2011, Association Rule Mining and Classifier Approach for Quantitative Spot Rainfall Prediction, Journal of Theoretical and Applied Information Technology. [30] Sridharan S, Muthuchami A., 1990, Northeast monsoon rainfall in relation to El NLino, QBO and Atlantic hurricane frequency, Vayumandal, 20, 3-4, 1.08-1.11. [31] Zubair L, Ropelewski, 2006, J.Climate, 19, pp 1567-1575. 174