Rainfall data analysis and storm prediction system

Similar documents
USGS ATLAS. BACKGROUND

MapReduce in Spark. Krzysztof Dembczyński. Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

LUIZ FERNANDO F. G. DE ASSIS, TÉSSIO NOVACK, KARINE R. FERREIRA, LUBIA VINHAS AND ALEXANDER ZIPF

MapReduce in Spark. Krzysztof Dembczyński. Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland

Spatial Data Science. Soumya K Ghosh

The Challenge of Geospatial Big Data Analysis

Query Analyzer for Apache Pig

CS425: Algorithms for Web Scale Data

CONTEMPORARY ANALYTICAL ECOSYSTEM PATRICK HALL, SAS INSTITUTE

STATISTICAL PERFORMANCE

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February ISSN

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce

WDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

Spatial Analytics Workshop

Computational Frameworks. MapReduce

A Cloud-Based Flood Warning System For Forecasting Impacts to Transportation Infrastructure Systems

Hunting for Anomalies in PMU Data

Scalable 3D Spatial Queries for Analytical Pathology Imaging with MapReduce

Experiences on Processing Spatial Data with MapReduce *

A Cloud Computing Workflow for Scalable Integration of Remote Sensing and Social Media Data in Urban Studies

SUMMARY OF DIMENSIONLESS TEXAS HYETOGRAPHS AND DISTRIBUTION OF STORM DEPTH DEVELOPED FOR TEXAS DEPARTMENT OF TRANSPORTATION RESEARCH PROJECT

Visualizing Big Data on Maps: Emerging Tools and Techniques. Ilir Bejleri, Sanjay Ranka

TXHYETO.XLS: A Tool To Facilitate Use of Texas- Specific Hyetographs for Design Storm Modeling. Caroline M. Neale Texas Tech University

Large-Scale Behavioral Targeting

Computational Frameworks. MapReduce

4. GIS Implementation of the TxDOT Hydrology Extensions

DUG User Guide. Version 2.1. Aneta J Florczyk Luca Maffenini Martino Pesaresi Thomas Kemper

Distributed Architectures

Synergie PC Updates and Plans. A. Lasserre-Bigorry, B. Benech M-F Voidrot, & R. Giraud

Real-Time Big Data Analytical Architecture for Remoting Sensing Application

Basics of GIS reviewed

RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE

Spatial Data Management of Bio Regional Assessments Phase 1 for Coal Seam Gas Challenges and Opportunities

Project 2: Hadoop PageRank Cloud Computing Spring 2017

Proposal to Include a Grid Referencing System in S-100

Predictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers

State of GIS at the High Performance Computing Cluster

V.4 MapReduce. 1. System Architecture 2. Programming Model 3. Hadoop. Based on MRS Chapter 4 and RU Chapter 2 IR&DM 13/ 14 !74

Technical Report Documentation Page

Scalable and Power-Efficient Data Mining Kernels

CWMS Modeling for Real-Time Water Management

Tall-and-skinny! QRs and SVDs in MapReduce

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

Real-Time Meteorological Gridded Data: What s New With HEC-RAS

DP Project Development Pvt. Ltd.

Marla Meehl Manager of NCAR/UCAR Networking and Front Range GigaPoP (FRGP)

A Spatial Data Infrastructure for Landslides and Floods in Italy

Large Scale Evaluation of Chemical Structure Recognition 4 th Text Mining Symposium in Life Sciences October 10, Dr.

Unsupervised Anomaly Detection for High Dimensional Data

Discovering Spatial and Temporal Links among RDF Data Panayiotis Smeros and Manolis Koubarakis

Leon Creek Watershed October 17-18, 1998 Rainfall Analysis Examination of USGS Gauge Helotes Creek at Helotes, Texas

Automated Storm-based Scheduling on the National Weather Radar Testbed Phased Array Radar

Harvard Center for Geographic Analysis Geospatial on the MOC

Environment Change Prediction to Adapt Climate- Smart Agriculture Using Big Data Analytics

Electrical thunderstorm nowcasting using lightning data mining

INTRODUCTION TO GEOGRAPHIC INFORMATION SYSTEM By Reshma H. Patil

Big Data Analytics. Lucas Rego Drumond

Clustering algorithms distributed over a Cloud Computing Platform.

Orbital Insight Energy: Oil Storage v5.1 Methodologies & Data Documentation

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

ZFL, Center of Remote Sensing of Land Surfaces, University of Bonn, Germany. Geographical Institute, University of Bonn, Germany

How to make R, PostGIS and QGis cooperate for statistical modelling duties: a case study on hedonic regressions

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Imagery and the Location-enabled Platform in State and Local Government

Extending the use of Flood Modeller Pro towards operational forecasting with Delft-FEWS

GIS (GEOGRAPHIC INFORMATION SYSTEMS)

Development of a GIS Interface for WEPP Model Application to Great Lakes Forested Watersheds

CS224W: Methods of Parallelized Kronecker Graph Generation

AN INTERNATIONAL SOLAR IRRADIANCE DATA INGEST SYSTEM FOR FORECASTING SOLAR POWER AND AGRICULTURAL CROP YIELDS

Research Collection. JSONiq on Spark. Master Thesis. ETH Library. Author(s): Irimescu, Stefan. Publication Date: 2018

A REVIEW ON SPATIAL DATA AND SPATIAL HADOOP

Advanced Computing Systems for Scientific Research

Internet Engineering Jacek Mazurkiewicz, PhD

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Handling Uncertainty in Clustering Art-exhibition Visiting Styles

Rick Ebert & Joseph Mazzarella For the NED Team. Big Data Task Force NASA, Ames Research Center 2016 September 28-30

INTRODUCTION TO GIS. Dr. Ori Gudes

The GeoCLIM software for gridding & analyzing precipitation & temperature. Tamuka Magadzire, FEWS NET Regional Scientist for Southern Africa

RADAR Rainfall Calibration of Flood Models The Future for Catchment Hydrology? A Case Study of the Stanley River catchment in Moreton Bay, Qld

GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research

Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. ACM MobiCom 2014, Maui, HI

Innovation. The Push and Pull at ESRI. September Kevin Daugherty Cadastral/Land Records Industry Solutions Manager

BCN: Expansible Network Structures for Data Centers Using Hierarchical Compound Graphs

Near Real-Time Runoff Estimation Using Spatially Distributed Radar Rainfall Data. Jennifer Hadley 22 April 2003

DANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

Deep Learning. Convolutional Neural Networks Applications

A Comparative Study of the National Water Model Forecast to Observed Streamflow Data

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde

Using R for Iterative and Incremental Processing

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

Hierarchical models for the rainfall forecast DATA MINING APPROACH

Large Alberta Storms. March Introduction

gvsig: Open Source Solutions in spatial technologies

Climpact2 and PRECIS

Uncertainty Quantification in Performance Evaluation of Manufacturing Processes

5A.10 A GEOSPATIAL DATABASE AND CLIMATOLOGY OF SEVERE WEATHER DATA

Transcription:

Rainfall data analysis and storm prediction system SHABARIRAM, M. E. Available from Sheffield Hallam University Research Archive (SHURA) at: http://shura.shu.ac.uk/15778/ This document is the author deposited version. You are advised to consult the publisher's version if you wish to cite from it. Published version SHABARIRAM, M. E. (2017). Rainfall data analysis and storm prediction system. In: Computational Intelligence for Societal Development in Developing Countries (CISDIDC), Sheffield Hallam University, 17 February 2017. (Unpublished) Copyright and re-use policy See http://shura.shu.ac.uk/information.html Sheffield Hallam University Research Archive http://shura.shu.ac.uk

Rainfall data analysis and Storm Prediction System Presented by C.P.Shabariram M.E., Assistant Professor

Problem Analysis The main problem is that big rainfall data stored in relational database is as input. System is implemented on Graph search which involves multiple scan of same data. Finally the system is run a single server without applying any distributed technology. The main objective is that preprocessing technique is used to filter the unnecessary rainfall data and analyzing only the meaningfull data.

Abstract Rainfall fall is collected to predict the storm warning from the hydrological data. This is considered as research idea as it consumes large number of records from the distributed systems. In this work, we proposed a novel solution to manage the data based on spatial temporal characteristic and Map Reduce Framework. The Work load is classified using Support Vector Machine to initialize the Map and Reduce function. It uses the feature selection and reduction algorithm to extract feature entity attribute. Various Rainstorm concepts prediction achieved using big raw rainfall data. Three concepts are defined local, hourly and overall storms. The proposed system serves as a tool for predict rain storm from large amount of rainfall data in effective manner. This system improves the performance in terms of accuracy and efficiency.

SYSTEM REQUIREMENTS Hardware Requirements: Processor RAM Hard Disk : Intel Pentium i3 : 2GB : Minimum 2GB free space Software Requirements: Operating System : Windows 8.1 Tool : Hadoop IDE : Eclipse Language : Java

Literature Survey Paper Tile Author Description Advantages Disadvantage s 1. Using Mapreduce to Speed Up Storm Identification from Big Raw Rainfall Data 2. Simplified Data Processing Large Clusters on 3. Rainfall Depth- Duration- Frequency Curves and Their Uncertainties K. Jitkajornwanich, U. Gupta, R. Elmasri, L. Fegaras, J.McEnery and J. Dean and S. Ghemawat A. Overeem, T. A. Buishand, and I. Holleman Relevant characteristics storm is identified from raw rainfall data. original raw rainfall data text files instead of using the data in the relational database. Implementation of Mapreduce runs on a large clusters of commodity machine effects of dependence between the maximum rainfalls for different durations on the estimation of DDF curves The performance of the new storm identification system is significantly improved, based on previous one The runtime system takes care of execution, handling errors program (ie) It is used as Statistical literature for the estimating maximum region the rainfall parallelization of computation in storm identification based on area and centre. Network bandwidth is a scare source Large samples are needed to estimate this shape parameter accurately or data from several sites in a region

continued 4. Experiences on A. Cary, Z. Sun, V. It describes the Computation is It is not applied to Processing Spatial Hristidis, and N. Rishe problems of r-tree, improved, that leads to high complex spatial Data Mapreduce with which is used as spatial access methods high linear scalability. problem 5. Statistical W. H. Asquith, M. C. The analysis is based on It helps to find the storm It is not suitable for Characteristics of Roussel, T. G. hourly rainfall data inter event time and specifying all sites in a Storm Inter event Cleveland, X. Fang, and recorded by NWS duration particular location Time, Depth and D. B.Thompson, Duration for Eastern New Mexico, Oklahoma and Texas

SYSTEM ARCHITECTURE

PROPOSED SYSTEM The reduction in number of records allows faster querying and mining of storm data. The framework is compatible with the original location-specific analysis of storms It helps the hydrologists by helping them to analyze data easier more efficiently.

Modules Modelling the Mapreduce framework for Task Processing. Classification of the data to mapper phase process based on the Spatial Temporal Characteristics using SVM.

Modeling the MapReduce for Taskprocessing MapReduce is a programming Model that is associated with rainfall data for task processing and generating data. The computation takes a set input key/value pairs and produces a set of output key/value pairs. Map method includes Resource Utility function Metadata Management function Fault tolerance function Providers Time slot utility function Reduce Method includes Concession Making algorithm

Classification of the data to mapper phase process based on the Spatial Temporal Characteristics using SVM Map Process takes caries of the partitioning the spatial data of the rainfall data. The support vector machine is used as a data mining technique to extract informative hydrologic Various percentages (from 50% to 10%) of hydrologic data, including those for flood stage and rainfall data, were mined and used as informative data to characterize a flood indicated attributes.

PERFORMANCE EVALUVATION The evaluation is based on storms stored in different clusters. It describes the cluster size of the each storm class during the SVM classification with class boundaries containing the threshold limits and state values.

Hadoop Installation

MainPage

DataPreprocessing

Analysing of Preprocessed Data as TextFiles

Moving data to HDFS server

LOCAL STORMS

CLUSTER CLASSIFICATION USING SVM

HOURLY STORMS

OVERALL STORMS

ANALYSING OF OVERALL STORMS

CONCLUTION The design and implementation a storm classification mechanism using SVM classification The data classification is carried in the map reduce paradigm using Hadoop framework. As the dataset is available in large scale and hence to improve the performance of the cluster scalability, it has been utilized and classify the rainfall data into cluster using the mapper and reduce functions.

FUTURE WORK The challenge to proposed system is to guarantee the quality of discovered relevance features in rainfall dataset for describing storm prediction large scale terms and data patterns. Most popular classification methods have adopted term-based approaches suffered from the problems of feature evolution. It discovers rainfall conditions as higher level features and deploys them over low-level features.

References K. Jitkajornwanich, R. Elmasri, C. Li, and J. McEnery, Extracting Storm- Centric Characteristics from Raw Rainfall Data for Storm Analysis and Mining, Proceedings of the 1st ACM SIGSPATIALInternational Workshop on Analytics for Big Geospatial Data (ACM SIGSPATIAL BIGSPATIAL 12), 2012, pp. 91-99. K. Jitkajornwanich, U. Gupta, R. Elmasri, L. Fegaras, and J.McEnery, Using MapReduce to Speed Up Storm Identification from Big Raw Rainfall Data, Proceedings of the 4th International Conference on Cloud Computing, GRIDs, and Virtualization (CLOUD COMPUTING 13), 2013, pp. 49-55. J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI 04), 2004

Continued.. A. Overeem, T. A. Buishand, and I. Holleman, Rainfall Depth-Duration- Frequency Curves and Their Uncertainties, Journal of Hydrology, vol. 348, 2008, pp. 124-134. W. H. Asquith, M. C. Roussel, T. G. Cleveland, X. Fang, and D. B.Thompson, Statistical Characteristics of Storm Interevent Time,Depth, and Duration for Eastern New Mexico, Oklahoma, and Texas, Professional Paper 1725. U.S. Geological Survey, 2006. W. H. Asquith, Depth-Duration Frequency of Precipitation for Texas, Water-Resources Investigations Report 98-4044. U.S.Geological Survey (USGS), 1998.

THANK YOU