ST-DBSCAN: An Algorithm for Clustering Spatial-Temporal Data

Similar documents
Chapter 5-2: Clustering

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

CLUSTERING AND NOISE DETECTION FOR GEOGRAPHIC KNOWLEDGE DISCOVERY

CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014

Projective Clustering by Histograms

Density-Based Clustering

Near-Surface Dispersion and Circulation in the Marmara Sea (MARMARA)

CLOUD NOWCASTING: MOTION ANALYSIS OF ALL-SKY IMAGES USING VELOCITY FIELDS

Lesson 5: Map Scale and Projections

Application and verification of ECMWF products at the Finnish Meteorological Institute

EasySDM: A Spatial Data Mining Platform

IS THE ARCTIC MELTING? THEORY VS. OBSERVATIONS

Clustering of Amino Acids Proles

AERMOD Sensitivity to AERSURFACE Moisture Conditions and Temporal Resolution. Paper No Prepared By:

Multi-wind Field Output Power Prediction Method based on Energy Internet and DBPSO-LSSVM

Large Scale Weather Patterns Associated with California Extreme Heat Waves and their Link to Daily Summer Maximum Temperatures in the Central Valley

IV Course Spring 14. Graduate Course. May 4th, Big Spatiotemporal Data Analytics & Visualization

Validation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1

Modeling Temporal-Spatial Correlations for Crime Prediction

Annual WWW Technical Progress Report. On the Global Data Processing and Forecasting System 2004 OMAN

Application and verification of ECMWF products 2013

Finding Climate Indices and Dipoles Using Data Mining

Exploratory Hierarchical Clustering for Management Zone Delineation in Precision Agriculture

How to evaluate daylight. Initiated by the VELUX Group

Global Cryosphere Watch (GCW) Tropical Cryosphere Workshop

Lecture 4. Spatial Statistics

International Journal of Chemical Modeling ISSN: Darko Butina ABSTRACT

Some details about the theoretical background of CarpatClim DanubeClim gridded databases and their practical consequences

Leveraging Sentinel-1 time-series data for mapping agricultural land cover and land use in the tropics

4.5 Comparison of weather data from the Remote Automated Weather Station network and the North American Regional Reanalysis

Application and verification of ECMWF products 2009

Application and verification of ECMWF products: 2010

Radius of reliability: A distance metric for interpreting and verifying spatial probability forecasts

Is the Hénon map chaotic

Forecasting Wind Ramps

Clustering in Geo-Social Networks

Data Mining Based Anomaly Detection In PMU Measurements And Event Detection

UNIT 1. WEATHER AND CLIMATE. PRIMARY 4/ Social Science Pedro Antonio López Hernández

A Geo-Statistical Approach for Crime hot spot Prediction

Accuracy of near real time updates in wind power forecasting with regard to different weather regimes

Supplement Materials: A Spatial Hierarchical Analysis of the Temporal Influences of Weather on Dengue in Kalutara District, Sri Lanka

Investigation of Arctic ice cover variance using XX century historical ice charts information and last decades microwave data

University of Florida CISE department Gator Engineering. Clustering Part 1

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat

Clustering Analysis of London Police Foot Patrol Behaviour from Raw Trajectories

Atmospheric patterns for heavy rain events in the Balearic Islands

GIS & Spatial Analysis in MCH

DROUGHT MONITORING BULLETIN

CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors. Furong Huang /

South Carolina s Climate Report Card: South Carolina State Climatology Office. Understanding South Carolina s Climate Trends and Variability

Spectral Methods for Subgraph Detection

Popular Mechanics, 1954

1.Introduction 2.Relocation Information 3.Tourism 4.Population & Demographics 5.Education 6.Employment & Income 7.City Fees & Taxes 8.

Evolution of Star Formation of Dwarf Galaxies within Extragalactic Cluster Substructures

Extreme large-scale precipitation events within Czech river basins

Membership determination of open cluster NGC 188 based on the DBSCAN clustering algorithm

ESA Sea Level Climate Change Initiative. Sea Level CCI project. Phase II 1 st annual review

Semi-Automated Stratigraphic Interpretation The HorizonCube

Deke Arndt, Chief, Climate Monitoring Branch, NOAA s National Climatic Data Center

FUTURE CARIBBEAN CLIMATES FROM STATISTICAL AND DYNAMICAL DOWNSCALING

Probability distributions of monthly-to-annual mean temperature and precipitation in a changing climate

INFLUENCE OF CLIMATE CHANGE ON EXTREME WEATHER EVENTS

Citation for the original published paper (version of record):

Outline. Shape of the Earth. Geographic Coordinates (φ, λ, z) Ellipsoid or Spheroid Rotate an ellipse around an axis. Ellipse.

Tri clustering: a novel approach to explore spatio temporal data cubes

YOPP archive: needs of the verification community

Lecture 2: Motions of the Earth and Moon. Astronomy 111 Wednesday August 30, 2017

GIS in Water Resources Midterm Exam Fall 2012 There are five questions on this exam. Please do all five.

GLOBAL/CONTINENTAL LAND COVER MAPPING AND MONITORING

GEO 463-Geographic Information Systems Applications. Lecture 1

Claim: Heat Waves are increasing at an alarming rate and heat kills

Deutscher Wetterdienst

Hunting for Anomalies in PMU Data

CS145: INTRODUCTION TO DATA MINING

JRC MARS Bulletin Crop monitoring in Europe. January 2017 Minor frost damages so far. Improved hardening of winter cereals in central Europe

Claim: Heat Waves are increasing at an alarming rate and heat kills

Ecography. Supplementary material

Geographic Grid -Latitudes and Longitudes

International Journal of Chemical Modeling ISSN: Darko Butina ABSTRACT

Blue and Fin Whale Habitat Modeling from Long-Term Year-Round Passive Acoustic Data from the Southern California Bight

NS7-5 Adding Integers on a Number Line

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

P3.1 Development of MOS Thunderstorm and Severe Thunderstorm Forecast Equations with Multiple Data Sources

Internet Engineering Jacek Mazurkiewicz, PhD

Experimental and Theoretical Studies of Ice-Albedo Feedback Processes in the Arctic Basin

Tyler Wixtrom Texas Tech University Unidata Users Workshop June 2018 Boulder, CO. 7/2/2018 workshop_slides slides

Data Preprocessing. Cluster Similarity

Application and verification of the ECMWF products Report 2007

Application and verification of ECMWF products 2008

The WMO Global Basic Observing Network (GBON)

Exploring Urban Areas of Interest. Yingjie Hu and Sathya Prasad

USING GRIDDED MOS TECHNIQUES TO DERIVE SNOWFALL CLIMATOLOGIES

Analysis of SIGMET Coordination between Neighbouring MWOs

2013 Summer Weather Outlook. Temperatures, Precipitation, Drought, Hurricanes and why we care

4. Conclusions. Acknowledgments. 5. References

MMIF-processed WRF data for AERMOD Case Study: North ID mountain terrain

Application and verification of ECMWF products 2016

Application of SOM neural network in clustering

To Predict Rain Fall in Desert Area of Rajasthan Using Data Mining Techniques

Transcription:

ST-DBSCAN: An Algorithm for Clustering Spatial-Temporal Data Title Di Qin Carolina Department First Steering of Statistics Committee and Operations Research October 9, 2010

Introduction Clustering: the process of grouping large data sets according to their similarity. Partitional, Hierarchical, Grid-based, Model-based and Density-based Density-based clustering Objects form a dense region should be grouped together Search for regions of high density in a feature space that are separated by regions of lower density

Spatial- Temporal Data Data stored as temporal slices of the spatial dataset The clustering algorithms have to consider the spatial and temporal neighbors of objects. The algorithm can be used in geographic information systems, medical image, and weather forecasting

DBSCAN Designed to discover arbitrary-shaped clusters Eps: radius value based on a user defined distance measure MinPts: number of min points that should occur within Eps radius Core object: a point that its neighborhood of a Eps has to contain at least MinPts of other points

Problems of Existing Approaches Clustering spatial-temporal data Distance measure to determine similarity Only one distance parameter Eps

Problems of Existing Approaches Clustering spatial-temporal data Two distance metrics A(x1, y1, t1, t2), B(x2, y2, t3, t4) Spatial value & Non-spatial value Eps1 = (x1 x2), +(y1 y2), Closeness of two points geographically Eps2 = (t1 t3), +(t2 t4), Similarities of temperature values First retaining only the temporal neighbors and their corresponding spatial values.

Problems of Existing Approaches Identifying noise objects noise = pεd i: p C = C = is a cluster of database D. O1 and O2 are noise Densities are different Cannot determine an appropriate Eps

Problems of Existing Approaches Identifying noise objects The density distance of p is defined as: density_distance_max(p)/ density_distance_min(p)

Problems of Existing Approaches Identifying adjacent clusters Existing approaches are adequate if clusters are distant The value of border objects may be very different than the value of other border objects in opposite side.

Problems of Existing Approaches Identifying adjacent clusters Comparing the average value of a cluster with new coming value. If the absolute difference between Cluster_Avg() and Object_Value is bigger than Δε The object is not appended to the cluster

The Description of the Algorithm Inputs: D = O A, O,,, O D Set of objects Eps1: Max geographical coordinate (spatial) distance value Eps 2: Max non-spatial distance value MinPts: Min number of points within Eps1 and Eps2 distance ( may be ln(n), n is the size of the dataset) ε: Threshold value to be included in a cluster. Outputs: C = {C 1, C 2,, C k } Set of clusters

The Description of the Algorithm Start from the first point, whether it is a core object Select a next point If the selected point does not belong to any clusters: Retrieve_Neighbors(object, Eps1, Eps2) The objects have a distance less than Eps1, Eps2. Noise if the total number is less than MinPts

The Description of the Algorithm If the selected point has enough neighbors within Eps1 and Eps2: A new cluster is constructed. If an object is not a noise or not in a cluster, and the difference between average value of the cluster and the new coming value is smaller than ε, it is placed into the current cluster. The Retrieve_Neighbors function considers both the spatial and non-spatial values.

Application The task is to discover the region that have similar seawater characteristics. The data contains information about four seas: the Black Sea, the Marmara Sea, the Aegean Sea, and the east of the Mediterranean Sea

Application The database contains weekly daytime and nighttime temperature records measured at 5340 stations between 2001 and 2004.

Application Eps1 = 3, Eps2 = 0.5 MinPts = 15 C1 is the coldest area bordered by Ukraine and Russia C2 is the second coldest area C4 covers north of Aegean Sea C5 and C7 can be one cluster in winter C6 becomes a little small in summer C7 is the hottest region

Section divider here Thanks