Application of Clustering to Earth Science Data: Progress and Challenges

Similar documents
Finding Climate Indices and Dipoles Using Data Mining

Discovery of Climate Indices using Clustering

Data Mining for the Discovery of Ocean Climate Indices *

Data Mining for the Discovery of Ocean Climate Indices *

Mining Climate Data. Michael Steinbach Vipin Kumar University of Minnesota /AHPCRC

Discovering Dynamic Dipoles in Climate Data

Stefan Liess University of Minnesota Saurabh Agrawal, Snigdhansu Chatterjee, Vipin Kumar University of Minnesota

NEW GENERATION OF DATA MINING APPLICA- TIONS

Global Climate Patterns and Their Impacts on North American Weather

North Pacific Climate Overview N. Bond (UW/JISAO), J. Overland (NOAA/PMEL) Contact: Last updated: September 2008

Discovery of Patterns in the Global Climate System using Data Mining

El Niño / Southern Oscillation

THE PACIFIC DECADAL OSCILLATION (PDO)

MPACT OF EL-NINO ON SUMMER MONSOON RAINFALL OF PAKISTAN

Construction and Analysis of Climate Networks

TROPICAL-EXTRATROPICAL INTERACTIONS

lecture 10 El Niño and the Southern Oscillation (ENSO) Part I sea surface height anomalies as measured by satellite altimetry

Name: Date: Hour: Comparing the Effects of El Nino & La Nina on the Midwest (E4.2c)

The Climate System and Climate Models. Gerald A. Meehl National Center for Atmospheric Research Boulder, Colorado

The North Atlantic Oscillation: Climatic Significance and Environmental Impact

Climate Outlook for March August 2018

Forced and internal variability of tropical cyclone track density in the western North Pacific

NSF Expeditions in Computing. Understanding Climate Change: A Data Driven Approach. Vipin Kumar University of Minnesota

Lecture 8: Natural Climate Variability

Chapter outline. Reference 12/13/2016

Climate Outlook for October 2017 March 2018

Climate Outlook for December 2015 May 2016

Mining Relationships in Spatio-temporal Datasets

Climate Outlook for March August 2017

ATMOSPHERIC MODELLING. GEOG/ENST 3331 Lecture 9 Ahrens: Chapter 13; A&B: Chapters 12 and 13

Global Circulation. Local weather doesn t come from all directions equally Everyone s weather is part of the global circulation pattern

Ocean in Motion 7: El Nino and Hurricanes!

Different impacts of Northern, Tropical and Southern volcanic eruptions on the tropical Pacific SST in the last millennium

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 5 August 2013

Global Temperature Report: December 2018

THE INFLUENCE OF CLIMATE TELECONNECTIONS ON WINTER TEMPERATURES IN WESTERN NEW YORK INTRODUCTION

CHAPTER 1: INTRODUCTION

El Niño: How it works, how we observe it. William Kessler and the TAO group NOAA / Pacific Marine Environmental Laboratory

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 25 February 2013

New Zealand Climate Update No 223, January 2018 Current climate December 2017

Name: Climate Date: EI Niño Conditions

The Effect of the North Atlantic Oscillation On Atlantic Hurricanes Michael Barak-NYAS-Mentors: Dr. Yochanan Kushnir, Jennifer Miller

Interannual Variability of the South Atlantic High and rainfall in Southeastern South America during summer months

Introduction of climate monitoring and analysis products for one-month forecast

The Forcing of the Pacific Decadal Oscillation

Impacts of extreme climatic conditions on sugar cane production in northeastern Australia

Warming after a cold winter will disappear quickly as it did in 2007 By Joseph D Aleo

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 23 April 2012

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 15 July 2013

ENSO Outlook by JMA. Hiroyuki Sugimoto. El Niño Monitoring and Prediction Group Climate Prediction Division Japan Meteorological Agency

11/24/09 OCN/ATM/ESS The Pacific Decadal Oscillation. What is the PDO? Causes of PDO Skepticism Other variability associated with PDO

Ocean-Atmosphere Interactions and El Niño Lisa Goddard

March was 3rd warmest month in satellite record

Name the surface winds that blow between 0 and 30. GEO 101, February 25, 2014 Monsoon Global circulation aloft El Niño Atmospheric water

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 24 September 2012

An El Niño Primer René Gommes Andy Bakun Graham Farmer El Niño-Southern Oscillation defined

The Formation of Precipitation Anomaly Patterns during the Developing and Decaying Phases of ENSO

WINTER FORECAST NY Metro

ENSO: Recent Evolution, Current Status and Predictions. Update prepared by: Climate Prediction Center / NCEP 30 October 2017

Lab 12: El Nino Southern Oscillation

Semiblind Source Separation of Climate Data Detects El Niño as the Component with the Highest Interannual Variability

2013 ATLANTIC HURRICANE SEASON OUTLOOK. June RMS Cat Response

Weather Outlook 2016: Cycles and Patterns Influencing Our Growing Season

North Pacific Climate Overview N. Bond (UW/JISAO), J. Overland (NOAA/PMEL) Contact: Last updated: August 2009

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 11 November 2013

Teleconnections and Climate predictability

Name Date Class. growth rings of trees, fossilized pollen, and ocean. in the northern hemisphere.

Global Ocean Monitoring: Recent Evolution, Current Status, and Predictions

Climate Variability. Andy Hoell - Earth and Environmental Systems II 13 April 2011

Global Atmospheric Circulation

Homework. Oceanography and Climate Review due Friday Feb 12 th (test day!!)

Climate Forecast Applications Network (CFAN)

Global Weather Trade Winds etc.notebook February 17, 2017

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP July 26, 2004

MAURITIUS METEOROLOGICAL SERVICES

NIWA Outlook: March-May 2015

Appearance of solar activity signals in Indian Ocean Dipole (IOD) phenomena and monsoon climate pattern over Indonesia

Introduction. H/L = Higher/ Lower atmospheric pressure. WMO has produced this El Nino Update in order to ensure that the most effective

June 1989 T. Nitta and S. Yamada 375. Recent Warming of Tropical Sea Surface Temperature and Its. Relationship to the Northern Hemisphere Circulation

ENSO, AO, and climate in Japan. 15 November 2016 Yoshinori Oikawa, Tokyo Climate Center, Japan Meteorological Agency

ALASKA REGION CLIMATE OUTLOOK BRIEFING. December 22, 2017 Rick Thoman National Weather Service Alaska Region

KUALA LUMPUR MONSOON ACTIVITY CENT

Climate Variability and El Niño

SIO 210 Final Exam December 10, :30 2:30 NTV 330 No books, no notes. Calculators can be used.

Percentage of normal rainfall for August 2017 Departure from average air temperature for August 2017

Simple models for trading climate and weather risk

Introduction of products for Climate System Monitoring

ENSO: Recent Evolution, Current Status and Predictions. Update prepared by: Climate Prediction Center / NCEP 9 November 2015

Winter. Here s what a weak La Nina usually brings to the nation with tempseraures:

THE WINTER OF JUST HOW WEIRD WAS IT EXACTLY?

Earth s Climate Patterns

NOTES AND CORRESPONDENCE. El Niño Southern Oscillation and North Atlantic Oscillation Control of Climate in Puerto Rico

Pacific Decadal Oscillation ( PDO ):

Thai Meteorological Department, Ministry of Digital Economy and Society

Percentage of normal rainfall for April 2018 Departure from average air temperature for April 2018

Mid-season Storm Surge Update: December, 2013

Exploring Climate Patterns Embedded in Global Climate Change Datasets

Here s what a weak El Nino usually brings to the nation with temperatures:

Long term properties of monthly atmospheric pressure fields

Website Lecture 3 The Physical Environment Part 1

Transcription:

Application of Clustering to Earth Science Data: Progress and Challenges Michael Steinbach Shyam Boriah Vipin Kumar University of Minnesota Pang-Ning Tan Michigan State University Christopher Potter NASA Ames Research Center Steven Klooster California State University, Monterey Bay NASA funded project: Discovery of Changes from the Global Carbon Cycle and Climate System Using Data Mining Additional support from Army High Performance Computing Research Center Shyam Boriah Application of Clustering to Earth Science Data

Overview Background Climate indices Techniques for Discovering Climate Indices Traditional approaches Clustering Results Challenges Conclusion Shyam Boriah Application of Clustering to Earth Science Data 2

Research Goal Average Monthly Temperature Research Goal: Find global climate patterns of interest to Earth Scientists A key interest is finding connections between the ocean / atmosphere and the land. grid cell NPP Longitude Pressure Precipitation SST Latitude... Time NPP Pressure Precipitation Satellite Data: Global snapshots of values for a number of variables on land surfaces or water Span a range of 10 to 50 years Gridded Shyam Boriah Application of Clustering to Earth Science Data 3 SST zone

The El Niño Climate Phenomenon El Niño is the anomalous warming of the eastern tropical region of the Pacific (off the coast of Peru) Normal Year: Trade winds push warm ocean water west, cool water rises in its place El Niño Year: Trade winds ease, switch direction, warmest water moves east. http://www.usatoday.com/weather/tg/wetnino/wetnino.htm Shyam Boriah Application of Clustering to Earth Science Data 4

The El Niño Climate Phenomenon Shyam Boriah Application of Clustering to Earth Science Data 5

Overview Background Climate indices Techniques for Discovering Climate Indices Traditional approaches Clustering Results Challenges Conclusion Shyam Boriah Application of Clustering to Earth Science Data 6

Climate Indices: Connecting the Ocean/Atmosphere and the Land A climate index is a time series of temperature or pressure Similar to business or economic indices Based on Sea Surface Temperature (SST) or Sea Level Pressure (SLP) Climate indices are important because Dow Jones Index (from Yahoo!) They distill climate variability at a regional or global scale into a single time series They are well-accepted by Earth scientists They are related to well-known climate phenomena such as El Niño M. Steinbach Discovery of Climate Indices Using Clustering 7

A Temperature Based Climate Index: NINO1+2 Correlation Between ANOM 1+2 and Land Temp (>0.2) Correlation Between Niño 1+2 and Land Temperature (>0.2) 90 90 1 0.8 El Niño Events 60 60 30 0.9 0.6 0.8 0.4 0.7 0.2 0.6 latitude latitude 0 0.5 0 Niño 1+2 Index -30-60 -60-90 -90-180 -150-120 -90-60 -30 30 60 90 120 150 180-180 -150-120 -90-60 -30 0 30 60 90 120 150 180 longitude longitude 0.4-0.2 0.3-0.4 0.2-0.6 0.1-0.8 0 Shyam Boriah Application of Clustering to Earth Science Data 8

A Pressure Based Climate Index: SOI The Southern Oscillation Index (SOI) is also associated with El Niño. 90 1 Defined as the normalized pressure differences between Tahiti and Darwin Australia. latitude 60 Tahiti 30 0 Darwin 0.9 0.8 0.7 0.6 0.5 0.4-30 0.3 Both temperature and pressure based indices capture the same El Niño climate phenomenon. -60-90 -180-150 -120-90 -60-30 0 30 60 90 120 150 180 longitude 0.2 0.1 0 Shyam Boriah Application of Clustering to Earth Science Data 9

List of Well Known Climate Indices Index Description SOI Southern Oscillation Index: Measures the SLP anomalies between Darwin and Tahiti NAO North Atlantic Oscillation: Normalized SLP differences between Ponta Delgada, Azores and Stykkisholmur, Iceland AO Arctic Oscillation: Defined as the _first principal component of SLP poleward of 20 N PDO Pacific Decadel Oscillation: Derived as the leading principal component of monthly SST anomalies in the North Pacific Ocean, poleward of 20 N QBO Quasi-Biennial Oscillation Index: Measures the regular variation of zonal (i.e. east-west) strato-spheric winds above the equator CTI Cold Tongue Index: Captures SST variations in the cold tongue region of the equatorial Pacific Ocean (6 N-6 S, 180-90 W) WP Western Pacific: Represents a low-frequency temporal function of the zonal dipole' SLP spatial pattern involving the Kamchatka Peninsula, southeastern Asia and far western tropical and subtropical North Pacific NINO1+2 Sea surface temperature anomalies in the region bounded by 80 W-90 W and 0-10 S NINO3 Sea surface temperature anomalies in the region bounded by 90 W-150 W and 5 S-5 N NINO3.4 Sea surface temperature anomalies in the region bounded by 120 W-170 W and 5 S-5 N NINO4 Sea surface temperature anomalies in the region bounded by 150 W-160 W and 5 S-5 N Shyam Boriah Application of Clustering to Earth Science Data 10

Overview Background Climate indices Techniques for Discovering Climate Indices Traditional approaches Clustering Results Challenges Conclusion Shyam Boriah Application of Clustering to Earth Science Data 11

Discovering Climate Indices: Traditional Approaches Earth scientists have discovered currently known climate indices Observation The El Niño phenomenon was first noticed by Peruvian fishermen centuries ago They observed that in some years the warm southward current, which appeared around Christmas, would persist for an unusually long time, with a disastrous impact on fishing Eigenvalue techniques such as Singular Value Decomposition (SVD) Shyam Boriah Application of Clustering to Earth Science Data 12

Overview Background Climate indices Techniques for Discovering Climate Indices Traditional approaches Clustering Results Challenges Conclusion Shyam Boriah Application of Clustering to Earth Science Data 13

Discovering Climate Indices via Data Mining Clustering provides an alternative approach for finding candidate indices Clusters represent ocean regions with relatively homogeneous behavior The centroids of these clusters are time series that summarize the behavior of these ocean areas, and thus, represent potential climate indices Need to evaluate the influence of potential indices on land points Shyam Boriah Application of Clustering to Earth Science Data 14

Shared Nearest Neighbor (SNN) Clustering Density based clustering approach Determine the density of each point (time series) Density is high if most of your neighbors have you as a neighbor Perform the clustering using the density Identify and eliminate noise and outliers, which are points with low density Identify core points, which are time series with high density Build clusters around the core points Shyam Boriah Application of Clustering to Earth Science Data 15

SNN Density of SLP Time Series 90 60 30 latitude 0-30 -60-90 -180-150 -120-90 -60-30 0 30 60 90 120 150 180 longitude Redder areas are high density, i.e., high homogeneity. Shyam Boriah Application of Clustering to Earth Science Data 16

SLP Clusters 25 SLP Clusters Shyam Boriah Application of Clustering to Earth Science Data 17

Overview Background Climate indices Techniques for Discovering Climate Indices Traditional approaches Clustering Results: SST-based clusters Challenges Conclusion Shyam Boriah Application of Clustering to Earth Science Data 18

SST Clusters 90 107 SST Clusters 60 30 latitude 0-30 -60-90 -180-150 -120-90 -60-30 0 30 60 90 120 150 180 longitude Shyam Boriah Application of Clustering to Earth Science Data 19

Correlation of Known Indices with SST Cluster Centroids and SVD Components Known Indices Best Matching SST Cluster Centroid Best Matching SVD Component SOI 0.7006 0.5427 NAO 0.2973 0.1774 AO 0.2383 0.2301 PDO 0.5172 0.4684 QBO 0.2675 0.3187 CTI 0.9147 0.6316 WP 0.2590 0.1904 NINO1+2 0.9225 0.5419 NINO3 0.9462 0.6449 NINO3.4 0.9196 0.6844 NINO4 0.9165 0.6894 SST based cluster centroids have better correlation to known indices than SVD based indices in all but one case. Red indicates higher magnitude of correlation. 20

Area-weighted Correlation of Known Indices with SST Cluster Centroids and SVD Components Known Area Weighted Correlation for Indices Index Best Centroid Best SVD Component SOI 0.1550 0.1768 0.1240 NAO 0.1328 0.1387 0.0929 AO 0.1682 0.1912 0.0929 PDO 0.1378 0.1377 0.0891 QBO 0.0671 0.1377 0.0850 CTI 0.1702 0.1708 0.1240 WP 0.1117 0.1714 0.1240 NINO1+2 0.1558 0.1608 0.2091 NINO3 0.1774 0.1708 0.2091 NINO 3.4 0.1800 0.1714 0.2091 NINO 4 0.1696 0.1768 0.2091 SST based cluster centroids have higher area-weighted correlation than SVD based indices and known indices in most cases. Red indicates higher correlation. 21

Overview Background Climate indices Techniques for Discovering Climate Indices Traditional approaches Clustering Results Challenges Conclusion Shyam Boriah Application of Clustering to Earth Science Data 22

Dynamic Clusters Motivation: In Earth Science, most phenomena of interest evolve in space and time Most well-known indices based on data collected at fixed land stations However, underlying phenomenon may not occur at exact location of the land station. e.g. NAO Shyam Boriah Application of Clustering to Earth Science Data 23

An example NAO NAO refers to swings in the atmospheric sea level pressure difference between the Arctic and the subtropical Atlantic Computed as the normalized difference between SLP at a pair of land stations in these two regions of the North Atlantic Ocean Exact location of centers of the phenomena do not necessarily correspond to fixed land stations Source: Portis et al, Seasonality of the NAO, AGU Chapman Conference, 2000. Shyam Boriah Application of Clustering to Earth Science Data 24

Dynamic Clusters Need for novel clustering approaches that can find dynamic or moving clusters Climate phenomenon can be captured much more accurately by having a notion of a dynamic cluster This would represent a dynamic phenomenon based on satellite data for the entire region Such clusters might move in space and/or expand or contract in extent over time Successful development of dynamic clustering techniques will provide better insight as to how climate indices and their impact on land climate change over time Shyam Boriah Application of Clustering to Earth Science Data 25

Previous work in this area Two common approaches: 1. View data as collection of snapshots taken at different time periods 1 Spatial clustering is performed for each snapshot independently Correspondence between clusters in different time periods established by measuring fraction of objects the clusters share in common 2. Spatial clustering initially applied to data in first snapshot 2 Clusters updated incrementally taking into account changes due to moving objects that leave or join the cluster as time progresses Limitations: Clustering performed along one of the dimensions (either spatial or temporal, not both) loses information about contiguity of objects along other dimension 1. P. Laknis, N. Mamoulis, and S. Bakiras. On discovering moving clusters in spatio-temporal data. In Proc. of the 9th International Symposium on Spatial and Temporal Databases (SSTD 2005), Angra dos Reis, Brazil, 2005. 2. Y. Li, J. Han, and J. Yang. Clustering moving objects. In Proc. of the 2004 ACM SIGKDD Int'l Conf on Know ledge Discovery and Data Mining, 2004. Shyam Boriah Application of Clustering to Earth Science Data 26

Possible approaches Directly optimize a spatio-temporal objective function Partition space-time dimensions into spatio-temporal cells, and aggregate cells that have similar values while preserving spatial and temporal proximities Challenges: Feature extraction: granularity of spatio-temporal cells may be too fine. Need to determine (spatial and temporal) neighborhood size Spatio-temporal clustering: Formulate objective function. Spatial constraints: MBR? Visualization Shyam Boriah Application of Clustering to Earth Science Data 27

Conclusion Briefly described current progress in applying clustering to find climate indices One of the remaining challenges is to model dynamic clusters Future work: Investigate several approaches to modeling dynamic clusters Take into account domain specific factors such as seasonality, land cover and geographical boundaries Other challenges include those applicable to most clustering algorithms, i.e. handling outliers, parameter initialization and scalability Shyam Boriah Application of Clustering to Earth Science Data 28

Questions? Project web page: http://www.cs.umn.edu/~kumar/nasa-umn