Applying Visual Analytics Methods to Spatial Time Series Data: Forest Fires, Phone Calls, Dr. & Dr. Natalia Andrienko In collaboration with TU Darmstadt and Univ. Constance, DFG Priority Research Program on Visual Analytics, SPP 1335
http://visualanalytics.info Joint research group of Fraunhofer Institute IAIS and University of Bonn Major R&D direction: Visual Analytics for spatial and temporal data data mining optimization geovisualization Key research projects: GeoPKDD: Geographic Privacyaware Knowledge Discovery (FET-Open 2005-2009) ViAMoD: Visual Spatiotemporal Pattern Analysis of Movement and Event Data (DFG, 2008-2011) OASIS (2004-2008), ESS (2009-2013) EU-funded IPs on crisis management VisMaster, MODAP (FET-Open CAs) 2
Spatial Time Series: Data Structure Spatial references: states of the USA Temporal references: years from 1960 till 2000 (41) Attributes: population + various crime rates 3
OECD Explorer (2009) EuroFigures (1999) 4
Spatial Visualizations: animated maps, diagram maps 5
Temporal visualizations 6
Exploring trends by interactive operations 7
Scalability problem What if we have - Multiple attributes - Many places - Long time series Interactive visualization is not sufficient We need grouping in space and time => Clustering 8
Data sets Forest fires in Italy (courtesy of the European forest Fires database, accessed within the Forest Fires project, tender to JRC) - Monthly counts of forest fires for 107 regions, 24 years 24 x 12 = 264 time intervals - Annual counts of forest fires by different causes, accidental and deliberate Phone calls in Milan (courtesy of WIND, provided within ESS project) - Hourly counts of mobile phone calls, 9 days, 238 regions 9 x 24 = 216 time intervals 9
Approach 2) Group time intervals by similarity of spatial situations: clustering Space in Time 1) Group places by similarity of temporal dynamics: clustering Time in Space 10
SOM Self-Organizing Map (Kohonen 2001) is a neural network type vector projection and quantization algorithm. By means of a competitive, iterative training process, a network of prototype vectors (or neurons, or cells) is trained (adjusted) to the input vector data. The output of the algorithm is a network of vectors that is approximately topology preserving w.r.t. the input data. The network can be interpreted as a set of clusters and simultaneously as a map to lay out the input data elements (e.g., in the nearest neighbor sense w.r.t. the prototypes). Typically, two-dimensional rectangular or hexagonal prototype vector networks are assumed. The capability of SOM to arrange input data in a regular network structure provides good opportunities for visualization. 11
Space-in-Time and Time-in-Space SOMs: visualization 1. Bars on top of a cell show number of objects inside 2. Shading of borders between cells reflects similarity of features 3. Similarity of colors also reflects similarity 4. Colors are projected on other displays: maps (if grouping places) and time graphs (if grouping time intervals), Natalia Andrienko, Sebastian Bremm, Tobias Schreck, Tatiana von Landesberger, Peter Bak, Daniel Keim Space-in-Time and Time-in-Space Self-Organizing Maps for Exploring Spatiotemporal Patterns Computer Graphics Forum, 2010, v.29 (3) 12
Time-in-Space SOM of forest fires Inside cells: index images (what is grouped) feature images (what are the features) Feature images Index images 13
Time-in-Space SOM: details 14
Space-in-time SOM (index images) years 1 Months 12 15
Space-in-time SOM (feature images) 16
Cause 3: Accident Cause 4: Deliberate Comparison of spatial distributions of temporal dynamics 17
Time-in-Space SOM of mobile phone calls 18
Space-in-Time SOM of mobile phone calls 19
Next step: predictive modeling Applying Holt-Winters smoothing to detected clusters, separately for working days and weekends (joint work with Christian Pölitz, Univ. Bonn) 20
Further work: search for outliers, event detection etc. 21
Conclusions Interactive and animated maps and graphs are not sufficient for analyzing large and complex data. Visual methods need to be augmented by computations. With space-in-time and time-in-space SOMs we consider data from two different perspectives: - places grouped by similar attribute dynamics - time intervals grouped by similar spatial distributions of attribute values Clusters colours reflect their similarity Two case studies demonstrate the value of the approach A step towards predictive modelling - Live demo is possible 22