Time Series. CSE 6242 / CX 4242 Mar 27, Duen Horng (Polo) Chau Georgia Tech. Nonlinear Forecasting; Visualization; Applications

Similar documents
15-826: Multimedia Databases and Data Mining. Must-Read Material. Thanks. C. Faloutsos CMU 1

Time Series. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. Mining and Forecasting

Time Series Mining and Forecasting

Ensembles of Nearest Neighbor Forecasts

Text Analytics (Text Mining)

y(n) Time Series Data

Evaluating nonlinearity and validity of nonlinear modeling for complex time series

Online Data Mining for Co-Evolving Time Sequences

A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A.

The analog data assimilation: method, applications and implementation

Nonlinear Systems, Chaos and Control in Engineering

Text Analytics (Text Mining)

Analysis of Electroencephologram Data Using Time-Delay Embeddings to Reconstruct Phase Space

1 Random walks and data

Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites

Chaos, Complexity, and Inference (36-462)

Complex system approach to geospace and climate studies. Tatjana Živković

A Fractal-ANN approach for quality control

HYBRID CHAOS SYNCHRONIZATION OF HYPERCHAOTIC LIU AND HYPERCHAOTIC CHEN SYSTEMS BY ACTIVE NONLINEAR CONTROL

Een vlinder in de wiskunde: over chaos en structuur

6.2 Brief review of fundamental concepts about chaotic systems

Revista Economica 65:6 (2013)

PH36010: Numerical Methods - Evaluating the Lorenz Attractor using Runge-Kutta methods Abstract

ESTIMATING THE ATTRACTOR DIMENSION OF THE EQUATORIAL WEATHER SYSTEM M. Leok B.T.

Data mining in large graphs

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Stochastic Stability - H.J. Kushner

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. SAMs - Detailed outline. Spatial Access Methods - problem

GLOBAL CHAOS SYNCHRONIZATION OF HYPERCHAOTIC QI AND HYPERCHAOTIC JHA SYSTEMS BY ACTIVE NONLINEAR CONTROL

What is Chaos? Implications of Chaos 4/12/2010

ME DYNAMICAL SYSTEMS SPRING SEMESTER 2009

Section 6.5. Least Squares Problems

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Phase Space Reconstruction from Economic Time Series Data: Improving Models of Complex Real-World Dynamic Systems

Nonlinear Dynamics and Chaos Summer 2011

Detection of Nonlinearity and Stochastic Nature in Time Series by Delay Vector Variance Method

Electrical Circuits I

Nonlinear Prediction for Top and Bottom Values of Time Series

No. 6 Determining the input dimension of a To model a nonlinear time series with the widely used feed-forward neural network means to fit the a

Documents de Travail du Centre d Economie de la Sorbonne

Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point?

Forecasting Data Streams: Next Generation Flow Field Forecasting

Scenario 5: Internet Usage Solution. θ j

Deep Learning (CNNs)

Predicting freeway traffic in the Bay Area

Learning Chaotic Dynamics using Tensor Recurrent Neural Networks

Chemical Kinetics and the Rössler System. 1 Introduction. 2 The NH 3 - HCl reaction. Dynamics at the Horsetooth Volume 2, 2010.

Lecture 1: Introduction, history, dynamics, nonlinearity, 1-D problem, phase portrait

OUTPUT REGULATION OF THE SIMPLIFIED LORENZ CHAOTIC SYSTEM

Controlling a Novel Chaotic Attractor using Linear Feedback

THE DESIGN OF ACTIVE CONTROLLER FOR THE OUTPUT REGULATION OF LIU-LIU-LIU-SU CHAOTIC SYSTEM

ECE 8803 Nonlinear Dynamics and Applications Spring Georgia Tech Lorraine

Algorithms for generating surrogate data for sparsely quantized time series

Determinism, Complexity, and Predictability in Computer Performance

Dynamical Systems and Chaos Part I: Theoretical Techniques. Lecture 4: Discrete systems + Chaos. Ilya Potapov Mathematics Department, TUT Room TD325

Outline : Multimedia Databases and Data Mining. Indexing - similarity search. Indexing - similarity search. C.

Rainfall attractors and predictability

Sum-Product Networks: A New Deep Architecture

Time Series Analysis -- An Introduction -- AMS 586

Generalized projective synchronization of a class of chaotic (hyperchaotic) systems with uncertain parameters

Wavelet Transforms: Introduction To Theory & Applications By Raghuveer M. Rao

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

The Research of Railway Coal Dispatched Volume Prediction Based on Chaos Theory

Neural Networks for Two-Group Classification Problems with Monotonicity Hints

2007 Summer College on Plasma Physics

Maps and differential equations

ADAPTIVE CONTROL AND SYNCHRONIZATION OF HYPERCHAOTIC NEWTON-LEIPNIK SYSTEM

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

ADAPTIVE DESIGN OF CONTROLLER AND SYNCHRONIZER FOR LU-XIAO CHAOTIC SYSTEM

ACTIVE CONTROLLER DESIGN FOR THE OUTPUT REGULATION OF THE WANG-CHEN-YUAN SYSTEM

TIME SERIES DATA PREDICTION OF NATURAL GAS CONSUMPTION USING ARIMA MODEL

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

FORECASTING AND MODEL SELECTION

Forecasting Chaotic time series by a Neural Network

CHEM-UA 652: Thermodynamics and Kinetics

Learning Algorithms with Unified and Interactive Web-Based Visualization

Forecasting of ATM cash demand

Adaptive feedback synchronization of a unified chaotic system

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM

ADAPTIVE CONTROLLER DESIGN FOR THE ANTI-SYNCHRONIZATION OF HYPERCHAOTIC YANG AND HYPERCHAOTIC PANG SYSTEMS

OUTPUT REGULATION OF RÖSSLER PROTOTYPE-4 CHAOTIC SYSTEM BY STATE FEEDBACK CONTROL

Robust Learning of Chaotic Attractors

Machine Learning (CS 567) Lecture 5

10 th Workshop on Multimedia in Physics Teaching and Learning. Freie Universität Berlin 5-7 October 2005

A Hybrid Time-delay Prediction Method for Networked Control System

Introduction to Dynamical Systems Basic Concepts of Dynamics

Why are Discrete Maps Sufficient?

course overview 18.06: Linear Algebra

Photonic Simulation Software Tools for Education

Dynamical Systems: Lecture 1 Naima Hammoud

The Effects of Coarse-Graining on One- Dimensional Cellular Automata Alec Boyd UC Davis Physics Deparment

Univariate Short-Term Prediction of Road Travel Times

Phase-Space Reconstruction. Gerrit Ansmann

NONLINEAR DYNAMICS PHYS 471 & PHYS 571

CSE 546 Final Exam, Autumn 2013

MODELING NONLINEAR DYNAMICS WITH NEURAL. Eric A. Wan. Stanford University, Department of Electrical Engineering, Stanford, CA

Mark Howell Gonzaga High School, Washington, D.C.

Predict Time Series with Multiple Artificial Neural Networks

Journal of of Computer Applications Research Research and Development and Development (JCARD), ISSN (Print), ISSN

Scheduling of Frame-based Embedded Systems with Rechargeable Batteries

Describing and Forecasting Video Access Patterns

Transcription:

CSE 6242 / CX 4242 Mar 27, 2014 Time Series Nonlinear Forecasting; Visualization; Applications Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos, Le Song

Similarity search" Euclidean distance" Time-warping" Linear Forecasting" Last Time AR (Auto Regression) methodology" RLS (Recursive Least Square) = fast, incremental least square 2

This Time Linear Forecasting" Co-evolving time sequences" Non-linear forecasting" Lag-plots + k-nn" Visualization and Applications 3

Co-Evolving Time Sequences Given: A set of correlated time sequences Forecast Repeated(t) 90 Number of packets 68 45 23?? sent lost repeated 0 1 4 6 9 11 Time Tick

Q: what should we do? Solution:

Solution: Least Squares, with Dep. Variable: Repeated(t) Indep. Variables: Number of packets 45 Sent(t-1) Sent(t-w); 23 0 Lost(t-1) Lost(t-w); Repeated(t-1), Repeated(t-w) (named: MUSCLES [Yi+00]) 90 68 1 4 6 9 11 Time Tick sent lost repeated

Forecasting - Outline Auto-regression Least Squares; recursive least squares Co-evolving time sequences Examples Conclusions

Examples - Experiments Datasets Modem pool traffic (14 modems, 1500 time-ticks; #packets per time unit) AT&T WorldNet internet usage (several data streams; 980 time-ticks) Measures of success Accuracy : Root Mean Square Error (RMSE)

Accuracy - Modem 4 3 RMSE 2 AR yesterday MUSCLES 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Modems MUSCLES outperforms AR & yesterday

Accuracy - Internet 1.4 1.05 RMSE 0.7 AR yesterday MUSCLES 0.35 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Streams MUSCLES consistently outperforms AR & yesterday

Linear forecasting - Outline Auto-regression Least Squares; recursive least squares Co-evolving time sequences Examples Conclusions

Conclusions - Practitioner s guide AR(IMA) methodology: prevailing method for linear forecasting Brilliant method of Recursive Least Squares for fast, incremental estimation.

Resources: software and urls MUSCLES: Prof. Byoung-Kee Yi: http://www.postech.ac.kr/~bkyi/ or christos@cs.cmu.edu R http://cran.r-project.org/

Books George E.P. Box and Gwilym M. Jenkins and Gregory C. Reinsel, Time Series Analysis: Forecasting and Control, Prentice Hall, 1994 (the classic book on ARIMA, 3rd ed.) Brockwell, P. J. and R. A. Davis (1987). Time Series: Theory and Methods. New York, Springer Verlag.

Additional Reading [Papadimitriou+ vldb2003] Spiros Papadimitriou, Anthony Brockwell and Christos Faloutsos Adaptive, Hands-Off Stream Mining VLDB 2003, Berlin, Germany, Sept. 2003 [Yi+00] Byoung-Kee Yi et al.: Online Data Mining for Co-Evolving Time Sequences, ICDE 2000. (Describes MUSCLES and Recursive Least Squares)

Outline Motivation... Linear Forecasting Non-linear forecasting Conclusions

Chaos & non-linear forecasting

Reference: [ Deepay Chakrabarti and Christos Faloutsos F4: Large-Scale Automated Forecasting using Fractals CIKM 2002, Washington DC, Nov. 2002.]

Detailed Outline Non-linear forecasting Problem Idea How-to Experiments Conclusions

Recall: Problem #1 Value Given a time series {x t }, predict its future course, that is, x t+1, x t+2,... Time

Datasets x(t) Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] time Lag-plot ARIMA: fails

How to forecast? ARIMA - but: linearity assumption Lag-plot ARIMA: fails

How to forecast? ARIMA - but: linearity assumption " " ANSWER: Delayed Coordinate Embedding = Lag Plots [Sauer92] ~ nearest-neighbor search, for past incidents

General Intuition (Lag Plot) x t Lag = 1, k = 4 NN x t-1

General Intuition (Lag Plot) x t Lag = 1, k = 4 NN x t-1 New Point

General Intuition (Lag Plot) x t Lag = 1, k = 4 NN x t-1 4-NN New Point

General Intuition (Lag Plot) x t Lag = 1, k = 4 NN x t-1 4-NN New Point

General Intuition (Lag Plot) x t Lag = 1, k = 4 NN Interpolate these x t-1 4-NN New Point

General Intuition (Lag Plot) x t Lag = 1, k = 4 NN Interpolate these To get the final prediction x t-1 4-NN New Point

Questions: Q1: How to choose lag L? Q2: How to choose k (the # of NN)? Q3: How to interpolate? Q4: why should this work at all?

Q1: Choosing lag L Manually (16, in award winning system by [Sauer94])

Q2: Choosing number of neighbors k Manually (typically ~ 1-10)

Q3: How to interpolate? How do we interpolate between the k nearest neighbors? A3.1: Average A3.2: Weighted average (weights drop with distance - how?)

Q3: How to interpolate? A3.3: Using SVD - seems to perform best ([Sauer94] - first place in the Santa Fe forecasting competition) x t X t-1

Q3: How to interpolate? A3.3: Using SVD - seems to perform best ([Sauer94] - first place in the Santa Fe forecasting competition) x t X t-1

Q3: How to interpolate? A3.3: Using SVD - seems to perform best ([Sauer94] - first place in the Santa Fe forecasting competition) x t X t-1

Q3: How to interpolate? A3.3: Using SVD - seems to perform best ([Sauer94] - first place in the Santa Fe forecasting competition) x t X t-1

A4: YES! Q4: Any theory behind it?

Theoretical foundation Based on the Takens theorem [Takens81] which says that long enough delay vectors can do prediction, even if there are unobserved variables in the dynamical system (= diff. equations)

Detailed Outline Non-linear forecasting Problem Idea How-to Experiments Conclusions

Datasets x(t) Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] time Lag-plot

Datasets x(t) Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] time Lag-plot ARIMA: fails

Logistic Parabola Our Prediction from here Value Timesteps

Value Logistic Parabola Comparison of prediction to correct values Timesteps

Value Datasets LORENZ: Models convection currents in the air dx / dt = a (y - x) dy / dt = x (b - z) - y dz / dt = xy - c z

Value LORENZ Comparison of prediction to correct values Timesteps

Value Datasets LASER: fluctuations in a Laser over time (used in Santa Fe competition) Time

Value Laser Comparison of prediction to correct values Timesteps

Conclusions Lag plots for non-linear forecasting (Takens theorem) suitable for chaotic signals

References Deepay Chakrabarti and Christos Faloutsos F4: Large-Scale Automated Forecasting using Fractals CIKM 2002, Washington DC, Nov. 2002. Sauer, T. (1994). Time series prediction using delay coordinate embedding. (in book by Weigend and Gershenfeld, below) Addison-Wesley. Takens, F. (1981). Detecting strange attractors in fluid turbulence. Dynamical Systems and Turbulence. Berlin: Springer-Verlag.

References Weigend, A. S. and N. A. Gerschenfeld (1994). Time Series Prediction: Forecasting the Future and Understanding the Past, Addison Wesley. (Excellent collection of papers on chaotic/non-linear forecasting, describing the algorithms behind the winners of the Santa Fe competition.)

Overall conclusions Similarity search: Euclidean/time-warping; feature extraction and SAMs Linear Forecasting: AR (Box-Jenkins) methodology; Non-linear forecasting: lag-plots (Takens)

Must-Read Material Byong-Kee Yi, Nikolaos D. Sidiropoulos, Theodore Johnson, H.V. Jagadish, Christos Faloutsos and Alex Biliris, Online Data Mining for Co-Evolving Time Sequences, ICDE, Feb 2000. Chungmin Melvin Chen and Nick Roussopoulos, Adaptive Selectivity Estimation Using Query Feedbacks, SIGMOD 1994

Thanks Deepay Chakrabarti (CMU) " " Spiros Papadimitriou (CMU) " " " Prof. Dimitris Gunopulos (UCR) " Mengzhi Wang (CMU) " Prof. Byoung-Kee Yi (Pohang U.) "

Time Series Visualization + Applications 47

Why Time Series Visualization? Time series is the most common data type" But why is time series so common? 48

How to build time series visualization? Easy way: use existing tools, libraries" Google Public Data Explorer (Gapminder) http://goo.gl/hmrh" Google acquired Gapminder http://goo.gl/43avy (Hans Rosling s TED talk http://goo.gl/tkv7)" Google Annotated Time Line http://goo.gl/upm5w" Timeline, from MIT s SIMILE project http://simile-widgets.org/timeline/" Timeplot, also from SIMILE http://simile-widgets.org/timeplot/" Excel, of course 49

How to build time series visualization? The harder way:" R (ggplot2)" Matlab" gnuplot"..." The even harder way:" D3, for web" JFreeChart (Java)"... 50

Time Series Visualization Why is it useful?" When is visualization useful?" (Why not automate everything? Like using the forecasting techniques you learned last time.) 51

Time Series User Tasks When was something greatest/least? Is there a pattern? Are two series similar? Do any of the series match a pattern? Provide simpler, faster access to the series Does data element exist at time t? When does a data element exist? How long does a data element exist? How often does a data element occur? How fast are data elements changing? In what order do data elements appear? Do data elements exist together? Muller & Schumann 03 citing MacEachern 95

horizontal axis is time

http://www.patspapers.com/blog/item/what_if_everybody_flushed_at_once_edmonton_water_gold_medal_hockey_game/

http://www.patspapers.com/blog/item/what_if_everybody_flushed_at_once_edmonton_water_gold_medal_hockey_game/

Gantt Chart Useful for project How to create in Excel: http://www.youtube.com/watch?v=sa67g6zakoe

ThemeRiver Stacked graph Streamgraph http://www.nytimes.com/interactive/2008/02/23/movies/20080223_revenue_graphic.html http://bl.ocks.org/mbostock/3943967

TimeSearcher support queries http://hcil2.cs.umd.edu/video/2005/2005_timesearcher2.mpg

GeoTime http://www.youtube.com/watch?v=inkf86qjbda 59