The Caterpillar -SSA approach to time series analysis and its automatization

Similar documents
The Caterpillar -SSA approach: automatic trend extraction and other applications. Theodore Alexandrov

Automatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc.

SSA analysis and forecasting of records for Earth temperature and ice extents

Analysis of Time Series Structure: SSA and Related Techniques

Singular Spectrum Analysis for Forecasting of Electric Load Demand

Chapter 2 Basic SSA. 2.1 The Main Algorithm Description of the Algorithm

arxiv: v2 [stat.me] 18 Jan 2013

Vulnerability of economic systems

Window length selection of singular spectrum analysis and application to precipitation time series

Variations of Singular Spectrum Analysis for separability improvement: non-orthogonal decompositions of time series

Impacts of natural disasters on a dynamic economy

arxiv: v1 [q-fin.st] 7 May 2016

Polynomial-exponential 2D data models, Hankel-block-Hankel matrices and zero-dimensional ideals

On the choice of parameters in Singular Spectrum Analysis and related subspace-based methods

Polynomial Chaos and Karhunen-Loeve Expansion

(a)

TRACKING THE US BUSINESS CYCLE WITH A SINGULAR SPECTRUM ANALYSIS

Preliminary Examination, Numerical Analysis, August 2016

USING THE MULTIPLICATIVE DOUBLE SEASONAL HOLT-WINTERS METHOD TO FORECAST SHORT-TERM ELECTRICITY DEMAND

A time series is called strictly stationary if the joint distribution of every collection (Y t

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Separability between Signal and Noise Components Using the Distribution of Scaled Hankel Matrix Eigenvalues with Application in Biomedical Signals

CONTENTS NOTATIONAL CONVENTIONS GLOSSARY OF KEY SYMBOLS 1 INTRODUCTION 1

Spectral Processing. Misha Kazhdan

Baltic Astronomy, vol. 25, , 2016

On simultaneous data based dimension reduction and hidden phase identification

Lesson 2: Analysis of time series

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Lecture 6. Numerical methods. Approximation of functions

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

Expectation Maximization

Operational modal analysis using forced excitation and input-output autoregressive coefficients

Econometric Reviews Publication details, including instructions for authors and subscription information:

Time series models in the Frequency domain. The power spectrum, Spectral analysis

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Analysis of Chandler wobble excitation, reconstructed from observations of the polar motion of the Earth

APPLICATIONS OF MATHEMATICAL METHODS IN THE SUPERVISION OF THE TECHNICAL SAFETY OF WATER CONSTRUCTIONS

AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET. Questions AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

Spectral Regularization

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

CSE 554 Lecture 7: Alignment

Introduction to Reinforcement Learning. CMPT 882 Mar. 18

New trends in Air Traffic Complexity

Regularization via Spectral Filtering

Foundations of Computer Vision

Principal Component Analysis (PCA)

Fast and Robust Phase Retrieval

Tutorial on Principal Component Analysis

Scientific Computing: An Introductory Survey

Estimating Periodic Signals

Data Mining Techniques

Time Series Examples Sheet

Analysis of Discrete-Time Systems

1 Linear Difference Equations

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

ELEC E7210: Communication Theory. Lecture 10: MIMO systems

Some Interesting Problems in Pattern Recognition and Image Processing

Collaborative Filtering: A Machine Learning Perspective

Central limit theorem - go to web applet

COMPLEX PRINCIPAL COMPONENT SPECTRA EXTRACTION

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

On Moving Average Parameter Estimation

Time Series Prediction

Statistics of Stochastic Processes

On the convergence of (ensemble) Kalman filters and smoothers onto the unstable subspace

Heart Rate Monitoring Using PPG Signals

Complex system approach to geospace and climate studies. Tatjana Živković

Analysis of Discrete-Time Systems

The Simulation and Optimization of NMR Experiments Using a Liouville Space Method

Principal Component Analysis (PCA) CSC411/2515 Tutorial

PCA and admixture models

Development of a Deep Recurrent Neural Network Controller for Flight Applications

Interactive comment on Characterizing ecosystem-atmosphere interactions from short to interannual time scales by M. D. Mahecha et al.

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Part 2 Introduction to Microlocal Analysis

= observed volume on day l for bin j = base volume in jth bin, and = residual error, assumed independent with mean zero.

Applied Time. Series Analysis. Wayne A. Woodward. Henry L. Gray. Alan C. Elliott. Dallas, Texas, USA

Class: Trend-Cycle Decomposition

On Input Design for System Identification

Singular Value Decomposition

Scaling analysis of snapshot spectra in the world-line quantum Monte Carlo for the transverse-field Ising chain

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

Laboratorio di Problemi Inversi Esercitazione 2: filtraggio spettrale

Math 577 Assignment 7

Dimensionality Reduction

New data analysis for AURIGA. Lucio Baggio Italy, INFN and University of Trento AURIGA

Lecture 6 Sept Data Visualization STAT 442 / 890, CM 462

Lecture VIII Dim. Reduction (I)

Multivariate modelling of long memory processes with common components

Fundamentals of Unconstrained Optimization

Decay rates for partially dissipative hyperbolic systems

Variational Principal Components

Preprocessing & dimensionality reduction

NONLINEAR TIME SERIES ANALYSIS, WITH APPLICATIONS TO MEDICINE

A Course in Time Series Analysis

Introduction to Biomedical Engineering

Time Series Analysis -- An Introduction -- AMS 586

Multi-Robotic Systems

Transcription:

The Caterpillar -SSA approach to time series analysis and its automatization Th.Alexandrov,.Golyandina theo@pdmi.ras.ru, nina@ng1174.spb.edu St.Petersburg State University Caterpillar -SSA and its automatization p. 1/15

Outline 1. History 2. Possibilities and advantages 3. Caterpillar -SSA: basic algorithm 4. Decomposition feasibility 5. Identification of SVD components 6. Example: trend and seasonality extraction 7. Forecast 8. Example: signal forecast 9. AutoSSA: 1) extraction of trend 2) extraction of cyclic components 3) model example 4) conclusions 10. Future plans Caterpillar -SSA and its automatization p. 2/15

History Origins: Singular system approach to the method of delays. Dynamic Systems analysis of attractors [middle of 80 s] (Broomhead) Singular Spectrum Analysis. Geophysics/meteorology signal/noise enhancing, distinguishing of a time series from the red noise realization (Monte Carlo SSA) [90 s] (Vautard, Ghil, Fraedrich) Caterpillar. Principal Component Analysis evolution [end of 90 s] (Danilov, Zhigljavskij, Solntsev, ekrutkin, Goljandina) Books: Elsner, Tsonis. Singular Spectrum Analysis. A ew Tool in Time Series Analysis, 1996. Golyandina, ekrutkin, and Zhigljavsky. Analysis of Time Series Structure: SSA and Related Techniques, 2001. Internet links and software: http://www.atmos.ucla.edu/tcd/ssa/ http://www.gistatgroup.com/cat/ http://www.pdmi.ras.ru/ theo/autossa/ Caterpillar -SSA and its automatization p. 3/15

Possibilities and advantages Basic possibilities of the Caterpillar -SSA technique: Finding trend of different resolution Smoothing Extraction of seasonality components Simultaneous extraction of cycles with small and large periods Extraction periodicities with varying amplitudes Simultaneous extraction of complex trends and periodicities Forecast Change-point detection Advantages: Doesn t require the knowledge of parametric model of time series Works with wide spectrum of real-life time series Matches up for non-stationary time series Allows to find structure in short time series Caterpillar -SSA and its automatization p. 4/15

Caterpillar -SSA: basic algorithm Decomposes time series into sum of additive components: F = F (1) Provides the information about each component (m) +...+F Algorithm: 1. Trajectory matrix construction: F = (f 0,..., f 1 ), F X R L K (L window length, parameter) X = f 0 f 1... f L f 1 f 2... f L+1.......... f L 1 f L... f 1 X 2. Singular Value Decomposition (SVD): j = X = X j λj U j V T j λ j eigenvalue, U j e.vector of XX T, V j e.vector of X T X, V j = X T U j / λ j 3. Grouping of SVD components: {1,..., d} = I k, X (k) = j I k X j 4. Reconstruction by diagonal averaging: X (k) F (k) F = F (1) (m) +... + F Caterpillar -SSA and its automatization p. 5/15

F = F (1) + F (2), X = X(1) + X (2) Decomposition feasibility Separability: we can form the group I 1 of SVD components so that I 1 X (1) Separability is the necessary condition for Caterpillar -SSA application Exact separability impose strong constraints at the spectrum of time series which could be processed Real-life: asymptotic separability The case of finite : F = F (1) + F (2), I 1 X (1) Examples of asymptotic separability: F (1) A determinate signal is asympt. separable from a white noise A periodicity is asympt. separable from a trend approximation of the F (1) Fulfilment of (asymptotic) separability conditions places limitations on the value of window length L Only time series which generate finite amount of SVD components hard constraint? Any linear combination of product of exponents, harmonics and polynomials generates finite amount of SVD components Caterpillar -SSA and its automatization p. 6/15

Identification of SVD components Identification choosing of SVD components on the stage of grouping. Important examples Exponential trend: f n = Ae αn it generates one SVD component, eigenvector: U = (u 1,..., u L ) T : u k = Ce αk ( exponential form with the same α) Exponentially-modulated harmonic: it generates two SVD components, eigenvectors: U 1 = (u (1) 1,..., u(1) L )T : u (1) k U 2 = (u (2) 1,..., u(2) L )T : u (2) k f n = Ae αn cos(2πωn) = C 1 e αk cos(2πωk) = C 2 e αk sin(2πωk) ( exponentially-modulated form with the same α and ω) Caterpillar -SSA and its automatization p. 7/15

Identification of SVD components Identification choosing of SVD components on the stage of grouping. Important examples Exponential trend: f n = Ae αn it generates one SVD component, eigenvector: U = (u 1,..., u L ) T : u k = Ce αk ( exponential form with the same α) Exponentially-modulated harmonic: it generates two SVD components, eigenvectors: U 1 = (u (1) 1,..., u(1) L )T : u (1) k U 2 = (u (2) 1,..., u(2) L )T : u (2) k f n = Ae αn cos(2πωn) = C 1 e αk cos(2πωk) = C 2 e αk sin(2πωk) ( exponentially-modulated form with the same α and ω) Caterpillar -SSA and its automatization p. 7/15

Identification of SVD components Identification choosing of SVD components on the stage of grouping. Important examples Exponential trend: f n = Ae αn it generates one SVD component, eigenvector: U = (u 1,..., u L ) T : u k = Ce αk ( exponential form with the same α) Exponentially-modulated harmonic: it generates two SVD components, eigenvectors: U 1 = (u (1) 1,..., u(1) L )T : u (1) k U 2 = (u (2) 1,..., u(2) L )T : u (2) k f n = Ae αn cos(2πωn) = C 1 e αk cos(2πωk) = C 2 e αk sin(2πωk) ( exponentially-modulated form with the same α and ω) Caterpillar -SSA and its automatization p. 7/15

Example: trend and seasonality extraction Traffic fatalities. Ontario, monthly, 1960-1974 Abraham, Redolter. Stat. Methods for Forecasting, 1983 =180, L=60 Caterpillar -SSA and its automatization p. 8/15

Example: trend and seasonality extraction Traffic fatalities. Ontario, monthly, 1960-1974 Abraham, Redolter. Stat. Methods for Forecasting, 1983 =180, L=60 Caterpillar -SSA and its automatization p. 8/15

Example: trend and seasonality extraction Traffic fatalities. Ontario, monthly, 1960-1974 Abraham, Redolter. Stat. Methods for Forecasting, 1983 =180, L=60 SVD components: 1, 4, 5 Caterpillar -SSA and its automatization p. 8/15

Example: trend and seasonality extraction Traffic fatalities. Ontario, monthly, 1960-1974 Abraham, Redolter. Stat. Methods for Forecasting, 1983 =180, L=60 SVD components: 1, 4, 5 Caterpillar -SSA and its automatization p. 8/15

Example: trend and seasonality extraction Traffic fatalities. Ontario, monthly, 1960-1974 Abraham, Redolter. Stat. Methods for Forecasting, 1983 =180, L=60 SVD components: 1, 4, 5 SVD components with periods estimations: 2-3 (12), 4-5, 6-7(6), 8(2), 9-10(10), 11-12(4), 13-14(2.4) Caterpillar -SSA and its automatization p. 8/15

Forecast Recurrent forecast in the framework of the Caterpillar -SSA. Linear Recurrent Formula (LRF): F = (f 0,..., f 1 ) : f n = d k=1 a kf n k Theory: All time series with finite number of SVD components generates finite order LRFs F = F (1) + F(2), if {U j} j I1 F (1) then we can find a LRF governing the F(1) Algorithm of F (1) one-step forecast: 1. Choose window length L 2. Identify SVD components corresponding to the F (1) : {U j} j I1 F (1) 3. Reconstruct via diagonal averaging: {U j } j I1 4. Find a LRF governing F (1) 5. Apply the LRF to last values of the using the theorem F (1) : f F (1) Caterpillar -SSA and its automatization p. 9/15

Example: signal forecast First 119 points were given as the base for the signal reconstruction and forecast Remaining part of the time series is figured to estimate the forecast quality =119, L=60, forecast of points 120-180 SVD components: 1 (trend); 2-3, 5-6, 9-10 (harmonics with periods 12, 4, 2.4); 4 (harmonic with period 2) Caterpillar -SSA and its automatization p. 10/15

AutoSSA: trend extraction Let us investigate every eigenvector U j and take U = (u 1,..., u L ) T. Low Frequencies method u n = c 0 + Periodogram: 1 k L 1 2 Π L U (k/l) = L 2 ( ck cos(2πnk/l) + s k sin(2πnk/l) ) + ( 1) n c L/2, 2c 2 0, k = 0, c 2 k + s 2 k, 1 k L 1 2, 2c 2 L/2, L is even and k = L/2. Π L U (ω), ω {k/l}, reflects the contribution of a harmonic with the frequency ω into the Fourier decomposition of U. Parameter: C(U) = ω 0 upper boundary for the low frequencies interval 0 k Lω 0 Π L U(k/L) 0 k L/2 ΠL U(k/L) contribution of LF frequencies. C(U) C 0 eigenvector U corresponds to a trend, Usually C 0 = 0.1 where C 0 (0, 1) threshold Caterpillar -SSA and its automatization p. 11/15

AutoSSA: periodicity extraction We consider every pair of neighbor eigenvectors U j,u j+1. Fourier method Stage 1. Check maximal frequencies: θ j = arg min k Π L U j (k/l), L θ j θ j+1 s 0 the pair (j, j + 1) is a harmonical pair. Stage 2. Check the form of periodogram: ( ) ρ (j,j+1) = 1 2 max k Π L U j (k/l) + Π L U j+1 (k/l) for a harm. pair ρ (j,j+1) = 1. ρ (j,j+1) ρ 0 the pair (j, j + 1) corresponds to a harmonic, where ρ 0 (0, 1) threshold Usually ρ 0 = 0.7 Caterpillar -SSA and its automatization p. 12/15

AutoSSA: model example Signal forecast F : f n = s n +ε n, s n = e 0.015n +2e 0.005n sin(2πn/24)+e 0.01n sin(2πn/6+2), ε n (0, 1) =95, L=48, forecast of points 96-144 Trend: 1st SVD component (C(U 1 ) = 0.997 > 0.1) Periodicity, T=6: 2-3 (ρ (2,3) = 0.95 > 0.7); T=24: 4-5 (ρ (2,3) = 0.96 > 0.7) Caterpillar -SSA and its automatization p. 13/15

AutoSSA: conclusions Main questions: does automatization make sense? and how to choose thresholds? There is no general answer. One application of AutoSSA: processing of time series from the given class (e.g. trend extraction if we know that trend has exponential form with α < α 0 ) It s reasonable for batch processing of time series data We invested applicability of AutoSSA for extraction and forecast exponential trend/e-m harmonic. For this task AutoSSA provides good results and its quality is comparable with interactive identification We realized that thresholds for extraction and forecast are the same It s possible to work out instructions of thresholds setting for a class of time series Caterpillar -SSA and its automatization p. 14/15

Future plans Open problems: SSA: complex time series models. For instance, with modulation of parameters (e.g. harmonic frequency) in time Future plans: AutoSSA: methodology SSA: comparison of the Caterpillar -SSA with other techniques: standard techniques wavelets neural networks SSA: application of the Caterpillar -SSA in other areas Caterpillar -SSA and its automatization p. 15/15