Streaming multiscale anomaly detection

Similar documents
Anomaly Detection via Over-sampling Principal Component Analysis

Anomaly Detection via Online Oversampling Principal Component Analysis

L 2,1 Norm and its Applications

Mejbah Alam. Justin Gottschlich. Tae Jun Lee. Stan Zdonik. Nesime Tatbul

CS281 Section 4: Factor Analysis and PCA

Change Detection in Multivariate Data

Index Terms: Wavelets, Outliers, resolution, Residuals, Distributions, Gaussian, Discrete, Analysis IJSER

Reducing The Data Transmission in Wireless Sensor Networks Using The Principal Component Analysis

Comparative Performance Analysis of Three Algorithms for Principal Component Analysis

SPECTRAL ANOMALY DETECTION USING GRAPH-BASED FILTERING FOR WIRELESS SENSOR NETWORKS. Hilmi E. Egilmez and Antonio Ortega

A CUSUM approach for online change-point detection on curve sequences

A New Unsupervised Event Detector for Non-Intrusive Load Monitoring

Short Term Memory Quantifications in Input-Driven Linear Dynamical Systems

Research Overview. Kristjan Greenewald. February 2, University of Michigan - Ann Arbor

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Comparison of statistical process monitoring methods: application to the Eastman challenge problem

Unsupervised Anomaly Detection for High Dimensional Data

Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

Anomaly Detection via Online Over-Sampling Principal Component Analysis

ECE 661: Homework 10 Fall 2014

Machine learning for pervasive systems Classification in high-dimensional spaces

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Tennis player segmentation for semantic behavior analysis

Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics )

Detection of signal transitions by order statistics filtering

A Framework for Adaptive Anomaly Detection Based on Support Vector Data Description

Salt Dome Detection and Tracking Using Texture Analysis and Tensor-based Subspace Learning

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Outlier Detection Using Rough Set Theory

STA 4273H: Statistical Machine Learning

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multi-Plant Photovoltaic Energy Forecasting Challenge with Regression Tree Ensembles and Hourly Average Forecasts

Hierarchical Boosting and Filter Generation

Iterative Laplacian Score for Feature Selection

Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning

From Last Meeting. Studied Fisher Linear Discrimination. - Mathematics. - Point Cloud view. - Likelihood view. - Toy examples

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams

Online Estimation of Discrete Densities using Classifier Chains

Tracking the Intrinsic Dimension of Evolving Data Streams to Update Association Rules

Describing and Forecasting Video Access Patterns

Online Appearance Model Learning for Video-Based Face Recognition

Mining Big Data Using Parsimonious Factor and Shrinkage Methods

Reconnaissance d objetsd et vision artificielle

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Rare Event Discovery And Event Change Point In Biological Data Stream

Finding Robust Solutions to Dynamic Optimization Problems

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Particle Filtered Modified-CS (PaFiMoCS) for tracking signal sequences

WMO Aeronautical Meteorology Scientific Conference 2017

Digital Image Processing Lectures 13 & 14

Recipes for the Linear Analysis of EEG and applications

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Believe it Today or Tomorrow? Detecting Untrustworthy Information from Dynamic Multi-Source Data

Classification of Hand-Written Digits Using Scattering Convolutional Network

Real-time Sentiment-Based Anomaly Detection in Twitter Data Streams

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Learning from Time-Changing Data with Adaptive Windowing

Using HDDT to avoid instances propagation in unbalanced and evolving data streams

Using wavelet tools to estimate and assess trends in atmospheric data

Wavelet Transform And Principal Component Analysis Based Feature Extraction

Jun Zhang Department of Computer Science University of Kentucky

Process Monitoring Based on Recursive Probabilistic PCA for Multi-mode Process

Grassmann Averages for Scalable Robust PCA Supplementary Material

Modelling Multivariate Data by Neuro-Fuzzy Systems

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

Principal Component Analysis CS498

Aruna Bhat Research Scholar, Department of Electrical Engineering, IIT Delhi, India

An Overview of Outlier Detection Techniques and Applications

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

The Kalman Filter ImPr Talk

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

TUTORIAL PART 1 Unsupervised Learning

An Enhanced Seasonal-Hybrid ESD Technique for Robust Anomaly Detection on Time Series

Machine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels

Intelligent Fault Classification of Rolling Bearing at Variable Speed Based on Reconstructed Phase Space

Minor probability events detection in big data: An integrated approach with Bayesian testing and MIM

ANOMALY (or outlier) detection aims to identify a small

Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection

PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS

Multi-scale Geometric Summaries for Similarity-based Upstream S

LARGE SCALE ANOMALY DETECTION AND CLUSTERING USING RANDOM WALKS

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Correlation Preserving Unsupervised Discretization. Outline

Machine Learning Approaches to Network Anomaly Detection

Feature Engineering, Model Evaluations

Eigen Co-occurrence Matrix Method for Masquerade Detection

Principal Component Analysis

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

Clustering VS Classification

Multi-Layer Boosting for Pattern Recognition

Feature Extraction with Weighted Samples Based on Independent Component Analysis

RaRE: Social Rank Regulated Large-scale Network Embedding

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

CS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.

Transcription:

Streaming multiscale anomaly detection DATA-ENS Paris and ThalesAlenia Space B Ravi Kiran, Université Lille 3, CRISTaL Joint work with Mathieu Andreux beedotkiran@gmail.com June 20, 2017 (CRISTaL) Streaming anomaly detection June 20, 2017 1 / 22

Overview 1 Anomaly detection 2 Time series representation 3 Streaming Subspace Tracking 4 Results (CRISTaL) Streaming anomaly detection June 20, 2017 2 / 22

Motivation : Anomaly detection Areas : Industrial processes, medical and satellite telemetry, Finance Anomaly Detection : x(t), t where signal deviates from the local mean value. x(t) are observations over time t where new data arrives over time t. High volumes of data are generated per day (in the GBs). Requirements : Time series representation is robust to variation in scale of pseudo-periodicities (window size). Streaming time series anomaly detection to handle large abouts of data. Require an online multiscale anomaly detection algorithm. (CRISTaL) Streaming anomaly detection June 20, 2017 3 / 22

Yahoo! and Numenta Datasets Yahoo! unsupervised anomaly detection Benchmark [8] provides datasets with annotated anomalies and changepoints. Numenta Anomaly Benchmark (NAB) [9] provides an evaluation of streaming time series anomaly detection algorithms. The datasets contain various types of anomalies : level shifts/change-points, point anomalies, change in periodicities, value drifts, change in envelopes, linear trends. (CRISTaL) Streaming anomaly detection June 20, 2017 4 / 22

Anomaly detection problem Formulation : Track the principal direction given a scale/lag p for the design matrix of time series. Evaluate the reconstruction error to measure deviation from the rest of the windows. Evaluate across multiple lags (p) Characterize anomalies by their variation in reconstruction error across scale of lag-window size. Related work : Streaming anomaly detection by subspace tracking [5] Tracking correlations over multi-scale windows for frequent motif extraction [12] Multi-scale anomaly detection offline [3] (CRISTaL) Streaming anomaly detection June 20, 2017 5 / 22

Time series Embedding We build a lag matrix over a window of size p X p t = [x t, x t 1,..., x t p+1 ] T R p (CRISTaL) Streaming anomaly detection June 20, 2017 6 / 22

Multiscale Lagmatrix Xtp = [xt, xt 1,..., xt p+1 ]T Rp (CRISTaL) Streaming anomaly detection June 20, 2017 7 / 22

Principal Subspace Tracking Streaming PCA : Dimensionality reduction for time series lag embedding Recursive update for principal subspace Linear Principal Component Analysis criterion : ] J(w t ) = E [ X t w t w Tt X t 2 w t R p r At the global minimum for w t shall contain the r dominant eigen-vectors. Online principal subspace tracking of the lagmatrix to track correlations : SPIRIT algorithm [11] Given X p R T p, w p is defined as the 1-D projection capturing most of the energy of the data samples : w p = arg min w =1 T t=1 X p t (w p w T p )X p t 2 (CRISTaL) Streaming anomaly detection June 20, 2017 8 / 22

Streaming PCA Algorithm 1 Streaming PCA Initialization: w j 0, σ 2 j ɛ with ɛ 1 for t = 1,..., T do for j = 1,..., J do Z j t H T 2 j X j t y j t w T j Z j t σ 2 j σ 2 j + (y j t ) 2 e j t Zt j yt j w j w j w j + σ 2 j yt j e j t πt j wj T Zt j Z t j πtw j j αt j Z t j Zt j 2 end for end for return α R T J Given x(t) R for t = 1 : T We evaluate the lag-matrix X R T p where p = 2 j. For each vector X t R p we perform a change of basis Z t := Φ T X t We require a unitary transform to localize a deviation from the local mean and variations. Preserve the variance. Haar transform [ Φ = H :] H 2N = 1 HN [1, 1] 2 I N [1, 1] (CRISTaL) Streaming anomaly detection June 20, 2017 9 / 22

Streaming PCA point cloud (2d embedding only) (CRISTaL) Streaming anomaly detection June 20, 2017 10 / 22

Error Spectrogram Reconstruction error of the lag-matrix calculated in logarithmic scales : (CRISTaL) Streaming anomaly detection June 20, 2017 11 / 22

Hierarchical Approximation for multi-scale windows [12] When passing from one scale p j to the next p j+1 = 2p j, instead of rebuilding a lag matrix Xt j+1 whose size doubles, it builds a reduced lag matrix Zt j+1 by considering the projection of each component of size p j on the principal direction obtained at this scale, i.e. = [wj T Zt j, wj T Z j ] T with t 2 j Zt 1 = Xt 1. Z j+1 t The principal direction at scale p j+1 is then obtained by applying the streaming PCA algorithm on this reduced representation. w T j+1 w T j w T j w T j 1 wt j 1 wt j 1 wt j 1 2 j 1 2 j 1 2 j 1 2 j 1 Future Figure : Hierarchical PCA. (CRISTaL) Streaming anomaly detection June 20, 2017 12 / 22

Aggregating Multi-scale Anomaly score At time t, we denote by X t p the projection of Xt p upon w p (at this time p step), i.e. X t = wp T Xt p. We obtain α t R T J, we propose the following ways to aggregated the J scales : 1 α t 2 : Norm of multiscale anomaly score 2 α t α t 2 : Streaming reconstruction error on anomaly score, obtained via a 2nd iteration of the streaming PCA algorithm on the multiscale anomaly score instead of the lag-matrix. where j = arg min j corresponding to the scale which is least correlated with others. 3 α j t Performance Evaluation : i (αt α) ji : the anomaly score Area under the receiver operators characteristics curve (AUC) integrating the curve of the False positive rate(fpr) vs the True positive rate (TPR) obtained for all possible thresholds. 0 (worst value) and 1 (perfect detector) (CRISTaL) Streaming anomaly detection June 20, 2017 13 / 22

Global-flow Input x t Lagmatrix X 2j=1 t Haar Tx. H 1 Error α j=1 t Norm α t 2 Lagmatrix X 2j=2 t Haar Tx. H 2 Error α j=2 t 2nd iteration α t α t 2.... Lagmatrix X 2j=J t Haar Tx. H J Error α j=j t Least correlated scale j = arg min j i (αt α) ji Representation Φ T X t : Localize the anomaly in a basis Multiscale Anomaly Score : Compose anomaly scores (CRISTaL) Streaming anomaly detection June 20, 2017 14 / 22

Results Multi-scale score-norm α t 2 (PC=1) Method / AUCs Bench 1 Bench 2 Bench 3 Bench 4 NAB fixed-scale 0.828 ± 0.240 0.835 ± 0.180 0.614 ± 0.108 0.568 ± 0.160 0.815 ± 0.238 fixed-scale-haar 0.826 ± 0.238 0.878 ± 0.143 0.617 ± 0.115 0.576 ± 0.157 0.812 ± 0.232 multiscale-lagmatrix 0.884 ± 0.232 0.978 ± 0.057 0.816 ± 0.092 0.696 ± 0.157 0.879 ± 0.199 hierarchical-approx 0.871 ± 0.236 0.997 ± 0.002 0.980 ± 0.025 0.897 ± 0.104 0.900 ± 0.189 multiscale-haar 0.906 ± 0.231 0.989 ± 0.019 0.992 ± 0.019 0.892 ± 0.126 0.892 ± 0.198 PCA on multi-scale score α t α t 2 (PC=1) Method / AUCs Bench 1 Bench 2 Bench 3 Bench 4 NAB fixed-scale 0.632 ± 0.264 0.754 ± 0.206 0.533 ± 0.124 0.525 ± 0.133 0.700 ± 0.247 fixed-scale-haar 0.649 ± 0.251 0.723 ± 0.194 0.514 ± 0.110 0.522 ± 0.129 0.699 ± 0.244 multiscale-lagmatrix 0.895 ± 0.218 0.997 ± 0.006 0.993 ± 0.017 0.959 ± 0.063 0.891 ± 0.194 hierarchical-approx 0.859 ± 0.233 0.997 ± 0.002 0.961 ± 0.071 0.895 ± 0.108 0.884 ± 0.204 multiscale-haar 0.888 ± 0.219 0.988 ± 0.031 0.956 ± 0.059 0.898 ± 0.106 0.886 ± 0.178 Least correlated scale α j t where j = arg min i (αt α) ji (PC=1) j Method / AUCs Bench 1 Bench 2 Bench 3 Bench 4 NAB fixed-scale 0.828 ± 0.240 0.835 ± 0.180 0.614 ± 0.108 0.568 ± 0.160 0.815 ± 0.238 fixed-scale-haar 0.826 ± 0.238 0.878 ± 0.143 0.617 ± 0.115 0.576 ± 0.157 0.812 ± 0.232 multiscale-lagmatrix 0.816 ± 0.238 0.773 ± 0.236 0.993 ± 0.017 0.964 ± 0.055 0.885 ± 0.196 hierarchical-approx 0.816 ± 0.238 0.773 ± 0.236 0.993 ± 0.017 0.964 ± 0.055 0.885 ± 0.196 multiscale-haar 0.832 ± 0.238 0.997 ± 0.007 0.799 ± 0.120 0.817 ± 0.123 0.886 ± 0.183 (CRISTaL) Streaming anomaly detection June 20, 2017 15 / 22

Effect of the Iterated Streaming PCA The error here should decorrelate the scores at different scales. Plotting Mean Recon. Error (Approximation) Vs. AUC (Detection) α t 2 Vs. α t α t 2 : (CRISTaL) Streaming anomaly detection June 20, 2017 16 / 22

Effect of the Iterated Streaming PCA The error here should decorrelate the scores at different scales. Plotting Mean Recon. Error (Approximation) Vs. AUC (Detection) α t 2 Vs. α t α t 2 : (CRISTaL) Streaming anomaly detection June 20, 2017 16 / 22

Effect of the Iterated Streaming PCA The error here should decorrelate the scores at different scales. Plotting Mean Recon. Error (Approximation) Vs. AUC (Detection) α t 2 Vs. α t α t 2 : (CRISTaL) Streaming anomaly detection June 20, 2017 16 / 22

Effect of the Iterated Streaming PCA The error here should decorrelate the scores at different scales. Plotting Mean Recon. Error (Approximation) Vs. AUC (Detection) α t 2 Vs. α t α t 2 : (CRISTaL) Streaming anomaly detection June 20, 2017 16 / 22

Effect of the Iterated Streaming PCA The error here should decorrelate the scores at different scales. Plotting Mean Recon. Error (Approximation) Vs. AUC (Detection) α t 2 Vs. α t α t 2 : (CRISTaL) Streaming anomaly detection June 20, 2017 16 / 22

Effect of Iterated Streaming PCA (CRISTaL) Streaming anomaly detection June 20, 2017 17 / 22

Failure Cases When the errors of reconstruction across scales remain correlated : Figure : Scale correlation. A larger scale of lag-window provides a least correlated scale. (CRISTaL) Streaming anomaly detection June 20, 2017 18 / 22

Failure Cases Figure : Near zero AUC score. (CRISTaL) Streaming anomaly detection June 20, 2017 19 / 22

Future work Improvements on current model Understand the bounds on the reconstruction error α(t) for Streaming PCA. Better base-line with the multivariate zscore by calculating covariance matrix online. Add anomaly-score likelihood to filter the anomaly score by using a moving window gaussian neg-log score. Use a streaming recursively calculable multi-scale time series representation Φ T X t : This should make use of coefficients that are calculated in the past. For now the Haar transformation HX t operates on a single vector. [4] (CRISTaL) Streaming anomaly detection June 20, 2017 20 / 22

Future work Other Tasks Anomolous time series ranking [8] Online Change-point evaluation [8] Other applications : Unsupervised unsual action recognition in videos Change detection in areal/remote sensing data : hyperspectral video. (CRISTaL) Streaming anomaly detection June 20, 2017 21 / 22

The End. (CRISTaL) Streaming anomaly detection June 20, 2017 22 / 22

Hierarchical PCA Initialization: w j 0, σj 2 ɛ with ɛ 1 for t = 1,..., T do for j = 2,..., J do if j = 1 then Zt j Xt j else Zt j [πt j 1, (Xt j ) T ] end if yt j wj T Zt j σj 2 σj 2 + (yt j ) 2 e j t Zt j yt j w j w j w j + σ 2 j yt j e j t πt j wj T Zt j Z t j πtw j j αt j Z t j Zt j 2 end for end for return α R T J (CRISTaL) Streaming anomaly detection June 20, 2017 23 / 22

Results PC=2 Multi-scale score-norm α t 2 (PC=2) fixed-scale 0.783 ± 0.269 0.918 ± 0.065 0.616 ± 0.142 0.569 ± 0.154 0.815 ± 0.231 fixed-scale-haar 0.808 ± 0.259 0.925 ± 0.074 0.627 ± 0.146 0.586 ± 0.144 0.811 ± 0.232 multiscale-lagmatrix 0.850 ± 0.242 0.969 ± 0.031 0.803 ± 0.116 0.686 ± 0.163 0.862 ± 0.210 hierarchical-approx 0.848 ± 0.240 0.985 ± 0.056 0.982 ± 0.021 0.941 ± 0.079 0.876 ± 0.213 multiscale-haar 0.862 ± 0.245 0.976 ± 0.021 0.805 ± 0.150 0.710 ± 0.166 0.873 ± 0.195 PCA on multi-scale score α t α t 2 (PC=2) fixed-scale 0.778 ± 0.270 0.908 ± 0.091 0.609 ± 0.133 0.573 ± 0.154 0.813 ± 0.232 fixed-scale-haar 0.804 ± 0.261 0.922 ± 0.079 0.625 ± 0.148 0.584 ± 0.143 0.811 ± 0.232 multiscale-lagmatrix 0.828 ± 0.237 0.872 ± 0.134 0.834 ± 0.172 0.793 ± 0.181 0.829 ± 0.207 hierarchical-approx 0.831 ± 0.248 0.978 ± 0.084 0.976 ± 0.031 0.935 ± 0.084 0.841 ± 0.231 multiscale-haar 0.816 ± 0.239 0.933 ± 0.088 0.859 ± 0.161 0.799 ± 0.171 0.807 ± 0.226 Least correlated scale α j t where j = arg min i (αt α) ji (PC=2) j fixed-scale 0.783 ± 0.269 0.918 ± 0.065 0.616 ± 0.142 0.569 ± 0.154 0.815 ± 0.231 fixed-scale-haar 0.808 ± 0.259 0.925 ± 0.074 0.627 ± 0.146 0.586 ± 0.144 0.811 ± 0.232 multiscale-lagmatrix 0.685 ± 0.332 0.757 ± 0.225 0.555 ± 0.140 0.597 ± 0.168 0.736 ± 0.327 hierarchical-approx 0.689 ± 0.333 0.757 ± 0.225 0.555 ± 0.140 0.596 ± 0.167 0.736 ± 0.327 multiscale-haar 0.739 ± 0.318 0.765 ± 0.241 0.533 ± 0.200 0.512 ± 0.200 0.736 ± 0.336 (CRISTaL) Streaming anomaly detection June 20, 2017 24 / 22

References I [1] V. Alarcon-Aquino and J. A. Barria. Anomaly detection in communication networks using wavelets. IEE Proceedings - Communications, 148(6):355 362, dec 2001. [2] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander. Lof: Identifying density-based local outliers. SIGMOD Rec., 29(2):93 104, may 2000. [3] X.-y. Chen and Y.-y. Zhan. Multi-scale anomaly detection algorithm based on infrequent pattern of time series. J. Comput. Appl. Math., 214(1):227 237, Apr. 2008. [4] G. Cormode, M. Garofalakis, and D. Sacharidis. Fast approximate wavelet tracking on streams. In International Conference on Extending Database Technology, pages 4 22. Springer, 2006. [5] P. H. dos Santos Teixeira and R. L. Milidiú. Data stream anomaly detection through principal subspace tracking. In Proceedings of the 2010 ACM Symposium on Applied Computing, SAC 10, pages 1609 1616, New York, NY, USA, 2010. ACM. [6] R. Ganesan, T. K. Das, and V. Venkataraman. Wavelet-based multiscale statistical process monitoring: A literature review. IIE transactions, 36(9):787 806, 2004. [7] C. T. Huang, S. Thareja, and Y. J. Shin. Wavelet-based Real Time Detection of Network Traffic Anomalies. In Securecomm and Workshops, 2006, pages 1 7, aug 2006. [8] N. Laptev, S. Amizadeh, and I. Flint. Generic and scalable framework for automated time-series anomaly detection. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 15, pages 1939 1947. ACM, 2015. [9] A. Lavin and S. Ahmad. Evaluating real-time anomaly detection algorithms - the numenta anomaly benchmark. In 14th IEEE International Conference on Machine Learning and Applications, ICMLA 2015, Miami, FL, USA, December 9-11, 2015, pages 38 44. IEEE, 2015. [10] W. Lu, M. Tavallaee, and A. A. Ghorbani. Detecting Network Anomalies Using Different Wavelet Basis Functions. In Communication Networks and Services Research Conference, 2008. CNSR 2008. 6th Annual, pages 149 156, may 2008. [11] S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In Proceedings of the 31st International Conference on Very Large Data Bases, VLDB 05, pages 697 708. VLDB Endowment, 2005. (CRISTaL) Streaming anomaly detection June 20, 2017 25 / 22

References II [12] S. Papadimitriou and P. Yu. Optimal multi-scale patterns in time series streams. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 647 658. ACM, 2006. [13] S. Pukkawanna and K. Fukuda. Combining sketch and wavelet models for anomaly detection. In 2010 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), pages 313 319, aug 2010. (CRISTaL) Streaming anomaly detection June 20, 2017 26 / 22

Previous work Standard offline multiscale anomaly detection using wavelet transform [1], [10], [13], [7]. Wavelet methods introduce a time delay in the computation of the coefficients at non-dyadic locations which worsens geometrically for coarser scales. Furthermore, they suffer from non-causality, i.e. they need to see some part of the future to assess the presence of an anomaly at present time [6]. [8] proposed several linear predictive models (Autoregressive, Kalman filter) followed by an anomaly score filtering (by kσ rule, or local outlier factor scores introduced by [2]) to detect anomalies. (CRISTaL) Streaming anomaly detection June 20, 2017 27 / 22