Time Series Outlier Detection Tingyi Zhu July 28, 2016 Tingyi Zhu Time Series Outlier Detection July 28, 2016 1 / 42
Outline Time Series Basics Outliers Detection in Single Time Series Outlier Series Detection from Multiple Time Series Demos Tingyi Zhu Time Series Outlier Detection July 28, 2016 2 / 42
Time Series Basics Tingyi Zhu Time Series Outlier Detection July 28, 2016 3 / 42
First-order Autoregression A model denoted as AR(1), in which the value of X at time t is a linear function of the value of X at time t 1: Assumptions: ε t i.i.d N(0, σ), stochastic term. ε t is independent of X t. X t = φx t 1 + ε t (1) Tingyi Zhu Time Series Outlier Detection July 28, 2016 4 / 42
General Autoregressive Model AR(p): X t = φ 1 X t 1 + φ 2 X t 2 + + φ p X t p + ε t p = φ i X t i + ε t = i=1 p φ i B i X t + ε t i=1 where we use the backshift operator B (BX t = X t 1, B k X t = X t k ). Alternative notation: φ(b) is a polynomial of B, φ(b)x t = ε t φ(b) = 1 φ 1 B φ 2 B 2 φ p B p = 1 p φ i B i i=1 Tingyi Zhu Time Series Outlier Detection July 28, 2016 5 / 42
Moving Average Another approach for modeling univariate time series X t depends linearly on its own current and previous stochastic terms MA(1): MA(q): X t = ε t + θ 1 ε t 1 X t = ε t + θ 1 ε t 1 + + θ q ε t q Tingyi Zhu Time Series Outlier Detection July 28, 2016 6 / 42
θ 1,..., θ q : parameters of MA model ε t,..., ε t q : stochastic terms Using backshift operator B, model simplified as X t = (1 + θ 1 B + + θ q B q )ε t q = (1 + θ i B i )ε t = θ(b)ε t i=1 Tingyi Zhu Time Series Outlier Detection July 28, 2016 7 / 42
ARMA Model A model consists of both autoregressive (AR) part and moving average (MA) part: X t = p q φ i X t i + ε t + θ i ε t i (2) i=1 i=1 referred to as the ARMA(p,q) model. p: the order of the autoregressive part q: the order of the moving average part More concisely, using backshift operator B, (2) becomes: φ(b)x t = θ(b)ε t Tingyi Zhu Time Series Outlier Detection July 28, 2016 8 / 42
Stationarity of Time Series In short, a time series is stationary if its statistical properties are all constant over time. To mention some properties: Mean: E[X t ] = E[X s ] for any t, s Z, Variance: Var[X t ] = Var[X s ] for any t, s Z, Joint distribution: Cov(X t, X t+1 ) = Cov(X s, X s+1 ) for any t, s Z. Tingyi Zhu Time Series Outlier Detection July 28, 2016 9 / 42
Tingyi Zhu Time Series Outlier Detection July 28, 2016 10 / 42
Requirements for a Stationary Time Series AR(1) X t = φx t 1 + ε t : φ < 1 AR(p) φ(b)x t = ε t : All the roots of φ(z) = 0 are outside unit circle. MA models are always stationary ARMA(p,q) φ(b)x t = θ(b)ε t : All the roots of φ(z) = 0 are outside unit circle. Tingyi Zhu Time Series Outlier Detection July 28, 2016 11 / 42
Non-stationary time series Trend effect Seasonal effect AirPassengers 100 200 300 400 500 600 1950 1952 1954 1956 1958 1960 Time Figure: Monthly totals of international airline passengers, 1949 to 1960. Tingyi Zhu Time Series Outlier Detection July 28, 2016 12 / 42
Time Series Decomposition Think of a more general time series formulation including both trend and seasonal effect: X t = T t + S t + E t (3) X t is data point at time t Tt is the trend component at time t St is the seasonal component at time t E t is the remainder component at time t (containing AR and MA terms) Tingyi Zhu Time Series Outlier Detection July 28, 2016 13 / 42
Series with Trend, examples: Assuming no seasonal effect, i.e. S t = 0 Linear trend: X t = 2t + 0.5X t 1 + ε t Quadratic trend: X t = 2t + t 2 + 0.5X t 1 + ε t Goal: remove the trend, to transform the series to be stationary Solution: lag-1 differencing Tingyi Zhu Time Series Outlier Detection July 28, 2016 14 / 42
Differencing and Trend Define the lag-1 difference operator, where B is the backshift operator. If X t = β 0 + β 1 t + E t, then If X t = k i=0 β it i + E t, then X t = X t X t 1 = (1 B)X t, X t = β 1 + E t. k X t = (1 B) k X t = k!β k + k E t. we call k kth lag-1 difference operator. Tingyi Zhu Time Series Outlier Detection July 28, 2016 15 / 42
Lag-1 Differencing S&P 500 Quote Year To Date S&P 500 YTD Lag 1 Differencing 1850 1950 2050 2150 80 60 40 20 0 20 40 Jan 04 2016 Mar 01 2016 May 02 2016 Jul 01 2016 Jan 04 2016 Mar 01 2016 May 02 2016 Jul 01 2016 Tingyi Zhu Time Series Outlier Detection July 28, 2016 16 / 42
Series with Seasonal Effect, example: For quarterly data, with possible seasonal (quarterly) effects, we can define indicator function S j. For j = 1, 2, 3, 4, { 1 if observation is in quarter j of a year, S j = 0 otherwise. A model with seasonal effects could be written as X t = α 1 S 1 + α 2 S 2 + α 3 S 3 + α 4 S 4 + ε t Goal: remove the seasonal effects Solution: lag-s differencing, where s is the number of seasons Tingyi Zhu Time Series Outlier Detection July 28, 2016 17 / 42
Differencing and Seasonal Effects Define the lag-s difference operator, s X t = X t X t s = (1 B s )X t, where B is the backshift operator. If X t = T t + S t + E t, and S t has period s (i.e. S t = S t s for all t), then s X t = (1 B s )X t = T t T t s + s E t. Tingyi Zhu Time Series Outlier Detection July 28, 2016 18 / 42
Non-seasonal ARIMA S t = 0 ARIMA stands for Auto-Regressive Integrated Moving Average, ARMA integrated with differencing. A nonseasonal ARIMA model is classified as ARIMA(p,d,q), where p is the order of AR terms, d is the number of nonseasonal differences needed for stationarity, q is the order of MA terms. Tingyi Zhu Time Series Outlier Detection July 28, 2016 19 / 42
Non-seasonal ARIMA, Cont. Recall ARMA(p,q): φ(b)x t = θ(b)ε t, φ(b) and θ(b) are polynomials of B of order p and q. Stationary requirement: all roots of φ(z) = 0 outside unit circle. ARIMA(p,d,q): φ(b)(1 B) d X t = θ(b)ε t, Xt is not stationary. Why? Z t = (1 B) d X t is ARMA(p,q), is stationary. Tingyi Zhu Time Series Outlier Detection July 28, 2016 20 / 42
Seasonal ARIMA A seasonal ARIMA model is classified as ARIMA(p, d, q) (P, D, Q) m p is the order of AR terms, d is the number of nonseasonal differences, q is the order of MA terms. P is the order of seasonal AR terms, D is the number of seasonal differences, Q is the order of seasonal MA terms. m is the number of seasons. Tingyi Zhu Time Series Outlier Detection July 28, 2016 21 / 42
Example: ARIMA(1, 1, 1) (1, 1, 1) 4 Tingyi Zhu Time Series Outlier Detection July 28, 2016 22 / 42
General ARIMA The ARIMA model can be generalized as follow: φ(b)α(b)x t = θ(b)ε t, φ(b): autoregressive polynomial, all roots outside unit circle α(b): differencing filter renders the data stationary, all roots on the unit circle θ(b): moving average polynomial, all roots outside unit circle (to assure θ(b) is invertible. Alternatively, X t = θ(b) φ(b)α(b) ε t. Tingyi Zhu Time Series Outlier Detection July 28, 2016 23 / 42
Outliers Detection in Single Time Series Tingyi Zhu Time Series Outlier Detection July 28, 2016 24 / 42
Automatic Detection Procedure Described in Chung Chen, Lon-Mu Liu. Joint Estimation of Model Parameters and Outlier Effects in Time Series,JASA, 1993 Based on the framework of ARIMA models R package tsoutlier written by YAHOO in 2014 Tingyi Zhu Time Series Outlier Detection July 28, 2016 25 / 42
Types of Outliers General representation: L(B)I t (t j ) L(B): a polynomial of lag operator B I t (t j ) = 1 there s outlier at time t = t j, and 0 otherwise. Types of outliers: Additive Outliers (AO): L(B) = 1; Level Shift (LS): L(B) = 1 1 B ; Temporary Change (TC): L(B) = 1 1 δb ; Seasonal Level Shift (SLS): L(B) = 1 1 B s ; Innovational Outliers (IO): L(B) = θ(b) φ(b)α(b). Tingyi Zhu Time Series Outlier Detection July 28, 2016 26 / 42
Types of Outliers Tingyi Zhu Time Series Outlier Detection July 28, 2016 27 / 42
Formulation ARIMA model: X t = θ(b) φ(b)α(b) ε t. Model with outliers at time t 1, t 2,..., t m : X t = m ω j L j (B)I t (t j ) + θ(b) φ(b)α(b) ε t. j=1 Lj (B) depends on pattern of the jth outlier I t (t j ) = 1 there s outlier at time t = t j, and 0 otherwise. ωj denotes the magnitude of the jth outlier effect Tingyi Zhu Time Series Outlier Detection July 28, 2016 28 / 42
Effect of One Outlier Assume the time series parameters are known, we examine the effect of one outlier: Define polynomial π(b) as: X t = ωl(b)i t (t 1 ) + θ(b) φ(b)α(b) ε t π(b) = φ(b)α(b) θ(b) = 1 π 1 B π 2 B, Contaminated by the outlier, the estimated residual ê t becomes (Without outlier, ê t = π(b)x t.) ê t = π(b)x t Tingyi Zhu Time Series Outlier Detection July 28, 2016 29 / 42
For the four types of outliers, IO: ê t = ωi t (t 1 ) + ε t, AO: ê t = ωπ(b)i t (t 1 ) + ε t, LS: ê t = ω π(b) 1 B I t(t 1 ) + ε t, TC: ê t = ω π(b) 1 δb I t(t 1 ) + ε t. Alternatively, ê t = ωx i,t + ε t, t = t 1, t 1 + 1,... and i = 1, 2, 3, 4 x i,t = 0 for all i and t < t 1, x i,t = 1 for all i, x 1,t1 +k = 0, x 2,t1 +k = π k, x 3,t1 +k = 1 k j=1 π j, x 4,t1 +k = δ k k 1 j=1 δk j π j π k. A simple linear regression! Tingyi Zhu Time Series Outlier Detection July 28, 2016 30 / 42
Estimate of ω The least square estimate doe the effect of a single outlier at t = t 1 can be expressed as Tingyi Zhu Time Series Outlier Detection July 28, 2016 31 / 42
Test Statistics τ From regression analysis, we have ˆω ω n ( x ˆσ i,t) 2 1/2 N(0, 1), a t=t 1 where ˆσ a is the estimation of residual standard deviation. We want to test whether ω = 0, then the following statistics are approximately N(0, 1): Tingyi Zhu Time Series Outlier Detection July 28, 2016 32 / 42
Procedure in the Presence of Multiple Ouliers In the presence of multiple outliers, recall the model X t = m ω j L j (B)I t (t j ) + θ(b) φ(b)α(b) ε t. j=1 where ˆσ a is the estimation of residual standard deviation. The estimated residual becomes ê t = m ω j π(b)l j (B)I t (t j ) + ε t j=1 Tingyi Zhu Time Series Outlier Detection July 28, 2016 33 / 42
Stage 1: Joint Estimation of Outlier Effect and Model Parameters Fitting the series by an ARIMA model (forecast package in R), obtain initial parameter (φ(b), θ(b), α(b)) estimation of the model. Detect outliers one by one sequentially Tingyi Zhu Time Series Outlier Detection July 28, 2016 34 / 42
Stage 2: Initial Parameter Estimation and Outlier Detection Tingyi Zhu Time Series Outlier Detection July 28, 2016 35 / 42
Tingyi Zhu Time Series Outlier Detection July 28, 2016 36 / 42
Outlier Series Detection from Multiple Time Series Tingyi Zhu Time Series Outlier Detection July 28, 2016 37 / 42
Detect Anomalous Series Goal: efficiently find the least similar time series in a large set Motivation: Internet companies monitoring the servers(cpu, Memory), find unusual behaviors Tingyi Zhu Time Series Outlier Detection July 28, 2016 38 / 42
Detect Anomalous Series Described in Rob J Hyndman et al. Large-Scale Unusual Time Series Detection, ICDM, 2015 Approach: Extract features from time series, PCA R package anomalous Test on real data from YAHOO email server, 80% accuracy compared to 40% from previous methods Tingyi Zhu Time Series Outlier Detection July 28, 2016 39 / 42
Step 1: Extract Features from Time Series 15 features selected, each captures the global information of time series Tingyi Zhu Time Series Outlier Detection July 28, 2016 40 / 42
Step2: PCA to reduce dimension dim=15 initially, correlation existing between features The first 2 PCs are sufficient, capturing most of the variance Step 3: Implement multi-dimentional outlier detection algorithm to find outlier series Density based α-hull Tingyi Zhu Time Series Outlier Detection July 28, 2016 41 / 42
Demo Tingyi Zhu Time Series Outlier Detection July 28, 2016 42 / 42