Quickest Changepoint Detection: Optimality Properties of the Shiryaev Roberts-Type Procedures

Size: px
Start display at page:

Download "Quickest Changepoint Detection: Optimality Properties of the Shiryaev Roberts-Type Procedures"

Transcription

1 Quickest Changepoint Detection: Optimality Properties of the Shiryaev Roberts-Type Procedures Alexander Tartakovsky Department of Statistics Inference for Change-Point and Related Processes Isaac Newton Institute for Mathematical Sciences Cambridge, UK January 17, 2014 Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

2 Outline A general changepoint detection scenario Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

3 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

4 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

5 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Composite Post-change Hypothesis: Nearly minimax properties of the SR-mixtures Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

6 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Composite Post-change Hypothesis: Nearly minimax properties of the SR-mixtures Applications to Information Systems: Rapid detection of attacks/intrusions in computer networks Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

7 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Composite Post-change Hypothesis: Nearly minimax properties of the SR-mixtures Applications to Information Systems: Rapid detection of attacks/intrusions in computer networks Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

8 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

9 A General Changepoint Detection Scenario X1 X2 X3 Xν 1 Xν Xν+1 Xν+2 Xν+3 Xn f(xn X1, X2,..., Xn 1) g(xn X1, X2,..., Xn 1) Surveillance Begins Change-Point Surveillance Continues Figure : Typical general changepoint scenario. A change occurs at an unknown point in time ν 0. In a general non-iid case joint distributions change, which can be described in changing of conditional pre-change densities f (X i X 1,..., X i 1 ) to conditional post-change densities g(x i X 1,..., X i 1 ) for i = ν + 1, ν + 2,... The simplest iid case is where f (X i X 1,..., X i 1 ) = f (X i ) and g(x i X 1,..., X i 1 ) = g(x i ), where f (x) is a common pre-change density and g(x) is a common post-change density As long as the observations behavior is consistent with the normal state, one is content to let the process continue; if the state changes, then one wants to detect a change as quickly as possible, i.e., the design of a detection procedure, which is a stopping time T wrt the observed sequence {X n} n 1, reduces to optimizing the tradeoff between a delay to detection and a false alarm rate (FAR). Numerous applications: anomaly detection, failure detection, surveillance, process control, intrusion detection in information systems, target detection, finance to name a few. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

10 Illustration of Change Detection Observed Data, Xn Pre-Change Regime each Xn f(x) Post-Change Regime each Xn g(x) ν Start of Surveillance Change-Point Time, n Detection Statistic A Detection Threshold Point of False Alarm ν 0 T Change-Point Run Length to False Alarm, T (random) Time, n Detection Delay, T ν, where T > ν (random) Detection Statistic A Detection Threshold Detection Point 0 ν Change-Point T Time, n Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

11 Change Detection as a Hypothesis Testing Problem Change Detection is regarded as testing two hypotheses having sample X n = (X 1,..., X n): H ν : change at 0 ν < n and H = H ν for ν n no change Likelihood Ratio (LR) for these hypotheses (based on data X n ): Λ ν dp(x n ν H ν) n =: dp(x n H = j=1 f (X j X j 1 ) n j=ν+1 g(x j X j 1 ) n n ) j=1 f (X = j X j 1 ) Special iid Case: Λ ν n = ν j=1 f (X j) n j=ν+1 g(x j) n j=1 f (X = j) n j=ν+1 g(x j ) f (X j ) j=ν+1 g(x j X j 1 ) f (X j X j 1 ) Since the changepoint ν is unknown, reasonable decision statistics may be built either based on the average LR or on the maximal LR (over ν), which will be discussed below Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

12 Optimality Criteria Summary Four Approaches: Bayesian (changepoint is random with known prior), Generalized Bayesian (improper uniform prior), Detection of a Change in a Stationary Regime (change occurs at a far time horizon applying a repeated multicycle procedure), and Minimax Bayesian Approach: random change point uniform improper prior Generalized Bayesian Approach equivalent re- interpreta1on Mul9- cyclic Detec9on of Changes in a Sta9onary Regime Minimax Approach: unknown non- random change point Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

13 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

14 Generalized Bayesian Setting Assume improper uniform prior on Z +: P(ν = k) = 1, k = 0, 1,... The risk associated with the detection delay can be measured by the Integral (relative) Average Delay to Detection IADD(T ): k=0 IADD(T ) = E k(t k) + k=0 = E k(t k T > k)p (T > k) E T E T The risk associated with false alarms is typically measured by the Mean Time to False Alarm, which is referred to as the Average Run Length to False Alarm (ARL2FA): ARL2FA(T ) = E [T ] Generalized Bayesian Optimality Criterion Minimize the Integral ADD IADD(T ) subject to the ARL to false alarm constraint ARL2FA(T ) γ, γ > 1, i.e., inf T C γ IADD(T ) T o for every γ > 1, where C γ = {T : E T γ} is the class of detection procedures such that the ARL2FA exceeds the given tolerable level γ. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

15 The Shiryaev Roberts (SR) Procedure and its Optimality The SR Statistic: (average LR over uniform prior) R n = n n k=1 i=k g(x i ) f (X i ) or recursively Rn = (1 + R n 1)L n, R 0 = 0, L i = g(x i) f (X i ), The SR Procedure: Raise an alarm at T SR (A) = inf{n 1: R n A}, A > 0 SR Strict Optimality [Pollak & Tartakovsky 09] Let A = A γ be such that ARL2FA(T SR (A γ)) = γ. Then the SR procedure T SR (A γ) is optimal in the generalized Bayesian setting, inf IADD(T ) = IADD(T SR (A γ)) for every γ > 1. T C γ Proof: Either using Shiryaev s Bayes optimality result [Shiryaev 63] and notice that SR is the limit case or directly using the fact that IADD(T ) = E T 1 [ n=0 Rn] + 1, so the problem is reduced to Markov optimal stopping since the SR statistic is Markov. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

16 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

17 Detection of Distant Changes with Repeated Multicyclic Procedures Observed Data, Xn Pre-Change Regime each Xn f(x) Post-Change Regime each Xn g(x) ν Start of Surveillance Change-Point Time, n Run Length to False Alarm, T (2) (random) Detection Delay, T (Iν ) ν, where T (Iν ) > ν (random) Detection Statistic A Detection Threshold Detection Point 0 T (1) = T (1) T (2) = T (1) + T (2) Run Length to False Alarm, T (1) (random) ν Change-Point T (Iν ) Iν j=1 T (j) Time, n Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

18 Optimality of the Multicyclic SR for Detecting Distant Changes Appearing After Many Reruns Low Cost of False Alarms: It is of importance to detect a change as quickly as possible, even at the price of raising many false alarms (using a repeated application of the same stopping rule) before the change occurs Multicyclic detection schemes with times between consecutive alarms T (1), T (2),... There are I ν 1 false detections before the change occurs and the I νth detection is a true detection, so that the time of the true detection is T (Iν ) = T (1) + + T (Iν ) Stationary ADD: STADD(T ) = lim ν Eν[T (I ν ) ν] = lim ν Eν[T (1) + + T (Iν ) ν] SR Strict Optimality in the iid Case [Pollak&Tartakovsky 09] If A = A γ is chosen so that E [T SR (A)] = γ, then the multicyclic SR procedure minimizes STADD(T ) for every γ > 1: STADD(T SR (A)) = inf STADD(T ) T C γ Proof: Using renewal theory, it can be shown that ν=0 STADD(T ) = Eν[(T ν)+ ] = IADD(T ) E [T ] Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

19 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

20 Lorden s Minimax Criterion and the CUSUM Procedure Essential Supremum Average Detection Delay (ESADD) [Lorden 71] { [ ESADD(T ) = ess sup Eν (T ν) + ]} X 1,..., X ν sup 0 ν< Lorden s Minimax Criterion Minimize ESADD(T ) subject to the ARL to false alarm constraint: inf ESADD(T ) T o for every γ > 1 (C γ = {T : ARL2FA(T ) γ}) T C γ CUSUM Maximum Likelihood Ratio Test [Page 54]: Maximize LR over ν [0, n), i.e., compute the generalized LR statistic V n = max 0 ν<n Λν n = max {1, V n 1 } L n, V 0 = 1, L n = g(x n)/f (X n) and stop as soon as it exceeds a threshold A > 0: T CS (A) = inf {n 1 : V n > A} Exact Optimality of CUSUM in the iid Case [Moustakides 86] Let A = A γ be selected so that ARL2FA(T CS (A γ)) = γ. Then CUSUM is optimal wrt ESADD(T ) in the class C γ = {T : ARL2FA(T ) γ} for every γ > 1. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

21 Pollak s Minimax Criterion and a Modified Shiryaev Roberts Procedure Minimize Supremum Average Detection Delay (SADD) [Pollak 85] SADD(T ) = sup E ν [T ν T > ν] subject to ARL2FA(T ) γ ν 0 Shiryaev Roberts Pollak (SRP) Procedure [Pollak 85]: Start off the SR statistic at a random point R 0 distributed according to the quasi-stationary distribution of the SR statistic Q A (x) = lim n P (R n x T SR (A) > n), ( ) Rn Q = 1 + R Q n 1 L n, R Q 0 Q A and stop as soon as it exceeds a threshold A > 0: { } = inf n 1 : Rn Q A T Q A SRP Almost Optimality [Pollak 85] If A = A γ is selected so that E [T Q A ] = γ, then SRP minimizes SADD(T ) to within o(1) 0 asymptotically as γ, i.e., SADD(T Q A ) inf T C γ SADD(T ) = o(1) as γ. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

22 Shiryaev Roberts Pollak Procedure Why quasi-stationary? And why SRP is expected to be optimal? Because starting off at quasi-stationary makes SRP an equalizer (see Figure): E ν(t Q A ν T Q A > ν) = E 0 T Q A for all ν 0, so that by the general decision theory it may be minimax (but not necessarily is!). R0 = 0 ADDν(S r A ) = Eν[Sr A ν Sr A > ν] R0 QA(x) Figure : Conditional ADD E ν(t ν T > ν) vs. changepoint ν for SR and SRP. ν Optimality of SRP was an open problem for 2 decades Polunchenko&Tartakovsky[Ann. Stat. 10] showed by the way of counterexample that it is not, but another procedure, which will be discussed later, is optimal. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

23 A Novel SR r Detection Procedure Idea: Initialize the SR statistic not from zero, but from a specially designed deterministic point r [0, A) R r n = (1 + R r n 1)L n, n 1, R r 0 = r [0, A); T r A = inf{n 1: R r n A} The question now is: How does one choose the starting point R 0 = r so as to outperform the SRP procedure? To be able to analyze detection procedures, Moustakides, Polunchenko & Tartakovsky developed a framework for performance evaluation that allows for computing (almost) precisely any performance index of interest (CADD, SADD, ARL2FA, PFA, IADD, quasi-stationary distribution, etc) solving numerically a system of Fredholm integral equations of the 2nd kind. 300 Delay to Detection, E k [T k T>k] SR (r=0) SRP (r=random) SR r (r=r A ) Change Point, k Figure : CADD E k [T k T > k] vs. ν = k for different values of R 0 = r (Numerics). There are starting points r for which SR-r uniformly outperforms SRP. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

24 The Lower Bound and Optimality Issues Let IADD r (T ) = re 0T + ν=0 E ν (T ν T >ν)p (T >ν) r+e T denote the Integral (relative) ADD which has been considered in the Generalized Bayesian Problem for r = 0. Theorem (Lower Bound for Maximal ADD) Let the threshold A = A γ be selected so that E [TA r γ ] = γ. (i) For any r 0, the SR-r procedure minimizes IADD r (T ) over all procedures with E T γ, i.e., inf T Cγ IADD r (T ) = IADD r (TA r γ ). (ii) For every r 0, inf SADD(T ) IADD r (TA r γ ) = re 0[TA r γ ] + T C γ ν=0 Eν[T r A γ ν T r A γ > ν]p (T r A γ > ν) r + E [T r A γ ] (iii) Assume that r = r(γ) can be chosen so that the SR-r procedure becomes an equalizer, i.e., E 0 [TA r γ ] = E ν(ta r γ ν TA r γ > ν) for all ν 0. Then it is strictly minimax: inf SADD(T ) = SADD(TA r γ ). T C γ Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

25 Counterexample Disproving Pollak s Conjecture Example: Exp(1) Exp(1/θ), i.e., prechange pdf f (x) = e x 1l {x 0} and postchange pdf g(x) = θe θx 1l {x 0}. Set θ = 2. Optimality of SR r [Polunchenko & Tartakovsky 10] Let the initializing value be chosen as r A = 1 + A 1 and the threshold A = A γ be selected from the transcendental equation A + (γ 1) 1 + A log(1 + A) 2(γ 1) 1 + A = 0. Then for every γ < γ 0 = (1 0.5 log 3) , the ARL to false alarm E [T r Aγ A γ ] = γ and the SR-r procedure is minimax, while SRP is not. 1.4 sup k E k [T k T>k] Shiryaev Roberts r A Shiryaev Roberts Pollak E [T] Figure : Operating characteristics of SRP vs. SR r. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

26 Near Minimaxity of SR-r Notation: Z i = log L i log-likelihood ratio for the i-th observation; S n = Z Z n; τ a = inf {n : S n a}; κ a = S τa a (overshoot); ζ = lim E 0[e κa ], κ = lim E 0κ a a a; I = E 0 Z 1 Kullback Leibler information number, V = j=1 e S j, R limiting value of SR R n (distributed according to the stationary distribution); C = E[log(1 + R + V )]; C r = E[log(1 + r + V )]; ADD (T ) = lim Eν(T ν T > ν). ν Tartakovsky, Pollak, & Polunchenko [Probability Theory & its Applications 11] Theorem (Near Optimality and Asymptotic Approximations) Let E 0 Z 1 2 < and let Z 1 be non-arithmetic. (i) If in the SRP procedure A = A γ = γζ, then E T Q A A = γ(1 + o(1)) and, as γ, SADD(T Q A A ) = I 1 [log(γζ) + κ C] + o(1). (ii) If in the SR r procedure A = A γ = γζ, and the initialization point r is either fixed or tends to infinity with the rate o(γ) and is selected so that SADD(TA r ) = ADD (T A r ), then E TA r = γ(1 + o(1)) and, as γ, SADD(T r A ) = I 1 [log(γζ) + κ C] + o(1). Therefore, both procedures are asymptotically third-order optimal: inf T Cγ SADD(T ) = SADD(T Q A A ) + o(1), inf T C γ SADD(T ) = SADD(T r A ) + o(1). Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

27 Ideas Behind the Proof and Efficient Choice of the Headstart Lemma (i) If E 0 Z 1 2 < and Z 1 is non-arithmetic, then as A, γ IADD 0 (T A ) = 1 I (log A + κ C) + o(1), inf SADD(T ) 1 [log(γζ) + κ C] + o(1). T C γ I (ii) If E 0 Z 1 2 < and Z 1 is non-arithmetic, then for any r 0 as A ADD (T r A) = E 0 T Q A A = 1 I E 0 T r A = 1 I (log A + κ C) + o(1), (log A + κ Cr ) + o(1). How to select r? Equate the ADD at the beginning and at infinity, making it look like as almost equalizer: E 0 T r A = ADD (T r A). So we obtain the equation for an optimal r C = C r. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

28 Comparison of Detection Procedures: Beta-to-Beta Model Beta-to-Beta Model: f (x) = xδ 1 (1 x) δ B(δ,δ+1) ; g(x) = xδ (1 x) δ 1, 0 < x < 1, where B(δ+1,δ) δ > 0 is a given constant and B(, ) is the Beta function. If δ = 1, then C = π 2 / and for r = 1.98 we have C(r = 1.98) = C = 1.64, so that ADD (T r A) ADD 0 (T r A) (almost equalizer). 3.6 E ν (T A r ν TA r >ν) SRP (r=random) SR r (r=r * =1.98) ν Figure : CADD versus changepoint Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

29 Comparison of Detection Procedures: Gaussian Model Gaussian Model: N (µ, aµ)-to-n (θ, aθ) model change in mean from µ to θ, but variance σ 2 also changes (ratio of variance to mean is constant a). When a = 1 this is an approximation to a Poisson model. 115 ADD ν (T)=E ν [T ν T>ν] SR(R 0 =0) CUSUM SRP(R 0 Q QA ) SR r(r 0 r =r * ) ν Figure : Conditional average detection delay ADD ν(t ) = E ν[t ν T > ν] for the four detection procedures for µ = 1000, θ = 1001, a = 0.01, γ = 10 4 versus the changepoint ν. Conclusions: (1) CUSUM is superior to SR for changes occurring either immediately or in the near to mid-term future and SR is superior for changes occurring at a far horizon; (2) SRP and SR-r have almost the same performance, while CUSUM is inferior Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

30 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

31 Unknown Post-change Parameter: g(x n) = g θ (X n), θ Θ Two conventional approaches: 1 GLR maximize over the unknown parameter (usually is being applied to CUSUM), i.e., T GCS (h) = inf{n : max 1 k n sup θ w(θ)λ k n(θ) h} 2 Mixtures (Weighting) average over some prior (usually is being applied to SR) { n } T WSR (A) = inf n : Λ k n(θ)w(θ) dθ A k=1 Both procedures are first-order asymptotically minimax wrt Pollak s maximal ADD, since (by the above theorem) ( ) inf SADD θ (T ) = I 1 gθ (x) θ log γ + O(1), γ, I θ = log g θ (x)λ(dx) T C γ f (x) ( ) and (for the l-dimensional exponential family, log fθ (x) = θ x b(θ)) if f 0 (x) A = γ Θ 1 ζ θ w(θ)dθ, then ET WSR (A) γ and SADD θ (T WSR ) = 1 { log γ + l l[1 + log(2π)] log log γ I θ 2 2 ( ) + log ζ t w(t)dt Θ } log (w(θ)e κ θ+µ θ Iθ b(θ)]) l/ det[ 2 + o(1), as γ Thus, SADD θ (T WSR ) inf T Cγ SADD θ (T ) = O( l 2I θ log log γ) for all θ Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

32 Unknown Post-change Parameter Cont d Question: Is it possible to further optimize by choosing a specific weight w(θ) to obtain higher-order optimality? Expected Kullback Leibler (K L) Information: KL ν,θ (T ) := E ν,θ (λ θ T λ θ ν T > ν) = I θ E ν,θ (T ν T > ν), where λ θ n = n k=1 log p θ(x k ) p θ0 is the log-likelihood ratio and (X k ) is the K L information number. I θ = E 0,θ λ θ 1 = θ b(θ) b(θ) Expected Maximal K L Information (over both ν and θ): [ ] sup sup KL ν,θ (T ) = sup I θ sup E θ ν(t ν T > ν) = sup[i θ SADD θ (T )]. θ Θ ν 0 θ Θ ν 0 θ Θ Find a nearly minimax procedure: sup KL ν,θ (T ) = θ Θ,ν 0 inf sup T C γ θ Θ,ν 0 KL ν,θ (T ) + o(1) as γ Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

33 Unknown Post-change Parameter Cont d Asymptotic Lower Bound [Siegmund&Yakir 08]: inf sup KL ν,θ (T ) log γ + l T C γ θ,ν 2 log log γ l [1 + log(2π)] + Copt + o(1), 2 ( ) C opt = log ζ te κ t β t det[ 2 b(t)]/it l dt Θ Weight Selection for WSR: w(θ) e κ θ µ θ det[ 2 b(θ)]/iθ l, which turns WSR into equalizer with constant ( ) C WSR = log ζ te κ t µ t det[ 2 b(t)]/it l dt Θ Asymptotic Risk Regret: sup θ,ν KL ν,θ (T WSR ) inf T sup KL ν,θ (T ) = O(1)( C WSR C opt) θ,ν Open Problem: Show that the SR r mixture with a specially designed mixing distribution and headstart is almost (third-order) optimal, i.e., attains the above lower bound with the same constant to within o(1) Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

34 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

35 Rapid Detection of Intrusions in Computer Networks As a rule, computer intrusions lead to (relatively) abrupt changes in network traffic, so that changepoint detection methods can be used for rapid detection of attacks with a given false alarm rate (FAR). Examples: Distributed Denial-of-Service (DDOS) attacks Unauthorized break-ins Spam campaigns However, the behavior of both pre- and post-attack traffic is poorly understood and typically neither the pre- nor post-change distributions are known. As a result, alternative score-based (not strictly optimal) methods should be used: R sc n = (1 + Rn 1)e sc Sn, Wn sc = max (0, Wn 1 sc + S n), S n score sensitive to a change Linear-quadratic Memoryless Score S n(x n) = C 1 X n + C 2 X 2 n C 3 is efficient for detecting changes in the mean and variance. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

36 Example 1: Detection of Spammers Implementation of linear score-based CUSUM and SR-type change detection algorithms to detection of spammers: Packets/sec Packet Rate Time, sec CUSUM n, sec (sample) Shiryaev Roberts 5 W n R n n, sec (sample) SPAM MESSAGE SENT SPAMMER DETECTED SPAMMER DETECTED Figure : SPAMMER detection with CUSUM (mid) and SR (bottom) procedures. Green line EWMA estimate of mean, red detection thresholds. Detection is fast with both algorithms, but CUSUM triggers several false alarms prior to the attack, while SR does not Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

37 Example 2: A TCP SYN Flood DDOS Attack TCP SYN flood attack on a University of Michigan IRC server: Starts at 550 seconds into the trace and lasts for 10 minutes. While the attack can be seen to the naked eye, it is not completely clear when it starts; there is fluctuation (a spike) in the data before the attack Number of Attempted Connections Change Point Time (seconds) Figure : The connections birth rate (the number of attempted connections). The estimated values of the connections birth rate mean and standard deviation for legitimate and attack traffic are: µ 1669, σ 114 and µ 1888, σ 218 (connections per 20 msec). Therefore, this attack leads to a considerable increase in both the mean and standard deviation of the connections birth rate. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

38 ~ ~ Example 2: A TCP SYN Flood DDOS Attack (Cont d) From Left to Right: A long run of the SR statistic with several false alarms and then the true detection of the attack with a very small detection delay (at the expense of raising many false alarms prior to the correct detection); the SR and CUSUM score-based statistics shortly prior to the attack and right after the attack starts until detection. ~ Log Shyriaev Roberts Statistic, log(r) Log SR Threshold False Alarms Log Shyriaev Roberts Statistic, log(r) Log SR Threshold Change Point SR Detection CUSUM Statistic, W CUSUM Threshold CUSUM Detection Change Point Time (seconds) Time (seconds) Time (seconds) (a) Long run of the multi-cyclic SR statistic (b) Multi-cyclic SR procedure (c) Multi-cyclic CUSUM procedure Figure : Detection of the SYN flood attack by the multi-cyclic SR and CUSUM procedures. The detection delay is approximately 0.14 seconds (7 samples) for the repeated SR procedure and about 0.21 seconds (10 samples) for the CUSUM procedure. As expected, the SR procedure is slightly better. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

39 Example 3: A UDP DDOS Attack The picture shows efficiency of the linear-quadratic score-based change detection algorithm with false alarm filtering by a spectral analyzer (hybrid anomaly spectral IDS) for detection of the UDP DDoS attack: Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

40 Movie: Detection of the UDP DDoS Attacks The movie shows the Hybrid Anomaly Spectral IDS in action: Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

41 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

42 Acknowledgements The research topics discussed in my talk has been supported by various US agencies, e.g., NSF, ONR, ARO, AFOSR, DARPA, DOE over the last several years, and is currently being supported by: 1 The U.S. National Science Foundation under grant DMS The U.S. Army Research Office under grant W911NF The U.S. Air Force Office of Scientific Research under MURI grant FA The U.S. Defense Threat Reduction Agency under grant HDTRA The U.S. Defense Advanced Research Projects Agency under grant W911NF Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

43 THE END THANK YOU! Questions? Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38

Asymptotically Optimal Quickest Change Detection in Distributed Sensor Systems

Asymptotically Optimal Quickest Change Detection in Distributed Sensor Systems This article was downloaded by: [University of Illinois at Urbana-Champaign] On: 23 May 2012, At: 16:03 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954

More information

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in

More information

Optimal Design and Analysis of the Exponentially Weighted Moving Average Chart for Exponential Data

Optimal Design and Analysis of the Exponentially Weighted Moving Average Chart for Exponential Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Optimal Design and Analysis of the Exponentially Weighted Moving Average Chart for

More information

arxiv: v1 [math.st] 13 Sep 2011

arxiv: v1 [math.st] 13 Sep 2011 Methodol Comput Appl Probab manuscript No. (will be inserted by the editor) State-of-the-Art in Sequential Change-Point Detection Aleksey S. Polunchenko Alexander G. Tartakovsky arxiv:119.2938v1 [math.st]

More information

SEQUENTIAL CHANGE-POINT DETECTION WHEN THE PRE- AND POST-CHANGE PARAMETERS ARE UNKNOWN. Tze Leung Lai Haipeng Xing

SEQUENTIAL CHANGE-POINT DETECTION WHEN THE PRE- AND POST-CHANGE PARAMETERS ARE UNKNOWN. Tze Leung Lai Haipeng Xing SEQUENTIAL CHANGE-POINT DETECTION WHEN THE PRE- AND POST-CHANGE PARAMETERS ARE UNKNOWN By Tze Leung Lai Haipeng Xing Technical Report No. 2009-5 April 2009 Department of Statistics STANFORD UNIVERSITY

More information

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, GA 30332-0205,

More information

arxiv: v2 [math.st] 3 Mar 2012

arxiv: v2 [math.st] 3 Mar 2012 Nearly Optimal Change-Point Detection with an Application to Cybersecurity arxiv:122.2849v2 [math.st] 3 Mar 212 Aleksey S. Polunchenko and Alexander G. Tartakovsky Department of Mathematics, University

More information

Sequential Detection. Changes: an overview. George V. Moustakides

Sequential Detection. Changes: an overview. George V. Moustakides Sequential Detection of Changes: an overview George V. Moustakides Outline Sequential hypothesis testing and Sequential detection of changes The Sequential Probability Ratio Test (SPRT) for optimum hypothesis

More information

Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data

Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data Taposh Banerjee University of Texas at San Antonio Joint work with Gene Whipps (US Army Research Laboratory) Prudhvi

More information

EARLY DETECTION OF A CHANGE IN POISSON RATE AFTER ACCOUNTING FOR POPULATION SIZE EFFECTS

EARLY DETECTION OF A CHANGE IN POISSON RATE AFTER ACCOUNTING FOR POPULATION SIZE EFFECTS Statistica Sinica 21 (2011), 597-624 EARLY DETECTION OF A CHANGE IN POISSON RATE AFTER ACCOUNTING FOR POPULATION SIZE EFFECTS Yajun Mei, Sung Won Han and Kwok-Leung Tsui Georgia Institute of Technology

More information

arxiv: v2 [stat.ap] 3 Dec 2015

arxiv: v2 [stat.ap] 3 Dec 2015 Statistics and Its Interface Volume (25) 4 Real-time financial surveillance via quickest change-point detection methods Andrey Pepelyshev and Aleksey S. Polunchenko arxiv:59.57v2 [stat.ap] 3 Dec 25 We

More information

The Shiryaev-Roberts Changepoint Detection Procedure in Retrospect - Theory and Practice

The Shiryaev-Roberts Changepoint Detection Procedure in Retrospect - Theory and Practice The Shiryaev-Roberts Changepoint Detection Procedure in Retrospect - Theory and Practice Department of Statistics The Hebrew University of Jerusalem Mount Scopus 91905 Jerusalem, Israel msmp@mscc.huji.ac.il

More information

Performance of Certain Decentralized Distributed Change Detection Procedures

Performance of Certain Decentralized Distributed Change Detection Procedures Performance of Certain Decentralized Distributed Change Detection Procedures Alexander G. Tartakovsky Center for Applied Mathematical Sciences and Department of Mathematics University of Southern California

More information

Least Favorable Distributions for Robust Quickest Change Detection

Least Favorable Distributions for Robust Quickest Change Detection Least Favorable Distributions for Robust Quickest hange Detection Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli, Sean Meyn Department of Electrical and omputer Engineering, and oordinated Science

More information

Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation

Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation Yuan Wang and Yajun Mei arxiv:1308.5738v3 [math.st] 16 Mar 2016 July 10, 2015 Abstract The quickest change detection

More information

Optimum CUSUM Tests for Detecting Changes in Continuous Time Processes

Optimum CUSUM Tests for Detecting Changes in Continuous Time Processes Optimum CUSUM Tests for Detecting Changes in Continuous Time Processes George V. Moustakides INRIA, Rennes, France Outline The change detection problem Overview of existing results Lorden s criterion and

More information

CHANGE DETECTION WITH UNKNOWN POST-CHANGE PARAMETER USING KIEFER-WOLFOWITZ METHOD

CHANGE DETECTION WITH UNKNOWN POST-CHANGE PARAMETER USING KIEFER-WOLFOWITZ METHOD CHANGE DETECTION WITH UNKNOWN POST-CHANGE PARAMETER USING KIEFER-WOLFOWITZ METHOD Vijay Singamasetty, Navneeth Nair, Srikrishna Bhashyam and Arun Pachai Kannu Department of Electrical Engineering Indian

More information

Data-Efficient Quickest Change Detection

Data-Efficient Quickest Change Detection Data-Efficient Quickest Change Detection Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign http://www.ifp.illinois.edu/~vvv (joint work with Taposh Banerjee)

More information

arxiv: v1 [math.st] 11 Oct 2013

arxiv: v1 [math.st] 11 Oct 2013 Proceedings of the 2013 Joint Statistical Meetings (JSM-2013) Montréal, Québec, Canada, 3 8 August 2013 Quickest Change-Point Detection: A Bird s Eye View Aleksey S. Polunchenko Grigory Sokolov Wenyu Du

More information

Change-point models and performance measures for sequential change detection

Change-point models and performance measures for sequential change detection Change-point models and performance measures for sequential change detection Department of Electrical and Computer Engineering, University of Patras, 26500 Rion, Greece moustaki@upatras.gr George V. Moustakides

More information

COMPARISON OF STATISTICAL ALGORITHMS FOR POWER SYSTEM LINE OUTAGE DETECTION

COMPARISON OF STATISTICAL ALGORITHMS FOR POWER SYSTEM LINE OUTAGE DETECTION COMPARISON OF STATISTICAL ALGORITHMS FOR POWER SYSTEM LINE OUTAGE DETECTION Georgios Rovatsos*, Xichen Jiang*, Alejandro D. Domínguez-García, and Venugopal V. Veeravalli Department of Electrical and Computer

More information

arxiv:math/ v2 [math.st] 15 May 2006

arxiv:math/ v2 [math.st] 15 May 2006 The Annals of Statistics 2006, Vol. 34, No. 1, 92 122 DOI: 10.1214/009053605000000859 c Institute of Mathematical Statistics, 2006 arxiv:math/0605322v2 [math.st] 15 May 2006 SEQUENTIAL CHANGE-POINT DETECTION

More information

X 1,n. X L, n S L S 1. Fusion Center. Final Decision. Information Bounds and Quickest Change Detection in Decentralized Decision Systems

X 1,n. X L, n S L S 1. Fusion Center. Final Decision. Information Bounds and Quickest Change Detection in Decentralized Decision Systems 1 Information Bounds and Quickest Change Detection in Decentralized Decision Systems X 1,n X L, n Yajun Mei Abstract The quickest change detection problem is studied in decentralized decision systems,

More information

Sequential Change-Point Approach for Online Community Detection

Sequential Change-Point Approach for Online Community Detection Sequential Change-Point Approach for Online Community Detection Yao Xie Joint work with David Marangoni-Simonsen H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology

More information

SCALABLE ROBUST MONITORING OF LARGE-SCALE DATA STREAMS. By Ruizhi Zhang and Yajun Mei Georgia Institute of Technology

SCALABLE ROBUST MONITORING OF LARGE-SCALE DATA STREAMS. By Ruizhi Zhang and Yajun Mei Georgia Institute of Technology Submitted to the Annals of Statistics SCALABLE ROBUST MONITORING OF LARGE-SCALE DATA STREAMS By Ruizhi Zhang and Yajun Mei Georgia Institute of Technology Online monitoring large-scale data streams has

More information

EXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES

EXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES EXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES CHAO AN-KUO NATIONAL UNIVERSITY OF SINGAPORE 2016 EXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES CHAO AN-KUO (MS,

More information

Prompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014

Prompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014 Prompt Network Anomaly Detection using SSA-Based Change-Point Detection Hao Chen 3/7/2014 Network Anomaly Detection Network Intrusion Detection (NID) Signature-based detection Detect known attacks Use

More information

Quickest Detection With Post-Change Distribution Uncertainty

Quickest Detection With Post-Change Distribution Uncertainty Quickest Detection With Post-Change Distribution Uncertainty Heng Yang City University of New York, Graduate Center Olympia Hadjiliadis City University of New York, Brooklyn College and Graduate Center

More information

arxiv: v2 [eess.sp] 20 Nov 2017

arxiv: v2 [eess.sp] 20 Nov 2017 Distributed Change Detection Based on Average Consensus Qinghua Liu and Yao Xie November, 2017 arxiv:1710.10378v2 [eess.sp] 20 Nov 2017 Abstract Distributed change-point detection has been a fundamental

More information

Simultaneous and sequential detection of multiple interacting change points

Simultaneous and sequential detection of multiple interacting change points Simultaneous and sequential detection of multiple interacting change points Long Nguyen Department of Statistics University of Michigan Joint work with Ram Rajagopal (Stanford University) 1 Introduction

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...

More information

Quickest Anomaly Detection: A Case of Active Hypothesis Testing

Quickest Anomaly Detection: A Case of Active Hypothesis Testing Quickest Anomaly Detection: A Case of Active Hypothesis Testing Kobi Cohen, Qing Zhao Department of Electrical Computer Engineering, University of California, Davis, CA 95616 {yscohen, qzhao}@ucdavis.edu

More information

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-088 The public reporting burden for this collection of information is estimated to average hour per response, including the time for reviewing instructions,

More information

Surveillance of BiometricsAssumptions

Surveillance of BiometricsAssumptions Surveillance of BiometricsAssumptions in Insured Populations Journée des Chaires, ILB 2017 N. El Karoui, S. Loisel, Y. Sahli UPMC-Paris 6/LPMA/ISFA-Lyon 1 with the financial support of ANR LoLitA, and

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Efficient scalable schemes for monitoring a large number of data streams

Efficient scalable schemes for monitoring a large number of data streams Biometrika (2010, 97, 2,pp. 419 433 C 2010 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asq010 Advance Access publication 16 April 2010 Efficient scalable schemes for monitoring a large

More information

Monitoring actuarial assumptions in life insurance

Monitoring actuarial assumptions in life insurance Monitoring actuarial assumptions in life insurance Stéphane Loisel ISFA, Univ. Lyon 1 Joint work with N. El Karoui & Y. Salhi IAALS Colloquium, Barcelona, 17 LoLitA Typical paths with change of regime

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Surveillance of Infectious Disease Data using Cumulative Sum Methods

Surveillance of Infectious Disease Data using Cumulative Sum Methods Surveillance of Infectious Disease Data using Cumulative Sum Methods 1 Michael Höhle 2 Leonhard Held 1 1 Institute of Social and Preventive Medicine University of Zurich 2 Department of Statistics University

More information

An Effective Approach to Nonparametric Quickest Detection and Its Decentralized Realization

An Effective Approach to Nonparametric Quickest Detection and Its Decentralized Realization University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 5-2010 An Effective Approach to Nonparametric Quickest Detection and Its Decentralized

More information

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

10-704: Information Processing and Learning Fall Lecture 24: Dec 7 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Applications of Information Geometry to Hypothesis Testing and Signal Detection

Applications of Information Geometry to Hypothesis Testing and Signal Detection CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California

More information

IMPORTANCE SAMPLING FOR GENERALIZED LIKELIHOOD RATIO PROCEDURES IN SEQUENTIAL ANALYSIS. Hock Peng Chan

IMPORTANCE SAMPLING FOR GENERALIZED LIKELIHOOD RATIO PROCEDURES IN SEQUENTIAL ANALYSIS. Hock Peng Chan IMPORTANCE SAMPLING FOR GENERALIZED LIKELIHOOD RATIO PROCEDURES IN SEQUENTIAL ANALYSIS Hock Peng Chan Department of Statistics and Applied Probability National University of Singapore, Singapore 119260

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information

The information complexity of sequential resource allocation

The information complexity of sequential resource allocation The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015

More information

Space-Time CUSUM for Distributed Quickest Detection Using Randomly Spaced Sensors Along a Path

Space-Time CUSUM for Distributed Quickest Detection Using Randomly Spaced Sensors Along a Path Space-Time CUSUM for Distributed Quickest Detection Using Randomly Spaced Sensors Along a Path Daniel Egea-Roca, Gonzalo Seco-Granados, José A López-Salcedo, Sunwoo Kim Dpt of Telecommunications and Systems

More information

On robust stopping times for detecting changes in distribution

On robust stopping times for detecting changes in distribution On robust stopping times for detecting changes in distribution by Yuri Golubev, Mher Safarian No. 116 MAY 2018 WORKING PAPER SERIES IN ECONOMICS KIT Die Forschungsuniversität in der Helmholtz-Gemeinschaft

More information

SEQUENTIAL CHANGE DETECTION REVISITED. BY GEORGE V. MOUSTAKIDES University of Patras

SEQUENTIAL CHANGE DETECTION REVISITED. BY GEORGE V. MOUSTAKIDES University of Patras The Annals of Statistics 28, Vol. 36, No. 2, 787 87 DOI: 1.1214/95367938 Institute of Mathematical Statistics, 28 SEQUENTIAL CHANGE DETECTION REVISITED BY GEORGE V. MOUSTAKIDES University of Patras In

More information

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Practical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length

Practical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length https://doi.org/10.1007/s11009-018-9636-1 Practical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length J. Kuhn 1,2 M. Mandjes 1 T. Taimre 2 Received: 29 August 2016 /

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Decentralized decision making with spatially distributed data

Decentralized decision making with spatially distributed data Decentralized decision making with spatially distributed data XuanLong Nguyen Department of Statistics University of Michigan Acknowledgement: Michael Jordan, Martin Wainwright, Ram Rajagopal, Pravin Varaiya

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Lecture Notes in Statistics 180 Edited by P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger

Lecture Notes in Statistics 180 Edited by P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger Lecture Notes in Statistics 180 Edited by P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger Yanhong Wu Inference for Change-Point and Post-Change Means After a CUSUM Test Yanhong Wu Department

More information

DETECTION theory deals primarily with techniques for

DETECTION theory deals primarily with techniques for ADVANCED SIGNAL PROCESSING SE Optimum Detection of Deterministic and Random Signals Stefan Tertinek Graz University of Technology turtle@sbox.tugraz.at Abstract This paper introduces various methods for

More information

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various

More information

Real-Time Detection of Hybrid and Stealthy Cyber-Attacks in Smart Grid

Real-Time Detection of Hybrid and Stealthy Cyber-Attacks in Smart Grid 1 Real-Time Detection of Hybrid and Stealthy Cyber-Attacks in Smart Grid Mehmet Necip Kurt, Yasin Yılmaz, Member, IEEE, and Xiaodong Wang, Fellow, IEEE Abstract For a safe and reliable operation of the

More information

Detection theory 101 ELEC-E5410 Signal Processing for Communications

Detection theory 101 ELEC-E5410 Signal Processing for Communications Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off

More information

Decentralized Detection In Wireless Sensor Networks

Decentralized Detection In Wireless Sensor Networks Decentralized Detection In Wireless Sensor Networks Milad Kharratzadeh Department of Electrical & Computer Engineering McGill University Montreal, Canada April 2011 Statistical Detection and Estimation

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Lecture 26: Likelihood ratio tests

Lecture 26: Likelihood ratio tests Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for

More information

A CUSUM approach for online change-point detection on curve sequences

A CUSUM approach for online change-point detection on curve sequences ESANN 22 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges Belgium, 25-27 April 22, i6doc.com publ., ISBN 978-2-8749-49-. Available

More information

Quantization Effect on the Log-Likelihood Ratio and Its Application to Decentralized Sequential Detection

Quantization Effect on the Log-Likelihood Ratio and Its Application to Decentralized Sequential Detection 1536 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 6, MARCH 15, 2013 Quantization Effect on the Log-Likelihood Ratio Its Application to Decentralized Sequential Detection Yan Wang Yajun Mei Abstract

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

The information complexity of best-arm identification

The information complexity of best-arm identification The information complexity of best-arm identification Emilie Kaufmann, joint work with Olivier Cappé and Aurélien Garivier MAB workshop, Lancaster, January th, 206 Context: the multi-armed bandit model

More information

Decentralized Sequential Hypothesis Testing. Change Detection

Decentralized Sequential Hypothesis Testing. Change Detection Decentralized Sequential Hypothesis Testing & Change Detection Giorgos Fellouris, Columbia University, NY, USA George V. Moustakides, University of Patras, Greece Outline Sequential hypothesis testing

More information

Bayesian Quickest Change Detection Under Energy Constraints

Bayesian Quickest Change Detection Under Energy Constraints Bayesian Quickest Change Detection Under Energy Constraints Taposh Banerjee and Venugopal V. Veeravalli ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign, Urbana,

More information

Optimal and asymptotically optimal CUSUM rules for change point detection in the Brownian Motion model with multiple alternatives

Optimal and asymptotically optimal CUSUM rules for change point detection in the Brownian Motion model with multiple alternatives INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Optimal and asymptotically optimal CUSUM rules for change point detection in the Brownian Motion model with multiple alternatives Olympia

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural

More information

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood

More information

Control Charts Based on Alternative Hypotheses

Control Charts Based on Alternative Hypotheses Control Charts Based on Alternative Hypotheses A. Di Bucchianico, M. Hušková (Prague), P. Klášterecky (Prague), W.R. van Zwet (Leiden) Dortmund, January 11, 2005 1/48 Goals of this talk introduce hypothesis

More information

Stratégies bayésiennes et fréquentistes dans un modèle de bandit

Stratégies bayésiennes et fréquentistes dans un modèle de bandit Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Anomaly Detection and Attribution in Networks with Temporally Correlated Traffic

Anomaly Detection and Attribution in Networks with Temporally Correlated Traffic Anomaly Detection and Attribution in Networks with Temporally Correlated Traffic Ido Nevat, Dinil Mon Divakaran 2, Sai Ganesh Nagarajan 2, Pengfei Zhang 3, Le Su 2, Li Ling Ko 4, Vrizlynn L. L. Thing 2

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Revisiting the Exploration-Exploitation Tradeoff in Bandit Models

Revisiting the Exploration-Exploitation Tradeoff in Bandit Models Revisiting the Exploration-Exploitation Tradeoff in Bandit Models joint work with Aurélien Garivier (IMT, Toulouse) and Tor Lattimore (University of Alberta) Workshop on Optimization and Decision-Making

More information

9 Bayesian inference. 9.1 Subjective probability

9 Bayesian inference. 9.1 Subjective probability 9 Bayesian inference 1702-1761 9.1 Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win

More information

On Optimal Stopping Problems with Power Function of Lévy Processes

On Optimal Stopping Problems with Power Function of Lévy Processes On Optimal Stopping Problems with Power Function of Lévy Processes Budhi Arta Surya Department of Mathematics University of Utrecht 31 August 2006 This talk is based on the joint paper with A.E. Kyprianou:

More information

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models On the Complexity of Best Arm Identification in Multi-Armed Bandit Models Aurélien Garivier Institut de Mathématiques de Toulouse Information Theory, Learning and Big Data Simons Institute, Berkeley, March

More information

Anomaly detection and. in time series

Anomaly detection and. in time series Anomaly detection and sequential statistics in time series Alex Shyr CS 294 Practical Machine Learning 11/12/2009 (many slides from XuanLong Nguyen and Charles Sutton) Two topics Anomaly detection Sequential

More information

Detection and Diagnosis of Unknown Abrupt Changes Using CUSUM Multi-Chart Schemes

Detection and Diagnosis of Unknown Abrupt Changes Using CUSUM Multi-Chart Schemes Sequential Analysis, 26: 225 249, 2007 Copyright Taylor & Francis Group, LLC ISSN: 0747-4946 print/532-476 online DOI: 0.080/0747494070404765 Detection Diagnosis of Unknown Abrupt Changes Using CUSUM Multi-Chart

More information

Detection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010

Detection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010 Detection Theory Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010 Outline Neyman-Pearson Theorem Detector Performance Irrelevant Data Minimum Probability of Error Bayes Risk Multiple

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Change Detection Algorithms

Change Detection Algorithms 5 Change Detection Algorithms In this chapter, we describe the simplest change detection algorithms. We consider a sequence of independent random variables (y k ) k with a probability density p (y) depending

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

Asymptotic results for empirical measures of weighted sums of independent random variables

Asymptotic results for empirical measures of weighted sums of independent random variables Asymptotic results for empirical measures of weighted sums of independent random variables B. Bercu and W. Bryc University Bordeaux 1, France Workshop on Limit Theorems, University Paris 1 Paris, January

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)? ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information