Quickest Changepoint Detection: Optimality Properties of the Shiryaev Roberts-Type Procedures
|
|
- Stephanie Townsend
- 5 years ago
- Views:
Transcription
1 Quickest Changepoint Detection: Optimality Properties of the Shiryaev Roberts-Type Procedures Alexander Tartakovsky Department of Statistics Inference for Change-Point and Related Processes Isaac Newton Institute for Mathematical Sciences Cambridge, UK January 17, 2014 Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
2 Outline A general changepoint detection scenario Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
3 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
4 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
5 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Composite Post-change Hypothesis: Nearly minimax properties of the SR-mixtures Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
6 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Composite Post-change Hypothesis: Nearly minimax properties of the SR-mixtures Applications to Information Systems: Rapid detection of attacks/intrusions in computer networks Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
7 Outline A general changepoint detection scenario A simple changepoint problem (iid case, known pre- and post-change distributions) Recent contributions, including SR Procedure: Strict optimality properties Novel SR-r Procedure: Minimax properties Composite Post-change Hypothesis: Nearly minimax properties of the SR-mixtures Applications to Information Systems: Rapid detection of attacks/intrusions in computer networks Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
8 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
9 A General Changepoint Detection Scenario X1 X2 X3 Xν 1 Xν Xν+1 Xν+2 Xν+3 Xn f(xn X1, X2,..., Xn 1) g(xn X1, X2,..., Xn 1) Surveillance Begins Change-Point Surveillance Continues Figure : Typical general changepoint scenario. A change occurs at an unknown point in time ν 0. In a general non-iid case joint distributions change, which can be described in changing of conditional pre-change densities f (X i X 1,..., X i 1 ) to conditional post-change densities g(x i X 1,..., X i 1 ) for i = ν + 1, ν + 2,... The simplest iid case is where f (X i X 1,..., X i 1 ) = f (X i ) and g(x i X 1,..., X i 1 ) = g(x i ), where f (x) is a common pre-change density and g(x) is a common post-change density As long as the observations behavior is consistent with the normal state, one is content to let the process continue; if the state changes, then one wants to detect a change as quickly as possible, i.e., the design of a detection procedure, which is a stopping time T wrt the observed sequence {X n} n 1, reduces to optimizing the tradeoff between a delay to detection and a false alarm rate (FAR). Numerous applications: anomaly detection, failure detection, surveillance, process control, intrusion detection in information systems, target detection, finance to name a few. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
10 Illustration of Change Detection Observed Data, Xn Pre-Change Regime each Xn f(x) Post-Change Regime each Xn g(x) ν Start of Surveillance Change-Point Time, n Detection Statistic A Detection Threshold Point of False Alarm ν 0 T Change-Point Run Length to False Alarm, T (random) Time, n Detection Delay, T ν, where T > ν (random) Detection Statistic A Detection Threshold Detection Point 0 ν Change-Point T Time, n Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
11 Change Detection as a Hypothesis Testing Problem Change Detection is regarded as testing two hypotheses having sample X n = (X 1,..., X n): H ν : change at 0 ν < n and H = H ν for ν n no change Likelihood Ratio (LR) for these hypotheses (based on data X n ): Λ ν dp(x n ν H ν) n =: dp(x n H = j=1 f (X j X j 1 ) n j=ν+1 g(x j X j 1 ) n n ) j=1 f (X = j X j 1 ) Special iid Case: Λ ν n = ν j=1 f (X j) n j=ν+1 g(x j) n j=1 f (X = j) n j=ν+1 g(x j ) f (X j ) j=ν+1 g(x j X j 1 ) f (X j X j 1 ) Since the changepoint ν is unknown, reasonable decision statistics may be built either based on the average LR or on the maximal LR (over ν), which will be discussed below Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
12 Optimality Criteria Summary Four Approaches: Bayesian (changepoint is random with known prior), Generalized Bayesian (improper uniform prior), Detection of a Change in a Stationary Regime (change occurs at a far time horizon applying a repeated multicycle procedure), and Minimax Bayesian Approach: random change point uniform improper prior Generalized Bayesian Approach equivalent re- interpreta1on Mul9- cyclic Detec9on of Changes in a Sta9onary Regime Minimax Approach: unknown non- random change point Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
13 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
14 Generalized Bayesian Setting Assume improper uniform prior on Z +: P(ν = k) = 1, k = 0, 1,... The risk associated with the detection delay can be measured by the Integral (relative) Average Delay to Detection IADD(T ): k=0 IADD(T ) = E k(t k) + k=0 = E k(t k T > k)p (T > k) E T E T The risk associated with false alarms is typically measured by the Mean Time to False Alarm, which is referred to as the Average Run Length to False Alarm (ARL2FA): ARL2FA(T ) = E [T ] Generalized Bayesian Optimality Criterion Minimize the Integral ADD IADD(T ) subject to the ARL to false alarm constraint ARL2FA(T ) γ, γ > 1, i.e., inf T C γ IADD(T ) T o for every γ > 1, where C γ = {T : E T γ} is the class of detection procedures such that the ARL2FA exceeds the given tolerable level γ. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
15 The Shiryaev Roberts (SR) Procedure and its Optimality The SR Statistic: (average LR over uniform prior) R n = n n k=1 i=k g(x i ) f (X i ) or recursively Rn = (1 + R n 1)L n, R 0 = 0, L i = g(x i) f (X i ), The SR Procedure: Raise an alarm at T SR (A) = inf{n 1: R n A}, A > 0 SR Strict Optimality [Pollak & Tartakovsky 09] Let A = A γ be such that ARL2FA(T SR (A γ)) = γ. Then the SR procedure T SR (A γ) is optimal in the generalized Bayesian setting, inf IADD(T ) = IADD(T SR (A γ)) for every γ > 1. T C γ Proof: Either using Shiryaev s Bayes optimality result [Shiryaev 63] and notice that SR is the limit case or directly using the fact that IADD(T ) = E T 1 [ n=0 Rn] + 1, so the problem is reduced to Markov optimal stopping since the SR statistic is Markov. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
16 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
17 Detection of Distant Changes with Repeated Multicyclic Procedures Observed Data, Xn Pre-Change Regime each Xn f(x) Post-Change Regime each Xn g(x) ν Start of Surveillance Change-Point Time, n Run Length to False Alarm, T (2) (random) Detection Delay, T (Iν ) ν, where T (Iν ) > ν (random) Detection Statistic A Detection Threshold Detection Point 0 T (1) = T (1) T (2) = T (1) + T (2) Run Length to False Alarm, T (1) (random) ν Change-Point T (Iν ) Iν j=1 T (j) Time, n Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
18 Optimality of the Multicyclic SR for Detecting Distant Changes Appearing After Many Reruns Low Cost of False Alarms: It is of importance to detect a change as quickly as possible, even at the price of raising many false alarms (using a repeated application of the same stopping rule) before the change occurs Multicyclic detection schemes with times between consecutive alarms T (1), T (2),... There are I ν 1 false detections before the change occurs and the I νth detection is a true detection, so that the time of the true detection is T (Iν ) = T (1) + + T (Iν ) Stationary ADD: STADD(T ) = lim ν Eν[T (I ν ) ν] = lim ν Eν[T (1) + + T (Iν ) ν] SR Strict Optimality in the iid Case [Pollak&Tartakovsky 09] If A = A γ is chosen so that E [T SR (A)] = γ, then the multicyclic SR procedure minimizes STADD(T ) for every γ > 1: STADD(T SR (A)) = inf STADD(T ) T C γ Proof: Using renewal theory, it can be shown that ν=0 STADD(T ) = Eν[(T ν)+ ] = IADD(T ) E [T ] Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
19 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
20 Lorden s Minimax Criterion and the CUSUM Procedure Essential Supremum Average Detection Delay (ESADD) [Lorden 71] { [ ESADD(T ) = ess sup Eν (T ν) + ]} X 1,..., X ν sup 0 ν< Lorden s Minimax Criterion Minimize ESADD(T ) subject to the ARL to false alarm constraint: inf ESADD(T ) T o for every γ > 1 (C γ = {T : ARL2FA(T ) γ}) T C γ CUSUM Maximum Likelihood Ratio Test [Page 54]: Maximize LR over ν [0, n), i.e., compute the generalized LR statistic V n = max 0 ν<n Λν n = max {1, V n 1 } L n, V 0 = 1, L n = g(x n)/f (X n) and stop as soon as it exceeds a threshold A > 0: T CS (A) = inf {n 1 : V n > A} Exact Optimality of CUSUM in the iid Case [Moustakides 86] Let A = A γ be selected so that ARL2FA(T CS (A γ)) = γ. Then CUSUM is optimal wrt ESADD(T ) in the class C γ = {T : ARL2FA(T ) γ} for every γ > 1. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
21 Pollak s Minimax Criterion and a Modified Shiryaev Roberts Procedure Minimize Supremum Average Detection Delay (SADD) [Pollak 85] SADD(T ) = sup E ν [T ν T > ν] subject to ARL2FA(T ) γ ν 0 Shiryaev Roberts Pollak (SRP) Procedure [Pollak 85]: Start off the SR statistic at a random point R 0 distributed according to the quasi-stationary distribution of the SR statistic Q A (x) = lim n P (R n x T SR (A) > n), ( ) Rn Q = 1 + R Q n 1 L n, R Q 0 Q A and stop as soon as it exceeds a threshold A > 0: { } = inf n 1 : Rn Q A T Q A SRP Almost Optimality [Pollak 85] If A = A γ is selected so that E [T Q A ] = γ, then SRP minimizes SADD(T ) to within o(1) 0 asymptotically as γ, i.e., SADD(T Q A ) inf T C γ SADD(T ) = o(1) as γ. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
22 Shiryaev Roberts Pollak Procedure Why quasi-stationary? And why SRP is expected to be optimal? Because starting off at quasi-stationary makes SRP an equalizer (see Figure): E ν(t Q A ν T Q A > ν) = E 0 T Q A for all ν 0, so that by the general decision theory it may be minimax (but not necessarily is!). R0 = 0 ADDν(S r A ) = Eν[Sr A ν Sr A > ν] R0 QA(x) Figure : Conditional ADD E ν(t ν T > ν) vs. changepoint ν for SR and SRP. ν Optimality of SRP was an open problem for 2 decades Polunchenko&Tartakovsky[Ann. Stat. 10] showed by the way of counterexample that it is not, but another procedure, which will be discussed later, is optimal. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
23 A Novel SR r Detection Procedure Idea: Initialize the SR statistic not from zero, but from a specially designed deterministic point r [0, A) R r n = (1 + R r n 1)L n, n 1, R r 0 = r [0, A); T r A = inf{n 1: R r n A} The question now is: How does one choose the starting point R 0 = r so as to outperform the SRP procedure? To be able to analyze detection procedures, Moustakides, Polunchenko & Tartakovsky developed a framework for performance evaluation that allows for computing (almost) precisely any performance index of interest (CADD, SADD, ARL2FA, PFA, IADD, quasi-stationary distribution, etc) solving numerically a system of Fredholm integral equations of the 2nd kind. 300 Delay to Detection, E k [T k T>k] SR (r=0) SRP (r=random) SR r (r=r A ) Change Point, k Figure : CADD E k [T k T > k] vs. ν = k for different values of R 0 = r (Numerics). There are starting points r for which SR-r uniformly outperforms SRP. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
24 The Lower Bound and Optimality Issues Let IADD r (T ) = re 0T + ν=0 E ν (T ν T >ν)p (T >ν) r+e T denote the Integral (relative) ADD which has been considered in the Generalized Bayesian Problem for r = 0. Theorem (Lower Bound for Maximal ADD) Let the threshold A = A γ be selected so that E [TA r γ ] = γ. (i) For any r 0, the SR-r procedure minimizes IADD r (T ) over all procedures with E T γ, i.e., inf T Cγ IADD r (T ) = IADD r (TA r γ ). (ii) For every r 0, inf SADD(T ) IADD r (TA r γ ) = re 0[TA r γ ] + T C γ ν=0 Eν[T r A γ ν T r A γ > ν]p (T r A γ > ν) r + E [T r A γ ] (iii) Assume that r = r(γ) can be chosen so that the SR-r procedure becomes an equalizer, i.e., E 0 [TA r γ ] = E ν(ta r γ ν TA r γ > ν) for all ν 0. Then it is strictly minimax: inf SADD(T ) = SADD(TA r γ ). T C γ Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
25 Counterexample Disproving Pollak s Conjecture Example: Exp(1) Exp(1/θ), i.e., prechange pdf f (x) = e x 1l {x 0} and postchange pdf g(x) = θe θx 1l {x 0}. Set θ = 2. Optimality of SR r [Polunchenko & Tartakovsky 10] Let the initializing value be chosen as r A = 1 + A 1 and the threshold A = A γ be selected from the transcendental equation A + (γ 1) 1 + A log(1 + A) 2(γ 1) 1 + A = 0. Then for every γ < γ 0 = (1 0.5 log 3) , the ARL to false alarm E [T r Aγ A γ ] = γ and the SR-r procedure is minimax, while SRP is not. 1.4 sup k E k [T k T>k] Shiryaev Roberts r A Shiryaev Roberts Pollak E [T] Figure : Operating characteristics of SRP vs. SR r. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
26 Near Minimaxity of SR-r Notation: Z i = log L i log-likelihood ratio for the i-th observation; S n = Z Z n; τ a = inf {n : S n a}; κ a = S τa a (overshoot); ζ = lim E 0[e κa ], κ = lim E 0κ a a a; I = E 0 Z 1 Kullback Leibler information number, V = j=1 e S j, R limiting value of SR R n (distributed according to the stationary distribution); C = E[log(1 + R + V )]; C r = E[log(1 + r + V )]; ADD (T ) = lim Eν(T ν T > ν). ν Tartakovsky, Pollak, & Polunchenko [Probability Theory & its Applications 11] Theorem (Near Optimality and Asymptotic Approximations) Let E 0 Z 1 2 < and let Z 1 be non-arithmetic. (i) If in the SRP procedure A = A γ = γζ, then E T Q A A = γ(1 + o(1)) and, as γ, SADD(T Q A A ) = I 1 [log(γζ) + κ C] + o(1). (ii) If in the SR r procedure A = A γ = γζ, and the initialization point r is either fixed or tends to infinity with the rate o(γ) and is selected so that SADD(TA r ) = ADD (T A r ), then E TA r = γ(1 + o(1)) and, as γ, SADD(T r A ) = I 1 [log(γζ) + κ C] + o(1). Therefore, both procedures are asymptotically third-order optimal: inf T Cγ SADD(T ) = SADD(T Q A A ) + o(1), inf T C γ SADD(T ) = SADD(T r A ) + o(1). Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
27 Ideas Behind the Proof and Efficient Choice of the Headstart Lemma (i) If E 0 Z 1 2 < and Z 1 is non-arithmetic, then as A, γ IADD 0 (T A ) = 1 I (log A + κ C) + o(1), inf SADD(T ) 1 [log(γζ) + κ C] + o(1). T C γ I (ii) If E 0 Z 1 2 < and Z 1 is non-arithmetic, then for any r 0 as A ADD (T r A) = E 0 T Q A A = 1 I E 0 T r A = 1 I (log A + κ C) + o(1), (log A + κ Cr ) + o(1). How to select r? Equate the ADD at the beginning and at infinity, making it look like as almost equalizer: E 0 T r A = ADD (T r A). So we obtain the equation for an optimal r C = C r. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
28 Comparison of Detection Procedures: Beta-to-Beta Model Beta-to-Beta Model: f (x) = xδ 1 (1 x) δ B(δ,δ+1) ; g(x) = xδ (1 x) δ 1, 0 < x < 1, where B(δ+1,δ) δ > 0 is a given constant and B(, ) is the Beta function. If δ = 1, then C = π 2 / and for r = 1.98 we have C(r = 1.98) = C = 1.64, so that ADD (T r A) ADD 0 (T r A) (almost equalizer). 3.6 E ν (T A r ν TA r >ν) SRP (r=random) SR r (r=r * =1.98) ν Figure : CADD versus changepoint Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
29 Comparison of Detection Procedures: Gaussian Model Gaussian Model: N (µ, aµ)-to-n (θ, aθ) model change in mean from µ to θ, but variance σ 2 also changes (ratio of variance to mean is constant a). When a = 1 this is an approximation to a Poisson model. 115 ADD ν (T)=E ν [T ν T>ν] SR(R 0 =0) CUSUM SRP(R 0 Q QA ) SR r(r 0 r =r * ) ν Figure : Conditional average detection delay ADD ν(t ) = E ν[t ν T > ν] for the four detection procedures for µ = 1000, θ = 1001, a = 0.01, γ = 10 4 versus the changepoint ν. Conclusions: (1) CUSUM is superior to SR for changes occurring either immediately or in the near to mid-term future and SR is superior for changes occurring at a far horizon; (2) SRP and SR-r have almost the same performance, while CUSUM is inferior Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
30 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
31 Unknown Post-change Parameter: g(x n) = g θ (X n), θ Θ Two conventional approaches: 1 GLR maximize over the unknown parameter (usually is being applied to CUSUM), i.e., T GCS (h) = inf{n : max 1 k n sup θ w(θ)λ k n(θ) h} 2 Mixtures (Weighting) average over some prior (usually is being applied to SR) { n } T WSR (A) = inf n : Λ k n(θ)w(θ) dθ A k=1 Both procedures are first-order asymptotically minimax wrt Pollak s maximal ADD, since (by the above theorem) ( ) inf SADD θ (T ) = I 1 gθ (x) θ log γ + O(1), γ, I θ = log g θ (x)λ(dx) T C γ f (x) ( ) and (for the l-dimensional exponential family, log fθ (x) = θ x b(θ)) if f 0 (x) A = γ Θ 1 ζ θ w(θ)dθ, then ET WSR (A) γ and SADD θ (T WSR ) = 1 { log γ + l l[1 + log(2π)] log log γ I θ 2 2 ( ) + log ζ t w(t)dt Θ } log (w(θ)e κ θ+µ θ Iθ b(θ)]) l/ det[ 2 + o(1), as γ Thus, SADD θ (T WSR ) inf T Cγ SADD θ (T ) = O( l 2I θ log log γ) for all θ Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
32 Unknown Post-change Parameter Cont d Question: Is it possible to further optimize by choosing a specific weight w(θ) to obtain higher-order optimality? Expected Kullback Leibler (K L) Information: KL ν,θ (T ) := E ν,θ (λ θ T λ θ ν T > ν) = I θ E ν,θ (T ν T > ν), where λ θ n = n k=1 log p θ(x k ) p θ0 is the log-likelihood ratio and (X k ) is the K L information number. I θ = E 0,θ λ θ 1 = θ b(θ) b(θ) Expected Maximal K L Information (over both ν and θ): [ ] sup sup KL ν,θ (T ) = sup I θ sup E θ ν(t ν T > ν) = sup[i θ SADD θ (T )]. θ Θ ν 0 θ Θ ν 0 θ Θ Find a nearly minimax procedure: sup KL ν,θ (T ) = θ Θ,ν 0 inf sup T C γ θ Θ,ν 0 KL ν,θ (T ) + o(1) as γ Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
33 Unknown Post-change Parameter Cont d Asymptotic Lower Bound [Siegmund&Yakir 08]: inf sup KL ν,θ (T ) log γ + l T C γ θ,ν 2 log log γ l [1 + log(2π)] + Copt + o(1), 2 ( ) C opt = log ζ te κ t β t det[ 2 b(t)]/it l dt Θ Weight Selection for WSR: w(θ) e κ θ µ θ det[ 2 b(θ)]/iθ l, which turns WSR into equalizer with constant ( ) C WSR = log ζ te κ t µ t det[ 2 b(t)]/it l dt Θ Asymptotic Risk Regret: sup θ,ν KL ν,θ (T WSR ) inf T sup KL ν,θ (T ) = O(1)( C WSR C opt) θ,ν Open Problem: Show that the SR r mixture with a specially designed mixing distribution and headstart is almost (third-order) optimal, i.e., attains the above lower bound with the same constant to within o(1) Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
34 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
35 Rapid Detection of Intrusions in Computer Networks As a rule, computer intrusions lead to (relatively) abrupt changes in network traffic, so that changepoint detection methods can be used for rapid detection of attacks with a given false alarm rate (FAR). Examples: Distributed Denial-of-Service (DDOS) attacks Unauthorized break-ins Spam campaigns However, the behavior of both pre- and post-attack traffic is poorly understood and typically neither the pre- nor post-change distributions are known. As a result, alternative score-based (not strictly optimal) methods should be used: R sc n = (1 + Rn 1)e sc Sn, Wn sc = max (0, Wn 1 sc + S n), S n score sensitive to a change Linear-quadratic Memoryless Score S n(x n) = C 1 X n + C 2 X 2 n C 3 is efficient for detecting changes in the mean and variance. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
36 Example 1: Detection of Spammers Implementation of linear score-based CUSUM and SR-type change detection algorithms to detection of spammers: Packets/sec Packet Rate Time, sec CUSUM n, sec (sample) Shiryaev Roberts 5 W n R n n, sec (sample) SPAM MESSAGE SENT SPAMMER DETECTED SPAMMER DETECTED Figure : SPAMMER detection with CUSUM (mid) and SR (bottom) procedures. Green line EWMA estimate of mean, red detection thresholds. Detection is fast with both algorithms, but CUSUM triggers several false alarms prior to the attack, while SR does not Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
37 Example 2: A TCP SYN Flood DDOS Attack TCP SYN flood attack on a University of Michigan IRC server: Starts at 550 seconds into the trace and lasts for 10 minutes. While the attack can be seen to the naked eye, it is not completely clear when it starts; there is fluctuation (a spike) in the data before the attack Number of Attempted Connections Change Point Time (seconds) Figure : The connections birth rate (the number of attempted connections). The estimated values of the connections birth rate mean and standard deviation for legitimate and attack traffic are: µ 1669, σ 114 and µ 1888, σ 218 (connections per 20 msec). Therefore, this attack leads to a considerable increase in both the mean and standard deviation of the connections birth rate. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
38 ~ ~ Example 2: A TCP SYN Flood DDOS Attack (Cont d) From Left to Right: A long run of the SR statistic with several false alarms and then the true detection of the attack with a very small detection delay (at the expense of raising many false alarms prior to the correct detection); the SR and CUSUM score-based statistics shortly prior to the attack and right after the attack starts until detection. ~ Log Shyriaev Roberts Statistic, log(r) Log SR Threshold False Alarms Log Shyriaev Roberts Statistic, log(r) Log SR Threshold Change Point SR Detection CUSUM Statistic, W CUSUM Threshold CUSUM Detection Change Point Time (seconds) Time (seconds) Time (seconds) (a) Long run of the multi-cyclic SR statistic (b) Multi-cyclic SR procedure (c) Multi-cyclic CUSUM procedure Figure : Detection of the SYN flood attack by the multi-cyclic SR and CUSUM procedures. The detection delay is approximately 0.14 seconds (7 samples) for the repeated SR procedure and about 0.21 seconds (10 samples) for the CUSUM procedure. As expected, the SR procedure is slightly better. Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
39 Example 3: A UDP DDOS Attack The picture shows efficiency of the linear-quadratic score-based change detection algorithm with false alarm filtering by a spectral analyzer (hybrid anomaly spectral IDS) for detection of the UDP DDoS attack: Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
40 Movie: Detection of the UDP DDoS Attacks The movie shows the Hybrid Anomaly Spectral IDS in action: Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
41 Outline 1 Changepoint Detection 2 Generalized Bayesian Problem 3 Detecting Changes in a Stationary Regime 4 Minimax Criteria Optimality of Page s CUSUM wrt Lorden s Criterion Optimality of the Shiryaev Roberts Pollak Procedure wrt Pollak s Criterion Novel SR r Procedure and its Optimality 5 Changepoint Problems for Composite Hypotheses 6 Applications to Information Systems and Cyber Security 7 Acknowledgements Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
42 Acknowledgements The research topics discussed in my talk has been supported by various US agencies, e.g., NSF, ONR, ARO, AFOSR, DARPA, DOE over the last several years, and is currently being supported by: 1 The U.S. National Science Foundation under grant DMS The U.S. Army Research Office under grant W911NF The U.S. Air Force Office of Scientific Research under MURI grant FA The U.S. Defense Threat Reduction Agency under grant HDTRA The U.S. Defense Advanced Research Projects Agency under grant W911NF Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
43 THE END THANK YOU! Questions? Alexander Tartakovsky (UConn) Optimality Properties of the SR-type Procedures January 17, / 38
Asymptotically Optimal Quickest Change Detection in Distributed Sensor Systems
This article was downloaded by: [University of Illinois at Urbana-Champaign] On: 23 May 2012, At: 16:03 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954
More informationUncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department
Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in
More informationOptimal Design and Analysis of the Exponentially Weighted Moving Average Chart for Exponential Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Optimal Design and Analysis of the Exponentially Weighted Moving Average Chart for
More informationarxiv: v1 [math.st] 13 Sep 2011
Methodol Comput Appl Probab manuscript No. (will be inserted by the editor) State-of-the-Art in Sequential Change-Point Detection Aleksey S. Polunchenko Alexander G. Tartakovsky arxiv:119.2938v1 [math.st]
More informationSEQUENTIAL CHANGE-POINT DETECTION WHEN THE PRE- AND POST-CHANGE PARAMETERS ARE UNKNOWN. Tze Leung Lai Haipeng Xing
SEQUENTIAL CHANGE-POINT DETECTION WHEN THE PRE- AND POST-CHANGE PARAMETERS ARE UNKNOWN By Tze Leung Lai Haipeng Xing Technical Report No. 2009-5 April 2009 Department of Statistics STANFORD UNIVERSITY
More informationEarly Detection of a Change in Poisson Rate After Accounting For Population Size Effects
Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, GA 30332-0205,
More informationarxiv: v2 [math.st] 3 Mar 2012
Nearly Optimal Change-Point Detection with an Application to Cybersecurity arxiv:122.2849v2 [math.st] 3 Mar 212 Aleksey S. Polunchenko and Alexander G. Tartakovsky Department of Mathematics, University
More informationSequential Detection. Changes: an overview. George V. Moustakides
Sequential Detection of Changes: an overview George V. Moustakides Outline Sequential hypothesis testing and Sequential detection of changes The Sequential Probability Ratio Test (SPRT) for optimum hypothesis
More informationStatistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data
Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data Taposh Banerjee University of Texas at San Antonio Joint work with Gene Whipps (US Army Research Laboratory) Prudhvi
More informationEARLY DETECTION OF A CHANGE IN POISSON RATE AFTER ACCOUNTING FOR POPULATION SIZE EFFECTS
Statistica Sinica 21 (2011), 597-624 EARLY DETECTION OF A CHANGE IN POISSON RATE AFTER ACCOUNTING FOR POPULATION SIZE EFFECTS Yajun Mei, Sung Won Han and Kwok-Leung Tsui Georgia Institute of Technology
More informationarxiv: v2 [stat.ap] 3 Dec 2015
Statistics and Its Interface Volume (25) 4 Real-time financial surveillance via quickest change-point detection methods Andrey Pepelyshev and Aleksey S. Polunchenko arxiv:59.57v2 [stat.ap] 3 Dec 25 We
More informationThe Shiryaev-Roberts Changepoint Detection Procedure in Retrospect - Theory and Practice
The Shiryaev-Roberts Changepoint Detection Procedure in Retrospect - Theory and Practice Department of Statistics The Hebrew University of Jerusalem Mount Scopus 91905 Jerusalem, Israel msmp@mscc.huji.ac.il
More informationPerformance of Certain Decentralized Distributed Change Detection Procedures
Performance of Certain Decentralized Distributed Change Detection Procedures Alexander G. Tartakovsky Center for Applied Mathematical Sciences and Department of Mathematics University of Southern California
More informationLeast Favorable Distributions for Robust Quickest Change Detection
Least Favorable Distributions for Robust Quickest hange Detection Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli, Sean Meyn Department of Electrical and omputer Engineering, and oordinated Science
More informationLarge-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation
Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation Yuan Wang and Yajun Mei arxiv:1308.5738v3 [math.st] 16 Mar 2016 July 10, 2015 Abstract The quickest change detection
More informationOptimum CUSUM Tests for Detecting Changes in Continuous Time Processes
Optimum CUSUM Tests for Detecting Changes in Continuous Time Processes George V. Moustakides INRIA, Rennes, France Outline The change detection problem Overview of existing results Lorden s criterion and
More informationCHANGE DETECTION WITH UNKNOWN POST-CHANGE PARAMETER USING KIEFER-WOLFOWITZ METHOD
CHANGE DETECTION WITH UNKNOWN POST-CHANGE PARAMETER USING KIEFER-WOLFOWITZ METHOD Vijay Singamasetty, Navneeth Nair, Srikrishna Bhashyam and Arun Pachai Kannu Department of Electrical Engineering Indian
More informationData-Efficient Quickest Change Detection
Data-Efficient Quickest Change Detection Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign http://www.ifp.illinois.edu/~vvv (joint work with Taposh Banerjee)
More informationarxiv: v1 [math.st] 11 Oct 2013
Proceedings of the 2013 Joint Statistical Meetings (JSM-2013) Montréal, Québec, Canada, 3 8 August 2013 Quickest Change-Point Detection: A Bird s Eye View Aleksey S. Polunchenko Grigory Sokolov Wenyu Du
More informationChange-point models and performance measures for sequential change detection
Change-point models and performance measures for sequential change detection Department of Electrical and Computer Engineering, University of Patras, 26500 Rion, Greece moustaki@upatras.gr George V. Moustakides
More informationCOMPARISON OF STATISTICAL ALGORITHMS FOR POWER SYSTEM LINE OUTAGE DETECTION
COMPARISON OF STATISTICAL ALGORITHMS FOR POWER SYSTEM LINE OUTAGE DETECTION Georgios Rovatsos*, Xichen Jiang*, Alejandro D. Domínguez-García, and Venugopal V. Veeravalli Department of Electrical and Computer
More informationarxiv:math/ v2 [math.st] 15 May 2006
The Annals of Statistics 2006, Vol. 34, No. 1, 92 122 DOI: 10.1214/009053605000000859 c Institute of Mathematical Statistics, 2006 arxiv:math/0605322v2 [math.st] 15 May 2006 SEQUENTIAL CHANGE-POINT DETECTION
More informationX 1,n. X L, n S L S 1. Fusion Center. Final Decision. Information Bounds and Quickest Change Detection in Decentralized Decision Systems
1 Information Bounds and Quickest Change Detection in Decentralized Decision Systems X 1,n X L, n Yajun Mei Abstract The quickest change detection problem is studied in decentralized decision systems,
More informationSequential Change-Point Approach for Online Community Detection
Sequential Change-Point Approach for Online Community Detection Yao Xie Joint work with David Marangoni-Simonsen H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationSCALABLE ROBUST MONITORING OF LARGE-SCALE DATA STREAMS. By Ruizhi Zhang and Yajun Mei Georgia Institute of Technology
Submitted to the Annals of Statistics SCALABLE ROBUST MONITORING OF LARGE-SCALE DATA STREAMS By Ruizhi Zhang and Yajun Mei Georgia Institute of Technology Online monitoring large-scale data streams has
More informationEXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES
EXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES CHAO AN-KUO NATIONAL UNIVERSITY OF SINGAPORE 2016 EXTENSIONS TO THE EXPONENTIALLY WEIGHTED MOVING AVERAGE PROCEDURES CHAO AN-KUO (MS,
More informationPrompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014
Prompt Network Anomaly Detection using SSA-Based Change-Point Detection Hao Chen 3/7/2014 Network Anomaly Detection Network Intrusion Detection (NID) Signature-based detection Detect known attacks Use
More informationQuickest Detection With Post-Change Distribution Uncertainty
Quickest Detection With Post-Change Distribution Uncertainty Heng Yang City University of New York, Graduate Center Olympia Hadjiliadis City University of New York, Brooklyn College and Graduate Center
More informationarxiv: v2 [eess.sp] 20 Nov 2017
Distributed Change Detection Based on Average Consensus Qinghua Liu and Yao Xie November, 2017 arxiv:1710.10378v2 [eess.sp] 20 Nov 2017 Abstract Distributed change-point detection has been a fundamental
More informationSimultaneous and sequential detection of multiple interacting change points
Simultaneous and sequential detection of multiple interacting change points Long Nguyen Department of Statistics University of Michigan Joint work with Ram Rajagopal (Stanford University) 1 Introduction
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...
More informationQuickest Anomaly Detection: A Case of Active Hypothesis Testing
Quickest Anomaly Detection: A Case of Active Hypothesis Testing Kobi Cohen, Qing Zhao Department of Electrical Computer Engineering, University of California, Davis, CA 95616 {yscohen, qzhao}@ucdavis.edu
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-088 The public reporting burden for this collection of information is estimated to average hour per response, including the time for reviewing instructions,
More informationSurveillance of BiometricsAssumptions
Surveillance of BiometricsAssumptions in Insured Populations Journée des Chaires, ILB 2017 N. El Karoui, S. Loisel, Y. Sahli UPMC-Paris 6/LPMA/ISFA-Lyon 1 with the financial support of ANR LoLitA, and
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationEfficient scalable schemes for monitoring a large number of data streams
Biometrika (2010, 97, 2,pp. 419 433 C 2010 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asq010 Advance Access publication 16 April 2010 Efficient scalable schemes for monitoring a large
More informationMonitoring actuarial assumptions in life insurance
Monitoring actuarial assumptions in life insurance Stéphane Loisel ISFA, Univ. Lyon 1 Joint work with N. El Karoui & Y. Salhi IAALS Colloquium, Barcelona, 17 LoLitA Typical paths with change of regime
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationSurveillance of Infectious Disease Data using Cumulative Sum Methods
Surveillance of Infectious Disease Data using Cumulative Sum Methods 1 Michael Höhle 2 Leonhard Held 1 1 Institute of Social and Preventive Medicine University of Zurich 2 Department of Statistics University
More informationAn Effective Approach to Nonparametric Quickest Detection and Its Decentralized Realization
University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 5-2010 An Effective Approach to Nonparametric Quickest Detection and Its Decentralized
More information10-704: Information Processing and Learning Fall Lecture 24: Dec 7
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationApplications of Information Geometry to Hypothesis Testing and Signal Detection
CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry
More informationIntroduction to Bayesian Statistics
Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California
More informationIMPORTANCE SAMPLING FOR GENERALIZED LIKELIHOOD RATIO PROCEDURES IN SEQUENTIAL ANALYSIS. Hock Peng Chan
IMPORTANCE SAMPLING FOR GENERALIZED LIKELIHOOD RATIO PROCEDURES IN SEQUENTIAL ANALYSIS Hock Peng Chan Department of Statistics and Applied Probability National University of Singapore, Singapore 119260
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationThe information complexity of sequential resource allocation
The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation
More informationChapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015
More informationSpace-Time CUSUM for Distributed Quickest Detection Using Randomly Spaced Sensors Along a Path
Space-Time CUSUM for Distributed Quickest Detection Using Randomly Spaced Sensors Along a Path Daniel Egea-Roca, Gonzalo Seco-Granados, José A López-Salcedo, Sunwoo Kim Dpt of Telecommunications and Systems
More informationOn robust stopping times for detecting changes in distribution
On robust stopping times for detecting changes in distribution by Yuri Golubev, Mher Safarian No. 116 MAY 2018 WORKING PAPER SERIES IN ECONOMICS KIT Die Forschungsuniversität in der Helmholtz-Gemeinschaft
More informationSEQUENTIAL CHANGE DETECTION REVISITED. BY GEORGE V. MOUSTAKIDES University of Patras
The Annals of Statistics 28, Vol. 36, No. 2, 787 87 DOI: 1.1214/95367938 Institute of Mathematical Statistics, 28 SEQUENTIAL CHANGE DETECTION REVISITED BY GEORGE V. MOUSTAKIDES University of Patras In
More informationStatistics Ph.D. Qualifying Exam: Part I October 18, 2003
Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationPractical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length
https://doi.org/10.1007/s11009-018-9636-1 Practical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length J. Kuhn 1,2 M. Mandjes 1 T. Taimre 2 Received: 29 August 2016 /
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationDecentralized decision making with spatially distributed data
Decentralized decision making with spatially distributed data XuanLong Nguyen Department of Statistics University of Michigan Acknowledgement: Michael Jordan, Martin Wainwright, Ram Rajagopal, Pravin Varaiya
More informationLecture 17: Likelihood ratio and asymptotic tests
Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationLecture Notes in Statistics 180 Edited by P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger
Lecture Notes in Statistics 180 Edited by P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger Yanhong Wu Inference for Change-Point and Post-Change Means After a CUSUM Test Yanhong Wu Department
More informationDETECTION theory deals primarily with techniques for
ADVANCED SIGNAL PROCESSING SE Optimum Detection of Deterministic and Random Signals Stefan Tertinek Graz University of Technology turtle@sbox.tugraz.at Abstract This paper introduces various methods for
More informationSTA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources
STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various
More informationReal-Time Detection of Hybrid and Stealthy Cyber-Attacks in Smart Grid
1 Real-Time Detection of Hybrid and Stealthy Cyber-Attacks in Smart Grid Mehmet Necip Kurt, Yasin Yılmaz, Member, IEEE, and Xiaodong Wang, Fellow, IEEE Abstract For a safe and reliable operation of the
More informationDetection theory 101 ELEC-E5410 Signal Processing for Communications
Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off
More informationDecentralized Detection In Wireless Sensor Networks
Decentralized Detection In Wireless Sensor Networks Milad Kharratzadeh Department of Electrical & Computer Engineering McGill University Montreal, Canada April 2011 Statistical Detection and Estimation
More informationContext tree models for source coding
Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationLecture 26: Likelihood ratio tests
Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for
More informationA CUSUM approach for online change-point detection on curve sequences
ESANN 22 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges Belgium, 25-27 April 22, i6doc.com publ., ISBN 978-2-8749-49-. Available
More informationQuantization Effect on the Log-Likelihood Ratio and Its Application to Decentralized Sequential Detection
1536 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 6, MARCH 15, 2013 Quantization Effect on the Log-Likelihood Ratio Its Application to Decentralized Sequential Detection Yan Wang Yajun Mei Abstract
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationThe information complexity of best-arm identification
The information complexity of best-arm identification Emilie Kaufmann, joint work with Olivier Cappé and Aurélien Garivier MAB workshop, Lancaster, January th, 206 Context: the multi-armed bandit model
More informationDecentralized Sequential Hypothesis Testing. Change Detection
Decentralized Sequential Hypothesis Testing & Change Detection Giorgos Fellouris, Columbia University, NY, USA George V. Moustakides, University of Patras, Greece Outline Sequential hypothesis testing
More informationBayesian Quickest Change Detection Under Energy Constraints
Bayesian Quickest Change Detection Under Energy Constraints Taposh Banerjee and Venugopal V. Veeravalli ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign, Urbana,
More informationOptimal and asymptotically optimal CUSUM rules for change point detection in the Brownian Motion model with multiple alternatives
INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Optimal and asymptotically optimal CUSUM rules for change point detection in the Brownian Motion model with multiple alternatives Olympia
More informationIntroduction to Statistical Inference
Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationControl Charts Based on Alternative Hypotheses
Control Charts Based on Alternative Hypotheses A. Di Bucchianico, M. Hušková (Prague), P. Klášterecky (Prague), W.R. van Zwet (Leiden) Dortmund, January 11, 2005 1/48 Goals of this talk introduce hypothesis
More informationStratégies bayésiennes et fréquentistes dans un modèle de bandit
Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationAnomaly Detection and Attribution in Networks with Temporally Correlated Traffic
Anomaly Detection and Attribution in Networks with Temporally Correlated Traffic Ido Nevat, Dinil Mon Divakaran 2, Sai Ganesh Nagarajan 2, Pengfei Zhang 3, Le Su 2, Li Ling Ko 4, Vrizlynn L. L. Thing 2
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationRevisiting the Exploration-Exploitation Tradeoff in Bandit Models
Revisiting the Exploration-Exploitation Tradeoff in Bandit Models joint work with Aurélien Garivier (IMT, Toulouse) and Tor Lattimore (University of Alberta) Workshop on Optimization and Decision-Making
More information9 Bayesian inference. 9.1 Subjective probability
9 Bayesian inference 1702-1761 9.1 Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win
More informationOn Optimal Stopping Problems with Power Function of Lévy Processes
On Optimal Stopping Problems with Power Function of Lévy Processes Budhi Arta Surya Department of Mathematics University of Utrecht 31 August 2006 This talk is based on the joint paper with A.E. Kyprianou:
More informationOn the Complexity of Best Arm Identification in Multi-Armed Bandit Models
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models Aurélien Garivier Institut de Mathématiques de Toulouse Information Theory, Learning and Big Data Simons Institute, Berkeley, March
More informationAnomaly detection and. in time series
Anomaly detection and sequential statistics in time series Alex Shyr CS 294 Practical Machine Learning 11/12/2009 (many slides from XuanLong Nguyen and Charles Sutton) Two topics Anomaly detection Sequential
More informationDetection and Diagnosis of Unknown Abrupt Changes Using CUSUM Multi-Chart Schemes
Sequential Analysis, 26: 225 249, 2007 Copyright Taylor & Francis Group, LLC ISSN: 0747-4946 print/532-476 online DOI: 0.080/0747494070404765 Detection Diagnosis of Unknown Abrupt Changes Using CUSUM Multi-Chart
More informationDetection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010
Detection Theory Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010 Outline Neyman-Pearson Theorem Detector Performance Irrelevant Data Minimum Probability of Error Bayes Risk Multiple
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationChange Detection Algorithms
5 Change Detection Algorithms In this chapter, we describe the simplest change detection algorithms. We consider a sequence of independent random variables (y k ) k with a probability density p (y) depending
More informationConsistency of the maximum likelihood estimator for general hidden Markov models
Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models
More informationAsymptotic results for empirical measures of weighted sums of independent random variables
Asymptotic results for empirical measures of weighted sums of independent random variables B. Bercu and W. Bryc University Bordeaux 1, France Workshop on Limit Theorems, University Paris 1 Paris, January
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More information2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?
ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we
More informationInvariant HPD credible sets and MAP estimators
Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in
More information