Universal Outlier Hypothesis Testing

Size: px

Start display at page:

Download "Universal Outlier Hypothesis Testing"

Sharon Gallagher
5 years ago
Views:

1 Universal Outlier Hypothesis Testing Venu Veeravalli ECE Dept & CSL & ITI University of Illinois at Urbana-Champaign (with Sirin Nitinawarat and Yun Li) NSF Workshop on Signal Processing for Big Data March 22, 2013

2 Statistical Outlier Detection Single sequence of observations Normal observations follow some fixed (possibly unknown) distribution or generating mechanism Outliers follow different generating mechanism Goal: To find outliers efficiently Applications: fraud detection, public health monitoring, cleaning up sensor data, etc. 2

3 Fraud Detection Example: spending records for a male graduate student Transactions Grocery Gas Books --- Amount 30$ 35$ 75$ Normal behavior 3

4 Fraud Detection Example: spending records for a male graduate student Transactions Grocery Gas Books --- Spa Women apparels Cosmetics Amount 30$ 35$ 75$ 250$ 1500$ 500$ Normal behavior Abnormal behavior 4

5 Fraud Detection: Group Monitoring Male graduate students Student 1 Student 2 Student 3 Student M Grocery Dining Grocery Gas Dining Grocery Gas Books Books Movie Books Grocery Movie Books Spa Dining Grocery Gas Women apparel Movie Gas Books Cosmetics Grocery 5

6 Outlier Hypothesis Testing M sequences of observations, with M large Almost all sequences are generated from common typical distribution Small subset of sequences generated from different (outlier) distributions 6

7 Outlier Hypothesis Testing M sequences of observations, with M large Almost all sequences are generated from common typical distribution Small subset of sequences generated from different (outlier) distributions Special case: o Exactly one sequence is generated from outlier distribution o Goal: to detect outlier sequence efficiently o Universal setting: neither typical nor outlier distributions known; no training data provided 7

8 Universal Outlier Hypothesis Testing Typical distribution Outlier distribution π µ H M µ π π π π H 2 π µ π π π π π µ π π π π π µ π H M π π π π µ 8

9 Mathematical Model y 1 (1) y 2 (1) y 1 (2) y 2 (2) y 1 (M ) y 2 (M ) y n (1) y n (2) y n (M ) ( ) = µ ( (i ) y ) k π( ( j ) y ) k H i : p i y (1),, y (M ) y (1) y (2) y (M ) n k = 1 j i 9

10 Hypothesis Testing Problem H i : p i y Mn n k = 1 ( ) = µ ( (i ) y ) k π( ( j ) y ) k j i Nothing is known about (µ,π) except that they are distinct Universal Detector: δ : Y Mn { 1,, M} Independent of (µ,π ) 10

11 Performance Metrics Maximal error probability: e( δ, ( µ, π) ) = max P i i { δ(y Mn ) i} Exponent for maximal error probability: α( δ, ( µ, π) ) = lim n 1 n loge( δ, ( µ, π) ) 11

12 Performance Metrics Maximal error probability: e( δ, ( µ, π) ) = max P i i { δ(y Mn ) i} Exponent for maximal error probability: α( δ, ( µ, π) ) = lim n 1 n loge( δ, ( µ, π) ) Consistency : e 0 as n Exponential Consistency : α > 0 12

13 Background: Binary Hypothesis Testing H 1 : p 1 (y) = n n π(y ) H : p (y) = µ(y ) k 2 2 k k=1 k=1 If (µ,π) known, δ ML (y) = argmax logp i (y) i has α(δ,(µ,π)) = C(µ,π) > 0 exponential consistency Chernoff Info :C(µ,π) = max log µ (y ) s π(y ) 1 s 0 s 1 y 13

14 Outlier Hypothesis testing: known (µ,π) H i : p i y Mn n k = 1 ( ) = µ ( (i ) y ) k π( ( j ) y ) k j i ML Rule : δ ML (y Mn ) = argmax i logp i (y Mn ) H 1 H 2 H M 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ Exponential Consistency : α(δ ML,(µ,π)) = 2B(µ,π) Bhattacharya Distance : B(µ,π) = log µ (y ) 1/2 π(y ) 1/2 y 14

15 Binary Hypothesis Testing: Unknown µ H 1 : p 1 (y) = n n π(y ) H : p (y) = µ(y ) k 2 2 k k=1 k=1 If µ unknown for any given δ there existsµ s.t. α = 0 No exponential consistency! 15

16 Outlier Hypothesis Testing: unknown µ µ uknown : ˆµ i = γ i empirical Distribution H i : ˆp i y Mn n k = 1 ( ) = ˆµ ( (i ) i y ) k π( ( j ) y ) k j i Generalized Likelihood (GL) Rule : δ GL (y Mn ) = argmax i log ˆp i (y Mn ) H 1 H 2 H M 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ Exponential Consistency : α(δ GL,(µ,π)) = 2B(µ,π) Same as known µ,π 16

17 Sanov s Theorem Sanov s Theorem: For i.i.d. rvs Y n p, exponent of probability that random empirical distribution falls in closed set E is 17

18 Key Tool: Sanov s Theorem Sanov s Theorem: For i.i.d. rvs Y n p, exponent of probability that random empirical distribution falls in closed set E is lim n 1 n logp { n Empirical( Y ) E} = min q E D ( q p) E q * D ( q * p) p 18

19 Proposed Universal Test ( µ,π) not known : ˆµ i = γ i ˆπ i = 1 M 1 H i : ˆpi y Mn n k = 1 ( ) = ˆµ ( (i ) i y ) k ˆπ ( ( j ) y ) k j i Generalized Likelihood (GL) Rule : δ GL (y Mn ) = argmax i log ˆpi (y Mn ) H 1 H 2 H M j i γ j Empirical distributions 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ 19

20 Performance of Universal Test α( δ, ( µ, π) ) = min D(p p 1,., p 1 µ) + D(p 2 π) + + D(p M π) M 1 D(p j p M 1 k ) D(p j j 1 k 1 j 2 1 p M 1 k ) k 2 20

21 Performance of Universal Test Universally exponential consistency! α( δ, ( µ, π) ) = min D(p p 1,., p 1 µ) + D(p 2 π) + + D(p M π) M > 0, ( µ,π) 1 D(p j p M 1 k ) D(p j j 1 k 1 j 2 1 p M 1 k ) k 2 21

22 Asymptotic Optimality Motivation: When only π is known, optimal error exponent is 2B µ, π ( ) Estimate of π satisfies lim n 1 M M γ i = 1 i =1 M µ + M 1 M π, lim M 1 M µ + M 1 M π = π 22

23 Asymptotic Optimality Our universal outlier detector achieves error exponent lower bounded by q: D(q π) min 2B(µ, q) 1 M 1 ( 2B(µ, q)+c π ) 23

24 Asymptotic Optimality Our universal outlier detector achieves error exponent lower bounded by q: D(q π) min 2B(µ, q) 1 M 1 ( 2B(µ, q)+c π ) This lower bound is non-decreasing in and converges to 2B(µ, π) as M M 3, 24

25 Numerical Results Lower bound of the error exponent 2 B (µ, π), µ=(0.36, 0.64) π=(0.64, 0.36) log (M), where M is the number M of the coordinates 25

26 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? 26

27 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? o If it is known that we have exactly K << M outliers, then GL test is still universally exponentially consistent 27

28 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? o If it is known that we have exactly K << M outliers, then GL test is still universally exponentially consistent o If number of outliers is not known a priori, then universally exponentially does not exist! 28

Universal Outlier Hypothesis Testing

1 Universal Outlier Hypothesis Testing Yun Li, Student Member, IEEE, Sirin Nitinawarat, Member, IEEE, and Venugopal V. Veeravalli, Fellow, IEEE Abstract arxiv:submit/0844243 [cs.it] 11 Nov 2013 The following