Universal Outlier Hypothesis Testing

Size: px
Start display at page:

Download "Universal Outlier Hypothesis Testing"

Transcription

1 Universal Outlier Hypothesis Testing Venu Veeravalli ECE Dept & CSL & ITI University of Illinois at Urbana-Champaign (with Sirin Nitinawarat and Yun Li) NSF Workshop on Signal Processing for Big Data March 22, 2013

2 Statistical Outlier Detection Single sequence of observations Normal observations follow some fixed (possibly unknown) distribution or generating mechanism Outliers follow different generating mechanism Goal: To find outliers efficiently Applications: fraud detection, public health monitoring, cleaning up sensor data, etc. 2

3 Fraud Detection Example: spending records for a male graduate student Transactions Grocery Gas Books --- Amount 30$ 35$ 75$ Normal behavior 3

4 Fraud Detection Example: spending records for a male graduate student Transactions Grocery Gas Books --- Spa Women apparels Cosmetics Amount 30$ 35$ 75$ 250$ 1500$ 500$ Normal behavior Abnormal behavior 4

5 Fraud Detection: Group Monitoring Male graduate students Student 1 Student 2 Student 3 Student M Grocery Dining Grocery Gas Dining Grocery Gas Books Books Movie Books Grocery Movie Books Spa Dining Grocery Gas Women apparel Movie Gas Books Cosmetics Grocery 5

6 Outlier Hypothesis Testing M sequences of observations, with M large Almost all sequences are generated from common typical distribution Small subset of sequences generated from different (outlier) distributions 6

7 Outlier Hypothesis Testing M sequences of observations, with M large Almost all sequences are generated from common typical distribution Small subset of sequences generated from different (outlier) distributions Special case: o Exactly one sequence is generated from outlier distribution o Goal: to detect outlier sequence efficiently o Universal setting: neither typical nor outlier distributions known; no training data provided 7

8 Universal Outlier Hypothesis Testing Typical distribution Outlier distribution π µ H M µ π π π π H 2 π µ π π π π π µ π π π π π µ π H M π π π π µ 8

9 Mathematical Model y 1 (1) y 2 (1) y 1 (2) y 2 (2) y 1 (M ) y 2 (M ) y n (1) y n (2) y n (M ) ( ) = µ ( (i ) y ) k π( ( j ) y ) k H i : p i y (1),, y (M ) y (1) y (2) y (M ) n k = 1 j i 9

10 Hypothesis Testing Problem H i : p i y Mn n k = 1 ( ) = µ ( (i ) y ) k π( ( j ) y ) k j i Nothing is known about (µ,π) except that they are distinct Universal Detector: δ : Y Mn { 1,, M} Independent of (µ,π ) 10

11 Performance Metrics Maximal error probability: e( δ, ( µ, π) ) = max P i i { δ(y Mn ) i} Exponent for maximal error probability: α( δ, ( µ, π) ) = lim n 1 n loge( δ, ( µ, π) ) 11

12 Performance Metrics Maximal error probability: e( δ, ( µ, π) ) = max P i i { δ(y Mn ) i} Exponent for maximal error probability: α( δ, ( µ, π) ) = lim n 1 n loge( δ, ( µ, π) ) Consistency : e 0 as n Exponential Consistency : α > 0 12

13 Background: Binary Hypothesis Testing H 1 : p 1 (y) = n n π(y ) H : p (y) = µ(y ) k 2 2 k k=1 k=1 If (µ,π) known, δ ML (y) = argmax logp i (y) i has α(δ,(µ,π)) = C(µ,π) > 0 exponential consistency Chernoff Info :C(µ,π) = max log µ (y ) s π(y ) 1 s 0 s 1 y 13

14 Outlier Hypothesis testing: known (µ,π) H i : p i y Mn n k = 1 ( ) = µ ( (i ) y ) k π( ( j ) y ) k j i ML Rule : δ ML (y Mn ) = argmax i logp i (y Mn ) H 1 H 2 H M 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ Exponential Consistency : α(δ ML,(µ,π)) = 2B(µ,π) Bhattacharya Distance : B(µ,π) = log µ (y ) 1/2 π(y ) 1/2 y 14

15 Binary Hypothesis Testing: Unknown µ H 1 : p 1 (y) = n n π(y ) H : p (y) = µ(y ) k 2 2 k k=1 k=1 If µ unknown for any given δ there existsµ s.t. α = 0 No exponential consistency! 15

16 Outlier Hypothesis Testing: unknown µ µ uknown : ˆµ i = γ i empirical Distribution H i : ˆp i y Mn n k = 1 ( ) = ˆµ ( (i ) i y ) k π( ( j ) y ) k j i Generalized Likelihood (GL) Rule : δ GL (y Mn ) = argmax i log ˆp i (y Mn ) H 1 H 2 H M 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ Exponential Consistency : α(δ GL,(µ,π)) = 2B(µ,π) Same as known µ,π 16

17 Sanov s Theorem Sanov s Theorem: For i.i.d. rvs Y n p, exponent of probability that random empirical distribution falls in closed set E is 17

18 Key Tool: Sanov s Theorem Sanov s Theorem: For i.i.d. rvs Y n p, exponent of probability that random empirical distribution falls in closed set E is lim n 1 n logp { n Empirical( Y ) E} = min q E D ( q p) E q * D ( q * p) p 18

19 Proposed Universal Test ( µ,π) not known : ˆµ i = γ i ˆπ i = 1 M 1 H i : ˆpi y Mn n k = 1 ( ) = ˆµ ( (i ) i y ) k ˆπ ( ( j ) y ) k j i Generalized Likelihood (GL) Rule : δ GL (y Mn ) = argmax i log ˆpi (y Mn ) H 1 H 2 H M j i γ j Empirical distributions 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ 19

20 Performance of Universal Test α( δ, ( µ, π) ) = min D(p p 1,., p 1 µ) + D(p 2 π) + + D(p M π) M 1 D(p j p M 1 k ) D(p j j 1 k 1 j 2 1 p M 1 k ) k 2 20

21 Performance of Universal Test Universally exponential consistency! α( δ, ( µ, π) ) = min D(p p 1,., p 1 µ) + D(p 2 π) + + D(p M π) M > 0, ( µ,π) 1 D(p j p M 1 k ) D(p j j 1 k 1 j 2 1 p M 1 k ) k 2 21

22 Asymptotic Optimality Motivation: When only π is known, optimal error exponent is 2B µ, π ( ) Estimate of π satisfies lim n 1 M M γ i = 1 i =1 M µ + M 1 M π, lim M 1 M µ + M 1 M π = π 22

23 Asymptotic Optimality Our universal outlier detector achieves error exponent lower bounded by q: D(q π) min 2B(µ, q) 1 M 1 ( 2B(µ, q)+c π ) 23

24 Asymptotic Optimality Our universal outlier detector achieves error exponent lower bounded by q: D(q π) min 2B(µ, q) 1 M 1 ( 2B(µ, q)+c π ) This lower bound is non-decreasing in and converges to 2B(µ, π) as M M 3, 24

25 Numerical Results Lower bound of the error exponent 2 B (µ, π), µ=(0.36, 0.64) π=(0.64, 0.36) log (M), where M is the number M of the coordinates 25

26 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? 26

27 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? o If it is known that we have exactly K << M outliers, then GL test is still universally exponentially consistent 27

28 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? o If it is known that we have exactly K << M outliers, then GL test is still universally exponentially consistent o If number of outliers is not known a priori, then universally exponentially does not exist! 28

Universal Outlier Hypothesis Testing

Universal Outlier Hypothesis Testing 1 Universal Outlier Hypothesis Testing Yun Li, Student Member, IEEE, Sirin Nitinawarat, Member, IEEE, and Venugopal V. Veeravalli, Fellow, IEEE Abstract arxiv:submit/0844243 [cs.it] 11 Nov 2013 The following

More information

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in

More information

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 7: Information Theory Cosma Shalizi 3 February 2009 Entropy and Information Measuring randomness and dependence in bits The connection to statistics Long-run

More information

Finding the best mismatched detector for channel coding and hypothesis testing

Finding the best mismatched detector for channel coding and hypothesis testing Finding the best mismatched detector for channel coding and hypothesis testing Sean Meyn Department of Electrical and Computer Engineering University of Illinois and the Coordinated Science Laboratory

More information

Quickest Anomaly Detection: A Case of Active Hypothesis Testing

Quickest Anomaly Detection: A Case of Active Hypothesis Testing Quickest Anomaly Detection: A Case of Active Hypothesis Testing Kobi Cohen, Qing Zhao Department of Electrical Computer Engineering, University of California, Davis, CA 95616 {yscohen, qzhao}@ucdavis.edu

More information

Data-Efficient Quickest Change Detection

Data-Efficient Quickest Change Detection Data-Efficient Quickest Change Detection Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign http://www.ifp.illinois.edu/~vvv (joint work with Taposh Banerjee)

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

The information complexity of sequential resource allocation

The information complexity of sequential resource allocation The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation

More information

Finding a Needle in a Haystack: Conditions for Reliable. Detection in the Presence of Clutter

Finding a Needle in a Haystack: Conditions for Reliable. Detection in the Presence of Clutter Finding a eedle in a Haystack: Conditions for Reliable Detection in the Presence of Clutter Bruno Jedynak and Damianos Karakos October 23, 2006 Abstract We study conditions for the detection of an -length

More information

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Vincent Tan, Matt Johnson, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems,

More information

The Method of Types and Its Application to Information Hiding

The Method of Types and Its Application to Information Hiding The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,

More information

Decentralized Detection in Sensor Networks

Decentralized Detection in Sensor Networks IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 51, NO 2, FEBRUARY 2003 407 Decentralized Detection in Sensor Networks Jean-François Chamberland, Student Member, IEEE, and Venugopal V Veeravalli, Senior Member,

More information

Entropy, Inference, and Channel Coding

Entropy, Inference, and Channel Coding Entropy, Inference, and Channel Coding Sean Meyn Department of Electrical and Computer Engineering University of Illinois and the Coordinated Science Laboratory NSF support: ECS 02-17836, ITR 00-85929

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 57, NO 3, MARCH 2011 1587 Universal and Composite Hypothesis Testing via Mismatched Divergence Jayakrishnan Unnikrishnan, Member, IEEE, Dayu Huang, Student

More information

Density Estimation: ML, MAP, Bayesian estimation

Density Estimation: ML, MAP, Bayesian estimation Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Applications of Information Geometry to Hypothesis Testing and Signal Detection

Applications of Information Geometry to Hypothesis Testing and Signal Detection CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/40 Acknowledgement Praneeth Boda Himanshu Tyagi Shun Watanabe 3/40 Outline Two-terminal model: Mutual

More information

INFORMATION THEORY AND STATISTICS

INFORMATION THEORY AND STATISTICS CHAPTER INFORMATION THEORY AND STATISTICS We now explore the relationship between information theory and statistics. We begin by describing the method of types, which is a powerful technique in large deviation

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Censoring for Type-Based Multiple Access Scheme in Wireless Sensor Networks

Censoring for Type-Based Multiple Access Scheme in Wireless Sensor Networks Censoring for Type-Based Multiple Access Scheme in Wireless Sensor Networks Mohammed Karmoose Electrical Engineering Department Alexandria University Alexandria 1544, Egypt Email: mhkarmoose@ieeeorg Karim

More information

ECE 695C, Midterm #1 6:30-7:30pm Tuesday, September 29, 2009, ME 312,

ECE 695C, Midterm #1 6:30-7:30pm Tuesday, September 29, 2009, ME 312, ECE 695C, Midterm #1 6:30-7:30pm Tuesday, September 29, 2009, ME 312, 1. Enter your name, student ID number, e-mail address, and signature in the space provided on this page, NOW! 2. This is a closed book

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

Two optimization problems in a stochastic bandit model

Two optimization problems in a stochastic bandit model Two optimization problems in a stochastic bandit model Emilie Kaufmann joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishnan Journées MAS 204, Toulouse Outline From stochastic optimization

More information

On the Complexity of Best Arm Identification with Fixed Confidence

On the Complexity of Best Arm Identification with Fixed Confidence On the Complexity of Best Arm Identification with Fixed Confidence Discrete Optimization with Noise Aurélien Garivier, Emilie Kaufmann COLT, June 23 th 2016, New York Institut de Mathématiques de Toulouse

More information

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

10-704: Information Processing and Learning Fall Lecture 24: Dec 7 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Lecture 22: Error exponents in hypothesis testing, GLRT

Lecture 22: Error exponents in hypothesis testing, GLRT 10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected

More information

Ferromagnets and superconductors. Kay Kirkpatrick, UIUC

Ferromagnets and superconductors. Kay Kirkpatrick, UIUC Ferromagnets and superconductors Kay Kirkpatrick, UIUC Ferromagnet and superconductor models: Phase transitions and asymptotics Kay Kirkpatrick, Urbana-Champaign October 2012 Ferromagnet and superconductor

More information

arxiv: v2 [math.st] 20 Jul 2016

arxiv: v2 [math.st] 20 Jul 2016 arxiv:1603.02791v2 [math.st] 20 Jul 2016 Asymptotically optimal, sequential, multiple testing procedures with prior information on the number of signals Y. Song and G. Fellouris Department of Statistics,

More information

Rate Maps and Smoothing

Rate Maps and Smoothing Rate Maps and Smoothing Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign http://sal.agecon.uiuc.edu Outline Mapping Rates Risk

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 4: ML Models (Overview) Cho-Jui Hsieh UC Davis April 17, 2017 Outline Linear regression Ridge regression Logistic regression Other finite-sum

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Dynamic spectrum access with learning for cognitive radio

Dynamic spectrum access with learning for cognitive radio 1 Dynamic spectrum access with learning for cognitive radio Jayakrishnan Unnikrishnan and Venugopal V. Veeravalli Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University

More information

arxiv: v4 [cs.it] 17 Oct 2015

arxiv: v4 [cs.it] 17 Oct 2015 Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology

More information

Decentralized Detection In Wireless Sensor Networks

Decentralized Detection In Wireless Sensor Networks Decentralized Detection In Wireless Sensor Networks Milad Kharratzadeh Department of Electrical & Computer Engineering McGill University Montreal, Canada April 2011 Statistical Detection and Estimation

More information

The information complexity of best-arm identification

The information complexity of best-arm identification The information complexity of best-arm identification Emilie Kaufmann, joint work with Olivier Cappé and Aurélien Garivier MAB workshop, Lancaster, January th, 206 Context: the multi-armed bandit model

More information

Stratégies bayésiennes et fréquentistes dans un modèle de bandit

Stratégies bayésiennes et fréquentistes dans un modèle de bandit Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel

More information

Sequential Detection. Changes: an overview. George V. Moustakides

Sequential Detection. Changes: an overview. George V. Moustakides Sequential Detection of Changes: an overview George V. Moustakides Outline Sequential hypothesis testing and Sequential detection of changes The Sequential Probability Ratio Test (SPRT) for optimum hypothesis

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015

More information

Solutions for Examination Categorical Data Analysis, March 21, 2013

Solutions for Examination Categorical Data Analysis, March 21, 2013 STOCKHOLMS UNIVERSITET MATEMATISKA INSTITUTIONEN Avd. Matematisk statistik, Frank Miller MT 5006 LÖSNINGAR 21 mars 2013 Solutions for Examination Categorical Data Analysis, March 21, 2013 Problem 1 a.

More information

Minimum Message Length Analysis of the Behrens Fisher Problem

Minimum Message Length Analysis of the Behrens Fisher Problem Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

Mining Approximate Top-K Subspace Anomalies in Multi-Dimensional Time-Series Data

Mining Approximate Top-K Subspace Anomalies in Multi-Dimensional Time-Series Data Mining Approximate Top-K Subspace Anomalies in Multi-Dimensional -Series Data Xiaolei Li, Jiawei Han University of Illinois at Urbana-Champaign VLDB 2007 1 Series Data Many applications produce time series

More information

Active Sequential Hypothesis Testing with Application to a Visual Search Problem

Active Sequential Hypothesis Testing with Application to a Visual Search Problem 202 IEEE International Symposium on Information Theory Proceedings Active Sequential Hypothesis Testing with Application to a Visual Search Problem Nidhin Koshy Vaidhiyan nidhinkv@ece.iisc.ernet.in Department

More information

Detection Games Under Fully Active Adversaries

Detection Games Under Fully Active Adversaries IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. X, NO. X, XXXXXXX XXXX 1 Detection Games Under Fully Active Adversaries Benedetta Tondi, Member, IEEE, Neri Merhav, Fellow, IEEE, Mauro Barni Fellow, IEEE

More information

Threshold Considerations in Distributed Detection in a Network of Sensors.

Threshold Considerations in Distributed Detection in a Network of Sensors. Threshold Considerations in Distributed Detection in a Network of Sensors. Gene T. Whipps 1,2, Emre Ertin 2, and Randolph L. Moses 2 1 US Army Research Laboratory, Adelphi, MD 20783 2 Department of Electrical

More information

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models On the Complexity of Best Arm Identification in Multi-Armed Bandit Models Aurélien Garivier Institut de Mathématiques de Toulouse Information Theory, Learning and Big Data Simons Institute, Berkeley, March

More information

The Timing Capacity of Single-Server Queues with Multiple Flows

The Timing Capacity of Single-Server Queues with Multiple Flows The Timing Capacity of Single-Server Queues with Multiple Flows Xin Liu and R. Srikant Coordinated Science Laboratory University of Illinois at Urbana Champaign March 14, 2003 Timing Channel Information

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

CSE 312 Final Review: Section AA

CSE 312 Final Review: Section AA CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material

More information

Unsupervised Nonparametric Anomaly Detection: A Kernel Method

Unsupervised Nonparametric Anomaly Detection: A Kernel Method Fifty-second Annual Allerton Conference Allerton House, UIUC, Illinois, USA October - 3, 24 Unsupervised Nonparametric Anomaly Detection: A Kernel Method Shaofeng Zou Yingbin Liang H. Vincent Poor 2 Xinghua

More information

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

On the Complexity of Best Arm Identification with Fixed Confidence

On the Complexity of Best Arm Identification with Fixed Confidence On the Complexity of Best Arm Identification with Fixed Confidence Discrete Optimization with Noise Aurélien Garivier, joint work with Emilie Kaufmann CNRS, CRIStAL) to be presented at COLT 16, New York

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Optimal Distributed Detection Strategies for Wireless Sensor Networks

Optimal Distributed Detection Strategies for Wireless Sensor Networks Optimal Distributed Detection Strategies for Wireless Sensor Networks Ke Liu and Akbar M. Sayeed University of Wisconsin-Madison kliu@cae.wisc.edu, akbar@engr.wisc.edu Abstract We study optimal distributed

More information

Number Theory and Divisibility

Number Theory and Divisibility Number Theory and Divisibility Recall the Natural Numbers: N = {1, 2, 3, 4, 5, 6, } Any Natural Number can be expressed as the product of two or more Natural Numbers: 2 x 12 = 24 3 x 8 = 24 6 x 4 = 24

More information

Statistical Learning Theory and the C-Loss cost function

Statistical Learning Theory and the C-Loss cost function Statistical Learning Theory and the C-Loss cost function Jose Principe, Ph.D. Distinguished Professor ECE, BME Computational NeuroEngineering Laboratory and principe@cnel.ufl.edu Statistical Learning Theory

More information

Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection

Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection Nir Halay and Koby Todros Dept. of ECE, Ben-Gurion University of the Negev, Beer-Sheva, Israel February 13, 2017 1 /

More information

The PAC Learning Framework -II

The PAC Learning Framework -II The PAC Learning Framework -II Prof. Dan A. Simovici UMB 1 / 1 Outline 1 Finite Hypothesis Space - The Inconsistent Case 2 Deterministic versus stochastic scenario 3 Bayes Error and Noise 2 / 1 Outline

More information

A New Algorithm for Nonparametric Sequential Detection

A New Algorithm for Nonparametric Sequential Detection A New Algorithm for Nonparametric Sequential Detection Shouvik Ganguly, K. R. Sahasranand and Vinod Sharma Department of Electrical Communication Engineering Indian Institute of Science, Bangalore, India

More information

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)

More information

Ferromagnets and the classical Heisenberg model. Kay Kirkpatrick, UIUC

Ferromagnets and the classical Heisenberg model. Kay Kirkpatrick, UIUC Ferromagnets and the classical Heisenberg model Kay Kirkpatrick, UIUC Ferromagnets and the classical Heisenberg model: asymptotics for a mean-field phase transition Kay Kirkpatrick, Urbana-Champaign June

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Revisiting the Exploration-Exploitation Tradeoff in Bandit Models

Revisiting the Exploration-Exploitation Tradeoff in Bandit Models Revisiting the Exploration-Exploitation Tradeoff in Bandit Models joint work with Aurélien Garivier (IMT, Toulouse) and Tor Lattimore (University of Alberta) Workshop on Optimization and Decision-Making

More information

The Calibrated Bayes Factor for Model Comparison

The Calibrated Bayes Factor for Model Comparison The Calibrated Bayes Factor for Model Comparison Steve MacEachern The Ohio State University Joint work with Xinyi Xu, Pingbo Lu and Ruoxi Xu Supported by the NSF and NSA Bayesian Nonparametrics Workshop

More information

Measure-Transformed Quasi Maximum Likelihood Estimation

Measure-Transformed Quasi Maximum Likelihood Estimation Measure-Transformed Quasi Maximum Likelihood Estimation 1 Koby Todros and Alfred O. Hero Abstract In this paper, we consider the problem of estimating a deterministic vector parameter when the likelihood

More information

Distributed Estimation and Detection for Smart Grid

Distributed Estimation and Detection for Smart Grid Distributed Estimation and Detection for Smart Grid Texas A&M University Joint Wor with: S. Kar (CMU), R. Tandon (Princeton), H. V. Poor (Princeton), and J. M. F. Moura (CMU) 1 Distributed Estimation/Detection

More information

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,

More information

Lecture 35: December The fundamental statistical distances

Lecture 35: December The fundamental statistical distances 36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Pattern Recognition. Parameter Estimation of Probability Density Functions

Pattern Recognition. Parameter Estimation of Probability Density Functions Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The

More information

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti Lecture 2: Categorical Variable A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti 1 Categorical Variable Categorical variable is qualitative

More information

Økonomisk Kandidateksamen 2004 (II) Econometrics 2 June 14, 2004

Økonomisk Kandidateksamen 2004 (II) Econometrics 2 June 14, 2004 Økonomisk Kandidateksamen 2004 (II) Econometrics 2 June 14, 2004 This is a four hours closed-book exam (uden hjælpemidler). Answer all questions! The questions 1 to 4 have equal weight. Within each question,

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Logistic Regression. Machine Learning Fall 2018

Logistic Regression. Machine Learning Fall 2018 Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes

More information

Machine Learning Basics: Maximum Likelihood Estimation

Machine Learning Basics: Maximum Likelihood Estimation Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning

More information

Detection Games under Fully Active Adversaries. Received: 23 October 2018; Accepted: 25 December 2018; Published: 29 December 2018

Detection Games under Fully Active Adversaries. Received: 23 October 2018; Accepted: 25 December 2018; Published: 29 December 2018 entropy Article Detection Games under Fully Active Adversaries Benedetta Tondi 1,, Neri Merhav 2 and Mauro Barni 1 1 Department of Information Engineering and Mathematical Sciences, University of Siena,

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

PAC-learning, VC Dimension and Margin-based Bounds

PAC-learning, VC Dimension and Margin-based Bounds More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks

Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Ying Lin and Biao Chen Dept. of EECS Syracuse University Syracuse, NY 13244, U.S.A. ylin20 {bichen}@ecs.syr.edu Peter

More information

Maximum likelihood estimation

Maximum likelihood estimation Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization

More information

Large-Margin Thresholded Ensembles for Ordinal Regression

Large-Margin Thresholded Ensembles for Ordinal Regression Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology, U.S.A. Conf. on Algorithmic Learning Theory, October 9,

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Distributed and Recursive Parameter Estimation in Parametrized Linear State-Space Models

Distributed and Recursive Parameter Estimation in Parametrized Linear State-Space Models 1 Distributed and Recursive Parameter Estimation in Parametrized Linear State-Space Models S. Sundhar Ram, V. V. Veeravalli, and A. Nedić Abstract We consider a network of sensors deployed to sense a spatio-temporal

More information

Statistics 3858 : Contingency Tables

Statistics 3858 : Contingency Tables Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method Rebecca Barter February 2, 2015 Confidence Intervals Confidence intervals What is a confidence interval? A confidence interval is calculated

More information

A Topologically Constrained Ising Model

A Topologically Constrained Ising Model Constrained Ising Model July 29, 2016 1 / 50 A Topologically Constrained Ising Model Derek Kielty UIUC Katherine Koch UIUC William Linz UIUC Colleen Robichaux UIUC Fernando Roman-Garcia UIUC Elizabeth

More information