Universal Outlier Hypothesis Testing
|
|
- Sharon Gallagher
- 5 years ago
- Views:
Transcription
1 Universal Outlier Hypothesis Testing Venu Veeravalli ECE Dept & CSL & ITI University of Illinois at Urbana-Champaign (with Sirin Nitinawarat and Yun Li) NSF Workshop on Signal Processing for Big Data March 22, 2013
2 Statistical Outlier Detection Single sequence of observations Normal observations follow some fixed (possibly unknown) distribution or generating mechanism Outliers follow different generating mechanism Goal: To find outliers efficiently Applications: fraud detection, public health monitoring, cleaning up sensor data, etc. 2
3 Fraud Detection Example: spending records for a male graduate student Transactions Grocery Gas Books --- Amount 30$ 35$ 75$ Normal behavior 3
4 Fraud Detection Example: spending records for a male graduate student Transactions Grocery Gas Books --- Spa Women apparels Cosmetics Amount 30$ 35$ 75$ 250$ 1500$ 500$ Normal behavior Abnormal behavior 4
5 Fraud Detection: Group Monitoring Male graduate students Student 1 Student 2 Student 3 Student M Grocery Dining Grocery Gas Dining Grocery Gas Books Books Movie Books Grocery Movie Books Spa Dining Grocery Gas Women apparel Movie Gas Books Cosmetics Grocery 5
6 Outlier Hypothesis Testing M sequences of observations, with M large Almost all sequences are generated from common typical distribution Small subset of sequences generated from different (outlier) distributions 6
7 Outlier Hypothesis Testing M sequences of observations, with M large Almost all sequences are generated from common typical distribution Small subset of sequences generated from different (outlier) distributions Special case: o Exactly one sequence is generated from outlier distribution o Goal: to detect outlier sequence efficiently o Universal setting: neither typical nor outlier distributions known; no training data provided 7
8 Universal Outlier Hypothesis Testing Typical distribution Outlier distribution π µ H M µ π π π π H 2 π µ π π π π π µ π π π π π µ π H M π π π π µ 8
9 Mathematical Model y 1 (1) y 2 (1) y 1 (2) y 2 (2) y 1 (M ) y 2 (M ) y n (1) y n (2) y n (M ) ( ) = µ ( (i ) y ) k π( ( j ) y ) k H i : p i y (1),, y (M ) y (1) y (2) y (M ) n k = 1 j i 9
10 Hypothesis Testing Problem H i : p i y Mn n k = 1 ( ) = µ ( (i ) y ) k π( ( j ) y ) k j i Nothing is known about (µ,π) except that they are distinct Universal Detector: δ : Y Mn { 1,, M} Independent of (µ,π ) 10
11 Performance Metrics Maximal error probability: e( δ, ( µ, π) ) = max P i i { δ(y Mn ) i} Exponent for maximal error probability: α( δ, ( µ, π) ) = lim n 1 n loge( δ, ( µ, π) ) 11
12 Performance Metrics Maximal error probability: e( δ, ( µ, π) ) = max P i i { δ(y Mn ) i} Exponent for maximal error probability: α( δ, ( µ, π) ) = lim n 1 n loge( δ, ( µ, π) ) Consistency : e 0 as n Exponential Consistency : α > 0 12
13 Background: Binary Hypothesis Testing H 1 : p 1 (y) = n n π(y ) H : p (y) = µ(y ) k 2 2 k k=1 k=1 If (µ,π) known, δ ML (y) = argmax logp i (y) i has α(δ,(µ,π)) = C(µ,π) > 0 exponential consistency Chernoff Info :C(µ,π) = max log µ (y ) s π(y ) 1 s 0 s 1 y 13
14 Outlier Hypothesis testing: known (µ,π) H i : p i y Mn n k = 1 ( ) = µ ( (i ) y ) k π( ( j ) y ) k j i ML Rule : δ ML (y Mn ) = argmax i logp i (y Mn ) H 1 H 2 H M 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ Exponential Consistency : α(δ ML,(µ,π)) = 2B(µ,π) Bhattacharya Distance : B(µ,π) = log µ (y ) 1/2 π(y ) 1/2 y 14
15 Binary Hypothesis Testing: Unknown µ H 1 : p 1 (y) = n n π(y ) H : p (y) = µ(y ) k 2 2 k k=1 k=1 If µ unknown for any given δ there existsµ s.t. α = 0 No exponential consistency! 15
16 Outlier Hypothesis Testing: unknown µ µ uknown : ˆµ i = γ i empirical Distribution H i : ˆp i y Mn n k = 1 ( ) = ˆµ ( (i ) i y ) k π( ( j ) y ) k j i Generalized Likelihood (GL) Rule : δ GL (y Mn ) = argmax i log ˆp i (y Mn ) H 1 H 2 H M 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ Exponential Consistency : α(δ GL,(µ,π)) = 2B(µ,π) Same as known µ,π 16
17 Sanov s Theorem Sanov s Theorem: For i.i.d. rvs Y n p, exponent of probability that random empirical distribution falls in closed set E is 17
18 Key Tool: Sanov s Theorem Sanov s Theorem: For i.i.d. rvs Y n p, exponent of probability that random empirical distribution falls in closed set E is lim n 1 n logp { n Empirical( Y ) E} = min q E D ( q p) E q * D ( q * p) p 18
19 Proposed Universal Test ( µ,π) not known : ˆµ i = γ i ˆπ i = 1 M 1 H i : ˆpi y Mn n k = 1 ( ) = ˆµ ( (i ) i y ) k ˆπ ( ( j ) y ) k j i Generalized Likelihood (GL) Rule : δ GL (y Mn ) = argmax i log ˆpi (y Mn ) H 1 H 2 H M j i γ j Empirical distributions 1 2 M µ π π π π π µ π π π π π µ π π π π π µ π π π π π µ 19
20 Performance of Universal Test α( δ, ( µ, π) ) = min D(p p 1,., p 1 µ) + D(p 2 π) + + D(p M π) M 1 D(p j p M 1 k ) D(p j j 1 k 1 j 2 1 p M 1 k ) k 2 20
21 Performance of Universal Test Universally exponential consistency! α( δ, ( µ, π) ) = min D(p p 1,., p 1 µ) + D(p 2 π) + + D(p M π) M > 0, ( µ,π) 1 D(p j p M 1 k ) D(p j j 1 k 1 j 2 1 p M 1 k ) k 2 21
22 Asymptotic Optimality Motivation: When only π is known, optimal error exponent is 2B µ, π ( ) Estimate of π satisfies lim n 1 M M γ i = 1 i =1 M µ + M 1 M π, lim M 1 M µ + M 1 M π = π 22
23 Asymptotic Optimality Our universal outlier detector achieves error exponent lower bounded by q: D(q π) min 2B(µ, q) 1 M 1 ( 2B(µ, q)+c π ) 23
24 Asymptotic Optimality Our universal outlier detector achieves error exponent lower bounded by q: D(q π) min 2B(µ, q) 1 M 1 ( 2B(µ, q)+c π ) This lower bound is non-decreasing in and converges to 2B(µ, π) as M M 3, 24
25 Numerical Results Lower bound of the error exponent 2 B (µ, π), µ=(0.36, 0.64) π=(0.64, 0.36) log (M), where M is the number M of the coordinates 25
26 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? 26
27 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? o If it is known that we have exactly K << M outliers, then GL test is still universally exponentially consistent 27
28 Discussion Generalized likelihood (GL) test is universally exponentially consistent for single outlier hypothesis testing GL test is asymptotically optimum in error exponent for large M What if we have more than one outlier? o If it is known that we have exactly K << M outliers, then GL test is still universally exponentially consistent o If number of outliers is not known a priori, then universally exponentially does not exist! 28
Universal Outlier Hypothesis Testing
1 Universal Outlier Hypothesis Testing Yun Li, Student Member, IEEE, Sirin Nitinawarat, Member, IEEE, and Venugopal V. Veeravalli, Fellow, IEEE Abstract arxiv:submit/0844243 [cs.it] 11 Nov 2013 The following
More informationUncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department
Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in
More informationMachine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.
More informationChaos, Complexity, and Inference (36-462)
Chaos, Complexity, and Inference (36-462) Lecture 7: Information Theory Cosma Shalizi 3 February 2009 Entropy and Information Measuring randomness and dependence in bits The connection to statistics Long-run
More informationFinding the best mismatched detector for channel coding and hypothesis testing
Finding the best mismatched detector for channel coding and hypothesis testing Sean Meyn Department of Electrical and Computer Engineering University of Illinois and the Coordinated Science Laboratory
More informationQuickest Anomaly Detection: A Case of Active Hypothesis Testing
Quickest Anomaly Detection: A Case of Active Hypothesis Testing Kobi Cohen, Qing Zhao Department of Electrical Computer Engineering, University of California, Davis, CA 95616 {yscohen, qzhao}@ucdavis.edu
More informationData-Efficient Quickest Change Detection
Data-Efficient Quickest Change Detection Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign http://www.ifp.illinois.edu/~vvv (joint work with Taposh Banerjee)
More informationGeneralized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses
Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:
More informationThe information complexity of sequential resource allocation
The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation
More informationFinding a Needle in a Haystack: Conditions for Reliable. Detection in the Presence of Clutter
Finding a eedle in a Haystack: Conditions for Reliable Detection in the Presence of Clutter Bruno Jedynak and Damianos Karakos October 23, 2006 Abstract We study conditions for the detection of an -length
More informationNecessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery
Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Vincent Tan, Matt Johnson, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems,
More informationThe Method of Types and Its Application to Information Hiding
The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,
More informationDecentralized Detection in Sensor Networks
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 51, NO 2, FEBRUARY 2003 407 Decentralized Detection in Sensor Networks Jean-François Chamberland, Student Member, IEEE, and Venugopal V Veeravalli, Senior Member,
More informationEntropy, Inference, and Channel Coding
Entropy, Inference, and Channel Coding Sean Meyn Department of Electrical and Computer Engineering University of Illinois and the Coordinated Science Laboratory NSF support: ECS 02-17836, ITR 00-85929
More informationIEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 57, NO 3, MARCH 2011 1587 Universal and Composite Hypothesis Testing via Mismatched Divergence Jayakrishnan Unnikrishnan, Member, IEEE, Dayu Huang, Student
More informationDensity Estimation: ML, MAP, Bayesian estimation
Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum
More informationSpatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood
Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October
More informationApplications of Information Geometry to Hypothesis Testing and Signal Detection
CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry
More informationSHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe
SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/40 Acknowledgement Praneeth Boda Himanshu Tyagi Shun Watanabe 3/40 Outline Two-terminal model: Mutual
More informationINFORMATION THEORY AND STATISTICS
CHAPTER INFORMATION THEORY AND STATISTICS We now explore the relationship between information theory and statistics. We begin by describing the method of types, which is a powerful technique in large deviation
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationCensoring for Type-Based Multiple Access Scheme in Wireless Sensor Networks
Censoring for Type-Based Multiple Access Scheme in Wireless Sensor Networks Mohammed Karmoose Electrical Engineering Department Alexandria University Alexandria 1544, Egypt Email: mhkarmoose@ieeeorg Karim
More informationECE 695C, Midterm #1 6:30-7:30pm Tuesday, September 29, 2009, ME 312,
ECE 695C, Midterm #1 6:30-7:30pm Tuesday, September 29, 2009, ME 312, 1. Enter your name, student ID number, e-mail address, and signature in the space provided on this page, NOW! 2. This is a closed book
More informationStat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS
Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails
More informationTwo optimization problems in a stochastic bandit model
Two optimization problems in a stochastic bandit model Emilie Kaufmann joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishnan Journées MAS 204, Toulouse Outline From stochastic optimization
More informationOn the Complexity of Best Arm Identification with Fixed Confidence
On the Complexity of Best Arm Identification with Fixed Confidence Discrete Optimization with Noise Aurélien Garivier, Emilie Kaufmann COLT, June 23 th 2016, New York Institut de Mathématiques de Toulouse
More information10-704: Information Processing and Learning Fall Lecture 24: Dec 7
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationLecture 22: Error exponents in hypothesis testing, GLRT
10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected
More informationFerromagnets and superconductors. Kay Kirkpatrick, UIUC
Ferromagnets and superconductors Kay Kirkpatrick, UIUC Ferromagnet and superconductor models: Phase transitions and asymptotics Kay Kirkpatrick, Urbana-Champaign October 2012 Ferromagnet and superconductor
More informationarxiv: v2 [math.st] 20 Jul 2016
arxiv:1603.02791v2 [math.st] 20 Jul 2016 Asymptotically optimal, sequential, multiple testing procedures with prior information on the number of signals Y. Song and G. Fellouris Department of Statistics,
More informationRate Maps and Smoothing
Rate Maps and Smoothing Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign http://sal.agecon.uiuc.edu Outline Mapping Rates Risk
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 4: ML Models (Overview) Cho-Jui Hsieh UC Davis April 17, 2017 Outline Linear regression Ridge regression Logistic regression Other finite-sum
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationDynamic spectrum access with learning for cognitive radio
1 Dynamic spectrum access with learning for cognitive radio Jayakrishnan Unnikrishnan and Venugopal V. Veeravalli Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University
More informationarxiv: v4 [cs.it] 17 Oct 2015
Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology
More informationDecentralized Detection In Wireless Sensor Networks
Decentralized Detection In Wireless Sensor Networks Milad Kharratzadeh Department of Electrical & Computer Engineering McGill University Montreal, Canada April 2011 Statistical Detection and Estimation
More informationThe information complexity of best-arm identification
The information complexity of best-arm identification Emilie Kaufmann, joint work with Olivier Cappé and Aurélien Garivier MAB workshop, Lancaster, January th, 206 Context: the multi-armed bandit model
More informationStratégies bayésiennes et fréquentistes dans un modèle de bandit
Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016
More informationSHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe
SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel
More informationSequential Detection. Changes: an overview. George V. Moustakides
Sequential Detection of Changes: an overview George V. Moustakides Outline Sequential hypothesis testing and Sequential detection of changes The Sequential Probability Ratio Test (SPRT) for optimum hypothesis
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015
More informationSolutions for Examination Categorical Data Analysis, March 21, 2013
STOCKHOLMS UNIVERSITET MATEMATISKA INSTITUTIONEN Avd. Matematisk statistik, Frank Miller MT 5006 LÖSNINGAR 21 mars 2013 Solutions for Examination Categorical Data Analysis, March 21, 2013 Problem 1 a.
More informationMinimum Message Length Analysis of the Behrens Fisher Problem
Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationMining Approximate Top-K Subspace Anomalies in Multi-Dimensional Time-Series Data
Mining Approximate Top-K Subspace Anomalies in Multi-Dimensional -Series Data Xiaolei Li, Jiawei Han University of Illinois at Urbana-Champaign VLDB 2007 1 Series Data Many applications produce time series
More informationActive Sequential Hypothesis Testing with Application to a Visual Search Problem
202 IEEE International Symposium on Information Theory Proceedings Active Sequential Hypothesis Testing with Application to a Visual Search Problem Nidhin Koshy Vaidhiyan nidhinkv@ece.iisc.ernet.in Department
More informationDetection Games Under Fully Active Adversaries
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. X, NO. X, XXXXXXX XXXX 1 Detection Games Under Fully Active Adversaries Benedetta Tondi, Member, IEEE, Neri Merhav, Fellow, IEEE, Mauro Barni Fellow, IEEE
More informationThreshold Considerations in Distributed Detection in a Network of Sensors.
Threshold Considerations in Distributed Detection in a Network of Sensors. Gene T. Whipps 1,2, Emre Ertin 2, and Randolph L. Moses 2 1 US Army Research Laboratory, Adelphi, MD 20783 2 Department of Electrical
More informationOn the Complexity of Best Arm Identification in Multi-Armed Bandit Models
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models Aurélien Garivier Institut de Mathématiques de Toulouse Information Theory, Learning and Big Data Simons Institute, Berkeley, March
More informationThe Timing Capacity of Single-Server Queues with Multiple Flows
The Timing Capacity of Single-Server Queues with Multiple Flows Xin Liu and R. Srikant Coordinated Science Laboratory University of Illinois at Urbana Champaign March 14, 2003 Timing Channel Information
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationCSE 312 Final Review: Section AA
CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material
More informationUnsupervised Nonparametric Anomaly Detection: A Kernel Method
Fifty-second Annual Allerton Conference Allerton House, UIUC, Illinois, USA October - 3, 24 Unsupervised Nonparametric Anomaly Detection: A Kernel Method Shaofeng Zou Yingbin Liang H. Vincent Poor 2 Xinghua
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More informationInformation Theory and Hypothesis Testing
Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking
More informationOn the Complexity of Best Arm Identification with Fixed Confidence
On the Complexity of Best Arm Identification with Fixed Confidence Discrete Optimization with Noise Aurélien Garivier, joint work with Emilie Kaufmann CNRS, CRIStAL) to be presented at COLT 16, New York
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationOptimal Distributed Detection Strategies for Wireless Sensor Networks
Optimal Distributed Detection Strategies for Wireless Sensor Networks Ke Liu and Akbar M. Sayeed University of Wisconsin-Madison kliu@cae.wisc.edu, akbar@engr.wisc.edu Abstract We study optimal distributed
More informationNumber Theory and Divisibility
Number Theory and Divisibility Recall the Natural Numbers: N = {1, 2, 3, 4, 5, 6, } Any Natural Number can be expressed as the product of two or more Natural Numbers: 2 x 12 = 24 3 x 8 = 24 6 x 4 = 24
More informationStatistical Learning Theory and the C-Loss cost function
Statistical Learning Theory and the C-Loss cost function Jose Principe, Ph.D. Distinguished Professor ECE, BME Computational NeuroEngineering Laboratory and principe@cnel.ufl.edu Statistical Learning Theory
More informationPlug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection
Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection Nir Halay and Koby Todros Dept. of ECE, Ben-Gurion University of the Negev, Beer-Sheva, Israel February 13, 2017 1 /
More informationThe PAC Learning Framework -II
The PAC Learning Framework -II Prof. Dan A. Simovici UMB 1 / 1 Outline 1 Finite Hypothesis Space - The Inconsistent Case 2 Deterministic versus stochastic scenario 3 Bayes Error and Noise 2 / 1 Outline
More informationA New Algorithm for Nonparametric Sequential Detection
A New Algorithm for Nonparametric Sequential Detection Shouvik Ganguly, K. R. Sahasranand and Vinod Sharma Department of Electrical Communication Engineering Indian Institute of Science, Bangalore, India
More informationMachine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)
More informationFerromagnets and the classical Heisenberg model. Kay Kirkpatrick, UIUC
Ferromagnets and the classical Heisenberg model Kay Kirkpatrick, UIUC Ferromagnets and the classical Heisenberg model: asymptotics for a mean-field phase transition Kay Kirkpatrick, Urbana-Champaign June
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationRevisiting the Exploration-Exploitation Tradeoff in Bandit Models
Revisiting the Exploration-Exploitation Tradeoff in Bandit Models joint work with Aurélien Garivier (IMT, Toulouse) and Tor Lattimore (University of Alberta) Workshop on Optimization and Decision-Making
More informationThe Calibrated Bayes Factor for Model Comparison
The Calibrated Bayes Factor for Model Comparison Steve MacEachern The Ohio State University Joint work with Xinyi Xu, Pingbo Lu and Ruoxi Xu Supported by the NSF and NSA Bayesian Nonparametrics Workshop
More informationMeasure-Transformed Quasi Maximum Likelihood Estimation
Measure-Transformed Quasi Maximum Likelihood Estimation 1 Koby Todros and Alfred O. Hero Abstract In this paper, we consider the problem of estimating a deterministic vector parameter when the likelihood
More informationDistributed Estimation and Detection for Smart Grid
Distributed Estimation and Detection for Smart Grid Texas A&M University Joint Wor with: S. Kar (CMU), R. Tandon (Princeton), H. V. Poor (Princeton), and J. M. F. Moura (CMU) 1 Distributed Estimation/Detection
More information1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).
Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More informationPattern Recognition. Parameter Estimation of Probability Density Functions
Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The
More informationLecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti
Lecture 2: Categorical Variable A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti 1 Categorical Variable Categorical variable is qualitative
More informationØkonomisk Kandidateksamen 2004 (II) Econometrics 2 June 14, 2004
Økonomisk Kandidateksamen 2004 (II) Econometrics 2 June 14, 2004 This is a four hours closed-book exam (uden hjælpemidler). Answer all questions! The questions 1 to 4 have equal weight. Within each question,
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationMachine Learning Basics: Maximum Likelihood Estimation
Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning
More informationDetection Games under Fully Active Adversaries. Received: 23 October 2018; Accepted: 25 December 2018; Published: 29 December 2018
entropy Article Detection Games under Fully Active Adversaries Benedetta Tondi 1,, Neri Merhav 2 and Mauro Barni 1 1 Department of Information Engineering and Mathematical Sciences, University of Siena,
More informationChapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationPAC-learning, VC Dimension and Margin-based Bounds
More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationDistributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks
Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Ying Lin and Biao Chen Dept. of EECS Syracuse University Syracuse, NY 13244, U.S.A. ylin20 {bichen}@ecs.syr.edu Peter
More informationMaximum likelihood estimation
Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization
More informationLarge-Margin Thresholded Ensembles for Ordinal Regression
Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology, U.S.A. Conf. on Algorithmic Learning Theory, October 9,
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationInformation Theory in Intelligent Decision Making
Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationDistributed and Recursive Parameter Estimation in Parametrized Linear State-Space Models
1 Distributed and Recursive Parameter Estimation in Parametrized Linear State-Space Models S. Sundhar Ram, V. V. Veeravalli, and A. Nedić Abstract We consider a network of sensors deployed to sense a spatio-temporal
More informationStatistics 3858 : Contingency Tables
Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson
More informationA Course on Advanced Econometrics
A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationSTAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method
STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method Rebecca Barter February 2, 2015 Confidence Intervals Confidence intervals What is a confidence interval? A confidence interval is calculated
More informationA Topologically Constrained Ising Model
Constrained Ising Model July 29, 2016 1 / 50 A Topologically Constrained Ising Model Derek Kielty UIUC Katherine Koch UIUC William Linz UIUC Colleen Robichaux UIUC Fernando Roman-Garcia UIUC Elizabeth
More information