Error Rates. Error vs Threshold. ROC Curve. Biometrics: A Pattern Recognition System. Pattern classification. Biometrics CSE 190 Lecture 3

Similar documents
Pattern Classification

Bayesian Decision Theory

Bayesian Decision Theory

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

Bayesian Decision Theory Lecture 2

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory

Bayesian Decision Theory

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 3

Part 2 Elements of Bayesian Decision Theory

Deep Learning for Computer Vision

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Minimum Error-Rate Discriminant

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Contents 2 Bayesian decision theory

Bayes Decision Theory

44 CHAPTER 2. BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

Biometrics: Introduction and Examples. Raymond Veldhuis

Nearest Neighbor Pattern Classification

Minimum Error Rate Classification

CSE555: Introduction to Pattern Recognition Midterm Exam Solution (100 points, Closed book/notes)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Detection theory 101 ELEC-E5410 Signal Processing for Communications

p(x ω i 0.4 ω 2 ω

Chapter 2 Bayesian Decision Theory. Pattern Recognition Soochow, Fall Semester 1

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

What does Bayes theorem give us? Lets revisit the ball in the box example.

Theoretical Statistical Correlation for Biometric Identification Performance

Bayesian Decision and Bayesian Learning

Detection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010

CMU-Q Lecture 24:

Score calibration for optimal biometric identification

If you wish to cite this paper, please use the following reference:

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Machine detection of emotions: Feature Selection

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 1

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Machine Learning 2017

Bayesian Learning. Bayesian Learning Criteria

Cheng Soon Ong & Christian Walder. Canberra February June 2018

From Bayes Theorem to Pattern Recognition via Bayes Rule

Lecture 2. Conditional Probability

Bayesian Decision Theory Tutorial Visual Recognition Tutorial 1

Probability and (Bayesian) Data Analysis

Machine Learning Linear Classification. Prof. Matteo Matteucci

p(x ω i 0.4 ω 2 ω

Pattern Recognition. Parameter Estimation of Probability Density Functions

Bayesian Decision Theory

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Announcements. Proposals graded

Example - basketball players and jockeys. We will keep practical applicability in mind:

a b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2)

Expect Values and Probability Density Functions

Detection theory. H 0 : x[n] = w[n]

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

Lecture 18: Noise modeling and introduction to decision theory

Learning Methods for Linear Detectors

Machine Learning Lecture 2

BIOMETRIC verification systems are used to verify the

Computer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13

Sequential Decisions

Review. More Review. Things to know about Probability: Let Ω be the sample space for a probability measure P.

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI

Bayes Formula. MATH 107: Finite Mathematics University of Louisville. March 26, 2014

When enough is enough: early stopping of biometrics error rate testing

ECE521 Lecture7. Logistic Regression

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Probability, Statistics, and Bayes Theorem Session 3

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning (Pattern recognition and model fitting) for Master students

Algorithmisches Lernen/Machine Learning

Biometric Hash based on Statistical Features of Online Signatures

Model Averaging (Bayesian Learning)

Topic 3: Hypothesis Testing

Chapter Three. Hypothesis Testing

L11: Pattern recognition principles

Concerns of the Psychophysicist. Three methods for measuring perception. Yes/no method of constant stimuli. Detection / discrimination.

Conditional Probability. CS231 Dianna Xu

Machine Learning Lecture 2

Pattern Recognition and Machine Learning. Learning and Evaluation of Pattern Recognition Processes

BAYES DECISION THEORY

Likelihood Ratio-Based Biometric Verification

This is an accepted version of a paper published in Elsevier Information Fusion. If you wish to cite this paper, please use the following reference:

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

MTMS Mathematical Statistics

Bayesian decision making

L2: Review of probability and statistics

Lecture 2. Bayes Decision Theory

Quantifying uncertainty & Bayesian networks

Lecture 1: Bayesian Framework Basics

Bayes Rule. CS789: Machine Learning and Neural Network Bayesian learning. A Side Note on Probability. What will we learn in this lecture?

Statistical Inference

Machine Learning. Yuh-Jye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU

Transcription:

Biometrics: A Pattern Recognition System Yes/No Pattern classification Biometrics CSE 190 Lecture 3 Authentication False accept rate (FAR): Proportion of imposters accepted False reject rate (FRR): Proportion of genuine users rejected Failure to enroll rate (FTE): portion of population that cannot be enrolled Failure to acquire rate (FTA): portion of population that cannot be verified Enrollment Error Rates Error vs Threshold False Match (False Accept): Mistaking biometric measurements from two different persons to be from the same person; False Non-match (False reject): Mistaking two biometric measurements from the same person to be from two different persons FAR: False accept rate FRR: False reject rate (c) Jain 2004 (c) Jain 2004 ROC Curve Announcements Readings Project assignment on web page Accuracy requirements of a biometric system are application dependent (c) Jain 2004 1

An Example 8 Pattern Classification Sorting incoming Duck on a conveyor according to vs. diseased Duck 9 10 11 Bayesian Decision Theory Continuous Features (Sections 2.1-2.2) 2

Introduction 13 14 The sea bass/salmon example State of nature, prior State of nature is a random variable The catch of salmon and sea bass is equiprobable (The healthy of ducks probably isn t equiprobable). P(ω 1 ), P(ω 2 ) Prior probabilities P(ω 1 ) = P(ω 2 ) (uniform priors) Decision rule with only the prior information Decide ω 1 if P(ω 1 ) > P(ω 2 ) otherwise decide ω 2 Use of the class conditional information P(x ω 1 ) and P(x ω 2 ) describe the difference in lightness between populations of healthy and diseased ducks P(ω 1 ) + P( ω 2 ) = 1 (exclusivity and exhaustivity) 15 16 Posterior, likelihood, evidence P(ω j x) = (P(x ω j ) * P (ω j )) / P(x) (BAYES RULE) In words, this can be said as: Posterior = (Likelihood * Prior) / Evidence Where in case of two categories 17 18 Intuitive decision rule given the posterior probabilities: Given x: if P(ω 1 x) > P(ω 2 x) True state of nature = ω 1 if P(ω 1 x) < P(ω 2 x) True state of nature = ω 2 Why do this?: Whenever we observe a particular x, the probability of error is : P(error x) = P(ω 1 x) if we decide ω 2 P(error x) = P(ω 2 x) if we decide ω 1 3

19 Bayesian Decision Theory Continuous Features 20 Since decision rule is optimal for each feature value X, there is not better rule for all x. Generalization of the preceding ideas Use of more than one feature Use more than two states of nature Allowing actions and not only decide on the state of nature Introduce a loss of function (more general than the probability of error) Allowing actions other than classification primarily allows the possibility of rejection Refusing to make a decision in close or bad cases! Letting loss function state how costly each action taken is Bayesian Decision Theory Continuous Features 21 What is the Expected Loss for action α i? 22 Let X be a vector of features. For any given x the expected loss is Let {ω 1, ω 2,, ω c } be the set of c states of nature (or classes ) Let {α 1, α 2,, α a } be the set of possible actions Let λ(α i ω j ) be the loss for action α i when the state of nature is ω j R(α i x) is called the Conditional Risk (or Expected Loss) Overall risk R = Sum of all R(α i x) for i = 1,,a Conditional risk Minimizing R Minimizing R(α i x) for i = 1,, a 23 Given a measured feature vector x, which action should we take? Select the action α i for which R(α i x) is minimum R is minimum and R in this case is called the Bayes risk = best performance that can be achieved! 24 for i = 1,,a 4

Two-Category Classification α 1 : deciding ω 1 α 2 : deciding ω 2 λ ij = λ(α i ω j ) loss incurred for deciding ω i when the true state of nature is ω j 25 Our rule is the following: if R(α 1 x) < R(α 2 x) λ 11 P(ω 1 x) + λ 12 P(ω 2 x) < λ 21 P(ω 1 x) + λ 22 P(ω 2 x) action α 1 : decide ω 1 is taken This results in the equivalent rule : decide ω 1 if: 26 Conditional risk: R(α 1 x) = λ 11 P(ω 1 x) + λ 12 P(ω 2 x) R(α 2 x) = λ 21 P(ω 1 x) + λ 22 P(ω 2 x) (λ 21 - λ 11 ) P(x ω 1 ) P(ω 1 ) > (λ 12 - λ 22 ) P(x ω 2 ) P(ω 2 ) and decide ω 2 otherwise 27 x (λ 21 - λ 11 ) x (λ 12 - λ 22 ) 5