EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
|
|
- Damian Pierce
- 5 years ago
- Views:
Transcription
1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1
2 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle Bayesian Decision Theory Discriminant Functions Decision Regions Minimum Distance Classification Nearest Neighbor Classification Parameter Estimation Decision Boundaries EEL 851 2
3 Introduction What is a Pattern? Object process of event consisting of both deterministic/stochastic components Record of dynamic occurrences influenced by both deterministic and stochastic factors Examples voice, image, characters Kind of Patterns Visual patterns Temporal patterns Logical patterns What is Pattern Class? Set of patterns sharing set of common attributes (or features) Usually originating from the same source EEL 851 3
4 Introduction Feature Relevant characteristics that make patterns apart from each other Data extractable through measurements or processing Examples Patterns Speech waveforms, crystals, textures, weather patterns Features Age, color, height, width Classifications Assigning patterns into classes based on features Noise Distortions associated with pattern processing and/or training samples that effect the classification performance of system EEL 851 4
5 Example Automation in fish packing plant Sort incoming fish on a conveyor according to species using optical sensing Sorting species Sea bass Salmon EEL 851 5
6 Problem Analysis Setup camera Take some sample images Features? Suggested features Length Lightness Width Number and shape of fins Position of mouth, etc.. Explore above suggested features! EEL 851 6
7 Preprocessing Segmentation Isolate fishes from background Isolate fishes from one another Feature extraction Reduce the data by measuring certain features Classifier Evaluates the evidence presented Makes final decision EEL 851 7
8 Preprocessing EEL 851 8
9 Classification Selecting length of fish Possible feature for classification EEL 851 9
10 Classification Length is a poor feature! Select brightness as a possible feature EEL
11 Threshold Decision Boundary Cost Consequences of our decision Equal? Task of decision theory Move the decision boundary towards smaller values of lightness, Cost Reduce number of sea brass classified as salmon EEL
12 Classification Adopt lightness and the width of fish Fish x = [x 1, x 2 ] Lightness Width EEL
13 Classification EEL
14 Classification Our satisfaction Premature Central aim Correctly classify a novel input Issue of Generalization! EEL
15 Classification Syntactic Pattern Recognition Recursive description using method of formal languages Structural Pattern Recognition Derivation of descriptions using formal rules EEL
16 Pattern Recognition Systems Invariant Features Translation Rotation Scale EEL
17 Design Cycle Data Collection Feature Choice Model Choice Training Evaluation Computational Complexity EEL
18 Design Cycle EEL
19 Types of Learning Supervised Learning Teacher Category label or cost for each label in training set Desired input Desired output Goals Produce a correct output given a new input Goals of Supervised Learning Classification Desired output Discrete class labels Regression Desired output Continuous valued Unsupervised Learning The system forms clusters or natural groupings of input pattern Goals Build a model or find useful representations of data Usage Reasoning, finding clusters, dimensionality reduction, decision making, prediction, etc. Data compression, Outlier detection, Help classification EEL
20 Bayesian Decision Theory Assumptions Problem Probabilistic terms Knowledge of relevant probabilities Sea Bass/Salmon example State of nature Random Variable ω ω 1 (sea bass) ω ω 1 (salmon) Priori probability P(ω 1 ) sea bass; P(ω 2 ) salmon P(ω 1 ) + P( ω 2 ) = 1 (exclusivity and exhaustivity) EEL
21 Bayesian Decision Theory Decision Rule Only prior information, Cannot see! Decide ω 1 if P(ω 1 ) > P(ω 2 ) otherwise decide ω 2 More info in most cases Class Conditional Info Class-conditional pdf p(x ω) p(x ω 1 ) and p(x ω 2 ) Difference in lightness between populations of sea and salmon EEL
22 Bayesian Decision Theory Class-conditional pdf x Lightness of fish Normalized EEL
23 Bayes Formula Lightness of fish x, Class? P(ω j x) = p(x ω j ). P (ω j ) / p(x) where in case of two categories p( x) j 2 = = j= 1 p( x ω ) P( ω ) j j Posterior = (Likelihood Prior) / Evidence p(ω j x) Likelihood p(x) Evidence (scale factor) EEL
24 Posterior Probabilities Prior probabilities P(ω 1 ) = 2/3 P(ω 2 ) = 1/3 EEL
25 Bayes Decision Rule Decision given posterior probabilities Observation x if P(ω 1 x) > P(ω 2 x) True state of nature = ω 1 if P(ω 1 x) < P(ω 2 x) True state of nature = ω 2 Probability of error P(error x) = P(ω 1 x) if we decide ω 2 P(error x) = P(ω 2 x) if we decide ω 1 EEL
26 Bayes Decision Rule Minimize probability of error Decide ω 1 if P(ω 1 x) > P(ω 2 x); otherwise decide ω 2 Bayes decision P(error x) = min [P(ω 1 x), P(ω 2 x)] EEL
27 Bayesian Classifier Generalization of Preceding Ideas More than one feature More than two state of nature Actions not only deciding state of nature Loss function More general than probability of error Bayes Decision Rule Input pattern C is classified into class ω k, for a given feature vector x, maximizes the posterior probability; P(ω k x) P(ω j x) for j k Likelihood conditions p(x ω k ) Training data measurements Prior probabilities P(ω k ) Supposed to known within given population of sample patterns p(x) is same for all class alternatives Ignored, normalization factor EEL
28 Loss Function Certain classification errors may be costly than others False Negative may be much more costly than False Positive Examples Fire alarm, Medical diagnosis Ω = {ω 1, ω 1, ω n } Possible set of nature/classes = {α 1,α 2, α n } Possible decisions/actions Loss Function λ(α i ω j ) Loss incurred by taking action α i when true state of nature is ω j Conditional Risk R(α i x) Loss expected by taking action α i when observed evidence is x EEL
29 Minimum Error Rate Classification Action α i is associated with class ω j All errors are equally likely Zero-One Loss Classification decision α i is correct only if state of nature is ω j Symmetrical function 0 λ( αi ω j ) = 1 Conditional Risk R(α i x) = λ(α i ω j ) P(ω j x) = 1 P(ω i x) Minimize Risk for for i = j i j Select the class maximizing the posterior probability Suggested by Bayes Decision Rule, Minimum error rate classification EEL
30 Neyman-Pearson Criterion Minimize overall risk subject to constraints R(α i x) dx < constant Misclassification limited to a frequency Fish example Do not misclassify more than 1% of salmon as bass Minimize the chance of classifying sea bass as salmon EEL
31 Discriminant Functions Classifier Representation Function employed for discriminating among classes g i (x), i = 1,,k Classifier Assign x to class ω i if g i (x) g j (x) for j i Possible Choices g i (x) = P(ω j x) g i (x) = p(x ω j )P(ω j ) g i (x) = ln {p(x ω j )} + ln {P(ω j )} EEL
32 Decision Region Effect of Decision Rule Feature space Decision regions Decision Boundaries Surface in feature space Ties occur among largest discriminant functions EEL
33 Normal Density Univariate Density Analytically tractable, Maximum entropy Continuous-valued density A lot of processes are asymptotically Gaussian Central Limit Theorem Aggregate effect of independent random disturbances Gaussian Many patterns Prototype corrupted by large number of RP P(x) = 1 exp 2π σ 1 x 2 µ σ 2, µ = mean (or expected value) of x σ 2 = expected squared deviation or variance EEL
34 Normal Density Multivariate density Multivariate normal density in d dimensions is: P(x) = (2π) 1 d / 2 Σ 1/ 2 exp 1 (x 2 µ ) t Σ 1 (x µ ) where: x = (x 1, x 2,, x d ) t feature vector µ = (µ 1, µ 2,, µ d ) t mean vector Σ = d*d covariance matrix Σ and Σ -1 are determinant and inverse respectively Σ Shape of Gaussian curve (x - µ) t Σ -1 (x - µ) Mahalanobis distance EEL
35 Normal Density P(x) = (2π) 1 d / 2 Σ 1/ 2 exp 1 (x 2 µ ) t Σ 1 (x µ ) EEL
36 Discriminant Functions Minimum Error Rate Classification g i (x) = ln p(x ω i ) + ln P(ω i ) g (x) = i 1 (x µ 2 i 1 t i ) (x µ i d ) ln2π 2 1 ln 2 Σ i + lnp( ω ) i Case Σ i = σ 2.I Features are statistically independent, same variance σ 2 Diagonal covariance matrix i t i g (x) = w x+ w 1 i0 1 t w i = 2 µ i wi0 = µ iµ i + ln P( ωi ) 2 2 σ σ Linear machines Decision surface Pieces of hyperplanes g i (x) = g j (x) EEL
37 Discriminant Functions EEL
38 Discriminant Functions EEL
39 Discriminant Functions Case Σ i = Σ Covariance matrices of all classes Identical but arbitrary g 1 t 1 x) = (x- µ i) Σ (x- µ i) + ln P( ω ) 2 i ( i Hyperplane separating R i and R j [ P( ω )/ P( ω )] 1 ln i j x 0 = ( µ i + µ j ).( µ i µ j ) 2 t 1 ( µ µ ) Σ ( µ µ ) i j i j Hyperplane separating R i and R j Not orthogonal to the line between the means EEL
40 Discriminant Functions EEL
41 Discriminant Functions EEL
42 EEL Case Σ i = arbitrary The covariance matrices are different for each category Hyperquadrics Hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids ) ( ln ln w w 2 1 W : ) ( i 1 i 0 i i i i t i i i i i i t i i t i P where w x w W x x x g ω µ µ µ + Σ Σ = = Σ Σ = = + = Discriminant Functions
43 Discriminant Functions EEL
44 Decision Boundary - Example Compute Bayes decision boundary Gaussian Distributions Means and Covariance's Discrete versions 1/ 2 0 µ 1 = = 6 µ 2 3 = = x 2 = x x1 EEL
45 Supervised Learning Parametric Approaches Bayesian parameter estimation Maximum likelihood estimation Estimation Problem Estimate P(ω j ) Estimate p(ω j x) Tough High dimensional feature spaces, small number of training samples Simplifying Assumptions Feature Independence Independently drawn samples I. I. D. model Assume that p(ω j x) is Gaussian Estimation problem Parameters of normality EEL
46 Estimation Methods Bayesian Estimation (MAP) Distribution parameters are random values that follow a known (i.e. Gaussian) distribution Behavior of training data helps in revising parameter values Large training samples Better chances of refining posterior probabilities (parameter peaking) Maximum-Likelihood (ML) Estimation Parameters of probabilistic distributions are fixed but unknown values Parameters Unknown constants, Identify using training data Best estimates of parameter values Class-conditional probabilities are maximized over the available samples EEL
47 Parametric Approaches Curse of dimensionality Estimate the parameters known distribution Smaller number of samples Some priori information EEL
48 Non-parametric Approaches Parzen window pdf estimation (KDE) Estimate p(ω j x) directly from sample patterns K n nearest-neighbor Directly construct the decision boundary based on training data EEL
49 Statistical Approaches EEL
50 Nearest-Neighbor Methods Statistical, nearest-neighbor or memory-based methods k-nearest-neighbor New pattern category Plurality of its k closest neighbor Large k Decreases the chance of undue influence by noisy training pattern Small k Reduces acuity of method Distance metric usually Euclidean In practice features are scaled a j Advantage Does not require training EEL
51 Nearest-Neighbor Decision Example Class of training patterns 4 EEL
52 References Richard O. Duda, Peter E. Hart, and David G.Stork, Pattern Classification, 2 nd edition,john Wiley & Sons, 2001 (most contents) Nils J. Nilsson, Introduction to Machine Learning, A. K. Jain, R. P. W. Duin, and J. Mao, Statistical Pattern Recognition: A Review, IEEE Trans PAMI, pp. 4-37, Jan Other Useful References Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, New York, NY, Therrien, Decision Estimation and Classification, John-Wiley & sons, New York, NY, Hertz, Krogh, and Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, Reading, MA, Bose and Liang, Neural Network Fundamentals with Graphs, Algorithms, and Applications, McGrawll-Hill, New York, NY, 1996 EEL
Bayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationPattern Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 with the permission of the authors
More informationBayes Decision Theory
Bayes Decision Theory Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density 0 Minimum-Error-Rate Classification Actions are decisions on classes
More informationError Rates. Error vs Threshold. ROC Curve. Biometrics: A Pattern Recognition System. Pattern classification. Biometrics CSE 190 Lecture 3
Biometrics: A Pattern Recognition System Yes/No Pattern classification Biometrics CSE 190 Lecture 3 Authentication False accept rate (FAR): Proportion of imposters accepted False reject rate (FRR): Proportion
More informationContents 2 Bayesian decision theory
Contents Bayesian decision theory 3. Introduction... 3. Bayesian Decision Theory Continuous Features... 7.. Two-Category Classification... 8.3 Minimum-Error-Rate Classification... 9.3. *Minimax Criterion....3.
More informationBayesian Decision Theory Lecture 2
Bayesian Decision Theory Lecture 2 Jason Corso SUNY at Buffalo 14 January 2009 J. Corso (SUNY at Buffalo) Bayesian Decision Theory Lecture 2 14 January 2009 1 / 58 Overview and Plan Covering Chapter 2
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationBayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory
Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More informationMinimum Error-Rate Discriminant
Discriminants Minimum Error-Rate Discriminant In the case of zero-one loss function, the Bayes Discriminant can be further simplified: g i (x) =P (ω i x). (29) J. Corso (SUNY at Buffalo) Bayesian Decision
More informationp(x ω i 0.4 ω 2 ω
p( ω i ). ω.3.. 9 3 FIGURE.. Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value given the pattern is in category ω i.if represents
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationBayes Decision Theory
Bayes Decision Theory Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 / 16
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More informationp(x ω i 0.4 ω 2 ω
p(x ω i ).4 ω.3.. 9 3 4 5 x FIGURE.. Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value x given the pattern is in category
More information44 CHAPTER 2. BAYESIAN DECISION THEORY
44 CHAPTER 2. BAYESIAN DECISION THEORY Problems Section 2.1 1. In the two-category case, under the Bayes decision rule the conditional error is given by Eq. 7. Even if the posterior densities are continuous,
More informationApplication: Can we tell what people are looking at from their brain activity (in real time)? Gaussian Spatial Smooth
Application: Can we tell what people are looking at from their brain activity (in real time? Gaussian Spatial Smooth 0 The Data Block Paradigm (six runs per subject Three Categories of Objects (counterbalanced
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationExample - basketball players and jockeys. We will keep practical applicability in mind:
Sonka: Pattern Recognition Class 1 INTRODUCTION Pattern Recognition (PR) Statistical PR Syntactic PR Fuzzy logic PR Neural PR Example - basketball players and jockeys We will keep practical applicability
More informationENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition
Memorial University of Newfoundland Pattern Recognition Lecture 6 May 18, 2006 http://www.engr.mun.ca/~charlesr Office Hours: Tuesdays & Thursdays 8:30-9:30 PM EN-3026 Review Distance-based Classification
More informationIntro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation
Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor
More informationIntroduction to Statistical Inference
Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationBayes Decision Theory - I
Bayes Decision Theory - I Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statistical Learning from Data Goal: Given a relationship between a feature vector and a vector y, and iid data samples ( i,y i ), find
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationIssues and Techniques in Pattern Classification
Issues and Techniques in Pattern Classification Carlotta Domeniconi www.ise.gmu.edu/~carlotta Machine Learning Given a collection of data, a machine learner eplains the underlying process that generated
More informationBayesian Decision Theory
Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationInformatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries
Overview Gaussians estimated from training data Guido Sanguinetti Informatics B Learning and Data Lecture 1 9 March 1 Today s lecture Posterior probabilities, decision regions and minimising the probability
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationMetric-based classifiers. Nuno Vasconcelos UCSD
Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major
More informationChapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)
HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter
More informationLecture 3: Pattern Classification. Pattern classification
EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 3
CS434a/541a: attern Recognition rof. Olga Veksler Lecture 3 1 Announcements Link to error data in the book Reading assignment Assignment 1 handed out, due Oct. 4 lease send me an email with your name and
More informationLecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher
Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and
More informationPattern Recognition 2
Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error
More informationBAYESIAN DECISION THEORY
Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will
More information5. Discriminant analysis
5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationPattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher
Pattern recognition "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher The more relevant patterns at your disposal, the better your decisions will be. This is hopeful news to
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationBayes Rule for Minimizing Risk
Bayes Rule for Minimizing Risk Dennis Lee April 1, 014 Introduction In class we discussed Bayes rule for minimizing the probability of error. Our goal is to generalize this rule to minimize risk instead
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationPart 2 Elements of Bayesian Decision Theory
Part 2 Elements of Bayesian Decision Theory Machine Learning, Part 2, March 2017 Fabio Roli 1 Introduction ØStatistical pattern classification is grounded into Bayesian decision theory, therefore, knowing
More informationChemometrics: Classification of spectra
Chemometrics: Classification of spectra Vladimir Bochko Jarmo Alander University of Vaasa November 1, 2010 Vladimir Bochko Chemometrics: Classification 1/36 Contents Terminology Introduction Big picture
More informationMachine Learning for Signal Processing Bayes Classification and Regression
Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationMachine Learning Lecture 2
Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen
More informationDensity Estimation: ML, MAP, Bayesian estimation
Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum
More informationPattern Recognition. Parameter Estimation of Probability Density Functions
Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The
More informationComputer Vision Group Prof. Daniel Cremers. 3. Regression
Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationMachine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier
Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically
More informationObject Recognition. Digital Image Processing. Object Recognition. Introduction. Patterns and pattern classes. Pattern vectors (cont.
3/5/03 Digital Image Processing Obect Recognition Obect Recognition One of the most interesting aspects of the world is that it can be considered to be made up of patterns. Christophoros Nikou cnikou@cs.uoi.gr
More informationStatistical Rock Physics
Statistical - Introduction Book review 3.1-3.3 Min Sun March. 13, 2009 Outline. What is Statistical. Why we need Statistical. How Statistical works Statistical Rock physics Information theory Statistics
More informationECE662: Pattern Recognition and Decision Making Processes: HW TWO
ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are
More informationBayesian Decision Theory
Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is
More informationStatistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms
More informationProblem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30
Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain
More informationNotation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions
Notation S pattern space X feature vector X = [x 1,...,x l ] l = dim{x} number of features X feature space K number of classes ω i class indicator Ω = {ω 1,...,ω K } g(x) discriminant function H decision
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationIntroduction to Machine Learning
Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationProbability Models for Bayesian Recognition
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationMachine Learning Lecture 2
Machine Perceptual Learning and Sensory Summer Augmented 6 Computing Announcements Machine Learning Lecture 2 Course webpage http://www.vision.rwth-aachen.de/teaching/ Slides will be made available on
More informationMinimum Error Rate Classification
Minimum Error Rate Classification Dr. K.Vijayarekha Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401 Table of Contents 1.Minimum Error Rate Classification...
More informationMachine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)
Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March
More informationGaussian Models
Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationContents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)
Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture
More informationNearest Neighbor Pattern Classification
Nearest Neighbor Pattern Classification T. M. Cover and P. E. Hart May 15, 2018 1 The Intro The nearest neighbor algorithm/rule (NN) is the simplest nonparametric decisions procedure, that assigns to unclassified
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision
More informationSYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I
SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationECE521 Lecture7. Logistic Regression
ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard
More informationPattern Classification
Pattern Classification Introduction Parametric classifiers Semi-parametric classifiers Dimensionality reduction Significance testing 6345 Automatic Speech Recognition Semi-Parametric Classifiers 1 Semi-Parametric
More informationAdvanced statistical methods for data analysis Lecture 1
Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationMachine detection of emotions: Feature Selection
Machine detection of emotions: Feature Selection Final Degree Dissertation Degree in Mathematics Leire Santos Moreno Supervisor: Raquel Justo Blanco María Inés Torres Barañano Leioa, 31 August 2016 Contents
More informationSGN-2506: Introduction to Pattern Recognition. Jussi Tohka Tampere University of Technology Institute of Signal Processing 2006
SGN-2506: Introduction to Pattern Recognition Jussi Tohka Tampere University of Technology Institute of Signal Processing 2006 September 1, 2006 ii Preface This is an English translation of the lecture
More informationComputer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13
Computer Vision Pa0ern Recogni4on Concepts Part I Luis F. Teixeira MAP- i 2012/13 What is it? Pa0ern Recogni4on Many defini4ons in the literature The assignment of a physical object or event to one of
More informationClustering VS Classification
MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:
More informationMaximum Likelihood Estimation. only training data is available to design a classifier
Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr Ruey S Tsay Lecture 9: Discrimination and Classification 1 Basic concept Discrimination is concerned with separating
More informationProbabilistic Methods in Bioinformatics. Pabitra Mitra
Probabilistic Methods in Bioinformatics Pabitra Mitra pabitra@cse.iitkgp.ernet.in Probability in Bioinformatics Classification Categorize a new object into a known class Supervised learning/predictive
More information