Efficient Approach to Pattern Recognition Based on Minimization of Misclassification Probability
|
|
- Simon Sherman
- 6 years ago
- Views:
Transcription
1 American Journal of Theoretical and Applied Statistics 05; 5(-): 7- Published online November 9, 05 ( doi: 0.648/j.ajtas.s ISSN: (Print); ISSN: (Online) Efficient Approach to Pattern Recognition Based on Nicholas A. Nechval, *, Konstantin N. Nechval Department of Mathematics, Baltic International Academy, Riga, Latvia Department of Applied Mathematics, Transport and Telecommunication Institute, Riga, Latvia address: (N. A. Nechval), (K. N. Nechval) To cite this article: Nicholas A. Nechval, Konstantin N. Nechval. Efficient Approach to Pattern Recognition Based on Minimization of Misclassification Probability. American Journal of Theoretical and Applied Statistics. Special Issue: Novel Ideas for Efficient Optimization of Statistical Decisions and Predictive Inferences under Parametric Uncertainty of Underlying Models with Applications. Vol. 5, No. -, 06, pp. 7-. doi: 0.648/j.ajtas.s Abstract: In this paper, an efficient approach to pattern recognition (classification) is suggested. It is based on minimization of assification probability and uses transition from high dimensional problem (dimension p ) to one dimensional problem (dimension p=) in the case of the two classes as well as in the case of several classes with separation of classes as much as possible. The probability of assification, which is nown as the error rate, is also used to judge the ability of various pattern recognition (classification) procedures to predict group membership. The approach does not require the arbitrary selection of priors as in the Bayesian classifier and represents the novel pattern recognition (classification) procedure that allows one to tae into account the cases, which are not adequate for Fisher s classification rule (i.e., the distributions of the classes are not multivariate normal or covariance matrices of those are different or there are strong multi-nonlinearities). Moreover, it also allows one to classify a set of multivariate observations, where each of the observations belongs to the same unnown class. For the cases, which are adequate for Fisher s classification rule, the proposed approach gives the results similar to that of Fisher s classification rule. For illustration, practical examples are given. Keywords: Pattern, Recognition, lassification, Misclassification, Probability, Minimization. Introduction Pattern recognition aim is to classify data (patterns) based on statistical information extracted from the patterns [, ]. It provides the solution to various problems from speech recognition, face recognition to classification of handwritten characters and medical diagnosis. The various application areas of pattern recognition are lie bioinformatics, document classification, image analysis, data mining, industrial automation, biometric recognition, remote sensing, handwritten text analysis, medical diagnosis, speech recognition, statistics, diagnostics, computer science, biology and many more. Pattern recognition aim is to classify data (patterns) based on either a priori nowledge or on statistical information extracted from the patterns. Fisher s linear discriminant rule (FLDR) is the most widely used classification rule [3 9 and references therein]. Some of the reasons for this are its simplicity and unnecessity of strict assumptions. In its original form, proposed by Fisher, the method assumes equality of population covariance matrices, but does not explicitly require multivariate normality. However, optimal classification performance of Fisher's discriminant function can only be expected when multivariate normality is present as well, since only good discrimination can ensure good allocation. In practice, we often are in need of analyzing input data samples, which are not adequate for Fisher s classification rule, such that the distributions of the groups (classes, populations) are not multivariate normal or covariance matrices of those are different or there are strong multi-nonlinearities. In this paper, an efficient approach to pattern recognition (classification) is proposed. It is based on minimization of assification probability. The approach does not require the arbitrary selection of priors as in the Bayesian classifier and represents the novel procedure that allows one to analyze input data samples, which are not adequate for Fisher s pattern classification rule (i.e., the distributions of the classes are not multivariate normal or covariance matrices of those are different or there are strong multi-nonlinearities). For the
2 8 Nicholas A. Nechval and Konstantin N. Nechval: Efficient Approach to Pattern Recognition Based on cases, which are adequate for Fisher s classification rule, the proposed approach gives the results similar to that of Fisher s rule. Moreover, it also allows one to classify the set of multivariate observations, where each of the observations belongs to the same class. This approach uses transition from high dimensional problem (dimension p ) to one dimensional problem (dimension p=) in the case of the two classes as well as in the case of several classes with separation of classes as much as possible. The probability of assification, which is nown as the error rate, is also used to judge the ability of various pattern recognition (classification) procedures to predict group membership.. Approach to Pattern Recognition.. Approach to Pattern Recognition (lassification) in the ase of Two lasses Let = (,, n ), = (,, n ) () be samples of observed vectors of attributes of objects from two different classes and, respectively. In this case, the proposed approach to pattern recognition (classification) is as follows. is determined (see Fig. ), where represents the probability density function (pdf) of a transformed observation X= X ( ) of from class j, j {, }. Step (Pattern recognition (classification) via separation threshold h). At this step, pattern recognition (classification) of the observation is carried out as follows: if X( ) < h, if X( ) h. > Remar. The recognition (classification) rule (3) can be rewritten as follows: Assign to the class j for which, j {, }, is largest... Approach to Pattern Recognition (lassification) in the ase of Several lasses Let = (,, ), n, = ( m,, ) m n (4) be samples of observed vectors of objects from several different classes,,, m, respectively. In this case, the proposed approach to pattern recognition (classification) is as follows. Step (Transition from high dimensional problem (dimension p ) to one dimensional problem (dimension p=)). At this step, at first, transition from p-dimensional problem to g-dimensional problem is carried out by using a suitable transformation of the multivariate observations = (,..., ) to the multivariate observations Z= p Z,..., Z ), where g must be no bigger [] than g m m (3) g= min( m, p). (5) Figure. Misclassification probability. Step (Transition from high dimensional problem (dimension p ) to one dimensional problem (dimension p=)). At this step, transition from high dimensional problem to one dimensional problem is carried out by using suitable transformations of the multivariate (p ) observations = [,, p ] to univariate observations X with separation of classes as much as possible to obtain the input object allocation, which should be optimal in the sense of minimizing, on average, the number of incorrect assignments. Then the separation threshold h, which minimizes the probability of assification of the input observation, () P = f ( xdx ) + f ( xdx ) ], R R If g, transition from g-dimensional problem to one dimensional problem is carried out by using a suitable transformation of the multivariate observations Z= ( Z,..., Z ) to univariate observations X with separation of g classes as much as possible to obtain the input object allocation, which should be optimal in the sense of minimizing, on average, the number of incorrect assignments. Then the separation threshold h l, which minimizes the probability of assification of the input observation for classes and l, l l l P = f ( xdx ) + f ( xdx ) ], R l R l, l {,, m}, l, (6) is determined (pairwise), where represents the pdf of a transformed observation X( ) from class. Step (Pattern recognition (classification) via separation thresholds h l ;, l {,, m}, l). At this step, pattern recognition (classification) of the observation is
3 American Journal of Theoretical and Applied Statistics 05; 5(-): 7-9 carried out as follows: if X ( ) < h, l. (7) Remar. The recognition (classification) rule (7) can be rewritten as follows: l l if f ( x ) > f ( x ), l. (8) 3. Practical Examples 3.. Example Suppose we wish to classify some product (input vector = [, ] ) to one of two classes of quality ( and ) of this product. The data samples of observed vectors of attributes of product quality from two different classes and, respectively, are given in Table. l Table. Product quality attributes data. lass of product quality attributes Vector (i) of quality attributes i y (i) y (i) i y (i) y (i) lass of product quality attributes Vector (i) of quality attributes i y (i) y (i) i y (i) y (i) l procedure will not successfully separate the two classes. Step. For transition from high dimensional problem (p=) to one dimensional problem (p=), the following transformations are used: =[, ] Z=[Z, Z, Z 3 ] X, where Z = ( a), Z = ( b), Z = ( a)( b), 3 n = 8 i= n = 8 y / n 5.3, (9) i= a= y / n = 6.98, b= = X = U Z= [5.74, 8.99,.089] Z. (0) Using the Anderson-Darling goodness-of-fit test for Normality (significance level α=0.05), it was found that It follows from () that ( x µ f ( x) = exp, πσ σ µ = 4.48, σ = 9.85, ( x µ = f ( x) exp, πσ σ µ = 09.39, σ = () () (3) (4) h= 38.49, P ( h ) = , (5) h = ( µ + µ )/= 6.739, Fisher (6) P ( h ) = (7) Fisher Indexes. Thus, the index of relative efficiency of the Fisher approach as compared with the proposed approach is I ( h, h) = P ( h)/ P ( h ) rel.eff. Fisher Fisher = / = (8) The index of reduction percentage in the probability of assification for the proposed approach as compared with the Fisher approach is given by I ( hh, ) red.per. rel.eff. Fisher Fisher = ( I ( h, h))00% = 6.8%. (9) Figure. Pictorial representation of the data of Table, which are not adequate for Fisher s classification rule. A pictorial representation of the above data, which are not adequate for Fisher s classification rule, is given on Fig.. If the points are projected in any direction onto a straight line, there will be almost total overlap. A linear discriminant 3.. Example This example is adapted from a study [0] concerned with the detection of hemophilia A carriers. To construct a procedure for detecting potential hemophilia A carriers, blood samples were assayed for two groups of women and measurements on the two variables: =log 0 (AHF activity)
4 0 Nicholas A. Nechval and Konstantin N. Nechval: Efficient Approach to Pattern Recognition Based on and =log 0 (AHF antigen) recorded. ( AHF denotes antihemophilic factor.) The first group of n =3 women was selected from nown hemophilia A carriers. This group was called the obligatory carriers. The second group of n =9 women were selected from a population of women who did not carry the hemophilia gene. This group was called the normal group. The pairs of observations (y, y ) for the two groups are given in Table and plotted in Fig. 3. Also shown are estimated contours containing 50% and 95% of the probability for bivariate normal distributions centered at y and, respectively. Table. Hemophilia data. Group of obligatory carriers Vector (i) = [log 0 (AHF activity, log 0 (AHF antigen] i y (i) y (i) i y (i) y (i) Group of noncarriers Vector (i) = [log 0 (AHF activity, log 0 (AHF antigen] i y (i) y (i) i y (i) y (i) Step. For transition from high dimensional problem (p=) to one dimensional problem (p=), the following transformation is used: =[ ] X = U, where S S U= S ( ) = ( ) + n n = [ , 38.53], (0) n j S = ( )( ), () j ji ( ) j ji ( ) j n j i= n j = ( ), j ji j =,. () n j i = Using the Anderson-Darling goodness-of-fit test for Normality (significance level α=0.05), it was found that Theorem. If ( x µ f ( x) = exp, πσ σ µ = 7.5, σ = , ( x µ = f ( x) exp, πσ σ (3) (4) (5) µ = , σ = (6) f ( x ) j, j =,, (7) are the probability density functions of the normal distribution with the parameters ( µ, σ ) and( µ, σ ), respectively, where µ < µ, then the necessary and sufficient conditions for h P ( h ) = f ( xdx ) + f ( xdx ) (8) h to have a unique minimum are: (i) the necessary condition for h to be a minimum point of (5) is given by f ( h) = f ( h), (9) (ii) the sufficient condition for h to be a minimum point of (8) is given by µ < h< µ (30). Figure 3. Scatter plots of [log 0 (AHF activity), log 0(AHF antigen)] for the normal group and obligatory hemophilia A carriers. Proof. The proof being straightforward is omitted here. orollary.. If σ = σ, then the separation threshold h
5 American Journal of Theoretical and Applied Statistics 05; 5(-): 7- is determined as h = ( µ + µ )/, (3) i.e., in this case we deal with Fisher s separation threshold. It follows from (8) that h= , P ( h ) = , (3) h = ( µ + µ )/= 53.60, P ( h ) = Fisher Fisher (33) Indexes. Thus, the index of relative efficiency of the Fisher approach as compared with the proposed approach is I ( h, h) = P ( h)/ P ( h ) rel.eff. Fisher Fisher = / = (34) The index of reduction percentage in the probability of assification for the proposed approach as compared with the Fisher approach is given by I ( hh, ) = ( I ( h, h))00% = 33.5%. (35) red.per. Fisher rel.eff. Fisher Step (Pattern recognition (classification) via separation threshold h). At this step, pattern recognition (classification) of a observation is carried out as follows: if X( ) = U < h, if X( ) = U > h. (36) For instance, measurements of AHF activity and AHF antigen on a woman who may be a hemophilia A carrier give y = 0.0 and y = Should this woman be classified as (obligatory carrier) or (normal)? Using Fisher s classification rule, we obtain U = X = [ ][ ] = 88.69< h = (37) Fisher Using the proposed approach based on minimization of assification probability, we obtain U = X = [ ][ ] = 88.69< h= (38) Applying either (37) or (38), we classify the woman as, an obligatory carrier. Thus, Fisher s approach and the proposed one give the same result in the above case. It will be noted that if = () =[ ], then X = > h Fisher = Thus, in this case, Fisher s classification rule gives incorrect classification. 4. onclusion The approach proposed in this paper represents the innovative pattern recognition (classification) procedure based on minimization of assification probability. It allows one to tae into account the cases, which are not adequate for Fisher s classification rule. Moreover, the procedure also allows one to classify the set of multivariate observations, where each of the observations belongs to the same unnown class. This approach has been motivated by a assification problem that appears in various application areas of pattern recognition. References [] R. Fisher, The use of multiple measurements in taxonomic problems, Ann. Eugenics, vol. 7, pp , 936. [] K. V. Mardia, J. T. Kent, and J. M. Bibby, Multivariate Analysis. Academic Press, 979. [3] N. A. Nechval, K. N. Nechval, and M. Purgailis, Statistical pattern recognition principles, in International Encyclopedia of Statistical Science, Part 9, Miodrag Lovric, Ed. Berlin, Heidelberg: Springer-Verlag, 0, pp [4] N. A. Nechval, K. N. Nechval, M. Purgailis, V. F. Strelchono, G. Berzins, and M. Moldovan, New approach to pattern recognition via comparison of maximum separations, omputer Modelling and New Technologies, vol. 5, pp , 0. [5] N. A. Nechval, K. N. Nechval, V. Danovich, G. Berzins, Distance-based approaches to pattern recognition via embedding, in Lecture Notes in Engineering and omputer Science: Proceedings of The World ongress on Engineering 04, 4 July, 04, London, U.K., pp [6] R. O. Duda, P. E. Hart, and D. G. Stor, Pattern classification. New or: Wiley. (Second Edition.), 00. [7] S. T. John and. Nello, Kernel Methods for Pattern Analysis. ambridge: ambridge University Press, 004. [8] A.. Rencher, Methods of Multivariate Analysis. John Wiley & Sons. (Second Edition.), 00. [9] T. Sergios and K. Konstantinos, Pattern Recognition. Singapore: Elsevier Ltd. (Third Edition.), 006. [0] B. N. Bouma, et al., Evaluation of the detection rate of hemophilia carriers. Statistical Methods for linical Decision Maing, vol. 7, pp , 975.
Discrimination: finding the features that separate known groups in a multivariate sample.
Discrimination and Classification Goals: Discrimination: finding the features that separate known groups in a multivariate sample. Classification: developing a rule to allocate a new object into one of
More informationSF2935: MODERN METHODS OF STATISTICAL LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS LEARNING. Tatjana Pavlenko.
SF2935: MODERN METHODS OF STATISTICAL LEARNING LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS Tatjana Pavlenko 5 November 2015 SUPERVISED LEARNING (REP.) Starting point: we have an outcome
More informationBayes Decision Theory
Bayes Decision Theory Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density 0 Minimum-Error-Rate Classification Actions are decisions on classes
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationModelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis
Progress in Applied Mathematics Vol. 6, No., 03, pp. [59-69] DOI: 0.3968/j.pam.955803060.738 ISSN 95-5X [Print] ISSN 95-58 [Online] www.cscanada.net www.cscanada.org Modelling Academic Risks of Students
More informationwhich is the chance that, given the observation/subject actually comes from π 2, it is misclassified as from π 1. Analogously,
47 Chapter 11 Discrimination and Classification Suppose we have a number of multivariate observations coming from two populations, just as in the standard two-sample problem Very often we wish to, by taking
More informationDiscriminant Analysis and Statistical Pattern Recognition
Discriminant Analysis and Statistical Pattern Recognition GEOFFREY J. McLACHLAN Department of Mathematics The University of Queensland St. Lucia, Queensland, Australia A Wiley-Interscience Publication
More informationDISCRIMINANT ANALYSIS. 1. Introduction
DISCRIMINANT ANALYSIS. Introduction Discrimination and classification are concerned with separating objects from different populations into different groups and with allocating new observations to one
More informationLecture 8: Classification
1/26 Lecture 8: Classification Måns Eriksson Department of Mathematics, Uppsala University eriksson@math.uu.se Multivariate Methods 19/5 2010 Classification: introductory examples Goal: Classify an observation
More informationp(x ω i 0.4 ω 2 ω
p( ω i ). ω.3.. 9 3 FIGURE.. Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value given the pattern is in category ω i.if represents
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More informationMachine Learning Lecture 2
Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen
More informationLecture Notes on the Gaussian Distribution
Lecture Notes on the Gaussian Distribution Hairong Qi The Gaussian distribution is also referred to as the normal distribution or the bell curve distribution for its bell-shaped density curve. There s
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationApplied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition
Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationDiscriminant Analysis with High Dimensional. von Mises-Fisher distribution and
Athens Journal of Sciences December 2014 Discriminant Analysis with High Dimensional von Mises - Fisher Distributions By Mario Romanazzi This paper extends previous work in discriminant analysis with von
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationBayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory
Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology
More informationPattern Recognition Problem. Pattern Recognition Problems. Pattern Recognition Problems. Pattern Recognition: OCR. Pattern Recognition Books
Introduction to Statistical Pattern Recognition Pattern Recognition Problem R.P.W. Duin Pattern Recognition Group Delft University of Technology The Netherlands prcourse@prtools.org What is this? What
More informationLecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides
Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture
More informationBayesian Reasoning and Recognition
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG 2 / osig 1 Second Semester 2013/2014 Lesson 12 28 arch 2014 Bayesian Reasoning and Recognition Notation...2 Pattern Recognition...3
More informationMachine Learning Lecture 2
Announcements Machine Learning Lecture 2 Eceptional number of lecture participants this year Current count: 449 participants This is very nice, but it stretches our resources to their limits Probability
More information5. Discriminant analysis
5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density
More informationRobustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples
DOI 10.1186/s40064-016-1718-3 RESEARCH Open Access Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples Atinuke Adebanji 1,2, Michael Asamoah Boaheng
More informationA Simple Implementation of the Stochastic Discrimination for Pattern Recognition
A Simple Implementation of the Stochastic Discrimination for Pattern Recognition Dechang Chen 1 and Xiuzhen Cheng 2 1 University of Wisconsin Green Bay, Green Bay, WI 54311, USA chend@uwgb.edu 2 University
More informationClassification. 1. Strategies for classification 2. Minimizing the probability for misclassification 3. Risk minimization
Classification Volker Blobel University of Hamburg March 2005 Given objects (e.g. particle tracks), which have certain features (e.g. momentum p, specific energy loss de/ dx) and which belong to one of
More informationDiscriminant analysis and supervised classification
Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical
More informationProblem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30
Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain
More informationMachine Learning Lecture 2
Machine Perceptual Learning and Sensory Summer Augmented 6 Computing Announcements Machine Learning Lecture 2 Course webpage http://www.vision.rwth-aachen.de/teaching/ Slides will be made available on
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationPrediction-based adaptive control of a class of discrete-time nonlinear systems with nonlinear growth rate
www.scichina.com info.scichina.com www.springerlin.com Prediction-based adaptive control of a class of discrete-time nonlinear systems with nonlinear growth rate WEI Chen & CHEN ZongJi School of Automation
More informationAuthor s Accepted Manuscript
Author s Accepted Manuscript Adaptive diagnosis of the bilinear mechanical systems L. Gelman,. Gorpinich,. Thompson PII: 888-37(9)14-4 DOI: doi:1.116/j.ymssp.9.1.7 Reference: YMP 36 To appear in: Mechanical
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationLinear Classifiers as Pattern Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2014/2015 Lesson 16 8 April 2015 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear
More informationMachine Learning Lecture 1
Many slides adapted from B. Schiele Machine Learning Lecture 1 Introduction 18.04.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Organization Lecturer Prof.
More informationINF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification
INF 4300 151014 Introduction to classifiction Anne Solberg anne@ifiuiono Based on Chapter 1-6 in Duda and Hart: Pattern Classification 151014 INF 4300 1 Introduction to classification One of the most challenging
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr Ruey S Tsay Lecture 9: Discrimination and Classification 1 Basic concept Discrimination is concerned with separating
More informationMINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava
MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,
More informationPairwise Neural Network Classifiers with Probabilistic Outputs
NEURAL INFORMATION PROCESSING SYSTEMS vol. 7, 1994 Pairwise Neural Network Classifiers with Probabilistic Outputs David Price A2iA and ESPCI 3 Rue de l'arrivée, BP 59 75749 Paris Cedex 15, France a2ia@dialup.francenet.fr
More informationMotivating the Covariance Matrix
Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role
More informationClassification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC)
Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Eunsik Park 1 and Y-c Ivan Chang 2 1 Chonnam National University, Gwangju, Korea 2 Academia Sinica, Taipei,
More informationMULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS. Richard Brereton
MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS Richard Brereton r.g.brereton@bris.ac.uk Pattern Recognition Book Chemometrics for Pattern Recognition, Wiley, 2009 Pattern Recognition Pattern Recognition
More informationSemiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance. Probal Chaudhuri and Subhajit Dutta
Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance Probal Chaudhuri and Subhajit Dutta Indian Statistical Institute, Kolkata. Workshop on Classification and Regression
More informationAdvanced statistical methods for data analysis Lecture 1
Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline
More informationGenerative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham
Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham Outline We have already seen how Bayes rule can be turned into a classifier In all our examples
More informationDiscriminant Kernels based Support Vector Machine
Discriminant Kernels based Support Vector Machine Akinori Hidaka Tokyo Denki University Takio Kurita Hiroshima University Abstract Recently the kernel discriminant analysis (KDA) has been successfully
More informationA Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis
009 0th International Conference on Document Analysis and Recognition A Novel Rejection easurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis Chun Lei He Louisa Lam Ching
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationThe generative approach to classification. A classification problem. Generative models CSE 250B
The generative approach to classification The generative approach to classification CSE 250B The learning process: Fit a probability distribution to each class, individually To classify a new point: Which
More informationData Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396
Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction
More informationClassification of handwritten digits using supervised locally linear embedding algorithm and support vector machine
Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Olga Kouropteva, Oleg Okun, Matti Pietikäinen Machine Vision Group, Infotech Oulu and
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationClustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation
Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide
More informationA NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1
A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 Jinglin Zhou Hong Wang, Donghua Zhou Department of Automation, Tsinghua University, Beijing 100084, P. R. China Control Systems Centre,
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationProbability Models for Bayesian Recognition
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian
More informationCLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS
CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS EECS 833, March 006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@gs.u.edu 864-093 Overheads and resources available at http://people.u.edu/~gbohling/eecs833
More informationp(x ω i 0.4 ω 2 ω
p(x ω i ).4 ω.3.. 9 3 4 5 x FIGURE.. Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value x given the pattern is in category
More informationConsistent Bivariate Distribution
A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of
More informationShort Note: Naive Bayes Classifiers and Permanence of Ratios
Short Note: Naive Bayes Classifiers and Permanence of Ratios Julián M. Ortiz (jmo1@ualberta.ca) Department of Civil & Environmental Engineering University of Alberta Abstract The assumption of permanence
More informationAlgorithms for Classification: The Basic Methods
Algorithms for Classification: The Basic Methods Outline Simplicity first: 1R Naïve Bayes 2 Classification Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationOverriding the Experts: A Stacking Method For Combining Marginal Classifiers
From: FLAIRS-00 Proceedings. Copyright 2000, AAAI (www.aaai.org). All rights reserved. Overriding the Experts: A Stacking ethod For Combining arginal Classifiers ark D. Happel and Peter ock Department
More informationMulti-Class Linear Dimension Reduction by. Weighted Pairwise Fisher Criteria
Multi-Class Linear Dimension Reduction by Weighted Pairwise Fisher Criteria M. Loog 1,R.P.W.Duin 2,andR.Haeb-Umbach 3 1 Image Sciences Institute University Medical Center Utrecht P.O. Box 85500 3508 GA
More informationThree-group ROC predictive analysis for ordinal outcomes
Three-group ROC predictive analysis for ordinal outcomes Tahani Coolen-Maturi Durham University Business School Durham University, UK tahani.maturi@durham.ac.uk June 26, 2016 Abstract Measuring the accuracy
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationContents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)
Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationA Novel PCA-Based Bayes Classifier and Face Analysis
A Novel PCA-Based Bayes Classifier and Face Analysis Zhong Jin 1,2, Franck Davoine 3,ZhenLou 2, and Jingyu Yang 2 1 Centre de Visió per Computador, Universitat Autònoma de Barcelona, Barcelona, Spain zhong.in@cvc.uab.es
More informationISSN X On misclassification probabilities of linear and quadratic classifiers
Afrika Statistika Vol. 111, 016, pages 943 953. DOI: http://dx.doi.org/10.1699/as/016.943.85 Afrika Statistika ISSN 316-090X On misclassification probabilities of linear and quadratic classifiers Olusola
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More information12 Discriminant Analysis
12 Discriminant Analysis Discriminant analysis is used in situations where the clusters are known a priori. The aim of discriminant analysis is to classify an observation, or several observations, into
More informationLearning Kernel Parameters by using Class Separability Measure
Learning Kernel Parameters by using Class Separability Measure Lei Wang, Kap Luk Chan School of Electrical and Electronic Engineering Nanyang Technological University Singapore, 3979 E-mail: P 3733@ntu.edu.sg,eklchan@ntu.edu.sg
More informationMultivariate Bayesian Linear Regression MLAI Lecture 11
Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate
More informationLinear Classification and SVM. Dr. Xin Zhang
Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a
More informationECE662: Pattern Recognition and Decision Making Processes: HW TWO
ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationThe Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space
The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway
More informationFrom dummy regression to prior probabilities in PLS-DA
JOURNAL OF CHEMOMETRICS J. Chemometrics (2007) Published online in Wiley InterScience (www.interscience.wiley.com).1061 From dummy regression to prior probabilities in PLS-DA Ulf G. Indahl 1,3, Harald
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationDiscrete Mathematics and Probability Theory Fall 2015 Lecture 21
CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about
More informationLecture 3: Pattern Classification. Pattern classification
EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and
More informationBayesian Inference for the Multivariate Normal
Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision
More informationClassification via kernel regression based on univariate product density estimators
Classification via kernel regression based on univariate product density estimators Bezza Hafidi 1, Abdelkarim Merbouha 2, and Abdallah Mkhadri 1 1 Department of Mathematics, Cadi Ayyad University, BP
More informationStatistical and Learning Techniques in Computer Vision Lecture 1: Random Variables Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 1: Random Variables Jens Rittscher and Chuck Stewart 1 Motivation Imaging is a stochastic process: If we take all the different sources of
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationOn Improving the k-means Algorithm to Classify Unclassified Patterns
On Improving the k-means Algorithm to Classify Unclassified Patterns Mohamed M. Rizk 1, Safar Mohamed Safar Alghamdi 2 1 Mathematics & Statistics Department, Faculty of Science, Taif University, Taif,
More informationUnsupervised Learning Methods
Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational
More information