Regularized Discriminant Analysis for Face Recognition

Similar documents
P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

A Novel Biometric Feature Extraction Algorithm using Two Dimensional Fisherface in 2DPCA subspace for Face Recognition

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Unified Subspace Analysis for Face Recognition

Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han

Tensor Subspace Analysis

A New Facial Expression Recognition Method Based on * Local Gabor Filter Bank and PCA plus LDA

Lecture 12: Classification

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

Statistical pattern recognition

Rotation Invariant Shape Contexts based on Feature-space Fourier Transformation

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

Using Random Subspace to Combine Multiple Features for Face Recognition

Composite Hypotheses testing

10-701/ Machine Learning, Fall 2005 Homework 3

Pattern Classification

Regularized Discriminant Analysis for High Dimensional, Low Sample Size Data

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Multi-Task Learning in Heterogeneous Feature Spaces

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Supporting Information

Efficient and Robust Feature Extraction by Maximum Margin Criterion

Pattern Classification

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

BACKGROUND SUBTRACTION WITH EIGEN BACKGROUND METHODS USING MATLAB

Boostrapaggregating (Bagging)

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Learning with Tensor Representation

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Which Separator? Spring 1

Kernel Maximum a Posteriori Classification with Error Bound Analysis

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Evaluation of simple performance measures for tuning SVM hyperparameters

CSE 252C: Computer Vision III

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

PATTERN RECOGNITION AND IMAGE UNDERSTANDING

arxiv:cs.cv/ Jun 2000

Feature Extraction by Maximizing the Average Neighborhood Margin

Chapter 6 Support vector machine. Séparateurs à vaste marge

Statistics II Final Exam 26/6/18

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Chapter 13: Multiple Regression

FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES

Lecture 10: Dimensionality reduction

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Online Classification: Perceptron and Winnow

Pattern Recognition 42 (2009) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage:

A Fast Fractal Image Compression Algorithm Using Predefined Values for Contrast Scaling

MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES. Yongzhong Xing, Xiaobei Wu and Zhiliang Xu

ERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION

Improvement of Histogram Equalization for Minimum Mean Brightness Error

Time-Varying Systems and Computations Lecture 6

DECADAL DECLINE ( )OF LOGGERHEAD SHRIKES ON CHRISTMAS BIRD COUNTS IN ALABAMA, MISSISSIPPI, AND TENNESSEE

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Non-linear Canonical Correlation Analysis Using a RBF Network

Automatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

More metrics on cartesian products

Chapter 8 Indicator Variables

Linear Classification, SVMs and Nearest Neighbors

A New Evolutionary Computation Based Approach for Learning Bayesian Network

Kernel Methods and SVMs Extension

18-660: Numerical Methods for Engineering Design and Optimization

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Module 9. Lecture 6. Duality in Assignment Problems

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Support Vector Machines

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Multigradient for Neural Networks for Equalizers 1

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Finding Dense Subgraphs in G(n, 1/2)

Study of Selective Ensemble Learning Methods Based on Support Vector Machine

Linear Approximation with Regularization and Moving Least Squares

An Improved multiple fractal algorithm

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD.

Intro to Visual Recognition

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Semi-supervised Classification with Active Query Selection

Dimension Reduction and Visualization of the Histogram Data

A Robust Method for Calculating the Correlation Coefficient

Homework Assignment 3 Due in class, Thursday October 15

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Transient Stability Assessment of Power System Based on Support Vector Machine

Fisher Linear Discriminant Analysis

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Generalized Linear Methods

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

On the correction of the h-index for career length

The Minimum Universal Cost Flow in an Infeasible Flow Network

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

The exam is closed book, closed notes except your one-page cheat sheet.

Transcription:

1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths paper studes Regularzed Dscrmnant Analyss (RDA) n the context of face recognton. We chec RDA senstvty to dfferent photometrc preprocessng methods and compare ts performance to other classfers. Our study shows that RDA s better able to extract the relevant dscrmnatory nformaton from tranng data than the other classfers tested, thus obtanng a lower error rate. Moreover, RDA s robust under varous lghtng condtons whle the other classfers perform badly when no photometrc method s appled. Keywords: face recognton, feature extracton, regularzaton, prncpal component analyss, dscrmnant analyss, photometrc preprocessng. 1 Introducton Ths study compares the performance of Regularzed Dscrmnant Analyss [2] (RDA) wth that of two classfers: L2 (Eucldean dstance) and angle (Normalzed Correlaton), usually used for face recognton. In sayng L2 and angle we mean that we use a nearest center classfer usng those dstance metrcs. The potental of extractng the relevant dscrmnatory nformaton from a small amount of tranng data usng RDA motvated us to explore RDA n the context of face recognton. We would le to state here that ths s the frst applcaton of RDA [2] to face recognton. In order to study the effcacy of the RDA for face recognton we have desgned experments n whch a small number of faces are represented n both Prncpal Component Analyss (PCA) features usng Egenfaces [1] and Lnear Dscrmnant Analyss (LDA) features usng Fsherfaces [4]. Our goal was to study RDA s behavor wth two completely dfferent data types: one that s obtaned by PCA, whch s smply data compresson, and the other, obtaned by LDA, that yelds hghly separated data. Moreover we also studed the effect of mage photometrc preprocessng methods on the performances of the classfers. The paper s organzed as follows. In the next secton we explan the applcaton of the RDA to face classfcaton. Secton 3 ntroduces the face database used for ths

2 wor and descrbes the experments carred out and ther obectves. Fnally, n Secton 4, conclusons are drawn from the results obtaned. 2. RDA Face Classfcaton Assume that we have tranng mages slced nto column vectors ( ) z ( 1 n ) for =1,2,,g. Each z ( ) belongs to one of the g classes C 1, C2,..., Cg where z ( ) s an mage taen from class C. Here, n s the number of mages from class C. The dmenson of ( ) z s P = N M (the number of pxels n the face mage). In ths wor we use the lnear mappng ( ) T ( ) x = V z, where V s a P S transformaton matrx for S P (usually S << P ). The S- dmensonal vector ( ) x s named feature vector. We compute V usng PCA (Egenfaces [1]) and LDA (Fsherfaces [4]). Then we use PCA and LDA features as nput of the RDA classfer. RDA [2] s a modfcaton of the Quadratc Dscrmnant Analyss (QDA) [5]. QDA assgns an arbtrary face represented by feature vector x to class f D ( x ) > D ( x), 1 m, g, m m D, where T 1 ( ) = ( x - µ ) Σ ( x - µ ) + ln Σ 2ln( P( ) ) x. Here P ( ), µ and Σ are ML estmators [5] of the a-pror probablty, mean and the covarance matrx of the feature vectors from class. A problem wth the QDA classfer occurs when the class sample szes n are small compared wth the dmenson of the feature space S. In ths case, the covarance matrces estmates become hghly varable. In order to solve ths problem, n RDA [2] the classcondtonal covarance matrces Σ n ( x) D are replaced by a regularzed estmate

3 Σ ( λ, γ). Followng Fredman [2], we frst compute the pooled (wthn-class) sample covarance matrx Σ = g = 1 P ( ) Σ. Then usng a regularzaton parameter λ we get ( λ) ( 1 λ) P( ) Σ + ( 1 λ) P( ) + λ λσ Σ =, λ 1. Fnally, usng another regularzaton parameter γ, we have Σ γ λ, = [ ]I, γ 1. S ( γ) ( 1 γ) Σ ( λ) + trace Σ ( λ) The parameter λ converts the class covarance matrx ( λ) Σ to a lnear combnaton of Σ and Σ. The second parameter, γ, shrns ( λ) Σ toward a multple of the dentty matrx. The sutable values of λ and γ are determned by the model selecton procedure [2]. Ths procedure sets a 2-dmensonal grd of ponts on the λ and γ plane ( λ 1, γ 1), evaluates the cross-valdated estmate of msclassfcaton rs at each prescrbed pont on the grd, and then chooses the pont wth the smallest estmated rs as our sutable values of the regularzaton parameters λ, γ. In our experments we set the values λ, γ =.1,.25,.,.75, 1., and appled leave-one-out cross-valdaton procedure [5]. Note that for λ = 1 and γ =.1 we get Σ ( λ, γ) Σ and we carry out lnear dscrmnant classfcaton [5]. For λ = 1 and γ = 1, RDA corresponds to the L2 classfer. Holdng γ = and varyng λ produces classfers between QDA and lnear dscrmnant classfcaton. 3. Expermental Study All of our experments are based on the Olvett Research Laboratory (ORL) face database (retreved from ftp://ftp.u.research.att.com:pub/data/att_faces.tar.z ). Snce we checed the performances of some preprocessng photometrc methods we had changed the lghtng n the database randomly.

4 The ORL database structure contans dfferent mages of dstnct subects (persons). As most researchers dd, we used 5 mages from every class for tranng and 5 mages per test and a sze of 48 48 pxels for each mage. We performed a number of experments employng dfferent photometrc normalzaton, features (PCA and LDA), decson rules (RDA, L2 and angle). Followng [3] we focused on the photometrc methods based on mage normalzaton and hstogram equalzaton. The number of LDA features runs from 3 to 39 wth steps of 3 (the last step s 2) where 39 s the maxmal number avalable for the LDA. The number of PCA features was to 199 wth steps of (the last step s 9), where 199 s the maxmal number avalable for the PCA. For every preprocessng photometrc methods, and for every dfferent feature dmenson (for PCA and LDA) we ran RDA, L2 and angle classfers. In Fgs 1 and 2 we show the test error rates obtaned. 4. Dscusson and Conclusons It s clear from loong at the results that RDA outperforms the L2 and angle classfers when usng the PCA feature extracton method (Fg. 1), but ths phenomenon s not so obvous when usng the LDA feature extracton method (Fg. 2). The best classfcaton results are attaned when usng RDA wth PCA (Fg. 1a) for hstogram equalzaton when the features dmenson s between and (error rates are.5% to 11.5%). Ths s a remarable feature of RDA because reports n the lterature usually state that LDA s better than PCA snce LDA extracts the relevant nformaton whle PCA only compresses t. Loong at Fg. 1, presentng the results for PCA, we found that the RDA classfer does not need any preprocessng to acheve good results. Ths can save precous tme when demandng real tme applcatons are used.

5 An nterestng pont s the non-monotonc behavor of the RDA errors usng PCA features for mage normalzaton n the dmensons 1 and for hstogram equalzaton n the dmensons - 1. The reason for ths s that the model selecton procedure selected the values λ =.1 and γ =.1 as ts sutable parameters. These values cause RDA to act as QDA (see end of Secton 2 and [2]), whch ncreases the rs of over-fttng, resultng n a large test error rate. For the hgher dmenson the parameters λ, γ are set to be other than.1, RDA produces classfers between QDA, LDA and L2, thus reducng the rs of over-fttng and decreasng the test error rate. Fnally the maor features of RDA are ts ablty to extract relevant dscrmnatory nformaton and ts robustness to lghtng changes. Support Vector machne (SVM) shares the same features [3]. It s nterestng to compare RDA and SVM, whch s an obect of our future research. References 1.M. Tur, A. Pentland, Egenfaces for Recognton, Journal of Cogntve Neuroscence, 3 (1), 1991, pp. 72-86. 2.J. H. Fredman, Regularzed Dscrmnant Analyss, Journal of the Amercan Statstcal Assocaton, 84 (5), 1989, pp. 165-175. 3.K. Jonsson, J. Kttler,YPL, J. Matas, Support Vector Machnes for Face Authentcaton, Image and Vson Computng, (5-6), 2, pp. 369-375. 4.K. Etemad, R. Chellappa, Dscrmnant Analyss for Recognton of Human Face Images, Journal of Optcal Socety of Amerca A, 1997, pp. 1724-1733. 5. R.O. Duda, P.E. Hart, D.J. Stor, Pattern Classfcaton and Scene Analyss, John Wley & Sons, New Yor, 1.

6 RDA classfer No Preprocessng Image Normalzaton Hstogram Equalzaton RDA classfer No Preprocessng Image Normalzaton Hstogram Equalzaton 1 1 1 1 PCA Feature Vectors Dmenson (a) 5 15 25 35 LDA Feature Vectors Dmenson (a) L2 classfer No Preprocessng Image Normalzaton Hstogram Equalzaton L2 classfer No Preprocessng Image Normalzaton Hstogram Equalzaton 1 1 1 1 PCA Feature Vectors Dmenson (b) 5 15 25 35 LDA Feature Vectors Dmenson (b) Angle classfer No Preprocessng Image Normalzaton Hstogram Equalzaton Angle classfer No Preprocessng Image Normalzaton Hstogram Equalzaton 1 1 1 1 PCA Feature Vectors Dmenson (c) Fgure 1. Classfers' error rates wth dfferent preprocessng types versus dfferent PCA features dmensons: (a) RDA classfer, (b) L2 classfer, (c) Angle classfer. 5 15 25 35 LDA Feature Vectors Dmenson (c) Fgure 2. Classfers' error rates wth dfferent preprocessng types versus dfferent LDA features dmensons: (a) RDA classfer, (b) L2 classfer, (c) Angle classfer.

7