Regularized Discriminant Analysis for Face Recognition

1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths paper studes Regularzed Dscrmnant Analyss (RDA) n the context of face recognton. We chec RDA senstvty to dfferent photometrc preprocessng methods and compare ts performance to other classfers. Our study shows that RDA s better able to extract the relevant dscrmnatory nformaton from tranng data than the other classfers tested, thus obtanng a lower error rate. Moreover, RDA s robust under varous lghtng condtons whle the other classfers perform badly when no photometrc method s appled. Keywords: face recognton, feature extracton, regularzaton, prncpal component analyss, dscrmnant analyss, photometrc preprocessng. 1 Introducton Ths study compares the performance of Regularzed Dscrmnant Analyss [2] (RDA) wth that of two classfers: L2 (Eucldean dstance) and angle (Normalzed Correlaton), usually used for face recognton. In sayng L2 and angle we mean that we use a nearest center classfer usng those dstance metrcs. The potental of extractng the relevant dscrmnatory nformaton from a small amount of tranng data usng RDA motvated us to explore RDA n the context of face recognton. We would le to state here that ths s the frst applcaton of RDA [2] to face recognton. In order to study the effcacy of the RDA for face recognton we have desgned experments n whch a small number of faces are represented n both Prncpal Component Analyss (PCA) features usng Egenfaces [1] and Lnear Dscrmnant Analyss (LDA) features usng Fsherfaces [4]. Our goal was to study RDA s behavor wth two completely dfferent data types: one that s obtaned by PCA, whch s smply data compresson, and the other, obtaned by LDA, that yelds hghly separated data. Moreover we also studed the effect of mage photometrc preprocessng methods on the performances of the classfers. The paper s organzed as follows. In the next secton we explan the applcaton of the RDA to face classfcaton. Secton 3 ntroduces the face database used for ths

2 wor and descrbes the experments carred out and ther obectves. Fnally, n Secton 4, conclusons are drawn from the results obtaned. 2. RDA Face Classfcaton Assume that we have tranng mages slced nto column vectors ( ) z ( 1 n ) for =1,2,,g. Each z ( ) belongs to one of the g classes C 1, C2,..., Cg where z ( ) s an mage taen from class C. Here, n s the number of mages from class C. The dmenson of ( ) z s P = N M (the number of pxels n the face mage). In ths wor we use the lnear mappng ( ) T ( ) x = V z, where V s a P S transformaton matrx for S P (usually S << P ). The S- dmensonal vector ( ) x s named feature vector. We compute V usng PCA (Egenfaces [1]) and LDA (Fsherfaces [4]). Then we use PCA and LDA features as nput of the RDA classfer. RDA [2] s a modfcaton of the Quadratc Dscrmnant Analyss (QDA) [5]. QDA assgns an arbtrary face represented by feature vector x to class f D ( x ) > D ( x), 1 m, g, m m D, where T 1 ( ) = ( x - µ ) Σ ( x - µ ) + ln Σ 2ln( P( ) ) x. Here P ( ), µ and Σ are ML estmators [5] of the a-pror probablty, mean and the covarance matrx of the feature vectors from class. A problem wth the QDA classfer occurs when the class sample szes n are small compared wth the dmenson of the feature space S. In ths case, the covarance matrces estmates become hghly varable. In order to solve ths problem, n RDA [2] the classcondtonal covarance matrces Σ n ( x) D are replaced by a regularzed estmate

3 Σ ( λ, γ). Followng Fredman [2], we frst compute the pooled (wthn-class) sample covarance matrx Σ = g = 1 P ( ) Σ. Then usng a regularzaton parameter λ we get ( λ) ( 1 λ) P( ) Σ + ( 1 λ) P( ) + λ λσ Σ =, λ 1. Fnally, usng another regularzaton parameter γ, we have Σ γ λ, = [ ]I, γ 1. S ( γ) ( 1 γ) Σ ( λ) + trace Σ ( λ) The parameter λ converts the class covarance matrx ( λ) Σ to a lnear combnaton of Σ and Σ. The second parameter, γ, shrns ( λ) Σ toward a multple of the dentty matrx. The sutable values of λ and γ are determned by the model selecton procedure [2]. Ths procedure sets a 2-dmensonal grd of ponts on the λ and γ plane ( λ 1, γ 1), evaluates the cross-valdated estmate of msclassfcaton rs at each prescrbed pont on the grd, and then chooses the pont wth the smallest estmated rs as our sutable values of the regularzaton parameters λ, γ. In our experments we set the values λ, γ =.1,.25,.,.75, 1., and appled leave-one-out cross-valdaton procedure [5]. Note that for λ = 1 and γ =.1 we get Σ ( λ, γ) Σ and we carry out lnear dscrmnant classfcaton [5]. For λ = 1 and γ = 1, RDA corresponds to the L2 classfer. Holdng γ = and varyng λ produces classfers between QDA and lnear dscrmnant classfcaton. 3. Expermental Study All of our experments are based on the Olvett Research Laboratory (ORL) face database (retreved from ftp://ftp.u.research.att.com:pub/data/att_faces.tar.z ). Snce we checed the performances of some preprocessng photometrc methods we had changed the lghtng n the database randomly.

4 The ORL database structure contans dfferent mages of dstnct subects (persons). As most researchers dd, we used 5 mages from every class for tranng and 5 mages per test and a sze of 48 48 pxels for each mage. We performed a number of experments employng dfferent photometrc normalzaton, features (PCA and LDA), decson rules (RDA, L2 and angle). Followng [3] we focused on the photometrc methods based on mage normalzaton and hstogram equalzaton. The number of LDA features runs from 3 to 39 wth steps of 3 (the last step s 2) where 39 s the maxmal number avalable for the LDA. The number of PCA features was to 199 wth steps of (the last step s 9), where 199 s the maxmal number avalable for the PCA. For every preprocessng photometrc methods, and for every dfferent feature dmenson (for PCA and LDA) we ran RDA, L2 and angle classfers. In Fgs 1 and 2 we show the test error rates obtaned. 4. Dscusson and Conclusons It s clear from loong at the results that RDA outperforms the L2 and angle classfers when usng the PCA feature extracton method (Fg. 1), but ths phenomenon s not so obvous when usng the LDA feature extracton method (Fg. 2). The best classfcaton results are attaned when usng RDA wth PCA (Fg. 1a) for hstogram equalzaton when the features dmenson s between and (error rates are.5% to 11.5%). Ths s a remarable feature of RDA because reports n the lterature usually state that LDA s better than PCA snce LDA extracts the relevant nformaton whle PCA only compresses t. Loong at Fg. 1, presentng the results for PCA, we found that the RDA classfer does not need any preprocessng to acheve good results. Ths can save precous tme when demandng real tme applcatons are used.

5 An nterestng pont s the non-monotonc behavor of the RDA errors usng PCA features for mage normalzaton n the dmensons 1 and for hstogram equalzaton n the dmensons - 1. The reason for ths s that the model selecton procedure selected the values λ =.1 and γ =.1 as ts sutable parameters. These values cause RDA to act as QDA (see end of Secton 2 and [2]), whch ncreases the rs of over-fttng, resultng n a large test error rate. For the hgher dmenson the parameters λ, γ are set to be other than.1, RDA produces classfers between QDA, LDA and L2, thus reducng the rs of over-fttng and decreasng the test error rate. Fnally the maor features of RDA are ts ablty to extract relevant dscrmnatory nformaton and ts robustness to lghtng changes. Support Vector machne (SVM) shares the same features [3]. It s nterestng to compare RDA and SVM, whch s an obect of our future research. References 1.M. Tur, A. Pentland, Egenfaces for Recognton, Journal of Cogntve Neuroscence, 3 (1), 1991, pp. 72-86. 2.J. H. Fredman, Regularzed Dscrmnant Analyss, Journal of the Amercan Statstcal Assocaton, 84 (5), 1989, pp. 165-175. 3.K. Jonsson, J. Kttler,YPL, J. Matas, Support Vector Machnes for Face Authentcaton, Image and Vson Computng, (5-6), 2, pp. 369-375. 4.K. Etemad, R. Chellappa, Dscrmnant Analyss for Recognton of Human Face Images, Journal of Optcal Socety of Amerca A, 1997, pp. 1724-1733. 5. R.O. Duda, P.E. Hart, D.J. Stor, Pattern Classfcaton and Scene Analyss, John Wley & Sons, New Yor, 1.

6 RDA classfer No Preprocessng Image Normalzaton Hstogram Equalzaton RDA classfer No Preprocessng Image Normalzaton Hstogram Equalzaton 1 1 1 1 PCA Feature Vectors Dmenson (a) 5 15 25 35 LDA Feature Vectors Dmenson (a) L2 classfer No Preprocessng Image Normalzaton Hstogram Equalzaton L2 classfer No Preprocessng Image Normalzaton Hstogram Equalzaton 1 1 1 1 PCA Feature Vectors Dmenson (b) 5 15 25 35 LDA Feature Vectors Dmenson (b) Angle classfer No Preprocessng Image Normalzaton Hstogram Equalzaton Angle classfer No Preprocessng Image Normalzaton Hstogram Equalzaton 1 1 1 1 PCA Feature Vectors Dmenson (c) Fgure 1. Classfers' error rates wth dfferent preprocessng types versus dfferent PCA features dmensons: (a) RDA classfer, (b) L2 classfer, (c) Angle classfer. 5 15 25 35 LDA Feature Vectors Dmenson (c) Fgure 2. Classfers' error rates wth dfferent preprocessng types versus dfferent LDA features dmensons: (a) RDA classfer, (b) L2 classfer, (c) Angle classfer.