Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han

Size: px

Start display at page:

Download "Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han"

Clarissa Horton
6 years ago
Views:

1 Report No. UIUCDCS-R UILU-ENG Subspace Learnng Based on Tensor Analyss by Deng Ca, Xaofe He, and Jawe Han May 2005

2 Subspace Learnng Based on Tensor Analyss Deng Ca Xaofe He Jawe Han Department of Computer Scence, Unversty of Illnos at Urbana-Champagn Department of Computer Scence, Unversty of Chcago Abstract Lnear dmensonalty reducton technques have been wdely used n pattern recognton and computer vson, such as face recognton, mage retreval, etc. The typcal methods nclude Prncpal Component Analyss (PCA) whch s unsupervsed and Lnear Dscrmnant Analyss (LDA) whch s supervsed. Both of them consder an m 1 m 2 mage as a hgh dmensonal vector n R m1 m2. Such a vector representaton fals to take nto account the spatal localty of pxels n the mage. An mage s ntrnscally a matrx. In ths paper, we consder an mage as the second order tensor n R m1 R m2 and propose two novel tensor subspace learnng algorthms called TensorPCA and TensorLDA. Our algorthms explctly take nto account the relatonshp between column vectors of the mage matrx and the relatonshp between the row vectors of the mage matrx. We compare our proposed approaches wth PCA and LDA for face recognton on three standard face databases. Expermental results show that tensor analyss acheves better recognton accuracy, whle beng much more effcent. The work was supported n part by the U.S. Natonal Scence Foundaton NSF IIS /IIS Any opnons, fndngs, and conclusons or recommendatons expressed n ths paper are those of the authors and do not necessarly reflect the vews of the fundng agences. 1

3 1 Introducton Real data of natural and socal scences s often very hgh-dmensonal. However, the underlyng structure can n many cases be characterzed by a small number of parameters. Reducng the dmensonalty of such data s benefcal for vsualzng the ntrnsc structure and t s also an mportant preprocessng step n many statstcal pattern recognton problems, such as face recognton [8, 1] and mage retreval [7, 2]. One of the fundamental problems n data analyss s how to represent the data. In the most of prevous works on face representaton, the face s represented as a vector n hgh-dmensonal space. However, an mage s ntrnscally a matrx, or the second order tensor. In vector representaton, the face mage has to be converted to a vector. A typcal way to do ths so-called matrx-to-vector algnment s to concatenate all the rows n the matrx together to get a sngle vector. In ths way, an mage of sze m 1 m 2 s dentfed wth a pont n R m 1 m 2. Thus, a lnear transformaton n R m 1 m 2 s characterzed by y = A T x, where A s an (m 1 m 2 ) (m 1 m 2 ) matrx. To acqure such lnear transformaton, tradtonal subspace learnng methods (PCA and LDA) need to egendecomposton of some matrces wth sze (m 1 m 2 ) (m 1 m 2 ), whch s computatonal expensve. Moreover, the learnng parameters n PCA and LDA (the dmenson of the projecton vector) are very large. These methods mght not acqure good performance when the number of tranng samples s small. Recently, multlnear algebra, the algebra of hgher-order tensors, was appled for analyzng the multfactor structure of mage ensembles [10, 11, 12]. aslescu and Terzopoulos have proposed a novel face representaton algorthm called Tensorface [10]. Tensorface represents the set of face mages by a hgher-order tensor and extends tradtonal PCA to hgher-order tensor decomposton. In ths way, the multple factors related to expresson, llumnaton and pose can be separated from dfferent dmensons of the tensor. However, Tensorface stll consders each face mage as a vector nstead of 2-d tensor. Thus, Tensorface s computatonally expensve. Moreover, t does not encode dscrmnatng nformaton, thus t s not optmal for recognton. In ths paper, we systemcally study the tensor analyss for subspace learnng and propose two novel algorthms called TensorPCA and TensorLDA for learnng a tensor subspace. TensorPCA and TensorLDA are fundamentally based on the tradtonal Prncple Component Analyss and Lnear Dscrmnant Analyss algorthms. An mage s represented as a matrx X R m 1 R m 2 and the lnear transformaton n the tensor space s naturally represented as Y = U T X, where U s a m 1 m 1 matrx and s a m 2 m 2 matrx. The new algorthm s dstnct from several perspectves: 1. Comparng to tradtonal subspace learnng algorthm such as PCA and LDA, our algorthm s much more computatonally effcent. Ths can be smply seen from the fact that the matrces U and are much smaller than the matrx A. 2

4 2. Comparng to recently proposed Tensorface approach [10], our algorthm explctly model each face mage as a 2-D tensor whch consders spatal localty of pxels n the face mage. Also, the TesorLDA ncorporated the label nformaton whch can get a better performance for face recognton. 3. Our algorthm s especally sutable for the small sample ssue, snce the number of parameters n our algorthm s much smaller that of tradtonal subspace learnng algorthms. The rest of ths paper s organzed as follows. We gve a bref revew of PCA and LDA n Secton 2. The TensorPCA and TensorLDA algorthms are ntroduced n Secton 3. Expermental results are shown n Secton 4. Fnally, we provde some Concludng remarks and suggestons for future work n Secton 5. 2 A Bref revew of PCA and LDA PCA s a canoncal lnear dmensonalty reducton algorthm. The basc dea of PCA s to project the data along the drectons of maxmal varances so that the reconstructon error can be mnmzed. Gven a set of data ponts x 1, x 2,, x n,letwbethe transformaton vector and y = w T x. The objectve functon of PCA s as follows: w opt = argmax w = argmax w n (y y) 2 =1 wt Cw where y = 1 n y and C s the data covarance matrx. The bass functons of PCA are the egenvectors of the data covarance matrx assocated wth the largest egenvalues. Whle PCA seeks drectons that are effcent for representaton, Lnear Dscrmnant Analyss (LDA) seeks drectons that are effcent for dscrmnaton. Suppose we have a set of n samples x 1, x 2,, x n, belongng to k classes. The objectve functon of LDA s as follows: S W = S B = w opt =argmax w w T S B w w T S W w k n (m () m)(m () m) T =1 k n =1 j=1 (x () j m () )(x () j m () ) T where m s the total sample mean vector, n s the number of samples n the -th class, m () s the average vector of the -th class, and x () j s the j-th sample n the -th class. We call S W the wthn-class scatter matrx and S B the between-class scatter matrx. 3

5 3 Subspace Learnng based on Tensor Analyss Let X R m 1 m 2 denote an mage of sze m 1 m 2. Mathematcally, X can be thought of as the 2 nd order tensor (or, 2-tensor) n the tensor space R m 1 R m 2. Let (u 1,, u m1 )beasetof orthonormal bass functons of R m 1. Let (v 1,, v m2 ) be a set of orthonormal bass functons of R m 2. Thus, an 2-tensor X can be unquely wrtten as: X = j (u T Xv j )u v T j Ths ndcates that {u v T j } forms a bass of the tensor space Rm 1 R m 2. Defne two matrces U =[u 1,, u l1 ] R m 1 l 1 and =[v 1,, v l2 ] R m 2 l 2.LetUbeasubspace of R m 1 spanned by {u } l 1 =1 and be a subspace of R m 2 spanned by {v j } l 2 j=1. Thus, the tensor product U s a subspace of R m 1 R m 2. The projecton of X R m 1 m 2 onto the space U s Y = U T X R l 1 l 2 Suppose we have n mages, X 1,,X n R m 1 m 2. These mages belong to k categores C 1,,C k. For the -th category, there are n mages. The mean of each category M (X) s computed by takng the average of X n category,.e., = 1 n and the global mean M (X) s defned as M (X) M (X) = 1 n j C X j Let Y = U T X R l 1 l 2. Lkewse, we can defne M (Y ) = 1 n j C Y j and M (Y ) = 1 n n j=1 Y j. It s easy to check that M (Y ) = U T M (X) and M (Y ) = U T M (X). The tensor subspace learnng problem ams at fndng the (l 1 l 2 )-dmensonal space U based on the specfc objectve functons. Partcularly, we wll ntroduce two novel algorthms called TensorPCA and TensorLDA n ths secton. It would be mportant to note that, the number of parameters n the tensor subspace learnng problem s much smaller than that n the orgnal subspace learnng problem lke PCA and LDA. Therefore, tensor analyss should be partcularly applcable for the small sample ssue. n j=1 X j 3.1 Tensor PCA TensorPCA s fundamentally based on PCA. It tres to project the data to the tensor subspace of maxmal varances so that the reconstructon error can be mnmzed. The objectve functon of TensorPCA can be descrbed as follows: Y M (Y ) 2 (1) max U, 4

6 Note that we use tensor norm of the dfference of two tensors to measure the dstance of two mages. Snce order two tensor s essentally matrx, we use Frobenus norm of a matrx as our 2-d tensor norm. Snce A 2 =tr(aa T ), we have n Y M (Y ) 2 = =1 = n =1 n =1 tr ((Y ) M (Y ) )(Y M (Y ) ) T ( ) tr U T (X M (X) ) T (X M (X) ) T U ( ( n ( )) ) = tr U T (X M (X) ) T (X M (X) ) T U =1. = tr ( U T M U ) where M = n ( =1 (X M (X) ) T (X M (X) ) T ). Smlarly, A 2 =tr(a T A), so we also have ( ( n Y M (Y ) 2 = tr T ( (X M (X) ) T UU T (X M (X) ) )) ) =1. = tr ( T M U ) where M U = n ( =1 (X M (X) ) T UU T (X M (X) ) ). Thus, the optmal projecton U should be the egenvectors of M and the optmal projecton should be the egenvectors of M U. One mght notce that U and can not be computed ndependently. In our algorthm, we try to fnd an optmal coordnate system of R m 1 R m 2. That s, we assume that both U and are orthonormal,.e. U T U = UU T = I and T = T = I. Insuchcase, and M = M U = n ((X ) M (X) )(X M (X) ) T =1 n =1 ( ) (X M (X) ) T (X M (X) ) It s clear that M no longer depends on,andm U no longer depends on U. Therefore, the matrx U can be smply computed as the egenvectors of M and the matrx can be computed as the egenvectors of M U. Note that, both M U and M are symmetrc, hence ther egenvectors are orthonormal. Ths s consstent wth our assumptons. If we try to reduce the orgnal tensor space to a l 1 l 2 tensor subspace, we choose the frst l 1 column vectors n U and the frst l 2 column vectors n. 3.2 Tensor LDA TensorPCA s performed under unsupervsed mode. Its goal s to mnmze the reconstructon error. In ths subsecton, we ntroduce the TensorLDA algorthm whch s performed under supervsed 5

7 mode. It tres to fnd the most dscrmnatve tensor subspace. By projectng the data ponts nto the tensor subspace, the data ponts of dfferent classes are far from each other whle data ponts of the same class are close to each other. Smlar to tradtonal Lnear Dscrmnant Analyss, we try to maxmze the followng objectve functon: Snce A 2 =tr(aa T ), we have max U, k k Y j M (Y ) 2 = tr =1 j C =1 j C k = tr where Sw = k =1 follows: k =1 j C (X j M (X) n M (Y ) M (Y ) 2 = k =1 j C Y j M (Y ) 2 k =1 n M (Y ) M (Y ) 2 (2) =1. = tr ( U T Sw U ) ( (Y j M (Y ) )(Y j M (Y ) ) T ) U T (X j M (X) j C ) T (X j M (X) ) T U ) T (X j M (X) ) T. The denomnator can be computed as k =1 = tr ( n (tr )) (M (Y ) M (Y ) )(M (Y ) M (Y ) ) T ( k =1. = tr ( U T S b U) n U T (M (X) M (X) ) T (M (X) M (X) ) T U ) where S b = k =1 n (M (X) M (X) ) T (M (X) M (X) ) T. Smlarly, A 2 = trace(a T A), so we also have S U w = k Y j M (Y ) 2 =tr ( T Sw U ) =1 j C k (X j M (X) ) T UU T (X j M (X) ) j C =1 and S U b = k =1 k =1 n (M (X) n M (Y ) M (Y ) 2 =tr ( T S U b ) M (X) ) T UU T (M (X) M (X) ) 6

8 The optmal projectons that preserve the gven class structure should maxmze U T Sw U/U T Sb U and T Sw U / T Sb U smultaneously. Thus U and can be obtaned by solvng the followng two generalzed egenvector problems: Sb a = λs w a (3) Sb U a = λsu w a (4) Unfortunately, the matrces Sb,SU b,s w,sw U are not fxed whle depend on U or. Therefore, U and can not be computed ndependently. To relax such a problem, we add a constrant that U and be orthonormal. That s, U T U = UU T = I and T = T = I. Thus, S U w = S U b = k S w = S b = k (X j M (X) ) T (X j M (X) ) =1 j C =1 n (M (X) M (X) ) T (M (X) M (X) ) k (X j M (X) )(X j M (X) ) T =1 j C k =1 n (M (X) M (X) )(M (X) M (X) ) T (5) It s clear that, wth the constrants, the above four matrces can be computed ndependently only dependng on the tranng data ponts. In the followng, we wll descrbe how to compute U and. It would be mportant to note that, the computatons of U and are essentally the same. Wthout loss of generalty, we use S w to represent Sw U or Sw, and use S b to represent Sb U or Sb. Now our problem becomes: a T S b a a 1 =argmax a a T S w a wth the constrant a k =argmax a a T S b a a T S w a a T k a 1 = a T k a 2 = = a T k a k 1 =0 Snce S w s n general nonsgular, for any a, we can always normalze t such that a T S w a =1, and the rato of a T S b a and a T S w a keeps unchanged. Thus, the above mnmzaton problem s equvalent to mnmzng the value of a T S b a wth an addtonal constrant as follows, a T S w a =1 Note that, the above normalzaton s only for smplfyng the computaton. optmal solutons, we can re-normalze them to get a othonormal bass vectors. Once we get the 7

9 It s easy to check that a 1 s the egenvector of the generalzed egen-problem: S b a = λs w a assocated wth the largest egenvalue. Snce S w s general nonsngular, a 1 s the egenvector of the matrx (S w ) 1 S b assocated wth the largest egenvalue. In order to get the k-th bass vector, we maxmze the followng objectve functon: wth the constrants: f(a k )= at k S ba k a T k S wa k (6) a T k a 1 = a T k a 2 = = a T k a k 1 =0, a T k S wa k =1 We can use the Lagrange multplers to transform the above objectve functon to nclude all the constrants C (k) = a T k S ba k λ(a T k S wa k 1) µ 1 a T k a 1 µ k 1 a T k a k 1 The optmzaton s performed by settng the partal dervatve of C (k) wth respect to a k to zero: C (k) =0 a k 2S b a k 2λS w a k µ 1 a 1 µ k 1 a k 1 =0 (7) Multplyng the left sde of (7) by a T k, we obtan 2a T k S ba k 2λa T k S wa k =0 λ = at k S ba k a T k S wa k (8) Comparng to (6), λ exactly represents the expresson to be maxmzed. Multplyng the left sde of (7) successvely by a T 1 S 1 w,, a T k 1 S 1 w, we now obtan a set of k 1 equatons: µ 1 a T 1 S 1 w a µ k 1 a T 1 S 1 w a k 1 =2a T 1 S 1 w S b a k µ 1 a T 2 S 1 w a µ k 1 a T 2 S 1 w a k 1 =2a T 2 S 1 w S b a k µ 1 a T k 1 S 1 w a µ k 1 a T k 1 S 1 w a k 1 =2a T k 1 S 1 w S b a k We defne: µ (k 1) =[µ 1,,µ k 1 ] T, A (k 1) =[a 1,, a k 1 ] [ ] [ B (k 1) = = A (k 1)] T S 1 w A (k 1), B (k 1) j B (k 1) j = a T S 1 w a j 8

10 Usng ths smplfed notaton, the prevous set of k 1 equatons can be represented n a sngle matrx relatonshp [ B (k 1) µ (k 1) =2 A (k 1)] T S 1 w S b a k thus Let us now multply the left sde of (7) by S 1 w µ (k 1) =2 [B (k 1)] 1 [ A (k 1)] T S 1 w S b a k (9) 2S 1 w S b a k 2λa k µ 1 S 1 w a 1 µ k 1 S 1 w a k 1 =0 Ths can be expressed usng matrx notaton as 2S 1 w S b a k 2λa k S 1 w A (k 1) µ (k 1) =0 Wth equaton (9), we obtan { I S 1 w A (k 1) [ B (k 1)] 1 [ A (k 1)] T } S 1 w S b a k = λa k As shown n (8), λ s just the crteron to be maxmzed, thus a k s the egenvector of { [ M (k) = I Sw 1 A (k 1) B (k 1)] 1 [ A (k 1)] } T Sw 1 S b assocated wth the largest egenvalue of M (k). In summery, our TensorLDA algorthm s performed as follows: 1. Compute the S U w, S U b, S w, S b as descrbed n Eqn Calculate the orthonormal matrx U =[u 1,, u m1 ]as (a) u 1 s the egenvector of (S w ) 1 S b (b) u k s the egenvector of M (k) M (k) = { I ( S w U (k 1) = [u 1,, u k 1 ] B (k 1) = correspondng to the largest egenvalue. correspondng to the largest egenvalue, where ) [ 1 ] U (k 1) B (k 1) 1 [ U (k 1)] } T ( ) S 1 w S b [ U (k 1)] T ( S w ) 1 U (k 1) 3. Calculate the orthonormal matrx =[v 1,, v m2 ]as (a) v 1 s the egenvector of (S U w ) 1 S U b correspondng to the largest egenvalue. 9

11 (b) v k s the egenvector of M (k) U correspondng to the largest egenvalue, where M (k) U = { I ( S U w (k 1) = [v 1,, v k 1 ] B (k 1) U = ) [ 1 ] (k 1) B (k 1) 1 [ U (k 1)] } T ( ) S U 1 w S U b [ (k 1)] T ( S U w ) 1 (k 1) If we try to reduce the orgnal tensor space to a l 1 l 2 tensor subspace, we choose the frst l 1 column vectors n U and the frst l 2 column vectors n. 4 Expermental Results In ths secton, we nvestgate the performance of our proposed TensorPCA and TensorLDA methods for face recognton. The system performance s compared wth the Egenfaces method (PCA) [9] and the Fsherfaces method (PCA+LDA) [1], two of the most popular lnear methods n face recognton. In ths study, three face databases were tested. The frst one s the Yale database 1, the second one s the ORL (Olvett Research Laboratory) datebase 2, and the thrd one s the PIE (pose, llumnaton, and expresson) database from CMU [6]. In all the experments, preprocessng to locate the faces was appled. Orgnal mages were normalzed (n scale and orentaton) such that the two eyes were algned at the same poston. Then, the facal areas were cropped nto the fnal mages for matchng. The sze of each cropped mage n Yale and PIE databases s pxels, and pxels for the ORL database, wth 256 gray levels per pxel. For Fsherfaces and Egenfaces, each mage s represented by a 1, 024-dmensonal (or 4, 096-dmensonal for ORL database) vector n mage space. For TensorPCA and TensorLDA, the mages are represented as matrces. Dfferent pattern classfers have been appled for face recognton, ncludng nearest-neghbor [1], Bayesan [4], Support ector Machne [5], etc. In ths paper, we apply the nearest-neghbor classfer for ts smplcty. The Eucldean metrc s used as our dstance measure. In short, the recognton process has three steps. Frst, we calculate the face subspace from the tranng set of face mages; then the new face mage to be dentfed s projected nto d-dmensonal subspace (PCA and LDA) or (d d)-dmensonal tensor subspace (TensorPCA and TensroLDA); fnally, the new face mage s dentfed by a nearest neghbor classfer. 4.1 Yale Database The Yale face database was constructed at the Yale Center for Computatonal son and Control. It contans 165 grayscale mages of 15 ndvduals. The mages demonstrate varatons n lghtng

12 Fgure 1: Sample face mages from the Yale database. For each subject, there are 11 face mages under dfferent lghtng condtons wth facal expresson. Table 1: Performance comparsons on the Yale database Method Dms (d) Error Rate Baselne % Egenfaces (PCA) % Fsherfaces (PCA+LDA) % TensorPCA % TensorLDA % condton (left-lght, center-lght, rght-lght), facal expresson (normal, happy, sad, sleepy, surprsed, and wnk), and wth/wthout glasses. Fgure 1 show the 11 mages of one ndvdual n Yale data base. A random subset wth 2 mages per ndvdual (hence, 30 mages n total) was taken wth labels to form the tranng set. The rest of the database was consdered to be the testng set. 20 tmes of random selecton for tranng examples were performed and the average recognton result was recorded. The tranng samples were used to learn the subspace. The testng samples were then projected nto the low-dmensonal representaton subspace. Recognton was performed usng a nearest-neghbor classfer. Note that, for LDA, there are at most c 1 nonzero generalzed egenvalues and, so, an upper bound on the dmenson of the reduced space s c 1, where c s the number of ndvduals [1]. In general, the performance of the all these methods vares wth the number of dmensons. We show the best results obtaned by Fsherfaces, Egenfaces, Tensor- PCA, TensorLDA and Baselne n Table 1. It s found that the TensorLDA method sgnfcantly outperforms both Egenfaces and Fsherfaces methods. The error rates of Egenfaces, Fsherfaces, TensorPCA and TensorLDA are 56.5%, 54.3%, 53.8% and 47.9%, respectvely. Fgure 2 shows the plots of error rate versus dmensonalty reducton. It s nterestng to note that when kept c 1 dmensons, Fsherfaces s even worse than baselne. Ths result s consstent wth the observaton n [3] that Egenfaces can outperform Fsherfaces n small tranng sample. 4.2 ORL Database ORL (Olvett Research Laboratory) face database s a well-known dataset for face recognton. It contans 400 mages of 40 ndvduals. Some mages were captured at dfferent tmes and have dfferent varatons ncludng expresson (open or closed eyes, smlng or non-smlng) and facal 11

13 80 Error rate (%) TensorLDA TensorPCA FsherFace (PCA+LDA) EgenFace (PCA) Baselne Dmenson (d) Fgure 2: Error rate vs. dmensonalty reducton on Yale database Fgure 3: Sample face mages from the ORL database. For each subject, there are 10 face mages wth dfferent facal expresson and detals. detals (glasses or no glasses). The mages were taken wth a tolerance for some tltng and rotaton of the face up to 20 degrees. All mages are grayscale and of sze sample mages of one ndvdual n the ORL database are dsplayed n Fgure 3. A random subset wth 2 mages per ndvdual (hence, 80 mages n total) was taken wth labels to form the tranng set. The rest of the database was consdered to be the testng set. 20 tmes of random selecton for tranng example were performed and the average recognton result was recorded. The expermental protocol s the same as before. The recognton results are shown n Table 2. It s found that the TensorLDA method sgnfcantly outperforms both Egenfaces and Fsherfaces methods. The error rates of Egenfaces, Fsherfaces, TensorPCA and TensorLDA are 33.5%, 29.84%, 30.78% and 23.41%, respectvely. Fgure 2 shows the plots of error rate versus dmensonalty reducton. 4.3 PIE Database The CMU PIE face database contans 68 ndvduals wth 41,368 face mages as a whole. The face mages were captured by 13 synchronzed cameras and 21 flashes, under varyng pose, llumnaton, and expresson. We choose the fve near frontal pose (C05, C07, C09, C27, C29) and use all the mages under dfferent llumnaton, lghtng and expresson whch leave us 170 near frontal face mages for each ndvdual. Fgure 5 shows several sample mages of one ndvdual wth dfferent poses, expressons and llumnatons. We randomly choose 20 mages for each ndvdual as tranng data. We repeat ths process 20 tmes and compute the average performance. Table 3 12

14 Table 2: Performance comparsons on the ORL database Method Dms (d) Error Rate Baselne % Egenfaces (PCA) % Fsherfaces (PCA+LDA) % TensorPCA % TensorLDA % 60 Error rate (%) TensorLDA TensorPCA FsherFace (PCA+LDA) EgenFace (PCA) Baselne Dmenson (d) Fgure 4: Error rate vs. dmensonalty reducton on ORL database shows the recognton results. As can be seen, Fsherfaces performs a bt worse than our algorthms on ths database, whle Egenfaces performs poorly. The error rate for TensorLDA, TensorPCA, Fsherfaces, and Egenfaces are 13.32%, 38.15%, 15.44%, and 38.14%, respectvely. Fgure 6 shows a plot of error rate versus dmensonalty reducton. As can be seen, the error rate of our TensorLDA method decreases fast as the dmensonalty of the face tensor subspace ncreases, and acheves the best result wth d = 13 dmenson. There s no sgnfcant mprovement f more dmensons are used. For Fsherfaces, the dmenson of the face subspace s bounded by c 1, and t acheves the best result wth c 1 = 67 dmensons. 5 Conclusons Tensor based subspace learnng s ntroduced n ths paper. We propose two new lnear dmensonalty reducton algorthm called TensorPCA and TensorLDA. Dfferent from tradtonal algorthms lke PCA and LDA, our algorthms are performed n the tensor space rather than the vector space. As a result they are especally sutable for face analyss whch s ntrnscally the second order tensor. 13

15 Fgure 5: Sample face mages from the CMU PIE database. For each subject, there are 170 near frontal face mages under varyng pose, llumnaton, and expresson. Table 3: Performance comparsons on the PIE database Method Dms (d) Error Rate Baselne % Egenfaces (PCA) % Fsherfaces (PCA+LDA) % TensorPCA % TensorLDA % Several questons reman to be nvestgated n our future work, 1. In ths paper, we appled our algorthms to face recognton n whch the face mages are naturally represented as the second order tensors. It s unclear f there s any other applcaton domans n whch tensor representaton s essental. 2. Our algorthms are lnear. Therefore, t can not capture the nonlnear structure of the data manfold. It remans unclear how to generalze our algorthms to nonlnear case. References [1] P. Belhumeur, J. Hepanha, and D. Kregman. Egenfaces vs. fsherfaces: recognton usng class specfc lnear projecton. IEEE Trans. on Pattern Analyss and Machne Intellgence, 19(7): , [2] J. Dy, C. E. Brodley, A. Kak, L. S. Broderck, and A. M. Asen. Herarchcal feature selecton appled to content-based mage retreval. IEEE Trans. on Pattern Analyss and Machne Intellgence, 25: , [3] A. M. Martnez and A. C. Kak. PCA versus LDA. IEEE Trans. on PAMI, 23(2): ,

16 60 50 Error rate (%) TensorLDA TensorPCA FsherFace (PCA+LDA) EgenFace (PCA) Baselne Dmenson (d) Fgure 6: Error rate vs. dmensonalty reducton on PIE database [4] B. Moghaddam and A. Pentland. Probablstc vsual learnng for object representaton. IEEE Trans. on PAMI, 19(7): , [5] P. J. Phllps. Support vector machnes appled to face recognton. Advances n Neural Informaton Processng Systems, 11: , [6] T. Sm, S. Baker, and M. Bsat. The CMU pose, llumnlaton, and expresson database. IEEE Trans. on PAMI, 25(12): , [7] D. L. Swets and J. Weng. Usng dscrmnant egenfeatures for mage retreval. IEEE Trans. on Pattern Analyss and Machne Intellgence, 18(8): , [8] M. Turk and A. Pentland. Egenfaces for recognton. Journal of Cogntve Neuroscence, 3(1):71 86, [9] M. Turk and A. P. Pentland. Face recognton usng egenfaces. In IEEE Conference on Computer son and Pattern Recognton, Mau, Hawa, [10] M. A. O. aslescu and D. Terzopoulos. Multlnear subspace analyss for mage ensembles. In IEEE Conference on Computer son and Pattern Recognton, [11] J. Yang, D. Zhang, A. Frang, and J. Yang. Two-dmensonal PCA: A new approach to appearance-based face representaton and recognton. IEEE Trans. PAMI, 26(1): , Jan [12] J. Ye, R. Janardan, and Q. L. Two-dmensonal lnear dscrmnant analyss. In Advances n Neural Informaton Processng Systems 17,

Tensor Subspace Analysis

Tensor Subspace Analysis Tensor Subspace Analyss Xaofe He 1 Deng Ca Partha Nyog 1 1 Department of Computer Scence, Unversty of Chcago {xaofe, nyog}@cs.uchcago.edu Department of Computer Scence, Unversty of Illnos at Urbana-Champagn