Feature Extraction by Maximizing the Average Neighborhood Margin

Size: px
Start display at page:

Download "Feature Extraction by Maximizing the Average Neighborhood Margin"

Transcription

1 Feature Extracton by Maxmzng the Average Neghborhood Margn Fe Wang, Changshu Zhang State Key Laboratory of Intellgent Technologes and Systems Department of Automaton, Tsnghua Unversty, Bejng, Chna Abstract A novel algorthm called Average Neghborhood Margn Maxmzaton () s proposed for supervsed lnear feature extracton. For each data pont, ams at pullng the neghborng ponts wth the same class label towards t as near as possble, whle smultaneously pushng the neghborng ponts wth dfferent labels away from t as far as possble. We wll show that features extracted from can separate the data from dfferent classes well, and t avods the small sample sze problem exsted n tradtonal Lnear Dscrmnant Analyss (LDA). The kernelzed (nonlnear) counterpart of s also establshed n ths paper. Moreover, as n many computer vson applcatons the data are more naturally represented by hgher order tensors (e.g. mages and vdeos), we develop a tensorzed (multlnear) form of, whch can drectly extract features from tensors. The expermental results of applyng to face recognton are presented to show the effectveness of our method. 1. Introducton Feature extracton (or dmensonalty reducton) s an mportant research topc n computer vson and pattern recognton felds, snce (1) the curse of hgh dmensonalty s usually a major cause of lmtatons of many practcal technologes; (2) the large quanttes of features may even degrade the performances of the classfers when the sze of the tranng set s small compared to the number of features [1]. In the past several decades, many feature extracton methods have been proposed, n whch the most wellknown ones are Prncpal Component Analyss () [1] and Lnear Dscrmnant Analyss (LDA). However, there are stll some lmtatons for drectly applyng them to solve vson problems. Frstly, although s a popular unsupervsed method whch ams at extractng a subspace n whch the varance of the projected data s maxmzed (or, equvalently, the reconstructon error s mnmzed), t does not take the class nformaton nto account and thus may not be relable for classfcaton tasks. On the contrary, LDA s a supervsed technque whch has been shown to be more effectve than n many applcatons. It ams to maxmze the betweenclass scatter and smultaneously mnmze the wthn-class scatter. Unfortunately, t has also been ponted out that there are some drawbacks exsted n LDA [13], such as (1) t usually suffers from the small sample sze problem [18] whch makes the wthn-class scatter matrx sngular; (2) t s only optmal for the case where the dstrbuton of the data n each class s a Gaussan wth an dentcal covarance matrx; (3) LDA can only extract at most c 1 features (where c s the number of dfferent classes), whch s suboptmal for many applcatons. Another lmtaton of and LDA s that they are all lnear methods. However, t s dscovered that many vson problems may not be lnear [7][2], whch makes these lnear approaches neffcent. Fortunately, kernel based methods [2] can handle these nonlnear cases very well. The basc dea behnd those kernel based technques s to frst map the data to a hgh-dmensonal (usually nfntedmensonal) feature space, and make the nonlnear problem n the orgnal space lnearly solvable n the feature space. It has been shown that Kernelzed [3] and Kernelzed LDA [19] can mprove the performances of orgnal and LDA sgnfcantly n many computer vson and pattern recognton problems. Fnally, and LDA take ther nputs as vectoral data, but n many real-world vson problems, the data are more naturally represented as hgher-order tensors. For example, a captured mage s a 2nd-order tensor,.e. matrx, and the sequental data, such as a vdeo sequence for event analyss, s n the form of 3rd-order tensor. Thus t s necessary to derve the multlnear forms of these tradtonal lnear feature extracton methods to handle the data as tensors drectly. Recently ths research topc has receved consderable nterests from the computer vson and pattern recognton communty [5], and the proposed methods have been shown to be much more effcent than the tradtonal vectoral methods. In ths paper, we propose a novel supervsed lnear feature extracton method called Average Neghborhood Mar- 1

2 gn Maxmzaton (). For each data pont, ams to pull the neghborng ponts wth the same class label towards t as near as possble, whle smultaneously push the neghborng ponts wth dfferent labels away from t as far as possble. Compared wth tradtonal LDA, our method has the followng advantages: 1. avods the small sample sze problem [18] snce t does not need to compute any matrx nverse; 2. can fnd the dscrmnant drectons wthout assumng the partcular form of class denstes; 3. Much more feature dmensons are avalable n, whch s not lmted to c 1 as n LDA. Moreover, we also derve the nonlnear and multlnear forms of for handlng the nonlnear and tensor data. Fnally the expermental results on face recognton are presented to show the effectveness of our method. The rest of ths paper s organzed as follows. In secton 2 we wll brefly revew some methods that are closely related to. The algorthm detals of wll be ntroduced n secton 3. In secton 4 and secton 5 we wll develop the kernelzed and tensorzed forms of. The expermental results on face recognton wll be presented n secton 6, followed by the conclusons and dscussons n secton Related Works In ths secton we wll brefly revew some lnear feature extracton methods that are closely related to. Frst let s see some notatons and problem defnton. Let {(x 1, y 1 ), (x 2, y 2 ),, (x N, y N )} be the emprcal dataset, where x R d s the -th datum represented by a d dmensonal column vector, and y L s the label of x, L = {1, 2,, c} s the label set. The goal of lnear feature extracton s to learn a d l projecton matrx W, whch can project x to y = W T x, where y R l s the projected data wth l d, such that n the projected space the data from dfferent classes can be effectvely dscrmnated. Tradtonal LDA learns W by maxmzng the followng crteron W T S b W J = W T S w W, where S b = c k=1 p k(m k m)(m k m) T s the betweenclass scatter matrx, where p k and m k are the pror and mean of class k, and m s the mean of the entre dataset. S w = c k=1 p ks k s the wthn-class scatter matrx wth S k beng the covarance matrx of class k. It has been shown that J can be maxmzed when W s consttuted by the egenvectors of S 1 w S b correspondng to ts l largest egenvalues [13]. However, when the sze of the dataset s small, S w wll become sngular. Then S 1 w does not exst and the small sample sze (SSS) problem occurs. Many approaches have been proposed to solve such a problem, such as +LDA [18], null space LDA [14], drect LDA [9], etc. L et al. [6] further proposed an effcent and robust lnear feature extracton method whch ams to maxmze the followng crteron whch was called a margn n [6] J = tr ( W T (S b S w )W ), (1) where tr( ) denotes the matrx trace. We can see that there s no need for computng any matrx nverse n optmzng the above crteron. However, such a margn s lack of geometrc ntutons. Qu et al. [23] proposed a Nonparametrc Margn Maxmzaton Crteron for learnng W, whch tres to maxmze J = N w ( δ E 2 δ I 2 ) (2) =1 n the transformed space, where δ E s the dstance between x and ts nearest neghbor n the dfferent class, δ I s the dstance between x and ts furthest neghbor n the same class. The problem s that usng just the nearest (or furthest) neghbor for defnng the margn may cause the algorthm senstve to outlers. Moreover, the stepwse procedure for maxmzng J s tme consumng. From another pont of vew lnear feature extracton can also be treated as learnng a proper Mahalanobs dstance between parwse ponts, snce y y j 2 = W T (x x j ) 2 = (x x j ) T WW T (x x j ) Let M = WW T, then y y j 2 = (x x j ) T M(x x j ). Wenberger et al. [15] proposed a large margn crteron to learn a proper M for k Nearest Neghbor classfer, and optmze t through a Semdefnte Programmng (SDP) procedure. Unfortunately, the computatonal burden of SDP s hgh, whch lmts ts potental applcatons n hghdmensonal datasets. 3. Feature Extracton by Average Neghborhood Margn Maxmzaton () In ths secton we wll ntroduce our Average Neghborhood Margn Maxmzaton () algorthm n detal. Lke other lnear feature extracton methods, ams to learn a projecton matrx W such that the data n the projected space have hgh wthn-class smlarty and betweenclass separablty. To acheve such a goal, we frst ntroduce

3 be defned as γ = γ = k:x k N e y y k 2 y y j 2 j:x j N o, (a) Neghborhood n the orgnal (b) Neghborhood n the projected space space Fgure 1. An ntutve llustraton of the crteron. The yellow dsk n the center represents x. The blue dsks are the data ponts n the homogeneous neghborhood of x, and the red squares are the data ponts n the heterogeneous neghborhood of x. (a) shows the data dstrbuton n the orgnal space, (b) shows the data dstrbuton n the projected space. two types of neghborhoods: Defnton 1(Homogeneous Neghborhoods). For a data pont x, ts ξ nearest homogeneous neghborhood N o s the set of ξ most smlar 1 data whch are n the same class wth x. Defnton 2(Heterogeneous Neghborhoods).For a data pont x, ts ζ nearest heterogeneous neghborhood N e s the set of ζ most smlar data whch are not n the same class wth x. Then the average neghborhood margn γ for x s defned as γ = k:x k N e y y k 2 y y j 2 j:x j N o, where represents the cardnalty of a set. Lterally, ths margn measures the dfference between the average dstance from x to the data ponts n ts heterogeneous neghborhood and the average dstance from t to the data ponts n ts homogeneous neghborhood. The maxmzaton of such a margn can push the data ponts whose labels are dfferent from x away from x whle pull the data ponts havng the same class label wth x towards x. Fg.1 gves us an ntutve llustraton of the crteron. Therefore, the total average neghborhood margn can 1 In ths paper two data vectors are consdered to be smlar f the Eucldean dstance between them s small, two data tensors are consdered to be smlar f the Frobenus norm of ther dfference tensor s small. and the crteron s to maxmze γ. Snce y y k 2 k:x k N e = tr (y y k ) (y y k ) T k:x k N e = tr W T (x x k ) (x x k ) T W k:x k N e = W T tr(s)w, (3) where the matrx S =,k: x k N e (x x k ) (x x k ) T, (4) s called the scatterness matrx. Smlarly, f we defne the compactness matrx as Then C = j:x j N o,j: x j N o (x x j ) (x x j ) T y y j 2. (5) = tr ( W T CW ). Therefore the average neghborhood margn can be rewrtten as γ = tr [ W T (S C)W ]. (6) If we expand W as W = (w 1, w 2,, w l ), then γ = l k=1 wt k (S C)w k. To elmnate the freedom that we can multply W wth some nonzero scalar, we add the constrant w T k w k = 1,.e. we restrct W to be consttuted of unt vectors. Thus our crteron becomes l max k=1 wt k (S C)w k s.t. wk T w k = 1. (7)

4 Table 1. Average Neghborhood Margn Maxmzaton Input: Tranng set D = {(x, y )} N =1, Testng set Z = {z 1, z 2,, z M }, Neghborhood sze,, Desred dmensonalty l; Output: l M feature matrx F extracted from Z. 1. Construct the heterogeneous neghborhood and homogeneous neghborhood for each x ; 2. Construct the scatterness matrx S and compactness matrx C usng Eq.(4) and Eq.(5) respectvely; 3. Do egenvalue decomposton on S C, construct d l matrx W whose columns are composed by the egenvectors of S C correspondng to ts largest l egenvalues; 4. Output F = W T Z wth Z = [z 1, z 2,, z N ]. Usng the Lagrangan method, we can easly fnd that the optmal W s composed of the l egenvectors correspondng to the largest l egenvalues of S C. To summarze, the man procedure of s shown n Table Nonlnearzaton va Kernelzaton In ths secton, we wll extend the algorthm to the nonlnear case va the kernel method [2]. More formally, we wll frst map the dataset from the orgnal space R d to a hgh (usually nfnte) dmensonal feature space F through a nonlnear mappng Φ : R d F, and apply lnear there. In the feature space F, the Eucldean dstance between Φ(x ) and Φ(x j ) can be computed as Φ(x ) Φ(x j ) = (Φ(x ) Φ(x j )) T (Φ(x ) Φ(x j )) = K + K jj 2K j, where K j = Φ(x ) T Φ(x j ) s the (, j)-th entry of the kernel matrx K. Thus we can use K to fnd the heterogeneous neghborhood and homogeneous neghborhood for each x n the feature space, and the total average neghborhood margn becomes where S Φ = C Φ = γ Φ = l,k: Φ(x k ) N e Φ(x ),j: Φ(x j ) N o Φ(x ) k=1 wt k (S Φ C Φ )w k, (8) (Φ(x ) Φ(x k )) (Φ(x ) Φ(x k )) T N e Φ(x ) (Φ(x ) Φ(x j )) (Φ(x ) Φ(x j )) T, N o Φ(x ) where NΦ(x e and N o ) Φ(x ) are the heterogeneous and homogeneous neghborhood of Φ(x ). It s mpossble to compute S Φ and C Φ drectly snce we usually do not know the explct form of Φ. To avod such a problem, we notce that each w k les n the span of Φ(x ), Φ(x 2 ),, Φ(x N ),.e. Therefore w T k Φ(x ) = w k = N p=1 αk pφ(x p ) N αpφ(x k p ) T Φ(x ) = (α k ) T K, p=1 where α k s a column vector wth ts p-th entry equal to α k p, K s the -th column of K. Thus Defne the matrces then γ Φ = S Φ = C Φ = = w T k (Φ(x ) Φ(x j ))(Φ(x ) Φ(x j )) T w k = (α k ) T (K K j )(K K j ) T α k.,k: Φ(x k ) N e Φ(x ),j: Φ(x j ) N o Φ(x ) l wk T (S Φ C Φ )w k = k=1 l (α k ) T ( SΦ C Φ) α k k=1 (K K k ) (K K k ) T N e Φ(x ) (9) (K K j ) (K K j ) T,(1) N o Φ(x ) l ( wk S Φ w k w k C Φ ) w k k=1 Smlar to Eq.(7), we also add the constrants that (α k ) T (α k ) = 1 (k = 1, 2,, l). Then the optmal (α k ) s are the egenvectors of S Φ C Φ correspondng to ts largest l egenvalues. For a new test pont z, ts k-th extracted feature can be computed by w T k Φ(z) = N αpφ(x k p ) T Φ(z) = (α k ) T K t z. (11) p=1 where we use K t to denote the kernel matrx between the tranng set and the testng set. The man procedure Kernel Average Neghborhood Margn Maxmzaton (K) algorthm s summarzed n Table 2.

5 Table 2. Kernel Average Neghborhood Margn Maxmzaton Input: Tranng set D = {(x, y )} N =1, Testng set Z = {z 1, z 2,, z M }, Neghborhood sze NΦ o, N Φ e, Kernel parameter θ, Desred dmensonalty l; Output: l M feature matrx F extracted from Z. 1. Construct the kernel matrx K on the tranng set; 2. Construct the heterogeneous neghborhood and homogeneous neghborhood for each Φ(x ); 3. Compute S Φ and C Φ usng Eq.(9) and Eq.(1) respectvely; 4. Do egenvalue decomposton on S Φ C Φ, store the egenvectors {α 1, α 2,, α l } correspondng to the largest l egenvalues; 5. Construct the kernel matrx between the tranng set and the testng set K t wth ts (, j)-th entry K t j = Φ(x ) T Φ(z j ). 6. Output F Φ wth F Φ j = (α ) T K t j. 5. Multlnearzaton va Tensorzaton Tll now the method we have ntroduced s based on the basc assumpton that the data are n vectorzed representatons. Therefore t s necessary to derve the tensor form of our method. Frst let s ntroduce some notatons and defntons. Let A be a tensor of d 1 d 2 d K. The order of A s K and the f-th dmenson (or mode) of A s of sze d f. A sngle entry wthn a tensor s denoted by A 1 2 K. Defnton 3 (Scalar Product). The scalar product A, B of two tensors A, B R d1 d2 d K s defned as A, B = A K B 1 2 K, K where denotes the complex conjugaton. Furthermore, the Frobenus norm of a tensor A s defned as A F = A, A, Defnton 4 (f-mode Product). The f-mode product of a tensor A R d1 d2 d K and a matrx U R d f g f s an d 1 d 2 d f 1 g f d f+1 d K tensor denoted as A f U, where the correspondng entres are gven by (A f U) 1 f 1 j f f+1 K = f A 1 f 1 f f+1 K U f j f Defnton 5 (f-mode Unfoldng). Let A be a d 1 d K tensor and (π 1,, π K 1 )be any permutaton of the entres of the set {1,, f 1, f +1,, K}. The f-mode unfoldng of the tensor A nto a d f K 1 l=1 d π l matrx, denoted by A (f), s defned as A R d1 d K f A (f) R d f K 1 l=1 dπ l, where A (f) f j = A 1 K wth j = 1 + K 1 ( π l=1 l 1) l 1 d l π =1 l. The tensor based crteron for s that, gven N data ponts X 1,, X N embedded n a tensor space R d1 d2 d K, we want to pursue K optmal nterrelated projecton matrces U R l d (l < d, = 1, 2,, K), whch maxmze the average neghborhood margn measured n the tensor metrc. That s γ = Y Y j 2 F j:x j N o Y Y k 2 F k:x k N e, where Y = X 1 U 1 2 U 2 K U K. Note that drectly maxmzng γ s almost nfeasble snce t s a hgherorder optmzaton problem. Generally such type of problems can be solved approxmately by employng an terate scheme whch was orgnally proposed by [12] for low-rank approxmaton of second-order tensors. Later [8] extended t for hgher-order tensors. In the followng we wll adopt such an teratve scheme to solve the optmzaton problem. Gven U 1, U 2,, U f 1, U f+1,, U K, let Y f = X 1 U 1 f 1 U f 1 f+1 U f+1 K U K. (12) Then, by the correspondng f-mode unfoldng, we can get Y f f Y (f) Therefore we have. Moreover, we can easly derve that Y f F ( ) T f U f = Y (f) Uf. Y Y j 2 F = X 1 U 1 K U K X j 1 U 1 K U K 2 F = Y f f U f Y f j 2 f U f F ( ) T ( ) T = Y (f) Uf Y (f) 2 j Uf F [ ( ) ( ) ] T = tr U T f Y (f) Y (f) j Y (f) Y (f) j Uf Then knowng U 1,, U f 1, U f+1,, U K, we can rewrte the compactness matrx and scatterness matrx n tensor as S = C =,k: x k N e,j: x k N o ( Y (f) ( Y (f) ) ( Y (f) k Y (f) F ) T Y (f) k,(13) ) ( Y (f) j Y (f) ) T Y (f) j,(14)

6 Table 3. Tensor Average Neghborhood Margn Maxmzaton Input: Tranng set D = {(X, y )} N =1, Testng set Z = {Z 1, Z 2,, Z M }, where X, Z j R d1 d2 d K, Neghborhood sze,, Desred dmensonalty l 1, l 2,, l K, Iteraton steps T max, Dfference ε; Output: Feature tensors {F } M =1 extracted from Z, where F R l1 l2 l K. 1. Intalze U 1 = I d1, U 2 = I d2,, U K = I d K, where I d represents the d d dentty matrx; 2. For t = 1, 2,, T max do For f = 1, 2,, K do (a). Compute Y f (b). Y f f Y (f) ; by Eq.(12); (c). Compute S and C usng Eq.(13) and Eq.(14); (d). Do egenvalue decomposton on S C: (S C)U t f = Ut f Λ f wth U t f Rd f l f ; (f). f U t f Ut 1 f < ε, break; End for. End for. 3. Output F = Z 1 U t 1 K U t K. and our optmzaton problem (wth respect to U f ) becomes max tr [ U T ] f (S C) U f (15) U f Let s expand U f as U f = (u f1, u f2,, u flf ) wth u f correspondng to the -th column of U f, then Eq.(15) can be rewrtten as max l f =1 ut f(s C)u f. (16) We also add the constrant that u T f u f = 1 to restrct the scale of U f. The man procedure of the Tensor Average Neghborhood Margn Maxmzaton (T) s summarzed n Table Experments In ths secton, we nvestgate the performance of our proposed, Kernel (K) and Tensor (T) methods for face recognton. We have done three groups of experments to acheve ths goal: 1. Lnear methods. In ths set of experments, the performance of orgnal s compared wth the tradtonal [16] method, LDA (+LDA) method [18], and three margn based methods, namely the Maxmum Margn Crteron () method [6], the Stepwse Nonparametrc Maxmum MArgn Crteron (SN) method [23] and the Margnal Fsher Analyss () method [21]; 2. Kernel methods. In ths set of experments, the performance of the K method s compared wth the K and the KDA method [17]; 3. Tensor methods. In ths set of experments, the performance of the Tensor (T) method s compared wth the Tensor (T) and the Tensor LDA (TLDA) methods [4]. In ths study, three face dataset are used: 1. The ORL face dataset 2. There are ten mages for each of the 4 human subjects, whch were taken at dfferent tmes, varyng the lghtng, facal expressons (open / closed eyes, smlng / not smlng) and facal detals (glasses / no glasses). The mages were taken wth a tolerance for some tltng and rotaton of the face up to 2 degrees. The orgnal mages (wth 256 gray levels) have sze , whch are reszed to for effcency; 2. The Yale face dataset 3. It contans 11 grayscale mages for each of the 15 ndvduals. The mages demonstrate varatons n lghtng condton (left-lght, center-lght, rght-lght), facal expresson (normal, happy, sad, sleepy, surprsed, and wnk), and wth/wthout glasses. In our experment, the mages were also reszed to 32 32; 3. The CMU PIE face dataset [22]. It contans 68 ndvduals wth 41,368 face mages as a whole. The face mages were captured by 13 synchronzed cameras and 21 flashes, under varyng pose, llumnaton, and expresson. In our experments, fve near frontal poses (C5, C7, C9, C27, C29) are selected under dfferent llumnatons, lghtng and expressons whch leaves us 17 near frontal face mages for each ndvdual, and all the mages were also reszed to The free parameters for the tested methods were determned n the followng ways: 1. For the -seres methods (ncludng, K, T), the szes of the homogeneous and heterogeneous neghborhoods for each data pont are all set to 1; 2. For the kernel methods,we all adopt the Gaussan kernel, and the varance of the Gaussan kernel were set by cross-valdaton; 3. For the tensor methods, we requre that the projected mages are also square,.e. of dmenson r r for some r

7 recognton accuracy 2 Tran SN LDA recognton accuracy.9 3 Tran SN +LDA recognton accuracy Tran SN +LDA Fgure 2. Face recognton accuraces on the ORL dataset wth 2,3,4 mages for each ndvdual randomly selected for tranng. 2 Tran 3 Tran 4 Tran recognton accuracy SN +LDA recognton accuracy SN +LDA recognton acuracy SN +LDA Fgure 3. Face recognton accuraces on the Yale dataset wth 2,3,4 mages per ndvdual randomly selected for tranng. 5 Tran 1 Tran Tran recognton accuracy SN +LDA recognton accuracy SN +LDA recognton accuracy SN +LDA Fgure 4. Face recognton accuraces on the CMU PIE dataset wth 5,1,2 mages per ndvdual randomly selected for tranng. The expermental results of the lnear methods on the three datasets are shown n Fg.2, Fg.3, Fg.4 respectvely. In all the fgures, the abscssas represent the projected dmensons, and the ordnates are the average recognton accuraces of 5 ndependent runs. From the fgures we clearly see that the performances of s better than other lnear methods on all the three datasets. Table 4 shows the expermental results of all the methods on three datasets, where the value n each entry represents the average recognton accuracy (n percentages) of 5 ndependent trals, and the number n brackets s the correspondng projected dmenson. The table shows that the -seres methods can perform better than those tradtonal methods on the three datasets. 7. Conclusons and Dscussons In ths paper we proposed a novel supervsed lnear feature extracton method named Average Neghborhood Margn Maxmzaton (). For each data pont, ams at pullng the neghborng ponts wth the same class label towards t as near as possble, whle smultaneously pushng the neghborng ponts wth dfferent labels away from t as far as possble. Moreover, as many computer vson and pattern recognton problems are ntrnscally nonlnear or multlnear, we also derve the kernelzed and tensorzed counterparts of. Fnally the expermental results on face recognton are presented to show the effectveness of our proposed approaches.

8 Table 4. Face recognton results on three datasets (%). Method ORL Yale CMU PIE 2 Tran 3 Tran 4 Tran 2 Tran 3 Tran 4 Tran 5 Tran 1 Tran 2 Tran 54.35(56) 64.71(64) 71.54(36) 45.19(37) 51.91(35) 56.3(4) 46.64(24) 54.72(213) 67.17(241) LDA 77.36(28) 86.96(39) 91.71(39) 46.4(9) 59.25(13) 68.9(12) 57.5(62) 76.75(62) 88.6(61) 77.73(54) 85.98(29) 91.26(52) 46.64(54) 58.8(56) 71.67(39) 57.5(21) 77.56(215) 85.54(195) SN 79.23(49) 87.68(54) 93.59(36) 49.5(49) 66.31(49) 78.57(47) 66.45(223) 88(213) 91.2(22) 77.34(41) 87.19(33) 92.19(33) 49.56(38) 64.6(38) 76.5(39) 63.6(21) 89(232) 88.69(25) 82.13(37) 89.13(41) 95.84(43) 55(41) 67.87(38) 89(41) 7.5(222) 82.8(23) 93.46(25) K 64.23(5) 75.25(54) 79.26(6) 49.34(45) 55.78(47) 62(54) 52.35(341) 62(384) 72.25(256) KDA 89(38) 89.13(36) 93.12(38) 52.35(14) 64.89(13) 71.95(14) 62.13(67) 81.27(66) 92.11(65) K 85.46(5) 92.21(39) 96.13(53) 54.62(54) 69.25(66) 87(62) 72.1(32) 82.41(28) 93.67(218) T 59.22(1 2 ) 71.25(12 2 ) 79.86(1 2 ) 55(7 2 ) 57.23(11 2 ) 62.3(1 2 ) 51.17(1 2 ) 56.65(13 2 ) 69.9(11 2 ) TLDA 88(9 2 ) 89.28(11 2 ) 93.37(8 2 ) 51.25(9 2 ) 66.19(1 2 ) 75.88(9 2 ) 61(12 2 ) 85(14 2 ) 92.75(8 2 ) T 85.87(1 2 ) 92.54(9 2 ) 96.22(11 2 ) 55.31(11 2 ) 73(8 2 ) 81.56(1 2 ) 73.2(12 2 ) 82.78(9 2 ) 94.32(11 2 ) As we mentoned n secton 2, lnear feature extracton methods can also be vewed as learnng a proper Mahalanobs dstance n the orgnal data space. Thus can also be used for dstance metrc learnng. From such a vewpont, our algorthm s more effcent n that t only needs to learn the transformaton matrx, but not the whole covarance matrx as n tradtonal metrc learnng algorthms[15]. References [1] A. K. Jan, B. Chandrasekaran. Dmensonalty and Sample Sze Consderatons n Pattern Recognton Practce. In Handbook of Statstcs. Amsterdam, North Holland [2] B. Schölkopf, A. Smola. Learnng wth Kernels. The MIT Press. Cambrdge, Massachusetts. London, England , 4 [3] B. Schölkopf, A. Smola, K.-R. Müller. Nonlnear Component Analyss as a Kernel Egenvalue Problem. Neural Computaton, 1: [4] D. Ca, X. He, J. Han. Subspace Learnng Based on Tensor Analyss. Department of Computer Scence Techncal Report No. 2572, Unversty of Illnos at Urbana-Champagn (UIUCDCS-R ) [5] Fernando De la Torre, M. Alex O. Vaslescu. Lnear and Multlnear (Tensor) Methods for Vson, Graphcs, and Sgnal Processng. IEEE CVPR Tutoral [6] H. L, T. Jang, K. Zhang. Effcent and Robust Feature Extracton by Maxmum Margn Crteron. In NIPS , 6 [7] H. S. Seung, D. D. Lee. The manfold ways of percepton. Scence, [8] H. Wang, Q., Wu, L., Sh, Y., Yu, N., Ahuja. Out-of-Core Tensor Approxmaton of Mult-Dmensonal Matrces of Vsual Data. In Proceedngs of ACM SIGGRAPH [9] H. Yu, J. Yang. A Drect LDA Algorthm for Hgh Dmensonal Data wth Applcaton to Face Recognton. Pattern Recognton [1] I. T. Jollffe. Prncpal Component Analyss. Sprnger- Verlag, New York [11] J. Yang, D. Zhang, Alejandro F. Frang, J. Yang. Two- Dmensonal : A New Approach to Appearance-Based Face Representaton and Recognton. IEEE TPAMI. 24. [12] J. Ye. Generalzed Low Rank Approxmatons of Matrces. In Proceedngs of ICML [13] K. Fukunaga. Introducton to Statstcal Pattern Recognton. Academc Press, New York, 2nd edton , 2 [14] K. Lu, Y. Cheng, J. Yang. A Generalzed Optmal Set of Dscrmnant Vectors. Pattern Recognton [15] K. Q. Wenberger, J. Bltzer, L. K. Saul. Dstance Metrc Learnng for Large Margn Nearest Neghbor Classfcaton In NIPS , 8 [16] M. A. Turk and A. P. Pentland. Egenfaces for recognton. Journal of Cogntve Neuroscence, 3(1): 71-96, [17] M. -H. Yang. Kernel Egenfaces vs. Kernel Fsherfaces: Face Recognton Usng Kernel Methods. InProceedngs of the Ffth IEEE Internatonal Conference on Automatc Face and Gesture Recognton [18] P.N. Belhumeur, J. Hespanda, D. Kregeman. Egenfaces vs. Fsherfaces: Recognton Usng Class Specfc Lnear Projecton. IEEE Trans. on PAMI , 2, 6 [19] S. Mka, G. Rätsch, J. Weston, B. Schölkopf, K.-R. Müller. Fsher Dscrmnant Analyss wth Kernels. Neural Networks for Sgnal Processng IX, IEEE [2] S. T. Rowes, L. K. Saul. Nonlnear dmensonalty reducton by locally lnear embeddng. Scence, [21] S. Yan, D. Xu, B. Zhang and H. Zhang. Graph Embeddng: A General Framework for Dmensonalty Reducton. In Proceedngs of IEEE CVPR [22] T. Sm, S. Baker, and M. Bsat. The CMU pose, llumnlaton, and expresson database. IEEE Trans. on PAMI [23] X. Qu, L. Wu. Face Recognton by Stepwse Nonparametrc Margn Maxmum Crteron. In Proc. ICCV , 6

Tensor Subspace Analysis

Tensor Subspace Analysis Tensor Subspace Analyss Xaofe He 1 Deng Ca Partha Nyog 1 1 Department of Computer Scence, Unversty of Chcago {xaofe, nyog}@cs.uchcago.edu Department of Computer Scence, Unversty of Illnos at Urbana-Champagn

More information

Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han

Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han Report No. UIUCDCS-R-2005-2572 UILU-ENG-2005-1767 Subspace Learnng Based on Tensor Analyss by Deng Ca, Xaofe He, and Jawe Han May 2005 Subspace Learnng Based on Tensor Analyss Deng Ca Xaofe He Jawe Han

More information

A Novel Biometric Feature Extraction Algorithm using Two Dimensional Fisherface in 2DPCA subspace for Face Recognition

A Novel Biometric Feature Extraction Algorithm using Two Dimensional Fisherface in 2DPCA subspace for Face Recognition A Novel ometrc Feature Extracton Algorthm usng wo Dmensonal Fsherface n 2DPA subspace for Face Recognton R. M. MUELO, W.L. WOO, and S.S. DLAY School of Electrcal, Electronc and omputer Engneerng Unversty

More information

Efficient and Robust Feature Extraction by Maximum Margin Criterion

Efficient and Robust Feature Extraction by Maximum Margin Criterion Effcent and Robust Feature Extracton by Maxmum Margn Crteron Hafeng L Tao Jang Department of Computer Scence Unversty of Calforna Rversde, CA 95 {hl,jang}@cs.ucr.edu Keshu Zhang Department of Electrcal

More information

Unified Subspace Analysis for Face Recognition

Unified Subspace Analysis for Face Recognition Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

CSE 252C: Computer Vision III

CSE 252C: Computer Vision III CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Efficient, General Point Cloud Registration with Kernel Feature Maps

Efficient, General Point Cloud Registration with Kernel Feature Maps Effcent, General Pont Cloud Regstraton wth Kernel Feature Maps Hanchen Xong, Sandor Szedmak, Justus Pater Insttute of Computer Scence Unversty of Innsbruck 30 May 2013 Hanchen Xong (Un.Innsbruck) 3D Regstraton

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

Fisher Linear Discriminant Analysis

Fisher Linear Discriminant Analysis Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems

More information

BACKGROUND SUBTRACTION WITH EIGEN BACKGROUND METHODS USING MATLAB

BACKGROUND SUBTRACTION WITH EIGEN BACKGROUND METHODS USING MATLAB BACKGROUND SUBTRACTION WITH EIGEN BACKGROUND METHODS USING MATLAB 1 Ilmyat Sar 2 Nola Marna 1 Pusat Stud Komputas Matematka, Unverstas Gunadarma e-mal: lmyat@staff.gunadarma.ac.d 2 Pusat Stud Komputas

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k) ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Multi-Task Learning in Heterogeneous Feature Spaces

Multi-Task Learning in Heterogeneous Feature Spaces Proceedngs of the Twenty-Ffth AAAI Conference on Artfcal Intellgence Mult-Task Learnng n Heterogeneous Feature Spaces Yu Zhang & Dt-Yan Yeung Department of Computer Scence and Engneerng Hong Kong Unversty

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

CS 468 Lecture 16: Isometry Invariance and Spectral Techniques

CS 468 Lecture 16: Isometry Invariance and Spectral Techniques CS 468 Lecture 16: Isometry Invarance and Spectral Technques Justn Solomon Scrbe: Evan Gawlk Introducton. In geometry processng, t s often desrable to characterze the shape of an object n a manner that

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

A New Facial Expression Recognition Method Based on * Local Gabor Filter Bank and PCA plus LDA

A New Facial Expression Recognition Method Based on * Local Gabor Filter Bank and PCA plus LDA Hong-Bo Deng, Lan-Wen Jn, L-Xn Zhen, Jan-Cheng Huang A New Facal Expresson Recognton Method Based on Local Gabor Flter Bank and PCA plus LDA A New Facal Expresson Recognton Method Based on * Local Gabor

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

Learning with Tensor Representation

Learning with Tensor Representation Report No. UIUCDCS-R-2006-276 UILU-ENG-2006-748 Learnng wth Tensor Representaton by Deng Ca, Xaofe He, and Jawe Han Aprl 2006 Learnng wth Tensor Representaton Deng Ca Xaofe He Jawe Han Department of Computer

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1 Estmatng the Fundamental Matrx by Transformng Image Ponts n Projectve Space 1 Zhengyou Zhang and Charles Loop Mcrosoft Research, One Mcrosoft Way, Redmond, WA 98052, USA E-mal: fzhang,cloopg@mcrosoft.com

More information

Non-linear Canonical Correlation Analysis Using a RBF Network

Non-linear Canonical Correlation Analysis Using a RBF Network ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane

More information

Manifold Learning for Complex Visual Analytics: Benefits from and to Neural Architectures

Manifold Learning for Complex Visual Analytics: Benefits from and to Neural Architectures Manfold Learnng for Complex Vsual Analytcs: Benefts from and to Neural Archtectures Stephane Marchand-Mallet Vper group Unversty of Geneva Swtzerland Edgar Roman-Rangel, Ke Sun (Vper) A. Agocs, D. Dardans,

More information

Pattern Recognition 42 (2009) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage:

Pattern Recognition 42 (2009) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage: Pattern Recognton 4 (9) 764 -- 779 Contents lsts avalable at ScenceDrect Pattern Recognton ournal homepage: www.elsever.com/locate/pr Perturbaton LDA: Learnng the dfference between the class emprcal mean

More information

Adaptive Manifold Learning

Adaptive Manifold Learning Adaptve Manfold Learnng Jng Wang, Zhenyue Zhang Department of Mathematcs Zhejang Unversty, Yuquan Campus, Hangzhou, 327, P. R. Chna wroarng@sohu.com zyzhang@zju.edu.cn Hongyuan Zha Department of Computer

More information

Lecture 4: Constant Time SVD Approximation

Lecture 4: Constant Time SVD Approximation Spectral Algorthms and Representatons eb. 17, Mar. 3 and 8, 005 Lecture 4: Constant Tme SVD Approxmaton Lecturer: Santosh Vempala Scrbe: Jangzhuo Chen Ths topc conssts of three lectures 0/17, 03/03, 03/08),

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980 MT07: Multvarate Statstcal Methods Mke Tso: emal mke.tso@manchester.ac.uk Webpage for notes: http://www.maths.manchester.ac.uk/~mkt/new_teachng.htm. Introducton to multvarate data. Books Chat eld, C. and

More information

A kernel method for canonical correlation analysis

A kernel method for canonical correlation analysis A kernel method for canoncal correlaton analyss Shotaro Akaho AIST Neuroscence Research Insttute, Central 2, - Umezono, Tsukuba, Ibarak 3058568, Japan s.akaho@ast.go.jp http://staff.ast.go.jp/s.akaho/

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Lecture 10: Dimensionality reduction

Lecture 10: Dimensionality reduction Lecture : Dmensonalt reducton g The curse of dmensonalt g Feature etracton s. feature selecton g Prncpal Components Analss g Lnear Dscrmnant Analss Intellgent Sensor Sstems Rcardo Guterrez-Osuna Wrght

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor Prncpal Component Transform Multvarate Random Sgnals A real tme sgnal x(t can be consdered as a random process and ts samples x m (m =0; ;N, 1 a random vector: The mean vector of X s X =[x0; ;x N,1] T

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013 ISSN: 2277-375 Constructon of Trend Free Run Orders for Orthogonal rrays Usng Codes bstract: Sometmes when the expermental runs are carred out n a tme order sequence, the response can depend on the run

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7 Structure from Moton Forsyth&once: Chap. 2 and 3 Szelsk: Chap. 7 Introducton to Structure from Moton Forsyth&once: Chap. 2 Szelsk: Chap. 7 Structure from Moton Intro he Reconstructon roblem p 3?? p p 2

More information

FMA901F: Machine Learning Lecture 5: Support Vector Machines. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 5: Support Vector Machines. Cristian Sminchisescu FMA901F: Machne Learnng Lecture 5: Support Vector Machnes Crstan Smnchsescu Back to Bnary Classfcaton Setup We are gven a fnte, possbly nosy, set of tranng data:,, 1,..,. Each nput s pared wth a bnary

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN S. Chtwong, S. Wtthayapradt, S. Intajag, and F. Cheevasuvt Faculty of Engneerng, Kng Mongkut s Insttute of Technology

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Discriminative Dictionary Learning with Low-Rank Regularization for Face Recognition

Discriminative Dictionary Learning with Low-Rank Regularization for Face Recognition Dscrmnatve Dctonary Learnng wth Low-Rank Regularzaton for Face Recognton Langyue L, Sheng L, and Yun Fu Department of Electrcal and Computer Engneerng Northeastern Unversty Boston, MA 02115, USA {l.langy,

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem H.K. Pathak et. al. / (IJCSE) Internatonal Journal on Computer Scence and Engneerng Speedng up Computaton of Scalar Multplcaton n Ellptc Curve Cryptosystem H. K. Pathak Manju Sangh S.o.S n Computer scence

More information

Lecture 3: Dual problems and Kernels

Lecture 3: Dual problems and Kernels Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Automatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models

Automatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models Automatc Object Trajectory- Based Moton Recognton Usng Gaussan Mxture Models Fasal I. Bashr, Ashfaq A. Khokhar, Dan Schonfeld Electrcal and Computer Engneerng, Unversty of Illnos at Chcago. Chcago, IL,

More information

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them? Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

An Iterative Modified Kernel for Support Vector Regression

An Iterative Modified Kernel for Support Vector Regression An Iteratve Modfed Kernel for Support Vector Regresson Fengqng Han, Zhengxa Wang, Mng Le and Zhxang Zhou School of Scence Chongqng Jaotong Unversty Chongqng Cty, Chna Abstract In order to mprove the performance

More information

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros Appled Mathematcal Scences, Vol. 5, 2011, no. 75, 3693-3706 On the Interval Zoro Symmetrc Sngle-step Procedure for Smultaneous Fndng of Polynomal Zeros S. F. M. Rusl, M. Mons, M. A. Hassan and W. J. Leong

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information