Feature Extraction by Maximizing the Average Neighborhood Margin
|
|
- Philip Holland
- 5 years ago
- Views:
Transcription
1 Feature Extracton by Maxmzng the Average Neghborhood Margn Fe Wang, Changshu Zhang State Key Laboratory of Intellgent Technologes and Systems Department of Automaton, Tsnghua Unversty, Bejng, Chna Abstract A novel algorthm called Average Neghborhood Margn Maxmzaton () s proposed for supervsed lnear feature extracton. For each data pont, ams at pullng the neghborng ponts wth the same class label towards t as near as possble, whle smultaneously pushng the neghborng ponts wth dfferent labels away from t as far as possble. We wll show that features extracted from can separate the data from dfferent classes well, and t avods the small sample sze problem exsted n tradtonal Lnear Dscrmnant Analyss (LDA). The kernelzed (nonlnear) counterpart of s also establshed n ths paper. Moreover, as n many computer vson applcatons the data are more naturally represented by hgher order tensors (e.g. mages and vdeos), we develop a tensorzed (multlnear) form of, whch can drectly extract features from tensors. The expermental results of applyng to face recognton are presented to show the effectveness of our method. 1. Introducton Feature extracton (or dmensonalty reducton) s an mportant research topc n computer vson and pattern recognton felds, snce (1) the curse of hgh dmensonalty s usually a major cause of lmtatons of many practcal technologes; (2) the large quanttes of features may even degrade the performances of the classfers when the sze of the tranng set s small compared to the number of features [1]. In the past several decades, many feature extracton methods have been proposed, n whch the most wellknown ones are Prncpal Component Analyss () [1] and Lnear Dscrmnant Analyss (LDA). However, there are stll some lmtatons for drectly applyng them to solve vson problems. Frstly, although s a popular unsupervsed method whch ams at extractng a subspace n whch the varance of the projected data s maxmzed (or, equvalently, the reconstructon error s mnmzed), t does not take the class nformaton nto account and thus may not be relable for classfcaton tasks. On the contrary, LDA s a supervsed technque whch has been shown to be more effectve than n many applcatons. It ams to maxmze the betweenclass scatter and smultaneously mnmze the wthn-class scatter. Unfortunately, t has also been ponted out that there are some drawbacks exsted n LDA [13], such as (1) t usually suffers from the small sample sze problem [18] whch makes the wthn-class scatter matrx sngular; (2) t s only optmal for the case where the dstrbuton of the data n each class s a Gaussan wth an dentcal covarance matrx; (3) LDA can only extract at most c 1 features (where c s the number of dfferent classes), whch s suboptmal for many applcatons. Another lmtaton of and LDA s that they are all lnear methods. However, t s dscovered that many vson problems may not be lnear [7][2], whch makes these lnear approaches neffcent. Fortunately, kernel based methods [2] can handle these nonlnear cases very well. The basc dea behnd those kernel based technques s to frst map the data to a hgh-dmensonal (usually nfntedmensonal) feature space, and make the nonlnear problem n the orgnal space lnearly solvable n the feature space. It has been shown that Kernelzed [3] and Kernelzed LDA [19] can mprove the performances of orgnal and LDA sgnfcantly n many computer vson and pattern recognton problems. Fnally, and LDA take ther nputs as vectoral data, but n many real-world vson problems, the data are more naturally represented as hgher-order tensors. For example, a captured mage s a 2nd-order tensor,.e. matrx, and the sequental data, such as a vdeo sequence for event analyss, s n the form of 3rd-order tensor. Thus t s necessary to derve the multlnear forms of these tradtonal lnear feature extracton methods to handle the data as tensors drectly. Recently ths research topc has receved consderable nterests from the computer vson and pattern recognton communty [5], and the proposed methods have been shown to be much more effcent than the tradtonal vectoral methods. In ths paper, we propose a novel supervsed lnear feature extracton method called Average Neghborhood Mar- 1
2 gn Maxmzaton (). For each data pont, ams to pull the neghborng ponts wth the same class label towards t as near as possble, whle smultaneously push the neghborng ponts wth dfferent labels away from t as far as possble. Compared wth tradtonal LDA, our method has the followng advantages: 1. avods the small sample sze problem [18] snce t does not need to compute any matrx nverse; 2. can fnd the dscrmnant drectons wthout assumng the partcular form of class denstes; 3. Much more feature dmensons are avalable n, whch s not lmted to c 1 as n LDA. Moreover, we also derve the nonlnear and multlnear forms of for handlng the nonlnear and tensor data. Fnally the expermental results on face recognton are presented to show the effectveness of our method. The rest of ths paper s organzed as follows. In secton 2 we wll brefly revew some methods that are closely related to. The algorthm detals of wll be ntroduced n secton 3. In secton 4 and secton 5 we wll develop the kernelzed and tensorzed forms of. The expermental results on face recognton wll be presented n secton 6, followed by the conclusons and dscussons n secton Related Works In ths secton we wll brefly revew some lnear feature extracton methods that are closely related to. Frst let s see some notatons and problem defnton. Let {(x 1, y 1 ), (x 2, y 2 ),, (x N, y N )} be the emprcal dataset, where x R d s the -th datum represented by a d dmensonal column vector, and y L s the label of x, L = {1, 2,, c} s the label set. The goal of lnear feature extracton s to learn a d l projecton matrx W, whch can project x to y = W T x, where y R l s the projected data wth l d, such that n the projected space the data from dfferent classes can be effectvely dscrmnated. Tradtonal LDA learns W by maxmzng the followng crteron W T S b W J = W T S w W, where S b = c k=1 p k(m k m)(m k m) T s the betweenclass scatter matrx, where p k and m k are the pror and mean of class k, and m s the mean of the entre dataset. S w = c k=1 p ks k s the wthn-class scatter matrx wth S k beng the covarance matrx of class k. It has been shown that J can be maxmzed when W s consttuted by the egenvectors of S 1 w S b correspondng to ts l largest egenvalues [13]. However, when the sze of the dataset s small, S w wll become sngular. Then S 1 w does not exst and the small sample sze (SSS) problem occurs. Many approaches have been proposed to solve such a problem, such as +LDA [18], null space LDA [14], drect LDA [9], etc. L et al. [6] further proposed an effcent and robust lnear feature extracton method whch ams to maxmze the followng crteron whch was called a margn n [6] J = tr ( W T (S b S w )W ), (1) where tr( ) denotes the matrx trace. We can see that there s no need for computng any matrx nverse n optmzng the above crteron. However, such a margn s lack of geometrc ntutons. Qu et al. [23] proposed a Nonparametrc Margn Maxmzaton Crteron for learnng W, whch tres to maxmze J = N w ( δ E 2 δ I 2 ) (2) =1 n the transformed space, where δ E s the dstance between x and ts nearest neghbor n the dfferent class, δ I s the dstance between x and ts furthest neghbor n the same class. The problem s that usng just the nearest (or furthest) neghbor for defnng the margn may cause the algorthm senstve to outlers. Moreover, the stepwse procedure for maxmzng J s tme consumng. From another pont of vew lnear feature extracton can also be treated as learnng a proper Mahalanobs dstance between parwse ponts, snce y y j 2 = W T (x x j ) 2 = (x x j ) T WW T (x x j ) Let M = WW T, then y y j 2 = (x x j ) T M(x x j ). Wenberger et al. [15] proposed a large margn crteron to learn a proper M for k Nearest Neghbor classfer, and optmze t through a Semdefnte Programmng (SDP) procedure. Unfortunately, the computatonal burden of SDP s hgh, whch lmts ts potental applcatons n hghdmensonal datasets. 3. Feature Extracton by Average Neghborhood Margn Maxmzaton () In ths secton we wll ntroduce our Average Neghborhood Margn Maxmzaton () algorthm n detal. Lke other lnear feature extracton methods, ams to learn a projecton matrx W such that the data n the projected space have hgh wthn-class smlarty and betweenclass separablty. To acheve such a goal, we frst ntroduce
3 be defned as γ = γ = k:x k N e y y k 2 y y j 2 j:x j N o, (a) Neghborhood n the orgnal (b) Neghborhood n the projected space space Fgure 1. An ntutve llustraton of the crteron. The yellow dsk n the center represents x. The blue dsks are the data ponts n the homogeneous neghborhood of x, and the red squares are the data ponts n the heterogeneous neghborhood of x. (a) shows the data dstrbuton n the orgnal space, (b) shows the data dstrbuton n the projected space. two types of neghborhoods: Defnton 1(Homogeneous Neghborhoods). For a data pont x, ts ξ nearest homogeneous neghborhood N o s the set of ξ most smlar 1 data whch are n the same class wth x. Defnton 2(Heterogeneous Neghborhoods).For a data pont x, ts ζ nearest heterogeneous neghborhood N e s the set of ζ most smlar data whch are not n the same class wth x. Then the average neghborhood margn γ for x s defned as γ = k:x k N e y y k 2 y y j 2 j:x j N o, where represents the cardnalty of a set. Lterally, ths margn measures the dfference between the average dstance from x to the data ponts n ts heterogeneous neghborhood and the average dstance from t to the data ponts n ts homogeneous neghborhood. The maxmzaton of such a margn can push the data ponts whose labels are dfferent from x away from x whle pull the data ponts havng the same class label wth x towards x. Fg.1 gves us an ntutve llustraton of the crteron. Therefore, the total average neghborhood margn can 1 In ths paper two data vectors are consdered to be smlar f the Eucldean dstance between them s small, two data tensors are consdered to be smlar f the Frobenus norm of ther dfference tensor s small. and the crteron s to maxmze γ. Snce y y k 2 k:x k N e = tr (y y k ) (y y k ) T k:x k N e = tr W T (x x k ) (x x k ) T W k:x k N e = W T tr(s)w, (3) where the matrx S =,k: x k N e (x x k ) (x x k ) T, (4) s called the scatterness matrx. Smlarly, f we defne the compactness matrx as Then C = j:x j N o,j: x j N o (x x j ) (x x j ) T y y j 2. (5) = tr ( W T CW ). Therefore the average neghborhood margn can be rewrtten as γ = tr [ W T (S C)W ]. (6) If we expand W as W = (w 1, w 2,, w l ), then γ = l k=1 wt k (S C)w k. To elmnate the freedom that we can multply W wth some nonzero scalar, we add the constrant w T k w k = 1,.e. we restrct W to be consttuted of unt vectors. Thus our crteron becomes l max k=1 wt k (S C)w k s.t. wk T w k = 1. (7)
4 Table 1. Average Neghborhood Margn Maxmzaton Input: Tranng set D = {(x, y )} N =1, Testng set Z = {z 1, z 2,, z M }, Neghborhood sze,, Desred dmensonalty l; Output: l M feature matrx F extracted from Z. 1. Construct the heterogeneous neghborhood and homogeneous neghborhood for each x ; 2. Construct the scatterness matrx S and compactness matrx C usng Eq.(4) and Eq.(5) respectvely; 3. Do egenvalue decomposton on S C, construct d l matrx W whose columns are composed by the egenvectors of S C correspondng to ts largest l egenvalues; 4. Output F = W T Z wth Z = [z 1, z 2,, z N ]. Usng the Lagrangan method, we can easly fnd that the optmal W s composed of the l egenvectors correspondng to the largest l egenvalues of S C. To summarze, the man procedure of s shown n Table Nonlnearzaton va Kernelzaton In ths secton, we wll extend the algorthm to the nonlnear case va the kernel method [2]. More formally, we wll frst map the dataset from the orgnal space R d to a hgh (usually nfnte) dmensonal feature space F through a nonlnear mappng Φ : R d F, and apply lnear there. In the feature space F, the Eucldean dstance between Φ(x ) and Φ(x j ) can be computed as Φ(x ) Φ(x j ) = (Φ(x ) Φ(x j )) T (Φ(x ) Φ(x j )) = K + K jj 2K j, where K j = Φ(x ) T Φ(x j ) s the (, j)-th entry of the kernel matrx K. Thus we can use K to fnd the heterogeneous neghborhood and homogeneous neghborhood for each x n the feature space, and the total average neghborhood margn becomes where S Φ = C Φ = γ Φ = l,k: Φ(x k ) N e Φ(x ),j: Φ(x j ) N o Φ(x ) k=1 wt k (S Φ C Φ )w k, (8) (Φ(x ) Φ(x k )) (Φ(x ) Φ(x k )) T N e Φ(x ) (Φ(x ) Φ(x j )) (Φ(x ) Φ(x j )) T, N o Φ(x ) where NΦ(x e and N o ) Φ(x ) are the heterogeneous and homogeneous neghborhood of Φ(x ). It s mpossble to compute S Φ and C Φ drectly snce we usually do not know the explct form of Φ. To avod such a problem, we notce that each w k les n the span of Φ(x ), Φ(x 2 ),, Φ(x N ),.e. Therefore w T k Φ(x ) = w k = N p=1 αk pφ(x p ) N αpφ(x k p ) T Φ(x ) = (α k ) T K, p=1 where α k s a column vector wth ts p-th entry equal to α k p, K s the -th column of K. Thus Defne the matrces then γ Φ = S Φ = C Φ = = w T k (Φ(x ) Φ(x j ))(Φ(x ) Φ(x j )) T w k = (α k ) T (K K j )(K K j ) T α k.,k: Φ(x k ) N e Φ(x ),j: Φ(x j ) N o Φ(x ) l wk T (S Φ C Φ )w k = k=1 l (α k ) T ( SΦ C Φ) α k k=1 (K K k ) (K K k ) T N e Φ(x ) (9) (K K j ) (K K j ) T,(1) N o Φ(x ) l ( wk S Φ w k w k C Φ ) w k k=1 Smlar to Eq.(7), we also add the constrants that (α k ) T (α k ) = 1 (k = 1, 2,, l). Then the optmal (α k ) s are the egenvectors of S Φ C Φ correspondng to ts largest l egenvalues. For a new test pont z, ts k-th extracted feature can be computed by w T k Φ(z) = N αpφ(x k p ) T Φ(z) = (α k ) T K t z. (11) p=1 where we use K t to denote the kernel matrx between the tranng set and the testng set. The man procedure Kernel Average Neghborhood Margn Maxmzaton (K) algorthm s summarzed n Table 2.
5 Table 2. Kernel Average Neghborhood Margn Maxmzaton Input: Tranng set D = {(x, y )} N =1, Testng set Z = {z 1, z 2,, z M }, Neghborhood sze NΦ o, N Φ e, Kernel parameter θ, Desred dmensonalty l; Output: l M feature matrx F extracted from Z. 1. Construct the kernel matrx K on the tranng set; 2. Construct the heterogeneous neghborhood and homogeneous neghborhood for each Φ(x ); 3. Compute S Φ and C Φ usng Eq.(9) and Eq.(1) respectvely; 4. Do egenvalue decomposton on S Φ C Φ, store the egenvectors {α 1, α 2,, α l } correspondng to the largest l egenvalues; 5. Construct the kernel matrx between the tranng set and the testng set K t wth ts (, j)-th entry K t j = Φ(x ) T Φ(z j ). 6. Output F Φ wth F Φ j = (α ) T K t j. 5. Multlnearzaton va Tensorzaton Tll now the method we have ntroduced s based on the basc assumpton that the data are n vectorzed representatons. Therefore t s necessary to derve the tensor form of our method. Frst let s ntroduce some notatons and defntons. Let A be a tensor of d 1 d 2 d K. The order of A s K and the f-th dmenson (or mode) of A s of sze d f. A sngle entry wthn a tensor s denoted by A 1 2 K. Defnton 3 (Scalar Product). The scalar product A, B of two tensors A, B R d1 d2 d K s defned as A, B = A K B 1 2 K, K where denotes the complex conjugaton. Furthermore, the Frobenus norm of a tensor A s defned as A F = A, A, Defnton 4 (f-mode Product). The f-mode product of a tensor A R d1 d2 d K and a matrx U R d f g f s an d 1 d 2 d f 1 g f d f+1 d K tensor denoted as A f U, where the correspondng entres are gven by (A f U) 1 f 1 j f f+1 K = f A 1 f 1 f f+1 K U f j f Defnton 5 (f-mode Unfoldng). Let A be a d 1 d K tensor and (π 1,, π K 1 )be any permutaton of the entres of the set {1,, f 1, f +1,, K}. The f-mode unfoldng of the tensor A nto a d f K 1 l=1 d π l matrx, denoted by A (f), s defned as A R d1 d K f A (f) R d f K 1 l=1 dπ l, where A (f) f j = A 1 K wth j = 1 + K 1 ( π l=1 l 1) l 1 d l π =1 l. The tensor based crteron for s that, gven N data ponts X 1,, X N embedded n a tensor space R d1 d2 d K, we want to pursue K optmal nterrelated projecton matrces U R l d (l < d, = 1, 2,, K), whch maxmze the average neghborhood margn measured n the tensor metrc. That s γ = Y Y j 2 F j:x j N o Y Y k 2 F k:x k N e, where Y = X 1 U 1 2 U 2 K U K. Note that drectly maxmzng γ s almost nfeasble snce t s a hgherorder optmzaton problem. Generally such type of problems can be solved approxmately by employng an terate scheme whch was orgnally proposed by [12] for low-rank approxmaton of second-order tensors. Later [8] extended t for hgher-order tensors. In the followng we wll adopt such an teratve scheme to solve the optmzaton problem. Gven U 1, U 2,, U f 1, U f+1,, U K, let Y f = X 1 U 1 f 1 U f 1 f+1 U f+1 K U K. (12) Then, by the correspondng f-mode unfoldng, we can get Y f f Y (f) Therefore we have. Moreover, we can easly derve that Y f F ( ) T f U f = Y (f) Uf. Y Y j 2 F = X 1 U 1 K U K X j 1 U 1 K U K 2 F = Y f f U f Y f j 2 f U f F ( ) T ( ) T = Y (f) Uf Y (f) 2 j Uf F [ ( ) ( ) ] T = tr U T f Y (f) Y (f) j Y (f) Y (f) j Uf Then knowng U 1,, U f 1, U f+1,, U K, we can rewrte the compactness matrx and scatterness matrx n tensor as S = C =,k: x k N e,j: x k N o ( Y (f) ( Y (f) ) ( Y (f) k Y (f) F ) T Y (f) k,(13) ) ( Y (f) j Y (f) ) T Y (f) j,(14)
6 Table 3. Tensor Average Neghborhood Margn Maxmzaton Input: Tranng set D = {(X, y )} N =1, Testng set Z = {Z 1, Z 2,, Z M }, where X, Z j R d1 d2 d K, Neghborhood sze,, Desred dmensonalty l 1, l 2,, l K, Iteraton steps T max, Dfference ε; Output: Feature tensors {F } M =1 extracted from Z, where F R l1 l2 l K. 1. Intalze U 1 = I d1, U 2 = I d2,, U K = I d K, where I d represents the d d dentty matrx; 2. For t = 1, 2,, T max do For f = 1, 2,, K do (a). Compute Y f (b). Y f f Y (f) ; by Eq.(12); (c). Compute S and C usng Eq.(13) and Eq.(14); (d). Do egenvalue decomposton on S C: (S C)U t f = Ut f Λ f wth U t f Rd f l f ; (f). f U t f Ut 1 f < ε, break; End for. End for. 3. Output F = Z 1 U t 1 K U t K. and our optmzaton problem (wth respect to U f ) becomes max tr [ U T ] f (S C) U f (15) U f Let s expand U f as U f = (u f1, u f2,, u flf ) wth u f correspondng to the -th column of U f, then Eq.(15) can be rewrtten as max l f =1 ut f(s C)u f. (16) We also add the constrant that u T f u f = 1 to restrct the scale of U f. The man procedure of the Tensor Average Neghborhood Margn Maxmzaton (T) s summarzed n Table Experments In ths secton, we nvestgate the performance of our proposed, Kernel (K) and Tensor (T) methods for face recognton. We have done three groups of experments to acheve ths goal: 1. Lnear methods. In ths set of experments, the performance of orgnal s compared wth the tradtonal [16] method, LDA (+LDA) method [18], and three margn based methods, namely the Maxmum Margn Crteron () method [6], the Stepwse Nonparametrc Maxmum MArgn Crteron (SN) method [23] and the Margnal Fsher Analyss () method [21]; 2. Kernel methods. In ths set of experments, the performance of the K method s compared wth the K and the KDA method [17]; 3. Tensor methods. In ths set of experments, the performance of the Tensor (T) method s compared wth the Tensor (T) and the Tensor LDA (TLDA) methods [4]. In ths study, three face dataset are used: 1. The ORL face dataset 2. There are ten mages for each of the 4 human subjects, whch were taken at dfferent tmes, varyng the lghtng, facal expressons (open / closed eyes, smlng / not smlng) and facal detals (glasses / no glasses). The mages were taken wth a tolerance for some tltng and rotaton of the face up to 2 degrees. The orgnal mages (wth 256 gray levels) have sze , whch are reszed to for effcency; 2. The Yale face dataset 3. It contans 11 grayscale mages for each of the 15 ndvduals. The mages demonstrate varatons n lghtng condton (left-lght, center-lght, rght-lght), facal expresson (normal, happy, sad, sleepy, surprsed, and wnk), and wth/wthout glasses. In our experment, the mages were also reszed to 32 32; 3. The CMU PIE face dataset [22]. It contans 68 ndvduals wth 41,368 face mages as a whole. The face mages were captured by 13 synchronzed cameras and 21 flashes, under varyng pose, llumnaton, and expresson. In our experments, fve near frontal poses (C5, C7, C9, C27, C29) are selected under dfferent llumnatons, lghtng and expressons whch leaves us 17 near frontal face mages for each ndvdual, and all the mages were also reszed to The free parameters for the tested methods were determned n the followng ways: 1. For the -seres methods (ncludng, K, T), the szes of the homogeneous and heterogeneous neghborhoods for each data pont are all set to 1; 2. For the kernel methods,we all adopt the Gaussan kernel, and the varance of the Gaussan kernel were set by cross-valdaton; 3. For the tensor methods, we requre that the projected mages are also square,.e. of dmenson r r for some r
7 recognton accuracy 2 Tran SN LDA recognton accuracy.9 3 Tran SN +LDA recognton accuracy Tran SN +LDA Fgure 2. Face recognton accuraces on the ORL dataset wth 2,3,4 mages for each ndvdual randomly selected for tranng. 2 Tran 3 Tran 4 Tran recognton accuracy SN +LDA recognton accuracy SN +LDA recognton acuracy SN +LDA Fgure 3. Face recognton accuraces on the Yale dataset wth 2,3,4 mages per ndvdual randomly selected for tranng. 5 Tran 1 Tran Tran recognton accuracy SN +LDA recognton accuracy SN +LDA recognton accuracy SN +LDA Fgure 4. Face recognton accuraces on the CMU PIE dataset wth 5,1,2 mages per ndvdual randomly selected for tranng. The expermental results of the lnear methods on the three datasets are shown n Fg.2, Fg.3, Fg.4 respectvely. In all the fgures, the abscssas represent the projected dmensons, and the ordnates are the average recognton accuraces of 5 ndependent runs. From the fgures we clearly see that the performances of s better than other lnear methods on all the three datasets. Table 4 shows the expermental results of all the methods on three datasets, where the value n each entry represents the average recognton accuracy (n percentages) of 5 ndependent trals, and the number n brackets s the correspondng projected dmenson. The table shows that the -seres methods can perform better than those tradtonal methods on the three datasets. 7. Conclusons and Dscussons In ths paper we proposed a novel supervsed lnear feature extracton method named Average Neghborhood Margn Maxmzaton (). For each data pont, ams at pullng the neghborng ponts wth the same class label towards t as near as possble, whle smultaneously pushng the neghborng ponts wth dfferent labels away from t as far as possble. Moreover, as many computer vson and pattern recognton problems are ntrnscally nonlnear or multlnear, we also derve the kernelzed and tensorzed counterparts of. Fnally the expermental results on face recognton are presented to show the effectveness of our proposed approaches.
8 Table 4. Face recognton results on three datasets (%). Method ORL Yale CMU PIE 2 Tran 3 Tran 4 Tran 2 Tran 3 Tran 4 Tran 5 Tran 1 Tran 2 Tran 54.35(56) 64.71(64) 71.54(36) 45.19(37) 51.91(35) 56.3(4) 46.64(24) 54.72(213) 67.17(241) LDA 77.36(28) 86.96(39) 91.71(39) 46.4(9) 59.25(13) 68.9(12) 57.5(62) 76.75(62) 88.6(61) 77.73(54) 85.98(29) 91.26(52) 46.64(54) 58.8(56) 71.67(39) 57.5(21) 77.56(215) 85.54(195) SN 79.23(49) 87.68(54) 93.59(36) 49.5(49) 66.31(49) 78.57(47) 66.45(223) 88(213) 91.2(22) 77.34(41) 87.19(33) 92.19(33) 49.56(38) 64.6(38) 76.5(39) 63.6(21) 89(232) 88.69(25) 82.13(37) 89.13(41) 95.84(43) 55(41) 67.87(38) 89(41) 7.5(222) 82.8(23) 93.46(25) K 64.23(5) 75.25(54) 79.26(6) 49.34(45) 55.78(47) 62(54) 52.35(341) 62(384) 72.25(256) KDA 89(38) 89.13(36) 93.12(38) 52.35(14) 64.89(13) 71.95(14) 62.13(67) 81.27(66) 92.11(65) K 85.46(5) 92.21(39) 96.13(53) 54.62(54) 69.25(66) 87(62) 72.1(32) 82.41(28) 93.67(218) T 59.22(1 2 ) 71.25(12 2 ) 79.86(1 2 ) 55(7 2 ) 57.23(11 2 ) 62.3(1 2 ) 51.17(1 2 ) 56.65(13 2 ) 69.9(11 2 ) TLDA 88(9 2 ) 89.28(11 2 ) 93.37(8 2 ) 51.25(9 2 ) 66.19(1 2 ) 75.88(9 2 ) 61(12 2 ) 85(14 2 ) 92.75(8 2 ) T 85.87(1 2 ) 92.54(9 2 ) 96.22(11 2 ) 55.31(11 2 ) 73(8 2 ) 81.56(1 2 ) 73.2(12 2 ) 82.78(9 2 ) 94.32(11 2 ) As we mentoned n secton 2, lnear feature extracton methods can also be vewed as learnng a proper Mahalanobs dstance n the orgnal data space. Thus can also be used for dstance metrc learnng. From such a vewpont, our algorthm s more effcent n that t only needs to learn the transformaton matrx, but not the whole covarance matrx as n tradtonal metrc learnng algorthms[15]. References [1] A. K. Jan, B. Chandrasekaran. Dmensonalty and Sample Sze Consderatons n Pattern Recognton Practce. In Handbook of Statstcs. Amsterdam, North Holland [2] B. Schölkopf, A. Smola. Learnng wth Kernels. The MIT Press. Cambrdge, Massachusetts. London, England , 4 [3] B. Schölkopf, A. Smola, K.-R. Müller. Nonlnear Component Analyss as a Kernel Egenvalue Problem. Neural Computaton, 1: [4] D. Ca, X. He, J. Han. Subspace Learnng Based on Tensor Analyss. Department of Computer Scence Techncal Report No. 2572, Unversty of Illnos at Urbana-Champagn (UIUCDCS-R ) [5] Fernando De la Torre, M. Alex O. Vaslescu. Lnear and Multlnear (Tensor) Methods for Vson, Graphcs, and Sgnal Processng. IEEE CVPR Tutoral [6] H. L, T. Jang, K. Zhang. Effcent and Robust Feature Extracton by Maxmum Margn Crteron. In NIPS , 6 [7] H. S. Seung, D. D. Lee. The manfold ways of percepton. Scence, [8] H. Wang, Q., Wu, L., Sh, Y., Yu, N., Ahuja. Out-of-Core Tensor Approxmaton of Mult-Dmensonal Matrces of Vsual Data. In Proceedngs of ACM SIGGRAPH [9] H. Yu, J. Yang. A Drect LDA Algorthm for Hgh Dmensonal Data wth Applcaton to Face Recognton. Pattern Recognton [1] I. T. Jollffe. Prncpal Component Analyss. Sprnger- Verlag, New York [11] J. Yang, D. Zhang, Alejandro F. Frang, J. Yang. Two- Dmensonal : A New Approach to Appearance-Based Face Representaton and Recognton. IEEE TPAMI. 24. [12] J. Ye. Generalzed Low Rank Approxmatons of Matrces. In Proceedngs of ICML [13] K. Fukunaga. Introducton to Statstcal Pattern Recognton. Academc Press, New York, 2nd edton , 2 [14] K. Lu, Y. Cheng, J. Yang. A Generalzed Optmal Set of Dscrmnant Vectors. Pattern Recognton [15] K. Q. Wenberger, J. Bltzer, L. K. Saul. Dstance Metrc Learnng for Large Margn Nearest Neghbor Classfcaton In NIPS , 8 [16] M. A. Turk and A. P. Pentland. Egenfaces for recognton. Journal of Cogntve Neuroscence, 3(1): 71-96, [17] M. -H. Yang. Kernel Egenfaces vs. Kernel Fsherfaces: Face Recognton Usng Kernel Methods. InProceedngs of the Ffth IEEE Internatonal Conference on Automatc Face and Gesture Recognton [18] P.N. Belhumeur, J. Hespanda, D. Kregeman. Egenfaces vs. Fsherfaces: Recognton Usng Class Specfc Lnear Projecton. IEEE Trans. on PAMI , 2, 6 [19] S. Mka, G. Rätsch, J. Weston, B. Schölkopf, K.-R. Müller. Fsher Dscrmnant Analyss wth Kernels. Neural Networks for Sgnal Processng IX, IEEE [2] S. T. Rowes, L. K. Saul. Nonlnear dmensonalty reducton by locally lnear embeddng. Scence, [21] S. Yan, D. Xu, B. Zhang and H. Zhang. Graph Embeddng: A General Framework for Dmensonalty Reducton. In Proceedngs of IEEE CVPR [22] T. Sm, S. Baker, and M. Bsat. The CMU pose, llumnlaton, and expresson database. IEEE Trans. on PAMI [23] X. Qu, L. Wu. Face Recognton by Stepwse Nonparametrc Margn Maxmum Crteron. In Proc. ICCV , 6
Tensor Subspace Analysis
Tensor Subspace Analyss Xaofe He 1 Deng Ca Partha Nyog 1 1 Department of Computer Scence, Unversty of Chcago {xaofe, nyog}@cs.uchcago.edu Department of Computer Scence, Unversty of Illnos at Urbana-Champagn
More informationSubspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han
Report No. UIUCDCS-R-2005-2572 UILU-ENG-2005-1767 Subspace Learnng Based on Tensor Analyss by Deng Ca, Xaofe He, and Jawe Han May 2005 Subspace Learnng Based on Tensor Analyss Deng Ca Xaofe He Jawe Han
More informationA Novel Biometric Feature Extraction Algorithm using Two Dimensional Fisherface in 2DPCA subspace for Face Recognition
A Novel ometrc Feature Extracton Algorthm usng wo Dmensonal Fsherface n 2DPA subspace for Face Recognton R. M. MUELO, W.L. WOO, and S.S. DLAY School of Electrcal, Electronc and omputer Engneerng Unversty
More informationEfficient and Robust Feature Extraction by Maximum Margin Criterion
Effcent and Robust Feature Extracton by Maxmum Margn Crteron Hafeng L Tao Jang Department of Computer Scence Unversty of Calforna Rversde, CA 95 {hl,jang}@cs.ucr.edu Keshu Zhang Department of Electrcal
More informationUnified Subspace Analysis for Face Recognition
Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationRegularized Discriminant Analysis for Face Recognition
1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths
More informationCSE 252C: Computer Vision III
CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationStatistical pattern recognition
Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationEfficient, General Point Cloud Registration with Kernel Feature Maps
Effcent, General Pont Cloud Regstraton wth Kernel Feature Maps Hanchen Xong, Sandor Szedmak, Justus Pater Insttute of Computer Scence Unversty of Innsbruck 30 May 2013 Hanchen Xong (Un.Innsbruck) 3D Regstraton
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationSemi-supervised Classification with Active Query Selection
Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples
More informationFisher Linear Discriminant Analysis
Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationKernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan
Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems
More informationBACKGROUND SUBTRACTION WITH EIGEN BACKGROUND METHODS USING MATLAB
BACKGROUND SUBTRACTION WITH EIGEN BACKGROUND METHODS USING MATLAB 1 Ilmyat Sar 2 Nola Marna 1 Pusat Stud Komputas Matematka, Unverstas Gunadarma e-mal: lmyat@staff.gunadarma.ac.d 2 Pusat Stud Komputas
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationHongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)
ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationMulti-Task Learning in Heterogeneous Feature Spaces
Proceedngs of the Twenty-Ffth AAAI Conference on Artfcal Intellgence Mult-Task Learnng n Heterogeneous Feature Spaces Yu Zhang & Dt-Yan Yeung Department of Computer Scence and Engneerng Hong Kong Unversty
More informationInexact Newton Methods for Inverse Eigenvalue Problems
Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationCS 468 Lecture 16: Isometry Invariance and Spectral Techniques
CS 468 Lecture 16: Isometry Invarance and Spectral Technques Justn Solomon Scrbe: Evan Gawlk Introducton. In geometry processng, t s often desrable to characterze the shape of an object n a manner that
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationA New Facial Expression Recognition Method Based on * Local Gabor Filter Bank and PCA plus LDA
Hong-Bo Deng, Lan-Wen Jn, L-Xn Zhen, Jan-Cheng Huang A New Facal Expresson Recognton Method Based on Local Gabor Flter Bank and PCA plus LDA A New Facal Expresson Recognton Method Based on * Local Gabor
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts
More informationLearning with Tensor Representation
Report No. UIUCDCS-R-2006-276 UILU-ENG-2006-748 Learnng wth Tensor Representaton by Deng Ca, Xaofe He, and Jawe Han Aprl 2006 Learnng wth Tensor Representaton Deng Ca Xaofe He Jawe Han Department of Computer
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationSupport Vector Machines
CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at
More informationEstimating the Fundamental Matrix by Transforming Image Points in Projective Space 1
Estmatng the Fundamental Matrx by Transformng Image Ponts n Projectve Space 1 Zhengyou Zhang and Charles Loop Mcrosoft Research, One Mcrosoft Way, Redmond, WA 98052, USA E-mal: fzhang,cloopg@mcrosoft.com
More informationNon-linear Canonical Correlation Analysis Using a RBF Network
ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane
More informationManifold Learning for Complex Visual Analytics: Benefits from and to Neural Architectures
Manfold Learnng for Complex Vsual Analytcs: Benefts from and to Neural Archtectures Stephane Marchand-Mallet Vper group Unversty of Geneva Swtzerland Edgar Roman-Rangel, Ke Sun (Vper) A. Agocs, D. Dardans,
More informationPattern Recognition 42 (2009) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage:
Pattern Recognton 4 (9) 764 -- 779 Contents lsts avalable at ScenceDrect Pattern Recognton ournal homepage: www.elsever.com/locate/pr Perturbaton LDA: Learnng the dfference between the class emprcal mean
More informationAdaptive Manifold Learning
Adaptve Manfold Learnng Jng Wang, Zhenyue Zhang Department of Mathematcs Zhejang Unversty, Yuquan Campus, Hangzhou, 327, P. R. Chna wroarng@sohu.com zyzhang@zju.edu.cn Hongyuan Zha Department of Computer
More informationLecture 4: Constant Time SVD Approximation
Spectral Algorthms and Representatons eb. 17, Mar. 3 and 8, 005 Lecture 4: Constant Tme SVD Approxmaton Lecturer: Santosh Vempala Scrbe: Jangzhuo Chen Ths topc conssts of three lectures 0/17, 03/03, 03/08),
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationChat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980
MT07: Multvarate Statstcal Methods Mke Tso: emal mke.tso@manchester.ac.uk Webpage for notes: http://www.maths.manchester.ac.uk/~mkt/new_teachng.htm. Introducton to multvarate data. Books Chat eld, C. and
More informationA kernel method for canonical correlation analysis
A kernel method for canoncal correlaton analyss Shotaro Akaho AIST Neuroscence Research Insttute, Central 2, - Umezono, Tsukuba, Ibarak 3058568, Japan s.akaho@ast.go.jp http://staff.ast.go.jp/s.akaho/
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationLecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.
prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationLecture 10: Dimensionality reduction
Lecture : Dmensonalt reducton g The curse of dmensonalt g Feature etracton s. feature selecton g Prncpal Components Analss g Lnear Dscrmnant Analss Intellgent Sensor Sstems Rcardo Guterrez-Osuna Wrght
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationSupport Vector Machines
/14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x
More informationThe Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor
Prncpal Component Transform Multvarate Random Sgnals A real tme sgnal x(t can be consdered as a random process and ts samples x m (m =0; ;N, 1 a random vector: The mean vector of X s X =[x0; ;x N,1] T
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationSupport Vector Machines
Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class
More informationWhy Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)
Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch
More informationSome Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)
Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013
ISSN: 2277-375 Constructon of Trend Free Run Orders for Orthogonal rrays Usng Codes bstract: Sometmes when the expermental runs are carred out n a tme order sequence, the response can depend on the run
More informationLecture 10: May 6, 2013
TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,
More informationPower law and dimension of the maximum value for belief distribution with the max Deng entropy
Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng
More informationOn a direct solver for linear least squares problems
ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationStructure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7
Structure from Moton Forsyth&once: Chap. 2 and 3 Szelsk: Chap. 7 Introducton to Structure from Moton Forsyth&once: Chap. 2 Szelsk: Chap. 7 Structure from Moton Intro he Reconstructon roblem p 3?? p p 2
More informationFMA901F: Machine Learning Lecture 5: Support Vector Machines. Cristian Sminchisescu
FMA901F: Machne Learnng Lecture 5: Support Vector Machnes Crstan Smnchsescu Back to Bnary Classfcaton Setup We are gven a fnte, possbly nosy, set of tranng data:,, 1,..,. Each nput s pared wth a bnary
More informationCHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD
CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB
More informationHidden Markov Models
Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,
More informationMULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN
MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN S. Chtwong, S. Wtthayapradt, S. Intajag, and F. Cheevasuvt Faculty of Engneerng, Kng Mongkut s Insttute of Technology
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationSTATS 306B: Unsupervised Learning Spring Lecture 10 April 30
STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationDiscriminative Dictionary Learning with Low-Rank Regularization for Face Recognition
Dscrmnatve Dctonary Learnng wth Low-Rank Regularzaton for Face Recognton Langyue L, Sheng L, and Yun Fu Department of Electrcal and Computer Engneerng Northeastern Unversty Boston, MA 02115, USA {l.langy,
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationSpeeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem
H.K. Pathak et. al. / (IJCSE) Internatonal Journal on Computer Scence and Engneerng Speedng up Computaton of Scalar Multplcaton n Ellptc Curve Cryptosystem H. K. Pathak Manju Sangh S.o.S n Computer scence
More informationLecture 3: Dual problems and Kernels
Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationC4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )
C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z
More informationAutomatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models
Automatc Object Trajectory- Based Moton Recognton Usng Gaussan Mxture Models Fasal I. Bashr, Ashfaq A. Khokhar, Dan Schonfeld Electrcal and Computer Engneerng, Unversty of Illnos at Chcago. Chcago, IL,
More informationImage classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?
Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationAn Iterative Modified Kernel for Support Vector Regression
An Iteratve Modfed Kernel for Support Vector Regresson Fengqng Han, Zhengxa Wang, Mng Le and Zhxang Zhou School of Scence Chongqng Jaotong Unversty Chongqng Cty, Chna Abstract In order to mprove the performance
More informationOn the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros
Appled Mathematcal Scences, Vol. 5, 2011, no. 75, 3693-3706 On the Interval Zoro Symmetrc Sngle-step Procedure for Smultaneous Fndng of Polynomal Zeros S. F. M. Rusl, M. Mons, M. A. Hassan and W. J. Leong
More information2.3 Nilpotent endomorphisms
s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms
More informationMatrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD
Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More information