Neurocomputing 154 (2015) Contents lists available at ScienceDirect. Neurocomputing. journal homepage:

Size: px
Start display at page:

Download "Neurocomputing 154 (2015) Contents lists available at ScienceDirect. Neurocomputing. journal homepage:"

Transcription

1 Neurocomputing 154 (2015) Contents lists available at ScienceDirect Neurocomputing journal homepage: Feature extraction using adaptive slow feature discriminant analysis Xingjian Gu a, Chuancai Liu a,n, Sheng Wang a, Cairong Zhao b a School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing , China b Department of Computer Science and Technology, Tongji University, Shanghai , China article info Article history: Received 9 June 2014 Received in revised form 6 November 2014 Accepted 2 December 2014 Communicated by Y. Yuan Available online 16 December 2014 Keywords: Feature extraction Slow feature discriminant analysis Time series Adaptive parameter abstract Slow feature discriminant analysis (SFDA) is an attractive biologically inspired learning method to extract discriminant features for classification. However, SFDA heavily relies on the constructed time series. For discriminant analysis, SFDA cannot make full use of discriminant power for classification, because the type of data distribution is unknown. To address those problems, we propose a new feature extraction method called adaptive slow feature discriminant analysis (ASFDA) in this paper. First, we design a new adaptive criterion to generate within-class time series. The time series have two properties: (1) a pair of time series lies on the same sub-manifold, (2) the sub-manifold of a pair of time series is smooth. Second, ASFDA seeks projections to minimize within-class temporal variation and maximize between-class temporal variation simultaneously based on maximum margin criterion. ASFDA provides an adaptive parameter to balance between-class temporal variation and within-class temporal variation to obtain an optimal discriminant subspace. Experimental results on three benchmark face databases demonstrate that our proposed ASFDA is superior to some state-of-the-art methods. & 2014 Elsevier B.V. All rights reserved. 1. Introduction Feature extraction is a fundamental and challenging problem in many research fields such as pattern recognition and machine learning. Among them, principal component analysis (PCA) [1] and linear discriminant analysis (LDA) [2] are the two most wellknown methods for linear feature extraction. To our best knowledge, linear feature extraction methods are unable to discover essential data structures that are nonlinear. Recent studies [3 6] have shown that large volumes of highdimensional data possibly reside on a nonlinear manifold. To discover the nonlinear manifold structure of the data, many manifold learning methods have been put forward. Representative manifold learning methods include Isomap [4], LLE [3], LE[5] and LTSA [6]. Isomap preserves pairwise geodesic distance of observations in embedding space. LLE focuses on local neighborhood of each data point and preserves the minimum linear reconstructing with neighborhood in the embedding space. LE is developed on Laplace Betltrami operator on manifold. LTAS first encodes local geometry of local tangent space, then aligns all the local tangent spaces to obtain global embedding. However, those manifold learning methods obtain low dimensional embedding without an explicit mapping, and they cannot extract feature beyond training samples. In order to overcome the problem, NPE [7] tries to find a n Corresponding author. address: chcailiu@163.com (C. Liu). linear subspace that preserves local structure based on the same principle of LLE. LPP [8] seeks a linear subspace to approximate nonlinear Laplacian Eigenmap. LLTSA [9] seeks to linear projections that approximate the affine transformation of LTSA. In order to extract discriminant feature for classification, there emerge several nonlinear manifold learning methods [10 15]. Yu et al. [10] present a discriminant locality preserving projections (DLPP) method to improve the classification performance of LPP. To overcome small size sample (SSS) problem in LPP, Lu et al. [11] propose discriminant locality preserving projections based on maximum margin criterion rather than ratio criterion (MMC) [16]. Yan et al. [12] propose the marginal fisher analysis (MFA) and Chen et al. [13] propose the local discriminant embedding (LDE). MFA and LDE are very similar in formulation. Both of MFA and LDE combine locality and class label information to represent the within-class compactness and between-class separability. For those methods, it is difficult to determine the number of nearest neighbors of each sample and the number of shortest pairs from different classes. To address the problem of local size, Zhang et al. [17] propose a method which can select adaptive local size. In order to utilize nonlocal information, Zhao et al. [18] propose a Graph Embedding Discriminant Analysis (GEDA) method, which does not only impact the samples of within-class and maximize the margin of between-class, but also maximize the nonlocal scatter at the same time. Recently, there are other interesting feature extraction models (such as sparse learning model [19], saliency-based visual attention [20] and temporal slowness learning [21]) inspired by the biological mechanism. Temporal slowness principle has been successfully /& 2014 Elsevier B.V. All rights reserved.

2 140 X. Gu et al. / Neurocomputing 154 (2015) applied to model the visual receptive field of cortical neurons [22]. Based on the slowness principle, Wiskott and Sejnowski [21] propose a nonlinear unsupervised algorithm called, Slow Feature Analysis (SFA), to learn the invariant and slowly varying features from quickly varying input signals. SFA has successfully extracted a rich set of complex-cell features by training with quasi-natural image sequences [23]. SFA also has found many applications in the field of computational neuroscience [24,25] and time series analysis [26,27]. Several researchers introduce the slowness principle to the applications of pattern recognition [28 32].ZhangandTao[28] have successfully introduced the SFA framework to deal with the problem of human action recognition. To our best knowledge, SFA has a good performance on the data sets with a temporal structure. However, in real applications, there are numerous discrete data sets that have no obvious temporal structure. In discrete scenario, it is necessary to construct time series before implementation of SFA. The authors [29] propose a new Supervised Slow Feature Analysis based on Consensus Matrix (SSFACM) to construct time series for face recognition. In [30], the authors propose another variant of Supervised Slow Feature that seeks the Shortest Path of each class samples (SSFASP) to construct time series for dimensionality reduction. Huang et al. [31,32] utilize KNN criterion to construct time series and introduce Supervised Slow Feature Analysis (SSFA) for nonlinear dimensionality reduction. In order to get discriminant slow feature, they also propose slow feature discriminant analysis (SFDA) [31], which minimizes within-class temporal variation and maximizes between-class temporal variation simultaneously. From the view of manifold learning, SFDA aims to find a suitable mapping which can minimize the distance of within-class points in the low dimensional space, and simultaneously ensure that between-class points are as far as possible. However, SFDA also have two main key issues. One is the notion of what type of pairwise-points can be considered as within-class time series or between-class time series. It is crucial to characterize suband multi-manifold information respectively. In the literatures [3,4,31,32], there are two common strategies for selecting withinclass time series: the k-nearest-neighborhood (k-nn), ε-neighborhood (ε-n). But they have distinct disadvantages. For example, it is difficult to choose a suitable parameter k (or ε), because the distribution of each class often does not always share the same scatter. According to the literature [33], if parameter k (or ε) is set to be relatively large, the k-nn (or ε-n) criterion has a tendency to include noisy time series. On the other hand, if parameter k (or ε) issettoberelativelysmall,some local information can be lost. Thus, how accuracy of the time series can be approximated is pivotal in the framework of SFA. The second one is how to balance temporal variation of within-class and between-class to get an optimal discriminant subspace. According to the literature [34], when there is a conflict between within-class temporal variation and between-class temporal variation, it is difficult to know which option is the best for classification given by the within-class temporal variation or between-class temporal variation. To deal with this problem, the concept of subclass discriminant analysis [35,36] is proposed, which divides each class into several subclasses. However, it is difficult to ascertain the number of subclass. In order to address these issues, we propose a novel dimensionality reduction method called adaptive slow feature discriminant analysis (ASFDA) to improve the performance of classification in this paper. First, a new adaptive criterion is designed to generate time series before implementation of ASFDA. It is well known that a pair of sample points, which can be considered as time series, are lying on a smooth sub-manifold. Inspired by [17], we develop a new adaptive criterion to generate within-class time series. From Fig. 1, we can see that the notion of constructed time series contains two aspects (1) points of time series are nearby with Euclidean distance, (2) points of time series lie in the same principal direction of the local neighborhood. We construct between-class time series using D between-class neighboring information to characterize the margin information. Second, for the purpose of enhancement of classification, ASFDA seeks projections to minimize the difference, rather than ratio, between within-class temporal variation and between-class temporal variation based on the idea of maximize margin criterion (MMC) [37]. Future more, ASFDA also gives an adaptive parameter to balance the temporal variation of within-class and between-class to maximize the discriminant power. Extensive experiments on three benchmark face databases show the effectiveness of the proposed ASFDA. The rest of the paper is organized as follows. In Section 2, we briefly review MMC and SFDA. In Section 3, we introduce the motivations of adaptive slow feature discriminant (ASFDA) and describe it in detail. In Section 4, experiments with face image databases are carried out to demonstrate the effectiveness of the proposed method. Finally, the conclusions are made in Section A brief review of MMC and SFDA Given a sample set X ¼fx 1 ; x 2 ; ; x N gar DN and each sample belongs to one of c classes fx 1 ; X 2 ; ; X c g. Let c denote the total number of classes and N i denote the number of training samples in the ith class. Let x i j denote the jth sample in the ith class, x denote the mean of all training samples, x i be the mean of the ith class MMC MMC is a classical supervised learning algorithm for feature extraction and classification. The between-class and within-class scatter matrices can be evaluated as follows: S b ¼ c N i ðx i xþðx i xþ T ð1þ i ¼ 1 S w ¼ c C N i i ¼ 1 j ¼ 1 B A Fig. 1. Geometric analysis of two different time series construction criterions. Left: KNN criterion to construct time series, Points (A, D) can be considered as a short time series in terms of Euclidean distance but they are not in the same principal direction, and it may deform the structure of manifold. Right: our proposed criterion to construct time series, points (A, B) and (A, C) are a pair of time series because they not only close to each other in terms of Euclidean distance but also lie in the same principal direction. ðx j i x iþðx j i x iþ T The MMC based discriminant rule is defined as follow, which is based on the difference of between-class scatter matrix and within-class scatter matrix W n ¼ arg max trðw T ðs b S w ÞWÞ ð3þ W The solution to the optimization of Eq. (3) can be solved by eigenvalue problem ðs b S w Þw ¼ λw and optimal projections can be selected as eigenvectors w 1 ; w 2 ; ; w d corresponding to the first largest eigenvalues λ 1 ; λ 2 ; ; λ d. D C B A ð2þ

3 X. Gu et al. / Neurocomputing 154 (2015) SFDA In this section, we present the detail of slow feature discriminant analysis (SFDA) [31] used for discrete data that does not have an obvious temporal structure. It should firstly constructs within-class time series t w and between-class time series t b using neighboring information: t w ¼fðx p i ; xq i Þg; i ¼ 1; 2; ; c; paq; pq ¼ 1; 2; ; N i ð4þ where x p i and x q i belong to the ith class. t b ¼fðx p i ; xq j Þg; iaj; ij ¼ 1; 2; ; c; p ¼ 1; 2; ; N i; q ¼ 1; 2; ; N j ð5þ where x p i and x q j belong to different classes. Based on the set of time series t w and t b, the temporal variation Δt w and Δt b can be approximated by the time difference, where Δt w ¼fðx p i xq i Þg; ðxp i ; xq i ÞAt w and Δt b ¼fðx p i xq j Þg; ðxp i ; xq j ÞAt b. The model of SFDA is as follows: w T T w w arg min w w T ð6þ T b w where T w ¼ Δt w Δt T w and T b ¼ Δt b Δt T b. The solution to the optimization of Eq. (6) can be solved by eigenvalue problem T w w ¼ λt b w and optimal projections can be selected as eigenvectors w 1 ; w 2 ; ; w d corresponding to the first smallest eigenvalues λ 1 ; λ 2 ; ; λ d. 3. Adaptive slow feature discriminant analysis 3.1. Motivations of ASFDA The goal of ASFDA is to extract discriminant slow feature for classification by minimizing the temporal variation of within-class time series and maximizing the temporal variation of between-class time series simultaneously. Fig. 2 provides an intuitive illustration of theideaofasfda.inscenarioofdiscretedatasetsthatdonothave obvious temporal structure, ASFDA heavily relies on the notion of how accurately the time series can be approximated. The selected within-class time series should reflect the local geometric structure of sub-manifold. For the purpose of classification, the selected between-class time series should reflect the local margin information between multi-manifold. On the other hand, according to the literature [34], there usually exists a conflict between the projections that minimizing temporal variation of within-class time series and projections that maximizing temporal variation of between-class time series. To address this problem, we give an adaptive parameter to balance temporal variations of within-class and between-class to maximize discriminant capability. Totally, ASFDA consists of two steps: (1) characterizing temporal variation of within-class and between-class, (2) integrating temporal variations of within-class and between-class using maximum margin criterion to learn an optimal discriminant subspace for classification. margin margin margin margin Fig. 2. An intuitive illustration of the idea of ASFDA. The points with different shapes are belonging to different classes. (a) The sample points in the original space and margin between different classes are shown. (b) The sample points in the projected space, on which each class vary slowly and a larger margin is provided are shown Within-class time series selection What can be considered as within-class time series in slow feature discriminant analysis is equivalent to how to select neighbor set, which can reflect local manifold structure. Inspired by the literature [17], we develop a new criterion that satisfies the following requirement: the selected time series for each sample point x i should reflect nearby relationship to the point x i.theselectedtime series contains two properties: (1) the two points of time series should lie on the same sub-manifold and (2) the sub-manifold of time series is smooth Notion of time series Given a data set X ¼½x 1 ; x 2 ; ; x N ŠAR DN sampled from a r- dimensional smooth sub-manifold x ¼ f ðτþ, wherexar D, τar r and f : ΩAR r -R D, Ω is an open connected subset. Assuming that two points x i and x j can be considered as a short time series, x j can be accurately represented in the form of Taylor expansion x j ¼ x i þ J τi ðτ j τ i Þþεðτ j τ i Þ,whereJ τi AR Dr is the Jacobian matrix of f at τ i and εðτ j τ i Þ is the second order term of τ j τ i.sincethesubmanifold is smooth, it guarantees that εðτ j τ i Þoηðτ j τ i Þ with a small constant ηað0; 1Þ. Basedonthis,ashorttimeseriesðx i ; x j Þ should satisfy the following criterion: Jx j x i J τi ðτ j τ i ÞJ oηjðτ j τ i ÞJ ð7þ According to the literature [6], J τi ðτ j τ i Þ can be estimated by Q i θ j i and τ j τ i can be estimated by θ j i, whereθ j i ¼ Q T i ðx j x i Þ is the local coordinate and Q i is a set of local orthogonal bases which can be obtained by singular value decomposition [38] method. Thus, Eq. (7) can be written as the following form: Jx j x i Q i θ j i J oηjθj i J The matrix form of Eq. (8) can be rewritten as follows: X i ðx i e T þq i Θ i Þ F oη Θ i F ð9þ where e is a column vector of all 1s, X i ¼fx j i gk i j ¼ 1,pointsx i and x j can be construct a short time series and Θ i ¼fθ j i gk i j ¼ 1 is the local coordinate matrix corresponding to X i. According to the property of singular value decomposition [38], X i x i e T F can be calculated in terms of its singular values, say σ 1 Z Zσ r Z Zσ ni 40, we can get the following equations: X i x i e T F ¼ n i ðσ l ¼ 1 lþ 2, Θ i F ¼ Q i Θ i F ¼ r ðσ l ¼ 1 lþ 2 and X i ðx i e T þq i Θ i Þ F ¼ n i ðσ l ¼ r þ 1 lþ 2. Thus, Eq. (9) can be rewritten as follows: r l ¼ 1 ðσ l Þ 2 þ n i l ¼ r þ 1 ðσ l Þ 2 oð1þηþ r l ¼ 1 ðσ l Þ 2 ð8þ ð10þ From the above analysis, the criterion of determining time series set can be summarized as follows: r ðσ l ¼ 1 lþ 2 n i ðσ 4β l ¼ 1 lþ ð11þ 2 where r is the dimension of local manifold, n i is the number of nonzero singular value of X i x i e T and β ¼ 1=ð1þηÞ, β A½0; 1Š. From Eq. (7), we can see that the smaller the η is, the more close the x i and x j. It also means that the larger the β is, the more close the X i and x i. From Eq. (11), we can see that the parameter β also measures the PCA energy of r directions. In summary, if the value of β is relatively large, the points in X i and point x i lie in the same principal directions Adaptive criterion for time series selection Assume that we have obtained a relative large neighbor set X i ¼½x 1 i ; x2 i ; ; xk i Š of point x i which can be obtained by k nearest neighborhood method. Given preset parameters r and β, if Eq. (11)

4 142 X. Gu et al. / Neurocomputing 154 (2015) is not satisfied, we should remove a point x i j from set X i. In each removing step, it should guarantee that the remaining set have maximum value of function (12) δðx i =x ðσ j i Þ¼ r l ¼ 1 lþ 2 n i ðσ ð12þ l ¼ 1 lþ 2 where X i =x j j i means point x i is removed from set X i and σ l, l ¼ 1; 2; ; n i are the nonzero singular values of X i =x j i x ie T. The removing step can be repeated until Eq. (11) holds. The adaptive selection of time series process is summarized in Algorithm 1. From Eq. (12), we can see that the remaining points mainly lie few a principal directions, which is in favor to describe the structure of submanifold. In practice, we usually set dimension of local neighborhood to be small and parameter β to be relative large. It ensures that the selected points are lying in the principal direction. Thus, those points which are not lying in the principal direction will be removed. Algorithm 1. Within-class time series selection. Input: Given a data point x i, its k nearest neighbor set X i ¼½x 1 i ; x2 i ; ; xk i i Š belonging to the same class of x i and parameters r, β Output: T i ¼fðx i ; x j i Þg j ¼ 1; ; k i, where x j i AX i and k i is the number of points in X i Calculate singular values σ l 40ðl ¼ 1; ; n i Þ of X i x i e T Calculate δ ¼ r ððσ l ¼ 1 lþ 2 Þ=ð n i ðσ l ¼ 1 lþ 2 Þ while δoβ and k i 4k min do j Select a point x ~ i that satisfies j ~ ¼ arg max Update X i ¼ X i =x ~ j i, k i ¼ k i 1 Update δ ¼ð r l ¼ 1 ðσ lþ 2 Þ=ð n i l ¼ 1 ðσ lþ 2 Þ end while j δðx i =x j i Þ 3.3. Characterization of within-class slowness scatter ASFDA aims to extract discriminant slow feature by using label information. It is difficult to obtain a set of suitable time series based on KNN criterion, since the distribution of each class data in real world application is unknown. A well constructed set of time series will be in favor of describing manifold structure, thus will perform better in recognition. Based on Algorithm 1, an adaptive time series t w ¼½t 1 w ; t2 w ; ; tn w Š can be obtained, where ti w ¼fðx i; x j iþg; i ¼ 1; 2; ; N; j ¼ 1; 2; ; k i, x i and x j i belonging to same class. Based on the set of time-series t w, the within-class temporal variation can be approximated Δt w ¼½Δt 1 w ; Δt2 w ; ; ΔtN wš as follows: Δt i w ¼fx i x j i g; ðx i; x j i ÞAti w ð13þ where k i is number of time series in t i w. And the within-class slowness scatter can be defined as follows: J w ¼ N trðw T Δt i wδt i w TWÞ ð14þ i ¼ 1 J w ¼ trðw T T w WÞ ð15þ where N is the number of training samples and T w ¼ N i ¼ 1 Δti wδt i w T Characterization of between-class margin scatter For achieving a good classification performance, the margin of between-class separability should be maximized in the lowdimensional space. Due to the nonlinear structure of manifold, many pairs of close samples share different labels. For each point, we only consider its k-nearest points with different label to calculate between-class time series. Given a data point x i,wefind its k-nearest points X ~ i ¼½~x 1 i ; ~x2 i ; ; ~xk i Š, which do not belong to the same class of x i, and calculate between-class temporal variation Δt b ¼½Δt 1 b ; Δt2 b ; ; ΔtN b Š by approximating the time difference, where Δt i b ¼fx i ~x 1 i ; ; x i ~x k i g; i ¼ 1; 2; ; N. And the betweenclass temporal variation can be defined as following equation: J b ¼ N trðw T Δt i bδt i b TWÞ ð16þ i ¼ 1 J b ¼ trðw T T b WÞ ð17þ where N is the number of samples, T b ¼ N i ¼ 1 Δti b Δti bt. It is easy to see that maximize J b is to maximize the local margin of different classes in the low dimensional space Objective function With the above preparation, the proposed algorithm is expected to find the optimal projection that can minimize the within-class temporal variation and simultaneously maximize the between-class temporal variation. We then have the following optimization problem: 8 >< min w T T w w w T w ¼ 1 max w T ð18þ >: T b w w T w ¼ 1 The optimization problem can be reformulated as follows: min w T w ¼ 1 w T ðt w αt b Þw ð19þ where αz0 is a suitable parameter that balance temporal variation of within-class and between-class. It can be easily reduced to an eigenvalue problem and optimal projections can be selected as eigenvector w 1 ; w 2 ; ; w d corresponding to the first smallest eigenvalues λ 1 ; λ 2 ; ; λ d The effective discriminant subspace It is obvious that the effective projections that can be used in feature extraction depend on both matrixes T w and T b. When the parameter value α is approaching to zero, SFDA will degenerate into a feature-extraction method which just only includes the withinclass information T w. On the other hand, when the parameter value is approaching to a large value such as þ1, ASFDA will degenerate into a feature extraction method which only considers betweenclass information T b. In real world application, it is difficult to characterize the real data distribution. According to the literature [34], when there is a conflict between temporal variation of withinclass and between-class, it is difficult to know which option is the best for classification that given by within-class temporal variation or that of between-class temporal variation. As far as we known, a larger discriminant power of low dimensional representation can result in a better performance for classification. In this section, we present a robust method to obtain an adaptive parameter that can maximize the discriminant power. To formally illustrate the effectiveness of the method, we first note the discriminant power with a given d as follows: trðw T T b WÞ trðw T ð20þ T w WÞ where W AR Dd and W T W ¼ I d. In order to acquire a good performance of classification, the goal of formula (20) is to maximize the discriminant power of a d-dimensional representation and Eq. (19)

5 X. Gu et al. / Neurocomputing 154 (2015) can be reformulated into the following equation: trðw T 0 max T bw 0 Þ α trðw T 0 T ww 0 Þ s:t: W 0 ¼ arg min trðw T ðt w αt b ÞWÞ ð21þ W 0 W 0 ¼ I d where D is the dimension of original data and d is the required low dimension. The optimal process contains two main steps: Removing the null space of between-class temporal variation It is well known that the matrices T w and T b are both positive semi-definite and the null space of T b has no discriminant ability. We assume that removing the null space of T b will not sacrifice the accuracy of classification. The singular value decomposition of T b is T b ¼ UΛ Tb U T ð22þ where U ¼½u 1 ; u 2 ; ; u m Š, Λ Tb ¼½λ 1 T b ; λ 2 T b ; ; λ m T b Š, λ 1 T b Zλ 2 T b Z Zλ m T b 40, m is the number of positive singular values of T b. Then the solution of Eq. (21) is a linear combination of column U, and W¼UV. We rewrite the two temporal variation as T U b ¼ UT T b U and T U w ¼ UT T w U. Eq. (21) can be reduced to min trðv T ðt U V T w αtu b ÞVÞ ð23þ V ¼ I d and so the discriminant power is expressed as follows: trðv T T U b VÞ trðv T T U w VÞ ð24þ where T U b is the positive definite matrix and T U w is the positive semi-definite matrix Iterative optimization At each iterative step, we start with V n AR md, and we could compute the tradeoff parameter as α n ¼ trðv T n TU w V nþ trðv T n TU b V nþ And then the V n þ 1 AR md is calculated as follows: V n þ 1 ¼ arg min trðv T n þ 1 ðt w α n T b ÞV n þ 1 Þ V T n þ 1 V n þ 1 ¼ I d ð25þ ð26þ As matrix T U b is the positive matrix, the term trðv T n TU b V nþ will always have a positive value. The detailed iterative procedure is listed in Algorithm 2. Algorithm 2. Iterative procedure to obtain optimal discriminant subspace. Input: Given within-class temporal variation T w and between-class variation T b Output: W Remove the null space of T b as in Eq. (22), rewrite T U w ¼ UT T w U and T U b ¼ UT T b U for each na½1; N max Š do Compute the balance parameter α n from the projection matrix V n 1 α n ¼ trðv T n 1 TU w V n 1Þ trðv T n 1 TU b V n 1Þ Calculate the new projection matrix V n V n ¼ arg min trðv T n ðtu α w nt U b ÞV nþ V T n V n ¼ I d If JV n V n 1 J oε (ε is a small positive value) V ¼ V n, then break; end for W¼UV Theorem 1. Iterative procedure in Algorithm 2 is convergence, since the parameter α n is monotonically decreasing and is also bounded. We denote a function F as follows: FðVÞ¼ trðv T T U w VÞ trðv T T U b VÞ ð27þ It has the following property: FðV n ÞrFðV n 1 Þ and FðVÞZ0. Proof. Set α n ¼ trðv T n 1 TU w V n 1Þ trðv T n 1 TU b V n 1Þ then trðv T n ðtu w α nt U b ÞV nþ¼0 Also V n ¼ arg min trðv T n ðtu α w nt U b ÞV nþ V T n V n ¼ I d Then trðv T n ðtu w α nt U b ÞV nþrtrðv T n 1 ðtu w α nt U b ÞV n 1Þ¼0 trðv T n TU w V nþ trðv T n TU b V nþ rα n FðV n ÞrFðV n 1 Þ Moreover T U b is the positive definite matrix and T U w is the positive semi-definite matrix FðVÞ¼ trðv T T w VÞ trðv T T b VÞ Z0: Therefore, in the process of the iteration the parameter α n is monotonically decreasing and is also bounded. Corollary 1. As Algorithm 2 convergences, the maximum discriminant power can be obtained simultaneously. Proof. It is easy to see that the discriminant power is the reciprocal of the parameter α, and as the parameter α arrives its minimum, the discriminant power will reach its maximum. Now, the algorithmic procedure of ASFDA is formally summarized as follows. Step Construct within-class temporal variation T w using 1: Algorithm 1. Step Construct between-class temporal variation T b using neighboring information. 2: Step Using Algorithm 2 to solve the following objection: 3: trðw T 0 T bw 0 Þ trðw T 0 T ww 0 Þ s:t: max W;α W 0 ¼ arg min W 0 W 0 ¼ I d trðw T ðt w αt b ÞWÞ Step After obtaining the optimal transformation matrix W, for a 4: new sample x, its low-dimensional feature representation y ¼ W T x.

6 144 X. Gu et al. / Neurocomputing 154 (2015) Experimental results and analysis In this section, we evaluate the performance of our method ASFDA in comparison with other classical dimensionality reduction methods including LDA [2], DLPP [10], MFA[12], MMC [16], SSFACM [29], SSFASP [30], SFDA [31]and SSFA [32] on several publicly available databases. In order to make the comparison fair, we first apply PCA as preprocessing step to keep 98% energy. For MFA the k-nearest neighborhood parameter k 1 is set as k 1 ¼ l 1 and k 2 is set as c, where l denotes the number of training samples per class and c denotes the number of classes. We follow [11,39] to set the heat parameter t as t ¼ 2 m=2:5 σ 0, where σ 0 is the standard deviation of the squared norms of the training samples, and maf 20; 9; ; 0; ; 20g in DLPP. In ASFDA, we set r¼1, β¼0.65 and the between-class neighborhood selection parameter k¼c. After all the methods have been adopted to extract low dimensional feature, the nearest neighbor classifier with Euclidean metric as the distance measure is employed to perform classification task. We denote the recognition accuracy as the percent of samples that can be correctly recognized in testing samples. All the experiments are performed on a (CPU:Core 2 Duo 2.2 GHz, RAM:2G) PC with MATLAB 2010a Database The ORL Face Database contains 400 images of 40 distinct individuals and each subject have 10 different images. These images were taken at different times and demonstrates variations in lighting condition, facial expression (open/closed eyes, smiling/not smiling) and facial details (glasses/no glasses). For computational convenience, we manually cropped the face portion of the image into the resolution of Some example images of one person are shown in Fig. 3. The Extended YaleB Face Database contains 16,128 images under 9 poses and 64 illumination conditions. In our experiment, we select a subset contains 2431 images of 38 individuals. Before implement our experiment, each image in Extended YaleB face database is cropped and resize to Fig. 4 shows some sample images. The CMU PIE Face Database contains 41,368 images from 68 individual. These images of each individual were taken under 13 different poses, 43 different illumination conditions, and with 4 different expressions. In our experiment, we select a subset containing 11,554 images of 68 individuals. Before implement our experiment, all the face images in PIE face database are resized into the resolution of SomesampleimagesareshowninFig Experiment for tradeoff parameter α In this subsection, we investigate the performance of the ASFDA over the reduced dimensions and the value of tradeoff parameter α. 8 samples of each individual are selected for training, while the remaining samples are used for testing on YaleB and PIE face databases. On ORL face database, 5 samples per class are randomly chosen for training and the rest samples are used for testing. Each experiment is randomly repeated 20 times to get the average recognition accuracy. Figs. 7 9 demonstrate the recognition accuracy of ASFDA over the variance of the dimensionality of subspaces and different values of the parameter α. Table 1 gives the maximal recognition accuracy of ASFDA with different values of parameter α. Fig. 6 gives the variation of parameter α in the process of iteration. Fig. 3. Sample images in ORL database. Fig. 4. Sample images in YaleB database. Fig. 5. Sample images in PIE database.

7 X. Gu et al. / Neurocomputing 154 (2015) From Figs. 6 to 9 and Table 1, the recognition rates are sensitive to tradeoff parameter, ASFDA always has a significantly advantage in recognition rates, because our method can automatically choose an optimal parameter to balance temporal variation of within-class and between-class to get the maximum discriminant power in the low dimensional subspace, so that it can always obtain the best performance for classification. As can be seen from Fig. 6, theparameterα will arrive its minimum value soon. When the parameter α reaches its minimum value, ASFDA obtains its maximum discriminant power, because discriminant power is the reciprocal of parameter α. From Fig. 7, when parameter α is set as 0.001, 0.01, 0.1 and 1, the recognition rate curves are very similar, and the recognition rates are less sensitive to parameter α. The reason is that there is less conflict between within-class temporal variation and between-class temporal variation on ORL database Experiment for face recognition In this subsection, we compare the performance of different dimensionality reduction methods. In order to evaluate the effectiveness of time series constructed by our method, we extend the SFDA [31] into difference form. So there are two variants of SFDA, SFDA-ratio and SFDA-difference. We randomly select l (l ¼ 8; 10; 12; 15 on YaleB, PIE and l ¼ 6; 7; 8 on ORL) samples of each individual for training, and the rest of samples used for testing on the three databases. Each experiment is randomly repeated 20 times to get the average recognition accuracy. Tables 2 4 give the maximal average recognition accuracy obtained by different dimensionality methods as well as standard deviations and the corresponding dimensionality of reduced subspace. In addition, we draw the recognition rate curves of the different dimensionality reduction methods in Figs Fig. 7. Recognition accuracies of ASFDA on ORL face database using 5 training samples. Table 1 The maximal recognition accuracy (%) of ASFDA with the variants of parameter α on ORL, YaleB and PIE databases.the bold recognition accuracy is the best. Databases α¼0.001 α¼0.01 α¼0.1 α¼1 α¼10 α¼100 α¼1000 Our method ORL (α¼0.09) YaleB (α¼0.25) PIE (α¼0.44) Fig. 8. Recognition accuracies of ASFDA on YaleB face database using 8 training samples. Fig. 6. The variation of tradeoff parameter α with the number of iteration on ORL, YaleB and PIE databases. Fig. 9. Recognition accuracies of ASFDA on PIE face database using 8 training samples.

8 146 X. Gu et al. / Neurocomputing 154 (2015) From Figs. 10 to 13 and Tables 2 to 4, wecanseethatasfda consistently outperforms LDA, DLPP, MFA, MMC in all experiments in three face databases. The good performance of ASFDA also demonstrates that ASFDA is more effective than other methods in feature extraction. ASFDA not only captures the structure information both of sub-manifold and multi-manifold, but also obtains maximum discriminant power from temporal variations of within-class and between-class. As shown in Figs , the maximum recognition of LDA is higher than that of MMC and the performance of SFDA-ratio also outperforms that of SFDA-difference. The reason is that MMC and SFDA-difference are sensitive to the tradeoff parameter α. SFDA-ratio and ASFDA outperform SSFA, SSFACM and SSFASP, because SSFA, SSFACM and SSFASP ignore the between-class information. Although SFDA-difference can obtain the discriminant feature, its performance is not good enough as expected in our experiments. The reason may be that the difference criterion relies on the tradeoff parameter α when there is a conflict between within-class temporal variation and between-class temporal variation. We can observe that ASFDA outperforms both of two variants of SFDA: SFDA-ratio and SFDAdifference, the reason is that the time series constructed by our method are more helpful to reveal the structure information of submanifoldandmulti-manifoldthansfda Influence of parameters on ASFDA performance TheproposedmethodASFDAhastwoparameters,i.e.,r of the dimension of local neighborhood and β measuring the PCA energy of r directions in the local neighborhood. In this subsection, we study the parameters r and β impact to the performance of ASFDA on the ORL, YaleB and PIE face databases. We randomly select 8 samples of each individual for training, and the rest samples are used for testing. We repeat the experiment for 20 times to obtain the average recognition Table 2 The maximal average recognition accuracy (%) and their corresponding standard deviations, optimal dimensions of LDA, DLPP, MFA, MMC, SSFA, SSFAMC, SSFASP, SFDA-ratio, SFDA-difference and ASFDA across 20 runs on ORL database. The bold recognition accuracy is the best. Method 6 Train. 7 Train. 8 Train. LDA (34) (38) (31) DLPP (39) (38) (36) MFA (31) (31) (41) MMC (80) (49) (46) SSFA (38) (60) (58) SSFAMC (46) (53) (49) SSFASP (55) (37) (32) SFDA-ratio (41) (47) (51) SFDA-difference (80) (22) (46) ASFDA (37) (33) (42) Fig. 10. The recognition rate curves of LDA, DLPP, MFC, MMC, SSFA, SSFACM, SSFASP, SFDA-ratio, SFDA-difference and ASFDA versus dimensions on ORL face database using 7 training samples. Table 3 The maximal average recognition accuracy (%) and their corresponding standard deviations, optimal dimensions of LDA, DLPP, MFA, MMC, SSFA, SSFAMC, SSFASP, SFDA-ratio, SFDA-difference and ASFDA across 20 runs on YaleB database. The bold recognition accuracy is the best. Method 8 Train. 10 Train. 12 Train. 15 Train. LDA (25) (34) (26) (29) DLPP (23) (35) (32) (25) MFA (81) (78) (87) (78) MMC (70) (78) (79) (89) SSFA (28) (36) (27) (38) SSFAMC (32) (55) (64) (53) SSFASP (45) (68) (62) (51) SFDA-ratio (25) (29) (28) (29) SFDA-difference (65) (76) (49) (54) ASFDA (28) (30) (28) (31) Table 4 The maximal average recognition accuracy (%) and their corresponding standard deviations, optimal dimensions of LDA, DLPP, MFA, MMC, SSFA, SSFAMC, SSFASP, SFDA-ratio, SFDA-difference and ASFDA across 20 runs on PIE database. The bold recognition accuracy is the best. Method 8 Train. 10 Train. 12 Train. 15 Train. LDA (30) (28) (31) (36) DLPP (34) (27) (26) (35) MFA (61) (45) (40) (48) MMC (43) (53) (57) (52) SSFA (31) (34) (24) (36) SSFAMC (25) (33) (33) (43) SSFASP (31) (31) (33) (28) SFDA-ratio (31) (30) (30) (38) SFDA-difference (50) (48) (47) (57) ASFDA (30) (28) (32) (30)

9 X. Gu et al. / Neurocomputing 154 (2015) accuracy. Fig. 13 shows the maximal average recognition accuracy as a function of these two parameters r and β. From Fig. 13, we can clearly see that our proposed ASFDA is stable on the whole to the variant of parameters r and β on the three face databases. More specially, the recognition accuracy of ASFDA will increase in a small range either with the increase of parameter β or with decrease of parameter r. It is obvious that when the parameter r is set r¼1 orr¼2 and parameter β is set above 0.6, ASFDA will obtain a better performance on the three face databases. The reason may be that the dimension of local manifold is small. The points in X i and point x i in the constructed time series are nearby and lie in the same principal directions, which favor to reveal the local structure of manifold Computational efficiency comparison Fig. 11. The recognition rate curves of LDA, DLPP, MFC, MMC, SSFA, SSFACM, SSFASP, SFDA-ratio, SFDA-difference and ASFDA versus dimensions on YaleB face database using 8 training samples. In this section, we discuss the computational cost of our proposed ASFDA in comparison to LDA, MMC, SSFA, SSFACM, SSFASP, SFDA-ratio and SFDA-difference. ASFDA has the same complexity as SFDA when the time series and parameter α are given. However, our proposed ASFDA needs to perform extra computation for constructing adaptive time series and computing adaptive parameter α to balance withinclass temporal variation and between-class temporal variation. Precisely, ASFDA has larger arithmetic operations than SFDA. We use the ORL, YaleB and PIE face databases to empirically compare the computational efficiency of those methods. In each database, l¼8 samples of each individual are selected to calculate the training cost of each method. The experiments are repeatedly performed 20 times. Finally, the average training time is computed. Table 5 shows the average training costs of the methods on different databases. 5. Conclusion In this paper, we develop a novel feature extraction method called adaptive slow feature discriminant analysis (ASFDA) for face recognition. ASFDA provides a new criterion to generate within-class time series to describe sub-manifold and use neighboring information to generate between-class time series to describe the margin between multi-manifold. Moreover, ASFDA gives an automatic parameter to balance temporal variation of within-class and between-class to obtain an optimal discriminant subspace for classification. The experimental results demonstrate that the proposed ASFDA is superior to some state-of-the-art methods in face recognition. Table 5 The average training time (seconds) of LDA, MMC, SSFA, SSFACM, SSFASP and SFDA across 20 runs on three face databases.the bold recognition accuracy is the best. LDA MMC SSFA SSFACM SSFASP SFDAratio SFDAdifference ASFDA Fig. 12. The recognition rate curves of LDA, DLPP, MFC, MMC, SSFA, SSFACM, SSFASP, SFDA-ratio, SFDA-difference and ASFDA versus dimensions on PIE face database using 8 training samples. ORL YaleB PIE Fig. 13. The maximal average recognition accuracy versus parameters r and β on the ORL, YaleB and PIE face databases.

10 148 X. Gu et al. / Neurocomputing 154 (2015) Acknowledgement This work is supported by the National Natural Science Fund of China (Grant nos , and ), the Project of Ministry of Industry, Information Technology of PRC (Grant no. E0310/1112/02-1) and Fundamental Research Funds for the Central Universities (Grant no. 2013KJ010). References [1] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cogn. Neurosci. 3 (1) (1991) [2] P.N. Belhumeur, J.P. Hespanha, D. Kriegman, Eigenfaces vs. fisherfaces: recognition using class specific linear projection, IEEE Trans. Pattern Anal. Mach. Intell. 19 (7) (1997) [3] S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (5500) (2000) [4] J.B. Tenenbaum, V. De Silva, J.C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science 290 (5500) (2000) [5] M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput. 15 (6) (2003) [6] Z.-y. Zhang, H.-y. Zha, Principal manifolds and nonlinear dimensionality reduction via tangent space alignment, J. Shanghai Univ. (English Edition) 8 (4) (2004) [7] X. He, D. Cai, S. Yan, H.-J. Zhang, Neighborhood preserving embedding, in: Tenth IEEE International Conference on Computer Vision, vol. 2, IEEE, Los Alamitos, CA, USA, 2005, pp [8] X. He, S. Yan, Y. Hu, P. Niyogi, H.-J. Zhang, Face recognition using laplacianfaces, IEEE Trans. Pattern Anal. Mach. Intell. 27 (3) (2005) [9] T. Zhang, J. Yang, D. Zhao, X. Ge, Linear local tangent space alignment and application to face recognition, Neurocomputing 70 (7) (2007) [10] W. Yu, X. Teng, C. Liu, Face recognition using discriminant locality preserving projections, Image Vis. Comput. 24 (3) (2006) [11] G.-F. Lu, Z. Lin, Z. Jin, Face recognition using discriminant locality preserving projections based on maximum margin criterion, Pattern Recognit. 43 (10) (2010) [12] S. Yan, D. Xu, B. Zhang, H.-J. Zhang, Q. Yang, S. Lin, Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Trans. Pattern Anal. Mach. Intell. 29 (1) (2007) [13] H.-T. Chen, H.-W. Chang, T.-L. Liu, Local discriminant embedding and its variants, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, IEEE, Los Alamitos, CA, USA, 2005, pp [14] C. Zhao, D. Miao, Z. Lai, C. Gao, C. Liu, J. Yang, Two-dimensional color uncorrelated discriminant analysis for face recognition, Neurocomputing 113 (3) (2013) [15] C. Zhao, Z. Lai, C. Liu, X. Gu, J. Qian, Fuzzy local maximal marginal embedding for feature extraction, Soft Comput. 16 (1) (2012) [16] H. Li, T. Jiang, K. Zhang, Efficient and robust feature extraction by maximum margin criterion, in: Neural Information Processing Systems, [17] Z. Zhang, J. Wang, H. Zha, Adaptive manifold learning, IEEE Trans. Pattern Anal. Mach. Intell. 34 (2) (2012) [18] C. Zhao, Z. Lai, D. Miao, Z. Wei, C. Liu, Graph embedding discriminant analysis for face recognition, Neural Comput. Appl. 22 (5) (2013) [19] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, Y. Ma, Robust face recognition via sparse representation, IEEE Trans. Pattern Anal. Mach. Intell. 31 (2) (2009) [20] L.Itti,C.Koch,E.Niebur,etal.,Amodelofsaliency-basedvisualattentionforrapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell. 20 (11) (1998) [21] L. Wiskott, T.J. Sejnowski, Slow feature analysis: unsupervised learning of invariances, Neural Comput. 14 (4) (2002) [22] P. Berkes, Temporal Slowness as an Unsupervised Learning Principle: Selforganization of Complex-cell Receptive Fields and Application to Pattern Recognition (Ph.D. thesis), Citeseer, [23] S. Dähne, N. Wilbert, L. Wiskott, Slow feature analysis on retinal waves leads to v1 complex cells, PLoS Comput. Biol. 10 (5) (2014) e [24] R. Legenstein, N. Wilbert, L. Wiskott, Reinforcement learning on slow features of high-dimensional input streams, PLoS Comput. Biol. 6 (8) (2010) e [25] M. Franzius, N. Wilbert, L. Wiskott, Invariant object recognition and pose estimation with slow feature analysis, Neural Comput. 23 (9) (2011) [26] T. Blaschke, T. Zito, L. Wiskott, Independent slow feature analysis and nonlinear blind source separation, Neural Comput. 19 (4) (2007) [27] S. Dähne, J. Höhne, M. Schreuder, M. Tangermann, Slow feature analysis-a tool for extraction of discriminating event-related potentials in brain-computer interfaces, in: Artificial Neural Networks and Machine Learning ICANN 2011, Springer, Berlin, Germany, 2011, pp [28] Z. Zhang, D. Tao, Slow feature analysis for human action recognition, IEEE Trans. Pattern Anal. Mach. Intell. 34 (3) (2012) [29] X. Gu, C. Liu, S. Wang, Supervised slow feature analysis for face recognition, in: Biometric Recognition, Springer, Heidelberg, Germany, 2013, pp [30] X. Gu, C. Liu, Z. Yang, Dimensionality reduction based on supervised slow feature analysis for face recognition, Int. J. Signal Process., Image Process. Pattern Recognit. 7 (1) (2014) [31] Y. Huang, J. Zhao, M. Tian, Q. Zou, S. Luo, Slow feature discriminant analysis and its application on handwritten digit recognition, in: International Joint Conference on Neural Networks, IEEE, Piscataway, NJ, USA, 2009, pp [32] Y. Huang, J. Zhao, Y. Liu, S. Luo, Q. Zou, M. Tian, Nonlinear dimensionality reduction using a temporal coherence principle, Inf. Sci. 181 (16) (2011) [33] V. Premachandran, R. Kakarala, Consensus of k-nns for robust neighborhood selection on graph-based manifolds, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Los Alamitos, CA, USA, 2013, pp [34] A.M. Martinez, M. Zhu, Where are linear feature extraction methods applicable? IEEE Trans. Pattern Anal. Mach. Intell. 27 (12) (2005) [35] M. Zhu, A.M. Martinez, Subclass discriminant analysis, IEEE Trans. Pattern Anal. Mach. Intell. 28 (8) (2006) [36] X. Jing, S. Li, D. Zhang, C. Lan, J. Yang, Optimal subset-division based discrimination and its kernelization for face and palmprint recognition, Pattern Recognit. 45 (10) (2012) [37] H. Li, T. Jiang, K. Zhang, Efficient and robust feature extraction by maximum margin criterion, IEEE Trans. Neural Netw. 17 (1) (2006) [38] G.H. Golub, C. Reinsch, Singular value decomposition and least squares solutions, Numer. Math. 14 (5) (1970) [39] L. Zhang, L. Qiao, S. Chen, Graph-optimized locality preserving projections, Pattern Recognit. 43 (6) (2010) Xingjian Gu is now working for Ph.D. degree at the School of Computer Science and Engineering in Nanjing University of Science and Technology. He received his B. S. degree in the college of math and physics at Nanjing University of Information Science and Technology in His research interests mainly focus on Pattern Recognition and Computer Vision. Chuancai Liu is a Full Professor in the School of Computer Science and Engineering of Nanjing University of Science and Technology, China. He obtained his Ph.D. degree from the China Ship Research and Development Academy in His research interests include AI, Pattern Recognition and Computer Vision. He has published about 50 papers in International/ National Journals. Sheng Wang received his B.S. degree in automation from Henan University, China, in He obtained his M.S. degree in Control Theory and Control Engineering from the same University. Currently, he is a Ph.D. student at Nanjing University of science and Technology. His research interests include Image Processing, Pattern Recognition and Machine Learning. Cairong Zhao is currently an assistant professor at Tongji University. He received the Ph.D. degree from Nanjing University of Science and Technology, M.S. degree from Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, and B.S. degree from Jilin University, in 2011, 2006 and 2003, respectively. His research interests include Face Recognition, Building Recognition and Vision Attention.

Discriminant Uncorrelated Neighborhood Preserving Projections

Discriminant Uncorrelated Neighborhood Preserving Projections Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Nonnegative Matrix Factorization Clustering on Multiple Manifolds Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,

More information

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA)

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA) Symmetric Two Dimensional inear Discriminant Analysis (2DDA) Dijun uo, Chris Ding, Heng Huang University of Texas at Arlington 701 S. Nedderman Drive Arlington, TX 76013 dijun.luo@gmail.com, {chqding,

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Nonlinear Dimensionality Reduction. Jose A. Costa

Nonlinear Dimensionality Reduction. Jose A. Costa Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning LIU, LU, GU: GROUP SPARSE NMF FOR MULTI-MANIFOLD LEARNING 1 Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning Xiangyang Liu 1,2 liuxy@sjtu.edu.cn Hongtao Lu 1 htlu@sjtu.edu.cn

More information

Graph-Laplacian PCA: Closed-form Solution and Robustness

Graph-Laplacian PCA: Closed-form Solution and Robustness 2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and

More information

Dimensionality Reduction:

Dimensionality Reduction: Dimensionality Reduction: From Data Representation to General Framework Dong XU School of Computer Engineering Nanyang Technological University, Singapore What is Dimensionality Reduction? PCA LDA Examples:

More information

2 GU, ZHOU: NEIGHBORHOOD PRESERVING NONNEGATIVE MATRIX FACTORIZATION graph regularized NMF (GNMF), which assumes that the nearby data points are likel

2 GU, ZHOU: NEIGHBORHOOD PRESERVING NONNEGATIVE MATRIX FACTORIZATION graph regularized NMF (GNMF), which assumes that the nearby data points are likel GU, ZHOU: NEIGHBORHOOD PRESERVING NONNEGATIVE MATRIX FACTORIZATION 1 Neighborhood Preserving Nonnegative Matrix Factorization Quanquan Gu gqq03@mails.tsinghua.edu.cn Jie Zhou jzhou@tsinghua.edu.cn State

More information

Locality Preserving Projections

Locality Preserving Projections Locality Preserving Projections Xiaofei He Department of Computer Science The University of Chicago Chicago, IL 60637 xiaofei@cs.uchicago.edu Partha Niyogi Department of Computer Science The University

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

Robust Laplacian Eigenmaps Using Global Information

Robust Laplacian Eigenmaps Using Global Information Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX

More information

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants Sheng Zhang erence Sim School of Computing, National University of Singapore 3 Science Drive 2, Singapore 7543 {zhangshe, tsim}@comp.nus.edu.sg

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

A Local Non-Negative Pursuit Method for Intrinsic Manifold Structure Preservation

A Local Non-Negative Pursuit Method for Intrinsic Manifold Structure Preservation Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence A Local Non-Negative Pursuit Method for Intrinsic Manifold Structure Preservation Dongdong Chen and Jian Cheng Lv and Zhang Yi

More information

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition Jun Liu, Songcan Chen, Daoqiang Zhang, and Xiaoyang Tan Department of Computer Science & Engineering, Nanjing University

More information

Local Learning Projections

Local Learning Projections Mingrui Wu mingrui.wu@tuebingen.mpg.de Max Planck Institute for Biological Cybernetics, Tübingen, Germany Kai Yu kyu@sv.nec-labs.com NEC Labs America, Cupertino CA, USA Shipeng Yu shipeng.yu@siemens.com

More information

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13 CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

Locally Linear Embedded Eigenspace Analysis

Locally Linear Embedded Eigenspace Analysis Locally Linear Embedded Eigenspace Analysis IFP.TR-LEA.YunFu-Jan.1,2005 Yun Fu and Thomas S. Huang Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign 405 North

More information

Example: Face Detection

Example: Face Detection Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,

More information

Statistical and Computational Analysis of Locality Preserving Projection

Statistical and Computational Analysis of Locality Preserving Projection Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637

More information

Orthogonal Laplacianfaces for Face Recognition

Orthogonal Laplacianfaces for Face Recognition 3608 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 11, NOVEMBER 2006 [29] G. Deng and J. C. Pinoli, Differentiation-based edge detection using the logarithmic image processing model, J. Math. Imag.

More information

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU, Eric Xing Eric Xing @ CMU, 2006-2010 1 Machine Learning Data visualization and dimensionality reduction Eric Xing Lecture 7, August 13, 2010 Eric Xing Eric Xing @ CMU, 2006-2010 2 Text document retrieval/labelling

More information

Informative Laplacian Projection

Informative Laplacian Projection Informative Laplacian Projection Zhirong Yang and Jorma Laaksonen Department of Information and Computer Science Helsinki University of Technology P.O. Box 5400, FI-02015, TKK, Espoo, Finland {zhirong.yang,jorma.laaksonen}@tkk.fi

More information

Supervised locally linear embedding

Supervised locally linear embedding Supervised locally linear embedding Dick de Ridder 1, Olga Kouropteva 2, Oleg Okun 2, Matti Pietikäinen 2 and Robert P.W. Duin 1 1 Pattern Recognition Group, Department of Imaging Science and Technology,

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,

More information

Adaptive Affinity Matrix for Unsupervised Metric Learning

Adaptive Affinity Matrix for Unsupervised Metric Learning Adaptive Affinity Matrix for Unsupervised Metric Learning Yaoyi Li, Junxuan Chen, Yiru Zhao and Hongtao Lu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering,

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

Enhanced graph-based dimensionality reduction with repulsion Laplaceans

Enhanced graph-based dimensionality reduction with repulsion Laplaceans Enhanced graph-based dimensionality reduction with repulsion Laplaceans E. Kokiopoulou a, Y. Saad b a EPFL, LTS4 lab, Bat. ELE, Station 11; CH 1015 Lausanne; Switzerland. Email: effrosyni.kokiopoulou@epfl.ch

More information

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Olga Kouropteva, Oleg Okun, Matti Pietikäinen Machine Vision Group, Infotech Oulu and

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Spectral Regression for Efficient Regularized Subspace Learning

Spectral Regression for Efficient Regularized Subspace Learning Spectral Regression for Efficient Regularized Subspace Learning Deng Cai UIUC dengcai2@cs.uiuc.edu Xiaofei He Yahoo! hex@yahoo-inc.com Jiawei Han UIUC hanj@cs.uiuc.edu Abstract Subspace learning based

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

The prediction of membrane protein types with NPE

The prediction of membrane protein types with NPE The prediction of membrane protein types with NPE Lipeng Wang 1a), Zhanting Yuan 1, Xuhui Chen 1, and Zhifang Zhou 2 1 College of Electrical and Information Engineering Lanzhou University of Technology,

More information

Linear Discriminant Analysis Using Rotational Invariant L 1 Norm

Linear Discriminant Analysis Using Rotational Invariant L 1 Norm Linear Discriminant Analysis Using Rotational Invariant L 1 Norm Xi Li 1a, Weiming Hu 2a, Hanzi Wang 3b, Zhongfei Zhang 4c a National Laboratory of Pattern Recognition, CASIA, Beijing, China b University

More information

SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES

SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES JIANG ZHU, SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, P. R. China E-MAIL:

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Intrinsic Structure Study on Whale Vocalizations

Intrinsic Structure Study on Whale Vocalizations 1 2015 DCLDE Conference Intrinsic Structure Study on Whale Vocalizations Yin Xian 1, Xiaobai Sun 2, Yuan Zhang 3, Wenjing Liao 3 Doug Nowacek 1,4, Loren Nolte 1, Robert Calderbank 1,2,3 1 Department of

More information

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,

More information

Face recognition Computer Vision Spring 2018, Lecture 21

Face recognition Computer Vision Spring 2018, Lecture 21 Face recognition http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 21 Course announcements Homework 6 has been posted and is due on April 27 th. - Any questions about the homework?

More information

Non-negative Matrix Factorization on Kernels

Non-negative Matrix Factorization on Kernels Non-negative Matrix Factorization on Kernels Daoqiang Zhang, 2, Zhi-Hua Zhou 2, and Songcan Chen Department of Computer Science and Engineering Nanjing University of Aeronautics and Astronautics, Nanjing

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION

STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION INTERNATIONAL JOURNAL OF INFORMATION AND SYSTEMS SCIENCES Volume 5, Number 3-4, Pages 351 358 c 2009 Institute for Scientific Computing and Information STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Gaussian Process Latent Random Field

Gaussian Process Latent Random Field Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Gaussian Process Latent Random Field Guoqiang Zhong, Wu-Jun Li, Dit-Yan Yeung, Xinwen Hou, Cheng-Lin Liu National Laboratory

More information

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold. Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all

More information

COS 429: COMPUTER VISON Face Recognition

COS 429: COMPUTER VISON Face Recognition COS 429: COMPUTER VISON Face Recognition Intro to recognition PCA and Eigenfaces LDA and Fisherfaces Face detection: Viola & Jones (Optional) generic object models for faces: the Constellation Model Reading:

More information

Multisets mixture learning-based ellipse detection

Multisets mixture learning-based ellipse detection Pattern Recognition 39 (6) 731 735 Rapid and brief communication Multisets mixture learning-based ellipse detection Zhi-Yong Liu a,b, Hong Qiao a, Lei Xu b, www.elsevier.com/locate/patcog a Key Lab of

More information

Nonlinear Manifold Learning Summary

Nonlinear Manifold Learning Summary Nonlinear Manifold Learning 6.454 Summary Alexander Ihler ihler@mit.edu October 6, 2003 Abstract Manifold learning is the process of estimating a low-dimensional structure which underlies a collection

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Regularized Locality Preserving Projections with Two-Dimensional Discretized Laplacian Smoothing

Regularized Locality Preserving Projections with Two-Dimensional Discretized Laplacian Smoothing Report No. UIUCDCS-R-2006-2748 UILU-ENG-2006-1788 Regularized Locality Preserving Projections with Two-Dimensional Discretized Laplacian Smoothing by Deng Cai, Xiaofei He, and Jiawei Han July 2006 Regularized

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

Dimension Reduction and Low-dimensional Embedding

Dimension Reduction and Low-dimensional Embedding Dimension Reduction and Low-dimensional Embedding Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/26 Dimension

More information

Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material

Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material Ioannis Gkioulekas arvard SEAS Cambridge, MA 038 igkiou@seas.harvard.edu Todd Zickler arvard SEAS Cambridge, MA 038 zickler@seas.harvard.edu

More information

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Recognition Using Class Specific Linear Projection Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Articles Eigenfaces vs. Fisherfaces Recognition Using Class Specific Linear Projection, Peter N. Belhumeur,

More information

Principal component analysis using QR decomposition

Principal component analysis using QR decomposition DOI 10.1007/s13042-012-0131-7 ORIGINAL ARTICLE Principal component analysis using QR decomposition Alok Sharma Kuldip K. Paliwal Seiya Imoto Satoru Miyano Received: 31 March 2012 / Accepted: 3 September

More information

CITS 4402 Computer Vision

CITS 4402 Computer Vision CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images

More information

Efficient Kernel Discriminant Analysis via QR Decomposition

Efficient Kernel Discriminant Analysis via QR Decomposition Efficient Kernel Discriminant Analysis via QR Decomposition Tao Xiong Department of ECE University of Minnesota txiong@ece.umn.edu Jieping Ye Department of CSE University of Minnesota jieping@cs.umn.edu

More information

Enhanced Fisher Linear Discriminant Models for Face Recognition

Enhanced Fisher Linear Discriminant Models for Face Recognition Appears in the 14th International Conference on Pattern Recognition, ICPR 98, Queensland, Australia, August 17-2, 1998 Enhanced isher Linear Discriminant Models for ace Recognition Chengjun Liu and Harry

More information

Manifold Learning and it s application

Manifold Learning and it s application Manifold Learning and it s application Nandan Dubey SE367 Outline 1 Introduction Manifold Examples image as vector Importance Dimension Reduction Techniques 2 Linear Methods PCA Example MDS Perception

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Automatic Subspace Learning via Principal Coefficients Embedding

Automatic Subspace Learning via Principal Coefficients Embedding IEEE TRANSACTIONS ON CYBERNETICS 1 Automatic Subspace Learning via Principal Coefficients Embedding Xi Peng, Jiwen Lu, Senior Member, IEEE, Zhang Yi, Fellow, IEEE and Rui Yan, Member, IEEE, arxiv:1411.4419v5

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to

More information

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition. Appears in the Second International Conference on Audio- and Video-based Biometric Person Authentication, AVBPA 99, ashington D. C. USA, March 22-2, 1999. Comparative Assessment of Independent Component

More information

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier Seiichi Ozawa 1, Shaoning Pang 2, and Nikola Kasabov 2 1 Graduate School of Science and Technology,

More information

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:

More information

Face Detection and Recognition

Face Detection and Recognition Face Detection and Recognition Face Recognition Problem Reading: Chapter 18.10 and, optionally, Face Recognition using Eigenfaces by M. Turk and A. Pentland Queryimage face query database Face Verification

More information

Online Appearance Model Learning for Video-Based Face Recognition

Online Appearance Model Learning for Video-Based Face Recognition Online Appearance Model Learning for Video-Based Face Recognition Liang Liu 1, Yunhong Wang 2,TieniuTan 1 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences,

More information

Manifold Coarse Graining for Online Semi-supervised Learning

Manifold Coarse Graining for Online Semi-supervised Learning for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,

More information

Manifold Learning: Theory and Applications to HRI

Manifold Learning: Theory and Applications to HRI Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher

More information

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface Image Analysis & Retrieval Lec 14 Eigenface and Fisherface Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv, Spring

More information

Apprentissage non supervisée

Apprentissage non supervisée Apprentissage non supervisée Cours 3 Higher dimensions Jairo Cugliari Master ECD 2015-2016 From low to high dimension Density estimation Histograms and KDE Calibration can be done automacally But! Let

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

PARAMETERIZATION OF NON-LINEAR MANIFOLDS

PARAMETERIZATION OF NON-LINEAR MANIFOLDS PARAMETERIZATION OF NON-LINEAR MANIFOLDS C. W. GEAR DEPARTMENT OF CHEMICAL AND BIOLOGICAL ENGINEERING PRINCETON UNIVERSITY, PRINCETON, NJ E-MAIL:WGEAR@PRINCETON.EDU Abstract. In this report we consider

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier Seiichi Ozawa, Shaoning Pang, and Nikola Kasabov Graduate School of Science and Technology, Kobe

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface CS/EE 5590 / ENG 401 Special Topics, Spring 2018 Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface Zhu Li Dept of CSEE, UMKC http://l.web.umkc.edu/lizhu Office Hour: Tue/Thr 2:30-4pm@FH560E, Contact:

More information