When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

Size: px
Start display at page:

Download "When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants"

Transcription

1 When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants Sheng Zhang erence Sim School of Computing, National University of Singapore 3 Science Drive 2, Singapore 7543 {zhangshe, tsim}@comp.nus.edu.sg Abstract he Fisher Linear Discriminant (FLD) is commonly used in pattern recognition. It finds a linear subspace that maximally separates class patterns according to Fisher s Criterion. Several methods of computing the FLD have been proposed in the literature, most of which require the calculation of the so-called scatter matrices. In this paper, we bring a fresh perspective to FLD via the Fukunaga-Koontz ransform (FK). We do this by decomposing the whole data space into four subspaces, and show where Fisher s Criterion is maximally satisfied. We prove the relationship between FLD and FK analytically, and propose a method of computing the most discriminative subspace. his method is based on the QR decomposition, which works even when the scatter matrices are singular, or too large to be formed. Our method is general and may be applied to different pattern recognition problems. We validate our method by experimenting on synthetic and real data.. Introduction In recent years, discriminant subspace analysis has been extensively studied in computer vision and pattern recognition. It has been widely used for feature extraction and dimension reduction in face recognition [ and text classification [5. One popular method is the Fisher Linear Discriminant (FLD), also known as Linear Discriminant Analysis (LDA) [2, 3. It tries to find an optimal subspace such that the separability of two classes is maximized. his is achieved by minimizing the within-class distance and maximizing the between-class distance simultaneously. o be more specific, in terms of the between-class scatter matrix S b and the within-class scatter matrix S w, the Fisher s Criterion can be written as J F (Φ) = Φ S b Φ Φ S w Φ () By maximizing the criterion J F, Fisher Linear Discriminant finds the subspaces in which the classes are most linearly separable. he solution [3 that maximizes J F is a set of the eigenvectors {φ i } which must satisfy S b φ = λs w φ. (2) his is called the generalized eigenvalue problem [2, 3. he discriminant subspace is spanned by the generalized eigenvectors. he discriminability of each eigenvector is measured by the corresponding generalized eigenvalue, i.e., the most discriminant subspace corresponds to the maximal generalized eigenvalue. he generalized eigenvalue problem can be solved by matrix inversion and eigendecomposition, i.e., applying the eigen-decomposition on S w S b. Unfortunately, for many applications with high dimensional data and few training samples, such as face recognition, the scatter matrix S w is singular because generally the dimension of sample data is greater than the number of samples. his is known as the undersampled or small sample size problem [3, 2. Up till now, a great number of methods have been proposed to circumvent the requirement of nonsingularity of S w, such as Fisherface [ and LDA/GSVD [5. In [, Fisherface first applies PCA [6, 8 to reduce dimension such that S w is nonsingular, then followed by LDA. he LDA/GSVD algorithm [5 avoids the inversion of S w by the simultaneous diagonalization via the Generalized Singular Value Decomposition (GSVD). However, these methods are designed to overcome the singularity problem and do not directly relate to the generalized eigenvalue, the essential measure of the discriminability. In this paper, we propose to apply Fukunaga Koontz ransform (FK) [3 on the LDA problem. Based on the eigenvalue ratio of FK, we decompose the whole data space into four subspaces. hen our theoretical analyses show the relationship between LDA, FK and GSVD. Our work has three main contributions:. We present a unifying framework to understand differ-

2 ent methods, namely, LDA, FK and GSVD. o be more specific, for the LDA problem, GSVD is equivalent to FK; and the generalized eigenvalue of LDA is equal to both the eigenvalue ratio of FK and the square of the generalized singular value of GSVD. 2. he proposed theory is useful for pattern recognition. Our theoretical analyses demonstrate how to choose the best subspaces for maximum discriminability. he experiments on synthetic data and real data validate our theory. 3. In connecting FK with LDA, we show how FK, originally meant for 2-class problems can be generalized to handle multiple classes. he rest of this paper is organized as follows: Section 2 briefly reviews the mathematical background for LDA, GSVD, and FK. In Section 3, we first analyze the discriminant subspace of LDA based on FK, then set up the connections between LDA, FK and GSVD. We apply our theory to the classification problem on synthetic and real data in Section 4, and conclude our paper in Section Mathematical Background Notations. Let A = {a,..., a N }, a i R D denote a data set of given D-dimensional vectors. Each data point belongs to exactly one of C object classes {L,..., L C }. he number of vectors in class L i is denoted by N i, thus N = N i. Observe that for high-dimensional data, e.g., face images, generally, C N D. he between-class scatter matrix S b, the within-class scatter matrix S w, and the total scatter matrix S t are defined as follows: S b = S w = S t = C N i (m i m)(m i m) = H b H b (3) i= C i= (a m i )(a m i ) = H w H w (4) a L i N (a i m)(a i m) = H t H t (5) i= S t = S b + S w (6) Here m i denotes the class mean and m is the global mean of A. he matrices H b R D C, H w R D N, and H t R D N are the precursor matrices of the betweenclass scatter matrix, the within-class scatter matrix and the total scatter matrix respectively. Let us denote the rank of each scatter matrix: r w = Rank(S w ), r b = Rank(S b ), and r t = Rank(S t ). Note that for high-dimensional data (D N), r b C, r w N and r t N. 2.. Linear Discriminant Analysis Given the data matrix A, which can be divided into C classes, we try to find a linear transformation matrix Φ R D d, where d < D. It maps high dimensional data to a low dimensional space. From the perspective of pattern classification, LDA aims to find the optimal transformation Φ such that the projected data are well separated. Usually, two types of criteria are used to measure the separability of classes [3. One type gives the upper bound on the Bayes error, e.g., Bhattacharyya distance. he other type is based on scatter matrices. As shown in Equation, Fisher s criterion belongs to the latter type. he solution of the criterion is the generalized eigenvector and eigenvalue (See Equation 2). However, as we mentioned, if S w is nonsingular it can be solved by the generalized eigendecomposition: S w S b φ = λφ. Otherwise, S w is singular and we have to circumvent the nonsingularity requirement via LDA/GSVD [5, for example Generalized SVD he Generalized Singular Value Decomposition (GSVD) was developed by Van Loan et al. [4. We will briefly review the mechanism of GSVD using LDA as an example. Howland et. al. [5 extended the applicability of LDA to the case when S w is singular. his is done by using the simultaneous diagonalization of the scatter matrices via GSVD [4. First, to reduce computation load, H b and H w are used instead of S b and S w. hen, based on GSVD there exist orthogonal matrices Y R C C, Z R N N, and a nonsingular matrix X R D D such that: where Σ b = I b Y H b X = [Σ b,, (7) Z H wx = [Σ w,. (8) D b, Σ w = O w O b D w I w. Since Y, Z are orthogonal matrices and X is nonsingular, from Equations 7, 8 we obtain H b = Y [Σ b, X, (9) H w = Z [Σ w, X. () he matrices I b R (rt rw) (rt rw) and I w R (rt r b) (r t r b ) are identity matrices, and O b R (C r b) (r t r b ) and O w R (N rw) (rt rw) are rectangle, zero matrices which may have no rows or no columns, and D b =diag(α rt r w+,..., α rb ) and D w =diag(β rt r w+,..., β rb ) satisfy > α rt r w+...

3 α rb >, < β rt r w+... β rb <, and α 2 i + β2 i =. hus, Σ b Σ b + Σ wσ w = I, () where I R rt rt is an identity matrix. he columns of X, the generalized singular vectors for the matrix pair (H b, H w ), can be used as the discriminant feature subspace based on GSVD Fukunaga Koontz ransform he FK was designed for the 2-class recognition problem. Given data matrices A and A 2 from two classes, the autocorrelation matrices S = A A and S 2 = A 2 A 2 are positive semi-definite (p.s.d.) and symmetric. he sum of these two matrices is still p.s.d. and symmetric and can be factorized in the form [ [ D U S = S + S 2 = [U, U (2) U Without loss of generality, S may be singular and r=rank(s)< D, thus D = diag{λ,..., λ r }, λ... λ r >. U R D r is the set of eigenvectors corresponding to nonzero eigenvalues and U R D (D r) is the orthogonal complement of U. Now we can whiten S by a transformation operator P = UD /2. he sum of the two matrices S, S 2 becomes P SP = P (S + S 2 )P = S + S 2 = I (3) where S = P S P and S 2 = P S 2 P, I R r r is an identity matrix. Suppose an eigenvector of S is v with eigenvalue λ, that is: S v = λ v. Since S = I S 2, we can rewrite it as: (I S 2 )v = λ v (4) S 2 v = ( λ )v (5) his means that S 2 has the same eigenvector as S but the corresponding eigenvalue is λ 2 = λ. Consequently, the dominant eigenvector of S is the weakest eigenvector of S 2, and vice versa. 3. heory In this section, we first employ FK on S b and S w, which results in decomposing the whole data space S t into four subspaces. hen we explain the relationship between LDA, FK and GSVD based on the generalized eigenvalue. his gives insight into the different discriminant subspace analyses. In this paper, we focus on linear discriminant subspace analysis, but our approach can be easily extended to nonlinear discriminant subspace analysis. For example, based on kernel method, we can apply FK on kernelized S b and kernelized S w. Eigenvalues λ b U r t r w r b r t D Dimension Figure. he whole data space is decomposed into four subspaces via FK. In U, the null space of S t, there is no discriminant information. In U, the sum of λ b + is equal to. Note that we represent all possible subspaces, but in practice, some of these subspaces may not exist. 3.. LDA/FK Generally speaking, for the LDA problem there are more than 2 classes. o handle multiple classes, we replace the autocorrelation matrices S and S 2 with the scatter matrices S b and S w. Since S b, S w and S t are p.s.d., symmetric and S t = S b + S w, we can apply FK on S b, S w and S t, which we shall henceforth call LDA/FK. he whole data space is decomposed into two subspaces: U and U (See Fig. ). On the one hand, U is the set of eigenvectors corresponding to the zero eigenvalues of S t. It is the intersection of the null spaces of S b and S w, and contains no discriminant information. On the other hand, U is the set of eigenvectors corresponding to the nonzero eigenvalues of S t. It contains discriminant information. Based on FK, S b = P S b P and S w = P S w P share the same eigenvectors, and the sum of two eigenvalues corresponding to the same eigenvector is equal to. S b = VΛ b V (6) S w = VΛ w V (7) I = Λ b + Λ w (8) Where V R rt rt is the orthogonal eigenvector matrix, Λ b, Λ w R rt rt are diagonal eigenvalue matrices. According to the eigenvalue ratio λ b, U can be further decomposed into three subspaces. o keep the integrity of the whole data space, we incorporate U as the fourth subspace. See Fig... Subspace : the span of eigenvectors {v i } corresponding to =, λ b =. Since λ b =, in this subspace, the eigenvalue ratio is maximized. U

4 2. Subspace 2: the span of eigenvectors {v i } corresponding to < < and < λ b <. Since < λ b <, the eigenvalue ratio is smaller than that of Subspace. 3. Subspace 3: the span of eigenvectors {v i } corresponding to = and λ b =. Since λ b =, the eigenvalue ratio is minimal. 4. Subspace 4: U, the span of eigenvectors corresponding to the zero eigenvalues of S t. Note that in practice, any of these four subspaces may not exist, depending on the ranks of S b, S w and S t. For example, if r t = r w + r b, then Subspace 2 does not exist. As illustrated in Fig., the null space of S w is the union of Subspace and Subspace 4, and the null space of S b is the union of Subspace 3 and Subspace 4, if they exist Relationship between LDA, GSVD and FK How do these four subspaces help to maximize the Fisher criterion J F? We explain this in the following heorem that connects the generalized eigenvalue of J F to the eigenvalues of FK. We begin with a Lemma: Lemma. For the LDA problem, GSVD is equivalent to FK, with X = [UD /2 V, U, Λ b = Σ b Σ b and Λ w = Σ wσ w. Where X, Σ b and Σ w are from GSVD (See Equations 7, 8), and U, D, V, U, Λ w and Λ b are matrices from FK (See Equations 6, 7, 8 ). Proof. GSVD = FK Based on GSVD, hus, S b = H b H b = X [ Σ b Σ b S w = H w H w = X [ Σ w Σ w X (S b + S w ) X = [ I X (9) X (2) (2) Since Σ b Σ b + Σ wσ w = I R rt rt, if we choose the first r t columns of X as P, i.e., P = X (D,rt), then P (S b + S w )P = I. his is exactly the outcome of FK. Meanwhile, we can reach the conclusion that Λ b = Σ b Σ b and Λ w = Σ wσ w. FK = GSVD Based on FK, P = UD /2 S b = P S b P = D /2 U H b H b UD /2 (22) S b = VΛ b V (23) Hence, D /2 U H b H b UD /2 = VΛ b V (24) In general, there is no unique decomposition on the above equation because H b H b = H byy H b for any orthogonal matrix Y R C C. hat is: D /2 U H b YY H b UD /2 = VΛ b V (25) Y H b UD /2 = Σ b V (26) Y H b UD /2 V = Σ b (27) where Σ b R C rt, and Λ b = Σ b Σ b. If we define X = [UD /2 V, U R D D. hen, Y H b X = Y H b [UD /2 V, U (28) = [Y H b UD /2 V, (29) = [ Σ b, (3) Here, H b U = and H wu = because U is the intersection of the null spaces of S b and S w. Similarly, we can obtain Z H wx = [ Σ w,, where Z R N N is an arbitrary orthogonal matrix and Σ w R N rt, and Λ w = Σ Σ w w. Since Λ b + Λ w = I and I R rt rt is an identity matrix, it is easy to check that Σ Σ b b + Σ Σ w w = I, which satisfies the constraint of GSVD. Now we have to prove X is nonsingular. XX = [UD /2 V, U [ V D /2 U U = UD U + U U [ [ D U = [U, U I U (3) where V R r r and [U, U are orthogonal matrices. Note that U U = and U U =. hus XX can be eigen-decomposed with positive eigenvalues, which means X is nonsingular. Based on the above lemma, we can investigate the relationship between the eigenvalue ratio of FK and the generalized eigenvalue λ of the Fisher criterion J F. heorem. If λ is the solution of Equation 2 (the generalized eigenvalue of S b and S w ), and λ b and are the eigenvalues after applying FK on S b and S w, then λ = λ b, where λ b + =. Proof. Based on GSVD, it is easy to verify that: S b = H b H b = X [ Σ b Σ b X (32)

5 According to our lemma, Λ b = Σ b Σ b, thus S b = X [ Λb X (33) Similarly, we can rewrite S w but with Λ w. Since S b φ = λs w φ, X [ Λb X φ = λx [ Λw X φ (34) Let v = X φ, and multiply X on both sides, we can obtain the following. [ Λb [ Λb If we add λ [ Λb ( + λ) [ Λw v = λ v (35) v on both sides of Equation 35, then [ I v = λ v. (36) his means that ( + λ)λ b = λ, which can be rewritten as λ b = λ( λ b ) = λ because λ b + =. Now, we can observe that λ = λ b. Corollary. If λ is the generalized eigenvalue of S b and S w, and α and β are the solutions of Equations 7, 8 and α/β is the generalized singular value of the matrix pair (H b, H w ), then λ = α2 β 2, where α 2 + β 2 =. Proof. In Lemma we have proved that Λ b = Σ b Σ b and Λ w = Σ wσ w, that is: λ b = α 2 and = β 2. According to heorem, we observe that λ = λ b. herefore it is easy to see that λ = α2 β. Note that α 2 β is the generalized singular value of (H b, H w ) by GSVD, and λ is the generalized eigenvalue of (S b, S w ). he Corollary suggests how to evaluate the discriminant subspaces of LDA/GSVD. Actually, Howland et. al. in [5 applied the Corollary implicitly, but in this paper we explicitly connect the generalized singular value α β measure of discriminability. Based on our analysis, the eigenvalue ratio λ b with λ, the and the square of the generalized singular value α2 β 2, both are equal to the generalized eigenvalue λ, the measure of discriminability. herefore, according to Fig., Subspace, with the infinite eigenvalue ratio λ b, is the most discriminant subspace, followed by Subspace 2 and Subspace 3. However, Subspace 4 which contains no discriminant information, can be safely thrown away. Input: he data matrix A. Output: Projection matrix Φ F such that the J F is maximized.. Compute H b and H t from data matrix A: [ N(m H b = m),..., N C(m C m), H t = [a m,..., a N m. 2. Apply QR decomposition on H t = QR, where Q R D r t, R R r t N and r t = Rank(H t). 3. Let S t = RR, since S t = Q S tq = Q H th t Q = RR. 4. Let Z = Q H b. 5. Let S b = ZZ, since S b = Q S b Q = Q H b H b Q = ZZ. 6. Compute the eigenvectors {v i} and eigenvalues {λ i} of S b. S t 7. Sort the eigenvectors v i according to λ i in decreasing order, λ λ 2 λ k > λ k+ = = λ rt =. 8. he final projection matrix Φ F = QV, where V = {v i}. Note that we can choose the subspaces based on the columns of V. Figure 2. Algorithm: Apply QR decomposition to compute LDA/FK. 3.3 Algorithm Although we proved that FK is equivalent to GSVD on the LDA problem, LDA/GSVD is computationally expensive. he running time of LDA/GSVD is O(D(N + C) 2 ), where D is the dimensionality, N is the number of training samples and C is the number of classes. In this paper, we propose an efficient way to compute the Subspaces, 2 and 3 of LDA/FK based on QR decomposition because Subspace 4 contains no discriminant information. Moreover, we use smaller matrices H b and H t because matrices S b, S w and S t may be too large to be formed. Our LDA/FK algorithm is shown in Fig. 2. he running time is O(DN 2 ). 4. Experiments 4.. A oy Problem First, we experiment on synthetic data: three single Gaussian classes with the same covariance matrix and different means. In 3D space, the three classes share the same covariance matrix: diag([,, ), and each class has points. hey have different means: [,,, [,, 2, and [,, (See Fig. 3(a)).

6 able. Eigenvalue and eigenvalue ratio of the toy problem. 2.5 LDA/FK LDA/GSVD LDA λ b / α 2 /β 2 λ / = ( ) / =.44 = / = (.66 6 ) (a) As shown in Fig. 4, LDA/FK decomposes the whole data space into three subspaces:, 2 and 3. Here Subspace 4 does not exist because the number of samples (3) is larger than the dimension (3). Note that each subspace consists of only one eigenvector. In terms of the eigenvalue ratio, Subspace is the most discriminative subspace, followed by Subspaces 2 and 3. But if we use Fisherface (PCA followed by LDA), we cannot obtain Subspace because Fisherface restricts to the Subspaces 2 and 3 where so that S w is invertible. In doing so, Fisherface is discarding the most discriminative information (Subspace ). o compare LDA/FK with Fisherface, we project the original 3D data to Subspace and Subspace 2, and we also project to 2D space via Fisherface. As illustrated in Fig. 3(b) and 3(c), the 2D projection of FK is more separable than the projection of Fisherface. his is consistent with our theory. Let us recall the Corollary: the eigenvalue ratio of LDA/FK is equal to the square of the generalized singular value of LDA/GSVD. o check this, we also apply GSVD on H b and H w, which is known as LDA/GSVD [5. hen we obtain the generalized singular value α/β. From able, we see that α 2 /β 2 λ b /λ 2 w. hus we show the validity of our theory experimentally. Moreover, this means we can compute the generalized eigenvalue λ of LDA, although it cannot be obtained directly when S w is singular Face Recognition We now apply LDA/FK on a real problem: face recognition. We perform experiments on the CMU-PIE [7 face dataset. he CMU-PIE database consists of over 4, images of 68 subjects. Each subject was recorded across 3 different poses, under 43 different illuminations (with and without background lighting), and with 4 different expression. Here we use a subset of PIE for our experiments. We choose 67 subjects 3, and each subject has 24 frontal face images with background lighting. All of these face images are aligned based on eyes coordinates, and cropped to 2 he difference depends on the machine precision. 3 In CMU-PIE, there are 68 subjects together, we only choose 67 subjects because one subject has fewer frontal face images (b) Figure 3. Original 3 classes (triangle, circle and cross) in 3D space (a) and 2D projection by (b) LDA/FK and (c) Fisherface. Observe that the 2D projection of FK is more separable than that of Fisherface Fig. 5 shows a sample of PIE face images used in our experiments. It is easy to see that the major challenge on this data set is to do face recognition under different illuminations. Here, to evaluate the performance of LDA/FK, we compare it with PCA and Fisherface. We employ -NN to do face recognition after projection onto the respective subspaces. For PCA, the projected subspace contains 66 dimensions; for Fisherface, we take the first principal components to make S w nonsingular, then followed by LDA. For LDA/FK, we project data to Subspaces, 2. In face recognition, we usually have an undersampled problem, which is also the reason for the singularity of S w. o evaluate the performance under such situation, we randomly choose n training samples from each subject, n = 2,, 2, and the remaining images are used for testing. Moreover, for each set of n training samples, we repeat sampling times to compute the mean and standard deviation of classification accuracies. As illustrated in Fig. 6, we observe that the more the training samples, the better the recognition accuracy. o be more specific, for each method, increasing the number of training samples increases the mean recognition rate and decreases the standard deviation. Fig. 6 also shows that no matter how many training sam- (c)

7 Eigenvalues Recognition rate (%) PCA Fisherface LDA/FK λ b Dimension Number of training samples (per class) Figure 4. Eigenvalue curve for the toy problem by LDA/FK. Note that each subspace consists of only one eigenvector. Figure 6. he accuracy curve on PIE with varying training samples. We show the rate and standard deviation from runs. Figure 5. A sample of face images from PIE dataset. Each subject has 24 frontal face images under different lighting conditions. ples are used, LDA/FK consistently outperforms PCA and Fisherface. his is easily explained by the subspaces used in each method. LDA/FK uses Subspaces and 2, which are the most discriminative. Fisherface ignores Subspace and operates in Subspaces 2 and 3. PCA has no notion of discriminative subspaces at all, since it is designed for pattern representation, rather than classification [2, 3. Note that even when we have only 2 or 4 training samples from each subject, LDA/FK can still obtain the best recognition accuracy ( 99%) with the smallest standard deviation (< %). his means LDA/FK can handle small sample size problem with high and stable performance, which is not the case for PCA or Fisherface. 5 Conclusion In this paper, we showed how FK provides valuable insights into LDA by exposing the four subspaces that make up the entire data space. hese subspaces differ in discriminability because each has a different value for the Fisher s Criterion. We also proved that the common technique of applying PCA followed by LDA (a.k.a. Fisherface) is not optimal because this discards Subspace, which is the most discriminative subspace. Our result also showed how FK, originally proposed for 2-class problems, can be generalized to handle multiple classes. his is achieved by replacing the autocorrelation matrices S and S 2 with the scatter matrices S b and S w. In the near future, we intend to investigate further the insights that FK reveals about LDA. Another interesting direction to pursue is the extend our theory to nonlinear discriminant analysis. One way is to use the kernel trick employed in SVM, e.g., construct kernelized between-class scatter and within-class scatter matrices. FK may yet again reveal new insights into kernelized LDA. References [ P. Belhumeur, J. Hespanha, and D. Kriegman. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. IEEE ransactions on Pattern Analysis and Machine Intelligence, 9(7), 997. [2 R. Duda, P. Hart, and D. Stork. Pattern Classification, 2nd edition. John Wiley and Sons, 2. [3 K. Fukunaga. Introduction to Statistical Pattern Recognition, 2rd Edition. Academic Press, 99. [4 G. Golub and C. V. Loan. Matrix Computations, 3rd Edition. John Hopkins Univ. Press, 996. [5 P. Howland and H. Park. Generalizing discriminant analysis using the generalized singular value decomposition. IEEE ransactions on Pattern Analysis and Machine Intelligence, 26(8):995 6, August 24. [6 I. Jolliffe. Principal component analysis. New York: SpringerVerlag, 986. [7. Sim, S. Baker, and M. Bsat. he cmu pose, illumination, and expression database. IEEE ransactions on Pattern Analysis and Machine Intelligence, 25(2):65 68, December 23. [8 M. urk and A. Pentland. Eigenfaces for Recognition. Journal of Cognitive Neuroscience, 3(), 99.

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition Jun Liu, Songcan Chen, Daoqiang Zhang, and Xiaoyang Tan Department of Computer Science & Engineering, Nanjing University

More information

Example: Face Detection

Example: Face Detection Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,

More information

Solving the face recognition problem using QR factorization

Solving the face recognition problem using QR factorization Solving the face recognition problem using QR factorization JIANQIANG GAO(a,b) (a) Hohai University College of Computer and Information Engineering Jiangning District, 298, Nanjing P.R. CHINA gaojianqiang82@126.com

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis

Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis Terence Sim Sheng Zhang Jianran Li Yan Chen School of Computing, National University of Singapore, Singapore 117417.

More information

Two-stage Methods for Linear Discriminant Analysis: Equivalent Results at a Lower Cost

Two-stage Methods for Linear Discriminant Analysis: Equivalent Results at a Lower Cost Two-stage Methods for Linear Discriminant Analysis: Equivalent Results at a Lower Cost Peg Howland and Haesun Park Abstract Linear discriminant analysis (LDA) has been used for decades to extract features

More information

Weighted Generalized LDA for Undersampled Problems

Weighted Generalized LDA for Undersampled Problems Weighted Generalized LDA for Undersampled Problems JING YANG School of Computer Science and echnology Nanjing University of Science and echnology Xiao Lin Wei Street No.2 2194 Nanjing CHINA yangjing8624@163.com

More information

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface Image Analysis & Retrieval Lec 14 Eigenface and Fisherface Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv, Spring

More information

Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach

Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach Tao Xiong Department of ECE University of Minnesota, Minneapolis txiong@ece.umn.edu Vladimir Cherkassky Department of ECE University

More information

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA)

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA) Symmetric Two Dimensional inear Discriminant Analysis (2DDA) Dijun uo, Chris Ding, Heng Huang University of Texas at Arlington 701 S. Nedderman Drive Arlington, TX 76013 dijun.luo@gmail.com, {chqding,

More information

A Unified Bayesian Framework for Face Recognition

A Unified Bayesian Framework for Face Recognition Appears in the IEEE Signal Processing Society International Conference on Image Processing, ICIP, October 4-7,, Chicago, Illinois, USA A Unified Bayesian Framework for Face Recognition Chengjun Liu and

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Enhanced Fisher Linear Discriminant Models for Face Recognition

Enhanced Fisher Linear Discriminant Models for Face Recognition Appears in the 14th International Conference on Pattern Recognition, ICPR 98, Queensland, Australia, August 17-2, 1998 Enhanced isher Linear Discriminant Models for ace Recognition Chengjun Liu and Harry

More information

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Recognition Using Class Specific Linear Projection Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Articles Eigenfaces vs. Fisherfaces Recognition Using Class Specific Linear Projection, Peter N. Belhumeur,

More information

Two-Dimensional Linear Discriminant Analysis

Two-Dimensional Linear Discriminant Analysis Two-Dimensional Linear Discriminant Analysis Jieping Ye Department of CSE University of Minnesota jieping@cs.umn.edu Ravi Janardan Department of CSE University of Minnesota janardan@cs.umn.edu Qi Li Department

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface CS/EE 5590 / ENG 401 Special Topics, Spring 2018 Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface Zhu Li Dept of CSEE, UMKC http://l.web.umkc.edu/lizhu Office Hour: Tue/Thr 2:30-4pm@FH560E, Contact:

More information

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,

More information

Random Sampling LDA for Face Recognition

Random Sampling LDA for Face Recognition Random Sampling LDA for Face Recognition Xiaogang Wang and Xiaoou ang Department of Information Engineering he Chinese University of Hong Kong {xgwang1, xtang}@ie.cuhk.edu.hk Abstract Linear Discriminant

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material

Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material Ioannis Gkioulekas arvard SEAS Cambridge, MA 038 igkiou@seas.harvard.edu Todd Zickler arvard SEAS Cambridge, MA 038 zickler@seas.harvard.edu

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on Lecture 17: Face Recogni2on Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei-Fei Li Stanford Vision Lab Lecture 17-1! What we will learn today Introduc2on to face recogni2on Principal Component Analysis

More information

Discriminant Uncorrelated Neighborhood Preserving Projections

Discriminant Uncorrelated Neighborhood Preserving Projections Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,

More information

Clustering VS Classification

Clustering VS Classification MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:

More information

A Statistical Analysis of Fukunaga Koontz Transform

A Statistical Analysis of Fukunaga Koontz Transform 1 A Statistical Analysis of Fukunaga Koontz Transform Xiaoming Huo Dr. Xiaoming Huo is an assistant professor at the School of Industrial and System Engineering of the Georgia Institute of Technology,

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Eigenfaces and Fisherfaces

Eigenfaces and Fisherfaces Eigenfaces and Fisherfaces Dimension Reduction and Component Analysis Jason Corso University of Michigan EECS 598 Fall 2014 Foundations of Computer Vision JJ Corso (University of Michigan) Eigenfaces and

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

Efficient Kernel Discriminant Analysis via QR Decomposition

Efficient Kernel Discriminant Analysis via QR Decomposition Efficient Kernel Discriminant Analysis via QR Decomposition Tao Xiong Department of ECE University of Minnesota txiong@ece.umn.edu Jieping Ye Department of CSE University of Minnesota jieping@cs.umn.edu

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction

More information

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition. Appears in the Second International Conference on Audio- and Video-based Biometric Person Authentication, AVBPA 99, ashington D. C. USA, March 22-2, 1999. Comparative Assessment of Independent Component

More information

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1 Eigenimages " Unitary transforms" Karhunen-Loève transform" Eigenimages for recognition" Sirovich and Kirby method" Example: eigenfaces" Eigenfaces vs. Fisherfaces" Digital Image Processing: Bernd Girod,

More information

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on Lecture 17: Face Recogni2on Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei-Fei Li Stanford Vision Lab Lecture 17-1! What we will learn today Introduc2on to face recogni2on Principal Component Analysis

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Constructing Optimal Subspaces for Pattern Classification. Avinash Kak Purdue University. November 16, :40pm. An RVL Tutorial Presentation

Constructing Optimal Subspaces for Pattern Classification. Avinash Kak Purdue University. November 16, :40pm. An RVL Tutorial Presentation Constructing Optimal Subspaces for Pattern Classification Avinash Kak Purdue University November 16, 2018 1:40pm An RVL Tutorial Presentation Originally presented in Summer 2008 Minor changes in November

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Machine Learning 11. week

Machine Learning 11. week Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares

On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares Liang Sun, Shuiwang Ji, Shipeng Yu, Jieping Ye Department of Computer Science and Engineering, Arizona

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA Pavel Laskov 1 Blaine Nelson 1 1 Cognitive Systems Group Wilhelm Schickard Institute for Computer Science Universität Tübingen,

More information

Kazuhiro Fukui, University of Tsukuba

Kazuhiro Fukui, University of Tsukuba Subspace Methods Kazuhiro Fukui, University of Tsukuba Synonyms Multiple similarity method Related Concepts Principal component analysis (PCA) Subspace analysis Dimensionality reduction Definition Subspace

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Image Analysis. PCA and Eigenfaces

Image Analysis. PCA and Eigenfaces Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information

PCA FACE RECOGNITION

PCA FACE RECOGNITION PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Lecture 13 Visual recognition

Lecture 13 Visual recognition Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14 Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations Linear Algebra in Computer Vision CSED441:Introduction to Computer Vision (2017F Lecture2: Basic Linear Algebra & Probability Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Mathematics in vector space Linear

More information

Adaptive Discriminant Analysis by Minimum Squared Errors

Adaptive Discriminant Analysis by Minimum Squared Errors Adaptive Discriminant Analysis by Minimum Squared Errors Haesun Park Div. of Computational Science and Engineering Georgia Institute of Technology Atlanta, GA, U.S.A. (Joint work with Barry L. Drake and

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

STRUCTURE PRESERVING DIMENSION REDUCTION FOR CLUSTERED TEXT DATA BASED ON THE GENERALIZED SINGULAR VALUE DECOMPOSITION

STRUCTURE PRESERVING DIMENSION REDUCTION FOR CLUSTERED TEXT DATA BASED ON THE GENERALIZED SINGULAR VALUE DECOMPOSITION SIAM J. MARIX ANAL. APPL. Vol. 25, No. 1, pp. 165 179 c 2003 Society for Industrial and Applied Mathematics SRUCURE PRESERVING DIMENSION REDUCION FOR CLUSERED EX DAA BASED ON HE GENERALIZED SINGULAR VALUE

More information

Locality Preserving Projections

Locality Preserving Projections Locality Preserving Projections Xiaofei He Department of Computer Science The University of Chicago Chicago, IL 60637 xiaofei@cs.uchicago.edu Partha Niyogi Department of Computer Science The University

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

Spectral Regression for Efficient Regularized Subspace Learning

Spectral Regression for Efficient Regularized Subspace Learning Spectral Regression for Efficient Regularized Subspace Learning Deng Cai UIUC dengcai2@cs.uiuc.edu Xiaofei He Yahoo! hex@yahoo-inc.com Jiawei Han UIUC hanj@cs.uiuc.edu Abstract Subspace learning based

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Dimension Reduction and Component Analysis Lecture 4

Dimension Reduction and Component Analysis Lecture 4 Dimension Reduction and Component Analysis Lecture 4 Jason Corso SUNY at Buffalo 28 January 2009 J. Corso (SUNY at Buffalo) Lecture 4 28 January 2009 1 / 103 Plan In lecture 3, we learned about estimating

More information

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

CITS 4402 Computer Vision

CITS 4402 Computer Vision CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

IEEE. Proof. LINEAR DISCRIMINANT analysis (LDA) [1], [16] has

IEEE. Proof. LINEAR DISCRIMINANT analysis (LDA) [1], [16] has TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL 35, NO 6, DECEMBER 2005 1 GA-Fisher: A New LDA-Based Face Recognition Algorithm With Selection of Principal Components Wei-Shi Zheng,

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION APPENDIX B EIGENVALUES AND SINGULAR VALUE DECOMPOSITION B.1 LINEAR EQUATIONS AND INVERSES Problems of linear estimation can be written in terms of a linear matrix equation whose solution provides the required

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

Small sample size in high dimensional space - minimum distance based classification.

Small sample size in high dimensional space - minimum distance based classification. Small sample size in high dimensional space - minimum distance based classification. Ewa Skubalska-Rafaj lowicz Institute of Computer Engineering, Automatics and Robotics, Department of Electronics, Wroc

More information

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System processes System Overview Previous Systems:

More information

Dimension Reduction and Component Analysis

Dimension Reduction and Component Analysis Dimension Reduction and Component Analysis Jason Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Dimension Reduction and Component Analysis 1 / 103 Plan We learned about estimating parametric models and

More information

Regularized Discriminant Analysis and Reduced-Rank LDA

Regularized Discriminant Analysis and Reduced-Rank LDA Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

Multi-Class Linear Dimension Reduction by. Weighted Pairwise Fisher Criteria

Multi-Class Linear Dimension Reduction by. Weighted Pairwise Fisher Criteria Multi-Class Linear Dimension Reduction by Weighted Pairwise Fisher Criteria M. Loog 1,R.P.W.Duin 2,andR.Haeb-Umbach 3 1 Image Sciences Institute University Medical Center Utrecht P.O. Box 85500 3508 GA

More information

SVD and its Application to Generalized Eigenvalue Problems. Thomas Melzer

SVD and its Application to Generalized Eigenvalue Problems. Thomas Melzer SVD and its Application to Generalized Eigenvalue Problems Thomas Melzer June 8, 2004 Contents 0.1 Singular Value Decomposition.................. 2 0.1.1 Range and Nullspace................... 3 0.1.2

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

Fisher Linear Discriminant Analysis

Fisher Linear Discriminant Analysis Fisher Linear Discriminant Analysis Cheng Li, Bingyu Wang August 31, 2014 1 What s LDA Fisher Linear Discriminant Analysis (also called Linear Discriminant Analysis(LDA)) are methods used in statistics,

More information

CS 4495 Computer Vision Principle Component Analysis

CS 4495 Computer Vision Principle Component Analysis CS 4495 Computer Vision Principle Component Analysis (and it s use in Computer Vision) Aaron Bobick School of Interactive Computing Administrivia PS6 is out. Due *** Sunday, Nov 24th at 11:55pm *** PS7

More information