Two-Dimensional Linear Discriminant Analysis

Size: px
Start display at page:

Download "Two-Dimensional Linear Discriminant Analysis"

Transcription

1 Two-Dimensional Linear Discriminant Analysis Jieping Ye Department of CSE University of Minnesota Ravi Janardan Department of CSE University of Minnesota Qi Li Department of CIS University of Delaware Astract Linear Discriminant Analysis (LDA) is a well-known scheme for feature extraction and dimension reduction. It has een used widely in many applications involving high-dimensional data, such as face recognition and image retrieval. An intrinsic limitation of classical LDA is the so-called singularity prolem, that is, it fails when all scatter matrices are singular. A well-known approach to deal with the singularity prolem is to apply an intermediate dimension reduction stage using Principal Component Analysis (PCA) efore LDA. The algorithm, called PCA+LDA, is used widely in face recognition. However, PCA+LDA has high costs in time and space, due to the need for an eigen-decomposition involving the scatter matrices. In this paper, we propose a novel LDA algorithm, namely, which stands for 2-Dimensional Linear Discriminant Analysis. overcomes the singularity prolem implicitly, while achieving efficiency. The key difference etween and classical LDA lies in the model for data representation. Classical LDA works with vectorized representations of data, while the algorithm works with data in matrix representation. To further reduce the dimension y, the comination of and classical LDA, namely, is studied, where LDA is preceded y. The proposed algorithms are applied on face recognition and compared with PCA+LDA. Experiments show that and achieve competitive recognition accuracy, while eing much more efficient. Introduction Linear Discriminant Analysis [2, 4] is a well-known scheme for feature extraction and dimension reduction. It has een used widely in many applications such as face recognition [], image retrieval [6], microarray data classification [3], etc. Classical LDA projects the data onto a lower-dimensional vector space such that the ratio of the etween-class distance to the within-class distance is maximized, thus achieving maximum discrimination. The optimal projection (transformation) can e readily computed y applying the eigendecomposition on the scatter matrices. An intrinsic limitation of classical LDA is that its ojective function requires the nonsingularity of one of the scatter matrices. For many applications, such as face recognition, all scatter matrices in question can e singular since the data is from a very high-dimensional space, and in general, the dimension exceeds the

2 numer of data points. This is known as the undersampled or singularity prolem [5]. In recent years, many approaches have een rought to ear on such high-dimensional, undersampled prolems, including pseudo-inverse LDA, PCA+LDA, and regularized LDA. More details can e found in [5]. Among these LDA extensions, PCA+LDA has received a lot of attention, especially for face recognition []. In this two-stage algorithm, an intermediate dimension reduction stage using PCA is applied efore LDA. The common aspect of previous LDA extensions is the computation of eigen-decomposition of certain large matrices, which not only degrades the efficiency ut also makes it hard to scale them to large datasets. In this paper, we present a novel approach to alleviate the expensive computation of the eigen-decomposition in previous LDA extensions. The novelty lies in a different data representation model. Under this model, each datum is represented as a matrix, instead of as a vector, and the collection of data is represented as a collection of matrices, instead of as a single large matrix. This model has een previously used in [8, 9, 7] for the generalization of SVD and PCA. Unlike classical LDA, we consider the projection of the data onto a space which is the tensor product of two vector spaces. We formulate our dimension reduction prolem as an optimization prolem in Section 3. Unlike classical LDA, there is no closed form solution for the optimization prolem; instead, we derive a heuristic, namely. To further reduce the dimension, which is desirale for efficient querying, we consider the comination of and LDA, namely, where the dimension of the space transformed y is further reduced y LDA. We perform experiments on three well-known face datasets to evaluate the effectiveness of and and compare with PCA+LDA, which is used widely in face recognition. Our experiments show that: () is applicale to high-dimensional undersampled data such as face images, i.e., it implicitly avoids the singularity prolem encountered in classical LDA; and (2) and have distinctly lower costs in time and space than PCA+LDA, and achieve classification accuracy that is competitive with PCA+LDA. 2 An overview of LDA In this section, we give a rief overview of classical LDA. Some of the important notations used in the rest of this paper are listed in Tale. Given a data matrix A IR N n, classical LDA aims to find a transformation G IR N l that maps each column a i of A, for i n, in the N-dimensional space to a vector i in the l-dimensional space. That is G : a i IR N i = G T a i IR l (l<n). Equivalently, classical LDA aims to find a vector space G spanned y {g i } l i=, where G =[g,,g l ], such that each a i is projected onto G y (g T a i,,gl T a i) T IR l. Assume that the original data in A is partitioned into k classes as A = {Π,, Π k }, where Π i contains n i data points from the ith class, and k i= n i = n. Classical LDA aims to find the optimal transformation G such that the class structure of the original high-dimensional space is preserved in the low-dimensional space. In general, if each class is tightly grouped, ut well separated from the other classes, the quality of the cluster is considered to e high. In discriminant analysis, two scatter matrices, called within-class (S w ) and etween-class (S ) matrices, are defined to quantify the quality of the cluster, as follows [4]: S w = k i= x Π i (x m i )(x m i ) T, and S = k i= n i(m i m)(m i m) T, where m i = n i x Π i x is the mean of the ith class, and m = k n i= x Π i x is the gloal mean.

3 Notation Description n numer of images in the dataset k numer of classes in the dataset A i ith image in matrix representation a i ith image in vectorized representation r numer of rows in A i c numer of columns in A i N dimension of a i (N = r c) Π j jth class in the dataset L transformation matrix (left) y R transformation matrix (right) y I numer of iterations in B i reduced representation of A i y l numer of rows in B i l 2 numer of columns in B i Tale : Notation It is easy to verify that trace(s w ) measures the closeness of the vectors within the classes, while trace(s ) measures the separation etween classes. In the low-dimensional space resulting from the linear transformation G (or the linear projection onto the vector space G), the within-class and etween-class matrices ecome S L = GT S G, and S L w = G T S w G. An optimal transformation G would maximize trace(s L) and minimize trace(sl w). Common optimizations in classical discriminant analysis include (see [4]): { max trace((s L w ) S L ) } { and min trace((s L ) Sw) L }. () G G The optimization prolems in Eq. () are equivalent to the following generalized eigenvalue prolem: S x = λs w x, for λ 0. The solution can e otained y applying an eigendecomposition to the matrix Sw S,ifS w is nonsingular, or S S w,ifs is nonsingular. There are at most k eigenvectors corresponding to nonzero eigenvalues, since the rank of the matrix S is ounded from aove y k. Therefore, the reduced dimension y classical LDA is at most k. A stale way to compute the eigen-decomposition is to apply SVD on the scatter matrices. Details can e found in [6]. Note that a limitation of classical LDA in many applications involving undersampled data, such as text documents and images, is that at least one of the scatter matrices is required to e nonsingular. Several extensions, including pseudo-inverse LDA, regularized LDA, and PCA+LDA, were proposed in the past to deal with the singularity prolem. Details can e found in [5]. 3 2-Dimensional LDA The key difference etween classical LDA and the that we propose in this paper is in the representation of data. While classical LDA uses the vectorized representation, works with data in matrix representation. We will see later in this section that the matrix representation in leads to an eigendecomposition on matrices with much smaller sizes. More specifically, involves the eigen-decomposition of matrices with sizes r r and c c, which are much smaller than the matrices in classical LDA. This dramatically reduces the time and space complexities of over LDA.

4 Unlike classical LDA, considers the following (l l 2 )-dimensional space L R, which is the tensor product of the following two spaces: L spanned y {u i } l i= and R spanned y {v i } l2 i=. Define two matrices L = [u,,u l ] IR r l and R = [v,,v l2 ] IR c l2. Then the projection of X IR r c onto the space L Ris L T XR R l l2. Let A i IR r c, for i =,,n, e the n images in the dataset, clustered into classes Π,, Π k, where Π i has n i images. Let M i = n i X Π i X e the mean of the ith class, i k, and M = k n i= X Π i X e the gloal mean. In, we consider images as two-dimensional signals and aim to find two transformation matrices L IR r l and R IR c l2 that map each A i IR r c, for i n, to a matrix B i IR l l2 such that B i = L T A i R. Like classical LDA, aims to find the optimal transformations (projections) L and R such that the class structure of the original high-dimensional space is preserved in the low-dimensional space. A natural similarity metric etween matrices is the Froenius norm [8]. Under this metric, the (squared) within-class and etween-class distances D w and D can e computed as follows: k k D w = X M i 2 F, D = n i M i M 2 F. i= X Π i i= Using the property of the trace, that is, trace(mm T )= M 2 F, for any matrix M, we can rewrite D w and D as follows: ( k ) D w = trace (X M i )(X M i ) T, i= X Π i ( k ) D = trace n i (M i M)(M i M) T. i= In the low-dimensional space resulting from the linear transformations L and R, the withinclass and etween-class distances ecome ( k ) D w = trace L T (X M i )RR T (X M i ) T L, i= X Π i ( k ) D = trace n i L T (M i M)RR T (M i M) T L. i= The optimal transformations L and R would maximize D and minimize D w. Due to the difficulty of computing the optimal L and R simultaneously, we derive an iterative algorithm in the following. More specifically, for a fixed R, we can compute the optimal L y solving an optimization prolem similar to the one in Eq. (). With the computed L, we can then update R y solving another optimization prolem as the one in Eq. (). Details are given elow. The procedure is repeated a certain numer of times, as discussed in Section 4. Computation of L For a fixed R, D w and D can e rewritten as D w = trace ( L T S R w L ), D = trace ( L T S R L ),

5 Algorithm (A,,A n, l, l 2 ) Input: A,,A n, l, l 2 Output: L, R, B,,B n. Compute the mean M i of ith class for each i as M i = n i X Π i X; 2. Compute the gloal mean M = k n i= X Π i X; 3. R 0 (I l2, 0) T ; 4. For j fromtoi 5. Sw R k i= X Π i (X M i )R j Rj T (X M i) T, S R k i= n i(m i M)R j Rj T (M i M) T ; 6. Compute the first l eigenvectors {φ L l }l l= of ( ) Sw R S R ; 7. L j [ ] φ L,,φ L l 8. Sw L k i= X Π i (X M i ) T L j L T j (X M i), S L k i= n i(m i M) T L j L T j (M i M); 9. Compute the first l 2 eigenvectors {φ R l }l2 l= of ( Sw) L S L ; 0. R j [ ] φ R,,φ R l 2 ;. EndFor 2. L L I, R R I ; 3. B l L T A l R, for l =,,n; 4. return(l, R, B,,B n ). where S R w = k i= X Π i (X M i )RR T (X M i ) T, S R = k n i (M i M)RR T (M i M) T. Similar to the optimization prolem in Eq. (), the optimal L can e computed y solving the following optimization prolem: max L trace ( (L T S R w L) (L T S R L)). The solution can e otained y solving the following generalized eigenvalue prolem: S R w x = λs R x. Since S R w is in general nonsingular, the optimal L can e otained y computing an eigendecomposition on ( S R w ) S R. Note that the size of the matrices S R w and S R is r r, which is much smaller than the size of the matrices S w and S in classical LDA. Computation of R Next, consider the computation of R, for a fixed L. A key oservation is that D w and D can e rewritten as D w = trace ( R T SwR L ), D = trace ( R T S L R ), where k k Sw L = (X M i ) T LL T (X M i ), S L = n i (M i M) T LL T (M i M). i= X Π i i= This follows from the following property of trace, that is, trace(ab) =trace(ba), for any two matrices A and B. Similarly, the optimal R can e computed y solving the following optimization prolem: max R trace ( (R T SwR) L (R T S LR)). The solution can e otained y solving the following generalized eigenvalue prolem: Swx L = λs Lx. Since SL w is in general nonsingular, i=

6 the optimal R can e otained y computing an eigen-decomposition on ( S L w) S L. Note that the size of the matrices S L w and S L is c c, much smaller than S w and S. The pseudo-code for the algorithm is given in Algorithm. It is clear that the most expensive steps in Algorithm are in Lines 5, 8 and 3 and the total time complexity is O(n max(l,l 2 )(r + c) 2 I), where I is the numer of iterations. The algorithm depends on the initial choice R 0. Our experiments show that choosing R 0 =(I l2, 0) T, where I l2 is the identity matrix, produces excellent results. We use this initial R 0 in all the experiments. Since the numer of rows (r) and the numer of columns (c) of an image A i are generally comparale, i.e., r c N, we set l and l 2 to a common value d in the rest of this paper, for simplicity. However, the algorithm works in the general case. With this simplification, the time complexity of the algorithm ecomes O(ndNI). The space complexity of is O(rc) =O(N). The key to the low space complexity of the algorithm is that the matrices Sw R, S R, SL w, and S L can e formed y reading the matrices A l incrementally. 3. As mentioned in the Introduction, PCA is commonly applied as an intermediate dimensionreduction stage efore LDA to overcome the singularity prolem of classical LDA. In this section, we consider the comination of and LDA, namely, where the dimension y is further reduced y LDA, since small reduced dimension is desirale for efficient querying. More specifically, in the first stage of, each image A i IR r c is reduced to B i IR d d y, with d<min(r, c). In the second stage, each B i is first transformed to a vector i IR d2 (matrix-to-vector alignment), then i is further reduced to L i IR k y LDA with k <d 2, where k is the numer of classes. Here, matrix-to-vector alignment means that the matrix is transformed to a vector y concatenating all its rows together consecutively. The time complexity of the first stage y is O(ndNI). The second stage applies classical LDA to data in d 2 -dimensional space, hence takes O(n(d 2 ) 2 ), assuming n>d 2. Hence the total time complexity of is O ( nd(ni + d 3 ) ). 4 Experiments In this section, we experimentally evaluate the performance of and on face images and compare with PCA+LDA, used widely in face recognition. For PCA+LDA, we use 200 principal components in the PCA stage, as it produces good overall results. All of our experiments are performed on a P4.80GHz Linux machine with GB memory. For all the experiments, the -Nearest-Neighor (NN) algorithm is applied for classification and ten-fold cross validation is used for computing the classification accuracy. Datasets: We use three face datasets in our study: PIX, ORL, and PIE, which are pulicly availale. PIX (availale at contains 300 face images of 30 persons. The image size is We susample the images down to a size of = ORL (availale at contains 400 face images of 40 persons. The image size is PIE is a suset of the CMU PIE face image dataset (availale at 48.html). It contains 665 face images of 63 persons. The image size is We susample the images down to a size of = Note that PIE is much larger than the other two datasets.

7 Numer of iterations Numer of iterations Numer of iterations Figure : Effect of the numer of iterations on and using the three face datasets; PIX, ORL and PIE (from left to right). The impact of the numer, I, of iterations: In this experiment, we study the effect of the numer of iterations (I in Algorithm ) on and. The results are shown in Figure, where the x-axis denotes the numer of iterations, and the y-axis denotes the classification accuracy. d =0is used for oth algorithms. It is clear that oth accuracy curves are stale with respect to the numer of iterations. In general, the accuracy curves of are slightly more stale than those of. The key consequence is that we only need to run the for loop (from Line 4 to Line ) in Algorithm only once, i.e., I =, which significantly reduces the total running time of oth algorithms. The impact of the value of the reduced dimension d: In this experiment, we study the effect of the value of d on and, where the value of d determines the dimensionality in the transformed space y. We did extensive experiments using different values of d on the face image datasets. The results are summarized in Figure 2, where the x-axis denotes the values of d (etween and 5) and the y-axis denotes the classification accuracy with -Nearest-Neighor as the classifier. As shown in Figure 2, the accuracy curves on all datasets stailize around d =4to 6. Comparison on classification accuracy and efficiency: In this experiment, we evaluate the effectiveness of the proposed algorithms in terms of classification accuracy and efficiency and compare with PCA+LDA. The results are summarized in Tale 2. We can oserve that has similar performance as PCA+LDA in classification, while it outperforms. Hence the LDA stage in not only reduces the dimension, ut also increases the accuracy. Another key oservation from Tale 2 is that is almost one order of magnitude faster than PCA+LDA, while, the running time of is close to that of. Hence is a more effective dimension reduction algorithm than PCA+LDA, as it is competitive to PCA+LDA in classification and has the same numer of reduced dimensions in the transformed space, while it has much lower time and space costs. 5 Conclusions An efficient algorithm, namely, is presented for dimension reduction. is an extension of LDA. The key difference etween and LDA is that works on the matrix representation of images directly, while LDA uses a vector representation. has asymptotically minimum memory requirements, and lower time complexity than LDA, which is desirale for large face datasets, while it implicitly avoids the singularity prolem encountered in classical LDA. We also study the comination of and LDA, namely, where the dimension y is further reduced y LDA. Experiments show that and are competitive with PCA+LDA, in terms of classification accuracy, while they have significantly lower time and space costs.

8 Value of d Value of d Value of d Figure 2: Effect of the value of the reduced dimension d on and using the three face datasets; PIX, ORL and PIE (from left to right). Dataset PCA+LDA Time(Sec) Time(Sec) Time(Sec) PIX 98.00% % %.73 ORL 97.75% % % 2.9 PIE 99.32% 53 00% 57 Tale 2: Comparison on classification accuracy and efficiency: means that PCA+LDA is not applicale for PIE, due to its large size. Note that PCA+LDA involves an eigendecomposition of the scatter matrices, which requires the whole data matrix to reside in main memory. Acknowledgment Research of J. Ye and R. Janardan is sponsored, in part, y the Army High Performance Computing Research Center under the auspices of the Department of the Army, Army Research Laoratory cooperative agreement numer DAAD , the content of which does not necessarily reflect the position or the policy of the government, and no official endorsement should e inferred. References [] P.N. Belhumeour, J.P. Hespanha, and D.J. Kriegman. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(7):7 720, 997. [2] R.O. Duda, P.E. Hart, and D. Stork. Pattern Classification. Wiley, [3] S. Dudoit, J. Fridlyand, and T. P. Speed. Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association, 97(457):77 87, [4] K. Fukunaga. Introduction to Statistical Pattern Classification. Academic Press, San Diego, California, USA, 990. [5] W.J. Krzanowski, P. Jonathan, W.V McCarthy, and M.R. Thomas. Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. Applied Statistics, 44:0 5, 995. [6] Daniel L. Swets and Juyang Weng. Using discriminant eigenfeatures for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(8):83 836, 996. [7] J. Yang, D. Zhang, A.F. Frangi, and J.Y. Yang. Two-dimensional PCA: a new approach to appearance-ased face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26():3 37, [8] J. Ye. Generalized low rank approximations of matrices. In ICML Conference Proceedings, pages , [9] J. Ye, R. Janardan, and Q. Li. GPCA: An efficient dimension reduction scheme for image compression and retrieval. In ACM SIGKDD Conference Proceedings, pages , 2004.

Efficient Kernel Discriminant Analysis via QR Decomposition

Efficient Kernel Discriminant Analysis via QR Decomposition Efficient Kernel Discriminant Analysis via QR Decomposition Tao Xiong Department of ECE University of Minnesota txiong@ece.umn.edu Jieping Ye Department of CSE University of Minnesota jieping@cs.umn.edu

More information

CPM: A Covariance-preserving Projection Method

CPM: A Covariance-preserving Projection Method CPM: A Covariance-preserving Projection Method Jieping Ye Tao Xiong Ravi Janardan Astract Dimension reduction is critical in many areas of data mining and machine learning. In this paper, a Covariance-preserving

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants Sheng Zhang erence Sim School of Computing, National University of Singapore 3 Science Drive 2, Singapore 7543 {zhangshe, tsim}@comp.nus.edu.sg

More information

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,

More information

Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach

Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach Tao Xiong Department of ECE University of Minnesota, Minneapolis txiong@ece.umn.edu Vladimir Cherkassky Department of ECE University

More information

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition Jun Liu, Songcan Chen, Daoqiang Zhang, and Xiaoyang Tan Department of Computer Science & Engineering, Nanjing University

More information

Solving the face recognition problem using QR factorization

Solving the face recognition problem using QR factorization Solving the face recognition problem using QR factorization JIANQIANG GAO(a,b) (a) Hohai University College of Computer and Information Engineering Jiangning District, 298, Nanjing P.R. CHINA gaojianqiang82@126.com

More information

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA)

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA) Symmetric Two Dimensional inear Discriminant Analysis (2DDA) Dijun uo, Chris Ding, Heng Huang University of Texas at Arlington 701 S. Nedderman Drive Arlington, TX 76013 dijun.luo@gmail.com, {chqding,

More information

LEC 5: Two Dimensional Linear Discriminant Analysis

LEC 5: Two Dimensional Linear Discriminant Analysis LEC 5: Two Dimensional Linear Discriminant Analysis Dr. Guangliang Chen March 8, 2016 Outline Last time: LDA/QDA (classification) Today: 2DLDA (dimensionality reduction) HW3b and Midterm Project 3 What

More information

Bilinear Discriminant Analysis for Face Recognition

Bilinear Discriminant Analysis for Face Recognition Bilinear Discriminant Analysis for Face Recognition Muriel Visani 1, Christophe Garcia 1 and Jean-Michel Jolion 2 1 France Telecom Division R&D, TECH/IRIS 2 Laoratoire LIRIS, INSA Lyon 4, rue du Clos Courtel

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Weighted Generalized LDA for Undersampled Problems

Weighted Generalized LDA for Undersampled Problems Weighted Generalized LDA for Undersampled Problems JING YANG School of Computer Science and echnology Nanjing University of Science and echnology Xiao Lin Wei Street No.2 2194 Nanjing CHINA yangjing8624@163.com

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Linear Discriminant Analysis Using Rotational Invariant L 1 Norm

Linear Discriminant Analysis Using Rotational Invariant L 1 Norm Linear Discriminant Analysis Using Rotational Invariant L 1 Norm Xi Li 1a, Weiming Hu 2a, Hanzi Wang 3b, Zhongfei Zhang 4c a National Laboratory of Pattern Recognition, CASIA, Beijing, China b University

More information

Example: Face Detection

Example: Face Detection Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier Seiichi Ozawa, Shaoning Pang, and Nikola Kasabov Graduate School of Science and Technology, Kobe

More information

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier Seiichi Ozawa 1, Shaoning Pang 2, and Nikola Kasabov 2 1 Graduate School of Science and Technology,

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Chapter 2 Canonical Correlation Analysis

Chapter 2 Canonical Correlation Analysis Chapter 2 Canonical Correlation Analysis Canonical correlation analysis CCA, which is a multivariate analysis method, tries to quantify the amount of linear relationships etween two sets of random variales,

More information

A Unified Bayesian Framework for Face Recognition

A Unified Bayesian Framework for Face Recognition Appears in the IEEE Signal Processing Society International Conference on Image Processing, ICIP, October 4-7,, Chicago, Illinois, USA A Unified Bayesian Framework for Face Recognition Chengjun Liu and

More information

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Recognition Using Class Specific Linear Projection Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Articles Eigenfaces vs. Fisherfaces Recognition Using Class Specific Linear Projection, Peter N. Belhumeur,

More information

Clustering VS Classification

Clustering VS Classification MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Two-stage Methods for Linear Discriminant Analysis: Equivalent Results at a Lower Cost

Two-stage Methods for Linear Discriminant Analysis: Equivalent Results at a Lower Cost Two-stage Methods for Linear Discriminant Analysis: Equivalent Results at a Lower Cost Peg Howland and Haesun Park Abstract Linear discriminant analysis (LDA) has been used for decades to extract features

More information

Discriminant Uncorrelated Neighborhood Preserving Projections

Discriminant Uncorrelated Neighborhood Preserving Projections Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,

More information

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on Lecture 17: Face Recogni2on Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei-Fei Li Stanford Vision Lab Lecture 17-1! What we will learn today Introduc2on to face recogni2on Principal Component Analysis

More information

Lecture 13 Visual recognition

Lecture 13 Visual recognition Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14 Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface Image Analysis & Retrieval Lec 14 Eigenface and Fisherface Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv, Spring

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Enhanced Fisher Linear Discriminant Models for Face Recognition

Enhanced Fisher Linear Discriminant Models for Face Recognition Appears in the 14th International Conference on Pattern Recognition, ICPR 98, Queensland, Australia, August 17-2, 1998 Enhanced isher Linear Discriminant Models for ace Recognition Chengjun Liu and Harry

More information

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning LIU, LU, GU: GROUP SPARSE NMF FOR MULTI-MANIFOLD LEARNING 1 Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning Xiangyang Liu 1,2 liuxy@sjtu.edu.cn Hongtao Lu 1 htlu@sjtu.edu.cn

More information

Adaptive Kernel Principal Component Analysis With Unsupervised Learning of Kernels

Adaptive Kernel Principal Component Analysis With Unsupervised Learning of Kernels Adaptive Kernel Principal Component Analysis With Unsupervised Learning of Kernels Daoqiang Zhang Zhi-Hua Zhou National Laboratory for Novel Software Technology Nanjing University, Nanjing 2193, China

More information

Myoelectrical signal classification based on S transform and two-directional 2DPCA

Myoelectrical signal classification based on S transform and two-directional 2DPCA Myoelectrical signal classification based on S transform and two-directional 2DPCA Hong-Bo Xie1 * and Hui Liu2 1 ARC Centre of Excellence for Mathematical and Statistical Frontiers Queensland University

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,

More information

Class Mean Vector Component and Discriminant Analysis for Kernel Subspace Learning

Class Mean Vector Component and Discriminant Analysis for Kernel Subspace Learning 1 Class Mean Vector Component and Discriminant Analysis for Kernel Suspace Learning Alexandros Iosifidis Department of Engineering, ECE, Aarhus University, Denmark alexandros.iosifidis@eng.au.dk arxiv:1812.05988v1

More information

Machine Learning - MT & 14. PCA and MDS

Machine Learning - MT & 14. PCA and MDS Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)

More information

Small sample size in high dimensional space - minimum distance based classification.

Small sample size in high dimensional space - minimum distance based classification. Small sample size in high dimensional space - minimum distance based classification. Ewa Skubalska-Rafaj lowicz Institute of Computer Engineering, Automatics and Robotics, Department of Electronics, Wroc

More information

Divide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai

Divide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai Divide-and-Conquer Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, 33.4. CSE 6331 Algorithms Steve Lai Divide and Conquer Given an instance x of a prolem, the method works as follows: divide-and-conquer

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Adaptive Discriminant Analysis by Minimum Squared Errors

Adaptive Discriminant Analysis by Minimum Squared Errors Adaptive Discriminant Analysis by Minimum Squared Errors Haesun Park Div. of Computational Science and Engineering Georgia Institute of Technology Atlanta, GA, U.S.A. (Joint work with Barry L. Drake and

More information

A Statistical Analysis of Fukunaga Koontz Transform

A Statistical Analysis of Fukunaga Koontz Transform 1 A Statistical Analysis of Fukunaga Koontz Transform Xiaoming Huo Dr. Xiaoming Huo is an assistant professor at the School of Industrial and System Engineering of the Georgia Institute of Technology,

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Robust Tensor Factorization Using R 1 Norm

Robust Tensor Factorization Using R 1 Norm Robust Tensor Factorization Using R Norm Heng Huang Computer Science and Engineering University of Texas at Arlington heng@uta.edu Chris Ding Computer Science and Engineering University of Texas at Arlington

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION APPENDIX B EIGENVALUES AND SINGULAR VALUE DECOMPOSITION B.1 LINEAR EQUATIONS AND INVERSES Problems of linear estimation can be written in terms of a linear matrix equation whose solution provides the required

More information

A Novel PCA-Based Bayes Classifier and Face Analysis

A Novel PCA-Based Bayes Classifier and Face Analysis A Novel PCA-Based Bayes Classifier and Face Analysis Zhong Jin 1,2, Franck Davoine 3,ZhenLou 2, and Jingyu Yang 2 1 Centre de Visió per Computador, Universitat Autònoma de Barcelona, Barcelona, Spain zhong.in@cvc.uab.es

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis

Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis Terence Sim Sheng Zhang Jianran Li Yan Chen School of Computing, National University of Singapore, Singapore 117417.

More information

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on Lecture 17: Face Recogni2on Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei-Fei Li Stanford Vision Lab Lecture 17-1! What we will learn today Introduc2on to face recogni2on Principal Component Analysis

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Machine Learning 11. week

Machine Learning 11. week Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately

More information

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface CS/EE 5590 / ENG 401 Special Topics, Spring 2018 Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface Zhu Li Dept of CSEE, UMKC http://l.web.umkc.edu/lizhu Office Hour: Tue/Thr 2:30-4pm@FH560E, Contact:

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Discriminative K-means for Clustering

Discriminative K-means for Clustering Discriminative K-means for Clustering Jieping Ye Arizona State University Tempe, AZ 85287 jieping.ye@asu.edu Zheng Zhao Arizona State University Tempe, AZ 85287 zhaozheng@asu.edu Mingrui Wu MPI for Biological

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

Lecture 02 Linear Algebra Basics

Lecture 02 Linear Algebra Basics Introduction to Computational Data Analysis CX4240, 2019 Spring Lecture 02 Linear Algebra Basics Chao Zhang College of Computing Georgia Tech These slides are based on slides from Le Song and Andres Mendez-Vazquez.

More information

Non-Iterative Heteroscedastic Linear Dimension Reduction for Two-Class Data

Non-Iterative Heteroscedastic Linear Dimension Reduction for Two-Class Data Non-Iterative Heteroscedastic Linear Dimension Reduction for Two-Class Data From Fisher to Chernoff M. Loog and R. P.. Duin Image Sciences Institute, University Medical Center Utrecht, Utrecht, The Netherlands,

More information

Local Learning Projections

Local Learning Projections Mingrui Wu mingrui.wu@tuebingen.mpg.de Max Planck Institute for Biological Cybernetics, Tübingen, Germany Kai Yu kyu@sv.nec-labs.com NEC Labs America, Cupertino CA, USA Shipeng Yu shipeng.yu@siemens.com

More information

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition. Appears in the Second International Conference on Audio- and Video-based Biometric Person Authentication, AVBPA 99, ashington D. C. USA, March 22-2, 1999. Comparative Assessment of Independent Component

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

COS 429: COMPUTER VISON Face Recognition

COS 429: COMPUTER VISON Face Recognition COS 429: COMPUTER VISON Face Recognition Intro to recognition PCA and Eigenfaces LDA and Fisherfaces Face detection: Viola & Jones (Optional) generic object models for faces: the Constellation Model Reading:

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

Dynamical Systems Solutions to Exercises

Dynamical Systems Solutions to Exercises Dynamical Systems Part 5-6 Dr G Bowtell Dynamical Systems Solutions to Exercises. Figure : Phase diagrams for i, ii and iii respectively. Only fixed point is at the origin since the equations are linear

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

A Tensor Approximation Approach to Dimensionality Reduction

A Tensor Approximation Approach to Dimensionality Reduction Int J Comput Vis (2008) 76: 217 229 DOI 10.1007/s11263-007-0053-0 A Tensor Approximation Approach to Dimensionality Reduction Hongcheng Wang Narendra Ahua Received: 6 October 2005 / Accepted: 9 March 2007

More information

PCA FACE RECOGNITION

PCA FACE RECOGNITION PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal

More information

Dimensionality reduction

Dimensionality reduction Dimensionality Reduction PCA continued Machine Learning CSE446 Carlos Guestrin University of Washington May 22, 2013 Carlos Guestrin 2005-2013 1 Dimensionality reduction n Input data may have thousands

More information

Approximating the Covariance Matrix with Low-rank Perturbations

Approximating the Covariance Matrix with Low-rank Perturbations Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

Probabilistic Class-Specific Discriminant Analysis

Probabilistic Class-Specific Discriminant Analysis Probabilistic Class-Specific Discriminant Analysis Alexros Iosifidis Department of Engineering, ECE, Aarhus University, Denmark alexros.iosifidis@eng.au.dk arxiv:8.05980v [cs.lg] 4 Dec 08 Abstract In this

More information

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 5, MAY ASYMMETRIC PRINCIPAL COMPONENT ANALYSIS

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 5, MAY ASYMMETRIC PRINCIPAL COMPONENT ANALYSIS IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 5, MAY 2009 931 Short Papers Asymmetric Principal Component and Discriminant Analyses for Pattern Classification Xudong Jiang,

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Face recognition Computer Vision Spring 2018, Lecture 21

Face recognition Computer Vision Spring 2018, Lecture 21 Face recognition http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 21 Course announcements Homework 6 has been posted and is due on April 27 th. - Any questions about the homework?

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

CITS 4402 Computer Vision

CITS 4402 Computer Vision CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images

More information

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and

More information

A Flexible and Efficient Algorithm for Regularized Fisher Discriminant Analysis

A Flexible and Efficient Algorithm for Regularized Fisher Discriminant Analysis A Flexible and Efficient Algorithm for Regularized Fisher Discriminant Analysis Zhihua Zhang 1, Guang Dai 1, and Michael I. Jordan 2 1 College of Computer Science and Technology Zhejiang University Hangzhou,

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

CS 340 Lec. 6: Linear Dimensionality Reduction

CS 340 Lec. 6: Linear Dimensionality Reduction CS 340 Lec. 6: Linear Dimensionality Reduction AD January 2011 AD () January 2011 1 / 46 Linear Dimensionality Reduction Introduction & Motivation Brief Review of Linear Algebra Principal Component Analysis

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information