Online Appearance Model Learning for Video-Based Face Recognition

Size: px
Start display at page:

Download "Online Appearance Model Learning for Video-Based Face Recognition"

Transcription

1 Online Appearance Model Learning for Video-Based Face Recognition Liang Liu 1, Yunhong Wang 2,TieniuTan 1 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, Beijing, China {lliu, tnt}@nlpr.ia.ac.cn 2 School of Computer Science and Engineering Beihang University, Beijing, China yhwang@buaa.edu.cn Abstract In this paper, we propose a novel online learning method which can learn appearance models incrementally from a given video stream. The data of each frame in the video can be discarded as soon as it has been processed. We only need to maintain a few linear eigenspace models and a transition matrix to approximately construct face appearance manifolds. It is convenient to use these learnt models for video-based face recognition. There are mainly two contributions in this paper. First, we propose an algorithm which can learn appearance models online without using a pretrained model. Second, we propose a method for eigenspace splitting to prevent that most samples cluster into the same eigenspace. This is useful for clustering and classification. Experimental results show that the proposed method can both learn appearance models online and achieve high recognition rate. 1. Introduction For video based face recognition, most state-of-the-art face recognition algorithms [1, 2, 6, 7, 8, 9, 10, 12, 16, 18, 19] can perform the recognition task in real time. However, the training process usually runs off-line in a batch mode. Though a few online learning algorithms were proposed recently, those algorithms generally need to perform online learning based on a pre-trained model. A pre-trained model is typically trained in a batch mode from a data set which is manually collected and labeled. The task of collecting and labeling data is often tedious and boring. In addition, a pretrained model is usually not flexible enough to cope with various conditions online. Therefore, how to learn appearance models online without a pre-trained model is valuable to be developed. There are several drawbacks in batch training. First, it is inconvenient to add additional training samples. Each time we add some new samples, we have to run the batch algorithm once more, leading to higher computational complexity. This is because the complexity is generally at least proportional to the total number of training samples. Second, if we are confronted with a huge training data set, the computational complexity of batch algorithms is often prohibitive for current computers. Third, batch algorithms cannot be applied for real time online training. In contrast to batch algorithms, there are several advantages in online learning. First, it is convenient to add additional training samples. Each time we add new samples, the computational complexity is roughly the same because we discard data as soon as it has been processed. We only have to maintain a roughly constant amount of memory to represent the models. So it is quite suitable for the processing of video streams. Second, we can easily handle huge data sets by sequential processing. Third, online learning algorithms can be used for real time online training. For video-based face recognition, these properties are quite desirable. Based on the framework of probabilistic appearance manifolds proposed by Lee and Kriegman [8], we propose an online learning algorithm which can learn appearance manifolds without a pre-trained model. Similar to [8], we use a set of linear eigenspaces to represent sub-manifolds. However, in our method, the sub-manifolds are learnt completely online, while Lee and Kriegman s method [8] learnt appearance manifolds based on a pre-trained model. For each subject in the training data set, we construct K pose manifolds which are approximately constructed by linear eigenspaces (K can be decided empirically). In order to exploit the temporal information which is embedded in video streams, we maintain a transition matrix to record the number of transitions from one eigenspace to another. In this way, the transition probability from one eigenspace to another can be easily computed from the transition matrix. In the online learning process, each time for /07/$ IEEE

2 an incoming frame, the eigenspaces models are updated using IPCA (Incremental Principal Component Analysis) [4] or Eigenspace Merging and Splitting (EMS) method, with consideration of transition probabilities. For the recognition task, we compute the likelihood probability that a test frame is generated from each pose manifold and choose the one with the highest probability. Experimental results show that our proposed algorithm can achieve high recognition rate. The remainder of this paper is as follows. In Section 2, we briefly introduce some previous work on video-based face recognition. In Section 3, we describe the proposed method in detail. Some experimental results are presented in Section 4. Conclusions are drawn in Section Previous work A general review of recent face recognition literature can be found in [17]. In this section, we only briefly review the the literature which deals specifically with video-based face recognition. In [16], Mutual Subspace Method (MSM) is applied in which the similarity is defined by the angle between the input and the reference subspaces. Krüeger and Zhou [6] proposed an exemplar-based method which selects representative face images as exemplars from training face videos. These exemplars are used to facilitate tracking and recognition tasks. Liu and Chen [10] proposed to use adaptive Hidden Markov Models (HMM) for video-based face recognition. In [12], a KL divergence-based algorithm was proposed. In [14] and [9], frame synchronization was used. They also take advantage of audio information in videos. In [1], a generic shape-illumination manifold is learnt offline. Given a new sequence, the learnt model is used to decompose the face appearance manifold into albedo and shape-illumination manifolds. Zhou et al. [18, 19] proposed a generic framework for both tracking and recognition by estimating the joint posterior probability distribution of motion vector and identity variable. Lee et al. [7] proposed a method using probabilistic appearance manifolds. An appearance manifold is approximated by piecewise linear subspaces and a transition matrix learnt from an image sequence. An online learning algorithm for constructing a probabilistic appearance manifold was proposed in [8]. In their method, an appearance model is incrementally learnt online using a prior generic model and successive frames from the video. Both the generic and individual appearances are represented as an appearance manifold that is approximated by a collection of submanifolds (namely pose manifolds) and the connectivity between them. One obvious limitation is that their algorithm require a generic prior model which should be learnt offline. Our work bears some resemblance to [8] in the sense that both methods use eigenspace models and transition matrix to approximate pose manifolds. However, in this paper, we present an online learning algorithm which does not require a generic model. 3. Online appearance model learning In Section 3.1, we give a description of the appearance models and the transition matrix to be learnt online. In Section 3.2, a framework of online appearance model learning method using Eigenspace Merging and Splitting (EMS) is presented. In our method, eigenspace update using IPCA and EMS is a critical part and is discussed in Section 3.3. In Section 3.4, we discuss more details about the computation of distance Model description The problem we focus on in this paper can be described as follows. For the training face video stream of each subject in the dataset, we aim to construct K eigenspaces, Ω 1, Ω 2,, Ω K, to approximately represent the appearance manifold of that subject. There are four parameters for each eigenspace, namely [4] Ω (i) = {x (i), U (i), Λ (i),n (i) },i=1,,k. (1) The meaning of each parameter is as follows. x: center of the eigenspace. U: a matrix whose columns are orthonormal base of the eigenspace, namely eigenvectors. Λ: a diagonal matrix. Elements along the diagonal are variances for each principal axis, namely eigenvalues. They are arranged in descending order. N: number of samples to construct the eigenspace. We use a transition matrix T to record the number of transitions from one eigenspace to another. In order to make our algorithm more efficient, we also maintain a distance matrix D to record the distance from one eigenspace to another. T =(T ij ) K K. (2) D =(D ij ) K K. (3) How to learn these models online is what we focus on in this paper. Our method is presented in Section 3.2.

3 3.2. Online appearance model learning method Our method is motivated by hierarchical clustering. We just do hierarchical clustering in an incremental way. Initially, we assign the first K frames as the centers of K eigenspaces respectively. We compute the distance between every pair of eigenspace centers. For each incoming frame I t, we compute the distance between I t and each eigenspace center. The computation of distance between two eigenspace centers or between an incoming sample and a certain eigenspace center is influenced by both the Euclidean distance and the transition probability between them (see Section 3.4 for more details). Then we try to find the nearest neighbor. If the smallest distance is between I t and an eigenspace center, we update that eigenspace using IPCA. If the smallest distance is between two eigenspace centers, we merge the two eigenspaces into one eigenspace. In order to prevent that most frames cluster into the same eigenspace, we split an eigenspace when it contains too many frames. Each time after we update the eigenspaces, the transition matrix T and the distance matrix D are updated. Our algorithm is summarized as follows. Algorithm 1 Input: {I 1,I 2,,I n }: a consecutive face image sequence from a certain person. K: number of eigenspace models to be learnt. Output: Ω (i) = {x (i), U (i), Λ (i),n (i) },i=1,,k. Method: Initialize matrix D with all zeros. Initialize each element in matrix T with the same positive number T 0 (The prior distribution of transition probability is assumed as a uniform distribution). 1. for i 1 to n 2. if i<= K 3. x (i) I i,n (i) 1, U (i) {}, Λ (i) {} 4. Update D and T (3.4). 5. else 6. Compute the distance from I i to each eigenspace center (3.4). 7. Find the nearest neighbor. 8. if the smallest distance is between I i and an eigenspace center 9. Update that eigenspace with I i using IPCA (3.3.1). 10. else 11. Merge the two nearest eigenspaces (3.3.2). 12. Assign I i as the center of a new eigenspace model. 13. end if 14. If we get an eigenspace which contains too many samples, split it into two. 15. Update D and T. 16. end if 17. end for We have also considered about how to handle outliers. Generally, an eigenspace is more likely to be an outlier if it is constructed from fewer frames or its center is further from other eigenspace centers. In the step of eigenspace splitting, we remove an eigenspace which is the most likely to be an outlier using the criteria above. In our algorithm, eigenspace update using IPCA and EMS is a critical part. In Section 3.3, we show how to update eigenspaces without knowing both the original samples and the covariance matrix Eigenspace update In this section, we discuss more details about the computation of IPCA, eigenspace merging and eigenspace splitting. At the end of this section, some tricks are provided to make the calculation more efficient IPCA An algorithm of IPCA was proposed by Hall et al. [4], but their method is complicated to some extent. A concise method is described as follows. Algorithm 2 Input: Ω={x, U, Λ,N}: constructed from x 1, x 2,, x N. x: a new sample. Output: Ω = {x, U, Λ,N }: constructed from x 1, x 2,,x N, x. Method: 1. N N α 1 N/N,α 2 1 α x α 1 x + α 2 x. 4. Generate artificial data: [ α1 Y UΛ 1/2, α 1 α 2 (x x)]. 5. Compute the eigenvectors and eigenvalues of Y T Y : Y T Y = VΛ V T. 6. Compute the eigenvectors of Ω : U YVΛ 1/2.

4 In Section 3.3.2, we will see that IPCA is just a special case of eigenspace merging. Therefore, The correctness of Algorithm 2 is obvious if Algorithm 3 in Section is correct. A proof for Algorithm 3 can be found in [13]. For eigenspace merging, if one of the two eigenspaces to be merged has zero dimensions, the problem degenerates to IPCA. The time complexity of Algorithm 2 is dominated by Step 5 and Step 6. If Y is an m r matrix, Step 5 can be computed in time O(rm + r 3 ) (See some details in Section 3.3.4). Step 6 takes time O(r 2 m). So the time complexity of Algorithm 2 is O(r 2 m + r 3 ). In Step 5 and Step 6, we need not retain all the eigenvalues and eigenvectors. It is enough to retain only a few relatively larger eigenvalues and corresponding eigenvectors Eigenspace merging Skarbek [13] developed an algorithm to compute eigenspace merging which is more concise than Hall s method [5]. Both methods need not store the covariance matrix of previous training samples. Given two eigenspace models Ω 1 and Ω 2, we aim to find the eigenspace model for the union of the original data sets assuming that the original data is not available. Skarbek s algorithm is summarized as follows [13]. Algorithm 3 Input: Ω 1 = {x 1, U 1, Λ 1,N 1 }: constructed from x 1, x 2,,x N1. Ω 2 = {x 2, U 2, Λ 2,N 2 }: constructed from y 1,y 2,,y N2. Output: Ω = {x, U, Λ,N }: constructed from x 1, x 2,, x N1, y 1, y 2,, y N2. Method: 1. N N 1 + N α 1 N 1 /N,α 2 1 α x α 1 x 1 + α 2 x Generate artificial data: Y [ α 1 U 1 Λ 1/2 1, α 2 U 2 Λ 1/2 2, α 1 α 2 (x 1 x 2 )]. 5. Compute the eigenvectors and eigenvalues of Y T Y : Y T Y = VΛ V T. 6. Compute the eigenvectors of Ω : U YVΛ 1/2. The time complexity of Algorithm 3 is dominated by Step 5 and Step 6. If the sizes of Y, U 1 and U 2 are m r, m q 1 and m q 2 respectively, Step 5 can be computed in time O(q 1 q 2 m + r 3 ) (See some details in Section 3.3.4). Step 6 takes time O(r 2 m). So the time complexity of Algorithm 3 is O(r 2 m + r 3 ). In Step 5 and Step 6, we need not retain all the eigenvalues and eigenvectors. It is enough to retain only a few relatively larger eigenvalues and corresponding eigenvectors Eigenspace splitting An eigenspace corresponds to a super-ellipsoid. There are an infinite number of ways to split an eigenspace. Here we adopt an intuitive way. We split the super-ellipsoid using the super-plane that is both passing through the center of the eigenspace and perpendicular to the longest axis of the super-ellipsoid. The longest axis corresponds to the first principal eigenvector. Assuming that we have split an eigenspace Ω into two new eigenspaces Ω 1 and Ω 2, each center of the two new eigenspaces should be a translation of the original center along the longest axis. Because we do not know the original data set from which Ω is constructed from, split Ω equally is a reasonable way. We can get Ω 1 and Ω 2 as follows, and it can be verified that Ω can be produced by merging Ω 1 and Ω 2. Proposition. Assuming that Ω={x, (u 1, u 2,, u q ), diag(λ 1,λ 2,,λ q ),N}, Ω 1 = { x + λ 1 u 1, (u 2,, u q ), diag(λ 2,,λ q ),N/2 }, Ω 2 = { x λ 1 u 1, (u 2,, u q ), diag(λ 2,,λ q ),N/2 }, then Ω can be produced by merging Ω 1 and Ω 2. (N/2 is allowed to be a decimal fraction. ) Proof: α 1 = α 2 =1/2. (4) Y 1 = α 1 (u 2,, u q ) diag(λ 2,,λ q ) 1/2. (5) Y 2 = α 2 (u 2,, u q ) diag(λ 2,,λ q ) 1/2. (6) p = α 1 α 2 2 λ 1 u 1 = λ 1 u 1. (7) Y =[Y 1, Y 2, p]. (8) YY T =(u 2,, u q ) diag(λ 2,,λ q ) (u 2,, u q ) T + λ 1 u 1 u T 1 = UΛU T. (9) The proposition now follows immediately from Algorithm 3 and the definition of Singular Value Decomposition (SVD).

5 For eigenspace splitting, the time complexity is dominated by the copying of (u 2,, u q ). Therefore, eigenspace splitting can be computed in time O(qm), where m is the dimension of the feature space More efficient calculation The calculation of Y T Y in Section can be simplified if we make some further analysis [3]. Y T Y =[Y 1, Y 2, p] T [Y 1, Y 2, p] α 1 Λ 1 = Y 2 Y 1 α 2 Λ 2. (10) p T Y 1 p T Y 2 p T p Because Y T Y is symmetric, we only need to calculate α 1 Λ 1, α 2 Λ 2, Y 2 T Y 1 and p T Y. In this way, the computational complexity of Y T Y becomes much lower. Similarly, the calculation of Y T Y in Section 3.1 can also be simplified. Y T Y =[Y 1, p] T [Y 1, p] [ ] α1 Λ = p T Y 1 p T. (11) p We only need to calculate α 1 Λ and p T Y Computation of distance Assuming that p ij is the transition probability from Ω (i) to Ω (j), p ij can be easily computed from the transition matrix T: p ij = p(ω (j) Ω (i) )= T ij. (12) K T ij j=1 Since larger p ij usually corresponds to smaller D ij, D ij can be computed using the following formula: D ij = x (i) x (j) (a p ij ), (13) where a is a constant which can be chosen empirically. (For a vector v, v 2 denotes the sum of the squares of all entries in v.) Similarly, the distance between an incoming sample and the center of Ω (i) can be computed using the following formula: d i = x x (i) (b p i ), (14) where b is a constant like a, and p i is the transition probability from Ω (i) to the incoming sample. { p0 if previous sample is in Ω (i), p i = (15) otherwise, 1 p 0 K 1 where p 0 is a constant and 1 K <p 0 < 1. Figure 1. Typical samples of the videos used in our experiments. The images in each row come from a different video sequence. 4. Experimental results In order to evaluate the effectiveness of the proposed algorithm, we conduct some experiments on a 36-subject face video data set which bears large pose variation collected by our lab. In the data set, there are 36 videos which correspond to 36 subjects respectively. Each video sequence was captured indoors at 30 frames per second. The video sequences contain large 2-D (in plane) and 3-D (out of plane) head rotation, with slight expression and illumination changes. In each video, the number of frames ranges from 236 to Since our experiments mainly focus on learning and recognition, we do not pay much attention to automatic face detection and tracking. In each video sequence, the faces are cropped automatically using a boosted cascade face detector [15], or cropped manually when the detection results are not good enough. All the cropped images are transformed into gray-level images and resized to a standard size of Then a histogram equalization step is applied to eliminate the illumination impact. Some samples are shown in Fig.1. In our experiments, we use the first half of each video sequence, ranging from 118 to 635 frames, for online learning. Then we use the second half of each video sequence for the recognition task. Apart from the proposed method, we also do experiments using some other online learning methods. The methods we used in our experiments are listed as follows. -EMS + transition, the proposed method. -EMS, online learning using Eigenspace Merging and Splitting without considering transition probabilities. -EM + transition, online learning using Eigenspace Merging with consideration of transition probabilities. In this method, eigenspace splitting is not used. -IPCA, online learning using IPCA. For each subject, only one eigenspace model is learnt.

6 For the first three methods, K, namely the number of eigenspace models, is a parameter which is difficult to optimize without experiments. In consideration of this, we do experiments for K = 6, 7, 8, 9 respectively to evaluate the first three methods. The test sets are constructed by randomly sampling from the second half of each video sequence for 10 times with each set containing 50 independently and identically distributed samples [2]. For each sample in the test set, we compute the likelihood that is generated from each eigenspace model and find the maximum to make a classification. The likelihood probability can be computed using the following formula [11]: q exp( 1 p(i t Ω (i) 2 )= q (2π) q/2 i=1 y i 2 λ i ) 1/2 λ i i=1 exp( ɛ2 (x) 2ρ ), (16) (2πρ)(m q)/2 where [y 1,y 2,,y q ] T is the projection from I t to Ω (i), and ɛ 2 (x) is the Euclidean distance from I t to Ω (i). The parameter ρ is chosen as 0.3λ q empirically. For each test set, we use majority voting to make a final decision. The recognition rates shown in Table 1 are the average results over all runs. Table 1. Average recognition rates (%) of different methods when choosing K = 6, 7, 8, 9 respectively. K Method EMS + transition EMS EM + transition IPCA 38.9 The results show that the proposed method outperforms the other three online learning methods most of the time. IPCA gives the worst performance because this method tries to learn non-linear manifolds in a simple linear way. EM + transition can sometimes give good results, but is not stable enough. This is mainly because sometimes it happens that most frames cluster into the same eigenspace and this method degenerates to IPCA. EMS is stabler than EM + transition, but performs worse than EMS + transition most of the time. This is mainly because EMS does not exploit any temporal information. In addition, we can notice that when choosing K =8, our method performs much better than the other three methods. The average recognition rate is as high as 96.4%. Fig. 2 shows some eigenspace centers learnt when choosing K =8. We also implemented the Probabilistic Manifold online learning algorithm presented in [8] for comparison. This algorithm starts with a generic manifold which is trained off-line. The online learning process contains two steps. Figure 2. Some eigenspace centers learnt when choosing K =8. As compared with Fig. 1, we can see that these eigenspace centers are fairly representative of the original sequences. The first step is to identify the pose manifold to which the current image belongs with the highest probability. The second step is to update the appearance manifold using IPCA. The result from the first step is applied to find a set of pre-training images that are expected to appear similar to the current subject in other poses. Then all of the other eigenspaces in the appearance manifold are updated with synthetic images. We use 15 of the 36 video sequences for pre-training. The face images are manually classified into 5 different pose clusters. Then a 10-D pose subspace is computed from the images in each cluster using PCA [8]. We use the remain 21 video sequences for online learning and recognition. The first half of each video sequence is used for online learning, and the second half is used for the recognition task. Our proposed algorithm also runs on the 21 video sequences to compare the performance. The average recognition rates and the processing time are listed in Table 2. We can notice that the recognition rate of EMS + transition in Table 2 is higher than that in Table 1. This is because the results in Table 2 are obtained from a 21-subject data set, while the results in Table 1 are obtained from a 36-subject data set. Smaller data set usually makes the recognition task easier. Table 2. Performance of the Probabilistic Manifold algorithm and our proposed algorithm. For EMS + transition, we choose K =8. Because EMS + transition needs no pre-training, the corresponding blank is left empty. Method Prob. Manifold EMS + transition Recognition rates 92.4% 97.1% Online learning 34.3s 9.5s Pre-training 77.3s - From Table 2, we can see that our proposed algorithm gives better performance than that of [8], while the process-

7 ing time is much shorter than the Probabilistic Manifold algorithm. This is mainly because the Probabilistic Manifold algorithm only use 5 pose manifolds which are not flexible enough to represent different poses. In addition, updating eigenspaces with synthetic face images significantly increases the time complexity. In contrast, the proposed algorithm is more flexible to generate representative eigenspace models. 5. Conclusions In this paper we have presented a novel method for online appearance model learning which can be applied for video-based face recognition. For each person, we build K linear eigenspace models and a transition matrix to approximately construct the face appearance manifold. Each eigenspace model can be viewed as a pose model which represents a particular pose. We update these eigenspace models in an incremental way. For eigenspace update, we use IPCA or Eigenspace Merging or Eigenspace Splitting if necessary. We also try to exploit the temporal information by maintaining a transition matrix. The computation of distance between two eigenspace centers or between an incoming sample and a certain eigenspace center is influenced by both the Euclidean distance and the transition probability between them. The learnt models are used for face recognition in our experiments. When choosing K appropriately, the proposed method performs very well. The average recognition rate can reach as high as 97.1%. Eigenspace models may not fully exploit the nonlinear characteristic of face appearance manifolds. An interesting direction of future work is to develop algorithms which can learn nonlinear models of face appearance manifolds online. 6. Acknowledgement This work was supported by Program of New Century Excellent Talents in University, National Natural Science Foundation of China (No , , , , , , ), Joint Project supported by National Science Foundation of China and Royal Society of UK ( ), the National Basic Research Program (Grant No.2004CB318110), Hi-Tech Research and Development Program of China (2006AA01Z133, 2006AA01Z193) and the Chinese Academy of Sciences. References [1] O. Arandjelovié and R. Cipolla. Face recognition from video using the generic shape-illumination manifold. In Proc. European Conf. on Computer Vision, 3594:27 40, [2] W. Fan and D.-Y. Yeung. Face recognition with image sets using hierarchically extracted exemplars from appearance manifolds. In Proceedings of the 7th International Conf. on Automatic Face and Gesture Recognition, pages , [3] A. Franco, A. Lumini, and D. Maio. Eigenspace merging for model updating. The 16th International Conference on Pattern Recognition, 2: , [4] P. M. Hall, D. Marshall, and R. R. Martin. Incremental eigenanalysis for classification. In The British Machine Vision Conference, pages , [5] P. M. Hall, D. Marshall, and R. R. Martin. Merging and splitting eigenspace models. IEEE Transacetions on Pattern Analysis and Machine Intelligence, 22(9): , [6] V. Krüeger and S. Zhou. Exemplars-based face recognition from video. In Proc. European Conf. on Computer Vision, 4: , [7] K.-C. Lee, J. Ho, M.-H. Yang, and D. Kriegman. Videobased face recognition using probabilistic appearance manifolds. In proceedings of the CVPR, pages , [8] K.-C. Lee and D. Kriegman. Online learning of probabilistic appearance manifolds for video-based recognition and tracking. In proceedings of the CVPR, 1: , [9] W. Liu, Z. Li, and X. Tang. Spatio-temporal embedding for statistical face recognition from video. In Proc. European Conf. on Computer Vision, 3592: , [10] X. Liu and T. Chen. Video-based face recognition using adaptive hidden markov models. In Proceedings of the CVPR, pages , [11] B. Moghaddam and A. Pentland. Probabilistic visual learning for object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7): , [12] G. Shakhnarovich, J. W. Fisher, and T. Darrell. Face recognition from long-term observations. In Proc. European Conf. on Computer Vision, 3: , [13] W. Skarbek. Merging subspace models for face recognition. In proceedings of the CAIP, pages , [14] X. Tang and Z. Li. Frame synchronization and multi-level subspace analysis for video based face recognition. In Proceedings of the CVPR, pages , [15] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proceedings of the CVPR, 1: , [16] O. Yamaguchi, K. Fukui, and K. ichi Maeda. Face recognition using temporal image sequence. In Proceedings of International Conf. on Automatic Face and Gesture Recognition, pages , [17] W. Zhao, R. Chellappa, P. Phillips, and A. Rosenfeld. Face recognition: A literature survey. ACM Computing Surveys, 35(4): , [18] S. Zhou, V. Krueger, and R. Chellappa. Probabilistic recognition of human faces from video. Computer Vision and Image Understanding, 91: , [19] S. K. Zhou and R. Chellappa. Probabilistic identity characterization for face recognition. In Proceedings of the CVPR, 2: , 2004.

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier Seiichi Ozawa, Shaoning Pang, and Nikola Kasabov Graduate School of Science and Technology, Kobe

More information

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier Seiichi Ozawa 1, Shaoning Pang 2, and Nikola Kasabov 2 1 Graduate School of Science and Technology,

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Lecture 13 Visual recognition

Lecture 13 Visual recognition Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14 Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object

More information

Example: Face Detection

Example: Face Detection Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,

More information

Kazuhiro Fukui, University of Tsukuba

Kazuhiro Fukui, University of Tsukuba Subspace Methods Kazuhiro Fukui, University of Tsukuba Synonyms Multiple similarity method Related Concepts Principal component analysis (PCA) Subspace analysis Dimensionality reduction Definition Subspace

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System processes System Overview Previous Systems:

More information

CITS 4402 Computer Vision

CITS 4402 Computer Vision CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images

More information

Learning Discriminative Canonical Correlations for Object Recognition with Image Sets

Learning Discriminative Canonical Correlations for Object Recognition with Image Sets Learning Discriminative Canonical Correlations for Object Recognition with Image Sets Tae-Kyun Kim 1, Josef Kittler 2, and Roberto Cipolla 1 1 Department of Engineering, University of Cambridge Cambridge,

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

Discriminant Uncorrelated Neighborhood Preserving Projections

Discriminant Uncorrelated Neighborhood Preserving Projections Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

Face Recognition Using Multi-viewpoint Patterns for Robot Vision

Face Recognition Using Multi-viewpoint Patterns for Robot Vision 11th International Symposium of Robotics Research (ISRR2003), pp.192-201, 2003 Face Recognition Using Multi-viewpoint Patterns for Robot Vision Kazuhiro Fukui and Osamu Yamaguchi Corporate Research and

More information

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants Sheng Zhang erence Sim School of Computing, National University of Singapore 3 Science Drive 2, Singapore 7543 {zhangshe, tsim}@comp.nus.edu.sg

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning LIU, LU, GU: GROUP SPARSE NMF FOR MULTI-MANIFOLD LEARNING 1 Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning Xiangyang Liu 1,2 liuxy@sjtu.edu.cn Hongtao Lu 1 htlu@sjtu.edu.cn

More information

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition. Appears in the Second International Conference on Audio- and Video-based Biometric Person Authentication, AVBPA 99, ashington D. C. USA, March 22-2, 1999. Comparative Assessment of Independent Component

More information

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Detection Recognition Sally Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification

More information

Table of Contents. Multivariate methods. Introduction II. Introduction I

Table of Contents. Multivariate methods. Introduction II. Introduction I Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Incremental Face-Specific Subspace for Online-Learning Face Recognition

Incremental Face-Specific Subspace for Online-Learning Face Recognition Incremental Face-Specific Subspace for Online-Learning Face ecognition Wenchao Zhang Shiguang Shan 2 Wen Gao 2 Jianyu Wang and Debin Zhao 2 (Department of Computer Science Harbin Institute of echnology

More information

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. Say Wei Foo, Yong Lian, Liang Dong. IEEE Transactions on Circuits and Systems for Video Technology, May 2004. Shankar

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to: System 2 : Modelling & Recognising Modelling and Recognising Classes of Classes of Shapes Shape : PDM & PCA All the same shape? System 1 (last lecture) : limited to rigidly structured shapes System 2 :

More information

PCA FACE RECOGNITION

PCA FACE RECOGNITION PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal

More information

CS 4495 Computer Vision Principle Component Analysis

CS 4495 Computer Vision Principle Component Analysis CS 4495 Computer Vision Principle Component Analysis (and it s use in Computer Vision) Aaron Bobick School of Interactive Computing Administrivia PS6 is out. Due *** Sunday, Nov 24th at 11:55pm *** PS7

More information

Face Detection and Recognition

Face Detection and Recognition Face Detection and Recognition Face Recognition Problem Reading: Chapter 18.10 and, optionally, Face Recognition using Eigenfaces by M. Turk and A. Pentland Queryimage face query database Face Verification

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Clustering VS Classification

Clustering VS Classification MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

Hidden Markov Models Part 1: Introduction

Hidden Markov Models Part 1: Introduction Hidden Markov Models Part 1: Introduction CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Modeling Sequential Data Suppose that

More information

Rapid Object Recognition from Discriminative Regions of Interest

Rapid Object Recognition from Discriminative Regions of Interest Rapid Object Recognition from Discriminative Regions of Interest Gerald Fritz, Christin Seifert, Lucas Paletta JOANNEUM RESEARCH Institute of Digital Image Processing Wastiangasse 6, A-81 Graz, Austria

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 1 Introduction, researchy course, latest papers Going beyond simple machine learning Perception, strange spaces, images, time, behavior

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:

More information

A Unified Bayesian Framework for Face Recognition

A Unified Bayesian Framework for Face Recognition Appears in the IEEE Signal Processing Society International Conference on Image Processing, ICIP, October 4-7,, Chicago, Illinois, USA A Unified Bayesian Framework for Face Recognition Chengjun Liu and

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Image Analysis. PCA and Eigenfaces

Image Analysis. PCA and Eigenfaces Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

A Modular NMF Matching Algorithm for Radiation Spectra

A Modular NMF Matching Algorithm for Radiation Spectra A Modular NMF Matching Algorithm for Radiation Spectra Melissa L. Koudelka Sensor Exploitation Applications Sandia National Laboratories mlkoude@sandia.gov Daniel J. Dorsey Systems Technologies Sandia

More information

Robust Motion Segmentation by Spectral Clustering

Robust Motion Segmentation by Spectral Clustering Robust Motion Segmentation by Spectral Clustering Hongbin Wang and Phil F. Culverhouse Centre for Robotics Intelligent Systems University of Plymouth Plymouth, PL4 8AA, UK {hongbin.wang, P.Culverhouse}@plymouth.ac.uk

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

Face Recognition from Video: A CONDENSATION Approach

Face Recognition from Video: A CONDENSATION Approach 1 % Face Recognition from Video: A CONDENSATION Approach Shaohua Zhou Volker Krueger and Rama Chellappa Center for Automation Research (CfAR) Department of Electrical & Computer Engineering University

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on Lecture 17: Face Recogni2on Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei-Fei Li Stanford Vision Lab Lecture 17-1! What we will learn today Introduc2on to face recogni2on Principal Component Analysis

More information

2D Image Processing Face Detection and Recognition

2D Image Processing Face Detection and Recognition 2D Image Processing Face Detection and Recognition Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de

More information

ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS: A TWO-LEVEL APPROACH

ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS: A TWO-LEVEL APPROACH ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS: A TWO-LEVEL APPROACH Arnaud Declercq, Justus H. Piater Montefiore Institute, University of Liège, B- Liège, Belgium Arnaud.Declercq@ULg.ac.be, Justus.Piater@ULg.ac.be

More information

STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION

STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION INTERNATIONAL JOURNAL OF INFORMATION AND SYSTEMS SCIENCES Volume 5, Number 3-4, Pages 351 358 c 2009 Institute for Scientific Computing and Information STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

Supervised locally linear embedding

Supervised locally linear embedding Supervised locally linear embedding Dick de Ridder 1, Olga Kouropteva 2, Oleg Okun 2, Matti Pietikäinen 2 and Robert P.W. Duin 1 1 Pattern Recognition Group, Department of Imaging Science and Technology,

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 1 Introduction, researchy course, latest papers Going beyond simple machine learning Perception, strange spaces, images, time, behavior

More information

Incremental Eigenanalysis for Classification

Incremental Eigenanalysis for Classification Incremental Eigenanalysis for Classification Peter M. Hall, David Marshall, and Ralph R. Martin Department of Computer Science Cardiff University Cardiff, CF2 3XF [peter dave ralph]@cs.cf.ac.uk Abstract

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington Bayesian Classifiers and Probability Estimation Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 Data Space Suppose that we have a classification problem The

More information

Covariance-Based PCA for Multi-Size Data

Covariance-Based PCA for Multi-Size Data Covariance-Based PCA for Multi-Size Data Menghua Zhai, Feiyu Shi, Drew Duncan, and Nathan Jacobs Department of Computer Science, University of Kentucky, USA {mzh234, fsh224, drew, jacobs}@cs.uky.edu Abstract

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient

More information

Self-Tuning Spectral Clustering

Self-Tuning Spectral Clustering Self-Tuning Spectral Clustering Lihi Zelnik-Manor Pietro Perona Department of Electrical Engineering Department of Electrical Engineering California Institute of Technology California Institute of Technology

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Motivating the Covariance Matrix

Motivating the Covariance Matrix Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role

More information

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on Lecture 17: Face Recogni2on Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei-Fei Li Stanford Vision Lab Lecture 17-1! What we will learn today Introduc2on to face recogni2on Principal Component Analysis

More information

Dimensionality reduction

Dimensionality reduction Dimensionality Reduction PCA continued Machine Learning CSE446 Carlos Guestrin University of Washington May 22, 2013 Carlos Guestrin 2005-2013 1 Dimensionality reduction n Input data may have thousands

More information

Image Region Selection and Ensemble for Face Recognition

Image Region Selection and Ensemble for Face Recognition Image Region Selection and Ensemble for Face Recognition Xin Geng and Zhi-Hua Zhou National Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China E-mail: {gengx, zhouzh}@lamda.nju.edu.cn

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Statistical Filters for Crowd Image Analysis

Statistical Filters for Crowd Image Analysis Statistical Filters for Crowd Image Analysis Ákos Utasi, Ákos Kiss and Tamás Szirányi Distributed Events Analysis Research Group, Computer and Automation Research Institute H-1111 Budapest, Kende utca

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Affine Structure From Motion

Affine Structure From Motion EECS43-Advanced Computer Vision Notes Series 9 Affine Structure From Motion Ying Wu Electrical Engineering & Computer Science Northwestern University Evanston, IL 68 yingwu@ece.northwestern.edu Contents

More information

Two-Layered Face Detection System using Evolutionary Algorithm

Two-Layered Face Detection System using Evolutionary Algorithm Two-Layered Face Detection System using Evolutionary Algorithm Jun-Su Jang Jong-Hwan Kim Dept. of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST),

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Locality Preserving Projections

Locality Preserving Projections Locality Preserving Projections Xiaofei He Department of Computer Science The University of Chicago Chicago, IL 60637 xiaofei@cs.uchicago.edu Partha Niyogi Department of Computer Science The University

More information

Random Sampling LDA for Face Recognition

Random Sampling LDA for Face Recognition Random Sampling LDA for Face Recognition Xiaogang Wang and Xiaoou ang Department of Information Engineering he Chinese University of Hong Kong {xgwang1, xtang}@ie.cuhk.edu.hk Abstract Linear Discriminant

More information

Cluster Kernels for Semi-Supervised Learning

Cluster Kernels for Semi-Supervised Learning Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information