Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Similar documents
Lecture: Face Recognition

Example: Face Detection

Lecture 17: Face Recogni2on

Lecture 17: Face Recogni2on

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Deriving Principal Component Analysis (PCA)

PCA FACE RECOGNITION

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Face Detection and Recognition

What is Principal Component Analysis?

Principal Component Analysis

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 13 Visual recognition

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Lecture: Face Recognition and Feature Reduction

PCA, Kernel PCA, ICA

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Linear Subspace Models

Lecture: Face Recognition and Feature Reduction

CITS 4402 Computer Vision

Principal Component Analysis (PCA)

Advanced Introduction to Machine Learning CMU-10715

COS 429: COMPUTER VISON Face Recognition

Face recognition Computer Vision Spring 2018, Lecture 21

Regularized Discriminant Analysis and Reduced-Rank LDA

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Image Analysis. PCA and Eigenfaces

Eigenfaces. Face Recognition Using Principal Components Analysis

Principal Component Analysis (PCA)

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,

Principal Component Analysis

Pattern Recognition 2

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

CSC 411 Lecture 12: Principal Component Analysis

Face detection and recognition. Detection Recognition Sally

20 Unsupervised Learning and Principal Components Analysis (PCA)

Face Recognition and Biometric Systems

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Reconnaissance d objetsd et vision artificielle

A Unified Bayesian Framework for Face Recognition

Eigenimaging for Facial Recognition

Principal Component Analysis and Linear Discriminant Analysis

Data Mining Techniques

Methods for sparse analysis of high-dimensional data, II

Gopalkrishna Veni. Project 4 (Active Shape Models)

Introduction to Machine Learning

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Methods for sparse analysis of high-dimensional data, II

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

ECE 661: Homework 10 Fall 2014

Dimensionality reduction

Machine Learning 11. week

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition

Dimensionality Reduction

Enhanced Fisher Linear Discriminant Models for Face Recognition

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

PROBABILISTIC LATENT SEMANTIC ANALYSIS

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

Expectation Maximization

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

CS 4495 Computer Vision Principle Component Analysis

Discriminant Uncorrelated Neighborhood Preserving Projections

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Subspace Analysis for Facial Image Recognition: A Comparative Study. Yongbin Zhang, Lixin Lang and Onur Hamsici

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA)

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA

PCA and LDA. Man-Wai MAK

Dimension Reduction (PCA, ICA, CCA, FLD,

Machine Learning 2nd Edition

Main matrix factorizations

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Lecture 7: Con3nuous Latent Variable Models

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

Non-parametric Classification of Facial Features

CS 559: Machine Learning Fundamentals and Applications 5 th Set of Notes

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Principal Component Analysis

Unsupervised Learning: K- Means & PCA

Mathematical foundations - linear algebra

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

PCA and LDA. Man-Wai MAK

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

14 Singular Value Decomposition

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Clustering VS Classification

N-mode Analysis (Tensor Framework) Behrouz Saghafi

Machine Learning - MT & 14. PCA and MDS

Transcription:

ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr [Sivic09] J. Sivic et al., "Who are you" Learning Person Specific Classifiers from Video, CVPR 2009 2 Application of ace Recognition 3 Subspace-Based ace Recognition Algorithms Authentication Visual surveillance Personal album organization Video management Tele-conferencing There are many face recognition algorithms based on subspace methods 4 Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) Independent Component Analysis (ICA) Local Non-negative Matrix actorization (LNM) Sparse representation by L1 minimization Many others

Why Using Subspace Methods Subspace method Dimensionality reduction: projecting the original data in full dimension onto a low-dimensional space Examples: PCA, LDA, ICA, NM, Reason to use subspace methods Data often reside in a subspace. The number of data is significantly small compared to the number of dimensions: curse of dimensionality The data in original dimension include too many variations and are not sufficiently general to construct a model. By projecting the data onto a subspace, we can obtain a general model of data, which contains similar or discriminative characteristics among data. Eigenface Eigenface: ace Recognition by PCA Project faces onto the subspace to maximize the appearance variations of faces by eigen-decomposition Compare two faces by projecting the images onto the subspace and measuring the Euclidean distance between them 5 6 [Turk91] M. Turk and A. Pentland, ace recognition using eigenfaces, CVPR 1991 Dimensionality Reduction Key observations The high-dimensional images maybe significantly correlated. It is possible to reduce the dimensionality by finding the subspace to minimize information loss. Dimensionality Reduction One possible method An example: conversion from 2D to 1D Dimensionality reduction without loss of information Data in 3D Data in 2D Projection 1 Projection 2 Error by projection 1 Error by projection 2 Minimizing information loss when data are projected onto a low dimensional subspace 7 8

Principal Components Characteristics of principal components Identifying directions to have high variances The first principal component has as high a variance as possible. Each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (uncorrelated with) the preceding components Principal Component Analysis Translation of the origin Rotation of axes Drop the axes with the least amount of information The second principal component Orthogonal The first principal component 9 10 Principal Component Analysis Concept An orthogonal transformation Conversion of correlated observations into uncorrelated variables Dimensionality reduction Minimize the loss of information = maximize the variance of data Provides an optimal solution if the samples are from Gaussian distribution Computation of PCA Eigenvalue decomposition of a data covariance matrix Singular value decomposition of a data matrix How to Compute PCA Given a dataset 1 2 R 5 (7 = 1,, ;) 1= = 1 ; > 1 2 B 1 = 1 ; > 1 2 1= 1 2 1= D mx1 vector mxm matrix kxk matrix mxk matrix B E = 1 ; > E 2 E= E 2 E= D = D B 1 where E 2 = D (1 2 1=) 11 12

Objective function How to Compute PCA G: diagonal matrix of eigenvalues (sorted in a decreasing order) H: orthonormal matrix whose columns correspond to eigenvector of the direction to maximize the objective function I I I D B 1 I I D B 1 I = I D HGH O I = H D I D G H D I I I I D B 1 I I P H D I D G H D I P D GP Objective function How to Compute PCA G: diagonal matrix of eigenvalues (sorted in a decreasing order) H: orthonormal matrix whose columns correspond to eigenvector of the subspace to maximize the objective function D B 1 D B 1 = D HGH O = H D D G H D D B 1 H D D G H D T D GT P = 1 0 0 D I = S A where H = S A S 5 T = U 5 W X 5 5YW = H A:W The solution is k-eigenvectors corresponding to the k largest eigenvalues of B 1 if the subspace is k dimensional. 13 14 Error in PCA D B 1 H D D G H D T D GT Projection to Subspace The projection to the subspace obtained by PCA A simple matrix and vector computation We lose some information. G = [ A [ \ kxk [ 5 (m-k)x(m-k) T D GT = T D D A:W G W W T A:W + T W_A:5 G 5YW (5YW) T W_A:5 Information contained in subspace T = 1 1 Information loss by projection to subspace 1 kx1 vector in subspace E 2 = D (1 2 1=) mxk matrix obtained by PCA mx1 vector in original space Lost information 1 2 E 2 = I A 1D vector in this case 15 16

Back-Projection to Original Space Back-projection from subspace to original space. A simple algebraic computation The reconstructed data is different from the original data because we lost some information during the projection. Vectorize each face. PCA with ace Images mx1 vector in original space 1`2 = E 2 = D (1 2 1) mxk matrix obtained by PCA mx1 vector in original space Reconstruction error 1 2 E 2 1`2 = I A 1 A 1 \ 1 a 1 b 1 Compute mean and covariance matrix. 1= = 1 ; > 1 2 B 1 = 1 ; > 1 2 1= 1 2 1= D Perform eigen-decomposition with covariance matrix and choose the first k eigenvectors. B 1 = HGH O D H A:W G A:W,A:W H A:W 17 18 Projection to subspace PCA with ace Images Back-projection to subspace PCA with ace Images E 2 = H D A:W (1 d 1=) H A:W = [S A S W ] E 2 = S A D 1 1=, S \ D 1 1=,, S W D 1 1= D 1`2 H A:W E 2 = H A:W H D A:W (1 1=) = S A S \ S W g A g \ g D W reconstruction g A g \ g W = g A S A + g \ S \ + +g W S W 1 2 1`2 + 1= S A g A S \ g \ S W g W S A g A S \ g \ S W g W 19 20

Eigenfaces Reconstruction of aces 4D 200D 400D 21 22 Issues in PCA Computation Singular Value Decomposition The size of the covariance matrix is too large. Eigen-decomposition B 1 = 1 ; > 1 2 1= 1 2 1= D B 1 = HGH D mxm matrix: hardly manageable since the number of pixels is large. How can we obtain eigenvectors Covariance matrix Diagonal eigenvalue matrix Eigenvector matrix Use SVD instead of eigen-decomposition. Singular Value Decomposition (SVD) More efficient when the dimensionality of data is very high. i = HBT D mxn data matrix mxm eigenvector matrix mxn singular value matrix nxn matrix 23 24

Characteristics of PCA Pros: Non-iterative, globally optimal solution Cons: PCA projection is optimal for reconstruction from a low dimensional basis, but may NOT be optimal for discrimination. PCA assumes that the data are given by a Gaussian distribution. ace representation by PCA Representation of aces Given a set of faces, 1 2 R 5 (7 = 1,, ;) Projection of faces to subspace: E 2 = D 1 2 1= j \ k \ What if data distribution is not Gaussian k W j a k A j A 25 26 Training for ace Recognition by PCA Align training images, 1 2 R 5 (7 = 1,, ;) Compute the average face Training for ace Recognition by PCA Compute the eigenvectors of the covariance matrix Compute training projections B 1 = HGH O B 1 = HGH O D H A:W G A:W,A:W H A:W H A:W = [S A S W ] E 2 = H D A:W 1 2 Low-rank matrix reconstruction 1= = 1 ; > 1 2 Compute the covariance matrix B 1 = 1 ; > 1 2 1= 1 2 1= D Put all faces on the subspace = generate eigenfaces 27 28

Recognition by Nearest Neighbor Classifier Nearest neighbor classifier Compare with all training examples E A, E \,, E and assign the query image to the nearest neighbor. No training required: All we need is a distance function for our inputs. Recognition by l-nearest Neighbor Classifier A simple extension of nearest neighbor method Classification based on the majority label among l-nearest neighbors How to determine l Class 1 Testing face Class 2 Class 1 Testing face Class 2 Note: You may use a different classifier. Boundary for 7-nearest neighbors 5 faces in Class 1 and 2 faces in Class 2: The testing face is classified as Class 1. 29 30 isherface: Linear Discriminant Analysis (LDA) Aka, isher's Linear Discriminant (LD) ind the subspace to maximize the between class scatter, while minimizing the within class scatter PCA: find the subspace to maximize the scatter of the training images Reference Peter N. Belhumeur, João P. Hespanha, David J. Kriegman, Eigenfaces vs. isherfaces: Recognition Using Class Specific Linear Projection, IEEE Trans. Pattern Anal. Mach. Intell. 19(7): 711-720 (1997) Linear Discriminant Analysis (LDA) What is a good projection for classification Don t be confused with Latent Dirichlet Allocation (LDA). Not good Good 31 32

PCA vs. LDA PCA maximizes the variance of data LDA maximizes the discriminativeness of data. LDA may be a better choice for classification than PCA. PCA: the 1 st component LDA: the 1 st component Variables Data: 1 2 R 5 (7 = 1,, ;) Classes: m m A, m \,, m n Scatters Within class scatter Between class scatter Total scatter Scatter 1= op = 1 > 1 m q B op = 1 > 1 2 m q 1= 2 1 q 1= 2 2 q o p 1= = 1 ; > 1 q q@a q o p n B s = > m 2 1= op 1= 1= op 1= D B = B r + B s D B r = > B op n 33 34 Illustration Optimization 2-class problem B s Between class scatter ind the subspace to Maximize between class scatter Minimize within class scatter n B s = > m 2 1 op 1= 2 D 1 op 1= 2 n B ot Bou Mathematical formulation B r = > B 2 Subspace projection: E W = D 1 W B r = B ot + B ou Within class scatter Bv s Bv r D B s D B r 35 36

37 Solving the objective function I By generalized eigenvectors Optimization Bv s Bv r I B s I 2 = [B r I 2 B r YA B s I 2 = [I 2 I D B s I I D B r I 7 = 1,, l ind the eigenvectors of B r YA B s corresponding to the largest eigenvalues 38 ace Recognition by isherface Training Training data: given Compute within class scatters with training data in each class Compute between class scatter with entire training data ind the optimal projection Testing Take a query image, 1 R 5 Project 1 onto the subspace A:W = I A I W by Bv s Bv r E = D A:W (1 1=) D B s D B r Compare E with all training examples E A, E \,, E and classify the query image by nearest neighbors or other classifiers. Eigenfaces vs. isherfaces Data Variations in lighting and facial expressions. Results Eigenfaces vs. isherfaces 39 40

Challenges in ace Recognition Performance of Recent Algorithms A lot of potential variations in Poses Lighting condition View point acial expressions Image registration issues Handling many people http://vis-www.cs.umass.edu/lfw/ 41 42 43