Face Recognition Lecture-14
Face Recognition
imple Approach Recognize faces (mug shots) using gray levels (appearance). Each image is mapped to a long vector of gray levels. everal views of each person are collected in the database during training. During recognition a vector corresponding to an unknown face is compared with all vectors in the database. The face from database, which is closest to the unknown face is declared as a recognized face.
Problems and olution Problems : Dimensionality of each face vector will be very large (250,000 for a 512X512 image!) Raw gray levels are sensitive to noise, and lighting conditions. olution: Reduce dimensionality of face space by finding principal components (eigen vectors) to span the face space Only a few most significant eigen vectors can be used to represent a face, thus reducing the dimensionality
Eigen Vectors and Eigen Values The eigen vector, x, of a matrix A is a special vector, with the following property here is called eigen value To find eigen values of a matrix A first find the roots of: Then solve the following linear system for each eigen value to find corresponding eigen vector
Example Eigen Values Eigen Vectors
Eigen Values
Eigen Vectors
Face Recognition Collect all gray levels in a long vector u: Collect n samples (views) of each of p persons in matrix A (MN X pn): Form a correlation matrix L (MN X MN): Compute eigen vectors,, of L, which form a bases for whole face space
Face Recognition Each face, u, can now be represented as a linear combination of eigen vectors Eigen vectors for a symmetric matrix are orthonormal:
Therefore: Face Recognition
Face Recognition L is a large matrix, computing eigen vectors of a large matrix is time consuming. Therefore compute eigen vectors of a smaller matrix, C: Let be eigen vectors of C, then are the eigen vectors of L:
Training Create A matrix from training images Compute C matrix from A. Compute eigenvectors of C. Compute eigenvectors of L from eigenvectors of C. elect few most significant eigenvectors of L for face recognition. Compute coefficient vectors corresponding to each training image. For each person, coefficients will form a cluster, compute the mean of cluster.
Recognition Create a vector u for the image to be recognized. Compute coefficient vector for this u. Decide which person this image belongs to, based on the distance from the cluster mean for each person.
load faces.mat C=A *A; [vectorc,valuec]=eig(c); ss=diag(valuec); [ss,iii]=sort(-ss); vectorc=vectorc(:,iii); vectorl=a*vectorc(:,1:5); Coeff=A *vectorl; for i=1:30 model(i, :)=mean(coeff((5*(i-1)+1):5*i,:)); end while (1) imagename=input( Enter the filename of the image to Recognize(0 stop): ); if (imagename <1) break; end; imageco=a(:,imagename) *vectorl; disp ( ); disp ( The coefficients for this image are: );
mess1=sprintf( %.2f %.2f %.2f %.2f %.2f, imageco(1),imageco(2),imageco(3),imageco(4), imageco(5)); disp(mess1); top=1; for I=2:30 if (norm(model(i,:)-imageco,1)<norm(model (top, : )-imageco,1)) top=i end end mess1=sprintf( The image input was a image of person number %d,top); disp(mess1); end
14.2 Face Recognition zeliski s book 17
Kirby and irovich (1990) Face image can be compressed: ~ x m M 1 i0 a i u i (c) PCA Reconstruction (85 bytes) (d) JPEG Reconstruction (530 bytes) 18
catter or Co variance matrix: 1 C ( x j m)( x j m) N T Eigen Decomposition: C UU Any arbitrary vector x can be represented: ~ x m M 1 i0 a i u i The distance of a projected face DIF (Distance in face space) DIF ~ x m M 1 2 a i i0 T 1 N N 1 j0 u u i i T j a ( x m). i u i The distance between two faces 19 DIF ~ x ~ y M 1 i0 ( a i b i ) 2
e are not utilizing the eigen value information, compute Mahalonobis distance DIF ~ x m C 1 M 1 i0 a 2 i 2 i Pre scale the eigen vectors by eigenvalues: ˆ 1/ 2 U U Euclidean Mahalonobis 20
Problems in Face Recognition ithin class variation Between class variation 21
Images Taken under Different Illuminations Note the wide range of illumination variation, which can be more dramatic than inter-personal variations 22
LDA (Linear Discriminative Analysis) Fisher Faces lides credit: http://courses.cs.tamu.edu/rgutier/cs790_w02/l6.pdf 23
LDA N N 1 2 : Class -1 : Class - 2 Project X onto a line 24
Find a measure of separation? Distance between projected means It is not a good measure, since it does not consider standard deviation within a class 25
Maximize a function that represents the difference between the means normalized by a measure of the within-class scatter How to find optimum? 26
Find catter Matrices catter Matrices for the projection 27
Find Optimum 28 0 )] ( [ d d J d d T B T 0 ) ( )2 ( )2 ( )] ( [ 2 J d d T B T B T 0 ) ( )2 ( )2 ( )] ( [ J d d T B T B T 0 ) ( ) ( ) ( ) ( )] ( [ J d d T B T B T T 0 )] ( [ J J d d B 0 )] ( [ 1 J J d d B J B 1 Eigen Value problem
Example 29
30
31
32
33
34
35 0.375 0.183 0 6.125 N B 0.457 0.183 0.183 0.246 3N 0.379 0.926 u 0.3889 4.715 0.289 11.796 1 N B
Linear Discriminant Analysis
Linear Discriminant Analysis (LDA) GOAL: Perform dimensionality reduction while preserving class discriminatory information LDA maximizes distance (difference) between the means, normalized by measure of within class scatter Project sample x onto line to obtain scalar y
Linear Discriminant Analysis (LDA) GOAL: Perform dimensionality reduction while preserving class discriminatory information LDA maximizes distance (difference) between the means, normalized by measure of within class scatter (variance) Maximize Between (Inter) Class Minimize ithin (Intra) Class
LDA Explained ithin Class catter Between Class catter
LDA Explained Equivalent to Generalized Eigenvalue Problem Like with PCA, the result is a set of projection vectors
PCA vs. LDA PCA queezes variance into as few dimensions as possible Number of linear functions equal to number of original variables (dimensions) Principal components always orthogonal ( uncorrelated ) LDA Maximizes class discrimination Number of linear functions equal to number of classes LDA s linear scores are not necessarily orthogonal
Failure Case for PCA Green: PCA Orange: LDA hat this graph illustrates that PCA would project the data to the green vector, where the blue and red class would be jumbled up. However, LDA would project to the orange line, where the clusters are maintained.
ources http://matlabdatamining.blogspot.com/2010/ 12/linear discriminant analysis lda.html http://research.cs.tamu.edu/prism/lectures/p r/pr_l10.pdf Richard zeleski, Computer Vision: Algorithms and Applications.