4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long University schedule says 7:3 Material for Test #3 begins today 4/4/7 Review: Principal Component Analysis PCA º SVDCovX SVDXX T /n- PCA: XX T RLR - R is a rotation matrix the Eigenvector matrix L is a diagonal matrix diagonal values are the Eigenvalues The Eigenvalues capture how much the dimensions in X co-vary The Eigenvectors show which combinations of dimensions tend to vary together PCA II The Eigenvector with the largest Eigenvalue is the direction of maximum variance The Eigenvector with the nd largest Eigenvalue is orthogonal to the st vector and has the next greatest variance. And so on The Eigenvalues describe the amount of variance along the Eigenvectors 3 4 The PCA Cookbook Step : Image as Vector,,N N, N,N I, I,N I, I N,N Step : Normalize Normalize each vector: Compute mean value of vector Subtract mean value " x X # x N " x N x,x x, X X i i # x N x
4/7/7 Step 3 optional : mean-center data set Form data matrix Images samples as columns " I I I K # Subtract mean image from all columns Step 4: Covariance of a Data Set Cov XX T # I # ω, ω,n # I I I I K ω N, ω N,N I K ω i, j K k x k i x k x k j x k Step 5: PCA Let I,, I N be normalized images # X I I N CovX XX T PCA X SVD XX T R T ΛR What do we have? A data set of samples expressed as points The origin at the center of mass A set of Eigenvectors, describing the axes of maximum variance maximum change Note that V is an orthonormal basis A set of Eigenvalues, giving the amount of variance along each Eigenvector At most minn,p- non-zero Eigenvalues 4/5/ Data projection We can project any data sample x i into the space defined by the Eigenvectors: Remember to first center this new point/image This is just a geometric rotation, so we can easily get x i back again: Data Compression But what about all the zero Eigenvalues? If P < N, N-P- zero Eigenvectors Probably more zeroes in practice. Let K be the number of non-zero Eigenvalues. Since lk+j, v k +j x i "j> So we can drop all but K Eigenvectors Note that the projected vector is only K elements long! 4/5/ 4/5/
4/7/7 Data Compression II So the compressed representation has ~KN values to store the Eigenvectors PK values to store the compressed images The original data was PN pixels Data compression if K < PN/P+N Further compression if you drop more Eigenvectors Dropping small Eigenvalues results in small errors Optimal compression in least squared sense SirovichKirby Data Matching - Review Assume a database of P images Optional Zero mean and unit length each image Center the images to form X Compute V L drop zero Eigenvalues, project data 4/5/ 3 4/5/ 4 Data Matching II Now, introduce a new probe image y Optional Zero mean and unit length y Subtract the data set mean m from y Project Y into the Eigenspace Now find the closest x i What image x i do you have? How expensive was it to find? Data Matching III When all non-zero Eigenvectors kept, then For training images bases for PCA The Euclidean distance between images in Eigen space is identical to Euclidean distance in the original image space. To the extent new images are like the training images, then PCA matching is a cheap way compute Euclidean distance between many image pairs. And, for zero mean, unit length images, Image space and PCA space correlation the same. 4/5/ 5 4/5/ 6 Data Matching IV In practice, assumptions often violated. If Eigenvectors associated with small but nonzero Eigenvalues are dropped, then minor dimensions are removed May correspond to noise or may not In practice, generally OK more efficient, minimal damage Other distance measures often outperform Euclidean distance Why is a non-trivial question. Example: Whitened cosine. Open research topic PCA: where are we? Done Mechanics algorithms Motivation as maximizing variance To do Motivation as Gaussian Random Process Image space interpretation 4/5/ 7 3
4/7/7 Multivariate Normal Random Variables The equation of a D Gaussian x µ f x σ π e µ is the mean and s is the standard deviation Covariance generalizes variance to n- dimensions. The N-dimensional Gaussian is defined as f x e π d Σ σ x µ Σ x µ Multivariate Normal II Consider the case of a D Gaussian. f x, y π σ xx σ yy e x µ T x σ xx σ xy x µ x y µ y σ yy y µ y Covariance Matrix 9 Special Case: Axis Aligned Probability Level Curves σ xx σ x σ yy σ y f x exp T x µ x σ x x µ x π σ x σ y y µ y y µ y σ y π σ x σ y exp * x µ x + y µ y -, + σ x σ y /. * exp x µ x - *, + πσ x σ / exp y µ y -, / x., πσ y σ + y /. Consider the following axis-aligned Gaussian: * f x, y π σ x σ y exp x µ x + y µ y -, /, σ x σ y / + σ x 4, σ y, µ x 3, µ y 5. Quadratic Forms Ø Look at the exponent of the centered µ D Gaussian, it has the form: V T MV x y a b f x, y b ax + bxy +cy M RΔR r r r r λ λ r r r r x c y Ø Singular value decomposition tells us that: Ø R rotates coordinates so M is diagonal. Quadratic Forms Rotated We may specify any quadratic form as being rotated from an axis aligned equivalent. f u, v V T D V f u, v [ u v ] é ë ê 8 ù é ù ê u û ë v û é V R X ê u ù é ê cos q - sin q ù é ê x ù ë v û ësin q cos q û ë y û V T R X T [ u v ] [ x y ] é ë ê cos q sin q ù - sin q cos q û f x, y X T R T D R X X T M X 4
4/7/7 Rotated Example f u, v 8 x + y f x, y 6.49 x - 5. x y + 3.5 y æ ö é.865 -.5ù Rç p ê è6 ø ë.5.865 û Summary PCA can be applied to any set of registered images It extracts the dimensions of maximum covariance In some sense, the structure of the domain Dimensions of co-variance may or may not be related to classification 6 5