EE6B Designing Information Devices and Systems II Lecture 9B Geometry of SVD, PCA Uniqueness of the SVD Find SVD of A 0 A 0 AA T 0 ) ) 0 0 ~u ~u 0 ~u ~u ~u ~u Uniqueness of the SVD Find SVD of A 0 A 0 AA T 0 ) ) 0 ~u ~u ~u ~u ~v A T ~u ~v Uniqueness of the SVD Find SVD of A 0 A 0 ~u ~u ~v A ~u ~v T + ~u ~v T cos sin 0 0 + ) ~v sin cos
Accuracy with Finite Precision Consider matrix A R 5x56 with the following singular values: 0 0-5 Accuracy with Finite Precision Consider matrix A R 5x56 with the following singular values: 0 0-5 - - -5-5 -5-5 -0 0 50 0 50 00 50 300-0 0 50 0 50 00 50 300 - - Precision -4-4 error A T A single - - Precision -4-4 error A T A single -7-6 -8 LAPACK single -7-6 -8 LAPACK single - - - - -5-4 -6 LAPACK double -5-4 -6 LAPACK double -8-8 -8-8 -0 0 50 0 50 00 50 300-0 0 50 0 50 00 50 300 Full Matrix Form of SVD U ~u r 3 4 ~u ~u 0 5 6 S 4... 7 5 V 0 r ~v r 3 4 ~v ~v 5 m r r r n r 3 3 S U 4 U U 5 0 4 5 0 0 m m m n A U V T 3 U T U I mm V T V I nn V 3 4 V V 5 n n Unitary Matrices Multiplying with unitary matrices does not change the length q U~x (U~x) T (U~x) p ~x T U T U~x p ~x T ~x ~x Example: Rotation, or reflection matrices U 0 U 0 0 0
Geometric Interpretation A~x U V T ~x A U V T ) V T ~x re-orients ~x without changing length. ) (V T ~x) Stretches along the axis with singlular values 0 x x 0 x x 3) U( V T ~x) re-orients again without changing length Geometric Interpretation U A~x A~x ~x Q: What vector would amplify the most? V T ~x Symmetric Matrices Properties of Symmetric Matrices We assumed before that, A T A has only real eigenvalues, r of them are positive and the rest are zero A T A has orthonormal eigenvectors (to be proven next time) For symmetric matrices: (AB) T B T A T (A T A) T A T A (AA T ) T AA T Q T Q ) A real-valued symmetric matrix has real eigenvalues and eigenvectors Qx x a + ib a ib Somehow we need to use the symmetric and real-ness property of Q to show that b0 Qx x T Q x x T x T Qx x T x x T Qx x T x x T x x T x ) ) R
Properties of Symmetric Matrices Qx x (Q I)x 0 So x is real as well real Properties of Symmetric Matrices ) Eigenvectors of a symmetrix matrix can be chosen to be orthonormal Choose two distinct eigenvalues and vectors 6 Qx x Qx x x T Qx x T x x T Qx x T x ( )x T x 0 Positiveness of Eigenvalues 3) If Q can be written as Q R T R for real R, then Q is positive semidefinite eigenvalues greater of equal to zero Qx x R T Rx x x T R T Rx (Rx) T (Rx) x T x x T x Rx x ) 0 Principal Component Analysis Application of the SVD to datasets to learn features PCA is a tool in statistics and machine learning, which can be computed using SVD movies viewers
Example -- PCA Consider data s.t. ~a ~a 3~a ~v ~a ~a 3~a + ~v a a a ~v ~v a Example -- PCA Consider miterm data students mid mid mid consistency up-down mobility mid SVD SVD ee6b notes spring 7 4 the test scores. Each data point corresponds to a student and those in the first quadrant (both midterms 0) are those students who scored above average in each midterm. You can see that there were students who scored below average in the first and above average in the second, and vice versa. 50 PCA Procedure Remove averages from column of A From A T A, find i, ~v i ~v i are principal components! students mid mid Midterm Example midterm mid -0-30 -40-50 Midterm mid For this data set the covariance matrix is: " 93 AT A 40 30 0 0 - Midterm -50-40 -30-0 - 0 0 30 40 50 97.69 0.53 0.53 9.07 # ~v direction 50 40 30 0 0 - -0-30 -40 ~v direction where the diagonal entries correspond to the squares of the standard deviations 7.5 and 7.09 for Midterms and, respectively. The positive sign of the EE6B (, M. ) Lustig, entry EECS implies UC Berkeley a positive correlation between the two midterm scores as one would expect. The singular values of A, obtained from the square roots of the eigenvalues of A T A, are s 5.08, s 9.66, and the corresponding eigenvectors of A T A are: " # " 0.70 ~v ~v -50-50 -40-30 -0-0 0 30 40 50 Midterm For this data set the covariance matrix is: " 93 AT A 0.70 #. 97.69 0.53 0.53 9.07 #
PCA in Genetics Reveals Geography Study: Characterized genetic variatios in 3,000 Europeans from 36 Countries Built a matrix of 00K SNPs (single nucleotide polymorphisms) Computed largest principle components Projected subjects on dimentional data A~v A~v Overlayed the result on the map of Europe Genes mirror geography within Europe Nature 456, 98- (6 November 008) PC could be associated with food PC associated with west migration
3 and me