Eigenvectors and SVD 1

Similar documents
Fundamentals of Matrices

The Multivariate Gaussian Distribution

Singular Value Decomposition

Linear Least Squares. Using SVD Decomposition.

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

The Singular Value Decomposition

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain

CS340 Machine learning Gaussian classifiers

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

STAT 100C: Linear models

Maths for Signals and Systems Linear Algebra in Engineering

The Singular Value Decomposition and Least Squares Problems

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013

UNIT 6: The singular value decomposition.

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

Lecture: Face Recognition and Feature Reduction

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

COMP 558 lecture 18 Nov. 15, 2010

1 Last time: least-squares problems

Lecture: Face Recognition and Feature Reduction

Singular Value Decomposition

CS 143 Linear Algebra Review

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

Singular value decomposition

4 Gaussian Mixture Models

Chapter 7: Symmetric Matrices and Quadratic Forms

Multivariate Gaussians. Sargur Srihari

CS 340 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis

Numerical Methods I Singular Value Decomposition

Singular Value Decomposition

Background Mathematics (2/2) 1. David Barber

Linear Algebra in Actuarial Science: Slides to the lecture

Lecture 6 Positive Definite Matrices

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Positive Definite Matrix

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

Linear Algebra Review. Vectors

1 Inner Product and Orthogonality

MATH 3795 Lecture 10. Regularized Linear Least Squares.

Principal Component Analysis

MARN 5898 Regularized Linear Least Squares.

Linear Algebra - Part II

STAT 100C: Linear models

MLES & Multivariate Normal Theory

Properties of Matrices and Operations on Matrices

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

LECTURE 16: PCA AND SVD

14 Singular Value Decomposition

L3: Review of linear algebra and MATLAB

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

Introduction to SVD and Applications

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

18 Bivariate normal distribution I

18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

15 Singular Value Decomposition

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

Linear Algebra & Geometry why is linear algebra useful in computer vision?

EE731 Lecture Notes: Matrix Computations for Signal Processing

Lecture Notes 2: Matrices

Jordan Normal Form and Singular Decomposition

The Multivariate Normal Distribution

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

Linear Algebra, part 3 QR and SVD

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

Singular Value Decomposition (SVD)

The Multivariate Normal Distribution. In this case according to our theorem

1 Linearity and Linear Systems

Multivariate Statistical Analysis

Principal Component Analysis

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Linear Algebra. Session 12

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Linear Subspace Models

Unsupervised Learning: Dimensionality Reduction

Computational math: Assignment 1

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

5 Linear Algebra and Inverse Problem

Lecture 8: Linear Algebra Background

Stat 159/259: Linear Algebra Notes

Linear Algebra Primer

Common-Knowledge / Cheat Sheet

Notes on Linear Algebra

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Visualizing the Multivariate Normal, Lecture 9

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

2. Review of Linear Algebra

Econ Slides from Lecture 7

III - MULTIVARIATE RANDOM VARIABLES

Next is material on matrix rank. Please see the handout

Linear Algebra Review. Fei-Fei Li

Transcription:

Eigenvectors and SVD 1

Definition Eigenvectors of a square matrix Ax=λx, x=0. Intuition: x is unchanged by A (except for scaling) Examples: axis of rotation, stationary distribution of a Markov chain 2

Diagonalization Stack up evec equation to get Where X R n n = AX=XΛ x 1 x 2 x n If evecs are linearly indep, X is invertible, so, Λ=diag(λ 1,...,λ n ). A=XΛX 1. 3

Evecs of symmetric matrix All evals are real (not complex) Evecs are orthonormal u T i u j=0ifi=j, u T i u i=1 So U is orthogonal matrix U T U=UU T =I 4

Diagonalizing a symmetric matrix We have A=UΛU T = n λ i u i u T i i=1 A = = λ 1 u T 1 u 1 u 2 u n λ 2 u T 2.... λn u T n = λ 1 u 1 ( u T 1 ) + +λ n u n ( u T n ) 5

Transformation by an orthogonal matrix Consider a vector x transformed by the orthogonal matrix U to give x = Ux The length of the vector is preserved since x 2 = x T x=x T U T U T x=x T x= x 2 The angle between vectors is preserved x T ỹ=x T U U y=x T y Thus multiplication by U can be interpreted as a rigid rotation of the coordinate system. 6

Geometry of diagonalization Let A be a linear transformation. We can always decompose this into a rotation U, a scaling Λ, and a reverse rotation U T =U -1. Hence A = U Λ U T. The inverse mapping is given by A -1 = U Λ -1 U T A = A 1 = m λ i u i u T i i=1 m i=1 1 λ i u i u T i 7

Matlab example Given Diagonalize A= 1.5 0.5 0 0.5 1.5 0 0 0 3 [U,D]=eig(A) U = -0.7071-0.7071 0-0.7071 0.7071 0 0 0 1.0000 D = 1 0 0 0 2 0 0 0 3 Rot(45) Scale(1,2,3) check >> U*D*U ans = 1.5000-0.5000 0-0.5000 1.5000 0 0 0 3.0000 8

Positive definite matrices A matrix A is pd if x T A x > 0 for any non-zero vector x. Hence all the evecs of a pd matrix are positive Au i = λ i u i u T iau i = λ i u T iu i =λ i >0 A matrix is positive semi definite (psd) if λ i >= 0. A matrix of all positive entries is not necessarily pd; conversely, a pd matrix can have negative entries > [u,v] = eig([1 2; 3 4]) u = -0.8246-0.4160 0.5658-0.9094 v = -0.3723 0 0 5.3723 [u,v]=eig([2-1; -1 2]) u = -0.7071-0.7071-0.7071 0.7071 v = 1 0 0 3 9

Multivariate Gaussian Multivariate Normal (MVN) N(x µ,σ) def = 1 (2π) p/2 Σ 1/2exp[ 1 2 (x µ)t Σ 1 (x µ)] Exponent is the Mahalanobis distance between x and µ =(x µ) T Σ 1 (x µ) Σ is the covariance matrix (symmetric positive definite) x T Σx>0 x 10

Covariance matrix is Bivariate Gaussian where the correlation coefficient is and satisfies -1 ρ 1 Density is Σ= ρ= ( ) σ 2 x ρσ x σ y ρσ x σ y σ 2 y Cov(X,Y) Var(X)Var(Y) p(x,y)= ( 1 2πσ x σ exp 1 y 1 ρ 2 2(1 ρ 2 ) ( x 2 σ 2 x + y2 σ 2 y 2ρxy )) (σ x σ y ) 11

Spherical, diagonal, full covariance Σ= ( ) σ 2 0 0 σ 2 Σ= ( σ 2 x 0 0 σ 2 y ) Σ= ( ) σ 2 x ρσ x σ y ρσ x σ y σ 2 y 12

Surface plots 13

Matlab plotting code stepsize = 0.5; [x,y] = meshgrid(-5:stepsize:5,-5:stepsize:5); [r,c]=size(x); data = [x(:) y(:)]; p = mvnpdf(data, mu, S); p = reshape(p, r, c); surfc(x,y,p); % 3D plot contour(x,y,p); % Plot contours 14

Visualizing a covariance matrix Let Σ = U Λ U T. Hence Σ 1 =U T Λ 1 U 1 =UΛ 1 U = Let y = U(x-µ) be a transformed coordinate system, translated by µ and rotated by U. Then ( p (x µ) T Σ 1 (x µ) = (x µ) T i=1 p = i=1 p i=1 1 λ i u i u T i 1 u i u T i λ i ) (x µ) 1 λ i (x µ) T u i u T i(x µ)= p i=1 y 2 i λ i 15

Visualizing a covariance matrix From the previous slide (x µ) T Σ 1 (x µ) = Recall that the equation for an ellipse in 2D is y 2 1 λ 1 + y2 2 λ 2 =1 Hence the contours of equiprobability are elliptical, with axes given by the evecs and scales given by the evals of Σ p i=1 yi 2 λ i 16

Visualizing a covariance matrix Let X ~ N(0,I) be points on a 2d circle. If Y=UΛ 1 2X Λ 1 2 =diag( Λ ii ) Then Cov[y]=UΛ 1 2Cov[x]Λ 1 2U T =UΛ 1 2Λ 1 2U T =Σ So we can map a set of points on a circle to points on an ellipse 17

Implementation in Matlab Y=UΛ 1 2X Λ 1 2 =diag( Λ ii ) function h=gaussplot2d(mu, Sigma, color) [U, D] = eig(sigma); n = 100; t = linspace(0, 2*pi, n); xy = [cos(t); sin(t)]; k = 6; %k = sqrt(chi2inv(0.95, 2)); w = (k * U * sqrt(d)) * xy; z = repmat(mu, [1 n]) + w; h = plot(z(1, :), z(2, :), color); axis( equal 18

Standardizing the data We can subtract off the mean and divide by the standard deviation of each dimension to get the following (for case i=1:n and dimension j=1:d) y ij = x ij x j σ j Then E[Y]=0 and Var[Y j ]=1. However, Cov[Y] might still be elliptical due to correlation amongst the dimensions. 19

Whitening the data Let X ~ N(µ,Σ) and Σ = U Λ U T. To remove any correlation, we can apply the following linear transformation Y = Λ 1 2U T X In Matlab Λ 1 2 = diag(1/ Λ ii ) [U,D] = eig(cov(x)); Y = sqrt(inv(d)) * U * X; 20

Whitening: example 21

Whitening: proof Let Y = Λ 1 2U T X Λ 1 2 = diag(1/ Λ ii ) Using we have Cov[AX]=ACov[X]A T and Cov[Y] = Λ 1 2U T ΣUΛ 1 2 = Λ 1 2U T (UΛU T )UΛ 1 2 = Λ 1 2ΛΛ 1 2 =I EY = Λ 1 2U T E[X] 22

Singular Value Decomposition A = UΣV T =λ 1 u 1 ( v T 1 ) + +λ r u r ( v T r ) 23

Right svectors are evecs of A^T A For any matrix A A T A = VΣ T U T UΣV T = V(Σ T Σ)V T (A T A)V = V(Σ T Σ)=VD 24

Left svectors are evecs of A A^T AA T = UΣV T VΣ T U T = U(ΣΣ T )U T (AA T )U = U(ΣΣ T )=UD 25

Truncated SVD A = U :,1:k Σ 1:k,1:k V T 1:,1:k=λ 1 u 1 ( v T 1 ) + +λ k u k ( v T k ) Spectrum of singular values Rank k approximation to matrix 26

SVD on images Run demo load clown [U,S,V] = svd(x,0); ranks = [1 2 5 10 20 rank(x)]; for k=ranks(:) Xhat = (U(:,1:k)*S(1:k,1:k)*V(:,1:k) ); image(xhat); end 27

Clown example 1 2 5 10 20 200 28

Space savings A U :,1:k Σ 1:k,1:k V T 1:,1:k m n (m k)(k)(n k)=(m+n+1)k 200 320=64,000 (200+320+1)20=10,420 29