Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA

Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis. How to choose this new basis?? Eigenvalues, Eigenvectors, and an Intro to PCA

PCA One (very popular) method: start by choosing the basis vectors as the directions in which the variance of the data is maximal. weight height Eigenvalues, Eigenvectors, and an Intro to PCA

PCA One (very popular) method: start by choosing the basis vectors as the directions in which the variance of the data is maximal. v1 w h Eigenvalues, Eigenvectors, and an Intro to PCA

PCA Then, choose subsequent directions that are orthogonal to the first and have the next largest variance. v2 w h Eigenvalues, Eigenvectors, and an Intro to PCA

PCA As it turns out, these directions are given by the eigenvectors of either: The Covariance Matrix (data is only centered) The Correlation Matrix (data is completely standardized) Great. What the heck is an eigenvector? Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues and Eigenvectors

Effect of Matrix Multiplication When we multiply a matrix by a vector, what do we get? A x =? = b Another vector. When the matrix A is square (n n) then x and b are both vectors in R n. We can draw (or imagine) them in the same space.

Linear Transformation In general, multiplying a vector by a matrix changes both its magnitude and direction. A = ( ) 1 1 6 0 x = ( ) 1 5 Ax = ( ) 4 6 span(e2) x Ax span(e1)

Eigenvectors However, a matrix may act on certain vectors by changing only their magnitude, not their direction (or possibly reversing it) ( ) ( ) ( ) 1 1 1 2 A = x = Ax = 6 0 3 6 span(e2) Ax x span(e1) The matrix A acts like a scalar!

Eigenvalues and Eigenvectors For a square matrix, A, a non-zero vector x is called an eigenvector if multiplication by A results in a scalar multiple of x. Ax = λx The scalar λ is called the eigenvalue associated with the eigenvector.

Previous Example ( ) 1 1 A = 6 0 ( ) 2 Ax = = 2 6 x = ( ) 1 3 ( ) 1 = 2x 3 = λ = 2 is the eigenvalue span(e2) Ax x span(e1)

Example 2 Show that x is an eigenvector of A and find the corresponding eigenvalue. ( ) ( ) 1 1 1 A = x = 6 0 2 ( ) 3 Ax = ( 1 = 3 2 6 ) = 3x = λ = 3 is the eigenvalue corresponding to the eigenvector x

Example 2 A = ( ) 1 1 6 0 x = span(e2) ( ) 1 2 Ax = 3x Ax x span(e1)

Let s Practice 1 Show that v is an eigenvector of A and find the corresponding eigenvalue: A = A = ( ) 1 2 2 1 ( ) 1 1 6 0 v = v = ( ) 3 3 ( ) 1 2 2 Can a rectangular matrix have eigenvalues/eigenvectors?

Eigenspaces Fact: Any scalar multiple of an eigenvector is also an eigenvector with the same eigenvalue: A = ( ) 1 1 6 0 x = ( ) 1 3 λ = 2 Try v = ( ) 2 6 or u = ( ) 1 3 or z = ( ) 3 9 In general (#proof): Let Ax = λx. If c is some constant, then: A(cx) = c(ax) = c(λx) = λ(cx)

Eigenspaces There are infinitely many eigenvectors associated with any one eigenvalue. Any scalar multiple (positive or negative) can be used. The collection is referred to as the eigenspace associated with the eigenvalue. In the previous example, the eigenspace of λ = 2 was the { span x = ( )} 1. 3 With this in mind what should you expect from software?

Zero Eigenvalues What if λ = 0 is an eigenvalue for some matrix A? Ax = 0x = 0, where x 0 is an eigenvector. This means that some linear combination of the columns of A equals zero! = Columns of A are linearly dependent. = A is not full rank. = Perfect Multicollinearity.

Eigenpairs For a square (n n) matrix A, eigenvalues and eigenvectors come in pairs. Normally, the pairs are ordered by magnitude of eigenvalue. λ 1 λ 2 λ 3 λ n λ 1 is the largest eigenvalue Eigenvector v i corresponds to eigenvalue λ i

Let s Practice 1 For the following matrix, determine the eigenvalue associated with the given eigenvector. 1 1 0 1 A = 0 1 1 v = 1 1 0 1 1 From this eigenvalue, what can you conclude about the matrix A? 2 The matrix M has eigenvectors u and v. What is λ 1, the first eigenvalue for the matrix M? ( ) ( ) ( ) 1 1 1 1 M = u = v = 6 0 3 2 3 For the previous problem, is the specific eigenvector (u or v) the only eigenvector associated with λ 1?

Diagonalization What happens if I put all the eigenvectors of A into a matrix as columns: V = [v 1 v 2 v 3... v n ] and then compute AV? AV = A[v 1 v 2 v 3... v n ] = [Av 1 Av 2 Av 3... Av n ] = [λ 1 v 1 λ 2 v 2 λ 3 v 3... λ n v n ] The effect is that the columns are all scaled by the corresponding eigenvalue.

Diagonalization AV = [λ 1 v 1 λ 2 v 2 λ 3 v 3... λ n v n ] The same thing would happen if we multiplied V by a diagonal matrix of eigenvalues on the right: λ 1 0 0... 0 0 λ 2 0... 0 VD = [v 1 v 2 v 3... v n ]. 0 0... 0....... 0 0 0 0 λ n = [λ 1 v 1 λ 2 v 2 λ 3 v 3... λ n v n ] = AV = VD

Diagonalization AV = VD If the matrix A has n linearly independent eigenvectors, then the matrix V has an inverse. = V 1 AV = D This is called the diagonalization of A. It is only possible when A has n linearly independent eigenvectors (so that the matrix V has an inverse).

Let s Practice For the matrix A = ( ) 0 4, 1 5 a. The pairs λ 1 = 4, v 1 = ( ) 1 1 and λ 2 = 1, v 2 = ( ) 4 1 are eigenpairs of A. For each pair, provide another eigenvector associated with that eigenvalue. b. Based on the eigenvectors listed in part a, can the matrix A be diagonalized? Why or why not? If diagonalization is possible, explain how it would be done.

Symmetric Matrices - Properties Symmetric matrices (like the covariance matrix, the correlation matrix, and X T X) have some very nice properties: 1 They are always diagonalizable. (Always have n linearly independent eigenvectors.) 2 Their eigenvectors are all mutually orthogonal. 3 Thus, if you normalize the eigenvectors (software does this automatically) they will form an orthogonal matrix, V. = AV = VD V T AV = D

Eigenvectors of Symmetric Matrices 5 4 3 2 1 0 1 2 3 4 5 5 4 3 2 1 0 1 2 3 4 5