Lecture 8 : Eigenvalues and Eigenvectors

Similar documents
DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Quantum Computing Lecture 2. Review of Linear Algebra

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

Review of Linear Algebra

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

1 Inner Product and Orthogonality

Linear Algebra Massoud Malek

CS 246 Review of Linear Algebra 01/17/19

Math Matrix Algebra

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Lecture 7: Positive Semidefinite Matrices

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

Econ Slides from Lecture 7

Here is an example of a block diagonal matrix with Jordan Blocks on the diagonal: J

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

MATH 304 Linear Algebra Lecture 23: Diagonalization. Review for Test 2.

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010

LINEAR ALGEBRA REVIEW

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

PY 351 Modern Physics Short assignment 4, Nov. 9, 2018, to be returned in class on Nov. 15.

Math 108b: Notes on the Spectral Theorem

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Lecture 8: Linear Algebra Background

Linear Algebra using Dirac Notation: Pt. 2

Image Registration Lecture 2: Vectors and Matrices

Spectral Theorem for Self-adjoint Linear Operators

Knowledge Discovery and Data Mining 1 (VO) ( )

Symmetric matrices and dot products

1 Last time: least-squares problems

Lecture 2: Linear operators

More chapter 3...linear dependence and independence... vectors

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Repeated Eigenvalues and Symmetric Matrices

Linear Algebra: Matrix Eigenvalue Problems

Lecture Semidefinite Programming and Graph Partitioning

Diagonalizing Matrices

Linear algebra 2. Yoav Zemel. March 1, 2012

The Singular Value Decomposition

Lecture 10: Eigenvectors and eigenvalues (Numerical Recipes, Chapter 11)

Symmetric and anti symmetric matrices

The Adjoint of a Linear Operator

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Chapter 4 Euclid Space

Eigenvalues and Eigenvectors

Conjugate Gradient (CG) Method

Cayley-Hamilton Theorem

. The following is a 3 3 orthogonal matrix: 2/3 1/3 2/3 2/3 2/3 1/3 1/3 2/3 2/3

ACM 104. Homework Set 4 Solutions February 14, 2001

Chap 3. Linear Algebra

The Spectral Theorem for normal linear maps

REPRESENTATION THEORY WEEK 7

Review of some mathematical tools

Symmetric Matrices and Eigendecomposition

Lecture # 3 Orthogonal Matrices and Matrix Norms. We repeat the definition an orthogonal set and orthornormal set.

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

Lecture 9: SVD, Low Rank Approximation

Lecture 3: Review of Linear Algebra

Chapter 6 Inner product spaces

Multivariate Statistical Analysis

c c c c c c c c c c a 3x3 matrix C= has a determinant determined by

Chapter 3 Transformations

MATHEMATICS 217 NOTES

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).

Definition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition


LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

Definitions for Quizzes

Review of Linear Algebra Definitions, Change of Basis, Trace, Spectral Theorem

Matrix Theory, Math6304 Lecture Notes from October 25, 2012

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Conceptual Questions for Review

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,

Tutorials in Optimization. Richard Socher

A A x i x j i j (i, j) (j, i) Let. Compute the value of for and

12x + 18y = 30? ax + by = m

Numerical Linear Algebra

MTH 2032 SemesterII

ORIE 6300 Mathematical Programming I August 25, Recitation 1

6.842 Randomness and Computation March 3, Lecture 8

Functional Analysis Review

Extreme Values and Positive/ Negative Definite Matrix Conditions

Recall the convention that, for us, all vectors are column vectors.

Chapter 6: Orthogonality

Physics 215 Quantum Mechanics 1 Assignment 1

Math 215 HW #9 Solutions

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Matrices and Linear Algebra

Lecture 7 Spectral methods

SEMIDEFINITE PROGRAM BASICS. Contents

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Linear Algebra Review. Vectors

Transcription:

CPS290: Algorithmic Foundations of Data Science February 24, 2017 Lecture 8 : Eigenvalues and Eigenvectors Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Hermitian Matrices It is simpler to begin with matrices with complex numbers. Let x = a + ib, where a, b are real numbers, and i = 1. Then, x = a ib is the complex conjugate of x. In the discussion below, all matrices and numbers are complex-valued unless stated otherwise. Let M be an n n square matrix with complex entries. Then, λ is an eigenvalue of M if there is a non-zero vector v such that M v = λ v This implies (M λi) v = 0, which also means the determinant of M λi is zero. Since the determinant is a degree n polynomial in λ, this shows that any M has n real or complex eigenvalues. A complex-valued matrix M is said to be Hermitian if for all i, j, we have M ij = Mji. If the entries are all real numbers, this reduces to the definition of symmetric matrix. In the discussion below, we will need the notion of inner product. Let v and w be two vectors with complex entries. Define their inner product as v, w = vi w i Since (x ) = x for any complex number x, we have v, w = vi w i = (wi v i ) = ( w, v ) Furthermore, we define v, v = vi v i = v 2 Claim 1. M is Hermitian iff all its eigenvalues are real. If further M is real and symmetric, then all its eigenvectors have real entries as well. Proof. Using the fact that M ij = M ji, we have the following relations: M v, v = (M ij v j ) v i i j = M ji vj v i i j = vj (M ji v i ) j i = v, M v 1

2 COMPUTING EIGENVALUES and Suppose M v = λ v. Then M v, v = λ v, v = λ v 2 v, M v = v, λ v = λ v 2 Since these expressions are equal, this means λ = λ, which means λ is real. Suppose M is real and symmetric. Let λ be some eigenvalue, which by the above definition is real. Then we have M v = λ v for some complex v. Since M is real, this means, the above relation also holds for both the real and complex parts of v. Therefore, if w is the real part of v, then M w = λ w. This implies all eigenvectors are real if M is real and symmetric. From now on, we will only focus on matrices with real entries. Claim 2. For a real, symmetric matrix M, let λ λ be two eigenvalues. Then the corresponding eigenvectors are orthogonal. Proof. Let M v = λ v and M w = λ w. Since M is symmetric, it is easy to check that M v, w = v, M w = i,j M ij v i w j But and M v, w = λ v, w v, M w = λ v, w Since λ λ, this implies v, w = 0, which means the eigenvectors are orthogonal. We state the next theorem without proof. Theorem 1. Let M be a n n real symmetric matrix, and let λ 1,..., λ n denote its eigenvalues. Then, there exist n real-valued vectors v 1, v 2,..., v n such that: v i = 1 for all i = 1, 2,..., n; v i, v j = 0 for all i j {1, 2,..., n}; and M v i = λ i v i for all i = 1, 2,..., n. Computing Eigenvalues In this section, we assume M is real and symmetric. Lemma 1. Let M be a real symmetric matrix. Let λ 1 denote its largest eigenvalue and v 1 denote the corresponding eigenvector with unit norm. Then λ 1 = sup x T M x = v T 1 M v 1

Orthonormal Square Matrices 3 Proof. First note that since v 1 = 1, we have: v 1 T M v 1 = λ 1 v 1 T v 1 = λ 1 λ 1 sup x T M x Suppose the supremum is achieved at vector y. Let v 1, v 2,..., v n denote the orthogonal eigenvectors of unit length corresponding to the eigenvalues λ 1 λ n respectively. Then we can write y in this basis as: y = α i v i Since y has unit length, this means Note next that y T M y = y, y = M y = α i v i, αi 2 = 1 α i λ i v i α i λ i v i = sup x T M x = y T M y = αi 2 λ i αi 2 λ i λ 1 αi 2 = λ 1 the supremum has value exactly λ 1 and is achieved for x = v 1. We can continue the same argument to show the following corollaries: Corollary 2. Let λ 2 denote the second largest eigenvalue of a real, symmetric matrix M, and let v 1 denote the first eigenvector. Then λ 2 = sup x T M x, x, v 1 =0 Corollary 3. Let M be a real symmetric matrix, and λ n denote its smallest eigenvalue. Then Orthonormal Square Matrices λ n = inf xt M x In order to interpret what the above results mean, we first review orthonormal matrices. A square matrix P is orthonormal if its rows (columns) are orthogonal vectors of unit length. More formally, we have P T P = P P T = I Note that since the matrix is square and the rows are orthogonal, they cannot be expressed as linear combinations of each other. the matrix is invertible. Multiplying the above expressions by P 1, it is easy to check that P T = P 1

4 POWER ITERATION In order to interpret what P does, note the following. For any vectors v and w, P v, P w = v T P T P w = v T w = v, w applying P preserves the angles between vectors, as well as their lengths. applying P performs a rotation of the space. Given a real, symmetric matrix M with eigenvalues λ 1 λ 2 λ n, let Q denote the matrix whose rows are the corresponding eigenvectors of unit length. Since these eigenvectors are orthogonal, this implies Q is orthonormal. Let D be the matrix whose entries along the diagonal are the n eigenvalues, and other entries are zero. It is easy to check that: MQ T = Q T D M = Q T DQ Therefore, applying M to a vector v is the same as applying the rotation Q; then stretching dimension i by factor λ i, and rotating back by Q T. Note that λ i could be negative, which flips the sign of the corresponding coordinate. If we were to do this, which direction stretches the most? The dimension corresponding to λ 1 after the rotation by Q. But in the original space, this would be the direction corresponding to the first eigenvector v 1. Now, for unit vector x, we have x T M x = (Q x) T D (Q x) This quantity therefore corresponds to the length of the stretched rotated vector (since Q x is also a unit vector), which is λ 1 if x = v 1. This is the interpretation of the previous result. Power Iteration Assuming λ 1 is strictly larger than λ 2, there is a simple algorithm to compute the largest eigenvector. Note that M k = ( Q T DQ ) k = Q T D k Q M k has eigenvectors v 1, v 2,..., and the corresponding eigenvalues are λ k 1, λk 2,.... Suppose x = n α i v i, then M k x = α i λ k i v i The claim is that as k becomes large, this vector points more and more in the direction of v 1. Suppose we choose the initial vector x at random. Since it is a random vector, each α i N ( 0, n) 1. with large probability, α i [ 1 n, 1]. If we assume λ 1 cλ 2, then, for i 2, α i λ k i λ k 1 /ck, while α 1 λ k 1 λk 1 /n. If we choose ck n, so that k log c n, then the terms for i 2 have vanishingly small coefficients, and the dominant term corresponds to v 1. Assuming c = 1 + ɛ, we need to choose k log n ɛ for the second and larger eigenvectors to have vanishing contributions compared to the first. This method is termed power iteration. It can be improved by repeated squaring, so that k grows in powers of 2. The number of iterations then drops to around log 1 ɛ + log log n. So even an exponentially small gap, say ɛ = 1 2 between λ n 1 and λ 2 is sufficient for the algorithm to converge to the direction v 1 in polynomially many iterations.

5 Once we have computed v 1, to compute v 2, simply pick a random vector and project in the direction perpendicular to v 1. Call this projected vector x. Since x = n i=2 α i v i, we again have M k x = α i λ k i v i i=2 In the limit, this converges to the second eigenvector assuming the second eigenvalue is wellseparated from the third. And so on. Positive semidefinite Matrices Positive semidefinite (PSD) matrices are a special case of real symmetric matrices. A matrix M is said to be PSD if x T M x 0 for all x. As an example, if M = A T A for any matrix A, then it is easy to see that x T M x = (A x) T (A x) 0 It is also easy to see that all eigenvalues of a PSD matrix are non-negative. To see this, note that v i T M v i = λ i v i T, v i 0 λ i 0 The above characterization is if and only if: A real, symmetric matrix is PSD iff all its eigenvalues are non-negative.