Matrix Algebra, part 2

Similar documents
STAT200C: Review of Linear Algebra

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

Linear Algebra Formulas. Ben Lee

Properties of Matrices and Operations on Matrices

Introduction to Matrix Algebra

3 (Maths) Linear Algebra

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Chapter 1. Matrix Algebra

Recall the convention that, for us, all vectors are column vectors.

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Mathematical Foundations of Applied Statistics: Matrix Algebra

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

A Quick Tour of Linear Algebra and Optimization for Machine Learning

Linear Algebra: Characteristic Value Problem

Chap 3. Linear Algebra

Knowledge Discovery and Data Mining 1 (VO) ( )

Chapter 3. Matrices. 3.1 Matrices

CP3 REVISION LECTURES VECTORS AND MATRICES Lecture 1. Prof. N. Harnew University of Oxford TT 2013

Quick Tour of Linear Algebra and Graph Theory

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

The Singular Value Decomposition

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

1. Linear systems of equations. Chapters 7-8: Linear Algebra. Solution(s) of a linear system of equations (continued)

10-701/ Recitation : Linear Algebra Review (based on notes written by Jing Xiang)

Linear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.

Quick Tour of Linear Algebra and Graph Theory

Introduction Eigen Values and Eigen Vectors An Application Matrix Calculus Optimal Portfolio. Portfolios. Christopher Ting.

Foundations of Matrix Analysis

Exercise Set Suppose that A, B, C, D, and E are matrices with the following sizes: A B C D E

Linear Algebra Review

CS 246 Review of Linear Algebra 01/17/19

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

A Little Necessary Matrix Algebra for Doctoral Studies in Business & Economics. Matrix Algebra

Linear Algebra Review

Linear Algebra: Matrix Eigenvalue Problems

Lecture 8: Linear Algebra Background

. a m1 a mn. a 1 a 2 a = a n

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Applied Matrix Algebra Lecture Notes Section 2.2. Gerald Höhn Department of Mathematics, Kansas State University

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013.

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Math 489AB Exercises for Chapter 1 Fall Section 1.0

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

Review problems for Math 511

Exercise Sheet 1.

Common-Knowledge / Cheat Sheet

Stat 206: Linear algebra

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Conceptual Questions for Review

Introduction to Numerical Linear Algebra II

A = 1 6 (y 1 8y 2 5y 3 ) Therefore, a general solution to this system is given by

A Tropical Extremal Problem with Nonlinear Objective Function and Linear Inequality Constraints

JUST THE MATHS UNIT NUMBER 9.8. MATRICES 8 (Characteristic properties) & (Similarity transformations) A.J.Hobson

B553 Lecture 5: Matrix Algebra Review

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Chapter 3 Transformations

Math 315: Linear Algebra Solutions to Assignment 7

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

CS 143 Linear Algebra Review

Ma/CS 6b Class 20: Spectral Graph Theory

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Section 3.3. Matrix Rank and the Inverse of a Full Rank Matrix

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

Cheat Sheet for MATH461

Matrices A brief introduction

MATH 511 ADVANCED LINEAR ALGEBRA SPRING 2006

Ma/CS 6b Class 20: Spectral Graph Theory

Mathematical Methods wk 2: Linear Operators

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

Linear Algebra Preparatory Course 2014

Review problems for MA 54, Fall 2004.

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Linear Algebra Review. Vectors

Lecture 7: Positive Semidefinite Matrices

Hands-on Matrix Algebra Using R

Lecture 6. Numerical methods. Approximation of functions

MAT 610: Numerical Linear Algebra. James V. Lambers

Systems of Algebraic Equations and Systems of Differential Equations

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and

MAT Linear Algebra Collection of sample exams

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and

Symmetric Matrices and Eigendecomposition

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

Miscellaneous Results, Solving Equations, and Generalized Inverses. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51

Chapter 5 Matrix Approach to Simple Linear Regression

Positive Definite Matrix

Review of Linear Algebra

CPE 310: Numerical Analysis for Engineers

MA 1B ANALYTIC - HOMEWORK SET 7 SOLUTIONS

Eigenspaces. (c) Find the algebraic multiplicity and the geometric multiplicity for the eigenvaules of A.

Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 2015

Math 489AB Exercises for Chapter 2 Fall Section 2.3

Matrices A brief introduction

Transcription:

Matrix Algebra, part 2 Ming-Ching Luoh 2005.9.12 1 / 38

Diagonalization and Spectral Decomposition of a Matrix Optimization 2 / 38

Diagonalization and Spectral Decomposition of a Matrix Also called Eigenvalues and Eigenvectors. The solution to a set of equations Ac = λc where λ is the eigenvalue and c is the eigenvector of a square matrix A. If c is a solution, then kc is also a solution for any k. Therefore, it is usually normalized so that c c = 1. 3 / 38

Diagonalization and Spectral Decomposition of a Matrix To solve for the solution, the above system implies Ac = λi K c. Therefore, (A λi )c = 0 This has non-zero solution if and only if the matrix (A λi ) is singular or has a zero determinant, or A λi = 0. The polynomial in λ is the characteristic equation of A. 4 / 38

Diagonalization and Spectral Decomposition of a Matrix For example, if A = [ 5 1 2 4 ], then A λi = 5 λ 1 2 4 λ = (5 λ)(5 λ) 2 = λ 2 9λ + 18 = 0 λ = 6, 3. 5 / 38

Diagonalization and Spectral Decomposition of a Matrix For the eigenvectors, [ ] [ ] 5 λ 1 c1 2 4 λ c 2 [ ] [ 1 1 c1 λ = 6, 2 2 c 2 [ ] [ ] 2 1 c1 λ = 3, = 2 1 c 2 [ ] 0 = 0 ] [ 0 = 0 [ 0 0 ], ] [ c1, [ c1 c 2 c 2 ] = ± ] = ± [ 1 5 [ 1 2 2 5 1 2 ] ] 6 / 38

Diagonalization and Spectral Decomposition of a Matrix The general results of eigenvalues and eigenvectors of a symmetric matrix A. A K K symmetric matrix has K distinct eigenvectors, c 1, c 2,, c K. The corresponding eigenvalues λ 1, λ 2,, λ K are real, but need not to be distinctive. Eigenvectors of a symmetric matrix are orthogonal, i.e., c i c j = 0, i = j. 7 / 38

Diagonalization and Spectral Decomposition of a Matrix We can collect the K eigenvectors in the matrix C = [c 1, c 2,, c K ], and the K eigenvalues in the same order in a diagonal matrix, = λ 1 0 0 0 λ 2 0.... 0 0 λ K 8 / 38

Diagonalization and Spectral Decomposition of a Matrix From Ac i = λ i c i, we have AC = C. Since eigenvectors are orthogonal and c i c = 1, C C = c c 1 1 c c 1 2 c c 1 K c c 2 1 c c 2 2 c c 2 K.... c K c 1 c K c 2 c K c K = I This implies C = C 1 and CC = CC 1 = I. 9 / 38

Diagonalization and Spectral Decomposition of a Matrix Diagonalization and Spectral Decomposition Since AC = C, pre-multiply both sides by C, we have C AC = C C = = λ 1 0 0 0 λ 2 0.... 0 0 λ K 10 / 38

Diagonalization and Spectral Decomposition of a Matrix On the other hand, if we post-multiply both sides by C 1 = C, we have λ 1 0 0 A = C C 0 = [c 1,, c K ] λ 2 0.... 0 0 λ K K = λ i c i c i i=1 c 1. c K This is called the spectral decomposition of matrix A. 11 / 38

Diagonalization and Spectral Decomposition of a Matrix Rank of a Product. For any matrix A and nonsingular matrices B and C, then rank of B AC is equal to the rank of A. pf: It is known that if A is n K and B is a square matrix of rank K, then rank(ab)=rank(a), and rank(a)=rank(a ). Since C is nonsingular, rank(b AC)=rank((B A)C)=rank(B A). Also we have rank(b A)=rank((B A) )=rank(a B ). It follows that rank(a B )=rank(a ) since B is nonsingular if B is nonsingular. Therefore, rank(b AC)=rank(A )=rank(a). 12 / 38

Diagonalization and Spectral Decomposition of a Matrix Using the result of diagonalization that C AC = and C and C are nonsingular ( λ = C A C = 0), we have, rank(c AC) = rank(a) = rank( ) This leads to the following theorem. Theorem Rank of a Symmetric Matrix. The rank of a symmetric matrix is the number of nonzero eigenvalues it contains. 13 / 38

Diagonalization and Spectral Decomposition of a Matrix It appears that the above Theorem applies to symmetric matrix only. However, note the fact that rank(a)=rank(a A), A A will always be a symmetric square matrix. Therefore, Theorem. The rank of any matrix A is the number of nonzero eigenvalues in A A. 14 / 38

Diagonalization and Spectral Decomposition of a Matrix The trace of a square K K matrix is the sum of its diagonal elements, tr(a) = K a i=1 ii. Some results about trace are as follows. tr(ca) = ctr(a), c is a scalar. tr(a ) = tr(a). tr(a + B) = tr(a) + tr(b). tr(i K ) = K 15 / 38

Diagonalization and Spectral Decomposition of a Matrix tr(ab) = tr(b A). proof: Let A : n K, B : K n, C = AB and D = B A. Since c ii = a i b i = K a k=1 ikb ki, and d kk = b k a k = n b i=1 kia ik, tr(ab) = = n n K c ii = a ik b ki i=1 K i=1 k=1 n b ki a ik = k=1 i=1 k=1 K d kk = tr(b A) 16 / 38

Diagonalization and Spectral Decomposition of a Matrix This permutation rule can be applied to any cyclic permutation in a product: tr(abc D) = tr(bc D A) = tr(c D AB) = tr(d ABC). Along the diagonalization rule of matrix A, we can derive a rule for the trace of a matrix. tr(c AC) = tr(acc ) = tr(ai ) K = tr(a) = tr( ) = k=1 Theorem. The trace of a matrix equals the sum of its eigenvalues. λ k 17 / 38

Diagonalization and Spectral Decomposition of a Matrix One particular simple way of calculating the determinant of a matrix is as follows. Since C AC =, C AC =. C AC = C A C = C C A = C C A K = I A = A = = k=1 Theorem. The determinant of a matrix equals the product of its eigenvalues. λ k 18 / 38

Diagonalization and Spectral Decomposition of a Matrix With the result that A = C C, we can show the following theorem. Theorem Eigenvalues of a Matrix Power. For any nonsingular symmetric matrix A = C C, A K = C K C, K =, 2, 1, 0, 1, 2,. In other word, eigenvalues of A K is λi K are the same. This is for all intergers. while eigenvectors 19 / 38

Diagonalization and Spectral Decomposition of a Matrix For real number, we have Theorem Real Powers of a Positive Definite Matrix. For a positive definite matrix A = C C, A r = C r C, for any real number, r. A positive definite matrix is a matrix with all eigenvalues being positive. A matrix is nonnegative definite if all the eigenvalues are either positive or zero. 20 / 38

Many optimization problems involve the quadratic form n n q = x Ax = x i x j a i j i=1 j=1 where A is a symmetric matrix, x is a column vector. For a given A, If x Ax > (<)0 for all nonzero x, then A is positive (negative) definite. If x Ax ( )0 for all nonzero x, then A is nonnegative definite or positive semi-definite (nonpositive definite). 21 / 38

We can use results we have so far to check a matrix for definiteness. Since A = C C, x Ax = x C C x = y y = n λ i yi 2 i=1 where y = C x is another vector. Theorem Definite Matrices. Let A be a symmetric matrix. If all the eigenvalues of A are positive (negative), then A is positive (negative) definite. If some of them are zeros, then A is nonnegative (nonpositve) if the remainder are positive (negative). 22 / 38

Nonnegative definite of Matrices If A is nonnegative definite, then A 0. Since determinant is the product of eigenvalues, it is nonnegative. If A is positive definite, so is A 1. The eigenvalues of A 1 are the reciprocals of eigenvalues of A which are all positive. Therefore A 1 is also positive definite. The identity matrix I is positive definite. x I x = x x > 0 if x = 0. 23 / 38

(Very important) If A is n K with full rank and n > K, then A A is positive definite and AA is nonnegative definite. By assumption that A is with full rank, Ax = 0. So x A Ax = y y = i y 2 i > 0. For the latter case, since A have more rows than columns, there is an x such that A x = 0. We only have y y 0. 24 / 38

If A is positive definite amd B is nonsingular, then B AB is positive definite. x B ABx = y Ay > 0, where y = Bx. But y can not be zero because B is nonsingular. For A to be negative definite, all A s eigenvalues must be negative. But te determinant of A is positive if A is in even order and negative if it is in odd order. 25 / 38

Optimization We have some definitions to begin with. y = f (x), x = x 1 x 2. x n where y is a scalar and x is a column vector. 26 / 38

Optimization Then the first derivative or gradient is defined as f = x n 1 y x 1 y x 2. y x n = f 1 f 2. f n 27 / 38

Optimization And the second derivatives matrix or Hessian is, 2 y 2 y x H = 1 x 1 x 1 x 2 2 y x 1 x n 2 f 2 y 2 y = x 2 x 1 x 2 x 2 2 y x 2 x n x x n n.... 2 y x n x 2 2 y x n x 1 2 y x n x n 28 / 38

Optimization In the case of a linear function y = a x = x a = n a i=1 i x i, where a and x are both column vectors, a 1 y = (a x) = x x = a 2. = a. n 1 a n y x 1 y x 2. y x n 29 / 38

Optimization In a set of linear functions, y 1 y 2 y = A x, y = n 1 n n., x = n 1 y n a 11 a 12 a 1n a 21 a 22 a 2n A =.... = a n1 a n2 a nn y i = a i1 x 1 + a i2 x 2 + + a in x n = x 1 x 2. x n a 1 a 2. a n, = [a 1, a 2,, a n ] n a i j x j = a i x j=1 30 / 38

Optimization The derivatives are y i x n 1 = a i, y i x 1 n (AX) x = y x = (AX) x = A = a i y 1 x y 2 x. y n x = a 1 a 2. a n = A 31 / 38

Optimization A quadratic form is written as (x Ax) x n 1 y = x Ax = = = n i=1 n x i x j a i j j=1 y nj=1 x 1 x j a 1 j + n y i=1 x i a i1 x 2 nj=1. = x j a 2 j + n i=1 x i a i2. y nj=1 x j a nj + n x i=1 x i a in n a 11 a 12 a 1n x 1 a 21 a 22 a 2n x 2.... a n1 a n2 a nn. x n 32 / 38

Optimization + The Hessian matrix is a 11 a 21 a n1 a 12 a 22 a n2.... a 1n a 2n a nn = Ax + A x = (A + A )x = 2Ax if A is symmetric. x 1 x 2. x n 2 (x Ax) x x n n = 2A 33 / 38

Optimization Outline Optimization In a single variable case, y = f (x). The first order (necessary) condition for an optimum is dy = 0. And the dx second order (sufficient) condition is d2 y d 2 x < 0 for a maximum and d2 y > 0 for a mimimum. d 2 x 34 / 38

Optimization For the optimization of a function with several variable y = f (x) where x is a vector. The first order condition is f (x) x = 0 And the second order condition is that the Hessian matrix H = 2 f (x) x x must be positive definite for a minimum and negative definite for a maximum. 35 / 38

Optimization One example, we want to find x to maximize R = a x x Ax, where x and a are n 1, A is symmetric n n. The first order condition is R x n 1 And the Hessian Matrix is = a 2Ax = 0, x = 1 2 A 1 a. 2 R x x n n = 2A. If A is in quadratic form and is positive definite, then 2 A is negative definite. This ensures a maximum. 36 / 38

Optimization An Example: OLS Regression y i 1 1 y n 1 = x i β + u i i = 1, 2,, n 1 K K 1 1 1 = x β + u. n K K 1 n 1 The OLS estimator mininizes the sum of squared residuals n i=1 u2 i = u u. In other words, min β S(β) = (y xβ) (y xβ) = y y β x y y xβ + β x xβ 37 / 38

Optimization The first order condition is S(β) β K 1 ˆβ O L S = 2x y + 2x xβ = 0 = (x x) 1 x y = (x x) 1 x (xβ + u), = β + (x x) 1 x u And the Hessian matrix is 2 S(β) ββ K K = 2x x The Hessian matrix is positive definite. This ensures a minimum. 38 / 38