Matrix Factorization and Analysis

Similar documents
5.6. PSEUDOINVERSES 101. A H w.

This can be accomplished by left matrix multiplication as follows: I

Gaussian Elimination and Back Substitution

Computational Linear Algebra

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math/CS 466/666: Homework Solutions for Chapter 3

Linear Systems and Matrices

Matrix decompositions

Math 407: Linear Optimization

II. Determinant Functions

MH1200 Final 2014/2015

Fundamentals of Engineering Analysis (650163)

Linear Analysis Lecture 16

CHAPTER 6. Direct Methods for Solving Linear Systems

MATH 2050 Assignment 8 Fall [10] 1. Find the determinant by reducing to triangular form for the following matrices.

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

SUMMARY OF MATH 1600

Review for Exam Find all a for which the following linear system has no solutions, one solution, and infinitely many solutions.

Section 6.4. The Gram Schmidt Process

Numerical Linear Algebra

Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebraic Equations

This MUST hold matrix multiplication satisfies the distributive property.

Review problems for MA 54, Fall 2004.

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

Determinants Chapter 3 of Lay

ANSWERS. E k E 2 E 1 A = B

Topic 15 Notes Jeremy Orloff

G1110 & 852G1 Numerical Linear Algebra

Roundoff Analysis of Gaussian Elimination

Chapter 5: Matrices. Daniel Chan. Semester UNSW. Daniel Chan (UNSW) Chapter 5: Matrices Semester / 33

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

Algebra C Numerical Linear Algebra Sample Exam Problems

Cheat Sheet for MATH461

Chapter 4. Determinants

Lecture 3: QR-Factorization

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

Orthonormal Transformations

No books, notes, any calculator, or electronic devices are allowed on this exam. Show all of your steps in each answer to receive a full credit.

Numerical Methods - Numerical Linear Algebra

Math Lecture 26 : The Properties of Determinants

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

1 9/5 Matrices, vectors, and their applications

Solution of Linear Equations

The determinant. Motivation: area of parallelograms, volume of parallepipeds. Two vectors in R 2 : oriented area of a parallelogram

Notes on Linear Algebra

Here are some additional properties of the determinant function.

Linear Algebra Primer

EE731 Lecture Notes: Matrix Computations for Signal Processing

The QR Factorization

Mathematical Methods wk 2: Linear Operators

Orthogonal Transformations

MATH2210 Notebook 2 Spring 2018

Determinants. Recall that the 2 2 matrix a b c d. is invertible if

Notes on Determinants and Matrix Inverse

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

Orthonormal Transformations and Least Squares

Conceptual Questions for Review

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

Assignment 11 (C + C ) = (C + C ) = (C + C) i(c C ) ] = i(c C) (AB) = (AB) = B A = BA 0 = [A, B] = [A, B] = (AB BA) = (AB) AB

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

1 Positive definiteness and semidefiniteness

A = , A 32 = n ( 1) i +j a i j det(a i j). (1) j=1

A Review of Linear Algebra

SYLLABUS. 1 Linear maps and matrices

1 0 1, then use that decomposition to solve the least squares problem. 1 Ax = 2. q 1 = a 1 a 1 = 1. to find the intermediate result:

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

2. Every linear system with the same number of equations as unknowns has a unique solution.

33AH, WINTER 2018: STUDY GUIDE FOR FINAL EXAM

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

The Determinant: a Means to Calculate Volume

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

Applied Linear Algebra

The Solution of Linear Systems AX = B

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

1 Last time: least-squares problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Important Matrix Factorizations

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Quantum Computing Lecture 2. Review of Linear Algebra

18.06SC Final Exam Solutions

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Practical Linear Algebra: A Geometry Toolbox

(b) If a multiple of one row of A is added to another row to produce B then det(b) =det(a).

Linear Algebra Review

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Homework 2 Foundations of Computational Math 2 Spring 2019

Numerical Linear Algebra

Introduction to Mathematical Programming

Matrix Algebra for Engineers Jeffrey R. Chasnov

MATH 235. Final ANSWERS May 5, 2015

Math 291-2: Lecture Notes Northwestern University, Winter 2016

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

Foundations of Matrix Analysis

Transcription:

Chapter 7 Matrix Factorization and Analysis Matrix factorizations are an important part of the practice and analysis of signal processing. They are at the heart of many signal-processing algorithms. Their applications include solving linear equations LU), decorrelating random variables LDLT,Cholesky), orthogonalizing sets of vectors QR), and finding low-rank matrix approximations SVD). Their usefulness is often two-fold: they allow efficient computation of important quantities and they are often) designed to minimize round-off error due to finite-precision calculation. An algorithm is called numerically stable, for a particular set of inputs, if the error in the final solution is proportional to the round-off error in the elementary field operations. 7. Triangular Systems A square matrix L F n n is called lower triangular or upper triangular) if all elements above or below) the main diagonal are zero. Likewise, a triangular matrix lower or upper) is a unit triangular if it has all ones on the main diagonal. A system of linear equations is called triangular if it can be represented by the matrix equation Ax = b where A is either upper or lower triangular. 7.. Solution by Substitution Let L F n n be a lower triangular matrix with entries l ij = [L] ij. The matrix equation Ly = b can be solved efficiently using forward substitution, which is 3

32 CHAPTER 7. MATRIX FACTORIZATION AND ANALYSIS defined by the recursion ) y j = j b j l ji y i, j =, 2,..., n. l jj i= Example 7... Consider the system 0 0 0 2 y y 2 y 3 = 2 9. Applying the above recursion gives y = = y 2 = 2 ) = y 3 = 9 2 ) = 6. Let U F n n be an upper triangular matrix with entries u ij = [U] ij. The matrix equation U x = y can be solved efficiently using backward substitution, which is defined by the recursion ) x j = y j u ji x i, j = n, n,...,. u jj i=j+ Example 7..2. Consider the system 0 3 0 0 2 x x 2 x 3 = 6. Applying the above recursion gives x 3 = 6 2 = 3 x 2 = 3 3) = 8 x = 6 8)) = 3. The computational complexity of each substitution is roughly 2 n2 operations.

7.2. LU DECOMPOSITION 33 Problem 7..3. Show that set of upper triangular matrices is a subalgebra of the set of all matrices. Since it is clearly a subspace, only two properties must be verified:. that the product of two upper triangular matrices is upper triangular 2. that the inverse of an upper triangular matrix is upper triangular 7..2 The Determinant The determinant deta) of a square matrix A F n n is a scalar which captures a number of important properties of that matrix. For example, A is invertible iff deta) 0 and the determinant satisfies detab) = deta) detb) for square matrices A, B. Mathematically, it is the unique function mapping matrices to scalars that is i) linear in each column, ii) negated by column transposition, and iii) satisfies deti) =. The determinant of a square matrix can be defined recursively using the fact that det [a]) = a. Let A F n n be an arbitrary square matrix with entries a ij = [A] ij. The i, j)-minor of A is the determinant of the n ) n ) matrix formed by deleting the i-th row and j-th column of A. Fact 7..4 Laplace s Formula). The determinant of A is given by deta) = a ij ) i+j M ij = a ij ) i+j M ij, j= i= where M ij is the i, j)-minor of A. Theorem 7..5. The determinant of a triangular matrix is the product of its diagonal elements. Proof. For upper lower) triangular matrices, this can be shown by expanding the determinant along the first column row) to compute each minor. 7.2 LU Decomposition 7.2. Introduction LU decomposition is a generalization of Gaussian elimination which allows one to efficiently solve a system of linear equations Ax = b multiple times with different

34 CHAPTER 7. MATRIX FACTORIZATION AND ANALYSIS right-hand sides. In its basic form, it is numerically stable only if the matrix is positive definite or diagonally dominant. A slight modification, known as partial pivoting, makes it stable for a very large class of matrices. Any square matrix A F n n can be factored as A = LU, where L is a unit lower-triangular matrix and U is an upper-triangular matrix. The following example uses elementary row operations to cancel, in each column, all elements below the main diagonal. These elementary row operations are represented using left multiplication by a unit lower-triangular matrix. 0 2 2 4 = 2 4 3 9 3 9 0 0 0 2 4 = 0 3 0 3 9 0 2 8 0 0 0 0 0 0 0 2 4 = 0 3 This allows one to write 2 4 = 3 9 2 4 3 9 = 0 0 0 0 0 0 0 0 3 9 0 0 0 0 3 9 0 0 2 0 0 2 4 = 0 0 3 2 0 0 2 0 3 0 2 0 0 2 0 0 0 0 0 3 0 0 2 0 0 2 LU decomposition can also be used to efficiently compute the determinant of A. Since deta) = detlu) = detl) detu), the problem is reduced to computing the determinant of triangular matrices. Using Theorem 7..5, it is easy to see that detl) = and detu) = n i= u ii.

7.2. LU DECOMPOSITION 35 7.2.2 Formal Approach To describe LU decomposition formally, we first need to describe the individual operations that are used to zero out matrix elements. Definition 7.2.. Let A F n n be an arbitrary matrix, α F be a scalar, and i, j {, 2,..., n}. Then, adding α times the j-th row to the i-th row an elementary row-addition operation. Moreover, I + αe ij, where E ij e i e T j and e k is the k- th standard basis vector, is the elementary row-addition matrix which effects this operation via left multiplication. Example 7.2.2. For example, elementary row operations are used to cancel the 2, ) matrix entry in 0 0 I E 2, )A = 0 2 4 0 0 3 9 = 0 3 3 9 Lemma 7.2.3. The following identities capture the important properties of elementary row-operation matrices:. i) E ij E kl = δ j,k E il ii) I + αe ij )I + βe kl ) = I + αe ij + βe kl iii) I + αe ij ) = I αe ij ) if i j. if j k Proof. This proof is left as an exercise. Now, consider the process for computing the LU decomposition of A. To initialize the process, we let A ) = A. In each round, we let ) L j = I aj) i,j E i,j i=j+ be the product of elementary row operation matrices which cancel the subdiagonal elements of the j-th column. The process proceeds by defining A j+) = L j A j) so that A j+) has all zeros below the diagonal in the first j columns. After n rounds, the process terminates with a j) U = A n) = L n L n 2 L A where L = L L 2 L n is unit lower triangular.

36 CHAPTER 7. MATRIX FACTORIZATION AND ANALYSIS Lemma 7.2.4. From the structure of elementary row operation matrices, we see n j= i=j+ Proof. First, we notice that i=j+ I + α ij E i,j ) = I + I + α ij E i,j ) = I + n j= i=j+ i=j+ α ij E i,j α ij E i,j. for j =, 2,..., n. Expanding the product shows that any term with two E matrices must contain a product E i,j E l,j with l > i > j. By Lemma 7.2.3i, we see that this term must be zero because j l. Now, we can prove the main result via induction. First, we assume that j= i=j+ k j= i=j+ I + α ij E i,j ) = I + j= i=j+ k j= i=j+ l=k+2 α ij E i,j. Next, we find that if k n 2, then k+ k ) n ) I + α ij E i,j ) = I + α ij E i,j ) I + α l,k+ E l,k+ ) = = I + = I + = I + I + k+ k j= i=j+ j= i=j+ k+ j= i=j+ k+ j= i=j+ α ij E i,j ) I + α ij E i,j + α ij E i,j + α ij E i,j. k l=k+2 j= i=j+ l=k+2 k j= i=j+ l=k+2 α l,k+ E l,k+ ) α ij α l,k+ E i,j E l,k+ α ij α l,k+ E i,k+ δ j,l Finally, we point out that the base case k = is given by the initial observation. Theorem 7.2.5. This process generates one column of L per round because a j) i,j if i < j a j) [L] ij = if i = j 0 otherwise.

7.2. LU DECOMPOSITION 37 Proof. First, we note that L = L L 2 L n n n = j= n a) = i=j+ i= i=j+ n b) = i= i=j+ = I + n i= i=j+ I aj) i,j a j) I aj) i,j a j) I + aj) i,j a j) i,j a j) a j) E i,j )) E i,j ) E i,j ) E i,j, where a) follows from Lemma 7.2.3ii i.e., all matrices in the inside product commute) and b) follows from Lemma 7.2.3iii. Picking off the i, j) entry of L e.g., with e T i Le j ) gives the stated result. Finally, we note that the LU decomposition can be computed in roughly 2 3 n3 field operations. 7.2.3 Partial Pivoting Sometimes the pivot element a j) can be very small or zero. In this case, the algorithm will either fail e.g., divide by zero) or return a very unreliable result. The algorithm can be easily modified to avoid this problem by swapping rows of A j) to increase the magnitude of the pivot element before each cancellation phase. This results in a decomposition of the form P A = LU, where P is a permutation matrix. In this section, we will describe LU decomposition with partial pivoting using the notation from the previous section. The main difference is that, in each round, we will define A j+) = M j P j A j) where P j is a permutation matrix. In particular, left multiplication by P j swaps row j with row p j, where p j = arg max i=+,...,n aj) i,j. The matrix M j is now chosen to cancel the subdiagonal elements in j-th column of P j A j). After n rounds, the resulting decomposition has the form A n) = M n P n M n 2P n 2 M P A = U.

38 CHAPTER 7. MATRIX FACTORIZATION AND ANALYSIS To show this can also be written in the desired form, we need to understand some properties of the permutations. First, we point that swapping two rows is a transposition and therefore Pj 2 be moved to the right. = I. Next, we will show that the permutations can Lemma 7.2.6. Let M = I + k j= n i=j+ α ije ij and Q be a permutation matrix which swaps row l k + and row m > l. Then, QM = MQ where Therefore, we can write and P A = LU. M M = I + M k j= i=j+ M α ij QE ij. A n) = n n 2 P n P 2 P A = U }{{}}{{} L P Proof. The proof is left as an exercise. 7.3 LDLT and Cholesky Decomposition If the matrix A C n n is Hermitian, then the LU decomposition allows the factorization A = LDL H, where L is unit lower triangular and D is diagonal. Since this factorization is typically applied to real matrices, it is referred to as LDLT decomposition. If A is also positive definite, then the diagonal elements of D are positive and we can write A = LD /2) LD /2) H. The form A = L LH, where L is lower triangular, is known as Cholesky factorization. To see this, we will describe the LDLT decomposition using the notation from LU decomposition starting from A ) = A. In the j-th round, define L j to be the product of elementary row-operation matrices which cancel the subdiagonal elements of the j-th column A j). Then, define A j+) = L j A j) L H j and notice that A j+) is Hermitian because A j) is Hermitian. Next, notice that A j+) has zeros below the diagonal in the first j columns and zeros to the right of diagonal in the first j rows. This follows from the fact that the first j rows of A j) are not affected by applying L j on left. Therefore, applying L H j on the right also cancels

7.3. LDLT AND CHOLESKY DECOMPOSITION 39 the elements to the right of the diagonal in the j-th row. After n rounds, we find that D = A n) is a diagonal matrix. There are a number of redundancies in the computation described above. First off, the L matrix computed by LU decomposition is identical to the L matrix computed by LDLT decomposition. Therefore, one can save operations by defining A j+) = L j A j). Moreover, the elements to the right of the diagonal in A j) do not affect the computation at all. So, one can roughly half the number of additions and multiplies by only updating the lower triangular part of A j). The resulting computational complexity is roughly 3 n3 field operations. 7.3. Cholesky Decomposition For a positive-definite matrix A, we can first apply the LDLT decomposition and then define L = LD /2. This gives the Cholesky decomposition L L H = LDL H = A. The Cholesky decomposition is typically used to compute whitening filters for random variables. For example, one can apply it to the correlation matrix R = E[XX H ] of a random vector X. Then, one can define Y = L X and see that E[Y Y H ] = E[ L XX H L H ] = L R L H = I. From this, one sees that Y is a vector of uncorrelated or white) random variables. 7.3.2 QR decomposition A complex matrix Q C n n is called unitary if Q H Q = QQ H = I. If all elements of the matrix are real, then it is called orthogonal and Q T Q = QQ T = I. Theorem 7.3.. Any matrix A C m n can be factored as A = QR, where Q is an m m unitary matrix, QQ H = I, and R is an m n upper-triangular matrix. Proof. To show this decomposition, we start by applying Gram-Schmidt Orthogonalization to the columns a,..., a n of A. This results in orthonormal vectors

40 CHAPTER 7. MATRIX FACTORIZATION AND ANALYSIS {q,..., q l }, where l = minm, n), such that a j = minj,l) i= r i,j q i for j =, 2,..., n. This gives an m l matrix Q = [q... q l ] and an l n upper-triangular matrix R, with entries [R] i,j = r i,j, such that A = QR. If m n, then l = m, Q is unitary, and the decomposition is complete. Otherwise, we must extend the orthonormal set {q,..., q l } to an orthonormal basis {q,..., q m } of C m. This gives an m m unitary matrix Q = [q... q m ]. Adding m n rows of zeros to the previous R matrix gives an m n matrix R such that A = Q R. 7.4 Hermitian Matrices and Complex Numbers Definition 7.4.. A square matrix Q R n n is orthogonal if Q T Q = QQ T = I. Definition 7.4.2. A square matrix U C n n is unitary if U H U = UU H = I. It is worth noting that, for unitary resp. orthogonal) matrices, it suffices to check only that U H U = I resp. Q T Q = I) because U is invertible e.g., it has linearly independent columns) and U H U = I = I = UU = UU H U)U = UU H. A useful analogy between matrices and complex numbers is as follows. Hermitian matrices satisfying A H = A are analogous to real numbers, whose complex conjugates are equal to themselves. Unitary matrices satisfying U H U = I are analogous to complex numbers on the unit circle, satisfying zz =. Orthogonal matrices satisfying Q T Q = I are analogous to the real numbers z = ±, such that z 2 =. The transformation z = + jr jr maps real number r into the unit circle z =. Analogously, by Cayley s formula, U = I + jr)i jr), a Hermitian matrix R is mapped to a unitary matrix.