Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de

Similar documents
Numerical Linear Algebra

Chapter 3 Least Squares Solution of y = A x 3.1 Introduction We turn to a problem that is dual to the overconstrained estimation problems considered s

then kaxk 1 = j a ij x j j ja ij jjx j j: Changing the order of summation, we can separate the summands, kaxk 1 ja ij jjx j j: let then c = max 1jn ja

Introduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1)

216 S. Chandrasearan and I.C.F. Isen Our results dier from those of Sun [14] in two asects: we assume that comuted eigenvalues or singular values are

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

Sums of independent random variables

AN ALGORITHM FOR MATRIX EXTENSION AND WAVELET CONSTRUCTION W. LAWTON, S. L. LEE AND ZUOWEI SHEN Abstract. This paper gives a practical method of exten

Norms and Perturbation theory for linear systems

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Sets of Real Numbers

A note on variational representation for singular values of matrix

Properties of Matrices and Operations on Matrices

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

MATH 6210: SOLUTIONS TO PROBLEM SET #3

Linear Algebra Review. Vectors

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

1. General Vector Spaces

An Inverse Problem for Two Spectra of Complex Finite Jacobi Matrices

Linear Algebra Massoud Malek

Introduction to Numerical Linear Algebra II

The Singular Value Decomposition

Lecture notes: Applied linear algebra Part 1. Version 2

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

1. The Polar Decomposition

Introduction to Group Theory Note 1

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Convex Analysis and Economic Theory Winter 2018

12 CHAPTER 1. PRELIMINARIES Lemma 1.3 (Cauchy-Schwarz inequality) Let (; ) be an inner product in < n. Then for all x; y 2 < n we have j(x; y)j (x; x)

Elementary theory of L p spaces

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Research Article Controllability of Linear Discrete-Time Systems with Both Delayed States and Delayed Inputs

MATH 2710: NOTES FOR ANALYSIS

Introduction to Banach Spaces

Proposition 42. Let M be an m n matrix. Then (32) N (M M)=N (M) (33) R(MM )=R(M)

1 Linear Algebra Problems

CR extensions with a classical Several Complex Variables point of view. August Peter Brådalen Sonne Master s Thesis, Spring 2018

Functional Analysis: Assignment Set # 11 Spring 2009 Professor: Fengbo Hang April 29, 2009

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

Commutators on l. D. Dosev and W. B. Johnson

CHAPTER 5 TANGENT VECTORS

Introduction to Linear Algebra. Tyrone L. Vincent

Real Analysis 1 Fall Homework 3. a n.

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

TRACES OF SCHUR AND KRONECKER PRODUCTS FOR BLOCK MATRICES

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important

A Numerical Radius Version of the Arithmetic-Geometric Mean of Operators

Radial Basis Function Networks: Algorithms

Positive decomposition of transfer functions with multiple poles

Participation Factors. However, it does not give the influence of each state on the mode.

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution

MATH36001 Generalized Inverses and the SVD 2015

Correspondence Between Fractal-Wavelet. Transforms and Iterated Function Systems. With Grey Level Maps. F. Mendivil and E.R.

Elementary Analysis in Q p

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

Linear Algebra Formulas. Ben Lee

4.3 - Linear Combinations and Independence of Vectors

A Recursive Block Incomplete Factorization. Preconditioner for Adaptive Filtering Problem

DISCRIMINANTS IN TOWERS

Numerical Linear Algebra Homework Assignment - Week 2

Rank, Trace, Determinant, Transpose an Inverse of a Matrix Let A be an n n square matrix: A = a11 a1 a1n a1 a an a n1 a n a nn nn where is the jth col

Chap 3. Linear Algebra

ESTIMATION OF ERROR. r = b Abx a quantity called the residual for bx. Then

NORMS ON SPACE OF MATRICES

Vector Spaces, Orthogonality, and Linear Least Squares

14 Singular Value Decomposition

p-adic Measures and Bernoulli Numbers

Some inequalities for sum and product of positive semide nite matrices

New weighing matrices and orthogonal designs constructed using two sequences with zero autocorrelation function - a review

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Extension of Minimax to Infinite Matrices

Transpose of the Weighted Mean Matrix on Weighted Sequence Spaces

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

CSE 599d - Quantum Computing When Quantum Computers Fall Apart

Additive results for the generalized Drazin inverse in a Banach algebra

Lecture 2: Linear Algebra Review

Inner products and Norms. Inner product of 2 vectors. Inner product of 2 vectors x and y in R n : x 1 y 1 + x 2 y x n y n in R n

4 Linear Algebra Review

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

LinGloss. A glossary of linear algebra

Section 0.10: Complex Numbers from Precalculus Prerequisites a.k.a. Chapter 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

MA3H1 TOPICS IN NUMBER THEORY PART III

Foundations of Matrix Analysis

t s (p). An Introduction

CMSC 425: Lecture 4 Geometry and Geometric Programming

Section 3.9. Matrix Norm

Chapter Robust Performance and Introduction to the Structured Singular Value Function Introduction As discussed in Lecture 0, a process is better desc

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

On Wald-Type Optimal Stopping for Brownian Motion

Analysis of some entrance probabilities for killed birth-death processes

Math 104B: Number Theory II (Winter 2012)

Maths for Signals and Systems Linear Algebra in Engineering

Linear Algebra: Characteristic Value Problem

Knowledge Discovery and Data Mining 1 (VO) ( )

Lecture 1: Review of linear algebra

Transcription:

Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A Dahleh George Verghese Deartment of Electrical Engineering and Comuter Science Massachuasetts Institute of Technology c

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value decomosition or SVD of a matrix is then resented The SVD exoses the -norm of a matrix, but its value to us goes much further: it enables the solution of a class of matrix erturbation roblems that form the basis for the stability robustness concets introduced later it solves the so-called total least squares roblem, which is a generalization of the least squares estimation roblem considered earlier and it allows us to clarify the notion of conditioning, in the context of matrix inversion These alications of the SVD are resented at greater length in the next lecture Examle To r o vide some immediate motivation for the study and alication of matrix norms, we begin with an examle that clearly brings out the issue of matrix conditioning with resect to inversion The question of interest is how sensitive the inverse of a matrix is to erturbations of the matrix Consider inverting the matrix A = () : A q u i c k calculation that shows A ; = ; ; : () Now suose we i n vert the erturbed matrix A + A = () :

The result now i s (A + A) ; = A ; + ( A ; ) = ; : ; () Here A denotes the erturbation in A and (A ; ) denotes the resulting e r - turbation in A ; Evidently a % change in one entry of A has resulted in a % change in the entries of A ; If we w ant to solve the roblem Ax = b where b = [ ; ] T, then x = A ; b = [; :] T, while after erturbation of A we get x + x = ( A + A) ; b = [ ; :] T Again, we see a % change in the entries of the solution with only a % change in the starting data The situation seen in the above examle is much w orse than what can ever arise in the scalar case If a is a scalar, then d(a ; )=(a ; ) = ; da=a, so the fractional change in the inverse of a has the same maginitude as the fractional change in a itself What is seen in the above examle, therefore, is a urely matrix henomenon It would seem to b e related to the fact that A is nearly singular in the sense that its columns are nearly deendent, its determinant is much smaller than its largest element, and so on In what follows (see next lecture), we shall develo a sound way to measure nearness to singularity, and show h o w this measure relates to sensitivity u n d e r i n version Before understanding such sensitivity to erturbations in more detail, we need ways to measure the \magnitudes" of vectors and matrices We have already introduced the notion of vector norms in Lecture, so we n o w turn to the denition of matrix norms Matrix Norms An m n comlex matrix may b e v i e w ed as an oerator on the (nite dimensional) normed vector sace C n : A mn : ( C n k k ) ;! ( C m k k ) () where the norm here is taken to be the standard Euclidean norm Dene the induced -norm of A as follows: k Axk k Ak = su x= k xk () = max k Axk kxk = : () The term \induced" refers to the fact that the denition of a norm for vectors such as Ax and x is what enables the above denition of a matrix norm From this denition, it follows that the induced norm measures the amount of \amlication" the matrix A rovides to vectors on the unit shere in C n, ie it measures the \gain" of the matrix Rather than measuring the vectors x and Ax using the -norm, we could use any -norm, the interesting cases being = Our notation for this is k Ak = max k Axk : (8) kxk =

The triangle inequality holds since: An imortant question to consider is whether or not the induced norm is actually a norm, in the sense dened for vectors in Lecture Recall the three conditions that dene a norm: k xk, and k xk = () x = k xk = j j k xk k x + yk k xk + k yk Now let us verify that k Ak is a norm on C mn using the receding denition: k Ak since k Axk for any x Futhermore, k Ak = () A =, since k Ak is calculated from the maximum of k Axk evaluated on the unit shere k Ak = j j k Ak follows from k yk = j j k yk (for any y) k A + Bk = max k (A + B)xk kxk = max k Axk + k Bx k kxk = k Ak + k Bk : Induced norms have t wo additional roerties that are very imortant: k Axk k Ak k xk, w h i c h is a direct consequence of the denition of an induced norm For A mn, B nr, k ABk k Ak k Bk (9) which is called the submultilicative roerty This also follows directly from the denition: k ABxk k Ak k Bx k k Ak k Bk k xk for any x: Dividing by k xk : from which the result follows k ABxk k xk k Ak k Bk Before we turn to a more detailed study of ideas surrounding the induced -norm, which will be the focus of this lecture and the next, we m a k e some remarks about the other induced norms of ractical interest, namely the induced -norm and induced -norm We shall also

say something about an imortant matrix norm that is not an induced norm, namely the Frobenius norm and It is a fairly simle exercise to rove that mx kak = max ja ij j (max of absolute column sums of A) () jn i= nx kak = max ja ij j (max of absolute row sum s of A) : () im j= (Note that these denitions reduce to the familiar ones for the -norm and -norm of column vectors in the case n = ) The roof for the induced -norm involves two stages, namely: Prove that the quantity in Equation () rovides an uer bound : kaxk kxk 8x Show that this bound is achievable for some x = x^: kax^k = kx^k for some ^x : In order to show how these stes can b e imlemented, we give the details for the -norm case Let x C n and consider nx kaxk = max j a ij x j j im j= nx max ja ij jjx j j im j= nx @ max ja ij ja max jx j j im j= jn nx = @ max ja ij ja kxk im j= The above inequalities show that an uer bound is given by nx max kaxk = max ja ij j: kxk = im j=

@ X X A Now in order to show t h a t this uer bound is achieved by som e vector x^, let i b e an index Pn at which the exression of achieves a maximum, that is = j= jaij j Dene the vector x ^ as sgn(ai) sgn(ai) x^ = : sgn(ain) Clearly kx^k = and X n kax^k = jaij j = : j= The roof for the -norm roceeds in exactly the same way, and is left to the reader There are matrix norms ie functions that satisfy the three dening conditions stated earlier that are not induced norms The most imortant examle of this for us is the Frobenius norm: n m kak F = ja ij j () = j= i= trace(a ; A) (verify) () In other words, the Frobenius norm is dened as the root sum of squares of the entries, ie the usual Euclidean -norm of the matrix when it is regarded simly as a vector in C mn Although it can be shown that it is not an induced matrix norm, the Frobenius norm still has the submultilicative roerty that was noted for induced norms Yet other matrix norms may b e dened (some of them without the submultilicative roerty), but the ones above are the only ones of interest to us Singular Value Decomosition Before we discuss the singular value decomosition of matrices, we begin with some matrix facts and denitions Some Matrix Facts: A matrix U C nn is unitary if U U = UU = I Here, as in Matlab, the suerscrit denotes the (entry-by-entry) comlex conjugate of the transose, which is also called the Hermitian transose or conjugate transose A matrix U R nn is orthogonal if U T U = UU T = I, where the suerscrit the transose T denotes Proerty: If U is unitary, then kux k = kxk

; " # If S = S (ie S equals its Hermitian transose, in which case we s a y S is Hermitian), then there exists a unitary matrix such t h a t U SU = [diagonal matrix] For any matrix A, both A A and AA are Hermitian, and thus can always be diagonalized by unitary matrices For any matrix A, the eigenvalues of A A and AA are always real and non-negative (roved easily by contradiction) Theorem (Singular Value Decomosition, or SVD) Given any matrix A C mn, A can be written as mm mn A = nn U V () where U U = I, V V = I, = () r and i = ith nonzero eigenvalue of A The A i are termed the singular values of A, and ie, are arranged in order of descending magnitude, r > : Proof: We will rove this theorem for the case rank(a) = m the general case involves very little more than what is required for this case The matrix AA is Hermitian, and it can therefore be diagonalized by a unitary matrix U C mm, so that U U = AA : Note that = diag( : : : m) has real ositive diagonal entries i due to the fact that AA is ositive denite We can write = = diag( : : : m) Dene V C mn by V = ; U A V has orthonormal rows as can b e seen from the following calculation: V V = AA U ; = I Choose the matrix V in such a w ay that U V = V V is in C nn and unitary Dene the m n matrix = [ j] This imlies that In other words we h a ve A = U V V = V = U A: One cannot always diagonalize an arbitrary matrix cf the Jordan form

= = Examle For the matrix A given at the beginning of this lecture, the SVD comuted easily in Matlab by writing [u s v] = svd(a) is Observations: A = :8 : : : ; 8 : : : ;:8 :8 : () i) which tells us U diagonalizes AA AA = U V V T U U T U = U r U () ii) A A = V T U U V V T V = V r V (8) which tells us V diagonalizes A A iii) If U and V are exressed in terms of their columns, ie, h i U = u u u m and then h i V = v v v n X r A = i u i v i= i (9)

= P which is another way to write the SVD The u i are termed the left singular vectors of A, and the v i are its right singular vectors From this we see that we can alternately interret Ax as X r ; Ax = i u i v () i x {z } i= rojection which is a weighted sum of the u i, where the weights are the roducts of the singular values and the rojections of x onto the v i Observation (iii) tells us that R a(a) = san f u : : : u r g (because Ax = i= c i u i where the c i are scalar weights) Since the columns of U are indeendent, dim R a(a) = r = rank ( A), and f u : : : u r g constitute a basis for the range sace of A The null sace of A is given by sanf v r+ : : : v ng To see this: U V x = () V x = () v x r v r x () v i x = i = : : : r () x sanf v r+ : : : v ng : Examle One alication of singular value decomosition is to the solution of a system of algebraic equations Suose A is an m n comlex matrix and b is a vector in C m Assume that the rank of A is equal to k, with k < m We are looking for a solution of the linear system Ax = b By alying the singular value decomosition rocedure to A, we get r A = = U U V V where is a k k non-singular diagonal matrix We will exress the unitary matrices U and V columnwise as h U = u u : : : u m h V = v v : : : v n : A necessary and sucient condition for the solvability of this system of equations is that u i b = for all i satisfying k < i m Otherwise, the system of equations is inconsistent This condition means that the vector b must be orthogonal to the i i

v = : last m ; k columns of U Therefore the system of linear equations can be written as V x = U b V x = b u b u u b k u b m u b Using the above equation and the invertibility o f, w e can rewrite the system of equations as v u b v x = u b : : : v u By using the fact that k k k b v h i v k v v : : : v k = I we obtain a solution of the form kx x = u i= i ib v i: From the observations that were made earlier, we know that the vectors v k+ v k+ : : : v n san the kernel of A, and therefore a general solution of the system of linear equations is given by kx x = (u b) + nx i v i i= i i=k+ i v i where the coecients i, with i in the interval k + i n, are arbitrary comlex numbers

Relationshi to Matrix Norms The singular value decomosition can be used to comute the induced -norm of a matrix A Theorem kaxk kak = su x= kxk = () = max (A) which tells us that the maximum amlication is given by the maximum singular value Proof: kaxk ku V xk su = su = kxk x x = kxk kv xk = su x= kxk kyk = su y = kv y k P r i= i jy i j = su P y= r jy i j i= : For y = [ ] T, kyk =, and the suremum is attained (Notice that this correonds to x = v Hence, Av = u ) Another alication of the singular value decomosition is in comuting the minimal amlication a full rank matrix exerts on elements with -norm equal to Theorem Given A C mn, suose rank(a) = n Then min kaxk = n (A) : () kxk = Note that if rank(a) < n, then there is an x such that the minimum is zero (rewrite A in terms of its SVD to see this) Proof: For any kxk =, xk kaxk = ku V = kv xk (invariant u n d e r m ultilication by unitary matrices) = kyk

It follows that for any matrix A, not necessarily square, v x A A v A v v Figure : Grahical deiction of the maing involving A Note that Av = u and that Av = u for y = V x Now! X n kyk = j i y i j i= n : T Note that the minimum is achieved for y = [ ] t h us the roof is comlete The Frobenius norm can also be exressed quite simly in terms of the singular values We l e a ve you to verify that kak F Examle Matrix Inequality Xn X m = @ ja ij j A j= i= ; = trace(a A)! rx = i i= () We say A B, t wo square matrices, if x Ax x B x for all x = : kak $ A A I:

Show that one way to do this is via Householder transformations, as follows: Exercises Exercise Verify that for any A, an m n matrix, the following holds: kak k Ak mkak : n Exercise Suose A = A Find the exact relation between the eigenvalues and singular values of A Does this hold if A is not conjugate symmetric? Exercise Show that if rank(a) =, then, kak F = kak Exercise This roblem leads you through the argument for the existence of the SVD, using an iterative construction Showing that A = U V, w here U and V are unitary matrices is equivalent t o showing that U AV = a) Argue from the denition of kak that there exist unit vectors (measured in the -norm) x C m and y C such that Ax = y, where = kak b) We can extend b o th x and y above to orthonormal bases, ie we can nd unitary matrices V and U whose rst columns are x and y resectively: ~ V = [ x V~ ] U = [ y U ] n and likewise for U hh V = I ; h = x ; [ : : : ] h h c) Now dene A = U AV Why is ka k = kak? d) Note that A = y Ax y AV~ w = U~ Ax U~ AV~ B What is the justication for claiming that the lower left element in the above matrix is? e) Now show that ka w k + w w and combine this with the fact that ka k = kak = to deduce that w =, so A = B

At the next iteration, we aly the above rocedure to B, and so on When the iterations terminate, we h a ve the SVD [The reason that this is only an existence roof and not an algorithm is that it begins by i n voking the existence of x and y, but does not show h o w to comute them Very good algorithms do exist for comuting the SVD see Golub and Van Loan's classic, Matrix Comutations, Johns Hokins Press, 989 The SVD is a cornerstone of numerical comutations in a host of alications] Exercise Suose the m n matrix A is decomosed in the form A = U where U and V are unitary matrices, and is an invertible r r matrix ( the SVD could be used to roduce such a decomosition) Then the \Moore-Penrose inverse", or seudo-inverse of A, denoted by A +, can be dened as the n m matrix (You can invoke it in Matlab with inv(a)) A + ; = V a) Show that A + A and AA + are symmetric, and that AA + A = A and A + AA + = A + (These four conditions actually constitute an alternative denition of the seudo-inverse) b) Show that when A has full column rank then A + = (A A) ; A, and that when A has full row rank then A + = A (AA ) ; c) Show that, of all x that minimize ky ; Axk (and there will b e many, if A does not have full column rank), the one with smallest length kxk is given by ^x = A + y V U Exercise All the matrices in this roblem are real Suose A = Q with Q being an m m orthogonal matrix and R an n n invertible matrix (Recall that such a decomosition exists for any matrix A that has full column rank) Also let Y be an m matrix of the form Y = Q where the artitioning in the exression for Y is conformable with the artitioning for A R Y Y

To simlify the comutation, minimize the Frobenius norm of in the dention of (A) ^ (a) What choice X of the n matrix X minimizes the Frobenius norm, or equivalently the squared Frobenius norm, of Y ; AX? In other words, nd ^ X = argmin ky ; AXk F Also determine the value of ky ; AXk ^ F (Your answers should b e exressed in terms of the matrices Q, R, Y and Y ) ^ (b) Can your X in (a) also be written as (A A) ; A Y? C an it be w ritten as A + Y, where A + denotes the (Moore-Penrose) seudo-inverse of A? (c) Now obtain an exression for the choice X of X that minimizes ky ; AXk F + kz ; BX k F where Z and B are given matrices of aroriate dimensions (Your answer can be exressed in terms of A, B, Y, and Z) Exercise Structured Singular Values Given a comlex square matrix A, dene the structured singular value function as follows where is some set of matrices (A) = min f max () j det(i ; A) = g a) = fi : C g (A (A of A as: (A) = m a x j i j and the i 's are the eigenvalues of A If, show that ) = ), where is the sectral radius, dened b) If = f C i nn g, show that (A) = max (A) c) If = fdiag( n) j i C g, show that where (A) (A) = (D ; AD) max (D ; AD) D f diag(d d n) j d i > g Exercise 8 Consider again the structured singular value function of a comlex square matrix A dened in the receding roblem If A has more structure, it is sometimes ossible to comute (A) exactly In this roblem, assume A is a rank-one matrix, so that we can write A = uv where u v are comlex vectors of dimension n Comute (A) w hen (a) = diag( : : : n) i C (b) = diag( : : : n) i R