MATH36001 Generalized Inverses and the SVD 2015

Similar documents
σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Introduction to Numerical Linear Algebra II

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Lecture notes: Applied linear algebra Part 1. Version 2

Problem set 5: SVD, Orthogonal projections, etc.

The Singular Value Decomposition

Properties of Matrices and Operations on Matrices

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

The Singular Value Decomposition and Least Squares Problems

Chapter 1. Matrix Algebra

Math 407: Linear Optimization

The Singular Value Decomposition

Review problems for MA 54, Fall 2004.

Notes on Eigenvalues, Singular Values and QR

MAT Linear Algebra Collection of sample exams

Chapter 0 Miscellaneous Preliminaries

Designing Information Devices and Systems II

EE731 Lecture Notes: Matrix Computations for Signal Processing

7. Symmetric Matrices and Quadratic Forms

Cheat Sheet for MATH461

Chapter 7: Symmetric Matrices and Quadratic Forms

UNIT 6: The singular value decomposition.

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Mathematical Foundations of Applied Statistics: Matrix Algebra

Linear Algebra Review. Fei-Fei Li

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Orthogonalization and least squares methods

Elementary linear algebra

Singular Value Decomposition

Maths for Signals and Systems Linear Algebra in Engineering

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Principal Component Analysis

Computational math: Assignment 1

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Singular Value Decomposition

Singular Value Decomposition

Stat 159/259: Linear Algebra Notes

Math Fall Final Exam

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Recall the convention that, for us, all vectors are column vectors.

ECE 275A Homework #3 Solutions

Lecture 1: Review of linear algebra

Solutions of Linear system, vector and matrix equation

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

MATH 612 Computational methods for equation solving and function minimization Week # 2

Optimization problems on the rank and inertia of the Hermitian matrix expression A BX (BX) with applications

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

Pseudoinverse & Orthogonal Projection Operators

Chapter 3 Transformations

Linear Algebra Fundamentals

Review of Some Concepts from Linear Algebra: Part 2

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Numerical Methods I Singular Value Decomposition

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Linear Algebra Review. Fei-Fei Li

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

I. Multiple Choice Questions (Answer any eight)

Notes on Linear Algebra

The Singular Value Decomposition

. = V c = V [x]v (5.1) c 1. c k

LinGloss. A glossary of linear algebra

5 Selected Topics in Numerical Linear Algebra

Problem # Max points possible Actual score Total 120

Conceptual Questions for Review

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

7. Dimension and Structure.

1. General Vector Spaces

NORMS ON SPACE OF MATRICES

Algorithmen zur digitalen Bildverarbeitung I

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Linear Algebra in Actuarial Science: Slides to the lecture

Matrix decompositions

1 Linear Algebra Problems

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Spring 2014 Math 272 Final Exam Review Sheet

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Linear Least Squares. Using SVD Decomposition.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Singular Value Decompsition

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

SECTION 3.3. PROBLEM 22. The null space of a matrix A is: N(A) = {X : AX = 0}. Here are the calculations of AX for X = a,b,c,d, and e. =

This can be accomplished by left matrix multiplication as follows: I

2. Every linear system with the same number of equations as unknowns has a unique solution.

Numerical Linear Algebra Homework Assignment - Week 2

18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

Linear Algebra: Matrix Eigenvalue Problems

Singular Value Decomposition

Chap 3. Linear Algebra

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Linear Systems. Carlo Tomasi

Algebra C Numerical Linear Algebra Sample Exam Problems

Transcription:

MATH36001 Generalized Inverses and the SVD 201 1 Generalized Inverses of Matrices A matrix has an inverse only if it is square and nonsingular. However there are theoretical and practical applications for which some kind of partial inverse of a matrix that is singular or even rectangular is needed. A generalized inverse of a matrix A is any matrix X satisfying AXA = A. (1) Note that a generalized inverse X must be the usual inverse when A is nonsingular since multiplication by A 1 on both sides of (1) gives X = A 1. 1.1 Illustration: Solvability of Linear Systems Consider the linear system Ax = b, where the matrix A C n n and the vector b C n are given and x is an unknown vector. If A is nonsingular there is a unique solution for x given by x = A 1 b. When A is singular, there may be no solution or infinitely many. Theorem 1 Let A C m n. If X is any matrix satisfying AXA = A then Ax = b has a solution if and only if AXb = b, in which case the general solution is where y C n is an arbitrary vector. x = Xb + (I XA)y, (2) Proof. Existence Let X be such that AXA = A. ( ) Suppose there exists x such that Ax = b. Then by (1), b = Ax = AXAx = AXb. ( ) AXb = b x = Xb solves the linear system. General solution First show that Xb+(I XA)y is a solution. Indeed, A(Xb+(I XA)y) = AXb + 0 = b. Then if x is any solution we can write x = XAx + (I XA)x, so every solution can be expressed in the form of (2). 1.2 Existence of Generalized Inverses We now show that for every A there exist one or more generalized inverses. Let A C m n and let E k, E k 1,..., E 1 be elementary row operations and P a permutation matrix such that Ir K EAP =, r min(n, m), O O where E = E k E k 1... E 1. The matrix [ I r K O O is the reduced row echelon form of A and rank(a) = r (see first year Linear Algebra course). Note that the two right-hand submatrices are absent when r = n and the two lower submatrices are absent if r = m. It is easy to verify that any X given by [ Ir O X = P O L E

MATH36001: Generalized Inverses and the SVD Page 2 for some L C (n r) (m r) satisfies AXA = A. Indeed, [ Ir O AXA = AP EE 1 Ir K P 1 Ir K = AP P 1 = E 1 Ir O L O O O O O K O P 1 = A. Example 1 Determine a generalized inverse for A = 1 2 2 0. (3) 0 2 Let us find the reduced row echelon form of A: 1 0 0 2 1 0 1 2 2 0 = 1 2 0 4, 1 1/2 0 0 1/4 0 1 2 0 4 = 1 0 0 1, 0 0 1 0 2 0 2 0 1 2 0 2 0 0 so E = 1 1/2 0 0 1/4 0 1 0 0 2 1 0 = 0 1/2 0 1/2 1/4 0, P = I 2. Hence, 0 1 2 0 0 1 2 1 2 1 0 0 X = 0 1/2 0 1/2 1/4 0 0 1/2 0 = 0 1 0 1/2 1/4 0 2 1 2 is a generalized inverse of A. Check that AXA = A. 1.3 The Moore Penrose Generalized Inverse The Moore Penrose generalized inverse of a matrix A C m n X C n m satisfying the four Moore Penrose conditions: is the unique matrix (i) AXA = A, (ii) XAX = X, (iii) AX = (AX), (iv) XA = (XA). The Moore Penrose generalized inverse of A is a generalized inverse since it satisfies (1). It is commonly called the pseudoinverse of A and is denoted by A +. We show the existence of A + via the singular value decomposition of A in Section 2.1. 2 Singular Value Decomposition The spectral theorem says that normal matrices can be unitarily diagonalized using a basis of eigenvectors. The singular value decomposition can be seen as a generalization of the spectral theorem to arbitrary, not necessarily square, matrices. Theorem 2 (Singular value decomposition (SVD)) A matrix A C m n has a singular value decomposition (SVD) A = UΣV, where U C m m, V C n n are unitary and Σ = diag(σ 1,..., σ p ) R m n, p = min(m, n), where σ 1 σ 2 σ p 0. If A is real, U and V can be taken real orthogonal. (4)

MATH36001: Generalized Inverses and the SVD Page 3 Proof. (Not examinable.) Let x C n and y C m be unit 2-norm vectors ( x 2 = y 2 = 1) such that Ax = σy, where σ = A 2. Since any orthonormal set can be extended to form an orthonormal basis for the whole space, we can define V 1 C n (n 1) and U 1 C m (m 1) such that V = [x, V 1 and U = [y, U 1 are unitary. Then [ U y y AV = A [ x V 1 = Ax y AV 1 σ w U1 Ax U 1 AV = = A 1 0 B 1, U 1 where w = y AV 1 and B = U1 AV 1. We now show that w = 0. We have [ σ σ w σ σ A 1 = = 2 + w w w 0 B w Bw so Hence, σ 2 + w w = σ 2 + w w 0 2 σ 2 + w w Bw 2 = A 1 A 1 2 (σ 2 + w w) 1/2. But, A 1 [ 2 = U AV 2 = A 2 = σ since U and V are unitary. σ 0 U AV =. The proof is completed by the obvious induction. 0 B [ σ w 2 A 1 2 (σ 2 + w w) 1/2. This implies that w = 0 and The σ i are called the singular values of A. The nonzero singular values of A are the positive square roots of the nonzero eigenvalues of AA or A A. The columns of U = [u 1,..., u m and V = [v 1,..., v n are left and right singular vectors of A, respectively. The left singular vectors u i are eigenvectors of AA and the right singular vectors v i are eigenvectors of A A. Suppose that the singular values satisfy Then (see exercise 2) σ 1 σ 2 σ r > σ r+1 = = σ p = 0, p = min(m, n). () rank(a) = r, null(a) = span{v r+1,..., v n }, (6) range(a) = span{u 1, u 2,..., u r }, (7) r A = σ i u i vi, Av i = σ i u i, A u i = σ i v i, i = 1,..., r. (8) A geometric interpretation of the SVD: we can think of A C m n as mapping x C n to y = Ax C m. Then we can choose one orthogonal coordinate system for C n (where the unit axes are the columns of V ) and another orthogonal coordinate system for C m (where the unit axes are the columns of U) such that A is diagonal (Σ), that is, maps a vector x = n β iv i to y = Ax = n σ iβ i u i. In other words, any matrix is diagonal, provided we pick up appropriate orthogonal systems for its domain and range!

MATH36001: Generalized Inverses and the SVD Page 4 Example 2 Compute the singular value decomposition of A in (3). 2 The eigenvalues of A T A = are 9 and 4 so the singular values of A are 3 and 2. 2 8 Normalized eigenvectors of A T A are v 1 = 1 [ 1 2, v2 = 1 [ 2 u 1 = 1 3 Av 1 = 1 3 2 4, u 2 = 1 2 Av 2 = 1 1. process to u 1, u 2 and e 1 produces u 3 = e 1 ( 2 et 1 u i )u i e 1 ( 2 = et 1 u i )u i 2 A has the SVD A = UΣV T where U = 1 3 0 2 2 6 4 3 2, Σ = 0 2. Application of the Gram Schmidt 1 3 0 0 2 0 0 2/3 1/3 2/3., V = 1 [ 1 2 2 1 What is the singular value decomposition of x C n? Let A = UΣV[ be a singular value decomposition of A x with U C n n and V C 1 1 σ unitary and Σ = 1 R 0 n 1. Note that x x is a positive scalar so σ 1 = (x x) 1/2 = x 2 (n 1) 1 and we can take V = [v 1 = 1. We have from (8) that u 1 = 1 σ 1 Av 1 = 1 x 2 x. Let Ũ be any n (n 1) matrix with orthonormal columns satisfying u 1Ũ = 0. Then U = [ u 1 Ũ is unitary and UΣV = [ x x 2 Ũ x 2 [ 1 = x. 0 2.1 Existence of the Moore Penrose Inverse Recall that the pseudoinverse X C n m of A C m n is the unique matrix satisfying the four Moore Penrose conditions (4) and is denoted by A +. The next theorem shows that any matrix A has a pseudoinverse. Theorem 3 If A = UΣV C m n is an SVD then A + = V Σ + U, where Σ + = diag(σ1 1,..., σ 1, 0,..., 0) is n m and r = rank(a). r Proof. One easily checks that Σ + and A + satisfy the four Moore Penrose conditions (4) so that they are pseudoinverses of Σ and A, respectively. Example 3 From the SVD of A in (3) obtained in Example 2 we have that A + = 1 18. 2 8 2. 4 2 In general it is not the case that (AB) + = B + A + for A C m n, B C n p. For example, + 1 0 1 1 1 1 1 0 1 0 let A =, B =. Then (AB) 0 0 0 0 + = = 0 0 1 and B 2 1 0 + A + =. 0 0

MATH36001: Generalized Inverses and the SVD Page rank = 39 rank = 1 rank = 20 rank = 100 Figure 1: Low-rank approximation of Dürer s magic square. 2.2 Low-Rank Approximation The singular values indicate how near a given matrix is to a matrix of low rank. Theorem 4 Let the SVD of A C m n be given by Theorem 2. Write U = [u 1,..., u m and V = [v 1,..., v n. If k < r = rank(a) and A k = k σ iu i v i then min A B 2 = A A k 2 = σ k+1. rank(b)=k Proof. Since U A k V = diag(σ 1,..., σ k, 0..., 0), it follows that rank(a k ) = k and that U (A A k )V = diag(0,..., 0, σ k+1,..., σ p ) and so A A k 2 = σ k+1. The theorem is proved if we can show that A B 2 σ k+1 for all matrices B of rank k. Let B be any matrix of rank k, so its null space is of dimension n k. Let V k+1 = span(v 1,..., v k+1 ). Then dim V k+1 = k + 1 and null(b) V k+1 since the sum of their dimensions is (n k) + (k + 1) > n. Let z null(b) V k+1 be of unit 2-norm (i.e., z 2 = 1) so that Bz = 0 and z = k+1 ζ iv i with k+1 ζ i 2 = 1 since z is of unit norm. Then k+1 k+1 A B 2 2 (A B)z 2 2 = Az 2 2 = UΣV z 2 2 = ΣV z 2 2 = σi 2 ζ i 2 σk+1 2 ζ i 2 = σk+1. 2 Example 4 (Image Compression) An m n grayscale image is just an m n matrix where entry (i, j) is interpreted as the brightness of pixel (i, j) ranging, say, from 0 (=black) to 1 (=white). Entries between 0 and 1 correspond to various shades of grey. Now, let A be the matrix representing a detail from Albrecht Dürer s engraving Melancolia I from 114 showing a 4 4 magic square (see first plot in Figure 1). The matrix A is of size 39 371 and of full rank. Its singular values decrease rapidly (only one > 10 4 and only six > 10 3 ). The three other plots in Figure 1 show the low rank approximations A k = k σ iu i vi for k = 1, 20, and 100. The checker board-like structure of A 1 is typical of very low-rank approximation. The individual numerals are recognizable in the r = 20 approximation. There is hardly any difference between the r = 100 approximation and the full rank image. Although low-rank matrix approximation to images do require less computer storage and transmission time than the full image, there are more effective data compression techniques.

MATH36001: Generalized Inverses and the SVD Page 6 The primary uses of low-rank approximations in image processing involve feature recognition such as in handwritten digits, faces, and finger prints. 3 Projectors Let S be a subspace of C m and let P S C m m. P S is the projector onto S if range(p S ) = S and P 2 S = P S. The projector is not unique. P S is the orthogonal projector onto S if range(p S ) = S, PS 2 = P S, and PS = P S. The orthogonal projector is unique (see Exercise 6). Also, P S = I P S is the orthogonal projector onto the orthogonal complement of S in C m denoted by S. (Solution to Exercise 6): Let P 1 and P 2 be orthogonal projectors onto S. Since range(p 1 ) = range(p 2 ), P 2 = P 1 X for some X. Then P 1 P 2 = P 2 1 X = P 1 X = P 2. Likewise, P 2 P 1 = P 1. Hence, for any z, Therefore P 1 P 2 = O. (P 1 P 2 )z 2 2 = z (P 1 P 2 )(P 1 P 2 )z = z (P 2 1 + P 2 2 P 1 P 2 P 2 P 1 )z = z (P 1 + P 2 P 2 P 1 )z = 0. Example Is the rank-one matrix xy y x with x, y Cn, y x 0, a projector? Is it an orthogonal projector? ( ) ( ) ( ) xy xy xy is idempotent since = xy xy y x y x y x y x. For any u Cn, u = y u y x y x x range(x) so P range(x) := xy is a projector onto range(x). y x If x = αy for some α R, P range(x) := xx is an orthogonal projector onto range(x) since x x xx is Hermitian. x x Let A C m n be such that rank(a) = r. In terms of the SVD of A, A = UΣV with U = [ U 1 U 2 and V = [ V 1 V 2 unitary and partitioned such that U 1 C m r and V 1 C n r, the orthogonal projectors onto the four fundamental subspaces of A are given by (see Exercise 7) In terms of the pseudoinverse, P range(a) = U 1 U 1, P null(a ) = U 2 U 2, P range(a ) = V 1 V 1, P null(a) = V 2 V 2. P range(a) = AA +, P null(a ) = I AA +, P range(a ) = A + A, P null(a) = I A + A.

MATH36001: Generalized Inverses and the SVD Page 7 4 Least Squares Problems Consider the system of linear equations Ax = b, where A C m n and b C m are given and x is an unknown vector. Theorem 1 says that Ax = b has a solution if and only if AA + b = b or, equivalently, b range(a). In this case the general solution is x = A + b + (I A + A)y for any y C n. Theorem (Minimum 2-norm solution) For given A C m n and b C m with b range(a), the vector x = A + b is the solution of minimum 2-norm amongst all the solutions to Ax = b. Proof. We will need the following fact Fact 1 For u, v C n : u v = 0 = u + v 2 2 = u 2 2 + v 2 2. This follows directly from u + v 2 2 = (u + v) (u + v) = u 2 2 + v 2 2 + 2 Re(u v). Then take u = A + b and v = (I A + A)y with y C n, arbitrary and using (4) (ii) and (iv), check that v u = y (I A + A) A + b = y (A + A + AA + )b = y (A + A + )b = 0. It follows that for the general solution x = A + b + (I A + A)y to Ax = b, x 2 2 = A + b 2 2 + (I A + A)y 2 2 and the 2-norm of x is minimal when (I A + A)y = 0, in which case x = A + b. (Note that (I A + A)y = 0 y null(a) = range(a ), see Exercise 7(i).) Theorem 1 says that there is no solution to Ax = b when b range(a). However for some purposes we may be satisfied with a least squares solution, which is a vector x C n such that Ax b 2 is minimized. Theorem 6 (Least squares solutions) For given A C m n and b C m the vectors x = A + b + (I A + A)y, y C n arbitrary, minimize Ax b 2. Moreover x LS = A + b is the least squares solution of minimum 2-norm. Proof. Let r = rank(a) and A = UΣV be an SVD. For any x C n we have Ax b 2 2 = U AV (V x) U b 2 2 = Σy c 2 2 (y = V x, c = U b) r m = σ i y i c i 2 + c i 2. i=r+1 Ax b 2 2 is minimized if and only if y i = c i /σ i = u i b/σ i, i = 1,..., r, where y r+1,..., y n are

MATH36001: Generalized Inverses and the SVD Page 8 arbitrary. Hence u 1b/σ 1. u x = V y = V rb/σ r = y r+1. y n r u i b σ i v i + n i=r+1 y i v i, with y i arbitrary for i = r + 1,..., n. The formula for x in the theorem follows from A + b = V Σ + U b = r u i b σ i v i and n i=r+1 y iv i = (I A + A)z for some z C n since (I A + A) is the orthogonal projector onto null(a) = span{v r+1,..., v n }. Since V is unitary, x 2 = y 2 and we get the minimum norm solution x LS by setting y r+1 = = y n = 0. Therefore, x LS = V y = r y i v i = r (u i b/σ i )v i = A + b. To summarize we have the following diagram: see Ortega p.169. Polar Decomposition The polar decomposition is the generalization to matrices of the familiar polar representation z = re iθ of a complex number. It is intimately related to the singular value decomposition (SVD), as our proof of the decomposition reveals. Theorem 7 (Polar decomposition) Any A C m n with m n can be represented in the form A = QH, where Q C m n has orthonormal columns and H C n n is Hermitian positive semidefinite. If A is real then Q can be taken real orthogonal and H symmetric positive semidefinite.

MATH36001: Generalized Inverses and the SVD Page 9 Proof. Let A = U [ Σ r 0 0 0 m r,n r V be an SVD, r = rank(a). Then [ Ir 0 A = U V Σr 0 V 0 I m r,n r 0 0 n r V QH. (9) Q has orthonormal columns since U, V are unitary and [ I r 0 0 I m r,n r has orthonormal columns. H is Hermitian since H = H and positive semidefinite since all its eigenvalues are real and nonnegative. Example 6 (Aligning two objects) Consider the molecule A, which we will specify by the coordinates a 1,..., a 7 of the centers of the seven spheres that represent some of its atoms, and let B be a second molecule. Define the 3 7 matrices A = [ a 1... a 7 and B = [ b 1... b 7. Molecule A Molecule B How can we tell whether molecule A and molecule B are the same, that is, if B was obtained by rotating A. In other words, is A = QB for some orthogonal matrix Q? Because life is about change and imperfection, we do not expect to obtain exact equality, but we want to make the difference between B and QA as small as possible. So our task is to solve the following problem minimize A QB F subject to Q T Q = I. This is called an orthogonal Procrustes problem. It is shown in Exercise that the transpose of the solution Q is the orthogonal polar factor of BA T. Exercises 1. Let A C m n, m n. If rank(a) = n, show that A A is nonsingular and that A + = (A A) 1 A. 2. Let A C m n with m n be such that rank(a) = r. The matrix A has the singular value decomposition A = UΣV, where U = [u 1,..., u m C m m, V = [v 1,..., v n C n n are unitary, and Σ = diag(σ 1,..., σ n ) R m n, where σ 1 σ 2 σ r > σ r+1 = = σ n = 0. Show that null(a) = span{v r+1,..., v n }, range(a) = span{u 1, u 2,..., u r }.

MATH36001: Generalized Inverses and the SVD Page 10 3. For A C n n use the SVD to find expressions for A 2 and A F in terms of the singular values of A. Hence obtain a bound of the form c 1 A 2 A F c 2 A 2, where c 1 and c 2 are constants that depend on n. When is there equality in these two inequalities? 4. Show that the pseudoinverse A + of A C m n solves the problem min AX I m F. X C n m [Hint: Reduce it to m standard least squares problems. Is the solution unique?. Let A, B R m n. Recall that A 2 F = trace(at A) and that for any matrix D for which the product AD is defined, trace(ad) = trace(da). (i) Show that the Q that minimizes A QB F maximizes trace(a T QB). over all choices of orthogonal Q also (ii) Suppose that the SVD of the m m matrix BA T is UΣV T, where U and V are m m and orthogonal and Σ is diagonal with diagonal entries σ 1 σ m 0. Define Z = V T QU. Use these definitions and (i) to show that trace(a T QB) = trace(zσ) m σ i. (iii) Identify the choice of Q that gives equality in the bound of (ii). (iv) Carefully state a theorem summarizing the solution to the minimization of A QB F. 6. Show that the orthogonal projector onto a subspace S is unique. 7. Let A C m n. (i) Show that ( range(a) ) = null(a ) and that ( null(a) ) = range(a ). (ii) Show that AA +, A + A, I A + A and I AA + are the orthogonal projectors onto range(a), range(a ), null(a) and null(a ), respectively. (iii) Suppose rank(a) = r and let A = UΣV with U = [ U 1 U 2 and V = [ V 1 V 2 unitary and partitioned such that U 1 C m r and V 1 C n r. Show that U 1 U1, V 1 V1, V 2 V2 and U 2 U2 are the orthogonal projectors onto range(a), range(a ), null(a) and null(a ), respectively. 8. What is the pseudoinverse of a vector x C n? 9. Is it true that (A + ) + = A?