Linear Algebra. Session 12

Similar documents
MATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products.

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

Linear Algebra Massoud Malek

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

UNIT 6: The singular value decomposition.

Chapter 3 Transformations

INNER PRODUCT SPACE. Definition 1

Linear Algebra Review

2. Review of Linear Algebra

The Singular Value Decomposition

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Applied Linear Algebra in Geoscience Using MATLAB

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Chapter 7: Symmetric Matrices and Quadratic Forms

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Chapter 6 Inner product spaces

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Lecture 1: Review of linear algebra

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Knowledge Discovery and Data Mining 1 (VO) ( )

Review of Some Concepts from Linear Algebra: Part 2

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Properties of Matrices and Operations on Matrices

COMP 558 lecture 18 Nov. 15, 2010

Maths for Signals and Systems Linear Algebra in Engineering

Elementary linear algebra

Lecture 2: Linear Algebra Review

Math Linear Algebra II. 1. Inner Products and Norms

Positive Definite Matrix

Lecture notes: Applied linear algebra Part 1. Version 2

Typical Problem: Compute.

MAT Linear Algebra Collection of sample exams

Lecture Notes 1: Vector spaces

There are two things that are particularly nice about the first basis

CS 143 Linear Algebra Review

Linear Algebra Primer

Basic Calculus Review

Linear Algebra. Paul Yiu. Department of Mathematics Florida Atlantic University. Fall A: Inner products

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

Linear Algebra Review. Vectors

Further Mathematical Methods (Linear Algebra)

Functional Analysis Review

1. General Vector Spaces

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

Contents. Appendix D (Inner Product Spaces) W-51. Index W-63

Quadratic forms. Here. Thus symmetric matrices are diagonalizable, and the diagonalization can be performed by means of an orthogonal matrix.

Section 7.5 Inner Product Spaces

Chapter 4 Euclid Space

7. Symmetric Matrices and Quadratic Forms

Review problems for MA 54, Fall 2004.

Algebra II. Paulius Drungilas and Jonas Jankauskas

Chapter 6: Orthogonality

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

The following definition is fundamental.

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

STA141C: Big Data & High Performance Statistical Computing

MATH 423 Linear Algebra II Lecture 28: Inner product spaces.

Lecture 23: 6.1 Inner Products

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

MTH 2032 SemesterII

Linear Algebra (Review) Volker Tresp 2017

MAT 610: Numerical Linear Algebra. James V. Lambers

MATH 167: APPLIED LINEAR ALGEBRA Chapter 3

Review of some mathematical tools

Mathematics Department Stanford University Math 61CM/DM Inner products

NORMS ON SPACE OF MATRICES

LINEAR ALGEBRA REVIEW

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

. = V c = V [x]v (5.1) c 1. c k

Conceptual Questions for Review

CS 246 Review of Linear Algebra 01/17/19

Lecture 3: Review of Linear Algebra

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Definitions for Quizzes

Lecture 3: Review of Linear Algebra

Basic Elements of Linear Algebra

Stat 159/259: Linear Algebra Notes

Linear Algebra (Review) Volker Tresp 2018

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

Computational math: Assignment 1

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013.

The Singular Value Decomposition and Least Squares Problems

1 Last time: least-squares problems

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

1. Foundations of Numerics from Advanced Mathematics. Linear Algebra

235 Final exam review questions

Math 24 Spring 2012 Sample Homework Solutions Week 8

Transcription:

Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017

Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x) = c c = 1 c = 2 1 0 1 2 c

Then, the normal system is ( ) 1 1 1 1 1 1 1 1 c = ( 1 1 1 1 ) c = 1 4 (1 + 0 + 1 + 2) = 1 (mean arithmetic value) Thus, the constant function is f (x) = 1 1 0 1 2

Example 12.2 Find the linear polynomial function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c 1 = 1 c f (x) = c 1 + c 2 x 1 + c 2 = 0 c 1 + 2c 2 = 1 c 1 + 3c 2 = 2 1 0 1 1 1 2 1 3 ( c1 c 2 ) = 1 0 1 2

Then, the nomal system is ( 1 1 1 1 0 1 2 3 ) ( 4 6 6 14 1 0 1 1 1 2 1 3 ) ( c1 c 2 Thus, the linear function is ( c1 ) = c 2 ) = ( 4 8 ) f (x) = 0.4 + 0.4x ( 1 1 1 1 0 1 2 3 { c1 = 0.4 c 2 = 0.4 ) 1 0 1 2

Example 12.3 Find the quadratic polynomial function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c 1 = 1 f (x) = c 1 + c 2 x + c 3 x 2 c 1 + c 2 + c 3 = 0 c 1 + 2c 2 + 4c 3 = 1 c 1 + 3c 2 + 9c 3 = 2

1 0 0 1 1 1 1 2 4 1 3 9 Then, the nomal system is 1 1 1 1 0 1 2 3 0 1 4 9 1 0 0 1 1 1 1 2 4 1 3 9 c 1 c 2 c 3 = c 1 c 2 c 3 1 0 1 2 = 1 1 1 1 0 1 2 3 0 1 4 9 1 0 1 2

4 6 14 6 14 36 14 36 98 c 1 c 2 c 3 Thus, the quadratic function is = 4 8 22 f (x) = 0.9 1.1x + 0.5x 2 c 1 = 0.9 c 2 = 1.1 c 3 = 0.5

Orthogonal sets Let <, > denote the scalar product in R n Definition Nonzero vectors v 1, v 2,, v k R n form an orthogonal set if they are orthogonal to each other: < v i, v j >= 0 for all i j. If, in addition, all vectors are of unit length, v i, v 1, v 2,, v k is called an orthonormal set. For instance, The standard basis e 1 = (1, 0, 0,..., 0), e 2 = (0, 1, 0,..., 0),, e n = (0, 0, 0,..., 1). It is an orthonormal set.

Orthonormal bases Suppose v 1, v 2,, v n is an orthonormal basis for R n (i.e., it is a basis and an orthonormal set). Theorem Let x = x 1 v 1 + x 2 v 2 + + x n v n and y = y 1 v 1 + y 2 v 2 + + y n v n where x i, y 1 R i) < x, y >= n i=i x iy i i) x = n i=i x iy i

proof i) n n < x, y >= x i v i, y j v j = i=i j=i n x i i=i j=i n y j v i, v j = ii) follows from i) when y = x n n x i v i, v j = i=i j=i n x i y i i=i

Suppose V is a subspace of R n. Let p be the orthogonal projection of a vector x R n onto V. If V is a one-dimensional subspace spanned by a v, then p = <x,v> <v,v> v If V admits an orthogonal basis v 1, v 2,, v k, then p = < x, v 1 > < v 1, v 1 > v 1 + < x, v 2 > < v 2, v 2 > v 2 +... + < x, v k > < v k, v k > v k Indeed, < p, v i >= k <x,v j > j=i <v j,v j > < v j, v i >= <x,v i > <v i,v i > < v i, v i >=< x, v i > < x p, v i >= 0 (x p) v i (x p) V.

Coordinates relative to an orthogonal basis Theorem If v 1, v 2,, v n is an orthogonal basis for R n, then x = < x, v 1 > < v 1, v 1 > v 1 + < x, v 2 > < v 2, v 2 > v 2 +... + < x, v n > < v n, v n > v n for any vector x R n Corollary If v 1, v 2,, v n is an orthonormal basis for R n, then z =< x, v 1 > v 1 + < x, v 2 > v 2 +...+ < x, v n > v n for any vector x R n.

. Let V be a subspace of R n. Suppose x 1, x 2,, x k is a basis for V. Let v 1 = x 1 v 2 = x 2 <x 2,v 1 > <v 1,v 1 > v 1 v 3 = x 3 <x 3,v 1 > <v 1,v 1 > v 1 <x 3,v 2 > <v 2,v 2 > v 2 v k = x k <x k,v 1 > <v 1,v 1 > v 1 <x k,v 2 > <v 2,v 2 > v 2 <x k,v k 1 > <v k 1,v k 1 > v k 1 Then v 1, v 2,, v k is an orthogonal basis for V.

. Properties of the Gram-Schmidt process: Any basis x 1, x 2,, x k Orthogonal basis v 1, v 2,, v k v j = x j (α 1 x 1 α 2 x 2 α j 1 x j 1 ); 1 j k the span of v 1, v 2,, v j is the same as the span of x 1, x 2,, x j v j is orthogonal to x 1, x 2,, x j 1 v j = x j p j where p j is the orthogonal projection of the vector x j on the subspace spanned by x 1, x 2,..., x k v j is the distance from x j to the subspace spanned by x 1, x 2,..., x j 1

. Normalization Let V be a subspace of R n. Suppose v 1, v 2,, v k is an orthogonal basis for V Let w 1 = v 2 v 2, w 2 = v 2 v 2,..., w k = v k v k Then w 1, w 2,..., w k is an orthonormal basis for V. Theorem Any non-trivial subspace of R n admits an orthonormal basis.

. Example 12.4 Let Π the plane spanned by vectors x 1 = (1, 1, 0), x 2 = (0, 1, 1). i) Find the orthogonal projection of the vector y = (4, 0, 1) onto the plane Π. i) Find the distance from y to Π. Solution First we apply the Gram-Schmidt process to the basis x 1, x 2. v 1 = x 1 = (1, 0, 0)

. v 2 = x 2 <x 2,v 1 > <v 1,v 1 > v 1 = (0, 1, 1) 1 2 (1, 1, 0) = ( 1/2, 1/2, 1) v 3 = x 3 <x 3,v 1 > <v 1,v 1 > v 1 <x 3,v 2 > <v 2,v 2 > v 2 Now that v 1, v 2 is an orthogonal basis for Π, the orthogonal projection of y onto Π is p = < y, v 1 > < v 1, v 1 > v 1+ < y, v 2 > < v 2, v 2 > v 2 = 4 3 (1, 0, 0)+ ( 1/2, 1/2, 1) = (2, 2 3/2 The distance from y to Π is o = y p = (1, 1, 1) = 3

. Example 12.5 Find the distance from the point y = (0, 0, 0, 1) to the subspace V R 4 spanned by vectors x 1 = (1, 1, 1, 1), x 2 = (1, 1, 3, 1), x 3 = ( 3, 7, 1, 3),. Solution First we apply the Gram-Schmidt process to the basis x 1, x 2, x 3 and obtain an orthogonal basis v 1, v 2, v 3 for the subspace V Next we compute the orthogonal projection p of the vector y onto V p = < y, v 1 > < v 1, v 1 > v 1 + < y, v 2 > < v 2, v 2 > v 2 + < y, v 3 > < v 3, v 3 > v 3

. Then the distance from y to V equals to o = y p Alternatively, we can apply t he Gram-Schmidt process to vectors x 1, x 2, x 3, y. We should obtain an orthogonal system v 1, v 2, v 3, v 4. Then t he desired distance will be v 4. v 1 = x 1 = (1, 1, 1, 1) v 2 = x 2 <x 2,v 1 > <v 1,v 1 > v 1 = (1, 1, 3, 1) 4 4 (1, 1, 1, 1) = ( 1/2, 1/2, 1) v 3 = x 3 <x 3,v 1 > <v 1,v 1 > v 1 <x 3,v 2 > <v 2,v 2 > v 2 = ( 3, 7, 1, 3) 12 4 (1, 1, 1, 1) 16 8 (0, 2, 2, 0) = (0, 0, 0, 0)

. The Gram-Schmidt process can be used to check linear independence of vectors!. It failed because the vector x 3 is a linear combination of x 1 and x 2. V is a plane, not a 3 dimensional subspace. To fix things, it is enough to drop x 3, i.e., we should orthogonalize vectors x 1, x 2, y. ˆv 3 = y <y,v 1> <v 1,v 1 > v 1 <y,v 2> <v 2,v 2 > v 2 = (0, 0, 0, 1) 1 4 (1, 1, 1, 3) 0 8 (0, 2, 2, 0) = (1/4, 1/4, 1/4, 3/4) Then the distance from y to V equals to ˆv 3 = (1/4, 1/4, 1/4, 3/4) = 12 4

Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α : V R n is called a norm on V if it has the following properties: α(x) 0, α(x) = 0 only for x = 0 (Positivy) α(rx) = r α(x) for all r R. (Homogeneity)

α(x + y) α(x) + α(y). (Triangle Inequality) Notation. The norm of a vector x V is usually denoted by x. Different norms on V are distinguished by subscripts, e.g., x 1 and x 2.

Let V = R n and let x = (x 1, x 2,..., x n ) be a vector in V x = max{ x 1, x 2,..., x n } Positivity and homogeneity are obvious. Let x = (x 1, x 2,..., x n ) and y = (y 1, y 2,..., y n ). Then x + y = (x 1 + y 1, x 2 + y 2,..., x n + y n ) x i + y i x i + y i max j x j + max j y j max j x i + y i max j x j + max j y j x + y x + y

x 1 = x 1 + x 2 +... + x n Positivity and homogeneity are obvious. Let x = (x 1, x 2,..., x n ) and y = (y 1, y 2,..., y n ). Then x + y = (x 1 + y 1, x 2 + y 2,..., x n + y n ) x i + y i x i + y i i x i + y i i x i + i y i x + y 1 x 1 + y 1

x p = ( x 1 p + x 2 p +... + x n p ) 1/p Positivity and homogeneity are obvious. Let x = (x 1, x 2,..., x n ) and y = (y 1, y 2,..., y n ). Then x + y = (x 1 + y 1, x 2 + y 2,..., x n + y n ). Now, using the Minkoswky Inequality ( x 1 + y 1 p + x 2 + y 2 p + + x n + y n p ( x 1 p + x 2 p + + x n p ) 1/p + ( y 1 p + y 2 p + + y n p ) 1/p with p > 0 it follows that x + y p x p + y p

Normed vector space Definition. A normed vector space is a vector space endowed with a norm. The norm defines a distance function on the normed vector space: dist(x, y) = y y. Then we say that a vector x is a good approximation of a vector x 0 if dist(x, x 0 ) is small. Also, we say that a sequence of vectors x 1, x 2,..., x n converges to a vector x if dist(x, x n ) 0 as n.

Unit circle on the normed vector space V = R n : {x V : x p = 1} x 1 = ( x 1 + x 2 x 3/2 = ( x 1 3/2 + x 2 3/2 ) 3/2 x 2 = ( x 1 2 + x 2 2 ) 1/2 x 3 = ( x 1 3 + x 2 3 ) 1/3 x 6 = ( x 1 6 + x 2 6 ) 1/6 x = max{ x 1 + x 2 }

V = C[a, b], f : [a, b] R f = max a x b f (x) f 1 = inta b f (x) dx f p = ( inta b f (x) p dx ) 1/p, p 1 Theorem f p is a norm on C[a, b] for any p 1

Abstract Linear Algebra The notion of inner product generalizes the notion of dot product of vectors in R n Definition Let V be a vector space. A function β : V V R usually denoted β(x, y) =< x, y > is called an inner product on V if it is positive, symmetric, and bilinear. That is, if i) < x, x > 0, < x, x >= 0 only for x = 0 (Positivity) ii) < x, y >=< y, x > (Symmetry)

Abstract Linear Algebra iii) < rx, y >= r < y, x > (Homogeneity ) iv) < x + y, z >=< x, z > + < y, z > (Distributive Law ) An inner product space is a vector space endowed with an inner product.

Abstract Linear Algebra V = R n Remarks < x, y >= x y = x 1 y 1 + x 2 y 2 + + x n y n < x, y >= d 1 x 1 y 1 + d 2 x 2 y 2 + + d n x n y n where d 1, d 2,..., d n > 0 < x, y >= (Dx) (Dy) where D is an invertible n n matrix. a) Invertibility of D is necessary to show that < x, x >= 0 x = 0 b) The second example is a particular case of the third one when D = diag(d 1/2 1, d 1/2 2,..., dn 1/2 ).

Abstract Linear Algebra Example 12.6 Find an inner product on R 2 such that < e 1, e 1 >= 1, < e 2, e 1 >= 3, and < e 1, e 2 >= 1 where e 1 = (1, 0), e 2 = (0, 1) Solution Let x = (x 1, x 2 ), y = (y 1, y 2 ) R 2. Then using bilinearity, we obtain < x, y >=< x 1 e 1 + x 2 e 2, y 1 e 1 + y 2 e 2 >= x 1 y 1 < e 1, e 1 > +x 1 y 2 < e 1, e 2 > +x 2 y 1 < e 2, e 1 > +y 1 y 1 < e 2, e 2 >=

Abstract Linear Algebra < x, y >= 2x 1 y 1 x 1 y 2 x 2 y 1 + 3x 2 y 2 It remains to check that < x, x > > 0 for x 0 Indeed, < x, x >= 2x 2 1 y 1 2x 1 x 2 + 3x 2 2 = (x 1 x 2 ) 2 + x 2 1 + 2x 2 2 > 0 for x 0

Abstract Linear Algebra V = M m,n (R), space of m n matrices. < A, B >= trace(ab T ) If A = (a ij ) and B = (b ij ), then < A, B >= m i=1 n j=1 a ijb ij V = C[a, b]. < f, g >= b a f (x)g(x)dx < f, g >= b a w(x)f (x)g(x)dx where w is bounded, piecewise continuous, and w > 0 everywhere on [a, b]. w is called the weight function

Abstract Linear Algebra Theorem Suppose < x, y > is an inner product on a vector space V. Then for all x, y V Proof < x, y > 2 < x, x >< y, y > For any t R let v t = x + ty then < v t, v t >=< x + ty, x + ty >=< x, x > +t < x, y > + t < y, x > +t 2 < y, y >

Abstract Linear Algebra Now, assume that y 0 and let t = <x,y> <y,y>. Then < v t, v t >=< x, x > <x,y>2 <y,y> since < v t, v t > 0 the desired inequality follows. In the case y = 0, we have < x, y >=< y, y >= 0

Abstract Linear Algebra Cauchy-Schwarz Inequality < x, y > < x, x > < y, y > Corollary < x, y > x y Corollary For any f, g C[a, b], ( b a f (x)g(x)dx ) 2 b a f (x) 2 dx b a g(x) 2 dx

Abstract Linear Algebra Norms induced by inner products Theorem Suppose < x, y > is an inner product on a vector space V. Then, x = < x, x > is a norm. Proof Positivity is obvious. Homogeneity: rx = < rx, rx > = r < x, x > = r x Triangle inequality (follows from Cauchy-Schwarzs): x + y 2 =< x + y, x + y >=< x, x > + < x, y > + < y, x > + < y, y >

Abstract Linear Algebra < x, x > + < x, y > + < y, x > + < y, y > x 2 + 2 x y + y 2 = ( x + y ) 2

Abstract Linear Algebra The length of a vector in R n. x = x1 2 + x 2 2 + + x n 2 ( b 1/2 The norm f 2 = a dx) f (x) 2 = on the vector space C[a, b] is induced by the inner product < f, g >= b a f (x)g(x)dx

Abstract Linear Algebra Angle Let V be an inner product space with an inner product <, > and the induced norm. Then < x, y > x y for all x, y V (the Cauchy-Schwarz inequality). Therefore we can define the angle between nonzero vectors in V by ) (x, y) = arcos ( <x,y> x y Then < x, y >= x y (x, y). In particular, vectors x and y are orthogonal (denoted x y if < x, y >= 0.

Abstract Linear Algebra Orthogonal sets Let V be an inner product space with an inner product <, > and the induced norm. Definition A nonempty set S V of nonzero vectors is called an orthogonal set if all vectors in S are mutually orthogonal. That is, 0 / S and < x, y >= 0 for any x, y S, x y. An orthogonal set S V is called orthonormal if x = 1 for any x S.

Abstract Linear Algebra Remark. Vectors v 1, v 2,..., v n V form an orthonormal set if and only if { 1, if i = j, < v i, v j >= 0, if i j,

. Singular Value Decomposition In this section, we assume throughout that A is an m n matrix with m n. (This assumption is made for convenience only; all the results will also hold if m < n). We will present a method for determining how close A is to a matrix of smaller rank. The method involves factoring A into a product UΣV T, where U is an m m orthogonal matrix, V is an n n orthogonal matrix, and Σ is an m n matrix whose off-diagonal entries are all 0 s and whose diagonal elements satisfy σ 1 σ 2 σ n 0

. Σ = σ 1 σ 2... σ n The σ s determined by this factorization are unique and are called the singular values of A. The factorization UΣV T is called the singular value decomposition of A, or, for short, the SVD of A.

. The SVD Theorem If A is an m n matrix, then A has a singular value decomposition Sketch of the proof A T A is a symmetric n n matrix. The eigenvalues of A T A are all real and it has an orthogonal diagonalizing matrix V. Furthermore, its eigenvalues must all be nonnegative. To see this point, let λ be an eigenvalue of A T A and x be an eigenvector belonging to λ. It follows that

. Ax 2 = x T A T Ax = x T λx = λx T x = λ x 2 λ = Ax 2 x 2 We may assume that the columns of V have been ordered so that the corresponding eigenvalues satisfy λ 1 λ 2 λ n 0. The singular values are given by σ j = λ j, j = 1, 2,..., n

. Let r denote the rank of A. The matrix A T A will also have rank r. Since A T A is symmetric, its rank equals the number of nonzero eigenvalues. Thus, σ 1 σ 2 σ r > 0 σ r+1 = σ r+2 = = σ n = 0 Now let V 1 = (v 1, v 2,..., v r, ) and V 2 = (v r+1, v r+2,..., v n, ) The column vectors of V 1 are eigenvectors of A T A belonging to λ i, i = 1, 2,..., r. The column vectors of V 2 are eigenvectors of A T A belonging to λ j = 0, j = r + 1, r + 2,..., n.

. Now let Σ 1 be the r r matrix defined by Σ 1 = σ 1 σ 2... σ n The m n matrix Σ is then given by ( ) Σ1 0 Σ = 0 0

. To complete the proof, we must show how to construct an m m orthogonal matrix U such that A = UΣV T AV = UΣ Comparing the first r columns of each side of the last equation, we see that Thus, if we define Av i = σ i v i, u i = 1 σ i Av i, i = 1, 2,..., r i = 1, 2,..., r

. and then it follows that U 1 = (u 1, u 2,..., u r ) AV 1 = U 1 Σ 1 The column vectors of U 1 form an orthonormal set. Thus, form an orthonormal basis for R(A). The vector space R(A) = N(A T ) has dimension m r. Let {u r+1, u r+2,, u n } be an orthonormal basis for N(A T ) and set U 2 = (u r+1, u r+2,..., u n )

. Setthe m n matrix U by U = (U 1, U 2 ) The the matrices U, Σ, and V satisfy A = UΣV T

. Observations Let A be an m n matrix with a singular value decomposition A = UΣV T. The singular values σ 1,..., σ n of A are unique; however, the matrices U and V are not unique. Since V diagonalizes A T A, it follows that the v j s are eigenvectors of A T A. Since AA T = UΣΣ T U T, it follows that U diagonalizes AA T and that the u j s are eigenvectors of AAT. The v j s are called the right singular vectors of A, and the u j s are called the left singular vectors of A.

. If A has rank r, then (i) v 1, v 2,..., v r form an orthonormal basis for R(A T ). (ii) v r+1, v r+2,..., v n form an orthonormal basis for N(A). (iii) u 1, u 2,..., u r form an orthonormal basis for R(A). (iv) u r+1, u r+2,..., u r+n form an orthonormal basis for N(A T ) The rank of the matrix A is equal to the number of its nonzero singular values (where singular values are counted according to multiplicity).

. In the case that A has rank r < n, if we set V 1 = (v 1, v 2,..., v r ) U 1 = (u 1, u 2,..., u r ) and define Σ 1 as before, then A = U 1 Σ 1 V T 1 This factorization, is called the compact form of the singular value decomposition of A. This form is useful in many applications

. Example 12.7 let A = 1 1 1 1 0 0 Compute the singular values and the singular value decomposition of A Solution The matrix AA T = ( 2 2 2 2 )

. has eigenvalues λ 1 = 4 and λ 2 = 0. Consequently, the singular values of A are σ 1 = 2 and σ 2 = 0 The eigenvalue λ 1 has eigenvectors of the form α(1, 1) T, and σ 2 has eigenvectors of the form β(1, 1) T. Therefore, the orthogonal matrix V = 1 2 ( 1 1 1 1 )

. diagonalizes A T A. From what we discussed before, it follows that u 1 = 1 Av 1 = 1 σ 1 2 1 1 1 1 0 0 ( 1/ 2 1/ 2 ) = 1/ 2 1/ 2 0 The remaining column vectors of U must form an orthonormal basis for N(A T ). We can compute a basis {x 2, x 3 } for N(A T ) in the usual way. x 2 = (1, 1, 0) T, x 3 = (0, 0, 1) T Since these vectors are already orthogonal, it is not necessary to use the Gram-Schmidt process to obtain an orthonormal basis.

. We need only set u 2 = 1 x 2 x 2 = ( 1 2, 1 2, 0) T, u 3 = 1 x 3 x 3 = (0, 0, 1) T It then follows that A = UΣV T = 1 2 1 2 0 1 2 1 2 0 0 0 1 2 0 2 0 0 0 ( 1 2 1 2 1 2 1 2 )

. OBS If A has singular value decomposition UΣV T, then A can be represented by the outer product expansion A = σ 1 u 1 v T 1 + σ 2 u 2 v T 2 + + σ n u n v T n The closest matrix of rank k, is obtained by truncating this sum, after the first k terms: A = σ 1 u 1 v T 1 + σ 1 u 2 v T 2 + + σ n u k v T k, k < n