Vector and Matrix Norms. Vector and Matrix Norms

Similar documents
Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Linear Algebra, part 3 QR and SVD

The Singular Value Decomposition and Least Squares Problems

Cheat Sheet for MATH461

UNIT 6: The singular value decomposition.

5.6. PSEUDOINVERSES 101. A H w.

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

The Singular Value Decomposition

Maths for Signals and Systems Linear Algebra in Engineering

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Linear Algebra in Actuarial Science: Slides to the lecture

MATH 350: Introduction to Computational Mathematics

COMP 558 lecture 18 Nov. 15, 2010

Lecture notes: Applied linear algebra Part 1. Version 2

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

18.06SC Final Exam Solutions

Math 407: Linear Optimization

Orthonormal Transformations and Least Squares

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Notes on Solving Linear Least-Squares Problems

Singular Value Decomposition

Linear Least-Squares Data Fitting

Singular Value Decomposition

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Conceptual Questions for Review

Stability of the Gram-Schmidt process

Singular Value Decomposition

Linear Algebra Review. Vectors

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

7. Symmetric Matrices and Quadratic Forms

Least squares: the big idea

Linear Algebra Review. Fei-Fei Li

Lecture 6, Sci. Comp. for DPhil Students

Linear Algebra Primer

Linear Analysis Lecture 16

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

Singular Value Decomposition (SVD)

Lecture 4 Orthonormal vectors and QR factorization

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Computational Linear Algebra

Pseudoinverse & Moore-Penrose Conditions

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Applied Numerical Linear Algebra. Lecture 8

Matrix Factorization and Analysis

Linear Algebra Review. Fei-Fei Li

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

18.06 Quiz 2 April 7, 2010 Professor Strang

Notes on Eigenvalues, Singular Values and QR

The Singular Value Decomposition

Singular value decomposition

L2-7 Some very stylish matrix decompositions for solving Ax = b 10 Oct 2015

Matrix decompositions

14 Singular Value Decomposition

Orthogonal Transformations

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

Review problems for MA 54, Fall 2004.

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Large Scale Data Analysis Using Deep Learning

7. Dimension and Structure.

Designing Information Devices and Systems II

Linear Algebra Massoud Malek

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

This can be accomplished by left matrix multiplication as follows: I

A Brief Outline of Math 355

Pseudoinverse & Orthogonal Projection Operators

LinGloss. A glossary of linear algebra

1 Last time: least-squares problems

Orthogonalization and least squares methods

. = V c = V [x]v (5.1) c 1. c k

7 Principal Component Analysis

MATH36001 Generalized Inverses and the SVD 2015

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Singular Value Decomposition

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

CS 143 Linear Algebra Review

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:

AMS526: Numerical Analysis I (Numerical Linear Algebra)

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

IV. Matrix Approximation using Least-Squares

8. Diagonalization.

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Linear Least Squares. Using SVD Decomposition.

Numerical Linear Algebra Chap. 2: Least Squares Problems

Main matrix factorizations

Linear Algebra. Carleton DeTar February 27, 2017

Solutions to Review Problems for Chapter 6 ( ), 7.1

Mathematical Methods wk 2: Linear Operators

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Summary of Week 9 B = then A A =

LINEAR ALGEBRA KNOWLEDGE SURVEY

Singular Value Decomposition

18.06 Professor Johnson Quiz 1 October 3, 2007

Transcription:

Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose elements are the same as in x and A is an m n matrix such that if y = Ax, then y = Ax. We will refer to the vector spaces for x and A as n and mn respectively Column Vector Norms: Natural norms to use are: / p / n n n p x j i ti l x d j x j j j j x, in particular and, p x x x max x j j 7. Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose elements are the same as in x and A is an m n matrix such that if y = Ax, then y = Ax. We will refer to the vector spaces for x and A as n and mn respectively Inner Product: x, y y x he norm generated by this inner product is the p = norm. When we write a norm without a subscript, we usually mean the p = norm. x x / n x j j 7.

Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose elements are the same as in x and A is an m n matrix such that if y = Ax, then y = Ax. We will refer to the vector spaces for x and A as n and mn respectively Matrix Norm: he natural norm to use is: Ax A max x max x A x Remark: his norm always exists (see slide 6.5). he definition implies Ax A x 7.3 Conditioning and Condition Number Ill-conditioned systems here are always errors in a matrix due to measurement difficulties and roundoff errors. An ill-conditioned matrix A will turn a small error in b in the equation Ax = b into a large error in x Example: 0.9999.000x x his equation has the solution: x = 0.5 + 5000.5, x = 0.5 + 4999.5 We see that is multiplied by a factor of around 5000! How small is small? How large is large? he answer to this question is application-dependent, BU when we multiply by factors that approach the roundoff limit ( ), we will always be in trouble. 7.4

Conditioning and Condition Number Ill-conditioned systems Geometric interpretation: An ill-conditioned system is nearly degenerate. A picture in two dimensions: () x x x x x 0 () x 0.99x.0x x x x x well-conditioned ill-conditioned 7.5 Conditioning and Condition Number Condition Number Definition: For a square n n matrix, the condition number is (A) = A A Definition: he residual r of an approximate solution ˆx of Ax = b is r baxˆ A( xxˆ) heorem: x x ˆ ( A) r x b Proof: From b = Ax, we infer b A x. We have x xˆ A r, so that x xˆ A r. Combining these results, we obtain the theorem Remark: With a small condition number and a small residual relative to b, the error in x relative to its norm will be small. Remark: here will some variation in the condition number, depending on the choice of norm. Generally, they track each other. 7.6 3

Some questions that this discussion leads to () What do we do when the condition number is poor? () What about problems where the matrix A is not square? he singular value decomposition allows us to deal with these questions and much more! heorem: Let A be an m n matrix of rank r. hen, A can be expressed as a product A = UΣV, where U and V are respectively m m and n n orthogonal (unitary) matrices of rank r, and Σ is a non-square diagonal m n matrix of rank r, Σ, r 0 r 0 7.7 Proof: We proceed inductively: () here is a vector x with a norm equal to ( p = ) that satisfies Ax = A x = A, and a vector y = (/ )Ax, whose norm is also equal to. We note that x n and y m. We start with a set of orthonormal column vectors that span n, for example e, e,, e n, where e = [ 0 0 0], e = [0 0 0],, e n = [0 0 0 ]. We note that I = [e e e n ] is the n n identity matrix and evidently unitary. We may now use the Gram- Schmidt procedure to create a new orthonormal set of column vectors that span n, v, v,, v n, where v = x, and a corresponding matrix V = [v v v n ]. In a similar way, we create a set of orthonormal vectors that span m, u, u,, u m, where u = y, and a corresponding matrix U = [u u u m ]. () We now construct the matrix B UAV. We see that AV = [ y Av Av n ] and UAV. In particular, we u kl kavl have UAV y y and UAV uky 0 when k > k 7.8 4

Proof: We proceed inductively: () We now construct the matrix B UAV. We see that AV = [ y Av Av n ] and. In particular, we UAV u kl kavl have UAV y y and UAV u ky 0 when k > k We conclude that B has the form B, where, () w ( n) ( m) ( n) w A 0 A We now note that B x x V A U UAVx y A A y A y, where y Vx Since x = y, we conclude B = A =. From (), using the column vector [ w ], we find B + w, which implies w = 0. Hence, 0 B 0 A 7.9 Proof: We proceed inductively: (3) We now proceed with A in exactly the same way that we proceeded with A (unless r = ). We have 0 < A. We find x (n) and y (m) just like before. As before, we obtain orthogonal matrices, Uˆ and Vˆ and we find ˆ 0 ( m) ( n) B, where 0 A A ( m) ( m) ( n) ( n) We can now construct matrices 0 0 0 0 0 U,, 0 0 ˆ V ˆ B 0 ˆ 0 0 U V B 0 0 A and we note B U U AVV 7.0 5

Proof: We proceed inductively: (4) We continue through r iterations, at which point we will have spanned the range of A. Hence, we must have A r+ = 0. he desired matrices are: UUU U, V VV V, Σ B r r r Remark: Suppose A is an n n non-singular matrix. We then find that A = VΣ U, so that A = / n, and (A) = / n. In general, a large ratio between singular values implies that a matrix is ill-conditioned. Remark: he theorem is not constructive since we have not described how to find the vectors that correspond to the maxima. We will discuss algorithms in connection with eigenvalues and eigenvectors. Corollary: A = VΣ U ; Rank A = Rank A = r; Nullity A = n r; Nullity A = m r 7. What is happening A A A = A : n m ; v u v v u v r r vr ur vr vr 0 ur 0 v 0 u 0 m n A = A : m n V spans n ; U spans m A: v u, v u,, v r u r with multipliers,,, r A : u v, u v,, u r v r with multipliers,,, r he SVD reveals the entire structure of A, including its nearly singular, but not exactly singular components! 7. 6

Example: Consider the matrix shown right >> A = [/3 /3 /3; /3 /3 4/3; /3 /3 3/3; /5 /5 4/5; 3/5 /5 4/5] his matrix has rank (col. 3 = col. + col. ) he SVD shows that sort of >> [U,S,V] = svd(a) Due to roundoff, the third singular value is non-zero. o see that, we change format >> format short e /3 /3 /3 /3 /3 4/3 A /3 /3 3/3 /5 /5 4/5 3/5 /5 4/5 MALAB can detect that the numerical rank is really >> rank(a) But that does not work with measurement errors that exceed roundoff >> A = A; A(5,3) = A(5,3) + e-7 >> rank(a) >> [U,S,V] = svd(a) 7.3 Example: We can change the tolerance /3 /3 /3 /3 >> rank(a,e-6) /3 /3 4/3 /3 hat leads to uncertain results for the A /3 /3 3/3, b /3 solution to Ax = b >> x = A\b, x = A\b /5 /5 4/5 /5 >> b = b; b(5) = b(5) + e-7 3/5 /5 4/5 3/5 >> x = A\b, x = A\b If b is not close to a column, things get very bad >> b = (/3)*[ ]' >> x = A\b, x = A\b MALAB does better with the first than the second because it knows that the numerical rank is really One way to fix things is to zero out large elements in the SVD of the pseudo-inverse What is the pseudo-inverse? 7.4 7

he (Moore-Penrose) pseudo-inverse Definition: Writing A = UΣV, where Σ = diag( r 0 0) [an m n matrix] hen the pseudo-inverse A + is given by A + = VΣ + U, where Σ + = diag(/ / / r 0 0) [an n m matrix] he pseudo-inverse equals the inverse for non-singular square matrices. heorem: Given the equation Ax = b, we have three cases: () If the system has a unique solution, x = A + b produces that solution. () If the system is over-determined, then x = A + b produces the solution from the linear manifold that t minimizes i i Ax b (least-squares solution) )that talso minimizes i i x. (3) If the system is under-determined, then x = A + b produces the solution of Ax = b that minimizes x. So, it always does something sensible! 7.5 Example (continued): >> Ap = pinv(a) >> Sp = svd(a) >> x = Ap*b, x = Ap*b, x=ap*b We get something sensible! Note however that pinv(a)*b is not the same as A\b. MALAB uses QR for over- or under-determined systems. Like the pseudoinverse, QR produces the least-square solution, but, unlike the pseudo-inverse, it produces nullity A zeros in the solution. >> Ap = pinv(a) [Many elements are very large a sign of trouble] >> x = Ap*b b, x = Ap*b b, x = Ap*b [Huge, nonsensical numbers in the third case!] We find the source of the problem by looking at the SVD >> [Up Sp Vp] = svd(ap) [Note the large first element] 7.6 8

Example (continued): We can fix the problem by zeroing out the large element of the SVD >> Sp(,) = 0 [We zero out the first element] >> Apm = Up*Sp*Vp' [and reconstruct the pseudo-inverse] >> x = Apm*b [and we now get sensible results] We can avoid this problem by using a tolerance with MALAB >> Ap = pinv(a,e-6) [he tolerance sets small SVD elements to zero] >> x = Ap*b [and we now get sensible results directly] 7.7 Least-Squares Method and the QR Factorization Problem Statement: We often are trying to fit a limited number of parameters to a large, noisy data set. hat leads to an over-determined linear system. Example: We have a set of carbon resistors in a circuit that must produce 0 A. As the resistors age, the voltage that is required to produce this current increases linearly. Hence, we expect v = v 0 + y, where v is the voltage and y is the number of years. We want to determine v 0 and using measurements on resistors of varying ages. We get a curve like the one below: voltage years 7.8 9

Least-Squares Method and the QR Factorization Problem Statement: Example (continued): With 0 measured points, we get a problem of the form Ax = b, where x = [v 0 ], b = [v v v 0 ], in which v v 0 are the measured voltages, and [a k ] =, [a k ] = y k = (k )/0; k = 0. We may generate the numerical example in the following way: >> y=0:.:0; >> v = + 0.*y; >> doc randn >> delta = randn(size(v)); >> vr = v + 0.05*delta; b = vr'; >> doc plot >> plot(y,v,y,vr,'.') >> A = ones(0,); >> A(:0,) = ; >> A(:0,) = y'; >> x = A\b 7.9 Least-Squares Method and the QR Factorization Problem Statement: Remark: Efficient solution of this problem is based on QR factorization. We have already seen that any matrix A may be written in the form A = QR, using Gram-Schmidt Sh orthogonalization ti, where Q is an orthogonal matrix and R is upper diagonal. (See slide 6.0) heorem: Ax b = Rx Q b Proof: Ax b = (QRx b) (QRx b) = (x R Q b )(QRx b) = x R Q QRx x R Q b b QRx+ b b = x R Rx x R Q b b QRx+ b QQ b = Rx Q b heorem: If A is an m nmatrix of rank r, and c = Q b,, then we may find a least squares solution by back-substitution if r = nor by setting x r+ x n = 0 and then using back-substitution if r < n. We then find m min Axb bk kr 7.0 0

Least-Squares Method and the QR Factorization Problem Statement: Remark: Except for the addition of column pivoting, which we will describe shortly, this algorithm is how MALAB calculates least squares. Column pivoting is needed d to ensure stability Remark: he QR algorithm is more efficient than SVD, although less robust, and allows the easy introduction of additional data. Remark: he QR algorithm plays an important role in finding eigenvalues and eigenvectors. o understand this algorithm (and algorithms for solving eigenproblems), we must tfirst study rotations ti and reflections! 7. Rotations: Rotations and Reflections x x (rcos, rsin ) x We find that cos sin x r x/ r x / r U x, where = sin cos x 0 U x / r x/ r (r, 0) and U a rotation matrix is an orthonormal matrix with determinant. x 7.

Rotations and Reflections Rotations: We can systematically implement a series of rotation matrices of the form U jk, to eliminate the lower triangular elements of A and create QR c s j U jk s c k j k 7.3 Rotations: Example: Rotations and Reflections a / a a a / a a 0 a a a 3 a a a 3 a / a a a / a a 0 Q a3 a3 a 33 0 0 A a a ( aa aa)/ a a ( aa3 aa3)/ a a QA 0 ( aa aa)/ aa ( aa3 aa3)/ a a a 3 a 3 a 33 A 7.4

Rotations: Rotations and Reflections Example (continued): a a a a / a a 0 a / a a 3 3 3 3 A 0 a a 3 Q 0 0 a 3 a3 a 33 a3 / a a3 0 a / a a3 a a ( a a a a )/ a a ( a a a a )/ a a QA 0 a a3 0 ( aa3 a3a)/ a a ( a a a a )/ a a A 3 3 3 3 3 3 33 3 3 3 3 33 3 3 3 We find Q 3 and A 3 similarly by eliminating a 3. We then have R = A 3 and Q = Q Q Q 3 Remark: hese rotations are called Givens rotations Better is to use reflections! 7.5 Reflections: Rotations and Reflections We note rsin cos = x /, and rsin x r) /. In vector notation, x v is the desired new vector. Letting u = v/ v, we find v = (rsin,, rsin cos) ( x r) ( x r) x r xr r xr x r 0 I ( x r) x x x r xr r xr U x uu x x x = (rcos, rsin ) Remark: det U =, which is a property of reflections Remark: Better in general is to use v = [x + sign(x )r] /, so that the first element is never zero. In that case, U x = [r 0]. x 7.6 3

Rotations and Reflections Reflections: Remark: Better in general is to use v = [x + sign(x )r] /, so that the first element is never zero. In that case, U x = [r 0]. x v = (rsin,, rsin cos) x x x = (rcos, rsin ) x x = (r, 0) x = (r, 0) x 7.7 Reflections: Rotations and Reflections Algorithm: his approach is extended in a very straightforward way to arbitrary dimensions and allows us to zero out an entire column at once in A. We let r = a, where a is the first column vector in A. We then let v = [r +a a a m ] and let Q Q I vv / v. We now proceed recursively r w ˆ ( m) ( n) ( n) A QA QA, where A, w 0 ˆ A We let r = a, where a is now the first column vector in Â. We now let v = [r +a a a (m) ], where a a (m) are the elements of a, and we let 0 ˆ ˆ Q Q I vv / v, Q Q 0 ˆ Q 7.8 4

Reflections: Algorithm (continued): Rotations and Reflections We now find r w A Q Q A A 0 ˆ A ˆ ( m) ( n) ( n) 0 r w, where, w We continue for r iterations, at which point the algorithm terminates. We have R = A r and Q = Q Q Q r One modification In order to guarantee stability, at each iteration we calculate max a j, where the a j are the column vectors of Aˆ k. Permuting the column vectors permutes the rows of the solution x, which must be restored. Remark: hese reflections are called Householder reflections. 7.9 QR Factorization Example: We again consider the matrix to the left. () We have a = [(/3) + (/3) + (/3) + (/5) + (3/5) ] / =.0893, a =.0954, a 3 =.8. So, we first permute columns and 3. /3 /3 /3 /3 /3 4/3 A /3 /3 3/3 /5 /5 4/5 3/5 /5 4/5 0.333 0.333 0.667 0.667 0.333 0.333 0.667 0.667.333.333 0.667 0.667 0.333 0.667.000.000 0.667 0.333 A 0.400 0.400 0.800 0.800 0.400 0.400 0.600 0.00 0.800 0.800 0.00 0.600 3 3 7.30 5

QR Factorization () We now have v = [0.667 +.8.333.000 0.800 0.800]. Calculating Q and A = Q A, we obtain Q I vv / v..064.058 0.000 0.005 0.005 0.000 0.655 0.655 A 0.000 0.0009 0.0009 0.000 0.009 0.009 3 (3) he norms of the two columns of  are the same by inspection. So, we do not permute. We have a = 0.603 and v = [0.005 0.603 0.655 0.0009 0.009] 7.3 QR Factorization (4) Calculating Q and R = A = Q Q A, Q I vv / v we obtain..064.058 000 0.000 0.603 0.603 0.000 0.000 0.000 A 0.000 0.000 0.000 0.000 0.000 0.000 3 MALAB: >> A = [/3 /3 /3; /3 /3 4/3; /3 /3 3/3; /5 /5 4/5; 3/5 /5 4/5] >> [Q,R,E] = qr(a) 7.3 6

Orthogonal and Unitary Matrices We stated that all our results apply to complex matrices, but all our examples are real. How do reflections and rotations work with complex matrices? heorem: he most general orthogonal l( (unitary) matrix may be written exp[ i( )]cos exp[ i( )]sin U exp[ i( )]sin exp[ i( )]cos Remark: det U = exp(i) can equal +, corresponding to rotations,, corresponding to reflections, and any other complex number of modulus. Remark: Building from this: We may always make the first column vector of A real by multiplying it by the m m orthogonal matrix U, chosen so that exp(i )a, exp(i )a,, exp(i m )a m are real. exp( i So, the QR algorithm can ) U exp( i easily be made to work with ) complex matrices 7.33 7