More on SVD and Gram-Schmidt Orthogonalization

Similar documents
AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Section 6.4. The Gram Schmidt Process

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Math 407: Linear Optimization

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

The QR Factorization

Lecture 4 Orthonormal vectors and QR factorization

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

EECS 275 Matrix Computation

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

5.6. PSEUDOINVERSES 101. A H w.

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Matrix Factorization and Analysis

Conceptual Questions for Review

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

7. Dimension and Structure.

Linear Algebra, part 3 QR and SVD

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Problem set 5: SVD, Orthogonal projections, etc.

COMP 558 lecture 18 Nov. 15, 2010

This can be accomplished by left matrix multiplication as follows: I

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

AMS526: Numerical Analysis I (Numerical Linear Algebra)

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

AM205: Assignment 2. i=1

The Singular Value Decomposition

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Orthonormal Transformations and Least Squares

Applied Numerical Linear Algebra. Lecture 8

Role of Linear Algebra in Numerical Analysis

Linear Analysis Lecture 16

UNIT 6: The singular value decomposition.

Dot product and linear least squares problems

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Lecture 4: Applications of Orthogonality: QR Decompositions

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

3 QR factorization revisited

Problem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit.

TBP MATH33A Review Sheet. November 24, 2018

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Pseudoinverse & Moore-Penrose Conditions

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Orthogonal Transformations

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

A Brief Outline of Math 355

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:

Maths for Signals and Systems Linear Algebra in Engineering

2. Every linear system with the same number of equations as unknowns has a unique solution.

Singular Value Decomposition

1 Last time: least-squares problems

Linear Algebra (Review) Volker Tresp 2017

Singular Value Decomposition

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Linear Algebra Review

Linear Algebra Review. Vectors

Lecture 3: Review of Linear Algebra

Lecture 8 : Eigenvalues and Eigenvectors

Notes on Eigenvalues, Singular Values and QR

Computational Linear Algebra

Basic Elements of Linear Algebra

Linear Algebra V = T = ( 4 3 ).

Stability of the Gram-Schmidt process

Review problems for MA 54, Fall 2004.

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

I. Multiple Choice Questions (Answer any eight)

Math 3191 Applied Linear Algebra

Review of similarity transformation and Singular Value Decomposition

Some Notes on Least Squares, QR-factorization, SVD and Fitting

Lecture 6, Sci. Comp. for DPhil Students

Linear Algebra & Geometry why is linear algebra useful in computer vision?

2. Review of Linear Algebra

Matrix decompositions

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

18.06 Quiz 2 April 7, 2010 Professor Strang

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

18.06SC Final Exam Solutions

Linear Algebra Primer

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Orthonormal Transformations

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Applied Linear Algebra in Geoscience Using MATLAB

MAT Linear Algebra Collection of sample exams

We will discuss matrix diagonalization algorithms in Numerical Recipes in the context of the eigenvalue problem in quantum mechanics, m A n = λ m

Lecture notes: Applied linear algebra Part 1. Version 2

Matrix Algebra for Engineers Jeffrey R. Chasnov

Math 6610 : Analysis of Numerical Methods I. Chee Han Tan

Least squares: the big idea

Math 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES

Introduction to Matrix Algebra

Computational math: Assignment 1

Lecture # 5 The Linear Least Squares Problem. r LS = b Xy LS. Our approach, compute the Q R decomposition, that is, n R X = Q, m n 0

Lecture 3: Review of Linear Algebra

1 0 1, then use that decomposition to solve the least squares problem. 1 Ax = 2. q 1 = a 1 a 1 = 1. to find the intermediate result:

Solution of Linear Equations

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for At a high level, there are two pieces to solving a least squares problem:

Transcription:

Indian Statistical Institute, Kolkata Numerical Analysis (BStat I) Instructor: Sourav Sen Gupta Scribe: Saptarshi Chakraborty(BS1504) Rohan Hore(BS1519) Date of Lecture: 05 February 2016 LECTURE 6 More on SVD and Gram-Schmidt Orthogonalization 6.1 Physical Interpretation of SVD In this section we give the physical interpretation of singular value decomposition(svd). Before we proceed we need the following theorem. Theorem 6.1. The determinant of an orthogonal matrix is either +1 or 1 Proof. Since det(a) = det(a T ) 1 = det(i n ) = det(a T A) = det(a T )det(a) = (det(a)) 2 = det(a) = +1or 1 Let A = UΣV T be the svd of A m n.to interpret svd geometrically we observe matrices U and V T are orthogonal.hence they preserve lengths and angles.consider the vector x.by multiplying by V T we find its components a along the direction of v i s.where v is form the columns of V.Then the matrix Σ expands the component along v i by a factor of d i.then the matrix U throws it into the column space of A preserving the lengths and angles. 6.1.1 relation between det(a) and det(σ) If we assume A n n is a square matrix,then det(a) = det(u) det(σ) V T = det(σ) = n σ i Physically a determinant can be interpreted as the volume of the n-cuboid formed by the columns of A.This is equal to the product the diagonal entries of Σ. i=1

6.2 Approximation by low rank matrices Expanding A = UΣV T we get, A = l T σ i u i v i i=1 where l = min{m, n}and u i and v i are the i-th column of U and V respectively.the sum goes to l because the other terms will be zeroed out by Σ.This shows that the matrix A(say of rank r) is the sum of matrices of rank 1 each.in practice if we drop out a few terms having small σ i,this serves as a very good approximation of A.This is called row rank approximation of the matrix A. 6.2.1 Application of low rank approximation One application of low rank approximation is to store high quality digital picture in small memory.the amount of memory to store A high.instead if we store some columns of U,V and some values of σ i, then we require much less memory. As a simple illustration we show the picture quality of the following pictures when we take 10,20,30 terms of the sum.(see figure 6.1) 6.3 Gram-Schmidt Process 6.3.1 Regular Gram-Schmidt process Our first algorithm for QR decomposition is the regular Gram-Schmidt process.assuming the columns of the matrix A m n be linearly independent,we can apply Gram-Schmidt orthogonalization process to orthonormalize the columns of A m n.to do this we multiply A by a bunch of upper triangular matrices to do the column operations.in this way we get Q.By taking the inverse of the bunch of upper triangular matrices, we get R.The algorithm for regular Gram- Schmidt process is as follows. function Gram-Schmidt ( v 1, v 2,..., v k ) compute an orthonormal basis â 1,..., â k of the span { v 1, v 2,..., v k } Assumes { v 1, v 2,..., v k } are linearly independent. â 1 = v 1 / v 1 for(i in 2:k) p = 0 for(j in 1:1) p = p + ( v i.â j )â j r = v i p â i = r/ r return{â 1,..., â k } 6.3.2 modified gram-schmidt process Here also,initially we have the column vectors v 1, v 2.. v n and we go on finding the orthonormal vectors ˆq 1,ˆq 2...ˆq n but here once we get one of the orthonormal vectors(say ˆq i ) we will immediately project ˆq i out of the remaining column vectors v i+1, v 1+2 and so on. L6 P2

Figure 6.1: (Clockwise)picture having 10,20,30 principal components respectively and the original picture 6.3.2.1 Algorithm For Modified Gram-Schmidt Process Thus the algorithm for performing the modified gram Schmidt process can be understood by this algorithm written using R language: for(i in 1:k)~ vi = rnorm(n,mean=0,sd=1) computes an orthonormal basis qˆ1,...qˆk for the input v~1,...v~k Assumes v~1,...v~k linearly independent for(i in 1:k)qˆi = v~i /norm(~ vi, type = F ) for(j in i+1:k)v~j = v~j (hv~j.qˆi i) qˆi return( qˆ1,...qˆk ) 6.3.3 comparison between the Gram-Schmidt Process The Gram-Schmidt algorithm sometimes is not good.for instance, due to rounding and other issues,sometimes q i s are not completely orthogonal after the projection step. Our projection formula for finding p~ within the algorithm, however, only works when the q i s are orthogonal. For this reason, in the presence of rounding, the projection p~ of p~i becomes less accurate. L6 P3

One way to solve this issue is the modified Gram-Schmidt algorithm which has similar running time but makes a subtle change in the way projections are computed. Rather than computing the projection p in each iteration i onto span [ˆq i ; : : : ; ˆq i 1 ], as soon as ˆq i is computed it is projected out of [ p i+1 ; : : : ; p k ] subsequently. we never have to consider ˆq i again. This way, even if the basis is not completely orthogonal due to rounding, the projection step is valid since it only projects onto one ˆq i at a time. In the absence of rounding, modified Gram-Schmidt and regular Gram-Schmidt generate identical output.but,in terms of machine computation MGS is better. 6.3.4 Drawbacks of Gram-Schmidt Process if any two of the input vectors i.e. v i and v j of the input vectors are very close,then that will turn the Gram-Schmidt process unstable. Suppose,we provide two vectors v 1 =(1,1) and v 2 = (1 + ɛ, 1) where ɛ << 1.Then one reasonable basis of span[v 1,v 2 ] can be [(0, 1) t,(1, 0) t ] But,if we apply Gram-Schmidt process we obtain ˆq 1 = v 1 / v i = (1/ 2) (1, 1) t p = ((2 + ɛ)/2) (1, 1) t r= v 2 - p = (1/2) (ɛ, ɛ) t r =(1/ 2) ɛ So,computing ˆq 2 =(1/ 2)*(1, 1) t needs division by a very small number of o(ɛ)which leads us to unstable numerical operation. Thus,in these cases we should avoid applying Gram-Schmidt process. 6.4 QR decomposition We can do QR decomposition by Gram-Schmidt Process.Consider the columns of The given matrix(say,a ( mxn))as the vectors v 1,v 2...v n as given set of vectors.we will apply Gram- Schmidt process to the set of vectors to get q 1,q 2...q n and then R will be found.now,a may not be full column rank matrices.in that case,we will consider only the linearly independent columns and then will extend it. For a full column rank matrix the R matrix can be written as: x 1.x 2 x 2.q 1... x m.q 1 0 x 2.q 2... x m.q 2 0 0.. x m.q n 1 0 0 0 x m.q n 6.4.1 QR decomposition by other ways The general process of QR decomposition by Gram-Schmidt can be pictorised as: A*U 1 *U 2 *...*U n 1 *U n =Q where Q is the ultimate resultant orthogonal matrix.and U are the upper triangular matrices we multiplied with.thus,the R is evaluated by Un 1,Un 1 1 1...U1 thus here,we are post-multiplying A by upper-triangular matrices to get Q..but let s try to pre multiply A by some matrices to get R.Obviously,the matrices are bound to be orthogonal.since,q T *A=R.thus,We are multiplying A by orthogonal matrices.now,orthogonal matrices can t do any kind of stretching.thus,orthogonal matrices can only do reflection and rotation. Let s try rotation first. L6 P4

6.4.1.1 QR decomposition by rotation objective: all the elements below the diagonal of the k th column of the given matrices should be made. Thus,we want to make all the elements below the first element of column a zero.say,the first column of A=(a 11,a 12,...a 1n ) should be transformed by this process: A=(a 11,a 12,...a 1n ) > (a 1 11,0,...a1 1n ) > (a2 11,0,0,...a2 1n ) > (an 1 11,0,0...0) Now,since rotation matrices do not perform any kind of stretching,our objective is to do a rotation which performs v > v *e 1 where say e 1 is the first vector of canonical basis.in case of R 2 we already know,how to do it. now,in case of R 2 we know,the rotation matrix turns out to be [ ] cos(θ) sin(θ) sin(θ) cos(θ) thus we generalised this rotation matrix in case of R n by the process of Given s rotation matrix.given defined the matrix G(i, j, θ) as follows: 1 0.... 0 0 1.... 0 :.. [c] i,i [s] i,j 0 :.. [ s] j,i [c] j,j 0 :...... 1 where c= cos(θ) and s=sin(θ) appears at the i th and j th rows and columns. The non-zero elements of given matrix G(i, j, θ)is defined as: g kk =1 g ii =c g jj =c g ij =s if j > i g ji =-s if j < i the product G(i, j, θ)x means a counter clockwise rotation of x by θ radians in the (i,j)planes. thus,we will multiply A by all G(i, j, θ) j>i.thus,by performing all the multiplications,we ultimately get the R matrix.thus,we are done. 6.4.1.2 QR by reflection So far,we have only done QR decomposition by pre-multiplying by suitable rotation matrices.now,since we started our talk with performing QR by any orthonormal matrices,now we should think about whether it is even possible. The answer is yes and the way to do it is called householder transformation. References [1] Samuel Conte and Carl de Boor, Elementary Numerical Analysis An Algorithmic Approach, McGraw-Hill Education. [2] Justin Solomon, Numerical Algorithms. Available online on the course website. L6 P5