Stability of the Gram-Schmidt process

Similar documents
The QR Factorization

Linear Analysis Lecture 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

This can be accomplished by left matrix multiplication as follows: I

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

1 Backward and Forward Error

Conceptual Questions for Review

Math 307 Learning Goals. March 23, 2010

Lecture 6, Sci. Comp. for DPhil Students

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Math 407: Linear Optimization

MAT Linear Algebra Collection of sample exams

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

Vector and Matrix Norms. Vector and Matrix Norms

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

14.2 QR Factorization with Column Pivoting

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Chapter 3 Transformations

6. Orthogonality and Least-Squares

Least-Squares Systems and The QR factorization

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 307 Learning Goals

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Class notes: Approximation

MATH 22A: LINEAR ALGEBRA Chapter 4

Review problems for MA 54, Fall 2004.

Linear least squares problem: Example

SUMMARY OF MATH 1600

Matrix decompositions

7. Symmetric Matrices and Quadratic Forms

Numerical Linear Algebra Chap. 2: Least Squares Problems

18.06 Professor Johnson Quiz 1 October 3, 2007

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Background Mathematics (2/2) 1. David Barber

I. Multiple Choice Questions (Answer any eight)

Orthogonal Projection, Low Rank Approximation, and Orthogonal Bases

The scope of the midterm exam is up to and includes Section 2.1 in the textbook (homework sets 1-4). Below we highlight some of the important items.

Orthogonal Projection, Low Rank Approximation, and Orthogonal Bases

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

Cheat Sheet for MATH461

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Lecture 2: Numerical linear algebra

MATH 350: Introduction to Computational Mathematics

Applied Linear Algebra in Geoscience Using MATLAB

Find the solution set of 2x 3y = 5. Answer: We solve for x = (5 + 3y)/2. Hence the solution space consists of all vectors of the form

which arises when we compute the orthogonal projection of a vector y in a subspace with an orthogonal basis. Hence assume that P y = A ij = x j, x i

FINAL EXAM Ma (Eakin) Fall 2015 December 16, 2015

18.06 Quiz 2 April 7, 2010 Professor Strang

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N.

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Lecture notes: Applied linear algebra Part 1. Version 2

Chapter 5 Orthogonality

ANSWERS. E k E 2 E 1 A = B

Linear Models Review

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Solution Set 3, Fall '12

Lecture 4: Applications of Orthogonality: QR Decompositions

Scientific Computing: Dense Linear Systems

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

homogeneous 71 hyperplane 10 hyperplane 34 hyperplane 69 identity map 171 identity map 186 identity map 206 identity matrix 110 identity matrix 45

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

CS 322 Homework 4 Solutions

Orthogonal Transformations

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Math Fall Final Exam

Review Questions REVIEW QUESTIONS 71

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Lecture 3: QR-Factorization

Linear Algebraic Equations

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

The Gram Schmidt Process

The Gram Schmidt Process

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

5.6. PSEUDOINVERSES 101. A H w.

MTH 464: Computational Linear Algebra

Chapter 6: Orthogonality

Linear Algebra Primer

Q T A = R ))) ) = A n 1 R

Diagonalizing Matrices

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Numerical Analysis Lecture Notes

MATH 3511 Lecture 1. Solving Linear Systems 1

Linear Systems of n equations for n unknowns

Linear Least Squares Problems

MIDTERM. b) [2 points] Compute the LU Decomposition A = LU or explain why one does not exist.

2. Every linear system with the same number of equations as unknowns has a unique solution.

(v, w) = arccos( < v, w >

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Row Space, Column Space, and Nullspace

Matrix Factorization and Analysis

Numerical Analysis Fall. Gauss Elimination

LINEAR ALGEBRA SUMMARY SHEET.

Elementary linear algebra

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

MTH 2032 SemesterII

Transcription:

Stability of the Gram-Schmidt process Orthogonal projection We learned in multivariable calculus (or physics or elementary linear algebra) that if q is a unit vector and v is any vector then the orthogonal projection of v onto span q is v = v, q q, and the orthogonal complement is v = v v. v q v <v,q>q <v,q>q More generally, if Q is a matrix with orthonormal columns then the orthogonal projection of v onto the column space of Q is QQ v, and its orthogonal complement if v QQ v = (I QQ )v. Note that if Q is a square matrix with orthonormal columns then it is orthogonal. But in general this is not true. The operator P = QQ is called an orthogonal projector. A projector satisfies the relation P 2 = P. The complementary projector is P = I P. It is also a projector, and P P = P P =. To say that a projector is orthogonal is to say that its column space and nullspace are orthogonal or equivalently that the column spaces of P and its complement are orthogonal. This is also equivalent to the condition that P = P. This is because for any matrix A the orthogonal complement of the column space is the same thing as the nullspace of A. Beware of a possible confusion in the terminology: an orthogonal projector is rarely an orthogonal matrix. Classical method Suppose A = [a 1... a n ] is a matrix with linearly independent columns. We will use orthogonal projectors to construct a matrix Q = [q 1... q n ] satisfying the following conditions: 1. Q Q = I. 2. For all j, span{a 1,..., a j } = span{q 1,..., q j }. From the second condition we see that A = QR, where R is upper triangular. Our algorithm will also ensure that the diagonal entries of R are positive. With these three conditions, Q and R are uniquely determined. Here is the classical algorithm: 1. Set r 11 = a 1 and q 1 = a 1 /r 11. 2. Given q 1,..., q j 1, construct q j and r j as follows: (a) Let P be the orthogonal projector onto span{q 1,..., q j 1 }. (b) Let r 1j,..., r j 1,j be the coordinates of P a j. (c) Let v = P a j. (d) Let r jj = v. (e) Finally, let q j = v/r jj. 3. Repeat step 2 for j = 2,..., n. (We are using the 2-norm throughout.) This produces the so-called reduced QR-decomposition; Q has the same size as A, and R is square. The so-called full QR-decomposition completes Q to a square, orthogonal matrix, and adds all- rows to R so that we still have QR = A. The full form of the QR-decomposition is used less often than the reduced form. Also, if A does not have linearly independent columns then we can still construct a QR-decomposition. When we come to a column which is linearly dependent on the previous columns then v = in step 2c and so we merely skip steps 2d and 2e. If A has rank r then this produces an m r matrix Q with orthonormal columns and an upper echelon r n matrix R with positive pivot entries. 1

Stable method The classical Gram-Schmidt method is useful conceptually but is not numerically stable, as we shall see. There is a simple way to fix the algorithm so as to restore stability: rather than adjust the current column to be orthogonal to the previous columns, adjust the remaining columns to be orthogonal to the current one! Said another way, rather than produce R column-by-column, produce it row-by-row. Here is the modified Gram-Schmidt algorithm, which we will also refer to as the stable Gram-Schmidt algorithm: 1. Set Q = A. 2. Let r jj = q j. 3. Replace q j by q j /r jj. 4. Make q j+1,..., q n orthogonal to q j, as follows: (a) Let r j,k = q k, q j, for k = j + 1,..., n. (b) Replace q k by q k r jk q j, for k = j + 1,..., n. 5. Repeat steps 2, 3, and 4 for j = 1,..., n. To appreciate the difference in stability between the two methods, exercises 1 and 2 (below) ask you to run these algorithms by hand, on a small, nearly singular, 2 2 example. You may use a calculator to do the arithmetic, but round off each computation to 5 decimal places. For this same matrix, exercise 3 asks you to make another check on the stability, called the Householder Test. Do this in Matlab or Octave, not by hand. Finally, exercise 4 asks you to determine the efficiency of these two algorithms, by counting the total number of floating point operations (flops) for each. As before, you should probably make three separate tallies for each: Additions and subtractions. Multiplications. Divisions. Stability comparison Let us explore the stability of the three functions clgs(), stgs(), and qr() (which we will explain in a later lesson) using large matrices. We will generate an 8 8 matrix with random Q and with an R whose diagonal entries decrease exponentially. To do this we use a factorization we will learn more about later, the singular value decomposition, or svd. This has the form USV, where U and V are orthogonal and S is diagonal with positive entries. [U,X] = qr(randn(8)); % randn() use the normal distribution. [V,X] = qr(randn(8)); S = diag(2.^ (-1:-1:8)); %.^ means element-wise exponentiation. A = U*S*V; We have chosen U and V randomly, and S to have diagonal entries 2 1,..., 2 8. When we find the QR-decomposition of A these will not be the exact values of the diagonal of R but they will be of the same order of magnitude. In particular at least some of these values are below the machine accuracy, and so will be lost to roundoff error. How many? Let s compute the QR-decomposition in three ways, collect the computed diagonal of R into three vectors x, y, and z, then plot the logarithms (base 2) of the three together: [Qc,Rc] = clgs(a); [Qs,Rs] = stgs(a); [Q,R] = qr(a,); x = sum(tril(rc)); y = sum(tril(rs)); z = sum(abs(tril(r))); % Here we are only interested in the R, but % you might also check norm(q *Q-eye(8)). % The parameter requests the reduced form. % This is a way to collect the diagonal entries % into a vector -- can you find a better way? % qr() does not guarantee a postive diagonal. 2

plot(log2(x), @;clgs();, % The expression @;clgs(); is a formatting log2(y), @;stgs();, % command: it says to plot points (rather than log2(z), @;qr(); ) % lines, say) and label these as clgs(). clgs() stgs() qr() -1-2 -3-4 -5-6 1 2 3 4 5 6 7 8 Note that clgs() becomes lost in roundoff at about 2 28, which is well before machine accuracy should be a problem. (ɛ machine = 2 56 = (2 28 ) 2.) However both stgs() and qr() remain stable down to the limits of machine accuracy. Can they be distinguished in another way, perhaps by the Householder Test? Application: Legendre polynomials There are many applications of the QR-decomposition, for example to the solution of linear systems. (See exercise 5.) In this section we explore an application of the Gram-Schmidt algorithm itself. The goal of the algorithm is to take a linearly independent set and produce from it an orthonormal set with the same span. All we need to run the algorithm is a vector space with an inner product. One important, infinite-dimensional example is the space of all continuous functions on [ 1, 1], with the inner product defined as follows: f, g = 1 2 1 1 f(t)g(t) dt. (The factor of 1 2 makes the constant 1 into a unit vector.) Thus we may speak of two functions being orthogonal on [ 1, 1]. For example, sin(πt) and cos(πt) are orthogonal on [ 1, 1]. (Check this!) The Legendre polynomials are the orthogonal polynomials produced by apply the Gram-Schmidt process to the standard monomials t, t 1, t 2, t 3,.... Exercise 6 asks you to determine exact formulas for the first four Legendre polynomials, q,..., q 3. Here we approximate these by discretizing the interval [ 1, 1] that is, we take a large number (257 in this example) of equally spaced points from that interval, form a matrix A by evaluating the monomials t,..., t 3 at these points, then computing the QR-decomposition of A. Note that A consists of four columns of a Vandermonde matrix. Finally we plot the four functions q i. t = (-128:128) /128; % We could also use x = linspace(-1,1,128) ; A = []; for j = :3 % We construct A by hand using element-wise A(:j+1) = t.^ j; % operations, because it would be a waste of end % memory to produce all 8 columns with vander(). [Q,R] = qr(a,); plot(q) % This plots the columns as distinct functions. 3

.2.15.1.5 -.5 -.1 -.15 -.2 In essence we are using Riemann sums to approximate the integrals, and so we need to rescale by t. To take into account any roundoff errors, let s rescale all of these polynomials so that q i (1) = 1. for j = 1:4 Q(:,j) = Q(:,j)/Q(257,j); end plot(q) 1.5 -.5-1 4

Homework problems: Final version due Friday, 17 March Do five of the following six problems. In problems 1, 2, and 3, (but not 4, 5, or 6) let [ ].7.7711 A =..71.7711 1. For A as above, compute [Q, R] = clgs(a) using 5-place floating-point arithmetic. Is Q orthogonal? Is QR = A? 2. For A as above, compute [Q, R] = stgs(a) using 5-place floating-point arithmetic. Is Q orthogonal? Is QR = A? 3. For A as above, use Matlab or Octave to compute the QR-decomposition three ways: clgs(), stgs(), and qr(). For each of these check Q Q I. (This is called the Householder Test.) What should this value be? What do you find? 4. Carefully count flops for clgs() and stgs(). As before, you should probably make three separate tallies for each: Additions and subtractions. Multiplications. Divisions. Which is faster? 5. Suppose you are given the (exact!) QR-factorization of a matrix A. (Not necessarily the one above!) Describe an efficient method for solving Ax = b. How many flops does this method require? Is it backward stable? 6. Find explicit formulas for the Legendre polynomials q,..., q 3. Use both the classical and the stable algorithms, but do not bother to keep track of R. Verify that the q i you find are orthonormal. Which algorithm is easier to use in this context? 5