Least-Squares Systems and The QR factorization

Similar documents
Linear Analysis Lecture 16

Householder reflectors are matrices of the form. P = I 2ww T, where w is a unit vector (a vector of 2-norm unity)

Lecture 6, Sci. Comp. for DPhil Students

Orthogonal Transformations

ORTHOGONALITY AND LEAST-SQUARES [CHAP. 6]

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Numerical Linear Algebra Chap. 2: Least Squares Problems

The QR Factorization

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Math 407: Linear Optimization

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture # 5 The Linear Least Squares Problem. r LS = b Xy LS. Our approach, compute the Q R decomposition, that is, n R X = Q, m n 0

This can be accomplished by left matrix multiplication as follows: I

PART II : Least-Squares Approximation

MIDTERM. b) [2 points] Compute the LU Decomposition A = LU or explain why one does not exist.

Dot product and linear least squares problems

Lecture 4 Orthonormal vectors and QR factorization

Applied Linear Algebra in Geoscience Using MATLAB

Lecture 3: QR-Factorization

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Orthonormal Transformations

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

Linear Algebra, part 3 QR and SVD

Orthonormal Transformations and Least Squares

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Math 3191 Applied Linear Algebra

Linear least squares problem: Example

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

Stability of the Gram-Schmidt process

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Lecture notes: Applied linear algebra Part 1. Version 2

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

MATH 22A: LINEAR ALGEBRA Chapter 4

Review problems for MA 54, Fall 2004.

Linear Algebra Massoud Malek

Cheat Sheet for MATH461

Notes on Eigenvalues, Singular Values and QR

Chapter 5 Orthogonality

MATH 235: Inner Product Spaces, Assignment 7

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

MAT Linear Algebra Collection of sample exams

AMS526: Numerical Analysis I (Numerical Linear Algebra)

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

LARGE SPARSE EIGENVALUE PROBLEMS

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

Basic Elements of Linear Algebra

I. Multiple Choice Questions (Answer any eight)

Orthogonalization and least squares methods

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math Fall Final Exam

E2 212: Matrix Theory (Fall 2010) Solutions to Test - 1

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

Lecture 2: Numerical linear algebra

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Algebra C Numerical Linear Algebra Sample Exam Problems

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Lecture 6. Numerical methods. Approximation of functions

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Section 6.4. The Gram Schmidt Process

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

MATH 532: Linear Algebra

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

Linear Models Review

ELE/MCE 503 Linear Algebra Facts Fall 2018

MATH 350: Introduction to Computational Mathematics

EXAM. Exam 1. Math 5316, Fall December 2, 2012

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

There are two things that are particularly nice about the first basis

Linear Algebra- Final Exam Review

5 Selected Topics in Numerical Linear Algebra

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

October 25, 2013 INNER PRODUCT SPACES

Linear Algebra Review. Vectors

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Problem 1. CS205 Homework #2 Solutions. Solution

MATH 369 Linear Algebra

Lecture 3: Review of Linear Algebra

1. Diagonalize the matrix A if possible, that is, find an invertible matrix P and a diagonal

Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) = α 2. = max. A 1 x.

Chap 3. Linear Algebra

ERROR AND SENSITIVTY ANALYSIS FOR SYSTEMS OF LINEAR EQUATIONS. Perturbation analysis for linear systems (Ax = b)

5. Orthogonal matrices

University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam June 8, 2012

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

To find the least-squares solution to Ax = b, we project b onto Col A to obtain the vector ) b = proj ColA b,

WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS

Introduction to Numerical Linear Algebra II

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

Numerical Methods - Numerical Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra

6. Orthogonality and Least-Squares

Transcription:

Least-Squares Systems and The QR factorization Orthogonality Least-squares systems. The Gram-Schmidt and Modified Gram-Schmidt processes. The Householder QR and the Givens QR. Orthogonality The Gram-Schmidt algorithm. Two vectors u and v are orthogonal if (u, v) =.. A system of vectors {v,..., v n } is orthogonal if (v i, v j ) = for i j; and orthonormal if (v i, v j ) = δ ij 3. A matrix is orthogonal if its columns are orthonormal Notation: V = [v,..., v n ] == matrix with column-vectors v,..., v n. 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-8- Least-Squares systems Given: an m n matrix n < m. Problem: find x which minimizes: b Ax Good illustration: Data fitting. Typical problem of data fitting: We seek an unknwon function as a linear combination φ of n known functions φ i (e.g. polynomials, trig. functions). Experimental data (not accurate) provides measures β,..., β m of this unknown function at points t,..., t m. Problem: find the best possible approximation φ to this data. Question: Close in what sense? Least-squares approximation: Find φ such that φ(t) = n i= ξ iφ i (t), & m j= φ(t j) β j = Min In linear algebra terms: find best approximation to a vector b from linear combinations of vectors f i, i =,..., n, where β φ i (t ) b = β., f i = φ i (t ). φ i (t m ) β m φ(t) = n i= ξ iφ i (t), s.t. φ(t j ) β j, j =,..., m 8-3 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-4 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-3 8-4

We want to find x = {ξ i } i=,...,n such that n ξ i f i b Minimum Define i= We want to find x to : F = [f, f,..., f n ], x = minimize b F x ξ. This is a Least-squares linear system: F is m n, with m > n. ξ n THEOREM. The vector x mininizes ψ(x) = b F x if and only if it is the solution of the normal equations: F T F x = F T b Proof: Expand out the formula for ψ(x + δx): ψ(x + δx) = ((b F x ) F δx) T ((b F x ) F δx) = ψ(x ) (F δx) T (b F x ) + (F δx) T (F δx) = ψ(x ) (δx) T [F T (b F x )] }{{} x ψ + (F δx) T (F δx) }{{} always positive In order for ψ(x + δx) ψ(x ) for any δx, the boxed quantity [the gradient vector] must be zero. Q.E.D. 8-5 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-6 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-5 8-6 b b F x * Example: Points: t = t = / t 3 = t 4 = / t 5 = Values: β =. β =.3 β 3 =.3 β 4 =. β 5 =..5 {F} F x *.4.3 Illustration of theorem: x is the best approximation to the vector b from the subspace span{f } if and only if b F x is to the whole subspace span{f }. This in turn is equivalent to F T (b F x ) = Normal equations.....5.5.5.5 8-7 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-8 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-7 8-8

) Approximations by polynomials of degree one: φ (t) =, φ (t) = t.....5 F =...5.. ( ) 5. F T F = (.5 ).9 F T b =.5 ) Approximation by polynomials of degree : φ (t) =, φ (t) = t, φ 3 (t) = t. Best polynomial found:.3857485.6 t.574857 t.5.4.5.3.4 Best approximation is φ(t) =.8.6t..3......5.5.5.5 8-9 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR..5.5.5.5 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-9 8- Problem with Normal Equations Condition number is high: if A is square and non-singular, then κ (A) = A A = σ max /σ min κ (A T A) = A T A (A T A) = (σ max /σ min ) Example: Let A = ( ) ɛ ɛ. ɛ Then κ(a) /ɛ, but κ(a T A) ɛ. ( ) + ɛ fl(a T A) = fl + ɛ = + ɛ is singular to working precision (if ɛ < u ). ( ) Finding an orthonormal basis of a subspace Goal: Find vector in span(x) closest to b. Much easier with an orthonormal basis for span(x). Problem: Given X = [x,..., x n ], compute Q = [q,..., q n ] which has orthonormal columns and s.t. span(q) = span(x) Note: each column of X must be a linear combination of certain columns of Q. We will find Q so that x j (j column of X) is a linear combination of the first j columns of Q. 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-8-

ALGORITHM : Classical Gram-Schmidt. For j =,..., n Do:. Set ˆq := x j 3. Compute r ij := (ˆq, q i ), for i =,..., j 4. For i =,..., j Do : 5. Compute ˆq := ˆq r ij q i 6. EndDo 7. Compute r jj := ˆq, 8. If r jj = then Stop, else q j := ˆq/r jj 9. EndDo All n steps can be completed iff x, x,..., x n are linearly independent. Lines 5 and 7-8 show that x j = r j q + r j q +... + r jj q j If X = [x, x,..., x n ], Q = [q, q,..., q n ], and if R is the n n upper triangular matrix R = {r ij } i,j=,...,n then the above relation can be written as X = QR R is upper triangular, Q is orthogonal. This is called the QR factorization of X. What is the cost of the factorization when X R m n? 8-3 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-4 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-3 8-4 X Original matrix = Q * Q is orthogonal H (Q Q = I) R R is upper triangular Another decomposition: A matrix X, with linearly independent columns, is the product of an orthogonal matrix Q and a upper triangular matrix R. 8-5 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR Better algorithm: Modified Gram-Schmidt. ALGORITHM : Modified Gram-Schmidt. For j =,..., n Do:. Define ˆq := x j 3. For i =,..., j, Do: 4. r ij := (ˆq, q i ) 5. ˆq := ˆq r ij q i 6. EndDo 7. Compute r jj := ˆq, 8. If r jj = then Stop, else q j := ˆq/r jj 9. EndDo Only difference: inner product uses the accumulated subsum instead of original ˆq 8-6 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-5 8-6

The operations in lines 4 and 5 can be written as ˆq := ORT H(ˆq, q i ) where ORT H(x, q) denotes the operation of orthogonalizing a vector x against a unit vector q. z x Modified Gram-Schmidt algorithm is much more stable than classical Gram-Schmidt in general. [A few examples easily show this]. Suppose MGS is applied to A yielding computed matrices ˆQ and ˆR. Then there are constants c i (depending on (m, n)) such that A + E = ˆQ ˆR E c u A ˆQ T ˆQ I c u κ (A) + O((u κ (A)) ) for a certain perturbation matrix E, and there exists an orthonormal matrix Q such that (x,q) q A + E = Q ˆR E (:, j) c 3 u A(:, j) for a certain perturbation matrix E. Result of z = ORT H(x, q) 8-7 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-8 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-7 8-8 An equivalent version: ALGORITHM : 3 Modified Gram-Schmidt - -. Set ˆQ := X. For i =,..., n Do:. Compute r ii := ˆq i, 3. If r ii = then Stop, else q i := ˆq i /r ii 4. For j = i +,..., n, Do: 5. r ij := (ˆq j, q i ) 6. ˆq j := ˆq j r ij q i 7. EndDo 8. EndDo Does exactly the same computation as previous algorithm, but in a different order. Example: Orthonormalize the system of vectors: X = [x, x, x 3 ] = 4 Answer: q = ; ˆq = x (x, q )q = ˆq = ; q = 8-9 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-9 8-

ˆq 3 = x 3 (x 3, q )q = = 4 3 ˆq 3 = ˆq 3 (ˆq 3, q )q = ( ) 3 ˆq 3 = 3 q 3 = ˆq 3 ˆq 3 = 3 =.5.5.5.5 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR For this example: what is Q? what is R? Compute Q T Q. Result is the identity matrix. Recall: For any orthogonal matrix Q, we have Q T Q = I (In complex case: Q H Q = I). Consequence: For an n n orthogonal matrix Q = Q T. (Q is orthogonal/ unitary) 8- TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-8- Application: another method for solving linear systems. Ax = b Use of the QR factorization Problem: Ax b in least-squares sense A is an n n nonsingular matrix. Compute its QR factorization. Multiply both sides by Q T Q T QRx = Q T b Rx = Q T b Method: Compute the QR factorization of A, A = QR. Solve the upper triangular system Rx = Q T b. Cost?? A is an m n (full-rank) matrix. Let A = QR the QR factorization of A and consider the normal equations: A T Ax = A T b R T Q T QRx = R T Q T b R T Rx = R T Q T b Rx = Q T b (R T is an n n nonsingular matrix). Therefore, x = R Q T b 8-3 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-4 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-3 8-4

Another derivation: Recall: span(q) = span(a) So b Ax is minimum when b Ax span{q} Therefore solution x must satisfy Q T (b Ax) = Q T (b QRx) = Rx = Q T b x = R Q T b Also observe that for any vector w w = QQ T w + (I QQ T )w and that w = QQ T w (I QQ T )w Pythagoras theorem w = QQT w + (I QQT )w b Ax = b QRx = (I QQ T )b + Q(Q T b Rx) = (I QQ T )b + Q(Q T b Rx) = (I QQ T )b + Q T b Rx Min is reached when nd term of r.h.s. is zero. 8-5 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-6 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-5 8-6 Method: Compute the QR factorization of A, A = QR. Compute the right-hand side f = Q T b Solve the upper triangular system Rx = f. x is the least-squares solution As a rule it is not a good idea to form A T A and solve the normal equations. Methods using the QR factorization are better. Total cost?? (depends on the algorithm use to get the QR decomp. Using matlab find the parabola that fits the data in previous data fitting example (p. 8-) in L.S. sense [verify that the result found is correct.] 8-7 TB: 7,8,,9; AB:.,.3.4, ;GvL 5, 5.3 QR 8-7