Math 407: Linear Optimization

Similar documents
Linear Analysis Lecture 16

The Linear Least Squares Problem

This can be accomplished by left matrix multiplication as follows: I

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

MA 265 FINAL EXAM Fall 2012

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

May 9, 2014 MATH 408 MIDTERM EXAM OUTLINE. Sample Questions

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

Solutions to Review Problems for Chapter 6 ( ), 7.1

Lecture notes: Applied linear algebra Part 1. Version 2

I. Multiple Choice Questions (Answer any eight)

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

2. Every linear system with the same number of equations as unknowns has a unique solution.

Math 265 Linear Algebra Sample Spring 2002., rref (A) =

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Lecture 4 Orthonormal vectors and QR factorization

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

ELE/MCE 503 Linear Algebra Facts Fall 2018

LINEAR ALGEBRA SUMMARY SHEET.

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

5.6. PSEUDOINVERSES 101. A H w.

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Typical Problem: Compute.

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Cheat Sheet for MATH461

Recall the convention that, for us, all vectors are column vectors.

Pseudoinverse & Moore-Penrose Conditions

MATH36001 Generalized Inverses and the SVD 2015

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

7. Dimension and Structure.

Section 6.4. The Gram Schmidt Process

Problem # Max points possible Actual score Total 120

The QR Factorization

Lecture 3: QR-Factorization

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Matrix Factorization and Analysis

SUMMARY OF MATH 1600

MATH 22A: LINEAR ALGEBRA Chapter 4

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

5. Orthogonal matrices

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

LINEAR ALGEBRA REVIEW

1 Last time: least-squares problems

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Lecture 6, Sci. Comp. for DPhil Students

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Linear Algebra Formulas. Ben Lee

Miderm II Solutions To find the inverse we row-reduce the augumented matrix [I A]. In our case, we row reduce

Householder reflectors are matrices of the form. P = I 2ww T, where w is a unit vector (a vector of 2-norm unity)

Problem set 5: SVD, Orthogonal projections, etc.

Review problems for MA 54, Fall 2004.

Chapter 6. Orthogonality

Row Space and Column Space of a Matrix

Matrix decompositions

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

Dot product and linear least squares problems

CSL361 Problem set 4: Basic linear algebra

Review of Matrices and Block Structures

MATH. 20F SAMPLE FINAL (WINTER 2010)

MAT Linear Algebra Collection of sample exams

Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 2015

Introduction to Numerical Linear Algebra II

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

1. General Vector Spaces

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Math 2174: Practice Midterm 1

Matrix invertibility. Rank-Nullity Theorem: For any n-column matrix A, nullity A +ranka = n

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 353, Practice Midterm 1

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

EXAM. Exam 1. Math 5316, Fall December 2, 2012

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

MATH 2210Q MIDTERM EXAM I PRACTICE PROBLEMS

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

No books, no notes, no calculators. You must show work, unless the question is a true/false, yes/no, or fill-in-the-blank question.

Chapter 3 Transformations

Math 308 Practice Test for Final Exam Winter 2015

Numerical Linear Algebra Chap. 2: Least Squares Problems

MATH 2360 REVIEW PROBLEMS

Math 291-2: Lecture Notes Northwestern University, Winter 2016

EE263: Introduction to Linear Dynamical Systems Review Session 2

The Singular Value Decomposition

Solution of Linear Equations

Least squares problems Linear Algebra with Computer Science Application

Linear Algebra. Paul Yiu. Department of Mathematics Florida Atlantic University. Fall A: Inner products

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

0 2 0, it is diagonal, hence diagonalizable)

Transcription:

Math 407: Linear Optimization Lecture 16: The Linear Least Squares Problem II Math Dept, University of Washington February 28, 2018 Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 1 / 22

Linear Least Squares A linear least squares problem is one of the form LLS 1 minimize x R n 2 Ax b 2 2, where A R m n, b R m, and y 2 2 := y 2 1 + y 2 2 + + y 2 m. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 2 / 22

Linear Least Squares A linear least squares problem is one of the form LLS 1 minimize x R n 2 Ax b 2 2, where A R m n, b R m, and y 2 2 := y 2 1 + y 2 2 + + y 2 m. Theorem: Consider the linear least squares problem LLS. 1. A solution to the normal equations A T Ax = A T b always exists. 2. A solution to LLS always exists. 3. The linear least squares problem LLS has a unique solution if and only if Null(A) = {0} in which case (A T A) 1 exists and the unique solution is given by x = (A T A) 1 A T b. 4. If Ran(A) = R m, then (AA T ) 1 exists and x = A T (AA T ) 1 b solves LLS, indeed, Ax = b. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 2 / 22

Distance to a Subspace Let S R m be a subspace and suppose b R m is not in S. Find the point z S such that z b 2 z b 2 z S, or equivalently, solve D 1 min z S 2 z b 2 2. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 3 / 22

Distance to a Subspace Let S R m be a subspace and suppose b R m is not in S. Find the point z S such that z b 2 z b 2 z S, or equivalently, solve D 1 min z S 2 z b 2 2. Least Squares Connection: Suppose S = Ran(A). Then z R m solves D if and only if there is an x R n with z = Ax such that x solves LLS. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 3 / 22

Orthogonal Projections and Orthonormal Bases Let Q R n k be a matrix whose columns form an orthonormal basis for the subspace S, so that k = dim S. Set P = QQ T and note that Q T Q = I k the k k identity matrix. Then P 2 =QQ T QQ T =QI k Q T =QQ T =P and P T =(QQ T ) T =QQ T =P, so that P = P S the orthogonal projection onto S! ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 4 / 22

Orthogonal Projections and Orthonormal Bases Let Q R n k be a matrix whose columns form an orthonormal basis for the subspace S, so that k = dim S. Set P = QQ T and note that Q T Q = I k the k k identity matrix. Then P 2 =QQ T QQ T =QI k Q T =QQ T =P and P T =(QQ T ) T =QQ T =P, so that P = P S the orthogonal projection onto S! Lemma: (1) The projection P R n n is orthogonal if and only if P = P T. (2) If the columns of the matrix Q R n k form an orthonormal basis for the subspace S R n, then P := QQ T is the orthogonal projection onto S. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 4 / 22

Orthogonal Projections and Distance to a Subspace Theorem: Let S R m be a subspace and let b R m \ S. Then the unique solution to the least distance problem D minimize z b 2 z S is z := P S b, where P S is the orthogonal projector onto S. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 5 / 22

Orthogonal Projections and Distance to a Subspace Theorem: Let S R m be a subspace and let b R m \ S. Then the unique solution to the least distance problem D minimize z b 2 z S is z := P S b, where P S is the orthogonal projector onto S. Proposition: Let A R m n with m n and Null(A) = {0}. Then the orthogonal projector onto Ran(A) is given by P Ran(A) = A(A T A) 1 A T. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 5 / 22

Minimal Norm Solutions to Ax = b Let A R m n and suppose that m < n. Then A is short and fat so A most likely has rank m, or equivalently, Ran(A) = R m. So for the purposes of this discussion we assume that rank(a) = m. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 6 / 22

Minimal Norm Solutions to Ax = b Let A R m n and suppose that m < n. Then A is short and fat so A most likely has rank m, or equivalently, Ran(A) = R m. So for the purposes of this discussion we assume that rank(a) = m. Since m < n, the set of solutions to Ax = b will be infinite since the nullity of A is n m. Indeed, if x 0 is any particular solution to Ax = b, then the set of solutions is given by x 0 + Null(A) := { x 0 + z z Null(A) }. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 6 / 22

Minimal Norm Solutions to Ax = b Let A R m n and suppose that m < n. Then A is short and fat so A most likely has rank m, or equivalently, Ran(A) = R m. So for the purposes of this discussion we assume that rank(a) = m. Since m < n, the set of solutions to Ax = b will be infinite since the nullity of A is n m. Indeed, if x 0 is any particular solution to Ax = b, then the set of solutions is given by x 0 + Null(A) := { x 0 + z z Null(A) }. In this setting, one might prefer the solution having least norm: 1 min 2 z + x 0 2. z Null(A) 2 This problem is of the form D min { z b 2 z S } whose solution is z = P S b when S is a subspace. Consequently, the solution is given by z = P S x 0 where P S is now the orthogonal projection onto S := Null(A). Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 6 / 22

A Formula for P Null(A) Observe that P Null(A) = P Ran(A T ) = I P Ran(A T ). ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 7 / 22

A Formula for P Null(A) Observe that P Null(A) = P Ran(A T ) = I P Ran(A T ). We have already shown that if M R p q satisfies Null(M) = {0}, then P Ran(M) = M(M T M) 1 M T. If we take M = A T, then our assumption that Ran(A) = R m gives Null(M) = Null(A T ) = Ran(A) = (R m ) = {0}. Hence, P Ran(A T ) = A T (AA T ) 1 A and P Null(A) = P Ran(A T ) = I P Ran(A T ) = I A T (AA T ) 1 A. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 7 / 22

Solution to D min { z b 2 z S } We can take x 0 := A T (AA T ) 1 b as our particular solution to Ax = b in D since Ax 0 = AA T (AA T ) 1 b = b. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 8 / 22

Solution to D min { z b 2 z S } We can take x 0 := A T (AA T ) 1 b as our particular solution to Ax = b in D since Ax 0 = AA T (AA T ) 1 b = b. Hence, our solution to D is x = x 0 + P Null(A) ( x 0 ) = x 0 + (I A T (AA T ) 1 A)( x 0 ) = A T (AA T ) 1 Ax 0 = A T (AA T ) 1 AA T (AA T ) 1 b = A T (AA T ) 1 b = x 0. That is, x 0 = A T (AA T ) 1 b is the least norm solution to Ax = b. ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 8 / 22

Solution to D min { z b 2 z S } Theorem: Let A R m n be such that m n and Ran(A) = R m. (1) The matrix AA T is invertible. (2) The orthogonal projection onto Null(A) is given by P Null(A) = I A T (AA T ) 1 A. (3) For every b R m, the system Ax = b is consistent, and the least norm solution to this system is uniquely given by x = A T (AA T ) 1 b. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 9 / 22

Gram-Schmidt Orthogonalization and the QR-Factorization We now define the Gram-Schmidt orthogonalization process for a set of linearly independent vectors a 1,..., a n R m. (This implies that n < m (why?)) The orthogonal vectors q 1,..., q n are defined inductively, as follows: p 1 = a 1, q 1 = p 1 / p 1, j 1 p j = a j a j, q j q i and q j = p j / p j for 2 j n. i=1 For 1 j n, q j Span{a 1,..., a j }, so p j 0 by the linear independence of a 1,..., a j. An elementary induction argument shows that the q j s form an orthonormal basis for span (a 1,..., a n ). Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 10 / 22

Gram-Schmidt Orthogonalization and the QR-Factorization If we now define r jj = p j 0 and r ij = a j, q i for 1 i < j n, then a 1 = r 11 q 1, a 2 = r 12 q 1 + r 22 q 2, a 3 = r 13 q 1 + r 23 q 2 + r 33 q 3,. a n = n r in q i. i=1 Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 11 / 22

Gram-Schmidt Orthogonalization and the QR-Factorization Again suppose a 1,..., a n R m are linearly independent and set A = [a 1 a 2... a n ] R m n, R = [r ij ] R n n, and Q = [q 1 q 2... q n ] R m n, where r ij = 0, i > j. Then A = QR, where Q has orthonormal columns and R is an upper triangular n n matrix. In addition, R is invertible since the diagonal entries r jj are non-zero. This is called the QR factorization of the matrix A. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 12 / 22

Full QR-Factorization Suppose A R m n with m n. Then there exists a permutation matrix P R n n, a unitary matrix Q R m m, and an upper triangular matrix R R m n such that AP = QR. Let Q 1 R m n denote the first n columns of Q, Q 2 the remaining (m n) columns of Q, and R 1 R n n the first n rows of R, then [ ] R1 AP = QR = [Q 1 Q 2 ] = Q 0 1 R 1. (1) Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 13 / 22

Full QR-Factorization Moreover, we have the following: (a) We may choose R to have nonnegative diagonal entries. (b) If A is of full rank, then we can choose R with positive diagonal entries, in which case we obtain the condensed factorization A = Q 1 R 1, where R 1 R n n invertible and the columns of Q 1 form an orthonormal basis for Ran(A). (c) If rank (A) = k < n, then R 1 = [ ] R11 R 12, 0 0 where R 11 is a k k invertible upper triangular matrix and R 12 R k (n k). In particular, this implies that AP = Q 11 [R 11 R 12 ], where Q 11 are the first k columns of Q. In this case, the columns of Q 11 form an orthonormal basis for the range of A and R 11 is invertible. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 14 / 22

Condenced QR-Factorization Let A R m n have rank k min{m, n}. Then there exist Q R m k R R k n P R n n with orthonormal columns, full rank upper triangular, and a permutation matrix such that AP = QR. In particular, the columns of the matrix Q form a basis for the range of A. Moreover, the matrix R can be written in the form where R 1 R k k is nonsingular. R = [R 1 R 2 ], Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 15 / 22

Orthogonal Projections onto the Four Fundamental Subspaces Let A R m n have rank k min{m, n}. Let A and A T have generalized QR factorizations AP = Q[R 1 R 2 ] and A T P = Q[ R1 R2 ]. Since row rank equals column rank, P R n n is a permutation matrix, P R m m is a permutation matrix, Q R m k and Q R n k have orthonormal columns, R 1, R 1 R k k are both upper triangular nonsingular matrices, R 2 R k (n k), and R 2 R k (m k). Moreover, QQ T is the orthogonal projection onto Ran(A), I QQ T is the orthogonal projection onto Null(A T ), Q Q T is the orthogonal projection onto Ran(A T ), and I Q Q T is the orthogonal projection onto Null(A). Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 16 / 22

Solving the Normal Equations with the QR Factorization Let A R m n and b R m with k := rank (A) min{m, n}. The normal equations A T Ax = A T b yield solutions to the LLS problem min 1 2 Ax b 2 2. We show how the QR factorization of A is used to solve these equations. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 17 / 22

Solving the Normal Equations with the QR Factorization Let A R m n and b R m with k := rank (A) min{m, n}. The normal equations A T Ax = A T b yield solutions to the LLS problem min 1 2 Ax b 2 2. We show how the QR factorization of A is used to solve these equations. Suppose A has condensed QR factorization AP = Q[R 1 R 2 ] where P R n n is a permutation matrix, Q R m k has orthonormal columns, R 1 R k k is nonsingular and upper triangular, and R 2 R k (n k) with k = rank (A) min{n, m}. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 17 / 22

Solving the Normal Equations with the QR Factorization Since A = Q[R 1 R 2 ]P T the normal equations A T Ax = A T b become P [ R T 1 R T 2 ] Q T Q [ R 1 R 2 ] P T x = A T Ax = A T b = P [ R T 1 R T 2 ] Q T b, or equivalently since Q T Q = I k. [ ] [ ] R T [R1 P 1 ] R2 T R 2 P T R T x = P 1 R2 T Q T b. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 18 / 22

Solving the Normal Equations with the QR Factorization Multiply [ ] [ ] R T [R1 P 1 ] R2 T R 2 P T R T x = P 1 R2 T Q T b on the left by P T to obtain [ ] [ ] R T [R1 1 ] R2 T R 2 P T R T x = 1 R2 T Q T b Defining w := [ R 1 R 2 ] P T x and setting ˆb := Q T b gives [ R T 1 R T 2 ] [ ] R T w = 1 ˆb. R2 T Take w = ˆb and define ˆx := P T x. Then [ R1 R 2 ] ˆx = ˆb. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 19 / 22

Solving the Normal Equations with the QR Factorization We now need to solve the following equation for ˆx: [ R1 R 2 ] ˆx = ˆb. To do this write ) (ˆx1 ˆx = ˆx 2 with ˆx 1 R k. If we assume that ˆx 2 = 0, then R 1ˆx 1 = [ ] (ˆx ) R 1 R 1 2 0 which we can solve by taking ˆx 1 = R 1 1 ˆb = ˆb since R 1 is an invertible upper triangular k k matrix by construction. Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 20 / 22

Solving the Normal Equations with the QR Factorization Then Does this x solve the normal equations? ( ) R 1 x = P ˆx = P 1 ˆb. 0 ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 21 / 22

Solving the Normal Equations with the QR Factorization Then ( ) R 1 x = P ˆx = P 1 ˆb. 0 Does this x solve the normal equations? [ ] A T Ax = A T R 1 AP 1 ˆb 0 = A T Q [ [ ] ] R 1 R 2 P T R 1 P 1 ˆb 0 = A T QR 1R 1 1 ˆb (since P T P = I ) = A T Q ˆb = A T QQ T b [ ] R T = P 1 Q T QQ T b R T 2 [ R T = P 1 R2 T = A T b! [ ] R (since A T T = P 1 R2 T Q T ) ] Q T b (since Q T Q = I ) ecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 21 / 22

Solving the Normal Equations with the QR Factorization This derivation yields the following recipe for solving the normal equations: AP = Q[R 1 R 2 ] the general condensed QR factorization o(m 2 n) ˆb = Q T b a matrix-vector product o(km) w 1 = R 1 1 ˆb a back solve o(k 2 ) [ ] R 1 x = P 1 ˆb 0 a matrix-vector product o(kn). Lecture 16: The Linear Least Squares Problem II (Math Dept, University Math 407: of Washington) Linear Optimization February 28, 2018 22 / 22