Dot product and linear least squares problems

Similar documents
Linear least squares problem: Example

Class notes: Approximation

Linear Least Squares Problems

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Cheat Sheet for MATH461

MATH 235: Inner Product Spaces, Assignment 7

Linear Algebra Massoud Malek

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Section 6.4. The Gram Schmidt Process

Applied Linear Algebra in Geoscience Using MATLAB

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Math 407: Linear Optimization

MTH 2032 SemesterII

Chapter 3 Transformations

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

ORTHOGONALITY AND LEAST-SQUARES [CHAP. 6]

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

7. Dimension and Structure.

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

Lecture 4 Orthonormal vectors and QR factorization

SUMMARY OF MATH 1600

Math Linear Algebra II. 1. Inner Products and Norms

Lecture 3: Linear Algebra Review, Part II

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Math 3191 Applied Linear Algebra

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

INNER PRODUCT SPACE. Definition 1

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

6. Orthogonality and Least-Squares

(v, w) = arccos( < v, w >

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:

Lecture 10: Vector Algebra: Orthogonal Basis

Introduction to Numerical Linear Algebra II

Chapter 6: Orthogonality

(v, w) = arccos( < v, w >

Typical Problem: Compute.

Linear Algebra. Paul Yiu. Department of Mathematics Florida Atlantic University. Fall A: Inner products

Applied Linear Algebra in Geoscience Using MATLAB

The QR Factorization

Least-Squares Systems and The QR factorization

LAB 2: Orthogonal Projections, the Four Fundamental Subspaces, QR Factorization, and Inconsistent Linear Systems

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Numerical Linear Algebra Chap. 2: Least Squares Problems

MATH 22A: LINEAR ALGEBRA Chapter 4

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

(v, w) = arccos( < v, w >

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

MA 265 FINAL EXAM Fall 2012

Lecture 6, Sci. Comp. for DPhil Students

Mathematics Department Stanford University Math 61CM/DM Inner products

Linear Algebra Highlights

Linear Algebra Review. Vectors

There are two things that are particularly nice about the first basis

Linear Analysis Lecture 16

Designing Information Devices and Systems II

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

Dimension and Structure

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

4.3 - Linear Combinations and Independence of Vectors

P = A(A T A) 1 A T. A Om (m n)

Lecture 4: Applications of Orthogonality: QR Decompositions

Linear Models Review

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Chapter 4 Euclid Space

Inner Product Spaces 6.1 Length and Dot Product in R n

Math 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES

2. Review of Linear Algebra

Conceptual Questions for Review

Math 3191 Applied Linear Algebra

LINEAR ALGEBRA SUMMARY SHEET.

Problem set 5: SVD, Orthogonal projections, etc.

Basic Elements of Linear Algebra

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

Notes on Eigenvalues, Singular Values and QR

Orthogonal Complements

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Transpose & Dot Product

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

1. General Vector Spaces

The Singular Value Decomposition and Least Squares Problems

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Linear Algebra. Alvin Lin. August December 2017

Math Fall Final Exam

EXAM. Exam 1. Math 5316, Fall December 2, 2012

Problem 1: Solving a linear equation

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

2. Every linear system with the same number of equations as unknowns has a unique solution.

MATH Linear Algebra

Transpose & Dot Product

Vectors. Vectors and the scalar multiplication and vector addition operations:

Math 2174: Practice Midterm 1

The Gram Schmidt Process

Transcription:

Dot product and linear least squares problems Dot Product For vectors u,v R n we define the dot product Note that we can also write this as u v = u,,u n u v = u v + + u n v n v v n = u v + + u n v n The dot product u u = u + + u n gives the square of the Euclidean length of the vector, aka norm of the vector: u = (u u) / = ( u + + u ) / n Theorem (Cauchy-Schwarz inequality) For a,b R n we have Proof Step for vectors with a = and b = : Then a b a b () (b a) (b a) = b b a b + a a a b a a + b b = + Hence a b Using (b + a) instead of (b a) gives a b, ie, a b Step for general vectors a, b: If a = or b = we see that () obviously holds If both vectors are different from let u := a/ a and v := b/ b, then u v = Since u = and v = we get from step that u v a b a b Consider a triangle with the three points, a, b Then the vector from a to b is given by c = b a, and the lengths of the three sides of the triangle are a, b, c Let θ denote the angle between the vectors a and b Then the law of cosines tells us that Multiplying out c = (b a) (b a) gives By comparing the last two equations we obtain If a and b are different from we have c = a + b a b cosθ c = a + b a b a b = a b cosθ cosθ = This tells us how to compute the angle θ between two vectors: q := a b a b, a b a b θ := cos q Because of () we have q, hence the inverse cosine function gives θ π: q = θ =, q = θ = π, q = θ = π, ie, a,b point in the same direction ie, a,b are orthogonal ie a, b point in opposite directions

Example Find the angle between the vectors a = q := and b = : We get a b a b = =, θ := cos q = cos = π b x 5 a 5 5 x x We say vectors a,b R n are orthogonal or a b a b a b = We say a vector a R n is orthogonal on a subspace V of R n or a V a V a v = for all v V Orthogonal projection onto a line Consider a -dimensional subspace V = span{v} of R n given by a vector v R n This is a line through the origin For a given point b R n we want to find the point u V which is closest to the point b: The point u must have the following properties: u V, ie, u = cv with some unknown c R u b V, ie, v (u b) = Find u V such that u b is minimal By plugging u = cv into the second property we get by multiplying out v (cv b) = c(v v) (v b) = c = v b v v Therefore the point u on the line given by v which is closest to the point b is given by u = v b v v v We say that u is the orthogonal projection of the point b onto the line given by v and use the notation pr v b = v b v v v

Orthogonal projection onto a -dimensional subspace V Consider a -dimensional subspace V = span { v (),v ()} of R n given by two linearly independent vectors v (),v () R n This is a plane through the origin For a given point b R n we want to find the point u V which is closest to the point b: The point u must have the following properties: Find u V such that u b is minimal u V, ie, u = c v () + c v () with some unknowns c,c R u b V, ie, v () (u b) = and v () (u b) = By plugging u = c v () +c v () into the second property we obtain a linear system of two equations for two unknowns which we can then solve We can use the k matrix A = v (),v () to express the two properties: c u V, ie, u = Ac with an unknown vector c = R v u b V, ie, v () (u b) = and v () () (u b) =, ie, By plugging u = Ac into the second property we obtain c v () A (u b) = A (Ac b) = A Ac = A b (u b) = These are the so-called normal equations (since they express that u b is orthogonal or normal on the subspace V This is how to find the point u V which is closest to b: find the matrix M := A A R and the vector g := A b R solve the linear system Mc = g for c R let u := Ac Example Consider the plane V = span Let A = Then Solving the linear system 6 c Hence the closest point is u = Ac = c, M = A 6 A = 5 = checking v () r = and v () r = : We have r = Ac b = v () r = or Find the point u V which is closest to b = gives c = 5, g = We can check that this is correct by finding the difference vector r = u b and =, v () r = =

Orthogonal projection onto a k-dimensional subspace V aka least squares problem Consider a k-dimensional subspace V = span { v (),,v (k)} of R n given by k linearly independent vectors v (),,v (k) R n Ie, the vectors v (),,v (k) form a basis for the subspace V For a given point b R n we want to find the point u V which is closest to the point b: Let us define the n k matrix A = v (),,v (k) The point u must have the following properties: Find u V such that u b is minimal () u V, ie, u = c v () + + c k v (k) = Ac with some unknown vector c R k v () u b V, ie, v () (u b) =,,v (k) (u b) =, ie, (u b) = By plugging u = Ac into the second property we obtain v (k) A (u b) = or A (Ac b) = () A Ac = A b () These are the so-called normal equations (since they express that u b is orthogonal or normal on the subspace V ) This is how to find the point u V which is closest to b:

find the matrix M := A A R k k and the vector g := A b R k solve the k k linear system Mc = g for c R k let u := Ac Note that the normal equations () always have a unique solution: Theorem Assume the matrix A R n k with k n has linearly independent columns (ie, ranka = k) Then the matrix M = A A is nonsingular Proof We have to show that Mc = implies c = Assume we have c R k such that Mc = Then we can multiply from the left with c and get with y := Ac = c Mc = c A Ac = y y = y as y = c A Since y = we have y = Ac = This means that we have a linear combination of the columns of A which gives the zero vector Since by assumption the columns of A are linearly independent we must have c = So solving the normal equations gives us a unique c R n We then get by u = Ac a point on the subspace V We now want to formally prove that this point u V is really the unique answer to our minimization problem () Theorem Assume the matrix A R n k with k n has linearly independent columns (ie, ranka = k) Then for any given b R n the minimization problem find c R k such that Ac b is minimal (5) has a unique solution which is obtained by solving the normal equations A Ac = A b Proof Let c R k be the unique solution of the normal equations Consider now c = c + d where d R k is nonzero We then have ( ) ( ) A c b = Ac b + Ad = (Ac b) + Ad (Ac b) + Ad = (Ac b) (Ac b) + (Ad) (Ac b) + (Ad) (Ad) We have for the middle term (Ad) (Ac b) = (Ad) (Ac b) = d A (Ac b) = by the normal equations () Hence A c b = Ac b + Ad Since d we have Ad since the columns of A are linearly independent, and hence Ad > This means that for any vector c different from c we get A c b > Ac b, ie, the vector c from the normal equations is the unique solution of (5) Least squares problem with orthogonal basis For a least squares problem we are given n linearly independent vectors a (),,a (n) R m which form a basis for the subspace V = span{a (),,a (n) } For a given right hand side vector b R m we want to find u V such that u b is minimal We can write u = c a () + + c n a (n) = Ac with the matrix A = a (),,a (n) R m n Hence we want to find c R n such that Ac b is minimal Solving this problem is much simpler if we have an orthogonal basis for the subspace V : Assume we have vectors p (),, p (n) such that span { p (),, p (n)} = V the vectors are orthogonal on each other: p (i) p ( j) = for i j

We can then write u = d p () + + d n p (n) = Pd with the matrix P = p (),, p (n) R m n Hence we want to find d R n such that Pd b is minimal The normal equations for this problem give where the matrix P P = p () p (n) (P P)d = P b (6) p () p () p () p (n) p (),, p (n) = = p (n) p () p (n) p (n) p () p () p (n) p (n) is now diagonal since p (i) p ( j) = for i j Therefore the normal equations (6) are actually decoupled ( p () p ()) d = p () b ( p (n) p (n)) d n = p (n) b and have the solution d i = p(i) b p (i) p (i) for i =,,n Gram-Schmidt orthogonalization We still need a method to construct from a given basis a (),,a (n) an orthogonal basis p (),, p (n) Given n linearly independent vectors a (),,a (n) R m we want to find vectors p (),, p (n) such that span { p (),, p (n)} = span { a (),,a (n)} the vectors are orthogonal on each other: p (i) p ( j) = for i j Step : p () := a () Step : p () := a () s p () where we choose s such that p () p () = : p () a () s p () p () = s = p() a () p () p () Step : p () := a () s p () s p () where we choose s, s such that p () p () =, ie, p () a () s p () p () s p () p () =, hence s }{{} = p() a () p () p () p () p () =, ie, p () a () s p () p () s }{{} p () p () =, hence s = p() a () p () p () Step n: p (n) := a (n) s n p () s n,n p (n ) where we choose s n,,s n,n such that p ( j) p (n) = for j =,,n which yields s jn = p( j) p (n) p ( j) p ( j) for j =,,n

Example: We are given the vectors a () =, a() =, a() = find an orthogonal basis p (), p (), p () for the subspace V = span { a (),a (),a ()} Step : p () := a () = Step : p () := a () p() a () p () p () p() = 6 Step : p () := a () p() a () p () p () p() p() a () p () p () p() = Note that we have = a () = p () 9 9 5 Use Gram-Schmidt orthogonalization to 5 = which we can write as In the general case we have a (),a (),a () = 9 a () = p () + 6 p() a () = p () + p() + 5 5 p() } {{ } A = p (), p (), p () 5 5 5 5 } {{ } P 6 5 5 5 5 } {{ } S a () = p () a () = p () + s p () a () = p () + s p () + s p () a (n) = p (n) + s n p () + + s n,n p (n ) which we can write as a (),a (),,a (n) = p (), p (),, p (n) s s n sn,n Therefore we obtain a decomposition A = PS where

P R m n has orthogonal columns S R n n is upper triangular, with on the diagonal Note that the vectors p (),, p (n) are different from : Assume, eg, that p () = a () s p () s p () =, then a () = s p () + s p () is in span { p (), p ()} = span { a (),a ()} This is a contradiction to the assumption that a (),a (),a () are linearly independent Solving the least squares problem Ac b = min using orthogonalization We are given A R m n with linearly independent columns, b R n We want to find c R n such that Ac b = min From the Gram-Schmidt method we get A = PS, hence we want to find c such that P }{{} Sc b = min d This gives the following method for solving the least squares problem: use Gram-Schmidt to find decomposition A = PS solve Pd b = min: d i := solve Sc = d by back substitution p(i) b p (i) for i =,,n p (i) Example: Solve the least squares problem Ac b = min for A = Gram-Schmidt gives A = 5 5 5 5 } {{ } P 5 5 } {{ } S 9 (see above), b = d = p() b p () p () = =, d = p() b p () = p () 5 =, d = p() b p () p () = = 5 5 5 c solving c = by back substitution gives c = 5, c = 9, c = c 5 Hence the solution of our least squares problem is the vector c = Note: If you want to solve a least squares problem by hand with pencil and paper, it is usually easier to use the normal equations But for numerical computation on a computer using orthogonalization is usually more efficient and more accurate 9 5 7 Finding an orthonormal basis q (),,q (n) : the QR decomposition The Gram-Schmidt method gives an orthogonal basis p (),, p (n) for V = span { a (),,a (n)} Often it is convenient to have a so-called orthonormal basis q (),,q (n) where the basis vectors have length : Define then we have q ( j) = j) p( p ( j) for j =,,n

span { q (),,q (n)} = V { q ( j) q (k) for j = k = otherwise This means that the matrix Q = q (),,q (n) satisfies Q Q = I where I is the n n identity matrix Since p ( j) = p ( j) q ( j) we have a () = p () q () }{{} r a () = p () q () +s p () }{{}}{{} r a (n) = p (n) q (n) }{{} r nn +s n r () p p () p () p (n ) + + s n,n }{{}}{{} r n r n,n (n ) q which we can write as a (),a (),,a (n) = q (),q (),,q (n) r r r n r rn,n r nn A = QR where the n n matrix R is given by row of R row nof R = p () (row of R) p (n) (row nof R) We obtain the so-called QR decomposition A = QR where the matrix Q R m n has orthonormal columns, rangeq = rangea the matrix R R n n is upper triangular, with nonzero diagonal elements Example: In our example we have p () p () =, p () p () = 5, p () p () =, hence 5 5 q () = p() = 5 5, q() = p () = 5 5 5 5, q() = p() = 5 5 and we obtain the QR decomposition 9 = 5 5/ 5 5 5 5/ 5 5 5 5/ 5 5 5 5/ 5 5 7 5 5 5 5 5 5

In Matlab we can find the QR decomposition using Q,R=qr(A,) This works for symbolic matrices >> A = sym( ; ; 9) ; >> Q,R = qr(a,) Q = /, -(*5^(/))/, / /, -5^(/)/, -/ /, 5^(/)/, -/ /, (*5^(/))/, / R =,, 7, 5^(/), *5^(/),, and it works for numerical matrices >> A = ; ; 9 ; >> Q,R = qr(a,) Q = -5 678 5-5 6-5 -5-6 -5-5 -678 5 R = - - -7-6 -678 Note that for numerical matrices Matlab returned the basis q (), q (),q () (which is also an orthonormal basis) and hence rows and of the matrix R is ( ) times our previous matrix R If we want to find an orthonormal basis for rangea and an orthonormal basis for the orthogonal complement (rangea) = nulla we can use the command Qh,Rh=qr(A) : It returns matrices ˆQ R m m and ˆR R m n with basis for rangea basis for (rangea) {}}{ R {}}{ ˆQ = q (),,q (n), q (n+),,q (m), ˆR = m nrows of zeros >> A = ; ; 9 ; >> Qh,Rh = qr(a) Qh = -5 678 5 6-5 6-5 -678-5 -6-5 678-5 -678 5-6 Rh = - - -7-6 -678 But in most cases we only need an orthonormal basis for rangea and we should use Q,R=qr(A,) (which Matlab calls the economy size decomposition)

Solving the least squares problem Ac b = min using the QR decomposition If we use an orthonormal basis q (),,q (n) for span{a (),,a (n) } we have Q Q = I The solution of Qd b = min is therefore given by the normal equations (Q Q)d = Q b, ie, we obtain d = Q b This gives the following method for solving the least squares problem: find the QR decomposition A = QR let d = Q b solve Rc = d by back substitution In Matlab we can do this as follows: Q,R = qr(a,); d = Q *b; c = R\d; In our example we have >> A = ; ; 9 ; b = ;;;7; >> Q,R = qr(a,); >> d = Q *b; >> c = R\d c = - 9 5 This works for both numerical and symbolic matrices For a numerical matrix A we can use the shortcut c=a\y which actually uses the QR decomposition to find the solution of Ac b = min >> A = ; ; 9 ; b = ;;;7; >> c = A\b c = - 9 5 Warning: This shortcut does not work for symbolic matrices: >> A = sym( ; ; 9) ; b = sym(;;;7); >> c = A\b Warning: The system is inconsistent Solution does not exist c = Inf Inf Inf