Orthogonality and Least Squares

Similar documents
Math 3191 Applied Linear Algebra

March 27 Math 3260 sec. 56 Spring 2018

6. Orthogonality and Least-Squares

Least squares problems Linear Algebra with Computer Science Application

Chapter 6: Orthogonality

Math 2331 Linear Algebra

Solutions to Review Problems for Chapter 6 ( ), 7.1

6.1. Inner Product, Length and Orthogonality

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

Orthogonality and Least Squares

Worksheet for Lecture 23 (due December 4) Section 6.1 Inner product, length, and orthogonality

MTH 2310, FALL Introduction

Section 6.1. Inner Product, Length, and Orthogonality

LINEAR ALGEBRA SUMMARY SHEET.

Chapter 6. Orthogonality and Least Squares

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Study Guide for Linear Algebra Exam 2

Typical Problem: Compute.

Math Linear Algebra

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Overview. Motivation for the inner product. Question. Definition

ORTHOGONALITY AND LEAST-SQUARES [CHAP. 6]

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

MTH 2032 SemesterII

Lecture 3: Linear Algebra Review, Part II

Orthogonal Projection. Hung-yi Lee

The geometry of least squares

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

7. Dimension and Structure.

Orthogonal Complements

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Announcements Wednesday, November 15

Section 6.4. The Gram Schmidt Process

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

2. Every linear system with the same number of equations as unknowns has a unique solution.

Chapter 3. Directions: For questions 1-11 mark each statement True or False. Justify each answer.

Chapter 6 - Orthogonality

Review of Linear Algebra

2.4 Hilbert Spaces. Outline

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Linear Algebra Massoud Malek

Problem # Max points possible Actual score Total 120

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Math 4377/6308 Advanced Linear Algebra

WI1403-LR Linear Algebra. Delft University of Technology

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

22m:033 Notes: 6.1 Inner Product, Length and Orthogonality

Vector Spaces, Orthogonality, and Linear Least Squares

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Elementary maths for GMT

y 2 . = x 1y 1 + x 2 y x + + x n y n 2 7 = 1(2) + 3(7) 5(4) = 3. x x = x x x2 n.

Math 2331 Linear Algebra

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Linear Algebra- Final Exam Review

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

Definitions for Quizzes

Lecture 23: 6.1 Inner Products

LINEAR ALGEBRA W W L CHEN

6.1 SOLUTIONS. 1. Since. and. 2. Since. 3. Since. 4. Since. x w = 6(3) + ( 2)( 1) + 3( 5) = 5, 6. Since x. u u ( 1) 2 5, v u = 4( 1) + 6(2) = 8, and

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

Normed & Inner Product Vector Spaces

LINEAR ALGEBRA REVIEW

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

6 Inner Product Spaces

MATH 221, Spring Homework 10 Solutions

Announcements Wednesday, November 15

Linear Equations and Vectors

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

Review Notes for Linear Algebra True or False Last Updated: February 22, 2010

Projections and Least Square Solutions. Recall that given an inner product space V with subspace W and orthogonal basis for

MATH 22A: LINEAR ALGEBRA Chapter 4

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Math Linear Algebra II. 1. Inner Products and Norms

The Transpose of a Vector

Fourier and Wavelet Signal Processing

Announcements Monday, November 26

(v, w) = arccos( < v, w >

Orthogonal Projection and Least Squares Prof. Philip Pennance 1 -Version: December 12, 2016

Math 54 HW 4 solutions

Kevin James. MTHSC 3110 Section 4.3 Linear Independence in Vector Sp

Announcements Monday, November 26

Lecture 20: 6.1 Inner Products

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

MTH501- Linear Algebra MCQS MIDTERM EXAMINATION ~ LIBRIANSMINE ~

MATH 2331 Linear Algebra. Section 1.1 Systems of Linear Equations. Finding the solution to a set of two equations in two variables: Example 1: Solve:

1 Last time: inverses

4.3 - Linear Combinations and Independence of Vectors

(v, w) = arccos( < v, w >

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

Linear Algebra V = T = ( 4 3 ).

Chapter 3 Transformations

Transcription:

6 Orthogonality and Least Squares 6.1 INNER PRODUCT, LENGTH, AND ORTHOGONALITY

INNER PRODUCT If u and v are vectors in, then we regard u and v as matrices. n 1 n The transpose u T is a 1 n matrix, and the matrix product u T v is a 1 1matrix, which we write as a single real number (a scalar) without brackets. The number u T v is called the inner product of u and v, and it is written as. uiv The inner product is also referred to as a dot product. Slide 6.1-2

INNER PRODUCT! u $! # 1 & # # u u= 2 & # # & v = # # & # # " u n & # % " If and,! "# then the inner product of u and v is! v $ # 1 & u 1 u 2 u $ # v 2 & n %& # & = u 1 v 1 + u 2 v 2 + + u n v n # & # " v n & % v 1 v 2 v n $ & & & & & %. Slide 6.1-3

INNER PRODUCT Theorem 1: Let u, v, and w be vectors in, and let c be a scalar. Then a. b. c. d., and if and only if n uiv = viu (u + v)iw = uiw + viw (cu)iv = c(uiv) = ui(cv) uiu 0 uiu = 0 u= 0 Properties (b) and (c) can be combined several times to produce the following useful rule: (c 1 u 1 + + c p u p )iw = c 1 (u 1 iw) + + c p (u p iw) Slide 6.1-4

THE LENGTH OF A VECTOR n If v is in, with entries v 1,, v n, then the square root of viv is defined because viv is nonnegative. Definition: The length (or norm) of v is the nonnegative scalar defined by v v = viv = v 12 + v 22 + + v n 2 and v 2 = viv! Suppose v is in 2, say, v = a $ # &. " b % Slide 6.1-5

THE LENGTH OF A VECTOR If we identify v with a geometric point in the plane, as usual, then v coincides with the standard notion of the length of the line segment from the origin to v. This follows from the Pythagorean Theorem applied to a triangle such as the one shown in the following figure. For any scalar c, the length cv is v. That is, cv = c v c times the length of Slide 6.1-6

THE LENGTH OF A VECTOR A vector whose length is 1 is called a unit vector. If we divide a nonzero vector v by its length that is, multiply by 1/ v we obtain a unit vector u because the length of u is. (1 / v ) v The process of creating u from v is sometimes called normalizing v, and we say that u is in the same direction as v. Slide 6.1-7

THE LENGTH OF A VECTOR v = (1, 2,2,0) Example 1: Let in the same direction as v. Solution: First, compute the length of v:. Find a unit vector u 2 2 2 2 2 v = vgv = (1) + ( 2) + (2) + (0) = 9 v = 9 = 3 1/ v " u = 1 v v = 1 3 v = 1 $ $ 3$ $ # Then, multiply v by to obtain 1 2 2 0 % " ' $ ' = $ ' $ ' $ & # 1/ 3 2 / 3 2 / 3 0 % ' ' ' ' & Slide 6.1-8

DISTANCE IN To check that, it suffices to show that. Definition: For u and v in, the distance between u and v, written as dist (u, v), is the length of the vector u v. That is, n u = 1 2 2 2 1 2 2 u = ugu = + + 3 3 3 + 0 1 4 4 = + + + 0 = 1 9 9 9 n dist (u,v) = u v 2 u = 1 ( ) 2 Slide 6.1-9

DISTANCE IN n Example 2: Compute the distance between the vectors and. u = (7,1) v = (3,2) Solution: Calculate u The vectors u, v, and the next slide. 7 3 4 v= 1 = 2 1 2 2 u v = 4 + ( 1) = 17 u v u v are shown in the figure on When the vector is added to v, the result is u. Slide 6.1-10

DISTANCE IN n Notice that the parallelogram in the above figure shows that the distance from u to v is the same as the distance from to 0. u v Slide 6.1-11

ORTHOGONAL VECTORS 2 3 Consider or and two lines through the origin determined by vectors u and v. See the figure below. The two lines shown in the figure are geometrically perpendicular if and only if the distance from u to v is the same as the distance from u to. v This is the same as requiring the squares of the distances to be the same. Slide 6.1-12

ORTHOGONAL VECTORS Now 2 2 2 "dist(u, # v) $ % = u ( v) = u + v = (u + v)i(u + v) = ui(u + v) + vi(u + v) = uiu + uiv + viu + viv Theorem 1(b) Theorem 1(a), (b) = u 2 + v 2 + 2uiv The same calculations with v and show that v Theorem 1(a) interchanged 2 2!dist(u,v)# " $2 = u + v + 2ui( v) = u 2 + v 2 2uiv Slide 6.1-13

ORTHOGONAL VECTORS The two squared distances are equal if and only if 2uiv = 2uiv, which happens if and only if uiv = 0. This calculation shows that when vectors u and v are identified with geometric points, the corresponding lines through the points and the origin are perpendicular if and only if. uiv = 0 n Definition: Two vectors u and v in are orthogonal (to each other) if uiv = 0. The zero vector is orthogonal to every vector in n T because for all v. 0v= 0 Slide 6.1-14

THE PYTHOGOREAN THEOREM Theorem 2: Two vectors u and v are orthogonal if 2 2 2 and only if. u+ v = u + v Orthogonal Complements If a vector z is orthogonal to every vector in a subspace W of n, then z is said to be orthogonal to W. The set of all vectors z that are orthogonal to W is called the orthogonal complement of W and is denoted by W (and read as W perpendicular or simply W perp ). Slide 6.1-15

ORTHOGONAL COMPLEMENTS 1. A vector x is in if and only if x is orthogonal to every vector in a set that spans W. W W n 2. is a subspace of. m n Theorem 3: Let A be an matrix. The orthogonal complement of the column space of A is the null space of A T (Col A) = Nul A T Slide 6.1-16

6 Orthogonality and Least Squares 6.2 ORTHOGONAL SETS

ORTHOGONAL SETS A set of vectors {u 1,,u p } in is said to be an orthogonal set if each pair of distinct vectors from the set is orthogonal, that is, if whenever i j. n u i iu j = 0 Theorem 4: If S ={u 1,,u p } is an orthogonal set of nonzero vectors in n, then S is linearly independent and hence is a basis for the subspace spanned by S. Slide 6.2-2

ORTHOGONAL SETS 0 = c 1 u 1 + + c p u p Proof: If for some scalars c 1,,c p, then 0 = 0iu 1 = (c 1 u 1 + c 2 u 2 + + c p u p )iu 1 = (c 1 u 1 )iu 1 + (c 2 u 2 )iu 1 + + (c p u p )iu 1 = c 1 (u 1 iu 1 ) + c 2 (u 2 iu 1 ) + + c p (u p iu 1 ) = c 1 (u 1 iu 1 ) because u 1 is orthogonal to u 2,,u p. Since u 1 is nonzero, is not zero and so. Similarly, c 2,,c p must be zero. u 1 iu 1 c = 0 1 Slide 6.2-3

ORTHOGONAL SETS Thus S is linearly independent. Definition: An orthogonal basis for a subspace W of is a basis for W that is also an orthogonal set. n Theorem 5: Let {u 1,,u p } be an orthogonal basis for a subspace W of n. For each y in W, the weights in the linear combination are given by y = c 1 u 1 + + c p u p c j = yiu j ( j = 1, K, p) u j iu j Slide 6.2-4

ORTHOGONAL SETS Proof: The orthogonality of {u 1,,u p } shows that yiu 1 = (c 1 u 1 + c 2 u 2 + + c p u p )iu 1 = c 1 (u 1 iu 1 ) u 1 iu 1 Since is not zero, the equation above can be solved for c 1. j = 2,, p yiu j To find c j for, compute and solve for c j. Slide 6.2-5

AN ORTHOGONAL PROJECTION Given a nonzero vector u in, consider the problem of decomposing a vector y in n into the sum of two vectors, one a multiple of u and the other orthogonal to u. We wish to write ŷ=αu y= yˆ + z n ----(1) where for some scalar α and z is some vector orthogonal to u. See the following figure. Slide 6.2-6

AN ORTHOGONAL PROJECTION Given any scalar α, let satisfied. Then z= y αu, so that (1) is y yˆ is orthogonal to u if an only if 0 = (y αu)iu = yiu (αu)iu = yiu α(uiu) That is, (1) is satisfied with z orthogonal to u if and α = yiu uiu only if and. ŷ ŷ = yiu uiu u The vector is called the orthogonal projection of y onto u, and the vector z is called the component of y orthogonal to u. Slide 6.2-7

AN ORTHOGONAL PROJECTION If c is any nonzero scalar and if u is replaced by cu in the definition of ŷ, then the orthogonal projection of y onto cu is exactly the same as the orthogonal projection of y onto u. Hence this projection is determined by the subspace L spanned by u (the line through u and 0). Sometimes is denoted by proj L y and is called the orthogonal projection of y onto L. That is, ŷ ŷ = proj L y = yiu uiu u ----(2) Slide 6.2-8

AN ORTHOGONAL PROJECTION Example 1: Let and. Find the orthogonal projection of y onto u. Then write y as the sum of two orthogonal vectors, one in Span {u} and one orthogonal to u. Solution: Compute y 7 = 6! yiu = # " 7 6! uiu = # " 4 2 u $! &i# % " $! &i# % " 4 = 2 4 2 4 2 $ & = 40 % $ & = 20 % Slide 6.2-9

AN ORTHOGONAL PROJECTION The orthogonal projection of y onto u is ŷ = yiu uiu u = 40 20 u = 2! # 4 " 2 $! & = # 8 % " 4 $ & % and the component of y orthogonal to u is " y ŷ = 7 % " $ ' 8 % " $ ' = 1 % $ ' # 6 & # 4 & # 2 & The sum of these two vectors is y. Slide 6.2-10

AN ORTHOGONAL PROJECTION That is,! # " 7 6 $! & = # 8 % " 4 $! &+# 1 % " 2 $ & % y ŷ (y y) ˆ The decomposition of y is illustrated in the following figure. Slide 6.2-11

AN ORTHOGONAL PROJECTION Note: If the calculations above are correct, then will be an orthogonal set. {y, ˆ y y} ˆ As a check, compute " ŷi(y ŷ) = $ 8 # 4 % " 'i$ & # 1 2 % ' = 8+8 = 0 & Since the line segment in the figure on the previous slide between y and ŷ is perpendicular to L, by construction of ŷ, the point identified with ŷ is the closest point of L to y. Slide 6.2-12

ORTHONORMAL SETS A set {u 1,,u p } is an orthonormal set if it is an orthogonal set of unit vectors. If W is the subspace spanned by such a set, then {u 1,,u p } is an orthonormal basis for W, since the set is automatically linearly independent, by Theorem 4. The simplest example of an orthonormal set is the standard basis {e 1,,e n } for. n Any nonempty subset of {e 1,,e n } is orthonormal, too. Slide 6.2-13

ORTHONORMAL SETS Example 2: Show that {v 1, v 2, v 3 } is an orthonormal basis of, where! # v 1 = # # # " 3 3 / 11 1/ 11 1/ 11 Solution: Compute $ " % " & $ 1/ 6 ' $ & &, v 2 = $ 2 / 6 ' $, v ' 3 = $ $ & $ 1/ 6 ' $ % # & # 1/ 66 4 / 66 7 / 66 v 1 iv 2 = 3 / 66 + 2 / 66 +1/ 66 = 0 v 1 iv 3 = 3 / 726 4 / 726 + 7 / 726 = 0 % ' ' ' ' & Slide 6.2-14

ORTHONORMAL SETS v gv = 1/ 396 8/ 396+ 7/ 396 = 0 2 3 Thus {v 1, v 2, v 3 } is an orthogonal set. Also, v 1 iv 1 = 9 /11+1/11+1/11= 0 v 2 iv 2 =1/ 6 + 4 / 6 +1/ 6 =1 v 3 iv 3 =1/ 66 +16 / 66 + 49 / 66 =1 which shows that v 1, v 2, and v 3 are unit vectors. Thus {v 1, v 2, v 3 } is an orthonormal set. Since the set is linearly independent, its three vectors form a basis for. See the figure on the next slide. 3 Slide 6.2-15

ORTHONORMAL SETS When the vectors in an orthogonal set of nonzero vectors are normalized to have unit length, the new vectors will still be orthogonal, and hence the new set will be an orthonormal set. Slide 6.2-16

6 Orthogonality and Least Squares 6.3 ORTHOGONAL PROJECTIONS

ORTHOGONAL PROJECTIONS 2 The orthogonal projection of a point in onto a line through the origin has an important analogue in. n n Given a vector y and a subspace W in, there is a vector ŷ in W such that (1) ŷ is the unique vector in W for which y yˆ is orthogonal to W, and (2) ŷ is the unique vector in W closest to y. See the following figure. Slide 6.3-2

THE ORTHOGONAL DECOMPOSITION THEOREM These two properties of provide the key to finding the least-squares solutions of linear systems. Theorem 8: Let W be a subspace of. Then each y in can be written uniquely in the form n ŷ where is in W and z is in. ----(1) In fact, if {u 1,,u p } is any orthogonal basis of W, then and. ŷ y= yˆ + z W n ŷ = yiu 1 u 1 iu 1 u 1 + + yiu p u p iu p u p z= y yˆ ----(2) Slide 6.3-3

THE ORTHOGONAL DECOMPOSITION THEOREM ŷ The vector in (1) is called the orthogonal projection of y onto W and often is written as proj W y. See the following figure. Proof: Let {u 1,,u p } be any orthogonal basis for W, and define by (2). ŷ ŷ ŷ Then is in W because is a linear combination of the basis u 1,,u p. Slide 6.3-4

THE ORTHOGONAL DECOMPOSITION THEOREM z= y yˆ Let. Since u 1 is orthogonal to u 2,,u p, it follows from (2) that " ziu 1 = (y ŷ)iu 1 = yiu 1 yiu % 1 $ 'u # u 1 iu 1 iu 1 0 0 1 & Thus z is orthogonal to u 1. Similarly, z is orthogonal to each u j in the basis for W. Hence z is orthogonal to every vector in W. W That is, z is in. = yiu 1 yiu 1 = 0 Slide 6.3-5

THE ORTHOGONAL DECOMPOSITION THEOREM To show that the decomposition in (1) is unique, suppose y can also be written as y= yˆ + z, with 1 1 in W and z 1 in. Then so W ˆ + = + ˆ1 1 y z y z (since both sides equal y), and yˆ y = z z ˆ1 1 This equality shows that the vector is in W and in W (because z 1 and z are both in W, and W is a subspace). viv = 0 v= 0 y y z = z Hence, which shows that. ˆ = ˆ1 1 This proves that and also. ˆ ˆ1 v= y y ŷ 1 Slide 6.3-6

THE ORTHOGONAL DECOMPOSITION THEOREM The uniqueness of the decomposition (1) shows that the orthogonal projection ŷ depends only on W and not on the particular basis used in (2). 2 2 1 u = 5,u = 1,and y = 2 1 2 1 1 3 Example 1: Let. Observe that {u 1, u 2 } is an orthogonal basis for W = Span{u,u }. Write y as the sum of a vector in 1 2 W and a vector orthogonal to W. Slide 6.3-7

THE ORTHOGONAL DECOMPOSITION THEOREM Solution: The orthogonal projection of y onto W is ŷ = yiu 1 u 1 iu 1 u 1 + yiu 2 u 2 iu 2 u 2 " = 9 $ $ 30$ # 2 5 1 % " ' ' '+ 3 $ $ 6$ & # 2 1 1 % " ' ' = 9 $ $ ' 30$ & # 2 5 1 % ' ' ' & + 15 30 " $ $ $ # 2 1 1 % " ' $ ' = $ ' $ & # 2 / 5 2 1/ 5 % ' ' ' & Also 1 2/5 7/5 y yˆ = 2 2 = 0 3 1/ 5 14 / 5 Slide 6.3-8

THE ORTHOGONAL DECOMPOSITION THEOREM Theorem 8 ensures that is in. To check the calculations, verify that is orthogonal to both u 1 and u 2 and hence to all of W. y yˆ W y yˆ The desired decomposition of y is 1 2/5 7/5 y= 2 = 2 + 0 3 1/ 5 14 / 5 Slide 6.3-9

PROPERTIES OF ORTHOGONAL PROJECTIONS If {u 1,,u p } is an orthogonal basis for W and if y happens to be in W, then the formula for proj W y is exactly the same as the representation of y given in Theorem 5 in Section 6.2. proj y In this case,. W = y W = Span{u 1,,u p } proj y y If y is in, then. W = Slide 6.3-10

THE BEST APPROXIMATION THEOREM Theorem 9: Let W be a subspace of, let y be any vector in n, and let ŷ be the orthogonal projection of y onto W. Then ŷ is the closest point in W to y, in the sense that for all v in W distinct from. ŷ y yˆ < y v ŷ The vector in Theorem 9 is called the best approximation to y by elements of W. ----(3) The distance from y to v, given by, can be regarded as the error of using v in place of y. Theorem 9 says that this error is minimized when. n y v v = yˆ Slide 6.3-11

THE BEST APPROXIMATION THEOREM Inequality (3) leads to a new proof that does not depend on the particular orthogonal basis used to compute it. ŷ If a different orthogonal basis for W were used to construct an orthogonal projection of y, then this projection would also be the closest point in W to y, namely,. ŷ Slide 6.3-12

THE BEST APPROXIMATION THEOREM Proof: Take v in W distinct from. See the following figure. ŷ ŷ v Then is in W. By the Orthogonal Decomposition Theorem, orthogonal to W. y yˆ ŷ v y yˆ In particular, is orthogonal to (which is in W ). Slide 6.3-13 is

THE BEST APPROXIMATION THEOREM Since y v = (y y) ˆ + (yˆ v) the Pythagorean Theorem gives y v = y yˆ + yˆ v 2 2 2 (See the colored right triangle in the figure on the previous slide. The length of each side is labeled.) 2 > ŷ v 0 Now ŷ v 0because, and so inequality (3) follows immediately. Slide 6.3-14

PROPERTIES OF ORTHOGONAL PROJECTIONS Example 2: The distance from a point y in to a subspace W is defined as the distance from y to the nearest point in W. Find the distance from y to W = Span{u,u } 1 2, where 1 5 1 y= 5,u = 2,u = 2 1 2 10 1 1 n Solution: By the Best Approximation Theorem, the distance from y to W is, where. y yˆ ŷ= proj y W Slide 6.3-15

PROPERTIES OF ORTHOGONAL PROJECTIONS Since {u 1, u 2 } is an orthogonal basis for W, 5 1 1 15 21 1 7 ŷ= u + u = 2 2 = 8 1 2 30 6 2 2 1 1 4 1 1 0 y yˆ = 5 8 = 3 10 4 6 2 2 2 y yˆ = 3 + 6 = 45 45 = 3 5 The distance from y to W is. Slide 6.3-16

6 Orthogonality and Least Squares 6.5 LEAST-SQUARES PROBLEMS

LEAST-SQUARES PROBLEMS Definition: If A is and b is in, a leastsquares solution of is an in such that n for all x in. m n m A x= b ˆx n b Axˆ b Ax The most important aspect of the least-squares problem is that no matter what x we select, the vector Ax will necessarily be in the column space, Col A. So we seek an x that makes Ax the closest point in Col A to b. See the figure on the next slide. Slide 6.5-2

LEAST-SQUARES PROBLEMS Solution of the General Least-Squares Problem Given A and b, apply the Best Approximation Theorem to the subspace Col A. Let ˆb = proj b Col A Slide 6.5-3

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM ˆb Because is in the column space A, the equation is consistent, and there is an in such that A ˆx = bˆ ˆx n A x = ----(1) bˆ ˆb Since is the closest point in Col A to b, a vector is a least-squares solution of A x= b if and only if ˆx satisfies (1). ˆx n ˆb Such an in is a list of weights that will build out of the columns of A. See the figure on the next slide. ˆx Slide 6.5-4

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM ˆx A ˆx = Suppose satisfies. By the Orthogonal Decomposition Theorem, the projection ˆb has the property that b bˆ is orthogonal to Col A, so ˆ is orthogonal to each column of A. b Ax If a j is any column of A, then, T and ˆ. a(b Ax) j bˆ a j i(b Aˆx) = 0 Slide 6.5-5

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM Since each is a row of A T, Thus a T j A T A T (b Ax) ˆ = 0 T b A Axˆ = 0 T AAˆx ----(2) These calculations show that each least-squares solution of A x= bsatisfies the equation T T AAx = Ab ----(3) The matrix equation (3) represents a system of equations called the normal equations for A x= b. A solution of (3) is often denoted by. = A T b ˆx Slide 6.5-6

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM A x= Theorem 13: The set of least-squares solutions of coincides with the nonempty set of solutions of the T T normal equation AAx = Ab. Proof: The set of least-squares solutions is nonempty and each least-squares solution ˆx satisfies the normal equations. Conversely, suppose ˆx satisfies T T AAˆx = Ab. Then ˆx satisfies (2), which shows that b Axˆ is orthogonal to the rows of A T and hence is orthogonal to the columns of A. Since the columns of A span Col A, the vector b Axˆ is orthogonal to all of Col A. b Slide 6.5-7

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM Hence the equation b= Ax ˆ + (b Ax) ˆ is a decomposition of b into the sum of a vector in Col A and a vector orthogonal to Col A. By the uniqueness of the orthogonal decomposition, must be the orthogonal projection of b onto Col A. Aˆx A ˆx = bˆ ˆx That is, and is a least-squares solution. Slide 6.5-8

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM Example 1: Find a least-squares solution of the inconsistent system for Solution: To use normal equations (3), compute:! # A = # # " A x= b $! & # &,b = # & # % " 4 0 0 2 1 1! A T A = # 4 0 1 " 0 2 1! $ # &# %# " 4 0 0 2 1 1 2 0 11 $ & & & % $ &! & = # 17 1 & " 1 5 % $ & % Slide 6.5-9

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM! A T b = # 4 0 1 " 0 2 1! $ # &# %# " 2 0 11 $ &! & = # 19 & " 11 % $ & % Then the equation! # " 17 1 1 5 T AAx $! &# %# " = x 1 x 2 A T b becomes $ & & =! 19 $ # & % " 11 % Slide 6.5-10

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM Row operations can be used to solve the system on the previous slide, but since A T A is invertible and 2 2, it is probably faster to compute and then solve (A T A) 1 = 1 84 T AAx = ˆx = (A T A) 1 A T b = 1 " 5 1 %" $ ' $ 84# 1 17 &# " $ # A T 5 1 1 17 b 19 11 as % ' = 1 " $ & 84# % ' & 84 168 % " ' = $ & # 1 2 % ' & Slide 6.5-11

SOLUTION OF THE GENREAL LEAST-SQUARES PROBLEM m n Theorem 14: Let A be an matrix. The following statements are logically equivalent: a. The equation A x= bhas a unique least-squares solution for each b in m. b. The columns of A are linearly independent. c. The matrix A T A is invertible. When these statements are true, the least-squares ˆx solution is given by 1 ˆx ( T T = AA) Ab ----(4) When a least-squares solution ˆx is used to produce A as an approximation to b, the distance from b to Aˆx is called the least-squares error of this approximation. ˆx Slide 6.5-12

ALTERNATIVE CALCULATIONS OF LEAST- SQUARES SOLUTIONS Example 2: Find a least-squares solution of " $ A = $ $ $ # 1 6 1 2 1 1 1 7 % " ' $ ',b = $ ' $ ' $ & # 1 2 1 6 % ' ' ' ' & A x= b for Solution: Because the columns a 1 and a 2 of A are orthogonal, the orthogonal projection of b onto Col A is given by ˆb = bia 1 a 1 ia 1 a 1 + bia 2 a 2 ia 2 a 2 = 8 4 a 1 + 45 90 a 2 ----(5) Slide 6.5-13

ALTERNATIVE CALCULATIONS OF LEAST- SQUARES SOLUTIONS! # = # # # " ˆb Now that is known, we can solve. But this is trivial, since we already know weights to place on the columns of A to produce. It is clear from (5) that! ˆx = 8 / 4 # " 45 / 90 2 2 2 2 $! & # & + # & # & # % " 3 1 1/ 2 7 / 2 $! & # & = # & # & # % " $! & = # % " 1 1 5 / 2 11/ 2 2 1/ 2 $ & & & & % A ˆx = ˆb $ & % bˆ Slide 6.5-14