Chapter 6 - Orthogonality

Similar documents
Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Vector Spaces, Orthogonality, and Linear Least Squares

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

Vector Spaces. 9.1 Opening Remarks. Week Solvable or not solvable, that s the question. View at edx. Consider the picture

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

Math 3191 Applied Linear Algebra

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Lecture 3: Linear Algebra Review, Part II

Overview. Motivation for the inner product. Question. Definition

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Lecture 13: Orthogonal projections and least squares (Section ) Thang Huynh, UC San Diego 2/9/2018

The matrix will only be consistent if the last entry of row three is 0, meaning 2b 3 + b 2 b 1 = 0.

Lecture: Linear algebra. 4. Solutions of linear equation systems The fundamental theorem of linear algebra

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

SECTION 3.3. PROBLEM 22. The null space of a matrix A is: N(A) = {X : AX = 0}. Here are the calculations of AX for X = a,b,c,d, and e. =

Linear Algebra, Summer 2011, pt. 3

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

χ 1 χ 2 and ψ 1 ψ 2 is also in the plane: αχ 1 αχ 2 which has a zero first component and hence is in the plane. is also in the plane: ψ 1

Department of Aerospace Engineering AE602 Mathematics for Aerospace Engineers Assignment No. 4

MAT Linear Algebra Collection of sample exams

Math 2331 Linear Algebra

Recall the convention that, for us, all vectors are column vectors.

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Chapter 2 Notes, Linear Algebra 5e Lay

Linear Algebra in A Nutshell

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

18.06 Problem Set 3 - Solutions Due Wednesday, 26 September 2007 at 4 pm in

Least squares problems Linear Algebra with Computer Science Application

MTH 464: Computational Linear Algebra

Section 6.1. Inner Product, Length, and Orthogonality

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

Typical Problem: Compute.

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Chapter 6. Orthogonality and Least Squares

Math 54 HW 4 solutions

A SHORT SUMMARY OF VECTOR SPACES AND MATRICES

4 ORTHOGONALITY ORTHOGONALITY OF THE FOUR SUBSPACES 4.1

We showed that adding a vector to a basis produces a linearly dependent set of vectors; more is true.

. = V c = V [x]v (5.1) c 1. c k

Linear Independence x

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Orthogonality and Least Squares

Row Space, Column Space, and Nullspace

2018 Fall 2210Q Section 013 Midterm Exam II Solution

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

Linear Systems. Carlo Tomasi

Elementary linear algebra

MTH 464: Computational Linear Algebra

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Announcements Monday, October 29

Pseudoinverse & Moore-Penrose Conditions

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

BASIC NOTIONS. x + y = 1 3, 3x 5y + z = A + 3B,C + 2D, DC are not defined. A + C =

MATH 22A: LINEAR ALGEBRA Chapter 4

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

A Primer in Econometric Theory

Find the solution set of 2x 3y = 5. Answer: We solve for x = (5 + 3y)/2. Hence the solution space consists of all vectors of the form

March 27 Math 3260 sec. 56 Spring 2018

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

Chapter 6: Orthogonality

Review Notes for Linear Algebra True or False Last Updated: February 22, 2010

Math 21b: Linear Algebra Spring 2018

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

18.06 Professor Johnson Quiz 1 October 3, 2007

MATH 2360 REVIEW PROBLEMS

Rank and Nullity. MATH 322, Linear Algebra I. J. Robert Buchanan. Spring Department of Mathematics

Chapter 3 - From Gaussian Elimination to LU Factorization

Linear Algebra Fundamentals

7. Dimension and Structure.

Notes on Solving Linear Least-Squares Problems

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

2. Every linear system with the same number of equations as unknowns has a unique solution.

A Brief Outline of Math 355

MA 511, Session 10. The Four Fundamental Subspaces of a Matrix

Lecture notes: Applied linear algebra Part 1. Version 2

Linear Algebra, Summer 2011, pt. 2

6.4 BASIS AND DIMENSION (Review) DEF 1 Vectors v 1, v 2,, v k in a vector space V are said to form a basis for V if. (a) v 1,, v k span V and

Study Guide for Linear Algebra Exam 2

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

Numerical Linear Algebra Homework Assignment - Week 2

LECTURE 6: VECTOR SPACES II (CHAPTER 3 IN THE BOOK)

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Orthogonal Projection and Least Squares Prof. Philip Pennance 1 -Version: December 12, 2016

MATH PRACTICE EXAM 1 SOLUTIONS

Math 3C Lecture 25. John Douglas Moore

Linear Algebra Massoud Malek

Chapter 7. Linear Algebra: Matrices, Vectors,

Section 4.5. Matrix Inverses

Chapter 2: Matrix Algebra

Linear Algebra March 16, 2019

Math 240, 4.3 Linear Independence; Bases A. DeCelles. 1. definitions of linear independence, linear dependence, dependence relation, basis

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Linear Algebra- Final Exam Review

Section 2.2: The Inverse of a Matrix

Solutions to Math 51 First Exam October 13, 2015

Normed & Inner Product Vector Spaces

GEOMETRY OF MATRICES x 1

EE263: Introduction to Linear Dynamical Systems Review Session 2

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

Transcription:

Chapter 6 - Orthogonality Maggie Myers Robert A. van de Geijn The University of Texas at Austin Orthogonality Fall 2009 http://z.cs.utexas.edu/wiki/pla.wiki/ 1

Orthogonal Vectors and Subspaces http://z.cs.utexas.edu/wiki/pla.wiki/ 2

Observations Let x, y R n be linearly independent. The subspace of all vectors αx + βy, α, β R (the space spanned by x and y) forms a plane. All three vectors x, y, and (x y) lie in this plane and they form a triangle: y z = y x x http://z.cs.utexas.edu/wiki/pla.wiki/ 3

Orthogonality of two vectors x and y are orthogonal/perpendicular if they meet at a right angle. The (Euclidean) length of x is given by x 2 = χ 2 0 + + χ2 n 1 = x T x. Pythagorean Theorem: The angle between x and y meet is a right angle if and only if z 2 2 = x 2 2 + y 2 2. In this case, x 2 2 + y 2 2 = z 2 2 = y x 2 2 = (y x) T (y x) = (y T x T )(y x) = (y T x T )y (y T x T )x = y T y (x T y + y T x) {z} {z } y 2 2x T y = x 2 2x T y + y 2 + x T x {z} x 2 http://z.cs.utexas.edu/wiki/pla.wiki/ 4

Soooo... Let x and y are orthogonal/perpendicular if they meet at a right angle. Then x 2 + y 2 = x 2 2x T y + y 2 Cancelling terms on the left and right of the equality, this implies that x T y = 0. Definition Two vectors x, y R n are said to be orthogonal if x T y = 0. Notation Sometimes we will use the notation x y to indicate that x is perpendicular to y. http://z.cs.utexas.edu/wiki/pla.wiki/ 5

Orthogonality of subspaces Let V, W R n be subspaces. Then V and W are said to be orthogonal if v V and w W implies that v T w = 0. In other words, two subspaces are orthogonal if all vectors in the first subspace are orthogonal to all vectors in the second subspace. Notation V W indicates that subspace V is orthogonal to subspace W. http://z.cs.utexas.edu/wiki/pla.wiki/ 6

Example Let V = Span 1 0 0, 0 1 0 and W = Span 0 0 1 Show that V W. http://z.cs.utexas.edu/wiki/pla.wiki/ 7

Definition Given subspace V R n, the set of all vectors in R n that are orthogonal to V is denoted by V (pronounced as V-perp ). Example Let V = Span 1 0 0, 0 0 1. What is V? http://z.cs.utexas.edu/wiki/pla.wiki/ 8

Exercise Let V, W R n be subspaces. Show that if V W then V W = {0}, the zero vector. Definition Whenever V W = {0} we will sometimes call this the trivial intersection. Trivial in the sense that it only contains only the zero vector. http://z.cs.utexas.edu/wiki/pla.wiki/ 9

Show that if V R n is a subspace, then V is a subspace. http://z.cs.utexas.edu/wiki/pla.wiki/ 10

Definitions (Recap) Let A R m n. Column space of A, C(A): set of all vectors in R m that can be written as Ax: C(A) = {y y = Ax}. Null space of A, N (A): the set of all vectors in R n that map to the zero vector: (New) The row space of A: N (A) = {x Ax = 0}. R(A) = {y y = A T x}. (New) The left null space of A: http://z.cs.utexas.edu/wiki/pla.wiki/ 11 (N (A T ) =){x x T A = 0}.

Exercise Show that the left null space of a matrix A R m n equals N (A T ). http://z.cs.utexas.edu/wiki/pla.wiki/ 12

Theorem Proof Let A R m n. Then the null space of A is orthogonal to the row space of A: R(A) N (A). Assume that y R(A) and x N (A). Then there exists a vector z R n such that y = A T z. (Why?) Now, y T x = (A T z) T x = (z T A)x = z T (Ax) = 0. (Why?) http://z.cs.utexas.edu/wiki/pla.wiki/ 13

Theorem The dimension of R(A) equals the dimension of C(A). Proof The dimension of the row space equals the number of linearly independent rows. which equals the number of nonzero rows that result from the Gauss-Jordan method which equals the number of pivots that show up in that method which equals the number of linearly independent columns. http://z.cs.utexas.edu/wiki/pla.wiki/ 14

Theorem Let A R m n. Then every x R n can be written as x = x r + x n where x r R(A) and x n N (A). Proof The dimension of N (A) and the dimension of C(A), r, add to the number of columns, n. (Why?) Thus, the dimension of R(A) equals r and the dimension of N (A) equals n r. If x R n cannot be written as x r + x n as indicated, then consider the set of vectors that consists of the union of a basis of R(A) and a basis of N (A), plus the vector x. This set is linearly independent and has n + 1 vectors, contradicting the fact that R n has dimension n. http://z.cs.utexas.edu/wiki/pla.wiki/ 15

Theorem Proof Let A R m n. Then A is a one-to-one, onto mapping from R(A) to C(A). Let A R m n. We need to show that A maps R(A) to C(A). This is trivial, since any vector x R m maps to C(A). Uniqueness: We need to show that if x, y R(A) and Ax = Ay then x = y. Notice that Ax = Ay implies that A(x y) = 0, which means that (x y) is both in R(A) (since it is a linear combination of x and y, both of which are in R(A)) and in N (A). Since we just showed that these two spaces are orthogonal, we conclude that (x y) = 0, the zero vector. Thus x = y. http://z.cs.utexas.edu/wiki/pla.wiki/ 16

Theorem Let A R m n. Then A is a one-to-one, onto mapping from R(A) to C(A). Proof (continued) Let A R m n. We need to show that Onto: We need to show that for any b C(A) there exists x r R(A) such that Ax r = b. Notice that if b C, then there exists x R n such that Ax = b. We know that x = x r + x n where x r R(A) and x n N (A). Then b = Ax = A(x r + x n) = Ax r + Ax n = Ax r. http://z.cs.utexas.edu/wiki/pla.wiki/ 17

Theorem Proof Let A R m n. Then the left null space of A is orthogonal to the columns space of A; and The dimension of the left null space of A equals m r, where r is the dimension of the column space of A. This follows trivially by applying the previous theorems to A T. http://z.cs.utexas.edu/wiki/pla.wiki/ 18

Summarizing... dim r row space of A R n x r x = x r + x n Ax r = b Ax = b column space of A b dim r x n R m dim n r nullspace left nullspace dim m r http://z.cs.utexas.edu/wiki/pla.wiki/ 19

Motivating Example http://z.cs.utexas.edu/wiki/pla.wiki/ 20

Example Let us consider the following set of points: (χ 0, ψ 0 ) = (1, 1.97), (χ 1, ψ 1 ) = (2, 6.97), (χ 2, ψ 2 ) = (3, 8.89), (χ 3, ψ 3 ) = (4, 10.01). Find a line that best fits these points. http://z.cs.utexas.edu/wiki/pla.wiki/ 21

14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 http://z.cs.utexas.edu/wiki/pla.wiki/ 22

14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 http://z.cs.utexas.edu/wiki/pla.wiki/ 23

The Problem Find a line that interpolates these points as near as is possible. Near : the sum of the squares of the distances to the line are minimized. Express this with matrices and vectors. χ 0 1 ψ 0 x = χ 1 χ 2 = 2 3 and y = ψ 1 ψ 2 = χ 3 4 ψ 3 The equation of the line we want is y = γ 0 + γ 1 x. 1.97 6.97 8.89 10.01 IF this line COULD go through all these points THEN. ψ 0 = γ 0 + γ 1 χ 1 ψ 1 = γ 0 + γ 1 χ 2 ψ 2 = γ 0 + γ 1 χ 3 ψ 3 = γ 0 + γ 1 χ 4 or, specifically, 1.97 = γ 0 + γ 1 6.97 = γ 0 + 2γ 1 8.89 = γ 0 + 3γ 1 10.01 = γ 0 + 4γ 1 http://z.cs.utexas.edu/wiki/pla.wiki/ 24

In Matrix Notation... We would like... 0 or, specifically, B @ 0 B @ ψ 0 ψ 1 ψ 2 ψ 3 1.97 6.97 8.89 10.01 1 C A = 1 0 B @ C A = 0 B @ 1 χ 0 1 χ 1 1 χ 2 1 χ 3 1 1 1 2 1 3 1 4 1 C A 1 C A γ0 γ 1 γ0 Just looking at the graph it is obvious that these point do not lie on the same line. How do we say that mathematically? γ 1 ««. Therefore all these equations cannot be simultaneously satified. Why? So, what do we do? http://z.cs.utexas.edu/wiki/pla.wiki/ 25

How does it relate to column spaces? For what right-hand sides could we have solved all four equations simultaneously? We would have had to choose y so that Ac = y, where 1 χ 0 1 1 ( ) A = 1 χ 1 1 χ 2 = 1 2 1 3 and c = γ0. γ 1 1 χ 3 1 4 This means that y must be in the column space of A! It must be possible to express it as y = γ 0 a 0 + γ 1 a 1, where A = ( a 0 a 1 )! What does this mean if we relate this back to the picture? Only if {ψ 0,, ψ 3 } have the property that {(1, ψ 0 ),, (4, ψ 3 )} lie on a line can we find coefficients γ 0 and γ 1 such that Ac = y. http://z.cs.utexas.edu/wiki/pla.wiki/ 26

What are the fundamental questions? When does Ax = b have a solution? When does Ax = b have more than one solution? How do we characterize all these solutions? If Ax = b does not have a solution, how do we find the best approximate solution? http://z.cs.utexas.edu/wiki/pla.wiki/ 27

How does this problem relate to orthogonality and projection? The problem: b does not lie in the column space of A. A question is what vector, z, that does lie in the column space so that we can solve Ac = z instead. Now, if z solves Ac = z exactly, then z = ( ) ( ) γ a 0 a 0 1 = γ γ 0 a 0 + γ 1 a 1, 1 Well DAH! z is in the column space of A. What we want is y = z + w, where w is as small (in length) as possible. This happens when w is orthogonal to z! So, y = γ 0 a 0 + γ 1 a 1 + w, with a T 0 w = at 1 w = 0. The vector z C(A) that is closest to y is known as the projection of y onto the column space of A. We need a way of finding a way to compute this projection. http://z.cs.utexas.edu/wiki/pla.wiki/ 28

Solving a Linear Least-Squares Problem http://z.cs.utexas.edu/wiki/pla.wiki/ 29

Observations The last problem motivated the following general problem: Given m equations in n unknowns, we end up with a system Ax = b where A R m n, x R n, and b R m. This system of equations may have no solutions. This happens when b is not in the column space of A. This system may have a unique solution. This happens only when r = m = n, where r is the rank of the matrix (the dimension of the column space of A). Another way of saying this is that it happens only if A is square and nonsingular (it has an inverse). This system may have many solutions. This happens when b is in the column space of A and r < n (the columns of A are linearly dependent, so that the null space of A is nontrivial). http://z.cs.utexas.edu/wiki/pla.wiki/ 30

Overdetermined systems (Approximately) solve Ax = b where b is not in the column space of A. What we want is an approximate solution ˆx such that Aˆx = z, where z is the vector in the column space of A that is closest to b. In other words, b = z + w where w T v = 0 for all v C(A). From The Figure we conclude that this means that w is in the left null space of A. So, A T w = 0. But that means that which we can rewrite as 0 = A T w = A T (b z) = A T (b Aˆx) A T Aˆx = A T b. (1) http://z.cs.utexas.edu/wiki/pla.wiki/ 31

Lemma If A R m n has linearly independent columns, then A T A is nonsingular (equivalently, has an inverse, A T Aˆx = A T b has a solution for all b, etc.). Proof Proof by contradiction. Assume that A R m n has linearly independent columns and A T A is singular. Then there exists x 0 such that A T Ax = 0. Hence, there exists y = Ax 0 such that A T y = 0. (Why?) This means y is in the left null space of A. But y is also in the column space of A, since Ax = y. Thus, y = 0, since the intersection of the column space of A and the left null space of A only contains the zero vector. This contradicts that A has linearly independent columns. http://z.cs.utexas.edu/wiki/pla.wiki/ 32

What does this mean? If A has linearly independent columns, then The desired ˆx that is the best solution to Ax = b is given by ˆx = (A T A) 1 A T b The vector z C(A) closest to b is given by z = Aˆx = A(A T A) 1 A T b. Thus z = A(A T A) 1 A T b is the vector in the columns space closest to b. The matrix A(A T A) 1 A T projects a vector onto the column space of A. http://z.cs.utexas.edu/wiki/pla.wiki/ 33

Theorem Let A R m n, b R m, and x R n and assume that A has linearly independent columns. Then the solution that minimizes min x b Ax 2 is given by ˆx = (A T A) 1 A T b. http://z.cs.utexas.edu/wiki/pla.wiki/ 34

Definition Let A R m n. If A has linearly independent columns, then (A T A) 1 A T is called the (left) pseudo inverse. Note that this means m n and (A T A) 1 A T A = I. If A has linearly independent rows, then A T (AA T ) 1 is called the right pseudo inverse. Note that this means m n and AA T (AA T ) 1 = I. http://z.cs.utexas.edu/wiki/pla.wiki/ 35

Why Least-Square? Notice that we are trying to find ˆx that minimizes min x b Ax 2. If ˆx minimizes min x b Ax 2, it also minimizes the function F (x) = b Ax 2 2 = (b Ax)T (b Ax) = b T b 2b T Ax x T A T Ax. Recall how one would find the minimum of a function f : R R, f(x) = α 2 x 2 2βαx + β 2! Take the derivative and set it to zero. Here F : R n R. Compute the gradient (essentially the derivative) and set it to zero: 0 2A T b + 2A T Ax = 0, or, A T Ax = A T b. We are looking for ˆx that solves A T Ax = A T b. http://z.cs.utexas.edu/wiki/pla.wiki/ 36