MATH 22A: LINEAR ALGEBRA Chapter 4

Similar documents
MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

MATH 167: APPLIED LINEAR ALGEBRA Chapter 3

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

P = A(A T A) 1 A T. A Om (m n)

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

MATH 167: APPLIED LINEAR ALGEBRA Chapter 2

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

6. Orthogonality and Least-Squares

Math 407: Linear Optimization

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Conceptual Questions for Review

Chapter 6: Orthogonality

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Applied Linear Algebra in Geoscience Using MATLAB

Section 6.4. The Gram Schmidt Process

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

Typical Problem: Compute.

LINEAR ALGEBRA W W L CHEN

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

Linear Models Review

MAT Linear Algebra Collection of sample exams

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

The QR Factorization

MATH Linear Algebra

There are two things that are particularly nice about the first basis

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Math 3191 Applied Linear Algebra

18.06 Quiz 2 April 7, 2010 Professor Strang

Linear Algebra, Summer 2011, pt. 3

1 Last time: least-squares problems

Least squares problems Linear Algebra with Computer Science Application

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

A Review of Linear Algebra

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

MATH 304 Linear Algebra Lecture 23: Diagonalization. Review for Test 2.

Lecture 4: Applications of Orthogonality: QR Decompositions

Math 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES

Math Linear Algebra

MATH 22A: LINEAR ALGEBRA Chapter 1

Answer Key for Exam #2

MTH 2310, FALL Introduction

Review of linear algebra

LINEAR ALGEBRA KNOWLEDGE SURVEY

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

Orthogonal Projection and Least Squares Prof. Philip Pennance 1 -Version: December 12, 2016

PRACTICE PROBLEMS (TEST I).

Math 21b: Linear Algebra Spring 2018

Maths for Signals and Systems Linear Algebra in Engineering

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

7. Dimension and Structure.

Projections and Least Square Solutions. Recall that given an inner product space V with subspace W and orthogonal basis for

Row Space, Column Space, and Nullspace

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Dot Products, Transposes, and Orthogonal Projections

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

EXAM. Exam 1. Math 5316, Fall December 2, 2012

Chapter 5 Orthogonality

Lecture 23: 6.1 Inner Products

Lecture 4 Orthonormal vectors and QR factorization

October 25, 2013 INNER PRODUCT SPACES

Math 291-2: Lecture Notes Northwestern University, Winter 2016

Linear Algebra Primer

18.06SC Final Exam Solutions

18.06 Professor Johnson Quiz 1 October 3, 2007

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

MATH 532: Linear Algebra

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

MATH 581D FINAL EXAM Autumn December 12, 2016

Chapter 6. Orthogonality

Math 3C Lecture 25. John Douglas Moore

Linear Algebra Massoud Malek

Review of Linear Algebra

1. General Vector Spaces

Chapter 6 - Orthogonality

4 ORTHOGONALITY ORTHOGONALITY OF THE FOUR SUBSPACES 4.1

Chapter 4 Euclid Space

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

MTH 2032 SemesterII

Pseudoinverse & Moore-Penrose Conditions

Chapter 3 Transformations

The Gram Schmidt Process

The Gram Schmidt Process

6 Inner Product Spaces

Lecture notes: Applied linear algebra Part 1. Version 2

I. Multiple Choice Questions (Answer any eight)

MATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products.

Cheat Sheet for MATH461

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

Transcription:

MATH 22A: LINEAR ALGEBRA Chapter 4 Jesús De Loera, UC Davis November 30, 2012

Orthogonality and Least Squares Approximation

QUESTION: Suppose Ax = b has no solution!! Then what to do? Can we find an Approximate Solution?

Say a police man is interested on clocking the speed of a vehicle by using measurements of its relative distance. At different times t i we measure distance b i. Assuming the vehicle is traveling at constant speed, so we know linear formula, but errors exist! Suppose we expect the output b to be a linear function of the input t b = α + tβ, but we need to determine α, β. At t = t i, the error between the measured value b i and the value predicted by the function is e i = b i (α + βt i ). We can write it as e = b Ax where x = (α, β). e is the error vector, b is the data vector. A is an m 2 matrix. We seek the line that minimizes the total squared error or Euclidean norm e = m i=1 e2 i.

KEY GOAL We are looking for x that minimizes b Ax.

Clearly if b C(A) then we can make the value b Ax = 0. Note that b Ax is the distance from b to the point Ax which is element of the column space! The error vector e = b Ax is must be perpendicular to the column space of A. Thus for each column a i we have a T i (b Ax) = 0. Thus in matrix notation: A T (b Ax) = 0, This gives. A T Ax = A T b We are going to study this system ALOT!!! PUNCH LINE If A has independent columns, then A T A is square, symmetric and invertible.

RECALL We say vectors x, y are perpendicular when they make a 90 degree angle. When that happens the triangle they define is right triangle! (WHY?) Lemma Two vectors x, y in R n are perpendicular if and only if x 1 y 1 + + x n y n = xy T = 0 When this last equation holds we say x, y are orthogonal. Orthogonal Bases: A basis u 1,..., u n of V is orthogonal if u i, u j = 0 for all i j. Lemma If v 1, v 2,..., v k are orthogonal then they are linearly independent.

The Orthogonality of the Subspaces Definition We say two say two subspaces V, W of R n are orthogonal if for u V and w W we have uw T = 0. Can you see a way to detect when two subspaces are orthogonal?? Through their bases! Theorem: The row space and the nullspace are orthogonal. Similarly the column space is orthogonal to the left nullspace. proof: The dot product between the rows of A T and the respective entries in the vector y is zero. Therefore the rows of A T are perpendicular to any y N(A T ) Column 1 of A y 1 0 A T y =.. =. Column n of A y n 0 where y N(A T ).

There is a stronger relation, for a subspace V of R n the set of all vectors orthogonal to V is the orthogonal complement of V, denoted V. Warning Spaces can be orthogonal without being complements! Exercise Let W be a subspace, its orthogonal complement is a subspace, and W W = 0. Exercise If V W subspaces, then W V. Theorem (Fundamental theorem part II) C(A T ) = N(A) and N(A) = C(A T ). Why? proof: First equation is easy because x is orthogonal to all vectors of row space x is orthogonal to each of the rows x N(A). The other equality follows from exercises. Corollary Given an m n matrix A, the nullspace is the orthogonal complement of the row space in R n. Similarly, the left nullspace is the orthogonal complement of the column space inside R m WHY is this such a big deal?

Theorem Given an m n matrix A, every vector x in R n can be written in a unique way as x n + x r where x n is in the nullspace and x r is in the row space of A. proof Pick x n to be the orthogonal projection of x into N(A) and x r to be the orthogonal projection into C(A T ). Clearly x is a sum of both, but why are they unique? If x n + x r = x n + x r, then x n x n = x r x r Thus the must be the zero vector because N(A) is orthogonal to to C(A T ). This has a beautiful consequence: Every matrix A, when we think of it as a linear map, transforms the row space into its column space!!!

An important picture

Projections onto Subspaces

QUESTION: Given a subspace S, what is the formula for the projection p of a vector b into S? Key idea of LEAST SQUARES for regression analysis Think of b as data from experiments, b is not in S, due to error of measurement. Projection p is the best choice to replace b. How to do this projection for a line? b is projected into the line L given by the vector a. (PICTURE!). The projection of vector b onto the line in the direction a is p = at b a T a a.

GENERAL CASE Suppose the subspace S = C(A) is the column space of A. Now b is a vector that is outside the column space!! We want x that minimizes b Ax. Then p = Ax is the projection of b onto C(A) Note that b Ax is the distance from b to the point Ax which is element of the column space! The vector w = b Ax must be perpendicular to the column space (picture!!!). For each column a i we have a T i (b Ax) = 0. Thus in matrix notation: A T (b Ax) = 0, This gives the normal equation or least-squares equation:. A T Ax = A T b

Theorem The solution x = (A T A) 1 A T b gives the coordinates of the projection p in terms of the columns of A. The projection of b into C(A) is. p = A((A T A) 1 A T )b Theorem The matrix P = A((A T A) 1 A T ) is a projection matrix. It has the properties P T = P, and P 2 = P. Lemma A T A is a symmetric matrix. A T A has the same Nullspace as A. Why? if x N(A), then clearly A T Ax = 0. Conversely, if A T Ax = 0 then x T A T Ax = Ax = 0, thus Ax = 0. Corollary If A has independent columns, then A T A is square, symmetric and invertible. Example 1 We wish to project the vector b = (2, 3, 4, 1) into the subspace x + y = 0. What is the distance?

Example 2 Consider the problem Ax = b with 1 2 0 3 1 1 A = 1 2 1 b T = (1, 0, 1, 2, 2). 1 1 2 2 1 1 We can see that there is no EXACT solution to Ax = b, use NORMAL EQUATION! 16 2 2 8 A T A = 2 11 2 AT b = 0 2 2 7 7 Solving A T Ax = A T b we get the least square solution x (0.4119, 0.2482, 0.9532) T with error b Ax 0.1799.

Example 3 A sample of lead-210 measured the following radioactivity data at the given times (time in days). Can YOU predict how long will it take until one percent of the original amount remains? time in days 0 4 8 10 14 18 mg 10 8.8 7.8 7.3 6.4 6.4 A linear model does not work here!! There is an exponential decay on the material m(t) = m 0 e βt, where m 0 is the initial radioactive material and β the decay rate. Taking logarithms y(t) = log(m(t)) = log(m 0 ) + βt We can use usual least squares to fit on the logarithms y i = log(m i ) of the radioactive mass data.

In this case we have [ 1 1 1 1 1 1 A T = 0 4 8 10 14 18 In this case we have [ 1 1 1 1 1 1 A T = 0 4 8 10 14 18 ] ] y(t) = b T = [2.30258, 2.17475, 2.05412, 1.98787, 1.856297, 1.85629] [ ] 6 54 Thus A T A =. 54 700 Solving the NORMAL system we get log(m 0 ) = 2.277327661 and β = 0.0265191683 The original amount was 10 mg. After 173 days it will below one percent of the radioactive material.

Orthogonal Bases and Gram-Schmidt

Not all bases of a vector space are created equal! Some are better than others!! A basis u 1,..., u n of a vector space V is orthonormal if it is orthogonal and each vector has unit length. Observation If the vectors u 1,..., u n are orthogonal basis, their normalizations form an orthonormal basis. u i u i Example Of course the standard unit vectors are orthonormal. Example The vectors 1 0 5 2 1 2 1 2 1 are an orthogonal basis of R 3.

Why do we care about orthonormal bases? Theorem Let u 1,..., u n be an orthonormal bases for a vector space with inner product V. The one can write any element v V as a linear combination v = c 1 u 1 + + c n u n where c i = v, u i, for i = 1,..., n. Why? Example Let us rewrite the vector v = (1, 1, 1) T in terms of the orthonormal basis u 1 = ( 1 6, 2, 1 ) T, u 2 = (0, 6 6 1 5, 2 ), u 3 = ( 5, 2, 5 30 30 1 30 ) Computing the dot products v T u 1 = 2 6, v T u 2 = 3 5, and v T u 3 = 4 30. Thus v = 2 6 u 1 + 3 5 u 2 + 4 30 u 3 A key reason to like matrices that have orthonormal vectors: The least-squares equations are even nicer!!!

Lemma If Q is a rectangular matrix with orthonormal columns, then the normal equations simplify because Q T Q = I : Q T Qx = Q T b simplifies to x = Q T b Projection matrix simplifies Q(Q T Q) 1 Q T = QIQ T = QQ T. Thus the projection point is p = QQ T b, thus p = (q T 1 b)q 1 + (q T 2 b)q 2 + + (q T n b)q n So how do we compute orthogonal/orthonormal bases for a space?? We use the GRAM-SCHMIDT ALGORITHM.

So how do we compute orthogonal/orthonormal bases for a space?? We use the GRAM-SCHMIDT ALGORITHM. input Starting with a linear independent vectors a 1,..., a n, output: orthogonal vectors q 1,..., q n. Step 1: q 1 = a 1 Step 2: q 2 = a 2 ( at 2 q1 q T 1 q1 )q 1 Step 3: q 3 = a 3 ( at 3 q1 q T 1 q1 )q 1 ( at 3 q2 q T 2 q2 )q 2 Step 4: q 4 = a 4 ( at 4 q1 q T 1 q1 )q 1 ( at 4 q2 q T 2 q2 )q 2 ( at 4 q3 q T 3 q3 )q 3.... Step j: q 4 = a j ( at j q 1 q T 1 q1 )q 1 ( at j q 2 q T 2 q2 )q 2... ( at j q j 1 q T j 1 q j 1 )q j 1 At the end NORMALIZE all vectors if you wish to have unit vectors!! (DIVIDE BY LENGTH).

EXAMPLE Consider the subspace W spanned by (1, 2, 0, 1), ( 1, 0, 0, 1) and (1, 1, 0, 0). Find an orthonormal basis for the space W. ANSWER: ( 1, 2, 0, 1 1 ), (, 1, 0, 1 ), ( 1, 0, 0, 1 ) 6 6 6 3 3 3 2 2

In this way, the original basis vectors a 1,..., a n can be written in a triangular way! If q 1, q 2,..., q n are orthogonal Just think of r ij = a T j q i a 1 = r 11 (q 1 /q T 1 q 1 ), (1) a 2 = r 12 (q 1 /q T 1 q 1 ) + r 22 (q 2 /q T 1 q 1 ) (2) a 3 = r 13 (q 1 /q T 1 q 1 ) + r 23 (q 2 /q T 2 q 2 ) + r 33 (q 3 /q T 3 q 3 ) (3).. (4) a n = r 1n (q 1 /q T 1 q 1 ) + r 2n (q 2 /q T 2 q 2 ) + + r nn (q n /q T n q n ). (5) Where r ij = a T j q i. Write this equations in matrix form! we obtain A = QR where A = (a 1... a n ) and Q = (q 1 q 2... q n ) and R = (r ij ).

Theorem (QR decomposition) Every m n matrix A with independent columns can be factor as A = QR where the columns of Q are orthonormal and R is upper triangular and invertible. NOTE: A and Q have the same column space. R is an invertible and upper triangular The simplest way to compute this decomposition is simply: 1 Use Gram-Schmidt to get the q i orthonormal vectors. 2 Matrix Q has columns q 1,..., q n 3 The matrix R is filled with the dot products r ij = a T j q i. NOTE: Every matrix has two decompositions LU and QR. They are both useful for different reasons!! One is for solving equations, the other good for least-squares.