Lecture 13: Orthogonal projections and least squares (Section 3.2-3.3) Thang Huynh, UC San Diego 2/9/2018
Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can be uniquely written as x = x + x. in W in W 1
Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can be uniquely written as x = x + x. in W in W x is the orthogonal projection of x onto W. 1
Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can be uniquely written as x = x + x. in W in W x is the orthogonal projection of x onto W. x is the point in W closest to x. 1
Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can be uniquely written as x = x + x. in W in W x is the orthogonal projection of x onto W. x is the point in W closest to x. If v 1,, v m is an orthogonal basis of W, then x = ( x v 1 ) v v 1 v 1 + + ( x v m ) v 1 v m v m. m 1
Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can be uniquely written as x = x + x. in W in W x is the orthogonal projection of x onto W. x is the point in W closest to x. If v 1,, v m is an orthogonal basis of W, then Once x = ( x v 1 ) v v 1 v 1 + + ( x v m ) v 1 v m v m. m x is determined, x = x x. 1
Orthogonal projection onto subspaces Example. Let W = span { { 3 0 1, 0 1 0 } }, and x = 0 3 10. Find the orthogonal projection of x onto W. Write x as a vector in W plus a vector orthogonal to W. 2
Orthogonal projection onto subspaces Definition. Let v 1,, v m be an orthogonal basis of W, a subspace of R n. The projection map π W R n R n, given by π W (x) = ( x v 1 v 1 v 1 ) v 1 + + ( x v m v m v m ) v m is linear (why?). The matrix P representing π W with respect to the standard basis is the corresponding projection matrix. 3
Orthogonal projection onto subspaces Definition. Let v 1,, v m be an orthogonal basis of W, a subspace of R n. The projection map π W R n R n, given by π W (x) = ( x v 1 v 1 v 1 ) v 1 + + ( x v m v m v m ) v m is linear (why?). The matrix P representing π W with respect to the standard basis is the corresponding projection matrix. Example. Find the projection matrix P which corresponds to { 3 orthogonal projection onto W = span 0, 0 } 1 in R 3. Then { 1 0 } find the orthogonal projection of x = 0 3 onto W. 10 3
Least squares In practice, Ax b. 4
Least squares In practice, Ax b. Definition. Ax = b if x is a least squares solution of the system x is such that Ax b is as small as possible. 4
Least squares In practice, Ax b. Definition. x is a least squares solution of the system Ax = b if x is such that Ax b is as small as possible. If Ax = b is consistent, i.e. b is in C(A), then a least squares solution x is just an ordinary solution. 4
Least squares In practice, Ax b. Definition. x is a least squares solution of the system Ax = b if x is such that Ax b is as small as possible. If Ax = b is consistent, i.e. b is in C(A), then a least squares solution x is just an ordinary solution. Interesting case: Ax = b is inconsistent, i.e. b is NOT in C(A). 4
Least squares In practice, Ax b. Definition. x is a least squares solution of the system Ax = b if x is such that Ax b is as small as possible. If Ax = b is consistent, i.e. b is in C(A), then a least squares solution x is just an ordinary solution. Interesting case: Ax = b is inconsistent, i.e. b is NOT in C(A). What should we do to find x? 4
Least squares In practice, Ax b. Definition. x is a least squares solution of the system Ax = b if x is such that Ax b is as small as possible. If Ax = b is consistent, i.e. b is in C(A), then a least squares solution x is just an ordinary solution. Interesting case: Ax = b is inconsistent, i.e. b is NOT in C(A). What should we do to find x? replace b with its projection and solve Ax = b. b onto C(A) 4
The normal equations Theorem x is a least squares solution of Ax = b if and only if A T Ax = A T b (the normal equation). Proof. 5
The normal equations Example. Find the least squares solution to Ax = b, where A = 1 1 1 1, b = 2 1. 0 0 1 6
The normal equations Example. Find the least squares solution to Ax = b, where A = 1 1 1 1, b = 2 1. 0 0 1 Solution. A T A = 1 1 0 1 1 1 1 0 1 1 = 2 0 0 0 0 2 and A T b = 1 1 0 2 1 1 0 1 = 1. 1 3 6
The normal equations The normal equation A T Ax = A T b is 2 0 x = 1. 0 2 3 Solving it, we obtain x = 1/2. 3/2 7
The normal equations Example. Find the least squares solution to Ax = b, where A = 4 0 0 2, b = 2 0. 1 1 11 What is the projection of b onto C(A)? 8
The normal equations Example. Find the least squares solution to Ax = b, where A = 4 0 0 2, b = 2 0. 1 1 11 What is the projection of b onto C(A)? Solution. and A T A = 4 0 1 4 0 0 2 1 0 2 = 17 1 1 1 1 5 A T b = 4 0 1 2 0 2 1 0 = 19. 11 11 8
The normal equations The normal equation A T Ax = A T b is 17 1 x = 19. 1 5 11 Solving it, we obtain x = 1. 2 9
The normal equations The normal equation A T Ax = A T b is 17 1 x = 19. 1 5 11 Solving it, we obtain x = 1. 2 The projection of b onto C(A) is Ax = 4 0 0 2 1 = 4 1 1 2 4. 3 9
Why is Ax the projection of b onto C(A)? The projection b of b onto C(A) is b = Ax, with x such that A T Ax = A T b. 10
Why is Ax the projection of b onto C(A)? The projection b of b onto C(A) is b = Ax, with x such that A T Ax = A T b. If A has full column rank (columns of A linearly independent), then b = A(A T A) 1 A T b. The projection matrix for projecting onto C(A) is P = A(A T A) 1 A T. 10
Application: least squares lines Example. Find β 1, β 2 such that the line y = β 1 + β 2 x best fits the data points (2, 1), (5, 2), (7, 3), (8, 3). 11
Application: least squares lines Example. Find β 1, β 2 such that the line y = β 1 + β 2 x best fits the data points (2, 1), (5, 2), (7, 3), (8, 3). Solution. The equations y i = β 1 + β 2 x i in matrix form 1 x 1 1 x 2 1 x 3 1 x 4 design matrix X β 1 = β 2 y 1 y 2 y 3 y 4 observation vector y. 11
Application: least squares lines We need to find a least squares solution to Then 1 2 1 1 5 β 1 1 7 = 2. β 2 3 1 8 3 1 2 X T X = 1 1 1 1 1 5 = 2 5 7 8 1 7 4 22, 22 142 1 8 1 X T y = 1 1 1 1 2 = 2 5 7 8 3 9 57 3 β 1 = 2/7 β 2 5/14. 12