Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Similar documents
Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

LINEAR ALGEBRA W W L CHEN

March 27 Math 3260 sec. 56 Spring 2018

MTH 2032 SemesterII

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

Lecture 23: 6.1 Inner Products

Math 3191 Applied Linear Algebra

Lecture 20: 6.1 Inner Products

Chapter 6: Orthogonality

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Math Linear Algebra II. 1. Inner Products and Norms

6. Orthogonality and Least-Squares

Applied Linear Algebra in Geoscience Using MATLAB

October 25, 2013 INNER PRODUCT SPACES

6.1. Inner Product, Length and Orthogonality

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

ORTHOGONALITY AND LEAST-SQUARES [CHAP. 6]

Definition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

1. General Vector Spaces

Conceptual Questions for Review

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Linear Equation: a 1 x 1 + a 2 x a n x n = b. x 1, x 2,..., x n : variables or unknowns

Vectors in Function Spaces

Orthogonality and Least Squares

Typical Problem: Compute.

LINEAR ALGEBRA SUMMARY SHEET.

(v, w) = arccos( < v, w >

Linear Algebra Massoud Malek

INNER PRODUCT SPACE. Definition 1

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

Mathematics Department Stanford University Math 61CM/DM Inner products

(v, w) = arccos( < v, w >

Math 3191 Applied Linear Algebra

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

MTH 309Y 37. Inner product spaces. = a 1 b 1 + a 2 b a n b n

The Gram-Schmidt Process 1

Linear Algebra. Alvin Lin. August December 2017

Algebra II. Paulius Drungilas and Jonas Jankauskas

(v, w) = arccos( < v, w >

Exercise Sheet 1.

Inner Product Spaces

MTH 2310, FALL Introduction

MATH 22A: LINEAR ALGEBRA Chapter 4

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Math 290, Midterm II-key

Mathematical Methods wk 1: Vectors

Mathematical Methods wk 1: Vectors

MATH 167: APPLIED LINEAR ALGEBRA Chapter 3

7. Dimension and Structure.

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

x 1 x 2. x 1, x 2,..., x n R. x n

Section 6.4. The Gram Schmidt Process

There are two things that are particularly nice about the first basis

Linear Algebra Final Exam Study Guide Solutions Fall 2012

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Lecture 1.4: Inner products and orthogonality

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Extra Problems for Math 2050 Linear Algebra I

The Gram Schmidt Process

The Gram Schmidt Process

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Lecture 3: Linear Algebra Review, Part II

MATH Linear Algebra

4.3 - Linear Combinations and Independence of Vectors

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Dot product and linear least squares problems

Section 7.5 Inner Product Spaces

Chapter 6. Orthogonality and Least Squares

Chapter 4 Euclid Space

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Study Guide for Linear Algebra Exam 2

Lecture 4 Orthonormal vectors and QR factorization

Overview. Motivation for the inner product. Question. Definition

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence)

The following definition is fundamental.

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014

2. Every linear system with the same number of equations as unknowns has a unique solution.

Review problems for MA 54, Fall 2004.

Linear Algebra. Session 12

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

SUMMARY OF MATH 1600

Worksheet for Lecture 23 (due December 4) Section 6.1 Inner product, length, and orthogonality

Projections and Least Square Solutions. Recall that given an inner product space V with subspace W and orthogonal basis for

MATH 221: SOLUTIONS TO SELECTED HOMEWORK PROBLEMS

Section 6.1. Inner Product, Length, and Orthogonality

SECTION v 2 x + v 2 y, (5.1)

Designing Information Devices and Systems II

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

MATH Linear Algebra

The Four Fundamental Subspaces

Transcription:

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u 2 1 + u 2 2 = u 2. Geometric Meaning: u v = u v cos θ. u θ v 1

Reason: The opposite side is given by u v. u v 2 = (u v) (u v) = u u v u u v + v v = u 2 + v 2 2u v. By Cosine law: c 2 = a 2 + b 2 2ab cos θ, i.e. u v 2 = u 2 + v 2 2 u v cos θ. Comparing two equalities, we get: u v = u v cos θ. 2

Inner Product Generalization of dot product. Direct generalization to R n : u v := u 1 v 1 +... + u n v n = n i=1 u i v i. Using matrix notation: n u i v i = [ u 1... u n ] v 1.. = u T v = v T u. i=1 v n This is called the (standard) inner product on R n. 3

Thm 1 (P.359): Let u, v, w R n and c R. Then: (i) u v = v u; (ii) (u + v) w = u w + v w; (iii) (cu) v = c(u v); (iv) u u 0, and u u = 0 iff u = 0. Note: (iv) is sometimes called: positive-definite property. A general inner product is defined using the above 4 properties. For complex inner product, need to add complex conjugate to (i). 4

Def: The length (or norm) of v is defined as: v := v v. v = 1 : called unit vectors. Def: The distance between u, v is defined as: dist(u, v) := u v. Def: The angle between u, v is defined as: u, v) := cos 1 u v u v. 5

Extra: The General Inner Product Space Let V be a vector space over R or C. Def: An inner product on V is a real/complex-valued function of two vector variables <u, v > such that: (a) <u, v >= <v, u>; (conjugate symmetric) (b) <u + v, w >=<u, w > + <v, w >; (c) <cu, v >= c <u, v >; (linear in the first vector variable) (d) <u, u> 0; and <u, u>= 0 iff u = 0. (positive-definite property) 6

Def: A real/complex vector space V equipped with an inner product is called an inner product space. Note: (i) An inner product is conjugate linear in the second vector variable: <u, cv 1 + dv 2 >= c <u, v 1 > + d <u, v 2 >. (ii) If we replace (a) by <u, v > = <v, u>, consider: <iu, iu> = i 2 <u, u>= <u, u>, it will be incompatible with (d). When working with complex inner product space, must take complex conjugate when interchanging u, v. 7

Examples of (general) inner product spaces: 1. The dot product on C n : ( : conjugate transpose) <u, v >:= u 1 v 1 +... + u n v n = v u. 2. A non-standard inner product on R 2 : [ <u, v >:= u 1 v 1 u 1 v 2 u 2 v 1 +2u 2 v 2 = v T 1 1 1 2 3. An inner product on the matrix space M m n : m n <A, B >:= tr(b A) = a jk bjk. j=1 k=1 ] u. 8

4. Consider the vector space V of continuous real/complexvalued functions defined on the interval [a, b]. Then the following is an inner product on V : <f, g >:= 1 b a b a f(t)g(t)dt. [In real case, the norm f will give the root-meansquare (r.m.s.) of area bounded by the curve of f and the t-axis over the interval [a, b].] 9

Schwarz s inequality: (a 1 b 1 +... + a n b n ) 2 (a 2 1 +... + a 2 n)(b 2 1 +... + b 2 n). Pf: The following equation cannot have distinct solution: (a 1 x + b 1 ) 2 +... + (a n x + b n ) 2 = 0 (a 2 1 +... + a 2 n)x 2 + 2(a 1 b 1 +... + a n b n )x +(b 2 1 +... + b 2 n) = 0 So 0, and this gives the inequality. 10

The Cauchy-Schwarz Inequality: u v u v, and equality holds if, and only if, {u, v} is l.d. Proof: When u 0, set û = 1 u u. Consider w = v (v û)û. v w (v û)û u Obviously, w w = w 2 0 Cauchy-Schwarz Inequality. 11

Set k = v û: 0 (v kû) (v kû) = v v 2k(v û) + k 2 (û û) = v 2 k 2, Note that k = v u u, so: k 2 = (v u)2 u 2 v 2 (u v) 2 u 2 v 2. Taking positive square roots, we obtain the result. 12

Thm: (Triangle Inequality) For u, v R n : u + v u + v, and equality holds iff one of the vectors is a non-negative scalar multiple of the other. Proof: Consider u + v 2. (u + v) (u + v) = u 2 + 2(u v) + v 2 u 2 + 2 u v + v 2 = ( u + v ) 2. Taking square root, we obtain the inequality. 13

Orthogonality: Pythagoras Theorem in vector form: u + v 2 = u 2 + v 2 u + v v But in general we have: u so we need u v = 0. u + v 2 = u 2 + 2(u v) + v 2, 14

Def: Let u, v be two vectors in R n. When u v = 0, we say that u is orthogonal to v, denoted by u v. This generalizes the concept of perpendicularity. 0 is the only vector that is orthogonal to every vector v in R n. Example: In R 2, we have: [ ] 3 4 [ ] 4. 3 Thm 2 (P.362): u and v are orthogonal iff u + v 2 = u 2 + v 2. 15

Common Orthogonality: Def: Let S be a set of vectors in R n. If u is orthogonal to every vector in S, we will say u is orthogonal to S, denoted by u S. i.e. We can regard u to be a common perpendicular to S. Examples: (i) 0 R n. (ii) In R 2, let S = x-axis. Then e 2 S. (iii) In R 3, let S = x-axis. Then both e 2 S and e 3 S. Exercise: Let u, v S. Show that: (i) (au + bv) S for any numbers a, b; (ii) u Span S. *** 16

Orthogonal Complement: Def: Let S be a set of vectors in R n. We define: S := {u R n u S}, called the orthogonal complement of S in R n. i.e. S collects all the common perpendiculars to S. Examples: (i) {0} = R n, (R n ) = {0}. (ii) In R 2, let S = x-axis. Then S = y-axis. (iii) In R 3, take S = {e 1 }. Then S = yz-plane. 17

Thm: S is always a subspace of R n. Checking: (i) 0 v for every v S. So 0 S. (ii) Pick any u 1, u 2 S. For any scalars a, b R, consider: (au 1 + bu 2 ) v = a(u 1 v) + b(u 2 v) = a 0 + b 0 = 0, whenever v S. So au 1 + bu 2 S (cf. previous exercise). Note: S itself need not be a subspace. Thm: (a) S = (Span S). (b) Span S (S ). Pf: (a) S (Span S) is easy to see, since any vector u Span S must also satisfy u S. 18

Now, pick any u S. For every v Span S, write: v = c 1 v 1 +... + c p v p, v i S, i = 1,..., p. Then since u S: u v = c 1 (u v 1 ) +... + c p (u v p ) = 0, and hence u (Span S), so S (Span S) is proved. (b) Pick a vector w Span S, we have l.c.: w = c 1 v 1 +... + c p v p. For any u S : w u = c 1 (v 1 u) +... + c p (v p u) = 0 w (S ). 19

Thm 3 (P.363): Let A be an m n matrix. Then: (Row A) = Nul A and (Col A) = Nul A T. Pf: The product Ax can be rewritten as: r 1 r T 1 x Ax =. x = r m. r T m x. So x Nul A x {r T 1,... r T m} x (Row A). Hence (Row A) = Nul A. Apply the result to A T, we obtain: (Col A) = (Row A T ) = Nul A T. 20

Orthogonal sets and Orthonormal sets Def: A set S is called orthogonal if any two vectors in S are always orthogonal to each other. Def: A set S is called orthonormal if (i) S is orthogonal, and (ii) each vector in S is of unit length. Example: Orthonormal set: 1 3 1, 11 1 1 1 2, 6 1 1 1 4 66 7. 21

Thm 4 (P.366): An orthogonal set S of non-zero vectors is always linearly independent. Pf: Let S = {u 1, u 2,..., u p } and consider the relation: c 1 u 1 + c 2 u 2 +... + c p u p = 0. Take inner product with u 1, then: c 1 (u 1 u 1 ) + c 2 (u 1 u 2 ) +... + c p (u p u 1 ) = 0 u 1 c 1 u 1 2 + c 2 0 +... + c p 0 = 0. As u 1 = 0, we must have c 1 = 0. Similarly for other c i. So S must be l.i. 22

The method of proof of previous Thm 4 gives: Thm 5 (P.367): Let S = {u 1,..., u p } be an orthogonal set of non-zero vectors and let v Span S. Then: v = v u 1 u 1 2 u 1 +... + v u p u p 2 u p. Pf: Let c 1,..., c p be such that v = c 1 u 1 +... + c p u p. Take inner product with u 1, we have: v u 1 = c 1 (u 1 u 1 ) +... + c p (u p u 1 ) = c 1 u 1 2. So c 1 = v u 1 u 1 2. Similarly for other c i. 23

Thm 5 : Let S = {û 1,... û p } be an orthonormal set. Then for any v Span S, we have: v = (v û 1 )û 1 +... + (v û p )û p. Remark: This generalizes our familiar expression in R 3 : v = (v i)i + (v j)j + (v k)k. Example: Express v as a l.c. of the vectors in S: v = 1 2, S = 3 1, 1 2, 1 4. 3 1 1 7 24

New method: Compute c 1, c 2, c 3 directly: c 1 = 1 2 3 1 3 1 3 c 2 = 1 2 1 1 2 1 2 3 1 1 c 3 = 2 2 1 1 2 1 4 3 7 1 4 2 7 c 1 = 8 11, c 2 = 6 6 = 1, c 3 = 12 66 = 2 11. 25

Exercise: Determine if v Span {u 1, u 2 }. v = 3 2 5, u 1 = 1 2 2, u 2 = 2 2 1. *** 26

Orthogonal basis and Orthonormal basis Def: A basis for a subspace W is called an orthogonal basis if it is an orthogonal set. Def: A basis for a subspace W is called an orthonormal basis if it is an orthonormal set. Examples: (i) {e 1,..., e n } is an orthonormal basis for R n. [ ] [ ] 3 4 (ii) S = {, } is an orthogonal basis for R 4 3 2. ] ] } is an orthonormal basis for R 2. S = {[ 3 5 4 5, [ 4 5 3 5 27

(iii) The following set S: S = 3 1 1, 1 2 1, 1 4 7 is an orthogonal basis for R 3. (iv) The columns of an n n orthogonal matrix A will form an orthonormal basis for R n. Orthogonal matrix: square matrix and A T A = I n. 28

Checking: Write A = [ v 1... v n ]. (i, j)-th entry of A T A = vi T v j = v i v j. { 1, i = j, (i, j)-th entry of I n = 0, i j. Above checking also works for non-square matrix: Thm 6 (P.371): The n columns of an m n matrix U are orthonormal iff U T U = I n. But for square matrices: AB = I BA = I. So: (iv) The rows of an n n orthogonal matrix A (written in column form) also form an orthonormal basis for R n. 29

Matrices having orthonormal columns are very special: Thm 7 (P.371): Let T : R n R m be a linear transformation given by an m n standard matrix U with orthonormal columns. Then for any x, y R n : a. U x = x (preserving length) b. (Ux) (Uy) = x y (preserving inner product) c. (Ux) (Uy) = 0 iff x y = 0 (preserving orthogonality) Pf: Direct verifications using U T U = I n. Results not true for just orthogonal columns. 30

Recall: Let S = {u 1,..., u p } be orthogonal. When v W = Span S, we have: v = v u 1 u 1 2 u 1 +... + v u p u p 2 u p. What happens if v W? LHS RHS, as RHS is always a vector in W. v = RHS is still computable. What is the relation between v and v? 31

p LHS = v, RHS = v = Take inner product of RHS with u j : i=1 v u i u i 2 u i. v u j = = ( p i=1 p i=1 ) v u i u i 2 u i u j v u i u i 2 (u i u j ) = v u j u j 2 (u j u j ) = v u j, which is the same as LHS u j. 32

In other words, (v v ) u j = 0 for j = 1,..., p. Thm: The vector z = v v is orthogonal to every vector in Span S, i.e. z (Span S). v z u i v W 33

Def: Let {u 1,..., u p } be an orthogonal basis for W. each v in R n, the following vector in W : For proj W v := v u 1 u 1 2 u 1 +... + v u p u p 2 u p, is called the orthogonal projection of v onto W. Remark: {u 1,..., u p } must be orthogonal, otherwise RHS will not give us the correct vector v. Note: v = proj W v v W. 34

Example: In R 3, consider S = {e 1, e 2 }. Then W = Span S is the xy-plane. For any vector v R 3 : z v v e 1 e 1 2 e 1 x proj W v v e 2 e 2 2 e 2 y proj W v = x y 0 35

Exercise: Consider in R 3 and W = Span {u 1, u 2 }. proj W v: Find v = 1 0 1, u 1 = 2 2 1, u 2 = 2 1 2. *** 36

Def: The decomposition: v = proj W v + (v proj W v), (v proj W v) W, is called the orthogonal decomposition of v w.r.t. W. v z = v proj W v w = proj W v Thm 8 (P.376): Orthogonal decomposition w.r.t. W is the unique way to write v = w + z with w W and z W. 37 W

Exercise: Find the orthogonal projection of v onto W = Nul A. 1 A = [ 1 1 1 1 ], 2 v =. 3 4 *** Thm 9 (P.378): Let v R n and let w W. Then we have: v proj W v v w, and equality holds only when w = proj W v. 38

Pf: We can rewrite v w as: v w = (v proj W v) + (proj W v w). v v proj W v W proj W v w v w proj W v w Can apply Pythagoras Theorem to the right-angle triangle. 39

v w 2 = v proj W v 2 + proj W v w 2 v proj W v 2, and equality holds iff proj W v w = 0 iff w = proj W v. Because of the inequality: v proj W v v w, proj W v sometimes is called the best approximation of v by vectors in W. 40

Def: The distance of v to W is defined as: dist(v, W ) := v proj W v. Obviously, v W iff dist(v, W ) = 0. Exercise: Let W = Span {u 1, u 2, u 3 }. Find dist(v, W ): u 1 = 1 1, u 1 2 = 1 1 1, u 1 3 = 1 1 1 and v = 1 1 Sol: Remeber to check that {u 1, u 2, u 3 } is orthogonal. *** 41 2 4. 6 4

Extension of Orthogonal Set Let S = {u 1,..., u p } be an orthogonal basis for W = Span S. When W R n, we can find a vector v W and: z = v proj W v 0. This vector z is in W, i.e. will satisfy: z w = 0 for every w W. Hence the following set will again be orthogonal: S {z} = {u 1,..., u p, z}. 42

Thm: Span(S {v}) = Span(S {z}). In other words, we can extend an orthogonal set S by adding the vector z. S 1 = {u 1 } orthogonal, v 2 Span S 1, then compute z 2. S 2 = {u 1, z 2 } is again orthogonal. and Span {u 1, v 2 } = Span {u 1, z 2 }. S 2 = {u 1, u 2 } orthogonal, v 3 Span S 2, compute z 3. S 3 = {u 1, u 2, z 3 } is again orthogonal. and Span {u 1, u 2, v 3 } = Span {u 1, u 2, z 3 }.. This is called the Gram-Schmidt orthogonalization process. 43

Thm 11 (P.383): Let {x 1,..., x p } l.i.. Define u 1 = x 1 and: u 2 = x 2 x 2 u 1 u 1 2 u 1 u 3 = x 3 x 3 u 2 u 2 2 u 2 x 3 u 1 u 1 2 u 1. u p = x p p 1 i=1 x p u i u i 2 u i. Then {u 1,..., u p } will be orthogonal and for 1 k p: Span {x 1,..., x k } = Span {u 1,..., u k }. 44

Notes: (i) Must use {u i } to compute proj Wk x k+1 since the formula: k x k+1 u i proj Wk x k+1 = u i 2 u i, i=1 is only valid for orthogonal set {u i }. (ii) If obtain u k = 0 for some k, i.e. x k = proj W x k, we have: x k Span {x 1,..., x k 1 }. so {x 1,..., x k } will be l.d. instead. (iii) All the u i will be non-zero vectors as {x i } is l.i. 45

Example: Apply Gram-Schmidt Process to {x 1, x 2, x 3 }: x 1 = 1 1 0, x 2 = 2 0 1 Solution: Take u 1 = x 1. Then:, x 3 = u 2 = x 2 x 2 u 1 u 1 2 u 1 = x 2 2 2 u 1 = u 3 = x 3 1 3 u 2 2 2 u 1 = 1 3 1 3 2 3. 1 1 1. 1 1, 1 46

Example: Apply Gram-Schmidt Process to {x 1, x 3, x 2 }: x 1 = 1 1 0, x 3 = 1 1 1 Solution: Take u 1 = x 1. Then:, x 2 = 2 0 1. u 2 = x 3 x 3 u 1 u 1 2 u 1 = x 3 2 2 u 1 = u 3 = x 2 1 1 u 2 2 2 u 1 = 1 1. 0 0 0 1, 47

Exercise: Find an orthogonal basis for Col A: A = 1 3 1 2 3 4 2 1. 1 1 1 1 1 2 2 2 Sol: First find a basis for Col A (e.g. pivot columns of A). Then apply Gram-Schmidt Process. *** 48

Approximation Problems: Solve Ax = b. Due to the presence of errors, a consistent system may appear as an inconsistent system: x 1 + x 2 = 1 x 1 x 2 = 0 2x 1 + 2x 2 = 2 x 1 + x 2 = 1.01 x 1 x 2 = 0.01 2x 1 + 2x 2 = 2.01 Also in practice, exact solutions are usually not necessary. How to obtain a good approximate solution for the above inconsistent system? 49

Least squares solution: How to measure the goodness of x 0 as an approximated solution to the system: Ax = b? Minimize the difference x x 0 Problem: But x is unknown... Another way of approximation: x 0 x Ax 0 Ax = b. 50

Analysis: Find x 0 such that: Ax 0 = b 0, and b 0 is as close to b as possible. b 0 must be in Col A. b b 0 2 is a sum of squares least squares solution. Best approximation property of orthogonal projection: b proj W b b w for every w in W = Col A. Should take b 0 = proj W b. 51

Example: Find the least squares solution of the inconsistent system: x 1 + x 2 = 1.01 x 1 x 2 = 0.01 2x 1 + 2x 2 = 2.01 To compute proj W b, we need an orthogonal basis for W = Col A first. a basis for Col A is: { 1 1 2, 1 1 2 }. 52

Then by Gram-Schmidt Process, we get an orthogonal basis for W = Col A: { 1 1, 2 1 1 2 Compute b 0 = proj W b: 1.01 0.01 1 1 2.01 2 b 0 = 1 1 2 2 } { 1 1 2 + 1 1 2 1.01 0.01 2.01, 1 5 2 1 5 2 }. 1 5 2 2 1 5 2 53

Hence: b 0 = 1.006 0.01 2.012 Since b 0 Col A, the system Ax 0 = b 0 must be consistent. Solving Ax 0 = b 0 : 1 1 1.006 1 1 0.01 2 2 2.012. 1 0 0.508 0 1 0.498 0 0 0 Thus we have the following least squares solution: [ ] 0.508 x 0 =. 0.498 54.

But we have the following result: (Col A) = Nul A T. Then, since we take b 0 = proj Col A b: (b b 0 ) (Col A) (b b 0 ) Nul A T A T (b b 0 ) = 0 A T b 0 = A T b. So, if x 0 is an approximate solution, we have: A T (Ax 0 ) = A T b. The above is usually called the normal equation of Ax = b. 55

Thm 13 (P.389): The least squares solutions of Ax = b are the solutions of the normal equation A T Ax = A T b. In the following case, the least square solution will be unique: Thm 14 (P.391): Let A be an m n matrix with rank A = n. Then the n n matrix A T A is invertible. Example: Find again the least squares solution: x 1 + x 2 = 1.01 x 1 x 2 = 0.01 2x 1 + 2x 2 = 2.01 56

Solution: Solve the normal equation. Compute: [ ] A T 1 1 2 A = 1 1 [ ] 1 1 6 4 =, 1 2 2 4 6 2 2 A T b = [ ] 1 1 2 1.01 0.01 1 2 2 2.01 = [ ] 5.04. 5.02 So the normal equation is: [ ] [ ] [ 6 4 x1 5.04 = 4 6 5.02 x 2 ] [ x1 x 2 ] = [ ] 0.508. 0.498 57

Least Squares Problems Linear Regression: Fitting data (x i, y i ) with straight line. y To minimize the differences indicated by the red intervals. x 58

When a straight line y = c + mx can pass through all the points, it will of course best fit the data. This requires: c + mx 1 = y 1. c + mx n = y n being consistent. 1 x 1 [ ].. c m 1 x n But in general the above system Ax = b is inconsistent. = y 1.. y n 59

Measurement of closeness: square sum of y-distances. y 1 (mx 1 + c) 2 +... + y n (mx n + c) 2. Note that this is expressed as b b 0 2, where: b = y 1. y n, b 0 = b 0 Col A since Ax = b 0 is consistent. Use normal equation! c + mx 1. c + mx n. 60

Example: Find a straight line that best fits the points: (2, 1), (5, 2), (7, 3), (8, 3), in the sense of minimizing the square-sum of y-distances. Sol: The (inconsistent) system is: 1 2 1 5 1 7 1 8 [ ] c m = 1 2. 3 3 We are going the find its least squares solution. 61

Compute: A T A = A T b = [ 1 1 1 1 2 5 7 8 [ 1 1 1 1 2 5 7 8 ] 1 2 1 5 1 7 1 8 ] 1 2 3 3 = = [ ] 9. 57 [ ] 4 22, 22 142 62

So the normal equation is: [ ] [ ] 4 22 c 22 142 m = [ ] 9, 57 which has a unique solution of ( 2 7, 5 14 ). The best fit straight line will be: y = 2 7 + 5 14 x. 63

Polynomial Curve Fitting: Example: Find a polynomial curve of degree at most 2 which best fits the following data: in the sense of least squares. (2, 1), (5, 2), (7, 3), (8, 3), Sol: Consider the general form of the fitting curve: y = a 0 1 + a 1 x + a 2 x 2. 64

The curve cannot pass through all the 4 points as: a 0 1 + a 1 2 + a 2 2 2 = 1 a 0 1 + a 1 5 + a 2 5 2 = 2 a 0 1 + a 1 7 + a 2 7 2 = 3 a 0 1 + a 1 8 + a 2 8 2 = 3 is inconsistent. Again, use normal equation. 65

The corresponding normal equation A T Ax = A T b is: 4 22 142 22 142 988 a 0 a 1 142 988 7138 a 2 = 9 57 393 which has a unique solution of ( 19 132, 19 44, 1 132 ). So the best fitting polynomial is: y = 19 132 + 19 44 x 1 132 x2., 66

General Curve Fitting: Example: Find a curve in the form c 0 + c 1 sin x + c 2 sin 2x which best fits the following data: ( π 6, 1), (π 4, 2), (π 3, 3), (π 2, 3), in the sense of least squares. Sol: Let y = c 0 1 + c 1 sin x + c 2 sin 2x. The system c 0 1 + c 1 sin π 6 + c 2 sin 2π 6 = 1 c 0 1 + c 1 sin π 4 + c 2 sin 2π 4 = 2 c 0 1 + c 1 sin π 3 + c 2 sin 2π 3 = 3 c 0 1 + c 1 sin π 2 + c 2 sin 2π 2 = 3 is inconsistent. 67

Solving A T Ax = A T b... c 0 = 184 39 2 89 3 + 9 6 78 18 2 38 3 + 6 6 2.29169 c 1 = 9 + 3 2 7 3 2 6 39 + 9 2 + 19 3 3 6 5.31308 c 2 = 8 + 9 2 10 3 6 6 78 18 2 38 3 + 6 6 0.673095 So the best fitting function is: ( 2.29169) + (5.31308) sin x + (0.673095) sin 2x. 68

Extra: Continuous Curve Fitting Find g(x) best fitting a given f(x). g(x) f(x) Try to minimize the difference (area) between two curves. 69

To minimize the root-mean-square (r.m.s.) between two curves: 1 b f(x) g(x) b a 2 dx. Given by the following inner product: a of area <f, g >= 1 b a b a f(x)g(x)dx. not in R n, not the standard inner product... No normal equation. But we can use orthogonal projection. 70

Recall: Formula of orthogonal projection in general: p <y, u i > proj W y = <u i, u i > u i, i=1 where {u 1,..., u p } is an orthogonal basis of W. Example: Fit f(x) = x over [0, 1] by l.c. of S = {1, sin 2πkx, cos 2πkx; k = 1, 2,..., n} Sol: S is orthogonal under the inner product: (direct checking) <f, g >= 1 0 71 f(x)g(x)dx.

So compute those <y, u i > : <f(x), 1>= 1 2, <f(x), sin 2πkx>= 1, <f(x), cos 2πkx>= 0. 2πk We also need those <u i, u i > : <1, 1>= 1, <sin 2πkx, sin 2πkx>= 1 2, <cos 2πkx, cos 2πkx>= 1 2. 72

So the best fitting curve is g(x) = proj W f(x): g(x) = 1 2 1 ( ) sin 2πx sin 4πx sin 2nπx + +... + π 1 2 n When n = 5: 1 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 73

Example: Let f(x) = sgn(x), the sign of x: 1 for x < 0 sgn(x) = 0 for x = 0 1 for x > 0 Find the best r.m.s. approximation function over [ 1, 1] using l.c. of S = {1, sin kπx, cos kπx; k = 1, 2, 3,..., 2n + 1}. Sol: Interval changed. Use new inner product: <f, g >= 1 2 1 1 f(x)g(x)dx. 74

Then S is orthogonal (needs another checking) and: <1, 1>= 1, <sin kπx, sin kπx>= 1 2 =<cos kπx, cos kπx>. So, we compute: <sgn(x), 1> = 0; { 0 if k is even, <sgn(x), sin kπx> = if k is odd; <sgn(x), cos kπx> = 0. 2 kπ 75

Hence the best r.m.s. approx. to sgn(x) over [ 1, 1] is: 4 π ( sin πx + sin 3πx 3 + sin 5πx 5... + sin(2n + 1)πx 2n + 1 ). When 2n + 1 = 9: 1 0.5-1 -0.5 0.5 1-0.5-1 76