Linear least squares problem: Example We want to determine n unknown parameters c,,c n using m measurements where m n Here always denotes the -norm v = (v + + v n) / Example problem Fit the experimental data t y 7 with a curve of the form g(t) = c + c t + c t g (t ) g n (t ) Here n =, m = and g (t) =, g (t) = t, g (t) = t We define the matrix A := g (t m ) g n (t m ) We want to find c R such that Ac y = min = 9 Find solution using normal equations: Find M := A A = 6 6 6 6 98 linear system Mc = b using Gaussian elimination, yielding the solution vector c = and b := A y = 9 5 8 Then solve the t = ;;;]; y = ;;;7]; A = t^,t,t^]; c = (A *A)\(A *y) c = A\y % use normal equations % Matlab shortcut (actually uses QR decomposition) te = linspace(-5,5,e) ; % points for plotting plot(t,y, o,te,te^,te,te^]*c) % plot given points and fitted curve legend( given points, least squares fit ) 8 given points least squares fit 6 - -5 5 5 5 5
Least squares problem with orthogonal basis For a least squares problem we are given n linearly independent vectors a (),,a (n) R m which form a basis for the subspace V = span{a (),,a (n) } For a given right hand side vector y R m we want to find u V such that u y is minimal We can write u = c a () + + c n a (n) = Ac with the matrix A = a (),,a (n)] R m n Hence we want to find c R n such that Ac y is minimal Solving this problem is much simpler if we have an orthogonal basis for the subspace V : Assume we have vectors p (),, p (n) such that span { p (),, p (n)} = V the vectors are orthogonal on each other: p (i) p ( j) = for i j We can then write u = d p () + + d n p (n) = Pd with the matrix P = p (),, p (n) ] R m n Hence we want to find d R n such that Pd b is minimal The normal equations for this problem give where the matrix P P = p () p (n) (P P)d = P b () p () p () p () p (n) p (),, p (n)] = = p (n) p () p (n) p (n) p () p () p (n) p (n) is now diagonal since p (i) p ( j) = for i j Therefore the normal equations () are actually decoupled ( p () p ()) d = p () y ( p (n) p (n)) d n = p (n) y and have the solution d i = p(i) y p (i) p (i) for i =,,n Gram-Schmidt orthogonalization We still need a method to construct from a given basis a (),,a (n) an orthogonal basis p (),, p (n) Given n linearly independent vectors a (),,a (n) R m we want to find vectors p (),, p (n) such that span { p (),, p (n)} = span { a (),,a (n)} the vectors are orthogonal on each other: p (i) p ( j) = for i j Step : p () := a () Step : p () := a () s p () where we choose s such that p () p () = : p () a () s p () p () = s = p() a () p () p () Step : p () := a () s p () s p () where we choose s, s such that p () p () =, ie, p () a () s p () p () s p () p () =, hence s = p() a () p () p ()
p () p () =, ie, p () a () s p () p () s p () p () =, hence s = p() a () p () p () Step n: p (n) := a (n) s n p () s n,n p (n ) where we choose s n,,s n,n such that p ( j) p (n) = for j =,,n which yields Example: We are given the vectors a () = s jn = p( j) p (n) p ( j) p ( j) for j =,,n, a() =, a() = find an orthogonal basis p (), p (), p () for the subspace V = span { a (),a (),a ()} Step : p () := a () = Step : p () := a () p() a () p () p () p() = 6 Step : p () := a () p() a () p () p () p() p() a () p () p () p() = Note that we have = a () = p () 9 9 5 Use Gram-Schmidt orthogonalization to 5 = which we can write as In the general case we have a (),a (),a ()] = 9 a () = p () + 6 p() a () = p () + p() + 5 5 p() A = a () = p () p (), p (), p ()] a () = p () + s p () 5 5 5 5 P a () = p () + s p () + s p () 6 5 5 5 5 S a (n) = p (n) + s n p () + + s n,n p (n )
which we can write as Therefore we obtain a decomposition A = PS where P R m n has orthogonal columns s s n a (),a (),,a (n)] = p (), p (),, p (n)] sn,n S R n n is upper triangular, with on the diagonal Note that the vectors p (),, p (n) are different from : Assume, eg, that p () = a () s p () s p () =, then a () = s p () + s p () is in span { p (), p ()} = span { a (),a ()} This is a contradiction to the assumption that a (),a (),a () are linearly independent Solving the least squares problem Ac y = min using orthogonalization We are given A R m n with linearly independent columns, b R n We want to find c R n such that Ac y = min From the Gram-Schmidt method we get A = PS, hence we want to find c such that P Sc y = min d This gives the following method: Algorithm: solve least squares problem Ac y = min using orthogonalization use Gram-Schmidt to find decomposition A = PS solve Pd y = min: d i := solve Sc = d by back substitution p(i) y p (i) for i =,,n p (i) Example: Solve the least squares problem Ac b = min for A = Gram-Schmidt gives A = 5 5 5 5 P 5 5 S 9 (see above), b = d = p() b p () p () = =, d = p() b p () = p () 5 =, d = p() b p () p () = = 5 5 5 c solving c = by back substitution gives c = 5, c = 9, c = c 5 Hence the solution of our least squares problem is the vector c = Note: If you want to solve a least squares problem by hand with pencil and paper, it is usually easier to use the normal equations But for numerical computation on a computer using orthogonalization is usually more efficient and more accurate 9 5 7
Finding an orthonormal basis q (),,q (n) : the QR decomposition The Gram-Schmidt method gives an orthogonal basis p (),, p (n) for V = span { a (),,a (n)} Often it is convenient to have a so-called orthonormal basis q (),,q (n) where the basis vectors have length : Define q ( j) = p ( j) j) p( for j =,,n then we have span { q (),,q (n)} = V { q ( j) q (k) for j = k = otherwise This means that the matrix Q = q (),,q (n) ] satisfies Q Q = I where I is the n n identity matrix Since p ( j) = p ( j) q ( j) we have a () = p () q () r a () = p () q () +s p () r a (n) = p (n) q (n) r nn +s n r () p p () p () p (n ) + + s n,n r n r n,n (n ) q which we can write as a (),a (),,a (n)] = q (),q (),,q (n)] r r r n r rn,n r nn A = QR where the n n matrix R is given by row of R row n of R = p () (row of S) p (n) (row n of S) We obtain the so-called QR decomposition A = QR where the matrix Q R m n has orthonormal columns, rangeq = rangea the matrix R R n n is upper triangular, with nonzero diagonal elements Example: In our example we have p () p () =, p () p () = 5, p () p () =, hence 5 5 q () = p() = 5 5, q() = p () = 5 5 5 5, q() = p() = 5 5 5 5 5 5
and we obtain the QR decomposition 9 = 5 5/ 5 5 5 5/ 5 5 5 5/ 5 5 5 5/ 5 5 In Matlab we can find the QR decomposition using Q,R]=qr(A,) 7 5 5 >> A = ; ; 9] ; >> Q,R] = qr(a,) Q = -5 678 5-5 6-5 -5-6 -5-5 -678 5 R = - - -7-6 -678 Note that Matlab returned the basis q (), q (),q () (which is also an orthonormal basis) and hence rows and of the matrix R is ( ) times our previous matrix R If we want to find an orthonormal basis for rangea and an orthonormal basis for the orthogonal complement (rangea) = nulla we can use the command Qh,Rh]=qr(A) : It returns matrices ˆQ R m m and ˆR R m n with basis for rangea basis for (rangea) {}} { R ˆQ = q (),,q (n), q (n+),,q (m), ˆR = m n rows of zeros >> A = ; ; 9] ; >> Qh,Rh] = qr(a) Qh = -5 678 5 6-5 6-5 -678-5 -6-5 678-5 -678 5-6 Rh = - - -7-6 -678 But in most cases we only need an orthonormal basis for rangea and we should use Q,R]=qr(A,) (which Matlab calls the economy size decomposition) Solving the least squares problem Ac b = min using the QR decomposition If we use an orthonormal basis q (),,q (n) for span{a (),,a (n) } we have Q Q = I The solution of Qd y = min is therefore given by the normal equations (Q Q)d = Q y, ie, we obtain d = Q y
This gives the following method: Algorithm: solve the least squares problem Ac y = min using orthonormalization: find the QR decomposition A = QR let d = Q y solve Rc = d by back substitution In Matlab we can do this as follows: Q,R] = qr(a,); d = Q *y; c = R\d; In our example we have >> A = ; ; 9] ; y = ;;;7]; >> Q,R] = qr(a,); >> d = Q *y; >> c = R\d c = - 9 5 We can use the shortcut c=a\y which actually uses the QR decomposition to find the solution of Ac y = min >> A = ; ; 9] ; y = ;;;7]; >> c = A\y c = - 9 5 Warning: In Matlab symbolic mode the backslash command does not find the least squares solution: >> A = sym( ; ; 9]) ; y = sym(;;;7]); >> c = A\y Warning: The system is inconsistent Solution does not exist c = Inf Inf Inf