The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve fittig is a process of fidig a relatively simple fuctio to approximate a set of data The data are ofte from experimetal measuremets that may cotai errors or have a limited umber of sigificat digits The approximatig fuctio cotais parameters that ca be adjusted to give agreemet with the data The quality of the fit is most ofte measured by a distace characterizatio betwee values of the approximatig fuctio ad the give data The least squares priciple asserts that the best fit is the oe that miimizes this distace For example, cosider the problem of fittig the data set x x 0 = 0 x 1 = 2 x 2 = 1 x 3 = 4 y y 0 = 2 y 1 = 1 y 2 = 1 y 3 = 3 I stadard polyomial iterpolatio, we would eed to fid a polyomial of degree 3, deoted by p 3 (x), such that p 3 (x i ) = y i, i = 0, 1, 2, 3 If the data ivolve error, this error will be carried over by the iterpolatig polyomial p 3 (x) Now we ask if we ca fit the give data set by usig a straight lie, a polyomial of degree 1 This meas that we eed to fid a polyomial such that p 1 (x) = a 0 + a 1 x (1) p 1 (x i ) = y i, i = 0, 1, 2, 3 (2) Sice a liear fuctio has oly two coefficiets (which ca be cosidered as adjustable parameters), it is geerally impossible to satisfy (2) I other words, i geeral, it is impossible to fid a liear fuctio that iterpolates four poits simultaeously The least squares method fids a liear fuctio p 1 (x) that fits the four poits i a best possible way This procedure ca be described as follows 1

First, we defie the error betwee the straight lie p 1 (x) of (1) ad the give set of data as follows: 3 σ(p 1 ) [p 1 (x i ) y i ] 2 i=0 Next, we adjust the parameters a 0 ad a 1 to miimize the error fuctio σ(p 1 ) To carry out the details for our example, we otice that the error fuctio is really give by 3 σ(p 1 ) = (a 0 + a 1 x i y i ) 2, which ca be expressed as i=0 σ(p 1 ) = (a 0 2) 2 + (a 0 + 2a 1 1) 2 + (a 0 a 1 + 1) 2 + (a 0 + 4a 1 3) 2 The fuctio σ(p 1 ) attais its miimum value at a poit (a 0, a 1 ) where the gradiet of σ(p 1 ) vaishes Therefore, we eed to solve the followig system of liear equatios: σ(p 1 ) a 0 2(a 0 2) + 2(a 0 + 2a 1 1) + 2(a 0 a 1 + 1) + 2(a 0 + 4a 1 3) = 8a 0 + 10a 1 10 = 0, σ(p 1 ) a 1 4(a 0 + 2a 1 1) 2(a 0 a 1 + 1) + 8(a 0 + 4a 1 3) = 10a 0 + 42a 1 30 = 0 This system of liear equatios ca the be rewritte as 4a 0 + 5a 1 = 5, 5a 0 + 21a 1 = 15, which ca be solved usig Gaussia elimiatio to give a 0 = 30 59, a 1 = 35 59 (You ca check this usig the backslash operator i MATLAB) As a result, the straight lie that best fits the give data set i the least squares sese is y = 30 59 + 35 59 x 2 Least Squares Fittig I this sectio, we preset a geeral discussio of least squares fittig of the data {(x r, f r )} As before, we assume that f r = f(x r ) 2

Let q m (x) be a polyomial of degree m with m Observe that q m (x r ) f r is the error i acceptig q m (x r ) as a approximatio to f r Thus the sum of the squares of these errors, σ(q m ) [q m (x r ) f r ] 2, gives a measure of how well q m (x) fits f(x) The idea is that the smaller the value of σ(q m ), the closer the polyomial q m (x) fits the data We say p m (x) is a least squares polyomial of degree m if p m (x) is a polyomial of degree m with the property that σ(p m ) σ(q m ) for all polyomials q m (x) of degree m; usually we oly have equality if q m (x) p m (x) As show i a advaced course i umerical aalysis or liear algebra, if the poits {x r } r=1 are distict ad if m, there is oe ad oly oe least squares polyomial of degree m for these data; thus we say p m (x) is the least squares polyomial of degree m So, the polyomial p m (x) that produces the smallest value σ(p m ) yields the least squares fit of the data While p m (x) produces the best fit of the data i the least squares sese, it may ot produce a very useful fit For example, cosider the case m = The the least squares fit p (x) is the same as the iterpolatig polyomial We have see already that the iterpolatig polyomial ca be a poor fit i the sese of havig a large ad highly oscillatory error So, a close fit i a least squares sese does ot ecessarily imply a very good fit ad, i some cases, the closer the fit is to a iterpolat the less useful it might be Sice the least squares criterio relaxes the fittig coditio from iterpolatio to a weaker coditio o the coefficiets of the polyomial, we eed fewer coefficiets (that is, a lower degree polyomial) i the represetatio As we have idicated, for the problem to be well posed, it is sufficiet that all the data poits be distict ad that m EXAMPLE 1 Cosider the least squares fit to the data i x i f i 0 1 2 1 3 4 2 4 3 3 5 1 by a straight lie: p 1 (x) = a 0 + a 1 x; that is, the coefficiets a 0 ad a 1 are to be determied We have σ(p 1 ) = {p 1 (1) 2} 2 + {p 1 (3) 4} 2 + {p 1 (4) 3} 2 + {p 1 (5) 1} 2 = {a 0 + a 1 2} 2 + {a 0 + 3a 1 4} 2 + {a 0 + 4a 1 3} 2 + {a 0 + 5a 1 1} 2 3

Observe that σ(p 1 ) is quadratic i the ukow coefficiets a 0 ad a 1 For a miimum of σ(p 1 ), the ukows a 0 ad a 1 must satisfy the liear equatios σ(p 1 ) a 0 2 {a 0 + a 1 2} + 2 {a 0 + 3a 1 4} + 2 {a 0 + 4a 1 3} + 2 {a 0 + 5a 1 1} = 20 + 8a 0 + 26a 1 = 0, σ(p 1 ) a 1 2 {a 0 + a 1 2} + 6 {a 0 + 3a 1 4} + 8 {a 0 + 4a 1 3} + 10 {a 0 + 5a 1 1} = 62 + 26a 0 + 102a 1 = 0; that is, Gaussia elimiatio yields the solutio 4a 0 + 13a 1 = 10, (3) 13a 0 + 51a 1 = 31 (4) a 0 = 107 35, a 1 = 6 35 (Agai you ca check this usig the backslash operator i MATLAB) Therefore, the straight lie that best fits the give data set i the least squares sese is p 1 (x) = 107 35 6 35 x It ca be see that the method of least-squares ivolves two mai steps: Formulate the total error Determie parameters (coefficiets) to miimize the total error Geerally, if we cosider fittig data usig a polyomial writte i power series form p m (x) = a 0 + a 1 x + + a m x m, (5) the σ(p m ) is quadratic i the ukow coefficiets a 0, a 1,, a m For the data {(x r, f r )}, we have σ(p m ) = {p m (x r ) f r } 2 (6) = {p m (x 0 ) f 0 } 2 + {p m (x 1 ) f 1 } 2 + + {p m (x ) f } 2 (7) 4

The coefficiets a 0, a 1,, a m are determied by solvig the liear system For each value j = 0, 1,, m, the liear equatio is formed as follows Observe that The, by the chai rule, σ(p m ) a 0 = 0, (8) σ(p m ) a 1 = 0, (9) (10) σ(p m ) a m = 0 (11) σ(p m ) a j = 0 p m (x r ) a j = x j r σ(p m ) [ = {p m (x 0 ) f 0 } 2 + {p m (x 1 ) f 1 } 2 + + {p m (x ) f } 2] (12) a j a j [ = 2 {p m (x 0 ) f 0 } p m(x 0 ) + {p m (x 1 ) f 1 } p m(x 1 ) + + {p m (x ) f } p ] m(x ) (13) a j a j a j [ ] = 2 {p m (x 0 ) f 0 } x j 0 + {p m(x 1 ) f 1 } x j 1 + + {p m(x ) f } x j (14) = 2 {p m (x r ) f r } x j r (15) = 2 x j rp m (x r ) 2 f r x j r (16) Substitutig the power series form for the polyomial p m (x r ), (5), leads to [ ] σ(p m ) = 2 x j r {a 0 + a 1 x r + + a m x m r } f r x j r (17) a j [ ] = 2 a 0 x j r + a 1 x j+1 r + + a m x j+m r f r x j r (18) Therefore, the equatios σ(pm) a j ormal equatios: a 0 x j r + a 1 x j+1 r + + a m = 0, j = 0, 1,, m, may be rewritte as the 5 x j+m r = f r x j r, j = 0, 1,, m

I matrix form, the ormal equatios may be writte as 1 x r x r x2 r xm r xm+1 r xm r xm+1 r x2m r a 0 a 1 a m = f r f rx r f rx m r (19) The coefficiet matrix of this system is symmetric ad positive defiite, permittig use of a accurate, efficiet modified versio of Gaussia elimiatio which exploits these properties, without the eed for partial pivotig by rows for size This ca be doe usig the backslash operator \ i MATLAB (Whe \ is called with a symmetric coefficiet matrix that has positive diagoal elemets, MATLAB determies if the matrix is positive defiite ad, if it is, solves the system i a optimal way Otherwise, it uses Gaussia elimiatio with partial pivotig) EXAMPLE 2 To compute a straight lie fit a 0 +a 1 x to the data {(x r, f r )}, we set m = 1 i the ormal equatios to give a 0 a 0 x r + a 1 1 + a 1 x r = x 2 r = f r, (20) f r x r (21) Substitutig the data i x i f i 0 1 2 1 3 4 2 4 3 3 5 1 from Example 1, we have the ormal equatios 4a 0 + 13a 1 = 10, 13 0 + 51a 1 = 31, which give the same result as i Example 1 The ext example shows how a quadratic polyomial ca be determied i the least squares sese for a give set of data 6

EXAMPLE 3 To compute a quadratic fit a 0 + a 1 x + a 2 x 2 to the data {(x r, f r )}, we set m = 2 i the ormal equatios to give a 0 1 + a 1 x r + a 2 a 0 x r + a 1 x 2 r + a 2 a 0 x 2 r + a 1 x 3 r + a 2 x 2 r = x 3 r = x 4 r = f r, (22) f r x r, (23) f r x 2 r (24) The least squares formulatio permits more geeral fuctios p m (x) tha simply polyomials, but the ukow coefficiets i p m (x) must still occur liearly The most geeral form is p m (x) = a 0 φ 0 (x) + a 1 φ 1 (x) + + a m φ m (x) = m a r φ r (x), where the basis fuctios {φ r (x)} m could be, for example, a liear polyomial splie basis, a cubic polyomial B splie basis, or a set of liearly idepedet trigoometric fuctios By aalogy with the power series case, the liear system of ormal equatios is a 0 φ 0 (x r )φ j (x r )+a 1 φ 1 (x r )φ j (x r )+ +a m φ m (x r )φ j (x r ) = f r φ j (x r ), for j = 0, 1,, m Agai, the coefficiet matrix of this liear system φ 0(x r ) 2 φ 0(x r )φ 1 (x r ) φ 0(x r )φ m (x r ) φ 1(x r )φ 0 (x r ) φ 1(x r ) 2 φ 1(x r )φ m (x r ) φ m(x r )φ 0 (x r ) φ m(x r )φ 1 (x r ) φ m(x r ) 2 is symmetric ad positive defiite 3 Least Squares Solutio of Liear Systems The priciple of least squares ca be used i other situatios As a example, let us attempt to solve a icosistet system of liear equatios of the form: a ij x j = b i, i = 1, 2,, m; (25) j=1 that is Ax = b, 7

where A = (a ij ) m, x = [x 1,, x ] T, b = [b 1,, b m ] T Here, m > ; that is, the umber of equatios is more tha the umber of ukows i the system of liear equatios (25) Geerally speakig, this system of liear equatios is icosistet i the sese that there is o vector x = (x 1,, x ) T which satisfies all the equatios i (25) I this case, a alterative is to miimize the sum of the squares of the residuals to obtai the least squares solutio of the liear system This solutio ca be determied i the followig way The sum of the squares of the residuals is give by 2 m σ(x 1, x 2,, x ) = a ij x j b i To fid the miimizatio poit for the error fuctio σ(x 1, x 2,, x ), we take the partial derivative of σ with respect to each variable x k, yieldig σ m = 2 a ij x j b i a ik, x k j=1 for k = 1, 2, Therefore, the miimizatio poit x = (x 1,, x ) must be a solutio of the liear system By settig j=1 j=1 m m a ij a ik x j = a ik b i, k = 1, 2,, m m c kj = a ij a ik, d k = a ik b i, (26) we see that the miimizatio poit x = (x 1, x 2,, x ) is the solutio of the system of liear equatios: c kj x j = d k, k = 1, 2,, m j=1 EXAMPLE 4 As a example of the least squares solutio of icosistet liear systems, cosider the three equatios i two ukows: x 1 + x 2 = 1, x 1 x 2 = 2, x 1 + 2x 2 = 1, 8

or 1 1 1 1 1 2 ( x1 x 2 ) = 1 2 1 This system of liear equatios is icosistet because it has o solutio at all Now, from (26), we see that c 11 = 3, c 12 = 2, c 21 = 2, c 22 = 6, d 1 = 2, d 2 = 3 Therefore, the least squares solutio must solve the 2 2 liear system The solutio is give by 3x 1 + 2x 2 = 2, (27) 2x 1 + 6x 2 = 3 (28) x 1 = 9 7, x 2 = 13 14 There is a short cut to the least squares formulatio for icosistet liear systems Notice that the matrix C = (c ij ), where c ij is give by (26), is othig but the product of A T ad A; that is, C = A T A, ad the right-had side vector d = (d 1,, d ) T is give by d = A T b Therefore, the least squares solutio of the matrix problem Ax = b is give by the solutio of the ormal equatios A T Ax = A T b The followig summarizes the procedure i the least squares approach for icosistet liear systems of the form Ax = b: Fid the traspose of A, deoted by A T Multiply the system Ax = b by A T : A T Ax = A T b Fid the solutio for the ormal system: A T Ax = A T b The solutio to the ormal system is exactly the least squares solutio of the origial icosistet liear system 9