Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data

Lecture 9 Curve fittig I Itroductio Suppose we are preseted with eight poits of easured data (x i, y j ). As show i Fig. o the left, we could represet the uderlyig fuctio of which these data are saples by iterpolatig betwee the data poits usig oe of the ethods we have studied previously. Fig. : Measured data with: (left) splie iterpolatio, (right) lie fit. However, aybe the data are saples of the respose of a process that we kow, i theory, is supposed to have the for y= f ( x )=a x+b where a,b are costats. Maybe we also kow that y is a very weak sigal ad the sesor used to easure it is oisy, that is, it adds its ow (rado) sigal i with the true y data. Give this it akes o sese to iterpolate the data because i part we'll be iterpolatig oise, ad we kow that the real sigal should have the for y=ax+b. I a situatio like this we prefer to fit a lie to the data rather tha perfor a iterpolatio (Fig. at right). If doe correctly this ca provide a degree of iuity agaist the effects of easureet errors ad oise. More geerally we wat to develop curve fittig techiques that allow theoretical curves, or odels, with ukow paraeters (such as a ad b i the lie case) to be fit to data poits. Fittig a costat to easured data The siplest curve fittig proble is estiatig a paraeter fro ultiple easureets. Suppose is the ass of a object. We wat to easure this usig a scale. Ufortuately the scales i our laboratory are ot well calibrated. However we have ie scales. We expect that if EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I /0 Fig. : Horizotal lie is average of several easureets (dots) we take easureets with all of the ad average the results we should get a better estiate of the true ass that by relyig o the easureet fro a sigle scale. Our results ight look soethig like show i Fig.. Let the easureet of the ith scale be i the the average easureet is give by = i= i () where is the uber of easureets. This is what we should use for our best estiate of the true ass. Averagig is a very basic for of curve fittig. 3 Least-squares lie fit Goig back to the situatio illustrated i Fig., how do we figure out the best fit lie? There does't see to be a straightforward way to average the data like we did i Fig.. Istead, let's suppose we have data poits ( x i, y i ). We are iterested i a liear odel of the for y=a x+b, ad our task is calculate the best values for a ad b. If all our data actually fell o a lie the the best a ad b values would result i y i (a xi +b)=0 for i=,,,. More geerally let's defie the residual ( error of the fit ) for the ith data poit as r i= y i (ax i+b) () A perfect fit would give r i=0 for all i. The residual ca be positive or egative, but what we are ost cocered with is its agitude. Let's defie the ea squared error (MSE) as MSE = r i= ( y i ( a xi +b)) i = i= EE Nuerical Coputig Scott Hudso (3) 05-08-8

Lecture 9: Curve fittig I 3/0 We ow seek the values of a ad b that iiize the MSE. These will satisfy MSE MSE =0 ad =0 a b (4) The b derivative is MSE = ( y i (a x i+b))=0 b i = (5) Multiplyig through by / ad rearragig we fid a yi x i b=0 i = i= i= (6) Now defie the average x ad y values as y = y i, x = x i i= i= (7) y a x b=0 (8) a x +b= y (9) Equatio (6) the reads or This tells us that the poit ( x, y ) (the cetroid of the data) falls o the lie. The a derivative of the MSE is MSE = ( y i (a x i+b)) x i =0 a i = (0) Multiplyig through by / ad rearragig we fid a b x i y i x i x i=0 i = i = i = () xy a x b x =0 () or with the additioal defiitios xy = x i y i, x = x i i= i= (3) a x +b x = xy (4) A fial rearrageet gives us We ow have two equatios i the two ukows a,b a x +b= y a x +b x = xy EE Nuerical Coputig Scott Hudso (5) 05-08-8

Lecture 9: Curve fittig I 4/0 Fig. 3: Least-squares lie fit to oisy data. Solvig the first equatio for b b= y a x (6) ad substitutig this ito the secod equatio we obtai a x +( y a x ) x = xy (7) Solvig this for a we have a= xy x y x x (8) Equatios (8) ad (6) provide the best-fit values of a ad b. Because we obtaied these paraeters by iiizig the su of squared residuals, this is called a least-squares lie fit. Exaple. The code below geerates six poits o the lie y= x ad adds orally-distributed oise of stadard deviatio 0. to the y values. The (8) ad (6) are used to calculate the best-fit values of a ad b. The data ad fit lie are plotted i Fig. 3. The true values are a=, b=. The fit values are a= 0.9,b=.0. -->x = [0:0.:]'; -->y = -x+rad(x,'oral')*0.; -->a = (ea(x.*y)-ea(x)*ea(y))/(ea(x.^)-ea(x)^) a = - 0.90347 -->b = ea(y)-a*ea(x) b =.0945 EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 5/0 4 Liear least-squares The least-squares idea ca be applied to a liear cobiatio of ay fuctios f (x), f (x),, f (x). Our odel has the for y= c j f j ( x) (9) j= For exaple, if = ad f (x)=, f ( x)=x the our odel is y=c +c x (0) which is just the liear case we've already dealt with. If we add f 3 (x)=x the the odel is y=c +c x+c 3 x () which is a arbitrary quadratic. Or we could have a odel such as y=c cos (5 x)+c si (5 x)+c 3 cos (0 x)+c 4 si(0 x ) () I ay case we'll cotiue to defie the residuals as the differece betwee the observed ad the odeled y values r i = y i c j f j ( x i) (3) j = ad the ea-squared error as ( ) MSE = r i= y i c j f j (x i ) i = i= j = (4) Let's expad this as ( ) ( [ y i c j f j (x i ) = y i y i c j f j ( x i )+ i = i = j= j = j = ]) c j f j ( x i) (5) Call yi= y i = ad y c f (x )= b j c j i= i j= j j i j = (6) with b j = yi f j ( xi ) i = (7) The last ter i (5) ca be writte EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 6/0 [ j= ] j= k= c j f j ( xi ) = c j f j ( x i ) c k f k (x i) (8) Therefore i = [ ] c j f j ( xi ) = j= i= ( j= k= ) c j f j ( xi ) ck f k (x i ) = a c c j= k= jk j k (9) with a jk =a kj = f (x ) f (x ) i= j i k i (30) Fially we ca write MSE = y bi ci + i = a c c i = j= ij i j (3) This shows that the MSE is a quadratic fuctio of the ukow coefficiets. I the lecture Optiizatio i diesios we calculated the solutio to a syste of this for, except that the secod ter (with the b coefficiets) had a plus rather tha ius sig. Defiig the colu vectors b ad c ad the atrix A as c=[c j ], b=[b j ], A=[a ij ] (3) the coditio for a iiu is (with the ius sig for the b coefficiets) b+a c=0 (33) c=a b (34) ad Aother way arrive at this result is to defie the colu vector y=[ yi ] (35) F=[ f ij] with f ij = f j ( xi ) (36) y=f c (37) ad the atrix The our odel is This is equatios i < ukows ad i geeral will ot have a solutio. Multiplyig both sides o the left by FT results i the syste FT Fc=FT y (38) Sice FT F is ad FT y is this is a syste of equatios i ukows that, i geeral, will have a uique solutio c=( FT F ) FT y (39) The eleets of FT F are EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 7/0 [ F F ] jk = T i= f ij f ik = a jk (40) f ij y i= b j (4) while the eleets of FT y are [ F y ] j = T i = Therefore FT Fc=FT y, whe ultiplied through by /, is equivalet to (4) A c=b The liear syste (38) is called the oral equatio, ad we have the followig algorith Liear least squares fit Give saples (x i, y i ) ad a odel y= c j f j ( x) j= For the atrix F with eleets f ij = f j ( xi ) For the colu vector y with eleets y i Solve the oral equatio FT Fc=FT y for c ^ The odeled y values are y=fc The atrix F is ot square if >, so we caot solve the liear syste y=fc (43) c=f y (44) by writig because F does ot have a iverse. However, as we've see, we ca copute c=( FT F ) FT y (45) ad this c will coe as close as possible (i a least-squares sese) to solvig (43). This leads us to defie the pseudoiverse of F as the atrix F =( F F ) F T + T (46) Our least-squares solutio ca ow be writte c=f + y (47) I Scilab/Matlab the pseudo iverse is coputed by the coad piv(f). However, if we siply apply the backslash operator as we would for a square syste c = F\y Scilab/Matlab returs the least-squares solutio. We do ot have to explicitly for the oral EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 8/0 equatio or the pseudoiverse. Exaple. Noise was added to Eleve saples of y=x x, x=0,0.,0.,,. A least-squares fit of the odel c +c x+c 3 x gave c =0.044, c=.0, c=.039 Code is show below ad results are plotted i Fig. 4. -->x = [0:0.:]'; -->y0 = x.^-x; -->y = y0+rad(y0,'oral')*0.03; //add oise -->F = [oes(x),x,x.^]; -->c = F\y c = 0.043654 -.04735.03903 -->yf = F*c 5 Goodess of fit Oce we've fit a odel to data we ay woder if the fit is good or ot. It would be helpful to have a easure of goodess of fit. Doig this rigorously requires details fro probability theory. We will preset the followig results without derivatio. Assue our y values are of the for y i=si +ηi where si is the sigal that we are tryig to odel ad ηi is oise. If our odel were to perfectly Fig. 4: f ( x)= x x (dashed curve), saples of f ( x) with oise added (dots) ad least-squares fit of odel c +c x+c3 x (solid lie). EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 9/0 fit the sigal, the the residuals r i= y i c j f j (x i) (48) j = would siply be oise r i=ηi. We ca quatify the goodess of fit by coparig the statistics of our residuals to the (assued kow) statistics of the oise. Specially, for large, ad orally distributed oise, a good fit will result i the uber σ= r i= i (49) beig equal, o average, to the stadard deviatio of the oise, where is the uber of data ad is the uber of odel coefficiets. If it is sigificatly larger tha this it idicates that the odel is ot accoutig for all of the sigal, where a fractioal chage of about /( ) is statistically sigificat. For exaple, /50=0. eas that a chage of aroud 0% is statistically sigificat. If the oise stadard deviatio is 0., a σ larger tha about 0.(.)=0. iplies the sigal is ot beig fully odeled. The followig exaple illustrates the use of this goodess-of-fit easure. Exaple. The followig code was used to geerate 50 saples of the fuctio f (x)=x+x over the iterval 0 x with orally distributed oise of stadard deviatio 0.05 added to each saple. = 50; rad('seed',); x = [lispace(0,,)]'; y = x+x.^+rad(x,'oral')*0.05; These data were the fit by the four odels y=c, y=c +c x, y=c +c x+c 3 x ad y=c +c x+c 3 x +c 4 x 3. The resultig σ values were σ0 =0.608, σ=0.0864, σ =0.0506 ad σ3 =0.0504. Sice /50=0. a chage of about 0% is statistically sigificat. The fits iproved sigificatly util the last odel. The data therefore support the odel y=c +c x+c 3 x but ot the cubic odel. The fits are show i Fig. 5. EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 0/0 Fig. 5 Data set fit by polyoials. Top-left: y=c, σ 0=0.608. Top-right: y=c +c x, σ =0.0864. 3 Botto-left: y=c +c x+c3 x, σ =0.0506. Botto-right: y=c +c x+c3 x +c4 x, σ 3=0.0504. EE Nuerical Coputig Scott Hudso 05-08-8