ESS Line Fitting - PDF Free Download

ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here t s also covered Chapter 4 of Pal Wessel s otes. Least-squares straght-le fttg The process of fttg a straght le s oe of the smplest examples of a verse problem. For pars of data pots,, = 1,, ad ft the data wth a relatoshp y = a + b 17-1) where y s the predcted value of. We ca use the χ statstc to measure the msft χ ) = a b 17-) where are the ucertates or estmates of the ucertates whch ca be set to uty the absece of better kowledge) the values. Our goal s to fd the values of a ad b that mmmze χ. To do ths we fd take the partal dervates of χ wth respect to a ad b ad solve for the values of a ad b at whch they are both zero χ a = a b = 0 17-3) χ b = a b = 0 If we defe 1 S =, S =, S =, S =, S = 17-4) the equato 7) reduces to as + bs = S as + bs = S The soluto s 17-5) a = S S S S 17-1 17-6) b = SS S S wth = SS S 17-7) We ca also estmate the ucertates a ad b. To do ths we sum the varace a ad b resultg from the varace each of the values. Ths ca be wrtte mathematcally

ESS 5 014 s a a = 17-8) s b b = After substtutg dervatves obtaed from equato 17-6) ad a far amout of mapulato we get s a = S 17-9) s b = S We ca also estmate the covarace of the ucertates a ad b s a b ab = = S x 17-10) Our estmate of the correlato coeffcet betwee a ad b, becomes r = s ab = S. 17-11) s a SS If we assume our estmates of the ucertaty are correct, we ca check f the ft s adequate sgfcat) at the α level by comparg our value of χ to the crtcal χ α for - degrees of freedom. Provded t does ot exceed ths value the the data s ft adequately by the straght le. We ca test the sgfcace of the correlato of x ad y, by applyg the t-statstc wth - degrees of freedom to determe whether the slope ad our estmate of ts ucertaty are sgfcatly dfferet from 0 t = b 0 ) 17-1) We ca wrte the 95% cofdece lmts for b as b ± t 0.05 17-13) If these lmts eclose zero we caot be cofdet that x ad y are correlated at the 95% level. If we do ot kow the ucertaty of our data but kow that the straght-le model s correct, the we ca tally assume a ucertaty of 1 for the purpose of gettg a straght le ft ad the estmate t from the resduals accordg to s = 1 a b ) 17-14) We ca use s place of the populato varace to estmate the slope ucertateut ths wll lead to a uderestmate of the ucertaty for small because there s addtoal ucertaty arsg from usg a estmate of ad ot ts true value. 17-

ESS 5 014 Note that Paul Wessel uses b etc stead of secto 4.1 but ths s ot cosstet wth the otato he troduces Chapter 1 ad that we have used sce s clearly a estmate based o a lmted sample of pots ot the etre populato) Le fttg wth errors x ad y It s mportat to ote that equato 17-) assumes that our determatos of x have o ucertaty. I some staces ths s a good assumpto for example our determatos of tme or spatal coordate wll ofte have eglgble ucertaty. For other staces t s a poor approxmato for example f we plot the cocetrato of two dssolved chemcals seawater or two trace elemets a rock, they may both have smlar aalytcal errors. If we have errors both varables the a better measure of msft s gve by y E = + x y, 17-15) x, where ad are the observed data ad x ad y are the modeled values that are requred to le o a straght le y = a + bx 17-16) Our goal s to fd the values of a ad b that mmze E. To do ths we use the method of Lagrage Multplers. We ca wrte equato 17-16) as f = a + bx y = 0 17-17) ad sce the f values are costraed to be zero we ca wrte equato 17-15) as y E = + x y, + λ f 17-18) x, where the λ values are ukow costat Lagrace multplers ad the factor of s for algebrac coveece. We ow set the partal dervatves of E to zero to fd the values that gve a mmum = = x y a = b = 0 Now f we make the assumpto that all the sgma values are equal to uty ths gves = x x x ) + λ x bx ) = x = y y y ) y ) + bλ = 0 17-19) λ y ) = y ) λ = 0 17-0) a = a λ a ) = λ = 0 17-1) b = a λ bx ) = λ x = 0 17-) From equatos 17-19) ad 17-0) we ca wrte 17-3

ESS 5 014 x = bλ y = + λ 17-3) Substtutg for x ad y equato 17-16) yelds + λ = a + b bλ ) = a + b b λ 17-4) Solvg for λ yelds λ = a + b 17-5) 1+ b Substtutg for λ to equato 17-1) yelds a + b = 0 17-6) 1+ b Substtutg for λ from equatos 17-5) ad for x from 17-3) to equato 17-) yelds a + b 1+ b a bλ ) = + b b a + b 1+ b 1+ b = 0 17-7) We ow have reduced the + equatos for a, b ad λ to equatos 17-6 ad 17-7) for a ad b. Sce the deomator equato 17-6) caot reduce to zero, we ca wrte a = a = b a = b where ad are the mea values of the data. We ca substtute equato 17-8) to 17-8) equato 17-7), multply by 1 + b ), ad use the varables U = - ad V = - ad after a few les of mapulato get ) U V b U V + b U V = 0. 17-9) Ths has the soluto V U ± U V b = U V + 4 U V There are two solutos for b each wth a correspodg value of a from equato 17-8), oe that mmzes E ad a secod that gves a perpedcular le that maxmzes E. Robust Le Fttg I a least squares le whch we assume all the data have the same ucertaty we seek to mmze 17-30) 17-4

ESS 5 014 Mmze ) E = a b = r 17-31) Ths process s sestve to outlers, partcularly so whe the outlers le ear the lower or upper lmts of the rage of x. The breakdow pot for the least squares le ft L regresso) s 1/. We ca overcome ths problem to a small extet by mmzg the sum of the absolute msfts L 1 regresso) Mmze E = r 17-3) but the L 1 orm also has a breakdow pot of 1/. A robust approach wth a breakdow pot of ½ s to mmze the meda msft. Mmze meda r = meda a b 17-33) Ths s equvalet to fdg the arrowest strp that ecloses half the pots. The oly way to do ths, y a systematc search through dfferet values of b. For each value of b we calculate - b, ad the fd the value of a that mmzes the meda of - b - a. Oe the chooses the a ad b values that gves the mmum meda amog all the values of b aalyzed. Oe ca use ths robust statstcal method to fd ad elmate outlers a b Meda a b > z cut 17-34) where a value of z cut =4.45 s equvalet to 3 stadard devatos for a ormal dstrbuto. Oce the outlers are elmated, oe ca apply the least squares le fttg approach. 17-5