A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS

A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS STEVEN L. LEE Abstract. The Total Least Squares (TLS) fit to the poits (x,y ), =1,,, miimizes the sum of the squares of the perpedicular distaces from the poits to the lie. This sum is the TLS error, ad miimizig its magitude is appropriate if x ad y are ucertai. A priori formulas for the TLS fit ad TLS error to coplaar poits were origially derived by Pearso i [Phil. Mag., (1901), pp. 559-57], ad they are expressed i terms of the mea, stadard deviatio ad correlatio coefficiet for the data. I this ote, these TLS formulas are derived i a more elemetary fashio. The TLS fit is obtaied via the ordiary least squares problem ad the algebraic properties of complex umbers. The TLS error is formulated i terms of the triagle iequality for complex umbers. Key words. regressio, total least squares, triagle iequality AMS subject classificatios. 6J05 1. Itroductio. Give the set of poits (x,y ), =1,,, it is ofte desirable to fid the straight lie (1.1) y = αx + β that miimizes the sum of the squares of the perpedicular distaces from the poits to the lie. Properly speaig, the above is a total least squares (TLS) problem. The desired lie is the TLS fit to the poits (x,y ), ad the TLS error is the sum to be miimized. Such a fittig is appropriate if x ad y are subject to error. For additioal bacgroud o this topic, see the origial paper by Pearso [6] or the recet paper by Nievergelt [5]. A comprehesive aalysis of this problem, ad more geeral TLS problems, is also give by Golub ad Va Loa i [] ad by Va Huffel ad Vaderwalle i [3]. Fially, we remar that algorithms for the TLS-fittig of lies, rectagles ad squares to coplaar poits are give by Gader ad Hřebíče i [1, Chap. 6.].. Mai results. I the least squares (LS) problem, we are iterested i fidig the straight lie y = ax + b that miimizes the sum of the squares of the vertical distaces from the poits (x,y ) to the lie. It is well ow that the solutio to the LS problem ca be obtaied by solvig the followig system of ormal equatios : [ ][ ] [ ] x b y (.1) x x =. a x y The solutio is uique, uless x 1 = =x (i.e., the poits are vertically aliged). This research was supported by Applied Mathematical Scieces Research Program, Office of Eergy Research, ad U.S. Departmet of Eergy cotract DE-AC05-84OR1400 with Marti Marietta Eergy Systems Ic. Mathematical Scieces Sectio, Oa Ridge Natioal Laboratory, P. O. Box 008, Buildig 601, Oa Ridge, Teessee 37831-6367 (a.slee@a-et.orl.gov). 1

STEVEN L. LEE We ca lear a great deal about the TLS problem by first trasformig it ito a LS problem ad the applyig the ormal equatios (.1). A few observatios are eeded before pursuig this approach. I particular, for the mea values x = x ad ȳ = y, let x = x x ad ỹ = y ȳ so that the TLS fit to the poits ( x, ỹ ) passes through the origi [5, p. 59]. The TLS slope α ad TLS error are uaffected by this traslatio. We ow proceed to wor i terms of ( x, ỹ ) to determie the slope ad agle at which the TLS fit crosses the origi. For otatioal coveiece, let us deote these cetered poits as the complex umbers ad let = x +iỹ, 0 τ<π deote the agle (i the couterclocwise directio) that the TLS fit maes with the positive real axis. Thus, we have the trivial relatios that slope α =ta(τ)ad τ=ta 1 (α). The aalysis that follows is simpler if we wor i terms of the TLS agle τ..1. Total Least Squares Fit. I priciple, we ca trasform the TLS problem ito a LS problem by rotatig the poits by τ so that the TLS fit to the poits (.) w = e iτ lies alog the real axis. The TLS error to the poits is uaffected by this rotatio, ad the perpedicular distaces from the poits w to the TLS fit lie strictly i the vertical directio. I this way, we establish the real axis as the solutio to a TLS ad LS problem. Although the τ i (.) is uow, we ca determie some of its basic properties by applyig the ormal equatios (.1) to the poits w ;amely, (.3) [ Re(w ) Re(w ) Re (w ) ][ b a ] = [ Im(w ) Re(w )Im(w ) Equatio (.3) has the solutio a =0,b= 0 sice the real axis is the LS fit to the poits (.). Based o this uique LS solutio, the right-had side of (.3) must also be zero. The secod compoet ]. (.4) Re iτ )Im iτ )=0

TOTAL LEAST SQUARES FIT TO COPLANAR POINTS 3 is especially valuable because the summatio of cross terms suggests a strategy for explicitly determiig τ. (We disregard the first compoet because Im iτ (.5) )=0 for ay value of τ.) Later, we show that we ca obtai the total least squares error, (.6) TLS error d = Im iτ ), via a formula that does ot ivolve computig the total least squares fit. To deduce the TLS agle τ, we itroduce the term (.7) ρ = iτ ) ad display its real ad imagiary parts via iτ ) = [ Re iτ )+iim iτ ) ] (.8) (.9) = Re iτ ) Im iτ )+i { Re iτ )Im iτ ) }. Notice that the summatio of cross terms, i curly bracets, is zero due to coditio (.4). Thus, aother importat property of the uow τ is that ρ i (.7) is always a real umber. To proceed further, we ow compute the complex umber (.10) ad simplify the term to get the expoetial form (.11) s = ρ = iτ ) = e iτ = e iτ s s = ρe iτ. We ca establish ρ as a positive real umber by deotig 0 τ <πas the agle (i the couterclocwise directio) that s maes with the positive real axis. By worig with the complex form of s, we ca exhibit the real ad imagiary parts via s = = ( x + iỹ ) = x ỹ +i x ỹ ad, for the TLS agle τ i (.11), determie that ta(τ) = Im(s) Re(s) = x ỹ (.1) x. ỹ I geeral, there are two values of τ that satisfy (.1); these two values will differ by π/. The TLS agle is the oe that also satisfies (.4). Other approaches to derivig (.1) are described i [4, Chap. 13], [6, p. 566]. If ta(τ) is defied, we the have the TLS slope α =ta(τ). The poit-slope form y ȳ = α(x x) cabeusedtoobtaitheβterm of the TLS fit (1.1).

4 STEVEN L. LEE.. Total Least Squares Error. A a priori formula for the TLS error comes from (.13) e iτ = ad the followig lemma. Lemma.1. For the total least squares agle τ to the poits, (.14) iτ ) =. Proof. From (.7), we have iτ ) = ρ. From the expoetial form of s i (.11), we ow that ρ equals the magitude of s. Thus, we fiish with ρ = s =. For the TLS agle τ, we ca also show that (.15) e iτ = Re iτ )+ Im iτ ) ad, from (.7) (.9), (.16) iτ ) = Re iτ ) Im iτ ). At this stage, otice that (.15) ad (.16) differ by Im iτ ), which is precisely twice the TLS error for the problem; see (.6). Moreover, (.13) ad (.14) show that this differece is a readily computable quatity; amely, (.17) (TLS error) =. Theorem.. Give the poits (x,y ), =1,,, let z = x + iy ad z = 1 z so that = z z. The error for the total least squares fit is (.18) d = 1 ( ),

TOTAL LEAST SQUARES FIT TO COPLANAR POINTS 5 where d is the perpedicular distace from (x,y ) to the fit. Proof. The theorem follows from (.17). The TLS error formula i (.18) is a cocise versio of Pearso s formula [6, p. 566], which is also give i Appedix A. Moreover, by arragig (.18) as (.19) + d =, we ca drop the term d ad thereby obtai the triagle iequality (.0) = as a special case. Thus, Theorem. establishes a importat relatioship betwee the TLS error to z ad the triagle iequality for the complex umbers. I particular, we fid that the error i the TLS fit equals half the error i the triagle iequality applied to the square of the cetered poits, ; see (.18). Ufortuately, this ew derivatio for the TLS fit ad TLS error does ot geeralize to higher dimesios. Algorithms for fittig a hyperplae to higher-dimesioal data are described i [1, Chap. 6.3], [5], for example. Appedix A. Pearso s formulas for the total least squares problem. I [6], formulas for the TLS fit ad TLS error to coplaar poits are expressed i terms of the mea stadard deviatio x = x, ȳ = y, ad correlatio coefficiet σ x = x x, σ y = r xy = x y xȳ σ x σ y y ȳ, for the data. I particular, Pearso shows that the TLS agle τ satisfies ad that d ta(τ) = r xyσ x σ y σ x σ y (Mea sq. residual) = 1 (σ x + σ y) 1 (σ x σ y) +4r xyσ xσ y. The above formulas simplify to those give i (.1) ad (.18), which we obtaied via the ordiary least squares problem, the triagle iequality, ad some elemetary properties of complex umbers.

6 STEVEN L. LEE Acowledgmets. I wish to tha Gee Golub for may helpful poiters to literature cocerig the total least squares problem. REFERENCES [1] W. Gader ad J. Hřebíče, Solvig Problems i Scietific Computig usig MAPLE ad MATLAB, Spriger-Verlag, 1993. [] G. H. Golub ad C. F. Va Loa, A aalysis of the total least squares problem, SIAMJ. Numer. Aal., 17 (1980), pp. 883 893. [3] S. Va Huffel ad J. Vaderwalle, The Total Least Squares Problem: Computatioal Aspects ad Aalysis, Society for Idustrial ad Applied Mathematics, Philadelphia, PA, 1991. [4] I. Lii, Method of Least Squares ad Priciples of the Theory of Observatios, Pergamo Press, New Yor, 1961. [5] Y. Nievergelt, Total least squares: State-of-the-art regressio i umerical aalysis, SIAMReview, 36 (1994), pp. 58 64. [6] K. Pearso, O lies ad plaes of closest fit to poits i space, Phil. Mag., (1901), pp. 559 57.