A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS

Similar documents
Linear Regression Demystified

The Method of Least Squares. To understand least squares fitting of data.

Math 155 (Lecture 3)

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

U8L1: Sec Equations of Lines in R 2

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Enumerative & Asymptotic Combinatorics

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

CHAPTER I: Vector Spaces

ECON 3150/4150, Spring term Lecture 3

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Linear Regression Models

Algebra of Least Squares

11 Correlation and Regression

The random version of Dvoretzky s theorem in l n

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

We will conclude the chapter with the study a few methods and techniques which are useful

Introduction to Machine Learning DIS10

U8L1: Sec Equations of Lines in R 2

Chimica Inorganica 3

2 Geometric interpretation of complex numbers

If we want to add up the area of four rectangles, we could find the area of each rectangle and then write this sum symbolically as:

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

NEW FAST CONVERGENT SEQUENCES OF EULER-MASCHERONI TYPE

Complex Analysis Spring 2001 Homework I Solution

Ray-triangle intersection

ANSWERS SOLUTIONS iiii i. and 1. Thus, we have. i i i. i, A.

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

8. Applications To Linear Differential Equations

Frequency Domain Filtering

P.3 Polynomials and Special products

Paired Data and Linear Correlation

Improving the Localization of Eigenvalues for Complex Matrices

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

APPENDIX F Complex Numbers

1 Inferential Methods for Correlation and Regression Analysis

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

Simple Linear Regression

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Polynomial Functions and Their Graphs

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

R is a scalar defined as follows:

Section 14. Simple linear regression.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Fitting an ARIMA Process to Data

FLOOR AND ROOF FUNCTION ANALOGS OF THE BELL NUMBERS. H. W. Gould Department of Mathematics, West Virginia University, Morgantown, WV 26506, USA

Complex Numbers Solutions

Mathematics 3 Outcome 1. Vectors (9/10 pers) Lesson, Outline, Approach etc. This is page number 13. produced for TeeJay Publishers by Tom Strang

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

How to Maximize a Function without Really Trying

6.3 Testing Series With Positive Terms

On forward improvement iteration for stopping problems

Chapter 4. Fourier Series

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Correlation Regression

Generating Functions for Laguerre Type Polynomials. Group Theoretic method

SNAP Centre Workshop. Basic Algebraic Manipulation

Math 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Synopsis of Euler s paper. E Memoire sur la plus grande equation des planetes. (Memoir on the Maximum value of an Equation of the Planets)

Two or more points can be used to describe a rigid body. This will eliminate the need to define rotational coordinates for the body!

Bertrand s Postulate

Ma 530 Introduction to Power Series

Appendix F: Complex Numbers

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

Chapter 7: The z-transform. Chih-Wei Liu

1. Linearization of a nonlinear system given in the form of a system of ordinary differential equations

CHAPTER 5. Theory and Solution Using Matrix Techniques

The Growth of Functions. Theoretical Supplement

Summary: Congruences. j=1. 1 Here we use the Mathematica syntax for the function. In Maple worksheets, the function

Complex Numbers Primer

Introduction to Signals and Systems, Part V: Lecture Summary

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Uniform Strict Practical Stability Criteria for Impulsive Functional Differential Equations

Ma 530 Infinite Series I

Determinants of order 2 and 3 were defined in Chapter 2 by the formulae (5.1)

Appendix: The Laplace Transform

1 Approximating Integrals using Taylor Polynomials

Machine Learning for Data Science (CS 4786)

Problem 4: Evaluate ( k ) by negating (actually un-negating) its upper index. Binomial coefficient

Modified Decomposition Method by Adomian and. Rach for Solving Nonlinear Volterra Integro- Differential Equations

Rotationally invariant integrals of arbitrary dimensions

Beurling Integers: Part 2

Rademacher Complexity

Course 4: Preparation for Calculus Unit 1: Families of Functions

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

The Minimum Distance Energy for Polygonal Unknots

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

An Introduction to Randomized Algorithms

Discrete Orthogonal Moment Features Using Chebyshev Polynomials

You may work in pairs or purely individually for this assignment.

Transcription:

A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS STEVEN L. LEE Abstract. The Total Least Squares (TLS) fit to the poits (x,y ), =1,,, miimizes the sum of the squares of the perpedicular distaces from the poits to the lie. This sum is the TLS error, ad miimizig its magitude is appropriate if x ad y are ucertai. A priori formulas for the TLS fit ad TLS error to coplaar poits were origially derived by Pearso i [Phil. Mag., (1901), pp. 559-57], ad they are expressed i terms of the mea, stadard deviatio ad correlatio coefficiet for the data. I this ote, these TLS formulas are derived i a more elemetary fashio. The TLS fit is obtaied via the ordiary least squares problem ad the algebraic properties of complex umbers. The TLS error is formulated i terms of the triagle iequality for complex umbers. Key words. regressio, total least squares, triagle iequality AMS subject classificatios. 6J05 1. Itroductio. Give the set of poits (x,y ), =1,,, it is ofte desirable to fid the straight lie (1.1) y = αx + β that miimizes the sum of the squares of the perpedicular distaces from the poits to the lie. Properly speaig, the above is a total least squares (TLS) problem. The desired lie is the TLS fit to the poits (x,y ), ad the TLS error is the sum to be miimized. Such a fittig is appropriate if x ad y are subject to error. For additioal bacgroud o this topic, see the origial paper by Pearso [6] or the recet paper by Nievergelt [5]. A comprehesive aalysis of this problem, ad more geeral TLS problems, is also give by Golub ad Va Loa i [] ad by Va Huffel ad Vaderwalle i [3]. Fially, we remar that algorithms for the TLS-fittig of lies, rectagles ad squares to coplaar poits are give by Gader ad Hřebíče i [1, Chap. 6.].. Mai results. I the least squares (LS) problem, we are iterested i fidig the straight lie y = ax + b that miimizes the sum of the squares of the vertical distaces from the poits (x,y ) to the lie. It is well ow that the solutio to the LS problem ca be obtaied by solvig the followig system of ormal equatios : [ ][ ] [ ] x b y (.1) x x =. a x y The solutio is uique, uless x 1 = =x (i.e., the poits are vertically aliged). This research was supported by Applied Mathematical Scieces Research Program, Office of Eergy Research, ad U.S. Departmet of Eergy cotract DE-AC05-84OR1400 with Marti Marietta Eergy Systems Ic. Mathematical Scieces Sectio, Oa Ridge Natioal Laboratory, P. O. Box 008, Buildig 601, Oa Ridge, Teessee 37831-6367 (a.slee@a-et.orl.gov). 1

STEVEN L. LEE We ca lear a great deal about the TLS problem by first trasformig it ito a LS problem ad the applyig the ormal equatios (.1). A few observatios are eeded before pursuig this approach. I particular, for the mea values x = x ad ȳ = y, let x = x x ad ỹ = y ȳ so that the TLS fit to the poits ( x, ỹ ) passes through the origi [5, p. 59]. The TLS slope α ad TLS error are uaffected by this traslatio. We ow proceed to wor i terms of ( x, ỹ ) to determie the slope ad agle at which the TLS fit crosses the origi. For otatioal coveiece, let us deote these cetered poits as the complex umbers ad let = x +iỹ, 0 τ<π deote the agle (i the couterclocwise directio) that the TLS fit maes with the positive real axis. Thus, we have the trivial relatios that slope α =ta(τ)ad τ=ta 1 (α). The aalysis that follows is simpler if we wor i terms of the TLS agle τ..1. Total Least Squares Fit. I priciple, we ca trasform the TLS problem ito a LS problem by rotatig the poits by τ so that the TLS fit to the poits (.) w = e iτ lies alog the real axis. The TLS error to the poits is uaffected by this rotatio, ad the perpedicular distaces from the poits w to the TLS fit lie strictly i the vertical directio. I this way, we establish the real axis as the solutio to a TLS ad LS problem. Although the τ i (.) is uow, we ca determie some of its basic properties by applyig the ormal equatios (.1) to the poits w ;amely, (.3) [ Re(w ) Re(w ) Re (w ) ][ b a ] = [ Im(w ) Re(w )Im(w ) Equatio (.3) has the solutio a =0,b= 0 sice the real axis is the LS fit to the poits (.). Based o this uique LS solutio, the right-had side of (.3) must also be zero. The secod compoet ]. (.4) Re iτ )Im iτ )=0

TOTAL LEAST SQUARES FIT TO COPLANAR POINTS 3 is especially valuable because the summatio of cross terms suggests a strategy for explicitly determiig τ. (We disregard the first compoet because Im iτ (.5) )=0 for ay value of τ.) Later, we show that we ca obtai the total least squares error, (.6) TLS error d = Im iτ ), via a formula that does ot ivolve computig the total least squares fit. To deduce the TLS agle τ, we itroduce the term (.7) ρ = iτ ) ad display its real ad imagiary parts via iτ ) = [ Re iτ )+iim iτ ) ] (.8) (.9) = Re iτ ) Im iτ )+i { Re iτ )Im iτ ) }. Notice that the summatio of cross terms, i curly bracets, is zero due to coditio (.4). Thus, aother importat property of the uow τ is that ρ i (.7) is always a real umber. To proceed further, we ow compute the complex umber (.10) ad simplify the term to get the expoetial form (.11) s = ρ = iτ ) = e iτ = e iτ s s = ρe iτ. We ca establish ρ as a positive real umber by deotig 0 τ <πas the agle (i the couterclocwise directio) that s maes with the positive real axis. By worig with the complex form of s, we ca exhibit the real ad imagiary parts via s = = ( x + iỹ ) = x ỹ +i x ỹ ad, for the TLS agle τ i (.11), determie that ta(τ) = Im(s) Re(s) = x ỹ (.1) x. ỹ I geeral, there are two values of τ that satisfy (.1); these two values will differ by π/. The TLS agle is the oe that also satisfies (.4). Other approaches to derivig (.1) are described i [4, Chap. 13], [6, p. 566]. If ta(τ) is defied, we the have the TLS slope α =ta(τ). The poit-slope form y ȳ = α(x x) cabeusedtoobtaitheβterm of the TLS fit (1.1).

4 STEVEN L. LEE.. Total Least Squares Error. A a priori formula for the TLS error comes from (.13) e iτ = ad the followig lemma. Lemma.1. For the total least squares agle τ to the poits, (.14) iτ ) =. Proof. From (.7), we have iτ ) = ρ. From the expoetial form of s i (.11), we ow that ρ equals the magitude of s. Thus, we fiish with ρ = s =. For the TLS agle τ, we ca also show that (.15) e iτ = Re iτ )+ Im iτ ) ad, from (.7) (.9), (.16) iτ ) = Re iτ ) Im iτ ). At this stage, otice that (.15) ad (.16) differ by Im iτ ), which is precisely twice the TLS error for the problem; see (.6). Moreover, (.13) ad (.14) show that this differece is a readily computable quatity; amely, (.17) (TLS error) =. Theorem.. Give the poits (x,y ), =1,,, let z = x + iy ad z = 1 z so that = z z. The error for the total least squares fit is (.18) d = 1 ( ),

TOTAL LEAST SQUARES FIT TO COPLANAR POINTS 5 where d is the perpedicular distace from (x,y ) to the fit. Proof. The theorem follows from (.17). The TLS error formula i (.18) is a cocise versio of Pearso s formula [6, p. 566], which is also give i Appedix A. Moreover, by arragig (.18) as (.19) + d =, we ca drop the term d ad thereby obtai the triagle iequality (.0) = as a special case. Thus, Theorem. establishes a importat relatioship betwee the TLS error to z ad the triagle iequality for the complex umbers. I particular, we fid that the error i the TLS fit equals half the error i the triagle iequality applied to the square of the cetered poits, ; see (.18). Ufortuately, this ew derivatio for the TLS fit ad TLS error does ot geeralize to higher dimesios. Algorithms for fittig a hyperplae to higher-dimesioal data are described i [1, Chap. 6.3], [5], for example. Appedix A. Pearso s formulas for the total least squares problem. I [6], formulas for the TLS fit ad TLS error to coplaar poits are expressed i terms of the mea stadard deviatio x = x, ȳ = y, ad correlatio coefficiet σ x = x x, σ y = r xy = x y xȳ σ x σ y y ȳ, for the data. I particular, Pearso shows that the TLS agle τ satisfies ad that d ta(τ) = r xyσ x σ y σ x σ y (Mea sq. residual) = 1 (σ x + σ y) 1 (σ x σ y) +4r xyσ xσ y. The above formulas simplify to those give i (.1) ad (.18), which we obtaied via the ordiary least squares problem, the triagle iequality, ad some elemetary properties of complex umbers.

6 STEVEN L. LEE Acowledgmets. I wish to tha Gee Golub for may helpful poiters to literature cocerig the total least squares problem. REFERENCES [1] W. Gader ad J. Hřebíče, Solvig Problems i Scietific Computig usig MAPLE ad MATLAB, Spriger-Verlag, 1993. [] G. H. Golub ad C. F. Va Loa, A aalysis of the total least squares problem, SIAMJ. Numer. Aal., 17 (1980), pp. 883 893. [3] S. Va Huffel ad J. Vaderwalle, The Total Least Squares Problem: Computatioal Aspects ad Aalysis, Society for Idustrial ad Applied Mathematics, Philadelphia, PA, 1991. [4] I. Lii, Method of Least Squares ad Priciples of the Theory of Observatios, Pergamo Press, New Yor, 1961. [5] Y. Nievergelt, Total least squares: State-of-the-art regressio i umerical aalysis, SIAMReview, 36 (1994), pp. 58 64. [6] K. Pearso, O lies ad plaes of closest fit to poits i space, Phil. Mag., (1901), pp. 559 57.