Least-Squares Fitting of a Hyperplane

Similar documents
SIMPLE LINEAR REGRESSION

Linear Approximation with Regularization and Moving Least Squares

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Chapter 11: Simple Linear Regression and Correlation

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Chapter 13: Multiple Regression

LECTURE 9 CANONICAL CORRELATION ANALYSIS

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

4DVAR, according to the name, is a four-dimensional variational method.

Difference Equations

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

APPENDIX 2 FITTING A STRAIGHT LINE TO OBSERVATIONS

More metrics on cartesian products

Chapter 8 Indicator Variables

Kernel Methods and SVMs Extension

Laboratory 3: Method of Least Squares

Lecture 3 Stat102, Spring 2007

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

PHYS 705: Classical Mechanics. Calculus of Variations II

Laboratory 1c: Method of Least Squares

Lecture 12: Discrete Laplacian

Composite Hypotheses testing

Formulas for the Determinant

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

Negative Binomial Regression

Economics 130. Lecture 4 Simple Linear Regression Continued

Global Sensitivity. Tuesday 20 th February, 2018

Linear Regression Analysis: Terminology and Notation

Chapter 9: Statistical Inference and the Relationship between Two Variables

Numerical Heat and Mass Transfer

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

MMA and GCMMA two methods for nonlinear optimization

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

STAT 3008 Applied Regression Analysis

/ n ) are compared. The logic is: if the two

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

U-Pb Geochronology Practical: Background

Generalized Linear Methods

Lecture Notes on Linear Regression

Estimation: Part 2. Chapter GREG estimation

Linear Correlation. Many research issues are pursued with nonexperimental studies that seek to establish relationships among 2 or more variables

Limited Dependent Variables

Least squares cubic splines without B-splines S.K. Lucas

Solution of Linear System of Equations and Matrix Inversion Gauss Seidel Iteration Method

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Inductance Calculation for Conductors of Arbitrary Shape

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Fisher Linear Discriminant Analysis

Statistics MINITAB - Lab 2

ONE DIMENSIONAL TRIANGULAR FIN EXPERIMENT. Technical Advisor: Dr. D.C. Look, Jr. Version: 11/03/00

The Geometry of Logit and Probit

Inexact Newton Methods for Inverse Eigenvalue Problems

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

Solutions to exam in SF1811 Optimization, Jan 14, 2015

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

e i is a random error

1 Matrix representations of canonical matrices

Chapter 6. Supplemental Text Material

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

ANOVA. The Observations y ij

Polynomial Regression Models

Army Ants Tunneling for Classical Simulations

Comparison of Regression Lines

Time-Varying Systems and Computations Lecture 6

x i1 =1 for all i (the constant ).

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Lecture 10 Support Vector Machines II

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Statistics Chapter 4

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Affine transformations and convexity

F statistic = s2 1 s 2 ( F for Fisher )

Chapter 12 Analysis of Covariance

A Robust Method for Calculating the Correlation Coefficient

Section 8.3 Polar Form of Complex Numbers

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

A PROCEDURE FOR SIMULATING THE NONLINEAR CONDUCTION HEAT TRANSFER IN A BODY WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY.

STAT 511 FINAL EXAM NAME Spring 2001

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

Important Instructions to the Examiners:

Goodness of fit and Wilks theorem

Errors for Linear Systems

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems

SPANC -- SPlitpole ANalysis Code User Manual

χ x B E (c) Figure 2.1.1: (a) a material particle in a body, (b) a place in space, (c) a configuration of the body

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

An identification algorithm of model kinetic parameters of the interfacial layer growth in fiber composites

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Transcription:

Least-Squares Fttng of a Hyperplane Robert K. Monot October 20, 2002 Abstract A method s developed for fttng a hyperplane to a set of data by least-squares, allowng for ndependent uncertantes n all coordnates of each data pont, and ncludng an error analyss. Note: Ths paper s adapted from a techncal report I wrote as a graduate student n the Department of Physcs, Unversty of Calforna, Berkeley, n 1976. Copyrght c 2002, by Robert K. Monot. All rghts reserved. 1 Introducton A recurrng computatonal problem n the feld of sotopc studes of terrestral and extraterrestral materals has been the nterpretaton of the observed mass spectra n terms of mxtures of varous source components. Each source component s characterzed by fxed, but often unknown, sotopc ratos, but t s present n varable amounts n dfferent measured samples. One would lke to verfy the hypothess that a gven number of components s adequate to account for all observatons, and, f possble, not only to determne the source component compostons, but also to resolve each measured sample nto ts orgnal components, n order to separate dfferent processes n ts orgn for study. Ths paper treats only the frst steps n ths sequence of analyss,.e., the nvestgaton nto the number and compostons of possble source components. The measured mass spectra may be denoted by Y (µ), where = 1,..., n s the sample number, and µ s the mass number. Snce µ takes on only dscrete values µ k, k = 1,..., p, each mass spectrum can be represented by a vector n a p-dmensonal vector space, Y k = Y (µ k ). These are assumed to be made up of lnear combnatons of the component spectra, Y (µ) = m α jg j(µ) (1) j=1 Where the α j are scalars between 0 and 1 subject to the normalzaton condton m αj = 1, = 1,..., n and the gj(µ) are the m dfferent component spectra. j=1 The problem s analogous to that of curve resoluton encountered, for example, n chromatography or spectrophotometry, where t has been treated wth consderable success usng the technque of prncpal component analyss. Lawton and Slvestre (1971), for example, have consdered the case of two source components 1

and have developed a method for computng two bands of curves, each contanng one of the source components. The method of prncpal component analyss, however, runs nto dffcultes f the data are characterzed by wdely dfferent expermental uncertantes. Ths s often the case wth mass spectroscopc data. Even f the relatve uncertantes n sotopc ratos are smlar, the ratos can vary by orders of magntude. As Anderson (1963) ponted out, the method of prncpal component analyss s justfed only f the rato of the uncertanty varance to the systematc,.e. correlaton, varance s the same for all components of the data. Nonconformty wth ths requrement may be remeded to some degree by rescalng the data accordng to ther respectve uncertantes. Here we abandon the method of prncpal component analyss for an alternatve approach that s on a better statstcal footng n that t takes full account of the estmated uncertantes of the data. It s easly shown that data ponts consstng of lnear combnatons of components accordng to Equaton (1) must le n an m 1-dmensonal subspace of the full p-dmensonal vector space. Ths subspace s defned by the smplex whose vertces are the m dstnct components. Ths paper deals wth only the frst step n component resoluton, namely the determnaton of the parameters of ths subspace. Furthermore, t consders only the smplest case, n whch p = m, that s, the number of components s the same as the number of coordnates of the space (e.g. the number of sotopc ratos measured n each sample). Thus for 2-dmensonal data we seek the equaton of a straght lne, for 3-dmensonal data a plane, and n general a hyperplane of dmenson one less than the space n whch t s embedded. The general case of arbtrary m p s to be dealt wth n a future paper. 2 Defnton of problem In accordance wth the forgong dscusson, we assume that the measured data ponts should deally le on a hyperplane of m 1 dmensons n a space of m dmensons. If we let y = [y 1, y 2,..., y m] denote a pont on ths hyperplane, the equaton whch the data pont deally should satsfy can be wrtten f(y) = a k y k + a m y m = 0 (2) The measured data consst of a set of n vectors Y, = 1..., n. Each coordnate Y k of each data pont has an assocated expermental uncertanty σ k. (The σ k may or may not be known a pror. We assume that at least ther relatve magntudes are known.) The expermental errors wll cause the data ponts Y to le scattered off the hyperplane of Equaton (2). Therefore one seeks a best ft to the data. The method of prncpal component analyss, referred to above, s equvalent to seekng the hyperplane that mnmzes the sum of squares of perpendcular dstances from the measured ponts to the hyperplane. As already mentoned, ths method s unable to take proper account of the expermental uncertantes of the data, and s not nvarant under a change of scale of one or more axes. The usual method that one fnds n the lterature for obtanng a best ft of ths knd s based on mnmzng a sum of squares of the resduals f(y ). Ths s called a regresson of y m aganst y 1 through y. The sum of squares of these resduals s 2

ether unweghted or weghted by 1/σm 2 (Bevngton, 1991). Ths approach gnores the uncertantes n coordnates y 1 through y. It also gves dfferent results dependng on whch coordnate s chosen as the dependent coordnate y m. The correct treatment that properly takes account of the expermental uncertantes was formulated by Demng (1943). Gven the data Y k wth assocated uncertantes σ k, a set of correspondng adjusted values y k are sought whch le exactly on the hyperplane (2) and mnmze the varance m 1 S = (Y k y k ) 2 (3) σ 2 k The soluton of ths formulaton of the problem s not straghtforward. York (1966) frst devsed an approach, later mproved by Wllamson (1968), for the straght-lne case. Here we extend Wllamson s soluton to arbtrary m 1. 3 Dervaton of soluton We begn wth the constrant that the adjusted ponts y are requred to satsfy the hyperplane equaton,.e. f(y ) = 0, = 1,..., n where f s as defned n Equaton (2). Defnng resduals V k = Y k y k allows ths constrant to be rewrtten as f(y ) = f(y ) a k V k + V m = 0, = 1,..., n (4) Now, gven any values of the parameters a k, we seek those values of y k that wll mnmze S for that choce of the a k, subject to the constrant (4). Thus we requre m 1 δs = 0 = V k δv k (5) From (4) we have σ 2 k δf(y ) = 0 = a k δv k + δv m, = 1,..., n (6) Multplyng each of the n equatons (6) by ts own undetermned multpler λ and addng them all to Equaton (5), we obtan ( V k λ a σk 2 k ) δv k + ( Vm σ 2 m + λ ) δv m = 0 (7) Snce the V k are ndependent, the coeffcents of δv k n ths equaton must ndvdually be zero, gvng V k = λ a k σk, 2 = 1,..., n, k = 1,..., m 1 V m = λ σm, 2 = 1,..., n Substtutng ths result nto Equaton (4) and solvng for λ yelds (8) λ = W f(y ) (9) 3

where W = Ths allows S to be rewrtten as [ 1 a 2 kσk 2 + σm] 2 (10) S = Wf(Y)2 (11) Ths expresses S n the form of a weghted sum of the resduals as defned for a conventonal regresson, but wth weghts that properly take account of the ndvdual uncertantes σ k. Now S s to be mnmzed wth respect to the parameters a k. Settng S/ a k = 0 leads to the followng set of equatons analogous to the normal equatons of the conventonal regresson: Wf(Y) y k = 0, k = 1,..., m 1 Wf(Y) = 0 (12) Snce the parameters a k occur n W, f(y ), and y, these equatons are nonlnear and cannot be solved n closed form. However, recognzng that W and y are only weakly dependent on the parameters of the hyperplane, we can lnearze Equatons (12) by treatng those quanttes as constants. Wrtng out f n terms of the parameters, then, we obtan ( WyjY k) a k + ( Wyj) am = WyjYm, j = 1,..., m 1 ( WY k) a k + ( W) am = WYm (13) Identfyng the parentheszed terms n Equaton (13) as the elements of an m m matrx M and the rght-hand sdes as elements of a vector b of length m, ths set of equatons s seen as a lnear system of the form Ma = b. Soluton proceeds teratvely. Startng wth an ntal guess for the vector of coeffcents a, the matrx M and vector b are evaluated and the system Ma = b s solved for the new value of a. Ths new value s used to re-evaluate M and b, and the equaton s solved agan. The teraton s contnued untl convergence s obtaned. In practce, t s not necessary to have a good startng guess for a. From Equaton (10) t can be seen that the ntal choce a = 0 gves, as the result of the frst teraton, the same parameters as would be obtaned f the σ k were zero for all k except m. Ths s the same result as would be gven by the conventonal weghted regresson of y m aganst the other coordnates. Incdentally, ths means that weghted averages (m = 1) are computed correctly n one teraton. Experence has shown that the convergence s rapd for data sets where the ft s justfed, and only a few teratons are necessary to obtan the coeffcents to accuraces that are well wthn ther uncertantes. 3.1 Refnement It should be mentoned that for the sake of numercal stablty, the measured ponts Y should be translated f necessary nto a coordnate system whose orgn s close to the mean of the ponts. Ths stablty can be acheved automatcally, and the 4

computaton smplfed somewhat, by reformulatng the soluton n the followng way. Frst, solve for a m from the last of the equatons (13): where Y k = a m = Y m a k Y k (14) WY k W, k = 1,..., m (15) Ths shows that the pont Y les on the best-ft hyperplane. Now defne Y = Y Y and z = y Y. From Equatons (8) and (9) we have z j = Y j W a jσ 2 jf(y ), = 1,..., n, j = 1,... m 1 (16) where we can express f(y ) as f(y ) = a k Y k Y m = 1,..., n (17) Then upon nsertng (14) nto the remanng equatons (13) we fnd ( WzjY k) a k = WzjY m, j = 1,..., m 1 (18) Ths reformulaton has mproved the numercal stablty and reduced the order of the set of equatons that needs to be solved on each teraton by 1. 4 Error analyss The varances of the hyperplane parameters can be found by evaluatng m σ 2 (a j) = σ 2 k ( aj Y k ) 2 (19) (Ths equaton assumes the data Y k are uncorrelated.) Snce the dependence of a j on Y k s not lnear as Equaton (13) suggests, due to the dependence of W and y k on a j, evaluaton of ths expresson s very complcated. The orgnal verson of ths paper contaned an error n the result of ths calculaton, and a corrected calculaton has not yet been done. To frst order, however, gnorng the nonlnearty one obtans the approxmaton σ 2 (a j) M 1 jj (20) that s, the varances of the parameters are gven smply by the dagonal elements of the nverse of the normal matrx defned n Equaton (13). (The off-dagonal elements of ths matrx are the covarances of the parameters.) For well-behaved data such as those used for llustraton by York (1966), ths approxmaton s good to wthn a few percent. 5

If the expermenter does not have standard errors σ k for the measured quanttes Y k, but only relatve uncertantes, the resultng ft s the same usng these relatve uncertantes, but the varances n the ftted parameters are gven by expresson (20) multpled by S/ν, where ν = n m s the number of degrees of freedom of the problem. If the errors σ k are known a pror, then the goodness of ft can be nferred from the value of S/ν, whch should be close to unty for normally dstrbuted errors. Ths consttutes a test of the m-component hypothess as set forth n the ntroducton. 5 Bblography Anderson, T. W. (1963). Asymptotc theory for prncpal component analyss, Annals of Mathematcal Statstcs, 34, 122 148. Bevngton, P. R. and Robnson, D. K. (1991). Data Reducton and Error Analyss for The Physcal Scences, McGraw-Hll, New York. Demng, W. E. (1943). Statstcal Adjustment of Data, Wley, New York. Lawton, W. H. and Sylvestre, E. A. (1971). Self modelng curve resoluton. Technometrcs, 13, 617 633. Wllamson, J. H. (1968). Least-squares fttng of a straght lne. Canadan Journal of Physcs, 46, 1845 1846. York, D. (1966). Least-squares fttng of a straght lne. Canadan Journal of Physcs, 44, 1079 1086. 6