Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data

Similar documents
Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting

Lecture 20. Curve fitting II. m y= f (x)= c j. (x) (1) f j. n [ y i )] 2 (2) f (x i. MSE= 1 n i =1

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions

) is a square matrix with the property that for any m n matrix A, the product AI equals A. The identity matrix has a ii

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data.

The Method of Least Squares. To understand least squares fitting of data.

Linear Regression Demystified

19.1 The dictionary problem

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers,

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

5.6 Binomial Multi-section Matching Transformer

ECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Data Analysis and Statistical Methods Statistics 651

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Orthogonal Functions

Theorem: Let A n n. In this case that A does reduce to I, we search for A 1 as the solution matrix X to the matrix equation A X = I i.e.

Summer MA Lesson 13 Section 1.6, Section 1.7 (part 1)

X. Perturbation Theory

Chapter 2. Asymptotic Notation

SNAP Centre Workshop. Basic Algebraic Manipulation

Review Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = =

5.6 Binomial Multi-section Matching Transformer

Optimal Estimator for a Sample Set with Response Error. Ed Stanek

APPENDIX F Complex Numbers

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

APPLIED MULTIVARIATE ANALYSIS

Ma/CS 6a Class 22: Power Series

Contents Two Sample t Tests Two Sample t Tests

Zeros of Polynomials

Find quadratic function which pass through the following points (0,1),(1,1),(2, 3)... 11

Integrals of Functions of Several Variables

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

A PROBABILITY PROBLEM

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

Properties and Hypothesis Testing

Polynomial Functions and Their Graphs

(s)h(s) = K( s + 8 ) = 5 and one finite zero is located at z 1

CHAPTER 5. Theory and Solution Using Matrix Techniques

Lecture 11. Solution of Nonlinear Equations - III

6.3 Testing Series With Positive Terms

AVERAGE MARKS SCALING

Lesson 10: Limits and Continuity

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

CHAPTER 10 INFINITE SEQUENCES AND SERIES

ECON 3150/4150, Spring term Lecture 3

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

1 Inferential Methods for Correlation and Regression Analysis

x !1! + 1!2!

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Introduction to Optimization, DIKU Monday 19 November David Pisinger. Duality, motivation

Orthogonal Function Solution of Differential Equations

The Binomial Multi-Section Transformer

Notes on iteration and Newton s method. Iteration

Recurrence Relations

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

Properties and Tests of Zeros of Polynomial Functions

CALCULATION OF FIBONACCI VECTORS

Bernoulli Polynomials Talks given at LSBU, October and November 2015 Tony Forbes

Mixture models (cont d)

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

Exercise 8 CRITICAL SPEEDS OF THE ROTATING SHAFT

Binomial transform of products

PARTIAL DIFFERENTIAL EQUATIONS SEPARATION OF VARIABLES

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Appendix F: Complex Numbers

Matrix Algebra 2.2 THE INVERSE OF A MATRIX Pearson Education, Inc.

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

A 2nTH ORDER LINEAR DIFFERENCE EQUATION

CALCULUS BASIC SUMMER REVIEW

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Linear Differential Equations of Higher Order Basic Theory: Initial-Value Problems d y d y dy

Define a Markov chain on {1,..., 6} with transition probability matrix P =

Automated Proofs for Some Stirling Number Identities

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Algorithms and Data Structures 2014 Exercises and Solutions Week 13

TEACHER CERTIFICATION STUDY GUIDE

18.01 Calculus Jason Starr Fall 2005

LESSON 2: SIMPLIFYING RADICALS

1 The Primal and Dual of an Optimization Problem

REGRESSION WITH QUADRATIC LOSS

Infinite Sequences and Series

SOLUTION SET VI FOR FALL [(n + 2)(n + 1)a n+2 a n 1 ]x n = 0,

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

6.4 Binomial Coefficients

MAXIMALLY FLAT FIR FILTERS

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Regression with quadratic loss

Section 14. Simple linear regression.

Topic 9: Sampling Distributions of Estimators

Ray-triangle intersection

INEQUALITIES BJORN POONEN

Name Period ALGEBRA II Chapter 1B and 2A Notes Solving Inequalities and Absolute Value / Numbers and Functions

Estimation for Complete Data

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

U8L1: Sec Equations of Lines in R 2

Transcription:

Lecture 9 Curve fittig I Itroductio Suppose we are preseted with eight poits of easured data (x i, y j ). As show i Fig. o the left, we could represet the uderlyig fuctio of which these data are saples by iterpolatig betwee the data poits usig oe of the ethods we have studied previously. Fig. : Measured data with: (left) splie iterpolatio, (right) lie fit. However, aybe the data are saples of the respose of a process that we kow, i theory, is supposed to have the for y= f ( x )=a x+b where a,b are costats. Maybe we also kow that y is a very weak sigal ad the sesor used to easure it is oisy, that is, it adds its ow (rado) sigal i with the true y data. Give this it akes o sese to iterpolate the data because i part we'll be iterpolatig oise, ad we kow that the real sigal should have the for y=ax+b. I a situatio like this we prefer to fit a lie to the data rather tha perfor a iterpolatio (Fig. at right). If doe correctly this ca provide a degree of iuity agaist the effects of easureet errors ad oise. More geerally we wat to develop curve fittig techiques that allow theoretical curves, or odels, with ukow paraeters (such as a ad b i the lie case) to be fit to data poits. Fittig a costat to easured data The siplest curve fittig proble is estiatig a paraeter fro ultiple easureets. Suppose is the ass of a object. We wat to easure this usig a scale. Ufortuately the scales i our laboratory are ot well calibrated. However we have ie scales. We expect that if EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I /0 Fig. : Horizotal lie is average of several easureets (dots) we take easureets with all of the ad average the results we should get a better estiate of the true ass that by relyig o the easureet fro a sigle scale. Our results ight look soethig like show i Fig.. Let the easureet of the ith scale be i the the average easureet is give by = i= i () where is the uber of easureets. This is what we should use for our best estiate of the true ass. Averagig is a very basic for of curve fittig. 3 Least-squares lie fit Goig back to the situatio illustrated i Fig., how do we figure out the best fit lie? There does't see to be a straightforward way to average the data like we did i Fig.. Istead, let's suppose we have data poits ( x i, y i ). We are iterested i a liear odel of the for y=a x+b, ad our task is calculate the best values for a ad b. If all our data actually fell o a lie the the best a ad b values would result i y i (a xi +b)=0 for i=,,,. More geerally let's defie the residual ( error of the fit ) for the ith data poit as r i= y i (ax i+b) () A perfect fit would give r i=0 for all i. The residual ca be positive or egative, but what we are ost cocered with is its agitude. Let's defie the ea squared error (MSE) as MSE = r i= ( y i ( a xi +b)) i = i= EE Nuerical Coputig Scott Hudso (3) 05-08-8

Lecture 9: Curve fittig I 3/0 We ow seek the values of a ad b that iiize the MSE. These will satisfy MSE MSE =0 ad =0 a b (4) The b derivative is MSE = ( y i (a x i+b))=0 b i = (5) Multiplyig through by / ad rearragig we fid a yi x i b=0 i = i= i= (6) Now defie the average x ad y values as y = y i, x = x i i= i= (7) y a x b=0 (8) a x +b= y (9) Equatio (6) the reads or This tells us that the poit ( x, y ) (the cetroid of the data) falls o the lie. The a derivative of the MSE is MSE = ( y i (a x i+b)) x i =0 a i = (0) Multiplyig through by / ad rearragig we fid a b x i y i x i x i=0 i = i = i = () xy a x b x =0 () or with the additioal defiitios xy = x i y i, x = x i i= i= (3) a x +b x = xy (4) A fial rearrageet gives us We ow have two equatios i the two ukows a,b a x +b= y a x +b x = xy EE Nuerical Coputig Scott Hudso (5) 05-08-8

Lecture 9: Curve fittig I 4/0 Fig. 3: Least-squares lie fit to oisy data. Solvig the first equatio for b b= y a x (6) ad substitutig this ito the secod equatio we obtai a x +( y a x ) x = xy (7) Solvig this for a we have a= xy x y x x (8) Equatios (8) ad (6) provide the best-fit values of a ad b. Because we obtaied these paraeters by iiizig the su of squared residuals, this is called a least-squares lie fit. Exaple. The code below geerates six poits o the lie y= x ad adds orally-distributed oise of stadard deviatio 0. to the y values. The (8) ad (6) are used to calculate the best-fit values of a ad b. The data ad fit lie are plotted i Fig. 3. The true values are a=, b=. The fit values are a= 0.9,b=.0. -->x = [0:0.:]'; -->y = -x+rad(x,'oral')*0.; -->a = (ea(x.*y)-ea(x)*ea(y))/(ea(x.^)-ea(x)^) a = - 0.90347 -->b = ea(y)-a*ea(x) b =.0945 EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 5/0 4 Liear least-squares The least-squares idea ca be applied to a liear cobiatio of ay fuctios f (x), f (x),, f (x). Our odel has the for y= c j f j ( x) (9) j= For exaple, if = ad f (x)=, f ( x)=x the our odel is y=c +c x (0) which is just the liear case we've already dealt with. If we add f 3 (x)=x the the odel is y=c +c x+c 3 x () which is a arbitrary quadratic. Or we could have a odel such as y=c cos (5 x)+c si (5 x)+c 3 cos (0 x)+c 4 si(0 x ) () I ay case we'll cotiue to defie the residuals as the differece betwee the observed ad the odeled y values r i = y i c j f j ( x i) (3) j = ad the ea-squared error as ( ) MSE = r i= y i c j f j (x i ) i = i= j = (4) Let's expad this as ( ) ( [ y i c j f j (x i ) = y i y i c j f j ( x i )+ i = i = j= j = j = ]) c j f j ( x i) (5) Call yi= y i = ad y c f (x )= b j c j i= i j= j j i j = (6) with b j = yi f j ( xi ) i = (7) The last ter i (5) ca be writte EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 6/0 [ j= ] j= k= c j f j ( xi ) = c j f j ( x i ) c k f k (x i) (8) Therefore i = [ ] c j f j ( xi ) = j= i= ( j= k= ) c j f j ( xi ) ck f k (x i ) = a c c j= k= jk j k (9) with a jk =a kj = f (x ) f (x ) i= j i k i (30) Fially we ca write MSE = y bi ci + i = a c c i = j= ij i j (3) This shows that the MSE is a quadratic fuctio of the ukow coefficiets. I the lecture Optiizatio i diesios we calculated the solutio to a syste of this for, except that the secod ter (with the b coefficiets) had a plus rather tha ius sig. Defiig the colu vectors b ad c ad the atrix A as c=[c j ], b=[b j ], A=[a ij ] (3) the coditio for a iiu is (with the ius sig for the b coefficiets) b+a c=0 (33) c=a b (34) ad Aother way arrive at this result is to defie the colu vector y=[ yi ] (35) F=[ f ij] with f ij = f j ( xi ) (36) y=f c (37) ad the atrix The our odel is This is equatios i < ukows ad i geeral will ot have a solutio. Multiplyig both sides o the left by FT results i the syste FT Fc=FT y (38) Sice FT F is ad FT y is this is a syste of equatios i ukows that, i geeral, will have a uique solutio c=( FT F ) FT y (39) The eleets of FT F are EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 7/0 [ F F ] jk = T i= f ij f ik = a jk (40) f ij y i= b j (4) while the eleets of FT y are [ F y ] j = T i = Therefore FT Fc=FT y, whe ultiplied through by /, is equivalet to (4) A c=b The liear syste (38) is called the oral equatio, ad we have the followig algorith Liear least squares fit Give saples (x i, y i ) ad a odel y= c j f j ( x) j= For the atrix F with eleets f ij = f j ( xi ) For the colu vector y with eleets y i Solve the oral equatio FT Fc=FT y for c ^ The odeled y values are y=fc The atrix F is ot square if >, so we caot solve the liear syste y=fc (43) c=f y (44) by writig because F does ot have a iverse. However, as we've see, we ca copute c=( FT F ) FT y (45) ad this c will coe as close as possible (i a least-squares sese) to solvig (43). This leads us to defie the pseudoiverse of F as the atrix F =( F F ) F T + T (46) Our least-squares solutio ca ow be writte c=f + y (47) I Scilab/Matlab the pseudo iverse is coputed by the coad piv(f). However, if we siply apply the backslash operator as we would for a square syste c = F\y Scilab/Matlab returs the least-squares solutio. We do ot have to explicitly for the oral EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 8/0 equatio or the pseudoiverse. Exaple. Noise was added to Eleve saples of y=x x, x=0,0.,0.,,. A least-squares fit of the odel c +c x+c 3 x gave c =0.044, c=.0, c=.039 Code is show below ad results are plotted i Fig. 4. -->x = [0:0.:]'; -->y0 = x.^-x; -->y = y0+rad(y0,'oral')*0.03; //add oise -->F = [oes(x),x,x.^]; -->c = F\y c = 0.043654 -.04735.03903 -->yf = F*c 5 Goodess of fit Oce we've fit a odel to data we ay woder if the fit is good or ot. It would be helpful to have a easure of goodess of fit. Doig this rigorously requires details fro probability theory. We will preset the followig results without derivatio. Assue our y values are of the for y i=si +ηi where si is the sigal that we are tryig to odel ad ηi is oise. If our odel were to perfectly Fig. 4: f ( x)= x x (dashed curve), saples of f ( x) with oise added (dots) ad least-squares fit of odel c +c x+c3 x (solid lie). EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 9/0 fit the sigal, the the residuals r i= y i c j f j (x i) (48) j = would siply be oise r i=ηi. We ca quatify the goodess of fit by coparig the statistics of our residuals to the (assued kow) statistics of the oise. Specially, for large, ad orally distributed oise, a good fit will result i the uber σ= r i= i (49) beig equal, o average, to the stadard deviatio of the oise, where is the uber of data ad is the uber of odel coefficiets. If it is sigificatly larger tha this it idicates that the odel is ot accoutig for all of the sigal, where a fractioal chage of about /( ) is statistically sigificat. For exaple, /50=0. eas that a chage of aroud 0% is statistically sigificat. If the oise stadard deviatio is 0., a σ larger tha about 0.(.)=0. iplies the sigal is ot beig fully odeled. The followig exaple illustrates the use of this goodess-of-fit easure. Exaple. The followig code was used to geerate 50 saples of the fuctio f (x)=x+x over the iterval 0 x with orally distributed oise of stadard deviatio 0.05 added to each saple. = 50; rad('seed',); x = [lispace(0,,)]'; y = x+x.^+rad(x,'oral')*0.05; These data were the fit by the four odels y=c, y=c +c x, y=c +c x+c 3 x ad y=c +c x+c 3 x +c 4 x 3. The resultig σ values were σ0 =0.608, σ=0.0864, σ =0.0506 ad σ3 =0.0504. Sice /50=0. a chage of about 0% is statistically sigificat. The fits iproved sigificatly util the last odel. The data therefore support the odel y=c +c x+c 3 x but ot the cubic odel. The fits are show i Fig. 5. EE Nuerical Coputig Scott Hudso 05-08-8

Lecture 9: Curve fittig I 0/0 Fig. 5 Data set fit by polyoials. Top-left: y=c, σ 0=0.608. Top-right: y=c +c x, σ =0.0864. 3 Botto-left: y=c +c x+c3 x, σ =0.0506. Botto-right: y=c +c x+c3 x +c4 x, σ 3=0.0504. EE Nuerical Coputig Scott Hudso 05-08-8