ESS Line Fitting

Similar documents
Simple Linear Regression

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

ENGI 3423 Simple Linear Regression Page 12-01

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Analyzing Two-Dimensional Data. Analyzing Two-Dimensional Data

Lecture Notes Types of economic variables

Summary of the lecture in Biostatistics

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Econometric Methods. Review of Estimation

Simple Linear Regression

STA302/1001-Fall 2008 Midterm Test October 21, 2008

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

ENGI 4421 Propagation of Error Page 8-01

Quantitative analysis requires : sound knowledge of chemistry : possibility of interferences WHY do we need to use STATISTICS in Anal. Chem.?

Linear Regression with One Regressor

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Chapter 13 Student Lecture Notes 13-1

Objectives of Multiple Regression

Multiple Choice Test. Chapter Adequacy of Models for Regression

Chapter 5 Properties of a Random Sample

Lecture 3. Sampling, sampling distributions, and parameter estimation

Functions of Random Variables

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Transforming Numerical Methods Education for the STEM Undergraduate Torque (N-m)

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Line Fitting and Regression

TESTS BASED ON MAXIMUM LIKELIHOOD

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

X ε ) = 0, or equivalently, lim

LINEAR REGRESSION ANALYSIS

Fitting models to data.

Probability and. Lecture 13: and Correlation

Correlation and Simple Linear Regression

residual. (Note that usually in descriptions of regression analysis, upper-case

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

: At least two means differ SST

Statistics MINITAB - Lab 5

4. Standard Regression Model and Spatial Dependence Tests

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then


Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

Lecture 2: Linear Least Squares Regression

Chapter 8. Inferences about More Than Two Population Central Values

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

Multiple Linear Regression Analysis

STK4011 and STK9011 Autumn 2016

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Chapter 8: Statistical Analysis of Simulated Data

Analysis of Variance with Weibull Data

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Lecture Note to Rice Chapter 8

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Simple Linear Regression - Scalar Form

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Chapter 2 Supplemental Text Material

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Lecture 8: Linear Regression

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Third handout: On the Gini Index

Module 7: Probability and Statistics

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

CHAPTER VI Statistical Analysis of Experimental Data

Simple Linear Regression and Correlation.

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Lecture 1: Introduction to Regression

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues

Point Estimation: definition of estimators

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

PROPERTIES OF GOOD ESTIMATORS

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

ε. Therefore, the estimate

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

Maximum Likelihood Estimation

1 Solution to Problem 6.40

Chapter 11 The Analysis of Variance

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Chapter Statistics Background of Regression Analysis

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Measures of Dispersion

MOLECULAR VIBRATIONS

QR Factorization and Singular Value Decomposition COS 323

Class 13,14 June 17, 19, 2015

Chapter 5. Curve fitting

Lecture 1: Introduction to Regression

MEASURES OF DISPERSION

Lecture 3 Probability review (cont d)

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

Some Applications of the Resampling Methods in Computational Physics

Derivation of 3-Point Block Method Formula for Solving First Order Stiff Ordinary Differential Equations

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

Transcription:

ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here t s also covered Chapter 4 of Pal Wessel s otes. Least-squares straght-le fttg The process of fttg a straght le s oe of the smplest examples of a verse problem. For pars of data pots,, = 1,, ad ft the data wth a relatoshp y = a + b 17-1) where y s the predcted value of. We ca use the χ statstc to measure the msft χ ) = a b 17-) where are the ucertates or estmates of the ucertates whch ca be set to uty the absece of better kowledge) the values. Our goal s to fd the values of a ad b that mmmze χ. To do ths we fd take the partal dervates of χ wth respect to a ad b ad solve for the values of a ad b at whch they are both zero χ a = a b = 0 17-3) χ b = a b = 0 If we defe 1 S =, S =, S =, S =, S = 17-4) the equato 7) reduces to as + bs = S as + bs = S The soluto s 17-5) a = S S S S 17-1 17-6) b = SS S S wth = SS S 17-7) We ca also estmate the ucertates a ad b. To do ths we sum the varace a ad b resultg from the varace each of the values. Ths ca be wrtte mathematcally

ESS 5 014 s a a = 17-8) s b b = After substtutg dervatves obtaed from equato 17-6) ad a far amout of mapulato we get s a = S 17-9) s b = S We ca also estmate the covarace of the ucertates a ad b s a b ab = = S x 17-10) Our estmate of the correlato coeffcet betwee a ad b, becomes r = s ab = S. 17-11) s a SS If we assume our estmates of the ucertaty are correct, we ca check f the ft s adequate sgfcat) at the α level by comparg our value of χ to the crtcal χ α for - degrees of freedom. Provded t does ot exceed ths value the the data s ft adequately by the straght le. We ca test the sgfcace of the correlato of x ad y, by applyg the t-statstc wth - degrees of freedom to determe whether the slope ad our estmate of ts ucertaty are sgfcatly dfferet from 0 t = b 0 ) 17-1) We ca wrte the 95% cofdece lmts for b as b ± t 0.05 17-13) If these lmts eclose zero we caot be cofdet that x ad y are correlated at the 95% level. If we do ot kow the ucertaty of our data but kow that the straght-le model s correct, the we ca tally assume a ucertaty of 1 for the purpose of gettg a straght le ft ad the estmate t from the resduals accordg to s = 1 a b ) 17-14) We ca use s place of the populato varace to estmate the slope ucertateut ths wll lead to a uderestmate of the ucertaty for small because there s addtoal ucertaty arsg from usg a estmate of ad ot ts true value. 17-

ESS 5 014 Note that Paul Wessel uses b etc stead of secto 4.1 but ths s ot cosstet wth the otato he troduces Chapter 1 ad that we have used sce s clearly a estmate based o a lmted sample of pots ot the etre populato) Le fttg wth errors x ad y It s mportat to ote that equato 17-) assumes that our determatos of x have o ucertaty. I some staces ths s a good assumpto for example our determatos of tme or spatal coordate wll ofte have eglgble ucertaty. For other staces t s a poor approxmato for example f we plot the cocetrato of two dssolved chemcals seawater or two trace elemets a rock, they may both have smlar aalytcal errors. If we have errors both varables the a better measure of msft s gve by y E = + x y, 17-15) x, where ad are the observed data ad x ad y are the modeled values that are requred to le o a straght le y = a + bx 17-16) Our goal s to fd the values of a ad b that mmze E. To do ths we use the method of Lagrage Multplers. We ca wrte equato 17-16) as f = a + bx y = 0 17-17) ad sce the f values are costraed to be zero we ca wrte equato 17-15) as y E = + x y, + λ f 17-18) x, where the λ values are ukow costat Lagrace multplers ad the factor of s for algebrac coveece. We ow set the partal dervatves of E to zero to fd the values that gve a mmum = = x y a = b = 0 Now f we make the assumpto that all the sgma values are equal to uty ths gves = x x x ) + λ x bx ) = x = y y y ) y ) + bλ = 0 17-19) λ y ) = y ) λ = 0 17-0) a = a λ a ) = λ = 0 17-1) b = a λ bx ) = λ x = 0 17-) From equatos 17-19) ad 17-0) we ca wrte 17-3

ESS 5 014 x = bλ y = + λ 17-3) Substtutg for x ad y equato 17-16) yelds + λ = a + b bλ ) = a + b b λ 17-4) Solvg for λ yelds λ = a + b 17-5) 1+ b Substtutg for λ to equato 17-1) yelds a + b = 0 17-6) 1+ b Substtutg for λ from equatos 17-5) ad for x from 17-3) to equato 17-) yelds a + b 1+ b a bλ ) = + b b a + b 1+ b 1+ b = 0 17-7) We ow have reduced the + equatos for a, b ad λ to equatos 17-6 ad 17-7) for a ad b. Sce the deomator equato 17-6) caot reduce to zero, we ca wrte a = a = b a = b where ad are the mea values of the data. We ca substtute equato 17-8) to 17-8) equato 17-7), multply by 1 + b ), ad use the varables U = - ad V = - ad after a few les of mapulato get ) U V b U V + b U V = 0. 17-9) Ths has the soluto V U ± U V b = U V + 4 U V There are two solutos for b each wth a correspodg value of a from equato 17-8), oe that mmzes E ad a secod that gves a perpedcular le that maxmzes E. Robust Le Fttg I a least squares le whch we assume all the data have the same ucertaty we seek to mmze 17-30) 17-4

ESS 5 014 Mmze ) E = a b = r 17-31) Ths process s sestve to outlers, partcularly so whe the outlers le ear the lower or upper lmts of the rage of x. The breakdow pot for the least squares le ft L regresso) s 1/. We ca overcome ths problem to a small extet by mmzg the sum of the absolute msfts L 1 regresso) Mmze E = r 17-3) but the L 1 orm also has a breakdow pot of 1/. A robust approach wth a breakdow pot of ½ s to mmze the meda msft. Mmze meda r = meda a b 17-33) Ths s equvalet to fdg the arrowest strp that ecloses half the pots. The oly way to do ths, y a systematc search through dfferet values of b. For each value of b we calculate - b, ad the fd the value of a that mmzes the meda of - b - a. Oe the chooses the a ad b values that gves the mmum meda amog all the values of b aalyzed. Oe ca use ths robust statstcal method to fd ad elmate outlers a b Meda a b > z cut 17-34) where a value of z cut =4.45 s equvalet to 3 stadard devatos for a ormal dstrbuto. Oce the outlers are elmated, oe ca apply the least squares le fttg approach. 17-5