Nonlinear regression

Similar documents
Case study Galactose diffusion in silica mesopore

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

1 Inferential Methods for Correlation and Regression Analysis

Polynomial Functions and Their Graphs

NCSS Statistical Software. Tolerance Intervals

The Phi Power Series

Math 257: Finite difference methods

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Linear Regression Models

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

The Method of Least Squares. To understand least squares fitting of data.

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

(# x) 2 n. (" x) 2 = 30 2 = 900. = sum. " x 2 = =174. " x. Chapter 12. Quick math overview. #(x " x ) 2 = # x 2 "

11 Correlation and Regression

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Statistical Inference Based on Extremum Estimators

Parameter, Statistic and Random Samples

PAPER : IIT-JAM 2010

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

CHAPTER 10 INFINITE SEQUENCES AND SERIES

NUMERICAL METHODS FOR SOLVING EQUATIONS

Lecture 3. Digital Signal Processing. Chapter 3. z-transforms. Mikael Swartling Nedelko Grbic Bengt Mandersson. rev. 2016

ECON 3150/4150, Spring term Lecture 3

Correlation Regression

Properties and Hypothesis Testing

Chapter 10: Power Series

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

1. Linearization of a nonlinear system given in the form of a system of ordinary differential equations

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

Study on Coal Consumption Curve Fitting of the Thermal Power Based on Genetic Algorithm

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Math 312 Lecture Notes One Dimensional Maps

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Probability, Expectation Value and Uncertainty

Control Charts for Mean for Non-Normally Correlated Data

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

Zeros of Polynomials

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

1.010 Uncertainty in Engineering Fall 2008

Introduction to Optimization Techniques. How to Solve Equations

THE KALMAN FILTER RAUL ROJAS

Power and Type II Error

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

( a) ( ) 1 ( ) 2 ( ) ( ) 3 3 ( ) =!

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Machine Learning Brett Bernstein

Similarity Solutions to Unsteady Pseudoplastic. Flow Near a Moving Wall

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

Chapter 6 Sampling Distributions

Computing Confidence Intervals for Sample Data

Statistical Properties of OLS estimators

Notes on iteration and Newton s method. Iteration

Topic 9: Sampling Distributions of Estimators

Linear Regression Demystified

Introduction to Signals and Systems, Part V: Lecture Summary

Stat 139 Homework 7 Solutions, Fall 2015

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Dotting The Dot Map, Revisited. A. Jon Kimerling Dept. of Geosciences Oregon State University

Infinite Sequences and Series

Fourier Series and the Wave Equation

Axis Aligned Ellipsoid

TAMS24: Notations and Formulas

UNIT 11 MULTIPLE LINEAR REGRESSION

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

Iterative Techniques for Solving Ax b -(3.8). Assume that the system has a unique solution. Let x be the solution. Then x A 1 b.

Lecture 11 Simple Linear Regression

Special Modeling Techniques

Formulas and Tables for Gerstman

Quadratic Functions. Before we start looking at polynomials, we should know some common terminology.

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable

U8L1: Sec Equations of Lines in R 2

L = n i, i=1. dp p n 1

Chapter 7 z-transform

6.3 Testing Series With Positive Terms

Modified Decomposition Method by Adomian and. Rach for Solving Nonlinear Volterra Integro- Differential Equations

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Regression, Inference, and Model Building

10-701/ Machine Learning Mid-term Exam Solution

Large Sample Theory. Convergence. Central Limit Theorems Asymptotic Distribution Delta Method. Convergence in Probability Convergence in Distribution

L 5 & 6: RelHydro/Basel. f(x)= ( ) f( ) ( ) ( ) ( ) n! 1! 2! 3! If the TE of f(x)= sin(x) around x 0 is: sin(x) = x - 3! 5!

Lecture 7: Properties of Random Samples

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Practice Problems: Taylor and Maclaurin Series

Transcription:

oliear regressio

How to aalyse data?

How to aalyse data? Plot!

How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer

What if data have o liear correlatio?

1. Liearizatio trasform oliear problem ito liear Example y Be Ax logy logb + Ax Y b + ax

Few word about R I the case of liear regressio the r coeff. Idicates the rate of liear depedecy betwe data. However there is more geeral approach

Few word about R S r y i f x i 2 Error of the model S t y i y 2 Discrepacy betwee data ad sigle estimate (mea)

Few word about R y y i s y S t 1

Few word about R S r y i f x i 2 Error of the model S t y i y 2 Discrepacy betwee data ad sigle estimate (mea)

Few word about R sy x S r 2 Stadard error of the estimate Spread aroud the lie

Few word about R sy x S r 2 Stadard error of the estimate Spread aroud the lie

Few word about R r 2 S t S r S t Error reductio due to describig data i terms of a model (straight lie) Scale depedet - ormalizatio

y 0.5 x + 3, r 2 0.67 Ascombe example

2. Polyomial f x a 0 + a 1 x + a 2 x 2 + a 3 x 3 Same approach as i the case of liear regressio least squares

2. Polyomial f x a 0 + a 1 x + a 2 x 2 + a 3 x 3 e i y i f x i y i a 0 a 1 x i a 2 x i 2 e i y i f x i y i 0 a x i

2. Polyomial f x a 0 + a 1 x + a 2 x 2 + a 3 x 3 e i 2 y i a x i 2 0 SSE a 0, a 1, e i 2 y i a x i 2 0

How to adust a ad b so SSE is the smallest? SSE(a, b) y i ax i b 2 How to calculate miimum of the SSE(a,b) fuctio? SSE a, b a 0 SSE a, b b 0

How to adust a ad b so SSE is the smallest? 2 SSE a 0, a 1, e i 2 y i a x i 0 SSE a 0, a 1,.. y i a x i 0 2 y i a x i 0 2 2 y i a x i 0 1 2 y i a x i 0 SSE a 0, a 1,.. y i a x i 0 2 y i a x i 0 2 2 y i a x i 0 x i 2 x i y i a x i 0 SSE a 0, a 1,.. a 2 a 2 y i a x i 0 2 a 2 y i a x i 0 2 2 y i a x i 0 x i 2 2 2 x i y i a x i 0

How to adust a ad b so SSE is the smallest? 2 SSE a 0, a 1, e i 2 y i a x i 0 SSE a 0, a 1,.. a k a k y i a x i 0 2 a k y i a x i 0 2 2 y i a x i 0 x i k 2 k x i y i a x i 0 SSE a 0, a 1,.. a k 2 x k i y i 0 a x i

How to adust a ad b so SSE is the smallest? We obtai set of +1 liear equatios SSE a 0, a 1,.. a k 2 x k i y i 0 a x i 0 x i k y i a x i 0 0 x i k y i 0 a x i +k 0 0 a x i +k x i k y i

How to adust a ad b so SSE is the smallest? We obtai set of +1 liear equatios SSE a 0, a 1,.. a k 2x i k y i 0 a x i 0 x i k y i a x i 0 0 x i k y i 0 a x i +k 0 0 a x i +k x i k y i

How to adust a ad b so SSE is the smallest? Row 1, k 0 0 a x i +k x i k y i a 0 + a 1 x i + a 2 x i 2 + + a x i y i Row 2, k 1 a 0 x i + a 1 x i 2 + a 2 x i 3 + + a x i +1 x i y i Row +1, k a 0 x i + a 1 x i +1 + a 2 x i +2 + + a x i + x i y i

How solve it? Liear equatio - matrix Row 1, k 0 x i x i y i x i x i 2 x i x i +1 +1 x i x i 2 a 0 a 1 a x i y i x i y i

oliear regressio 2 ormal equatios

Liear regressio differet approach y 1 ax 1 + b y 2 ax 2 + b y ax + b y 1 y 2 y x 1 x 2 x 1 1 1 a b y Az

Liear regressio differet approach y 1 y 2 y x 1 x 2 x 1 1 1 a b y Az This system caot be solved. System is over coditioed to may equatios. A is ot a square matrix caot be iverted. Solutio? Let s make it squared matrix!

Liear regressio differet approach Solutio? Let s make it squared matrix! A T A C square matrix x 1 x 2 x 1 1 1 x 1 x 2 x 1 1 1 2 x i x i x i

Liear regressio differet approach Solutio? Let s make it square matrix! y Az y Az A T A T y A T Az x 1 x 2 x 1 1 1 y 1 y 2 y x i y i y i

Example X W (kg) 0.5 1.5 2.0 2.5 3.0 Y L (m) 0.77 1.1 1.22 1.31 1.4 y ax b ly bx + la A 0. 5 1. 5 2. 0 2. 5 3. 0 1 1 1 1 1 Y Ax + B Y l 0. 77 l 1. 1 l 1. 22 l 1. 31 l 1. 4

Example X W (kg) 0.5 1.5 2.0 2.5 3.0 Y L (m) 0.77 1.1 1.22 1.31 1.4 y ax b Y Ax + B 0. 5 1. 5 2. 0 2. 5 3 1. 0 1. 0 1. 0 1. 0 1. 0 0. 5 1 1. 5 1 2. 0 1 2. 5 1 3. 0 1 21. 75 9. 5 9. 5 5. 0

Example X W (kg) 0.5 1.5 2.0 2.5 3.0 Y L (m) 0.77 1.1 1.22 1.31 1.4 y ax b Y Ax + B 21. 75 9. 5 9. 5 5. 0 A B 2. 095 0. 639 A B 0. 2378 0. 3229 y 0. 7233x 0.2378

Summig up Liear regressio problem ca be formulated as: Havig set of approximatio of every y i as liear fuctio of x i y i f x i + e i where f x ax + b we look for such parameters a, b such that SSE is the smallest. Solutio for this is: where z AA T 1 A T y A x 1 x 2 x 1 1 1 z a b y y 1 y 2 y

Example Fit fuctio f x, a 0, a 1 a 0 1 e a 1x to Partial derivatives off SSE with respect to a 0 ad a 1 are: SSE y i a 0 1 e a 1x i 1 e a 1x i 0 We obtai set of oliear equatios SSE a 0, a 1 Oe way is to solve them o error cotrol. y i a 0 1 e a 1x SSE 2 y i a 0 1 e a 1x i a 0 xe a 1x i 0 Aother Gauss-ewto iterative techique.

Example oliear regressio is characterized by the fact that the predictio equatio depeds oliearly o oe or more ukow parameters. Some oliear regressio problems ca be liearized by a suitable trasformatio of the model formulatio. However, use of a oliear trasformatio (that is, liearizatio) requires cautio. The iflueces of the data values will chage, as will the error structure of the model ad the iterpretatio of ay iferetial results. These may ot be desired effects. O the other had, depedig o what the largest source of error is, a oliear trasformatio may distribute your errors i a ormal fashio, so the choice to perform a oliear trasformatio must be iformed by modelig cosideratios. The accuracy obtaiable with liearizatio trasformatios is ot as good as that obtaiable with the iterative techiques applied directly to the oliear problem. However, eve whe very high accuracy is required, liearizatio trasformatios are of value i providig good iitial estimates for the iterative techiques used with oliear regressio. (Ref: W. G. Dotso, Jr., Liearizatio Trasformatios for Least Squares Problems, Mathematics Magazie, Vol. 39, o. 3 (May, 1966).

Iterative techique Gauss-ewto method Problem: Havig set of data x i, y i ad fuctio f(x, a 0, a 1, a 2, ) fit: y i f x i, a 0, a 1, a 2, + e i for every i 1,, So the sum of radom errors e i squared is the smallest. Vector otatio: y f x, a + e

Iterative techique Gauss-ewto method To illustrate the process we use the case where there are two parameters a 0 ad a 1. The the trucated Taylor expressio that defies the terms of the vectors used for values of the model fuctio f i the steps of the iterative process have the form: f x i +1 f x i + f x i Δa 0 + f x i Δa 1 for every i 1,, We dot travel i x rather i a we look for better estimatio of a. The values of Δa 0 ad Δa 1 are to be determied from the least squares computatio at each step ad represet the icremets added to the latest estimates of the parameters to geerate the ext parameter estimates. This expressio is said to liearize the origial model with respect to the parameters.

How to solve truly o-liear problem y i f x i + e i f x i +1 f x i + f x i Δa 0 + f x i Δa 1 y i f x i + f x i Δa 0 + f x i Δa 1 + e i f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1 + e 1 e 2 e

How to solve truly o-liear problem f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1 + e 1 e 2 e ow we drop the radom error terms to obtai a over determied system to which we apply least square computatioal strategy to determie the ormal system. f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1

How to solve truly o-liear problem f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1 f x 1 f x 1 b y 1 f x 1 y 2 f(x 2 ) y f x J f x 2 f x f x 2 f x z Δa 0 Δa 1

How to solve truly o-liear problem f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1 b Jz b Jz J T J T b J T Jz J T

How to solve truly o-liear problem f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1 J T b J T Jz J T z J T J 1 J T b Δa 0 Δa 1

How to solve truly o-liear problem The etries of vector z Δa 0 Δa 1 are used to update the values of the parameters: a 0,+1 a 0, + Δa 0 a 1,+1 a 1, + Δa 1 f x 1 f x 1 y 1 f x 1 y 2 f(x 2 ) y f x f x 2 f x f x 2 f x Δa 0 Δa 1

How to solve truly o-liear problem The iterative procedure cotiues ad we test for covergece usig a approximate relative error test a 0,+1 a 0, a 0,+1 < tolerace a 1,+1 a 1, a 1,+1 < tolerace

Example Fit fuctio f x, a 0, a 1 a 0 1 e a 1x to x 0.25 0.75 1.25 1.75 2.25 y 0.28 0.57 0.68 0.74 0.79

Example Fit fuctio f x, a 0, a 1 a 0 1 e a 1x to We use as iitial guesses a 0 1.0 ad a 1 1.0 Partial derivatives with respect to a 0 ad a 1 are: f 1 e a 1x f a 0 xe a 1x ext we evaluate etries of matrix J: f x 1 f x 1 J f x 2 f x f x 2 f x 0. 22 0. 52 0. 71 0. 82 0. 89 0. 19 0. 35 0. 35 0. 30 0. 23

Example The we compute etries of b matrix: b y 1 f x 1 y 2 f(x 2 ) y f x 0. 28 0. 22 0. 57 0. 52 0. 68 0. 71 0. 74 0. 82 0. 79 0. 89 0. 05 0. 04 0. 03 0. 08 0. 1 Usig ormal system equatio z J T J 1 J T b So ew set parameters a 0 ad a 1 are: z Δa 0 Δa 1 0. 27 0. 5 a 0 a 1 1 1 + 0. 27 0. 5 0. 73 1. 5

Problems 1. It may coverge slowly. 2. It may oscillate ad cotiually chage directio. 3. It may ot coverge.

More about ormal Distributio How ca we measure the similarity of give distributio to (0,1)? Media, mode Stadard ormal curve 0.5 Mea x xi 0 Media Mea Mode x Std. deviatio σ i μ 2 2 1 0.1587 0.8413 Z distributio μ 0 σ 1 Skewess s1 Kurtosis k 1 x μ 3 1 1 x μ 4 1 x μ 2 3 0 x μ 2 2 3 0.0013 0.0228 0.0214 0.1359 0.1359-3 -2-1 0.34130.3413 0.9772 0.0214 0 1 2 3 0.9982

More about ormal Distributio How ca we measure the similarity of give distributio to (0,1)? Media, mode Mea x xi 0 Media Mea Mode x Std. deviatio σ i μ 2 2 Skewess s1 Kurtosis k 1 x μ 3 1 1 x μ 4 1 x μ 2 3 0 x μ 2 2 3 1