REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Similar documents
Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

The Method of Least Squares. To understand least squares fitting of data.

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Correlation Regression

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Regression, Inference, and Model Building

The Pendulum. Purpose

Correlation and Covariance

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Linear Regression Models

Polynomial Functions and Their Graphs

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Kinetics of Complex Reactions

a is some real number (called the coefficient) other

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

ECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations

ECON 3150/4150, Spring term Lecture 3

Properties and Hypothesis Testing

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

TEACHER CERTIFICATION STUDY GUIDE

11 Correlation and Regression

NCSS Statistical Software. Tolerance Intervals

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

Principle Of Superposition

1 Inferential Methods for Correlation and Regression Analysis

x c the remainder is Pc ().

Simple Linear Regression

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Zeros of Polynomials

Topic 9: Sampling Distributions of Estimators

Optimization Methods MIT 2.098/6.255/ Final exam

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Statistical Properties of OLS estimators

Quadratic Functions. Before we start looking at polynomials, we should know some common terminology.

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

CALCULUS BASIC SUMMER REVIEW

The Phi Power Series

STP 226 ELEMENTARY STATISTICS

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Castiel, Supernatural, Season 6, Episode 18

Nonlinear regression

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Least-Squares Regression

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

Paired Data and Linear Correlation

Topic 9: Sampling Distributions of Estimators

A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS

Notes on iteration and Newton s method. Iteration

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

On a Smarandache problem concerning the prime gaps

Linear Regression Demystified

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Topic 9: Sampling Distributions of Estimators

6.867 Machine learning, lecture 7 (Jaakkola) 1

Algebra of Least Squares

A.1 Algebra Review: Polynomials/Rationals. Definitions:

CHAPTER 10 INFINITE SEQUENCES AND SERIES

Revision Topic 1: Number and algebra

Unit 4: Polynomial and Rational Functions

Chapter 12 Correlation

Properties and Tests of Zeros of Polynomial Functions

In algebra one spends much time finding common denominators and thus simplifying rational expressions. For example:

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Numerical Methods in Fourier Series Applications

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

EXPERIMENT OF SIMPLE VIBRATION

Chapter Vectors

Mechatronics. Time Response & Frequency Response 2 nd -Order Dynamic System 2-Pole, Low-Pass, Active Filter

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

IP Reference guide for integer programming formulations.

Recurrence Relations

SNAP Centre Workshop. Basic Algebraic Manipulation

Chapter 9: Numerical Differentiation

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

Final Examination Solutions 17/6/2010

CHAPTER 5. Theory and Solution Using Matrix Techniques

U8L1: Sec Equations of Lines in R 2

Math 2784 (or 2794W) University of Connecticut

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Stat 139 Homework 7 Solutions, Fall 2015


Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Complex Numbers Solutions

NUMERICAL METHODS FOR SOLVING EQUATIONS

We will conclude the chapter with the study a few methods and techniques which are useful

Study on Coal Consumption Curve Fitting of the Thermal Power Based on Genetic Algorithm

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Section 14. Simple linear regression.

Curve Sketching Handout #5 Topic Interpretation Rational Functions

ENGI 9420 Engineering Analysis Assignment 3 Solutions

Transcription:

REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data Y 0 8 6 4 0 0 3 4 5 6 X Figure : Graph of Example Data I this example, the poits are o a perfect straight lie. The formula of a geeral straight lie is Y=a*X+b where a is the slope of the lie ad b is the itercept of that lie with the y-axis. I this example, it is easy to verify that a= ad b= I. I geeral, with the data poits you obtai i your experimets, fidig a ad b is ot so easy. We wat to use a computer to calculate a ad b for us. For this, we use the regressio fuctio of Excel. Whe you are i Excel, type i your data poits as show i Table I. Now, we wat to do a liear regressio o these data poits. That will hopefully give us the value for a ad b. To do this, look i the meu for [Tools], the select [Data Aalysis] ad fially select [Regressio]. You are the faced with a dialogue-box "Regressio". For "iput Y rage", select the Y colum of your data For "iput X rage", select the X colum of your data, the click o "OK". After a few secods, you will see a ew Excel sheet with a overkill of umbers called the "Summary Output". Of this Summary Output, the oly part you eed is: Coefficiets Stadard Error t Stat Itercept 0 65535 X Variable 0 65535 From this, you ca read the coefficiet values for a ad b as follows: b=itercept= a=x-variable= which is what we expected. The equatio for the lie i this case would be: Y=aX+b=X+ The stadard errors i a ad b are zero here because the poits are o a perfect straight lie. I geeral, this will ot be the case, because experimets are ot perfect, ufortuately. For example, if you were to use the followig data poits (they are the same poits as before, except for the last oe) ad do a liear regressio o them, you will get:

X Y 0 3 5 3 7 4 9 5 5 Table : Not-so-perfect Example Data Coefficiets Stadard Error t Stat Itercept 0.3809538 0.99886557 0.38366 X Variable.574857 0.399444 7.7949 Table 3: Regressio Results of Not-so-perfect Data Ad you ow have stadard errors, which are ot zero. You would quote your results at the 95% cofidece level as: b=itercept=o.38±0.999 a= x -variable=.574±0.399 Of course, you must decide for yourself each time how may decimals are realistic ad what the uit is. Liear regressio is a very useful tool, ad you will eed it frequetly durig this course. I your report, DO NOT iclude the "regressio-summary" Excel produces. Istead, whe you do a liear regressio o your data, all you have to give is the equatio of the lie (icludig errors) Excel calculated, ad state that the calculatio was a liear regressio. Velocity (m/s) Measured 0 4.9.38 4.8.99 4.6. 4.6.5 4.5 3.06 4.4 3.77 4. 4.09 4. 4.65 3.8 5.5 3.6 6. 3.0 7..6 7.88.3 8.53.0 9.79. 0.3 0.8 0.93 0.5. 0..37 0.0 Table 4. Free Fall Data HOW TO PERFORM NONLINEAR TRENDLINE 0.0 0 5 0 5 Velocity (m/s) Besides liear tredlie, Excel has the capability of fittig logarithmic, polyomial of arbitrary order, power or expoetial fuctios to data. For the data preseted i Table 4, it appears that a quadratic 5.0 4.5 4.0 3.5 3.0.5.0.5.0 0.5 Quadradic Tredlie (Show) F = -0.07V - 0.78V + 4.977 R = 0.9986 Figure. Velocity vs. Force Liear Tredlie F = -0.3683V + 4.9 R = 0.96

relatioship should produce a excellet fit. Figure substatiates this i that this quadratic tredlie has a r of 0.9986 as compared to a value of 0.96 for the liear fit whe the itercept value is set to 4.90 (See Curve Fittig.xls example). Higher order polyomials may be used but ay icrease i r that is obtaied by this icreased complexity is rather superficial. HOW TO PERFORM NONLINEAR OPTIMIZING SOLVER If we start over o this problem ad apply some basic dyamics to the free fall problem, the summatio of forces i this case must be equal to the gravitatioal body force (m-g) i the dowward directio plus a drag force i the upward directio that is some ukow fuctio of velocity. Therefore theory implies that the force versus velocity relatioship must have the followig geeral form: F ( V ) = mg Drag () but it does ot supply ay iformatio about how the drag varies with velocity. Our ow persoal experiece idicates that the drag force icreases with velocity ad extesive experimetal testig over the years has show that power laws ca be used frequetly to correlate velocity-drag data over limited F = mg av b () velocity rages. If this is assumed to be the case here, the Theory ad some empirical isight has therefore bee combied to obtai a possible fuctio form betwee velocity ad force i terms of two arbitrary costats (a, b) that is based upo the physics of pheomea ad ot just blid curve fittig as was doe i the liear ad quadratic (Figure ) curve fit examples. The values of a ad b that give the best fit with the experimetal data ca be determie through the use of the Excel oliear optimizig solver. The fust requiremet of usig the oliear optimizig solver is the developmet of a regressio fuctio that you what to optimize i terms of miimizig or maximizig its value or obtaiig a specified value. The tredlies that are preseted i the previous two curve fits are based upo least square regressio i which the followig regressio fuctio is miimized ( F i F i ) i= (3) where F i is the measured force ad F i is the correspodig predicted value i the data set that cotais values. I this case Equatio would be subsituted for F i (F i =mg-av b ). Istead of doig this, lets miimize r ( F i F ) i = i= (4) ( Fi Fi ) i= r. That is where F is the mea force of the experimetal data set. Excel provides a oliear optimizig solver for miimizig fuctios such as Equatio 4. However, the problem must be prepared properly to obtai a

appropriate solutio. Table 5 presets a copy of the spreadsheet (see file Curve Fittig.xls for the actual spreadsheet) that was used to determie a & b. This table cotais six colums: colum is the idepedet variable (velocity); colum is the measured variable (acceleratio a i ); colum 3 is the depedet variable (force F i ) calculated from the measured variable, a i ; colum 4 is the predicted depedet variable (the force calculated from Equatio,F i ); colum 5 is the square of the differece betwee colums 3 ad 4; ad colum 6 is the square of the differece betwee colum 3 ad the average force which is calculated at the ed of colum 3. The colums 5 & 6 are the summed ad these values are used to calculate the r value for a guess set of coefficiets (a, b). For istace, the guess of (,) produces a very poor r value of -5.88. Velocity (m/s) a = 0.08535 N/(m/s)^b g = 9.8 (m/s^) b =.6635 m = 0.5 (kg) Accel. (m/s^) Measured Fi Predicted* Fi (Fi - Fi)^ N^ Table 5. Excel Table Used to Perform Noliear Regressio (Fav - Fi)^ N^ 0 9.8 4.9 4.9 0.00E+00 3.93.38 9.6 4.8 4.8.08E-03 3.54.99 9. 4.6 4.6.04E-03.83. 9. 4.6 4.6.3E-03.66.5 9.0 4.5 4.5 3.75E-05.50 3.06 8.8 4.4 4.4.7E-03.0 3.77 8.3 4. 4. 6.6E-04.5 4.09 8. 4. 4.0.39E-03.8 4.65 7.5 3.8 3.8.67E-03 0.69 5.5 7. 3.6 3.4.4E-0 0.40 6. 6.0 3.0 3..5E-0 0.0 7. 5..6.6.79E-04 0.0 7.88 4.5.3.3 8.3E-05 0.45 8.53 3.9.0.9 3.97E-03 0.94 9.79... 3.7E-03 3.49 0.3.5 0.8 0.8 4.8E-04 4.70 0.93 0.9 0.5 0.3.0E-0 6.09. 0.3 0. 0..34E-05 7.66.37 0.0 0.0 0.0.64E-03 8.5 Fav =.9 Sum = 5.80E-0 53.50 R^ = 0.998964 = - SUM(Fi - Fi)^/SUM(Fav - Fi)^ * Fi = Force(m,g,V,a,b) see Module Force(m,g,V,a,b) Excel uses a iterative approach to solve the oliear regressio problem oce it has a iitial guess set to start this iterative process. I this case, the program will systematically vary aad b to determie the local gradiet of' r ad thereby determie how the (a, b) set should be varied to maximize r. I order to use the solver tool, the tool must be loaded ito Excel. The solver ca be loaded by: () Click o Tools i the mai meu bar () Click o Solver i the pull dow meu If Solver is ot a optio, the (a) Click o Add-Is i the pull dow meu (b) click o Solver Add-I i the Add-Is dialog box (the check box must be checked) (c) Click OK (d) Click Solver The Solver dialog box is ow visible. The first meu item is the target cell which is r i this case. The secod item delieates what actio is to be perform o the target cell. I this example we wish to miimize

the target cell. The third item specifies which cells may have their values varied to accomplish the objective which i this case are cells cotaiig the guess values of the regressio parameters a ad b. Note that amed cells ca be utilized i specifyig the cell locatios of the target cell ad the adjustable cells. As a optio, you ca set umerical costraits o the adjustable cells. A little thought about the physics of this problem idicates that both a ad b are positive ad these costraits may be added. I some problems you may wish to chage the default Precisio ad Tolerace values by first clickig the Optios butto. Now click OK, ad Excel will attempt to fid the optimum solutio ad replace the guess values of the regressio parameters with the optimum values. Table 5 idicates that the combied theoretical/empirical correlatio produces a r of 0.9989 which is slightly better tha the quadratic. This correlatio is also simpler tha the quadratic fit ad it is more physically sigificat. Istead of basig the curve fit o r, try usig the least squares regressio method to compute the coefficiets ad compare your results. This example also illustrates the use of a fuctio module. To see it, click Tools, Macro ad Visual Basic Editor. Oe word of cautio: oliear fuctios ofte cotai more tha oe solutio ad that a give guess set may produce a local solutio (i this case, a local miimum) istead of a global solutio. Highly oliear problems may also require a fairly accurate iitial guess to obtai a global solutio or ay solutio. You may have to resort to plots to produce a accurate iitial guess. See Noliear Regressio.xls for aother example. Referece F = 4.9 0.085V.663 (5) Physics 0: Egieerig Physics. Lab Maual, Appedix A, Uiversity of Wyomig Physics ad Astroomy, Sprig, 999.