Introduction to Data Analysis

Similar documents
Error Analysis How Do We Deal With Uncertainty In Science.

Errors: What they are, and how to deal with them

The Treatment of Numerical Experimental Results

Propagation of Error Notes

Instrumentation & Measurement AAiT. Chapter 2. Measurement Error Analysis

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Notes Errors and Noise PHYS 3600, Northeastern University, Don Heiman, 6/9/ Accuracy versus Precision. 2. Errors

APPENDIX A: DEALING WITH UNCERTAINTY

Electromagnetism lab project

PHY 101L - Experiments in Mechanics

BRIEF SURVEY OF UNCERTAINITY IN PHYSICS LABS

Lab 1: Simple Pendulum 1. The Pendulum. Laboratory 1, Physics 15c Due Friday, February 16, in front of Sci Cen 301

Dealing with uncertainty

Chapter 3 Experimental Error

Dealing with uncertainty

Appendix II Calculation of Uncertainties

Introduction to Uncertainty and Treatment of Data

MSM120 1M1 First year mathematics for civil engineers Revision notes 4

Check List - Data Analysis

More Examples Of Generalized Coordinates

BRIDGE CIRCUITS EXPERIMENT 5: DC AND AC BRIDGE CIRCUITS 10/2/13

5 Error Propagation We start from eq , which shows the explicit dependence of g on the measured variables t and h. Thus.

Chapter 6: The Definite Integral

Modern Methods of Data Analysis - WS 07/08

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

Uncertainty in Measurement

Measurements and Data Analysis

Intermediate Lab PHYS 3870

Uncertainty, Error, and Precision in Quantitative Measurements an Introduction 4.4 cm Experimental error

Lecture 7: Optimal Smoothing

Pre-Lab: Primer on Experimental Errors

BRIEF SURVEY OF UNCERTAINITY IN PHYSICS LABS

Course Project. Physics I with Lab

Uncertainties and Error Propagation Part I of a manual on Uncertainties, Graphing, and the Vernier Caliper

Numerical techniques to solve equations

Introduction to Error Analysis

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

AE2160 Introduction to Experimental Methods in Aerospace

Lecture 2: Reporting, Using, and Calculating Uncertainties 2. v = 6050 ± 30 m/s. v = 6047 ± 3 m/s

Intermediate Lab PHYS 3870

Error analysis for the physical sciences A course reader for phys 1140 Scott Pinegar and Markus Raschke Department of Physics, University of Colorado

1 Measurement Uncertainties

Introduction to the General Physics Laboratories

Assume that you have made n different measurements of a quantity x. Usually the results of these measurements will vary; call them x 1

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Error Analysis in Experimental Physical Science Mini-Version

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum

Physics 1050 Experiment 1. Introduction to Measurement and Uncertainty

PARADISE VALLEY COMMUNITY COLLEGE PHYSICS COLLEGE PHYSICS I LABORATORY

3. Measurement Error and Precision

Solution of Algebric & Transcendental Equations

Preliminary Examination in Numerical Analysis

Physics 1140 Fall 2013 Introduction to Experimental Physics

Treatment of Error in Experimental Measurements

Numerical Methods I Solving Nonlinear Equations

SOLUTION FOR HOMEWORK 12, STAT 4351

Interpolation and Polynomial Approximation I

Experimental Uncertainty (Error) and Data Analysis

Exercises and Answers to Chapter 1

1 Measurement Uncertainties

Problem Set 1. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 20

Error analysis for IPhO contestants

EE4601 Communication Systems

Association studies and regression

Regression. Oscar García

Physics: Uncertainties - Student Material (AH) 1

Uncertainty and Graphical Analysis

Data Analysis, Standard Error, and Confidence Limits E80 Spring 2015 Notes

Intermediate Lab PHYS 3870

MEM 255 Introduction to Control Systems: Modeling & analyzing systems

Random Variables. P(x) = P[X(e)] = P(e). (1)

PHYS Handout 6

Measurement Uncertainties

Numbers and Data Analysis

Experiment 2 Random Error and Basic Statistics

Data Analysis, Standard Error, and Confidence Limits E80 Spring 2012 Notes

Physics for Scientists and Engineers. Chapter 1 Concepts of Motion

LECTURE 16 GAUSS QUADRATURE In general for Newton-Cotes (equispaced interpolation points/ data points/ integration points/ nodes).

Experiment 2. Reaction Time. Make a series of measurements of your reaction time. Use statistics to analyze your reaction time.

EXPERIMENT 2 Acceleration of Gravity

Welcome to Physics 40!

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Poisson distribution and χ 2 (Chap 11-12)

CS 450 Numerical Analysis. Chapter 8: Numerical Integration and Differentiation

Linear Regression (continued)

Errors. Intensive Computation. Annalisa Massini 2017/2018

Strong Lens Modeling (II): Statistical Methods

SAZ3C NUMERICAL AND STATISTICAL METHODS Unit : I -V

AIMS Exercise Set # 1

Neural Network Training

Kinetic Theory 1 / Probabilities

ECE 422/522 Power System Operations & Planning/Power Systems Analysis II : 7 - Transient Stability

IB Physics STUDENT GUIDE 13 and Processing (DCP)

Introduction to Statistics and Data Analysis

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

ME EXAM #1 Tuesday, October 7, :30 7:30 pm PHYS 112 and 114

Uncertainty and Bias UIUC, 403 Advanced Physics Laboratory, Fall 2014

PHYSICS 2150 LABORATORY

26, 24, 26, 28, 23, 23, 25, 24, 26, 25

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

Transcription:

Introduction to Data Analysis Analysis of Experimental Errors How to Report and Use Experimental Errors Statistical Analysis of Data Simple statistics of data Plotting and displaying the data Summary

Errors and Uncertainties experimental error " mistake experimental error " blunder experimental error = inevitable uncertainty of the measurement The measured value alone is not enough. We also need the experimental error.

Testing the Theories and Models Experimental data should be consistent with you theory (model) and inconsistent with alternative ones to prove them wrong. Example: Bending of the light near the Sun. 1. simplest classical theory 0. careful classical analysis 0.9 3. Einstein s general relativity 1.8 Solar eclipse needed to check: Dyson, Eddington, Davidson, year 1919, a=, 95% confidence, 1.7 < α <.3. Consistent with 1.8 and inconsistent with 0.9!!

Types of Experimental Errors Random errors rotating table + stopwatch voltage measurements i) revealed by repeating the measurements ii) may be estimated statistically Systematic errors measuring glass car performance calibration errors in general are invisible detected by comparison with results of alternative method or equipment

How to Report & Use the Errors Experimental result: measured value of x = x best δx ± the best estimate for x x best - δx < x < x best + δx uncertainty, error, margin of error Usually 95% confidence is suggested: 95% sure x inside the limits 5% chance x is outside the limits

Some Basic Rules Experimental errors should be always rounded to one significant digit. g = 9.8 0.0385 ± g = 9.8 0.0 ± wrong correct Thus error calculations become simple estimates. Exception: if the leading significant digit of the error is 1, keep one digit more. Everest is 8848 1.5 m high ±

Some Basic Rules The last significant figure in the answer should be the same order of magnitude as the uncertainty. ± 9.8 0.3 ± 9 3 ± 90 30 During the calculation retain one more digit than is finally justified to decrease the error.

Error in a Sum or a Difference Two (or more) independently measured variables: x = x best dx ± y = y best dy ± δ(x y) = δx + δy ± δ(x y z.) = δx + δy + δz +. ± ± ± If x, y, z,. are large, but (x ± y ± z ±.) is small Å trouble! Overall error is large.

Calculating Relative Errors relative error = (fractional uncertainty) δx x x = 50 1 cm = 0.0 ± ± δx x δx x x = 100000 1 cm = 0.00001

Relative errors of product and ratio of two variables. x= x ± δx δx<< x best y = y ± δ y δ y<< y best δ ( xy) =? xy xy = ( xbest ± δx)( ybest ± δ y) xbest ybest ± ( xbestδ y + ybestδx) δ( xy) = xbestδ y + ybestδx δ( xy) δx δ y = + xy y x best best For product xy the relative error is the sum of relative errors of x and y.

Relative errors of product and ratio of two variables. δ ( ) ± δ ± δ x/ y x x x 1 x/ x =? x/ y = x/ y y ± δ y y 1 ± δ y/ y best best best best best best xbest 1 + δx/ xbest x best δx δ y max = 1+ + ybest 1 δ y/ ybest ybest xbest ybest xbest 1 δx/ xbest x best δx δ y min = 1 ybest 1 + δ y/ ybest ybest xbest ybest δ ( x / y) δ x δ y = + x y y x / best best

Simple Rules. When the measured quantities are added or subtracted the errors add. When the measured quantities are multiplied or divided, the relative errors add.

Propagation of Errors We have upper bounds on errors for sum/difference and product/quotient of measurables. Can we do any better? If errors are independent and random: the errors are added in quadrature. q = x+ y ( ) ( ) δq = δx + δ y δx+ δ y q = x+... + z ( u+... + w) ( )... ( ) ( )... ( ) δq = δx + + δz + δu + + δw δx+... + δz+ δu+... + δw

l l 1 = 5.3± 0.cm = 7. ± 0.cm l = l + l 1 Propagation of Errors δl = δl + δl = 0. + 0. 3mm ( ) ( ) ( ) ( ) 1 = δl + δl = mm + mm = 4mm 1 For measurables there is no great difference, but for n measurables the difference is 1/ n.

Propagation of Errors Relative error of product/quotient: q x... z = u... w δq δx δz δu δw = +... + + +... + q x z u w δx δz δu δw +... + + +... + x z u w If the relative errors for n measurables are the same, we gain 1/ n in relative error.

Propagation of Errors D.C. electric motor, V - voltage, I - current. efficiency, e work doneby motor = = energy delivered to motor relativeerror for m, h, V, I = 1% relativeerror for t = 5% mgh VIt δq δm δh δv δi δt = + + + + = q m h V I t ( ) ( ) ( ) ( ) ( ) 1% + 1% + 1% + 1% + 5% = 9% 5% δm δh δv δi δt + + + + = 1% + 1% + 1% + 1% + 5% = 9% m h V I t

Propagation of Errors If many measurables, the error is dominated by the one from the worse measurable: δe e δt t What if measure x, but need an error for f(x)? Suggest that the error is small Do Taylor expansion of f about x best df ( x) f( x) = f( xbest ) ± δ x dx

Propagation of Errors Relative error of a power: n q x q x n q x δ δ = = General case - function of several variables: z z q x x q q z z q x x q q z x q δ δ δ δ δ δ + + + + =...... ),..., (

Propagation of Errors Example: measuring g with a simple pendulum, L-length, T-oscillation period T=π (L/g) 1/ Å g=4 π L/T δ ( T T ) δ T = T L = 9.95±0.1 cm, δl/l=0.1% δt/t=0.% δ g g 4 (9.95cm) g best = δ L T = 1.936±0.004 sec. δg = 0.004 x 979 cm/sec = 4 cm/sec = L + δ T T = π 979cm / sec ( 1.936sec) = ( 0.1) + ( 0.) % = 0.4% δ g g

Statistical Analysis of Random Errors. Random: small Systematic: small Random: large Systematic: small Random: small Systematic: large Random: large Systematic: large

Statistical Analysis of Random Errors. Random: small Systematic:? Random: large Systematic:? Random: small Systematic:? Random: large Systematic:?

Statistical Analysis of Random Errors. 1st case: we knew the value of the measured quantity. a) not realistic, b) not very interesting nd case - realistic, no chance of determining the systematic error. Concentrate on random errors and repeat the measurements several times. x 1, x, x 3, x 4,...

Statistical Analysis of Random Errors measure x several times, get a set of data: x 1, x,, x N try to estimate the variability (dispersion) of the measurable and it s best value. Example: manufacturer of metal parts Complaints: non-uniformity of melting temperatures. Analysis: Pick a representative batch of parts and measure melting temperatures. Get a set of data: t 1, t,, t N for all N parts from the sample.

Statistical Analysis of Random Errors Calculate the average (mean) t: (our estimate for the true value of t) N 1 tbest = t = t N i = 1 i Sort the data: t1 t... tn t j 310 311 31 313 314 315 n j 1 3 8 6 0 n j /N 1/0 3/0 8/0 6/0 /0 0 n j/ N probability of t j t t j + 1

Statistical Analysis of Random Errors N i= 1 ti N j j = = j tn N 1 normalized probability Plot a histogram. Increase N Å get a smoother histogram. If the errors are random and small, we get a Gaussian bell-shaped curve at N Å infinity. But we can get more information on random errors before that...

Statistical Analysis of Random Errors Example: 5 measurements of some value. 71, 7, 7, 73, 71 N x 71+ 7 + 7 + 73+ 71 i= 1 xbest = x = = 71.8= 5 N Trial Measured Deviation Deviation number value d i squared 1 71-0.8 0.64 7 0. 0.04 3 7 0. 0.04 4 73 1. 1. 5 71-0.8 0.64 i i i i d = x x d = 0 d =.8 j

Statistical Analysis of Random Errors Linear deviations are no good to characterize the statistics of random errors: have zero average, Let s go for squares. Definition: Standard deviation σ x : σ 1 1 N N ( ) d ( x x) = = x i i N i= 1 N i= 1 Root mean square (RMS) deviation of the measurements x 1, x,, x N

Statistical Analysis of Random Errors Standard deviation - measure of the accuracy of a single measurement or of the width of the data distribution (histogram). At NÅinfinity histogram turns into a bell-shaped Gaussian curve.

Statistical Analysis of Random Errors Standard deviation in a single measurement: σ x characterizes the average error of the measurements x 1, x,, x N. At large N the distribution approaches Gaussian and P(x true - σ x < x < x true + σ x ) = 68%. If you make a single measurement knowing σ x for the method used, it s 68% probable that your x is distance σ x or less form x true. You may be 68% confident that the single measurement is within σ x from the correct answer.

Statistical Analysis of Random Errors Standard deviation of the mean: x 1, x,, x N suggest x best = Σx i /N. How good (accurate) is our knowledge of x best? It may be proven that: σ x = σ x 1 N Standard deviation of the mean Or: Average of N measurements gives a 1/N 1/ smaller error, than a single measurement. x = x best ± σ x N

Statistical Analysis of Random Errors Example: Measure elasticity constants of springs. 86, 85, 84, 89, 85, 89, 87, 85, 8, 85 k = 85.7N / m, σ.16n/ m N/ m k If we make 10 measurements and get k = 85.7N/m σ k δ kwillbe 0.7N/m 10 k = 85.7 ± 0.7N / m The more measurements we make or the larger is the sample, the more accurate is the measurement!!

Gaussian Distribution If the errors of the measurement are random and small, we approach the Gaussian distribution at large N. xµ=5 σ=1 xµ=5 σ=3

b a Gaussian Distribution ( x x) 1 G = exp x, σ σ π σ P(x, x + dx) = G(x)dx (probability density) P(a x b) = G(x)dx G(x)dx = 1 (normalized) G(x + dx) = G(x dx) (symmetric with respect to x)

x + σ x σ x+ σ x σ ( ) Gaussian Distribution x G(x)dx = x = x (first moment) best x x G(x)dx σ (second moment) = remember "experimental" x and σ? At N is replaced by. x G(x)dx 0.68 (68%) xg(x)dx 0.95 (95%)

Least Squares Fitting So far: Single measurable x, though multiple measurements to get the statistics. Analysis of random errors. Now: The statistics is known. Multiple measurables y 1, y,, y N, at different values of x: x 1, x,, x N ; y i = f(x i ). We are figuring out f(x). Two possible approaches: Either : 1. Measure y i,. Plot the data. 3. Guess the form of y = f(x) & make the fit. Or: 1. Have an idea of y = f(x).. Set the experiment. 3. Plot the data to check the initial guess (model).

Least Squares Fitting The lore: 1. Get many points.. Plot the data. 3. Play with the coordinates (lin-lin, log-lin, log-log). 4. Look for the simplest possible fits.

Least Squares Fitting The simplest case - linear dependence: y = a x + b. x 1, x,, x N & y 1, y,, y N What are the best a and b to fit the data? Suggest Gaussian distribution for each y i, know σ y. Assume the fit curve is the true f(x), ax i +b is the true value of y i. Construct P(y 1, y,, y N ; a,b) probability of having the set of data y 1, y,, y N. Maximize P(y 1, y,, y N ; a,b) with respect to a & b and find the corresponding a and b. Compare a and b with theory (if we have any).

For a single measurement: Least Squares Fitting measured value 1 i i + P( yi ) ~ exp σ y σ y ( y ( a x b) ) For a set of measurements: P( y 1, y,..., χ y = N ) ~ N i= 1 P( y Welook for : 1 ) P( y ( y ( a x + b) ) i σ χ a i y )... P( y N ) ~ σ χ = 0 & = 0 b 1 N y true value exp χ

Least Squares Fitting We have a system of linear equations for a and b: N χ = x ( ) i y i (axi + b) = 0 a σ y i= 1 N χ = ( y ) i (axi + b) = 0 b σ y i= 1 a x + b x = x y a xi + Nb= yi i i i i

The solutions are: Least Squares Fitting a b = = xy x y i i i i i i i i i x y x x y, where i = x ( x ) i A x + b - least squares fit to the data, or line of regression of y on x.

Linear Least Squares, General case Our fitting function in general case is: F(x) = a 1 f 1 (x) + a f (x) + + a n f n (x) Note that the function itself does not have to be linear for the problem to be linear in the fitting parameters. Let us find a compact way to formulate the least squares problem.

Linear Least Squares, General case Thus we have: vectors x, y and a: x1 points y1 x =... where the data, y =... the data, was taken xn yn a1 fitting a =... parameters an and functions f ( x), f ( x),..., f ( x). 1 n

Linear Least Squares, General case The problem now looks like: y= F( x ) + e, wheree isaresidual: i i i mismatch between the measured value and the one predicted by the fit. Let s intorduce vector e: e1 e =... en i

Linear Least Squares, General case Let us express the problem in matrix notation: f1( x1) f( x1)... fn ( x1) 1( ) ( )... ( ) Z= f x f x fn x............ f1( xn) f( xn)... fn( xn) Overall we have now: y = Z a+ e Fitting problem in matrix notation. N i = i=1 T Look for min e min( ee)

Linear Least Squares, General case N N Look for min ei min n = y i za ij j = i=1 i=1 j= 1 min ( T ( y z a) ( y z a) ) ( T ee) a k ( y z a) = 0 for 1 k n ( y z a) = ak T 0

Linear Least Squares, General case ( a) z ( y z a) = ak T ( z ) ( y z a) (0 0...1...0) = 0 T ( ) ( ) z z... z y z a = 0 for 1 k n 1k k Nk Using Matlab colon notation: T ( z ) ( ) ( ) z a = z y :,k :, k T Or after putting all n equations together: T T z z a = z y T 0

Linear Least Squares, General case In general case linear lest squares problem can be formulated as a set of linear equations. Ways to solve: 1. Gaussian elimination.. To calculate the matrix inverse: ( T ) 1 T a = z z z y Suitable for Matlab, see homework 9.

Nonlinear Regression (Least Squares) What if the fitting function is not linear in fitting parameters? We get a nonlinear equation (system of equations). Example: f(x) = a (1 - e 1 -a x ) + e y = f (x ;a,a,...,a ) + e or just y = f(x ) + e i i 1 1 m i i i i Again look for the minimum of to the fitting parameters. N i=1 i e with respect

Matlab Function FMINSEARCH. Accepts as input parameters: 1. Name of the function (FUN) to be minimized. Vector with initial guess X0 for the fitting parameters Returns: Vector X of fitting parameters providing the local minimum of FUN. Function FUN accepts vector X and returns the scalar value dependent on X. In our case (hw10) FUN should calculate dependent on the fitting parameters b, m, A 1, A, N e i=1 i

Matlab Function FMINSEARCH. Syntax: x = FMINSEARCH(FUN,X0) or x = FMINSEARCH(FUN,X0,OPTIONS) See OPTIMSET for the detail on OPTIONS. x = FMINSEARCH(FUN,X0,OPTIONS,P1,P,..) in case you want to pass extra parameters to FMINSEARCH If no options are set use OPTIONS = [] as a place holder. Use @ to specify the FUN: x = fminsearch(@myfun,x0)

Gauss-Newton method for nonlinear regression y = f (x ;a,a,...,a ) + e or just y = f(x ) + e i i 1 1 m i i i i Look for the minimum of e with respect to a. 1. Make an initial guess for a: a0.. Linearize the equations (use Taylor expansion about a0). 3. Solve for a - correction to a0 a1=a0+ a - improved a-s and our new initial guess. 4. Back to (1). k,j+1 k,j N 5. Repeat until a -a < ε for any k. i=1 i i

Gauss-Newton method for nonlinear regression Linearization by Taylor expansion: f(x,a0) y f(x ) e f(x,a0) e n i i = i + i i + + i j= 1 a n f(x,a0) y f (x,a0) e for i = 1,,...,N n i i i = + i j= 1 a n or in matrix form: D = Z a+e, where f(x 1,a0) f(x 1,a0) y1 f(x 1,a0)... a1 a n D=..., and Z =......... yn f(x N,a0) f(x N,a0) f(x N,a0)... a1 an

Gauss-Newton method for nonlinear regression Linear regression: y = Z a + e Now,nonlinear regression : D = Z a+e. Old good linear equations with a in plce of a, D in place of y and Z with partial derivatives in place of Z with values of functions. Solve it for a, use a1=a0+ a as the new initial guess and repeat the procedure untill the convergence criteria are met...