PHYS 342L NOTES ON ANALYZING DATA. Spring Semester 2002

Similar documents
XII.3 The EM (Expectation-Maximization) Algorithm

APPENDIX 2 FITTING A STRAIGHT LINE TO OBSERVATIONS

Excess Error, Approximation Error, and Estimation Error

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

Chapter 12 Lyes KADEM [Thermodynamics II] 2007

1.3 Hence, calculate a formula for the force required to break the bond (i.e. the maximum value of F)

Fermi-Dirac statistics

LECTURE :FACTOR ANALYSIS

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede. . For P such independent random variables (aka degrees of freedom): 1 =

System in Weibull Distribution

Least Squares Fitting of Data

CHAPTER 10 ROTATIONAL MOTION

Least Squares Fitting of Data

Math1110 (Spring 2009) Prelim 3 - Solutions

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Measurement and Uncertainties

ON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES

Outline. Review Numerical Approach. Schedule for April and May. Review Simple Methods. Review Notation and Order

COS 511: Theoretical Machine Learning

The Impact of the Earth s Movement through the Space on Measuring the Velocity of Light

Chapter 1. Probability

Elastic Collisions. Definition: two point masses on which no external forces act collide without losing any energy.

A REVIEW OF ERROR ANALYSIS

Solution Thermodynamics

PES 1120 Spring 2014, Spendier Lecture 6/Page 1

Physics 3A: Linear Momentum. Physics 3A: Linear Momentum. Physics 3A: Linear Momentum. Physics 3A: Linear Momentum

Several generation methods of multinomial distributed random number Tian Lei 1, a,linxihe 1,b,Zhigang Zhang 1,c

SIMPLE LINEAR REGRESSION

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

Revision: December 13, E Main Suite D Pullman, WA (509) Voice and Fax

DEMO #8 - GAUSSIAN ELIMINATION USING MATHEMATICA. 1. Matrices in Mathematica

Comparison of Regression Lines

Measurement Uncertainties Reference

AN ANALYSIS OF A FRACTAL KINETICS CURVE OF SAVAGEAU

ACTM State Calculus Competition Saturday April 30, 2011

On Pfaff s solution of the Pfaff problem

AP Physics 1 & 2 Summer Assignment

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

Foundations of Arithmetic

Determination of the Confidence Level of PSD Estimation with Given D.O.F. Based on WELCH Algorithm

Errors for Linear Systems

Solutions for Homework #9

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e.

1 Review From Last Time

χ x B E (c) Figure 2.1.1: (a) a material particle in a body, (b) a place in space, (c) a configuration of the body

On the number of regions in an m-dimensional space cut by n hyperplanes

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede. ) with a symmetric Pcovariance matrix of the y( x ) measurements V

Pre-Calculus Summer Assignment

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

What is LP? LP is an optimization technique that allocates limited resources among competing activities in the best possible manner.

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Xiangwen Li. March 8th and March 13th, 2001

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = "J j. k i.

Linear Approximation with Regularization and Moving Least Squares

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Discrete Memoryless Channels

Propagation of error for multivariable function

Kernel Methods and SVMs Extension

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Quantum Particle Motion in Physical Space

Lecture 12: Discrete Laplacian

Source-Channel-Sink Some questions

Numerical Methods. ME Mechanical Lab I. Mechanical Engineering ME Lab I

NUMERICAL DIFFERENTIATION

Computational and Statistical Learning theory Assignment 4

Important Instructions to the Examiners:

Small-Sample Equating With Prior Information

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Lecture 3: Probability Distributions

Linear Regression Analysis: Terminology and Notation

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

Difference Equations

Laboratory 3: Method of Least Squares

Linear Momentum. Center of Mass.

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

One Dimensional Axial Deformations

Laboratory 1c: Method of Least Squares

AGC Introduction

Chapter 8 Indicator Variables

Goodness of fit and Wilks theorem

Statistics Chapter 4

Gravitational Acceleration: A case of constant acceleration (approx. 2 hr.) (6/7/11)

University of Washington Department of Chemistry Chemistry 452/456 Summer Quarter 2013

PROBABILITY AND STATISTICS Vol. III - Analysis of Variance and Analysis of Covariance - V. Nollau ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Analytical Chemistry Calibration Curve Handout

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Société de Calcul Mathématique SA

8.1 Arc Length. What is the length of a curve? How can we approximate it? We could do it following the pattern we ve used before

Chapter 13: Multiple Regression

Applied Mathematics Letters

Dummy variables in multiple variable regression model

This column is a continuation of our previous column

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ASYMMETRIC TRAFFIC ASSIGNMENT WITH FLOW RESPONSIVE SIGNAL CONTROL IN AN URBAN NETWORK

Uncertainty in measurements of power and energy on power networks

Rockefeller College University at Albany

Transcription:

PHYS 34L OTES O AALYZIG DATA Sprng Seester 00 Departent of Phscs Purdue Unverst

A ajor aspect of eperental phscs (and scence n general) s easureent of soe quanttes and analss of eperentall obtaned data. Whle there are a lot of books devoted to ths proble, n the net paragraphs we wll suarze soe of the portant deas that wll be needed to successfull analze data acqured n PHYS 34L. Students are advsed to consult wth [] for ore detaled dscusson on the topc.. The portance of estatng errors. Suppose ou are asked to easure the length of a pece of notebook paper. You grab a ruler and proceed wth a easureent. The ruler shows 76. Does t ean the length s 76.0000? Most probabl not. Wh? Because the dstance between the neghborng arks on our ruler s, b sang 76 ou cannot eclude, for eaple, length 76. or 59.9. Thus ou assue certan precson (or error) n our easureent, n ths case t s probabl ~0.5, as the dstance between the closest arks s. The result of the eausureent s not just the length of the paper but also the error of ths easureent: (76.0±0.5). In a scentfc eperent, both parts of easureent are portant. Suppose ou easure the length of the net sheet of paper to be (75.5±0.5). Wthn the error of our easureent these two sheets of paper have the sae length.. Precson (or Accurac) of a Measureent. Dstngush between absolute uncertant and relatve uncertant: absolute uncertant 7.6 ± 0. relatve uncertant 0. ± ± 0.0036388 7.6 All these nubers don t ean uch when calculatng the relatve uncertant, so round off to ± 0.004, or, epressed as a percent, ± 0.4%. 3. Cobnng Uncertantes. Suppose that ou easure two quanttes A and B. Suppose ou easure A to an accurac of ±δa and B to an accurac of ±δb. How do ou algebracall cobne these uncertantes? a) When addng: (A ± δa) + (B ± δb)? there are four possbltes: (A + δa) + (B + δb) (A + B) + (δa + δb) (A + δa) + (B - δb) (A + B) + (δa - δb) (A - δa) + (B + δb) (A + B) - (δa - δb) (A - δa) + (B - δb) (A + B) - (δa + δb) clearl, the worst case wll be (A+B)±(δA+δB) () b) When subtractng: (A±δA) - (B±δB)?

Agan consder four cases. Fro above, t should be obvous that the worst case wll be gven b (A-B)±(δA+δB) () c) When ultplng ( A ± δa) ( B ± δb) AB + A( ± δb) + B( ± δa) + ( ± δa)( ± δb) AB ± ( AδB + BδA) δb δa AB ± + B A sall, neglect (3) d) When dvdng A ± δa? B ± δb After soe algebra, ou fnd that A ± δa A δa δb ± + B ± B B (4) δ A B Reeber: relatve uncertantes add when ultplng or dvdng. absolute uncertantes add when addng or subtractng 4. Ssteatc and rando errors. X X X 3 X 4 X 5 X 6 X 7 X 8 X X X 3 X 4 X 5 X 6 X 7 X 8 X true X X X true Fgure : Spread n the easureent of soe quantt n the absence of ssteatc error (left) and n ts presence (rght). If error n our easureents s rando, then the average value should be close to the actual value. In the case of ssteatc error, that s not true. Ths stuaton a occur when, for eaple, usng a clock whch s runnng slow to easure soe te perod. Rando errors are nevtable, whle ssteatc errors can be taken nto account or elnated. 5. Average value and standard devaton. In order to decrease the nfluence of rando error ultple easureents are taken and averaged: n (5) n How close ths average value s to the actual value X? If we have a set of

easureents we can fnd an average error for a sngle easureent. The coonl accepted value to characterze error s called standard devaton, or root ean square (rs): n n ( X ) (6) Snce the actual value X s usuall unkown, we ust use nstead. It can be shown [] that n ths case: n ( ) (7) n The value characterzes error n a sngle easureent of value X. If we take several easureents of the sae value and average the, the resultng value ust n average be closer to actual value as a sngle easureent. It can be shown [] that standard devaton n for the average value of n easureents s: n (8) n 6. Dstrbuton of easureents A seres of easureents a be represented as a hstogra (Fg. ). Fgure. A sple hstogra after takng just fve data ponts (n5). There was onl one data pont fallng nto range of arked as A, B and D, and two easutreents where n regon C. It s dffcult to see an trends after takng just a few data ponts. Make ore easureents and use saller bns and ou ll eventuall get a hstogra that ght look lke ths.

Fgure 3. An hstogra after takng hundreds of easureents. In a lt of large n the dstrbuton s gven b contnuous dstrbutn functon f(), so that f()d s the probablt that a sngle easureent taken at rando wll le n the nterval to +d. The average value can be then found as: And standard devaton: f ( ) d (9) ( ) ( X ) f ( ) d f ( ) d (0) In an cases error dstrbuton functon s well descrbed b Gaussan (also called noral dstrbuton (Fg. 4): f ( X ) ( ) e () π fwh X Fgure 4. Gaussan dstrbuton functon.

The standard devaton for Gaussan dstrbuton can be also epressed as: fwh 0. 45 fwh () ln() where fwh f full wdth at half au, whch can be estated graphcall. Suppose now that we perforeed a sngle easureent whch resulted n value and we also know the standard devaton of ths easureent. Snce the Gaussan dstrbuton s a contnuous functon whch becoes zero onl n nfnt, the easured value a la anwhere fro - to +. What s the probablt that the actual value X whch we are trng to easure s wthn dstance fro ths easured value? Snce f()d s the probablt of easurng value between and +d, the probablt of X + easurng between X- to X+ s gven b ntegral f ( ) d 0.68. X Thus, the probablt of our easureent beng wthn X± - 68% X± - 95% X±3-99.7% X±4-99.994% 6. Cobnng V s. Let us now get back to the case when we add two values A and B, where standard devatons are A and B, correspondngl. What would be the standard devaton of the su A+B? As we alread know, the errors should be added for the worst case scenaro (Eq. ). However, f errors n A and B are rando and utuall uncorrelated, the tend to cancel to soe etent as there s 50% probablt that the have dfferent sgn n one easureent set. It can be shown, that the standard devaton of the su (or dfference) s: + A± B A B (3) otce that A±B A + B. If A B, then ± B Slarl, the relatve standard devaton C /C for product (or rato) of A and B s: A. C C A A B + B (4) where CAB or CA/B. ote, that ths s true onl f errors are uncorrelated and not ssteatc.

7. A sple eaple Suppose ou need to evaluate the charge to ass rato of an electron. Ths s to be done usng the followng equatons where BkI How would ou analze the error n q? q V, (5) B R Suppose ou easure I, V and k as follows: I(.4±0.) A V(40±) V kcol constant(7.5±0.5) 0-4 T/A You also easure R, the radus of the electron s orbt, b easurng ts daeter D. Snce the sallest arkng on the ruler s and ou have to deterne the postons of both sdes of the electron orbt, the precson of such a easureent s not better than 0. c. To reduce rando error ou a want to take several easureents. Suppose ou ake three easureents of D: D 6.0 ±0. c D 5.8±0. c D 3 5.7±0. c R D /3.0±0. c R.9±0. c R 3.85±0. c Usng Eqs. 5 and 7, the average value of R and the standard devaton for one easurent wll be 3.0 +.9 +.85 R.9( c) 3 be 0.08 + 0.0 + 0.07 0.054 Accordng to Eq. 8, for the average of 3 easureents the standard devaton 3 wll 0.054 3 3 0.03 However, the ruler we use has precson ~0. c onl. Usng Eq. 3 we can account for both errors: and we have R + 0. 0. c 3 R (.9±0.) c ow we calculate: q V V B R k I R q (7.5 ± 0.5) 0 (40 ± ) 4 [ ] [.4 ± 0.] [ 0.09 ± 0.00] (6)

Ottng errors we get the value of Usng Eq. 4 we can wrte: q.98 0 C / kg q / n V k I q / n V + k + I + R R (7) ote the factors n the equaton above. These ste fro the fact that correspondng values are squared n Eq. 6,.e. we have products k k, I I and R R. Snce k k s a product of two correlated values, we ust use Eq. 3 the relatve errors spl add up. q / n q / n.98 0 0.63 0 40 C / kg 7.5 + 0.5 0. +.4 0.00 + 0.09 And the fnal result for the rato can be wrtten as: q (.98 ± 0.63) 0 C / kg Here we used Eq. 4 snce values V, k, I and R are uncorrelated. If we assue, just for eaple, a correlated case, we ust use Eq. 3 and 4.e. add relatve errors: and q / n V k I R + + + q / n V k I R.0 0 C kg q / n / The accepted value s.76 0 C/kg. It s eas to ake a sple plot ncludng error bars to graphcall llustrate ths result.

q 4 3.98±0.63.76 Fgure 6: A plot showng the easured value wth error bar. The dotted horzontal lne represents the accepted value. The q/ as has unts of 0 C/kg. The easured value of q/ s ~ hgher than the accepted value. The probablt for that to happen s onl ~5%, whch strongl suggests the presence of soe ssteatc error n our easureent.

8. Least Squares Ft. Suppose ou easure soe data ponts as a functon of a varable called. After the easureents, ou wll have a set of data ponts,,, Soetes ou ght know that the data should ft a straght lne (e.g., fro theoretcal consderatons). The equaton of a straght lne s +b where the slope ght equal a certan quantt of nterest and the ntercept b ght equal soe other quantt of nterest. Fgure 7: A plot showng the best straght lne ft to a collecton of data ponts. In short, f ou could deterne and b, these values a contan estates for useful quanttes. One wa to deterne and b s to plot the data and use a ruler to draw a straght lne through the ponts. Then, b calculatng and b fro the straght lne drawn, ou have produced soe weghted average estate of and b fro all our data. A sple eaple? Suppose ou are asked to deterne π eperentall and suppose ou alread know that for crcles crcuference π (daeter) One wa to proceed ght be to ake a varet of crcles of dfferent daeters and then easure the crcuference of each one. You ght plot the data as follows:

Fgure 8: A plot of how the easured values for the crcuference of dfferent crcles ght var as a functon of the easured daeter. ote that n ths eaple, the ntercept of the best straght lne through the data MUST pass through the co-ordnate orgn. Clearl, the slope of a straght lne through the data contans useful nforaton snce πslope. Q: How can ou deterne the best value for the slope and ntercept wthout prejudce or personal judgent? A: Use the prncple of least squares. Assue ou draw crcles and ake easureents of each crcuference and daeter. Let the ndependent varable (the daeter) be represented b the sbol d. Let the dependent varable (the crcuference) be represented b the sbol C. Also assue the d values are accurate. After the easureent process, ou ll have a set of nubers (d,c): d, C d, C.. d, C It s conventonal to ap these nubers nto the paraeters (,) as follows d, C d, C.. d, C Let the dfference between the best lne through the data and each ndvdual data pont be represented b (δ ). One unabguous wa to specf the best lne through all the data can be defned b the condton that the su of all the (δ ) have a nu value. How are the ndvdual δ defned? Graphcall, the are ndcated n the plot below. ote that at ths pont of the analss, the straght lne drawn need not be the best straght lne through the data.

Fgure 9. A least squares analss requres ou to calculate the devaton of each data pont fro the best straght lne. Matheatcall, ou can calculate the δ as follows. Suppose ou defne a quantt Y such that Y +b where the sbols and b are soehow chosen to represent the best straght lne through the data, whatever that eans. Calculate δ -Y C -(d +b) Least Squares Fttng requres that (where the swtch n notaton fro (C,d) to (,) has been ade) ( ) ( ) [ ] + b nu δ Wrte Σ [ -( +b)] M. The condtons for M to be a nu are 0, 0 b M M Perforng the dervatves and settng the equal to zero gves, after soe algebra, two unque equatons for the best and b. ) ( ) ( Once we have and b, then also calculate the nteredate quantt :

[ ( + b) ] It can be shown that the uncertant n the slope and ntercept b s gven b ( ) ( ) You can conclude that the best values for and b are ± b± b Ths eans that there s a 68% chance of the real lng between - and +. Lkewse, there s a 68% chance of the real b lng between b- b and b+ b. Snce the least squares fttng forulae nvolve sus over varous cobnatons of easured data, the least squares fttng procedure s especall eas to pleent n spread sheets lke Ecel. In fact, ost spread sheet progras have pre-prograed least square ft routnes avalable as analss tools. Reference. G. L. Squres. Practcal Phscs, 4 th Edton, Cabrdge Unverst Press, Cabrdge (00)