The Treatment of Numerical Experimental Results

Memorial University of Newfoundl Department of Physics Physical Oceanography The Treatment of Numerical Experimental Results The purpose of these notes is to introduce you to some techniques of error analysis which you will use in the laboratory The notes are in no sense complete; other techniques such as the χ 2 test t-test are widely used, but are not discussed here Introduction An experimental result has no physical meaning unless an uncertainty (or error [1] )is assigned to it Getting the wrong answer when multiplying or dividing numbers is not an error, but a mistake is something which you should recognize be able to correct The error is not found by comparing your answer to some number in the textbook: uncertainties cannot be avoided in experimental physics Multiple measurements of the same quantity using the same measuring instrument, may not give the same result each time due to rom errors If a systematic error is present, a set of very precise measurements may miss the true value completely For example, a stopwatch is able to measure times to within 001 s, but there is often an associated error due to reaction time of about 03 s Absolute Relative Uncertainty Suppose a ruler is used to measure the length of a rod It is difficult to determine length exactly, but we decide on a value somewhere between 101 cm 103 cm, written as L =102 cm± 01 cm where 102 cm is the most likely for the length the value ±01 is called the absolute uncertainty in the measurement The uncertainty defines a range of possible values for the length, so that 101 cm L 103 cm We often use relative uncertainty, where relative uncertainty = absolute uncertainty, measured value [1] The terms experimental error experimental uncertainty are assumed to have the same meaning in these notes The Treatment of Numerical Experimental Results (1)

which, for the rod measurement is ± 01 102 = ±0009 (no units) The relative uncertainty (or precision of the measurement) is often quoted as a percentage so that the percent uncertainty above is 09% In the laboratory, the size of the error is usually estimated according to the equipment being used You need to select a large enough uncertainty such that the true value lies within the range of uncertainty most of the time This is not always straightforward but becomes easier with experience Combining Errors in Laboratory Results In experiments where the desired result is calculated from two or more quantities (eg speed = distance/time) the errors in each quantity must be combined to give an uncertainty in the final answer To calculate z ±δz, where z = f(x, y) ±δx ±δy are the uncertainties in x y, we could, in principle, calculate z for all values of x ± δx, y ± δy A less tedious method is to calculate the maximum value of δz from the equation: δz = f f δx + δy, (1) x y which for simple mathematical operations reduces to the following forms: z = x + y δz = δx + δy (2a) z = x y δz = δx + δy (2b) z = x y z = x y δz z = δx x + δy y δz z = δx x + δy y (2c) (2d) (2) The Treatment of Numerical Experimental Results

Examples 1) When the desired result depends on more than two quantities, the error may be calculated by breaking down the algebraic expression term by term Thus if a = bc/d, the maximum error in a is given by: δa a = δb b + δc c + δd d (3) 2) If z = x n,wheren can be positive or negative, we differentiate to obtain δz = nx n 1 δx or, as a relative uncertainty, δz z = n δx (4) x 3) The density of a metal cylinder may be calculated from ρ = 4m πd 2 l where the symbols have their usual meaning The corresponding error equation is δρ ρ = δm m +2δd d + δl l Note that 4 π do not contribute to the relative error since they are constants 4) y = a sin θ; or δy y = δa cos θδθ + a sin θ or δy y = δa a +cotθδθ provided θ δθ are measured in radians 5) y = a cos θ; δy = δa sin θ + a cos θδθ (using Eq (1)) or δy = δa cos θ + a sin θδθ δy y = δa a +tanθδθ 6) y =lnx; δy = 1 x δx The Treatment of Numerical Experimental Results (3)

Mean Stard Error The arithmetic mean x of a set of N readings is defined by N i=1 x = (5) N where x i is the ith reading, means add up all the individual values of x i from i =1to N The mean is the best estimate of the true value Repeated measurements generally follow a normal or Gaussian probability distribution; the probability of occurrence of an individual value x i may be calculated from { 1 (xi x) 2 } P (x i )= exp σ x 2π where x is the mean σ x is the stard deviation, defined by (xi x) σ x = 2 (6) N 1 The stard deviation is a measure of the deviation of a typical reading from the mean value It may be shown that 68% of the measurements lie within one stard deviation of x nearly all measurements (95%) are expected to lie within 2σ x of the mean The quantity σx 2 is called the variance It is generally more useful to consider the stard error of the mean, σ x It may be shown that for N measurements, each subject to an error δx i, the error in the mean is Experimental results are usually expressed in the form x i 2σ 2 x σ x = σ x (7) N x ± σ x In the special case of radioactive decay, the mean square deviation (from Poisson statistics) is given by σ = N where N is the number of counts Data are recorded in the form N ± N Hence it is necessary to count for at least 10,000 decays to obtain an accuracy of 1% Foran accuracy of 1 in 10 3, 10 6 decay events are required (4) The Treatment of Numerical Experimental Results

Significant Figures Suppose you use a calculator to obtain a stard deviation of 0987654321 If this is larger than the reading error in your measurements, then this will be the error in each of the datapoints For a sample of N datapoints, the expected uncertainty in the stard deviation (ie, the error in the error) is: δσ est = σ est 2N 2 Suppose δσ est =0232792356 This means that the actual value of the stard deviation lies between 0987654321-0232792356 = 0704861965, 0987654321 + 0232792356 = 1220446677 A moment s thought about this should convince you that many of these digits have no significance The value of the estimated stard deviation is more like 099 ± 023 or maybe even 10 ± 02 Writing 0988 ± 0233 has more digits than are actually significant Even if you repeat a measurement 50 times, the estimated stard deviation has at most only two digits that have any meaning Imagine that one of the data points has a numerical value of 123456789 If we estimate the stard deviation to be 099, then the point value is 1235 ± 099 It would be wrong to say 12345 ± 099, since the 5 has no meaning In the laboratory, the reading error will be the error in each individual measurement This will be little more than a guess made by the experimenter, it is doubtful that you can guess to more than one significant figure Thus a reading error almost by definition has only one significant figure, that number determines the significant figures in the value itself You need to be particularly careful when writing down computer-generated results A slope of 00778465 ± 000217814 is meaningless: the error should written as ±0002, which means that the slope should be written as 0078 to keep the same number of figures after the decimal point Similarly, if the voltage across a resistor is 154 ± 01 volts the current is 17 ± 01 amps, the resistance is not 90588235 ohms because any additional figures beyond the first decimal place are meaningless The final value for R should be written as (91 ± 06) Ω The Treatment of Numerical Experimental Results (5)

The Errors in a Straight Line Graph Often the result of an experiment depends on the slope of a straight line graph Errors associated with the data are illustrated by error bars, the size of which defines the range of uncertainty in one (or both) axes A straight line is represented by the equation where m is the slope b is the y-intercept y = mx + b y v a l u e s δx δy maximum best minimum b xvalues The uncertainty in the slope is estimated by considering extremes of maximum minimum slope which might concievably fit the data, as illustrated in the diagram Denoting the slopes of these lines by m max m min respectively, δm may be calculated from δm = m max m min 2 (8) the error in the intercept: δb = b max b min (9) 2 (6) The Treatment of Numerical Experimental Results

Method of Least Squares Linear regression by the Method of Least Squares finds the equation of the best fit line by minimizing the sum of the squares of deviations of the individual y values from the straight line δx 5 y v a l u e s δx 2 δx 4 δx 3 δx 1 xvalues The slope intercept are given by m = (xy) 1/ N ( x)( y) (x2 ) 1 / N ( x) 2 (10) b = 1 / N ( y m x ) (11) with uncertainties (y2 ) b y m (xy) δb = N 2 δm = (12) δb (x2 ) 1 / N ( x) 2 (13) The Treatment of Numerical Experimental Results (7)

Worked Example Linear regression is easily performed by computer, however, a detailed calculation is shown here for completeness [2] x =10, 20, 30, 40, 50, 60 y =716, 725, 743, 761, 770, 779, Example of Least Squares Fit 78 76 y values 74 72 70 10 20 30 40 50 60 x values i x i y i x 2 i yi 2 x i y i 1 10 716 10 5126 716 2 20 725 40 5256 145 3 30 743 90 5521 2229 4 40 761 160 5791 3044 5 50 770 250 5929 385 6 60 779 360 6068 4674 sum 210 4494 910 33692 15963 [2] Data taken from L Hmurcik et al Linear regression analysis in a first physics lab, Am J Phys, 57, 135 138 (1989) (8) The Treatment of Numerical Experimental Results

hence x =210; y =4494; x 2 =910; y 2 = 33692 xy = 15963 Finally we obtain The corresponding uncertainties are given by m = 15963 1 / 6 210 4494 910 1 / 6 (210) 2 =0134 b = 1 / 6 (4494 0134 210) = 702 33692 702 4494 0134 15963 δb = 4 δm = 004 910 1/ 6 (210) 2 =0009 =004 The relative uncertainty in the slope, therefore, is 0009 =0067 or 67% 0134 the relative uncertainty in the intercept is 004 702 =0005 or 05% When the data points do not follow a straight line, the best curve through the points may also be obtained by linear regression provided the function is a polynomial of the form y = a 0 + a 1 x + a 2 x 2 + a 3 x 3 + Other mathematical functions may be fitted using non-linear regression In most of the lab experiments that you will do, the uncertainty in the slope of the straight line will be greater than the uncertainties in other measured quantities This means that you can usually ignore the errors in everything but the slope The Treatment of Numerical Experimental Results (9)