Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Alexandr Malusek Division of Radiological Sciences Department of Medical and Health Sciences Linköping University 2014-04-15 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 1 / 39
Outline 1 Repetition Taylor series Random variables Statistics 2 Guide to the expression of uncertainty in measurement Documents Terminology GUM 1995 Propagation of distributions using a Monte Carlo method 3 Example 4 Appendix Non-adaptive Monte Carlo method Adaptive Monte Carlo method Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 2 / 39
Repetition First, we are going to repeat what you already know. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 3 / 39
Repetition: Taylor series in one variable The Taylor series of a function f (x) that is infinitely differentiable at a number a is the power series f (x) = f (a) + f (a) 1! (x a) + f (a) 2! (x a) 2 + f (3) (a) (x a) 3 +... 3! where n! denotes the factorial of n and f (n) (a) denotes the nth derivative of f evaluated at the point a. In sigma notation: f (x) = i=0 f (i) (a) (x a) i i! Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 4 / 39
Repetition: Taylor series in several variables Taylor series of a function f of n variables (x 1,..., x n ) at a point (a 1,..., a n ): ( ) (x 1 a 1 ) i1... (x n a n ) in i1+...+in f f (x 1,..., x n ) =... (a i 1!... i n! x i1 in 1,..., a n ) 1... xn i 1=0 i 2=0 i n=0 The first order approximation: f (x 1,..., x n ) f (a 1,..., a n ) + n i=0 f (a 1,..., a n ) x i (x i a i ) 0 5 10 15 2 10 0 10 2 2 0 2 0 0 2 2 f (x 1, x 2) = x 2 1 x 2 2 0 2 2 At ( 1, 1): f (x 1, x 2) 2 + 2(x 1 + 1) + 2(x 2 + 1) Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 5 / 39
Repetition: Random variables In probability and statistics, a random variable or stochastic variable is a variable whose value is subject to variations due to chance. For a continuous random variable X, we define: distribution function: G X (ξ) = Pr(X ξ) probability density function: g X (ξ) = dg X (ξ)/dξ expectation: E[X ] = ξg X (ξ) dξ variance: Var(X ) = E[(X E[X ]) 2 ] = (ξ E[X ])2 g X (ξ) dξ standard deviation: s = [Var(X )] 1/2 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 6 / 39
Repetition: Random variables An example for a normally distributed X : distribution function: G X (ξ) = ξ g X (ξ) dξ = 1 ξ µ 2 [1 + erf( σ )] 2 probability density function: g X (ξ) = 1 σ (ξ µ)2 exp[ 2π 2σ ] 2 expectation: E[X ] = µ variance: Var(X ) = σ 2 standard deviation: s = σ 0.4 1.0 g(ξ) 0.3 0.2 σ G(ξ) 0.8 0.6 0.4 0.1 0.2 4 2 0 2 4 4 2 0 2 4 ξ ξ standard normal distribution: µ = 0, σ = 1 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 7 / 39
Repetition: Statistics A statistic is a function of random variables that does not depend upon any unknown parameter. For example Y = X 1 /X 2 Count 12000 10000 8000 6000 4000 2000 0 36 38 40 42 44 x1 Count 12000 10000 8000 6000 4000 2000 0 16 18 20 22 24 x2 Count 12000 10000 8000 6000 4000 2000 0 x2 22 20 18 1.6 1.8 2.0 2.2 2.4 2.6 y 38 40 42 x1 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 8 / 39
Repetition: Statistics The expectation operator E is linear in this sense: for a real number a and random variables X and Y E[aX + Y ] = a E[X ] + E[Y ] The covariance operator Cov Cov(X, Y ) E[(X E[X ])(Y E[Y ])] = Cov(Y, X ) is linear too in the following sense: Cov(aX, Y ) = E[(aX E[aX ])(Y E[Y ])] = a Cov(X, Y ) Cov(X + Y, Z) = E[(X + Y E[X + Y ])(Z E[Z])] = E[(X E[X ])(Z E[Z]) + (Y E[Y ])(Z E[Z])] = Cov(X, Z) + Cov(Y, Z) Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 9 / 39
Repetition: Statistics As a consequence, the variance of a linear combination of random variables ax + by can be calculated as Var(aX + by ) = Cov(aX + by, ax + by ) = a Cov(X, ax + by ) + b Cov(Y, ax + by ) = a 2 Cov(X, X ) + ab Cov(X, Y ) + ab Cov(Y, X ) + b 2 Cov(Y, Y ) = a 2 Var(X ) + 2ab Cov(X, Y ) + b 2 Var(Y ) In general: ( ) Var a i X i i = i,j = i a i a j Cov(X i, X j ) ai 2 Var(X i ) + 2 a i a j Cov(X i, X j ) i,j:i<j Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 10 / 39
Repetition: Covariance x*y Cov(X, Y ) E[(X E[X ])(Y E[Y ])] 10 8 ρ(x, Y ) = f (x, y) = xy Cov(X, Y ) Var(X ) Var(Y ) f(x,y) 6 4 2 0-2 -4-6 -8-3 -2-1 y 0 1 2 3 3 2 1 0 x -3-2 -1 5 5 X 2 E[X2] 0 Cov = 4.06 ρ = 0.94 X 2 E[X2] 0 5 Cov = 4.03 ρ = 0.94 5 10 5 0 5 X 1 E[X 1] 5 0 5 10 X 1 E[X 1] Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 11 / 39
Guide to the expression of uncertainty in measurement The lecture starts now! Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 12 / 39
Guide to the expression of uncertainty in measurement This lecture is based on: Evaluation of measurement data Guide to the expression of uncertainty in measurement, JCGM 100:2008, (GUM 1995 with minor corrections) Evaluation of measurement data Supplement 1 to the Guide to the expression of uncertainty in measurement Propagation of distributions using a Monte Carlo method, JCGM 101:2008 The documents are available on http://www.bipm.org/en/publications/guides/gum.html Google for: JCGM GUM JCGM: Joint Committee for Guides in Metrology, http://www.bipm.org/en/committees/jc/jcgm/ Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 13 / 39
Accompanying documents Evaluation of measurement data An introduction to the Guide to the expression of uncertainty in measurement and related documents, JCGM 104:2009 Evaluation of measurement data Supplement 2 to the Guide to the expression of uncertainty in measurement Extension to any number of output quantities, JCGM 102:2011 Evaluation of measurement data The role of measurement uncertainty in conformity assessment, JCGM 106:2012 Evaluation of measurement data Supplement 3 to the Guide to the expression of uncertainty in measurement Modeling (in preparation) Evaluation of measurement data Applications of the least-squares method (in preparation) Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 14 / 39
Terminology Coverage interval: interval containing the value of a quantity with a stated probability, based on information available. (Note that confidence interval is something else!) Coverage probability: probability that the value of a quantity is contained within a specified coverage interval Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 15 / 39
GUM 1995: Measurement model A measurand Y is determined from N other quantities X 1,..., X N through a functional relationship f : where X 1,..., X N are input quantities Y is the output quantity Notation: Y = f (X 1,..., X N ) X 1 denotes both the random quantity and its outcome The expectation of X i is denoted x i. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 16 / 39
GUM 1995: Combined standard uncertainty The combined standard uncertainty u c (y) associated with the result of a measurement is estimated as where u 2 c(y) = N ( f i=1 x i ) 2 u 2 (x i ) + 2 N 1 N i=1 j=i+1 f f u(x i, x j ) x i x j u(x i ) is the standard uncertainty of the input estimate x i, f / x i is a sensitivity coefficient, u(x i, x j ) is an estimated covariance associated with x i and x j. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 17 / 39
GUM 1995: Derivation The formula u 2 c(y) = N ( f i=1 x i ) 2 u 2 (x i ) + 2 N 1 N i=1 j=i+1 directly follows from the first order approximation to f y = f (x 1,..., x n ) f (a 1,..., a n ) + n i=0 f f u(x i, x j ) x i x j f (a 1,..., a n ) x i (x i a i ) and the formula for calculation of variance: ( ) Var a i X i = ai 2 Var(X i ) + 2 a i a j Cov(X i, X j ) i i i,j:i<j where a i = f / x i. Note that Var(x i a i ) = Var(x i ) and Var(a i ) = 0. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 18 / 39
GUM 1995: Limitations If the functional relationship between Y and its input quantities is nonlinear and a first-order Taylor expansion of the relationship is not an acceptable approximation then the probability distribution of Y cannot be obtained by convolving the distributions of the input quantities. In such cases, other analytical or numerical methods are required. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 19 / 39
Calculation of x i and u(x i ) u 2 c(y) = N ( f i=1 x i ) 2 u 2 (x i ) + 2 N 1 N i=1 j=i+1 f f u(x i, x j ) x i x j Type A evaluation of standard uncertainty u(x i ) is based on statistical evaluation of repeated measurements. Type B evaluation of standard uncertainty u(x i ) is based on other methods (prior knowledge about the distribution). The terms type A standard uncertainty is sometimes used to denote the result of type A evaluation (and similarly for type B evaluation). Do not confuse them with the concept of random and systematic errors! Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 20 / 39
Type A evaluation of u(x i ) The expectation x i is calculated as an average from N measurements X i,1,..., X i,n : x i = X i = 1 N X i,j N j=1 The standard uncertainty u(x i ) is calculated as u 2 (x i ) = s( X i ) = 1 N(N 1) N (X i,j x i ) 2 j=1 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 21 / 39
Type B evaluation of u(x i ) The standard uncertainty u(x i ) is evaluated by scientific judgment. Example: Assume it is possible to estimate only bounds (upper and lower limits) for X i in particular, to state that the probability that the value of X i lies within the interval a to a + for all practical purposes equal to one and the probability that X i lies outside this interval is essentially zero. If there is no specific knowledge about the possible values of X i within the interval, one can only assume that it is equally probable for X i to lie anywhere within it (a uniform or rectangular distribution of possible values). Then and See GUM 1995 for more examples. x i = (a + a + )/2 u 2 (x i ) = (a + a ) 2 /12 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 22 / 39
Reporting uncertainty The expanded uncertainty, U, is obtained as U = ku c (y) where u c (y) is the combined standard uncertainty a k is the coverage factor. Typically k = 2 or k = 3 for normally distributed Y. Information about other distributions is in GUM 1995. Report your result as Y = y ± U Relative expanded uncertainty U/y, value of k, and corresponding level of confidence should also be reported, see GUM 1995 for more details. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 23 / 39
Example 1 Assume X 1 N(µ 1, σ1 2), X 2 N(µ 2, σ2 2 ) and the model is Then Y = f (X 1, X 2 ) = a 1 X 1 + a 2 X 2 x 1 = µ 1 x 2 = µ 2 u(x 1 ) = σ 1 u(x 2 ) = σ 2 c 1 = f / x 1 = a 1 c 2 = f / x 2 = a 2 y = a 1 x 1 + a 2 x 2 = a 1 µ 1 + a 2 µ 2 u c (y) = [c1 2 u 2 (x 1 ) + c2 2 u 2 (x 2 )] 1/2 = [c1 2 σ1 2 + c2 2 σ2] 2 1/2 Y = y ± ku c (y) Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 24 / 39
Propagation of distributions using a Monte Carlo method Consider the following model where the distributions of X 1,..., X N are known. Y = f (X 1,..., X N ) Think about a spreadsheet where each line contains samples of X 1,..., X N drawn from known distributions and the output quantity calculated from the samples. Y X 1 X 2... X N y 1 = f (X 1,1,..., X N,1 ) X 1,1 X 2,1... X N,1 y 2 = f (X 1,2,..., X N,2 ) X 1,2 X 2,2... X N,2... y M = f (X 1,M,..., X N,M ) X 1,M X 2,M... X N,M Estimate coverage interval of Y from the samples y 1,..., y M. As M should be quite large ( 100 000), spreadsheets are not used in practice. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 25 / 39
Expectation and standard uncertainty The average is taken as an estimate y of Y The standard deviation u(ỹ) ỹ = 1 M M r=1 y r u 2 (ỹ) = 1 M 1 M (y r ỹ) 2 r=1 is taken as the standard uncertainty u(y) associated with y. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 26 / 39
Coverage interval Method 1: (The easy way.) Density 3 2 1 0 1.6 1.8 2.0 2.2 2.4 2.6 y Method 2: (Described in the JCGM 101:2008 standard.) The coverage interval is determined using a quantile function provided by a statistical software package (R, Statistica, Matlab,... ) y 2.4 2.2 2.0 1.8 0 20 40 60 80 100 Sort y 1,..., y M For M = 100, index pairs (1, 91), (5, 95) and (10, 100) define 90% coverage intervals: (y 1, y 91 ), (y 5, y 95 ), (y 10, y 100 ) Select the shortest or probabilistically symmetric interval index Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 27 / 39
An example in the statistical software R Model: Y = f (X 1, X 2 ) = X 1 /X 2, X 1 N(40, 1 2 ), X 2 N(20, 1 2 ) Scalar calculation: x1 <- 40 x2 <- 20 y <- x1 / x2 print(y) Vector calculation: n <- 100000 x1 <- rnorm(n, mean=40, sd=1) x2 <- rnorm(n, mean=20, sd=1) y <- x1 / x2 print(mean(y)) print(quantile(y, c(0.025, 0.975))) R is freely available from http://www.r-project.org/. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 28 / 39
Reporting of results Example 1 (two significant digits in u(y)): y = 1.024 V, u(y) = 0.028 V shortest 95 % coverage interval = [0.983, 1.088] V Example 2 (one significant digit in u(y)): y = 1.02 V, u(y) = 0.03 V shortest 95 % coverage interval = [0.98, 1.09] V Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 29 / 39
Example: The effect of covariance Let K, N Q, and M Q be air kerma, calibration coefficient and dosimeter reading, respectively. Then K = N Q M K = N Q N Q1 N Q1 N Q0 N Q0 M Q = k 1 k 2 N Q0 M Q For simplicity, consider the product k 1 k 2 only. The GUM 1995 formula gives: [ ] 2 [ ] 2 [ ] 2 uc (k 1 k 2 ) u(k1 ) u(k2 ) = + + 2ρ(k 1, k 2 ) u(k 1) u(k 2 ) k 1 k 2 k 1 k 2 k 1 k 2 How does the last term affect the relative combined standard uncertainty? Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 30 / 39
Example: The effect of covariance Suppose you got the value of N Q0 from a standards laboratory and you performed one measurement of N Q and N Q1 each day. Each day, you calculated one sample of k 1 and k 2. Now you want to analyze the uncertainty of k 1 k 2. Suppose: N Q0 N(10, 0 2 ) N Q1 N(10, 2 2 ) N Q N(10, 1 2 ) The Monte Carlo method was used to estimate standard uncertainties of k 1 and k 2 and the correlation of k 1 and k 2. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 31 / 39
Example: The effect of covariance Results: k 1 = 1 k 2 = 1 u(k 1 )/k 1 = 0.26 u(k 2 )/k 2 = 0.20 ρ(k 1, k 2 ) = 0.85 Without the correlation term: u c (k 1 k 2 ) k 1 k 2 = 0.33 k 2 1.5 1.0 0.5 With the correlation term: u c (k 1 k 2 ) k 1 k 2 = 0.14 0.5 1.0 1.5 2.0 2.5 3.0 k 1 Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 32 / 39
The end Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 33 / 39
Appendix: Terminology coverage interval: interval containing the value of a quantity with a stated probability, based on information available coverage probability: probability that the value of a quantity is contained within a specified coverage interval length of coverage interval: largest value minus smallest value in a coverage interval probabilistically symmetric coverage interval: coverage interval for a quantity such that the probability that the quantity is less than the smallest value in the interval is equal to the probability that the quantity is greater than the largest value in the interval shortest coverage interval: coverage interval for a quantity with the shortest length among all coverage intervals for that quantity having the same coverage probability Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 34 / 39
Appendix: Non-adaptive Monte Carlo method Step-by-step procedure 1 select the number M of Monte Carlo trials to be made 2 generate M vectors, by sampling from the assigned PDFs, as realizations of the (set of N) input quantities 3 for each such vector, form the corresponding model value of Y, yielding M model values 4 sort these M model values into strictly increasing order, using the sorted model values to provide G 5 use G to form an estimate y of Y and the standard uncertainty u(y) associated with y 6 use G to form an appropriate coverage interval for Y, for a stipulated coverage probability p. Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 35 / 39
Appendix: Conditions for the application of the MCM Model: Y = f (X) = f (X 1,..., X n ) Conditions: 1 f is continuous with respect to the elements X i of X in the neighborhood of the best estimates x i of the X i 2 the distribution function for Y is continuous and strictly increasing 3 the PDF for Y is 1 continuous over the interval for which this PDF is strictly positive 2 unimodal (single-peaked) 3 strictly increasing (or zero) to the left of the mode and strictly decreasing (or zero) to the right of the mode 4 E[Y ] and Var(Y ) exist 5 a sufficiently large value of M is used Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 36 / 39
Appendix: Evaluation of the model Coverage interval: Let q = pm, if pm is an integer. Otherwise, take q to be the integer part of pm + 1/2. Then [y low, y high ] is a 100p % coverage interval for Y, where, for any r = 1,..., M q, y low = y (r) and y high = y (r+q). The probabilistically symmetric 100p % coverage interval is given by taking r = (M q)/2, if (M q)/2 is an integer, or the integer part of (M q + 1)/2, otherwise. The shortest 100p % coverage interval is given by determining r such that, for r = 1,..., M q, y (r +q) y (r ) y (r+q) y (r). Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 37 / 39
Appendix: Adaptive Monte Carlo method Numerical tolerance: Let n dig denote the number of significant digits regarded as meaningful in a numerical value z express z in the form c 10 l, where c is an n dig decimal digit integer and l an integer The numerical tolerance δ associated with z is given as δ = 1 2 10l Step-by-step procedure 1 set n dig to an appropriate small positive integer; 2 set M = max(j, 10 4 ) where J is the smallest integer greater than or equal to 100/(1 p); 3 set h = 1, denoting the first application of MCM in the sequence Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 38 / 39
4 carry out M Monte Carlo trials 5 use the M model values y 1,..., y M so obtained to calculate y (h), u(y (h) ), y (h) (h) low and y high for the hth member of the sequence; 6 if h = 1, increase h by one and return to step 4 7 calculate the standard deviation s y associated with the average of the estimates y (1),..., y (h) of Y, given by s 2 y = 1 h(h 1) h (y (r) y) 2, y = 1 h r=1 h r=1 y (r) 8 calculate the counterpart of this statistic for u(y), y low and y high 9 use all h M model values available so far to form u(y) 10 calculate the numerical tolerance δ associated with u(y) 11 if any of 2s y, 2s u(y), 2s ylow and 2s yhigh exceeds δ, increase h by one and return to step 4 12 regard the computation as having stabilized, and use all h M model values obtained to calculate y, u(y), and a 100p % coverage interval Alexandr Malusek (Radiation Physics) Estimation of uncertainties 2014-04-15 39 / 39