ON USING BOOTSTRAP APPROACH FOR UNCERTAINTY ESTIMATION

Size: px

Start display at page:

Download "ON USING BOOTSTRAP APPROACH FOR UNCERTAINTY ESTIMATION"

Tyler Richard
5 years ago
Views:

1 The First International Proficiency Testing Conference Sinaia, România 11 th 13 th October, 2007 ON USING BOOTSTRAP APPROACH FOR UNCERTAINTY ESTIMATION Grigore Albeanu University of Oradea, UNESCO Chair in Information Technologies, Universitatii Str., No. 1, , Oradea, Romania Abstract Computer-Intensive methods for estimation assessment provide valuable information concerning the adequacy of applied probabilistic models. The bootstrap method is an extensive computational approach to uncertainty estimation based on resampling and statistical estimation. It is a powerful tool, especially when only a small data set is used to predict the behaviour of systems or processes. This paper provides a methodology to investigate uncertainty evaluation by bootstrap and a procedure to obtain confidence bands for linear and nonlinear models used in data analysis and design measurements. Also statistical considerations for proficiency testing are given. Numerical examples and graphical visualizations will be shown for a case study. Key words Bootstrap method, Resampling methods, Data modelling, Sampling variance, Median, Z-scores, Confidence Intervals, Confidence bands, Proficiency Testing 1 INTRODUCTION Many modelling applications in social and engineering sciences give rise to an estimation process (parameters or functional relations). The numerical estimation process always depends on the previously data collection. Even data filtering and outlier s elimination procedures were applied, the ill-conditioned effect could appear. To increase the confidence on the estimations or to improve the estimates, some computer intensive methods were recommended. This paper describes the usage of the bootstrap approach for assessing the estimation process and to provide simultaneous confidence bands for the studied model. 221

2 2 THEORETICAL BACKGROUND Bootstrap is a simple but powerful Monte Carlo method to assess statistical accuracy or to estimate a distribution from sample's statistics. The name may come from phrase Pull yourself up by your bootstraps having the following meaning Improve your situation by your own efforts, as Phrase Finder [13] is able to find. The methods are suitable for any level of modelling being useful for fully parametric, semiparametric, and completely nonparametric analysis. These approaches are not only in use by statisticians, but also are applied anywhere statistics can be used: life sciences, social sciences, econometrics, reliability etc. For the aim of this paper we outline the basic bootstrap principle, the application of bootstrap sampling for accuracy estimation and the method of simultaneous confidence bands for uncertainty management. 2.1 The bootstrap method principle Let X be a random variable and F the cumulative distribution function of the variable X. The bootstrap method was proposed by Efron [5, 6, 7] and it can be used on the estimation of: the distribution function of a random variable R(X, F) ; a functional relation V(F), or the accuracy of a statistics s obtained from a sample (X 1, X 2,..., X n ) of size n from X. In the framework of this paper, the accuracy, describes the variability of s when independent estimations s(1), s(2),..., of the statistics s, are obtained by resampling. The bootstrap technique uses the sample (X 1, X 2,..., X n ) to obtain the sampling cumulative distribution function F n (x) in order to replace the true cumulative distribution function F: F n (x) = (1/n) cardinal { x i x; 1 i n }. To repeatedly simulate bootstrap samples X*:=(X 1 *, X 2 *,..., X n *) from F n, random number generators should be used according to the Monte Carlo approaches. Then, for each bootstrap sample, it is recalculated: the distribution function of the random variable R(X*, F n ) ; the functional relation V(F n ) or V(F n *); the statistics s*( ). The accuracy of the statistics s can be derived under an appropriate statistical inference study on the sequence s*(). The bootstrap resampling can be realised in various ways. Uniform resampling and the importance resampling are the mostly used. We will present, in detail, these resampling algorithms. Uniform resampling assumes that the measurement (observed) values are uniformly sampled from some process. The algorithm for uniform resampling consists in the following steps: 1. Initialise a uniform random number generator over (0,1), say random() ; 2. For k from 1 to n do {u := random(); j := [n u] + 1; X k * := X j }. 222

3 The importance resampling algorithm assumes the generation of sampling values according to a probability distribution {(x i, p i ): i = 1, 2,..., n} such as every p i is a nonnegative real number, and p 1 + p p n = 1. Using a method to generate random samples according to a given discrete distribution function (for more about random variable generation see [14]), the bootstrap sample is obtained in the following steps [1, 2]: 1. Initialise a uniform random number generator over (0,1); 2. For k from 1 to n do steps Let j := 0, and u := random(); 2.2. j := j + 1; 2.3. If u > p 1 + p p j then goto step 2.2; 2.4. X k * := X j. As a common example of the usage of the uniform resampling, we refer to the bootstrap algorithm for estimating standard errors. However, when some observations are more important then others, the importance resampling can provide close to real conclusions. If resampling is based on importance resampling weights, then the bootstrap estimates are reweighted as if uniform resampling is done. 2.2 Uncertainty estimation and bootstrap Terminology For a set (sample) of uncorrelated N random variables, x i, i = 1, 2,, N, let us denote by Mean[X] the sample mean given by: N 1 Mean[X] =. x i N i= 1 The median is the middle value of the group for a particular sample, i.e. half of the results for the sample are higher than it and half are lower (Q2). It is calculated from the sorted values (from lowest to highest). If N is even, the median is the average of the two central values. Let us denote this value by Median[X]. Interquartile range (IQR[X]) is the difference between the lower and upper quartiles = Q3 - Q1. The lower quartile is the value below a quarter of the results lie. Similarly, the upper quartile is the value above a quarter of the results lie. Normalised IQR (NIQR[X]) is a measure of the variability of the results which basically is a robust standard deviation. It is equal to the IQR[X] multiplied by the factor Robust CV (coefficient of variation) is equal to the NIQR[X] divided by the median, expressed as a percentage (i.e. multiplied by 100), it allows for the variability in different tests to be compared. An accepted statistical method for analysis test results in proficiency testing is to calculate a Z-score for each laboratory s result. The standard form for the calculation of Z-scores is 223

4 X i A[ X ] Z i =, B[ X ] where A[X] is the assigned value (sample mean), and B[X] is an estimate of the spread of all results (standard deviation). The classical approach based on mean and standard deviation is signifiquantly influenced by the presence of extreme values (outliers). Therefore, a robust approach based on median and interquantile range is better to be used [11, 16]. Robust Z-scores are calculated by replacing A[X] and B[X] in the classical Z-score by the median and NIQR, respectively. For proficiency testing both betweenlaboratory and within laboratory Z-scores can be used. The standardised sum (S) and the standardised difference (D) for the pair of results are: S = (A+B)/ 2 and D = (A-B)/ 2 (median of A is less than median of B) or D = (B-A)/ 2, otherwise. The between-laboratory Z-score (ZB) is the robust Z-score of S and the withinlaboratory Z-score (ZW) is the robust Z-score of D: and ZB = (S-Median[S])/NIQR(S), ZW = (D - Median(D))/NIQR(D). A methodology based on bootstrap approach can be used to study the robust Z- scores ZB and ZW Simultaneous bootstraping the MEAN and STDEV A Z-score can be computed repeatedly by simultaneous resampling in order to obtain a resample mean and standard deviance. The process is repeated by a number of steps and the final results can be obtained considering the best performance. An 80% approach can be used when a strong acceptance is required. Generally speaking, the best test could be based on a 51% approach Simultaneous bootstraping the sample MEDIAN and NIQR A robust Z-score can be computed repeatedly by simultaneous resampling in order to obtain a resample median and normalised interquantile range. The process is repeated by a number of steps and the final results can be obtained considering the best performance. Also, an 80% approach can be used when a strong acceptance is required. Generally speaking, the best test could be based on a 51% approach. 2.3 Simultaneous confidence bands The general framework for obtaining simultaneous confidence bands for non-linear explicit regression models using resampling techniques is presented in [1, 2, 3]. Let g+, g- be two known nonnegative models (real functions which may depend on the input data), and u+, u- be nonnegative scale factors. The confidence band for the model h (used to explain a relation such as y i = h(x i, θ) + ε i (i = 1, 2, ) is the region: R = {(x, y): x Input domain, h(x, est(θ)) - est(v)u - g - (x) y h(x, est(θ)) + est(v)u + g + (x) }, 224

5 where (x i, y i ) i 1 are observations, θ is a parameter vector, est(θ) is a least squares estimate of θ, (ε i ) i 1 are random variables distributed independently and identically with an unknown cumulative distribution function (cdf) F, but with mean zero and unknown variance σ 2, and est(v) is an estimate of σ. The reference [11] describes the application of this technique for both a parametric log-linear and a kernel estimation model in the normalised input domain (0,1). However, any input domain and any estimation model could be used to obtain an estimation h(x,.). Obtaining the region R asks for solving the following problem: given the models g-, g+, and a (0, 1), search for the non-negative real numbers u- and u+ such that P{h(x, est(θ)) - est(v)u - g - (x) h(x, θ) h(x, est(θ)) + est(v)u + g + (x); x in Input domain} = 1 - a. Various alternatives for choosing g - and g + there exist [1]. A common approach considers the models g - (x) = g + (x) = 1 for all x Input domain. Two approaches related to the scale factors u - and u + are important enough: u + = u - (symmetrically simultaneous confidence bands); or the sum u + + u - must have a minimal value. 2.4 Bootstrap for nonlinear regression and neural network modelling Regression analysis is one of the statistical tools that is frequently used in data analysis [1, 9]. Also designing neural networks for data analysis is a common activity among researchers. Not only nonlinear regression, also neural networks were successfully applied for the prediction of the most important cement quality characteristics, as Guslikov et al. [10] reports. Other successful stories are reported for case-control studies missing data [12] or for environmental risk assessment [15]. The combination of nonlinear regression / neural networks and bootstrap can be used to extract information from cement database. Applying the bootstrap approach a sequence of parameter s estimations and a classical and bootstrap confidence uncertainty analysis can be realised. Mean analysis Bootstrap Mean Figure 1 - Mean analysis by uniform resampling 225

6 Median analysis Bootstrap median Figure 2 - Median analysis by uniform resampling 3 EXPERIMENTAL RESULTS When considering the initial sample of a measurement task and apply a bootstrap resampling for a large moments of time, the bootstrap characteristics of the sample will oscillate between some values to be used in an the uncertainty analysis based on bootstrap intervals. The images shown by figures 1-5 illustrate the bootstrap characteristics of the above sample obtained in a uniform resampling approach. The following characteristics are illustrated: bootstrap mean, bootstrap median, bootstrap IQR, bootstrap N-IQR and bootstrap standard deviation. Figure 6 describes a bootstrap confidence obtained in a nonlinear regression model applied in software testing and reliability prediction according to data given in Table 1. It is easy to see that all 10 bootstrap functional belong to a band easily obtainable considering the domain range values (considering min - max values for all moments and functions). This will permit a bootstrap uncertainty analysis of model s parameters. 226

7 IQR analysis Bootstrap IQR Figure 3 - IQR analysis by uniform resampling N- IQR analysis Bootstrap N-IQR Figure 4 - N-IQR analysis by uniform resampling STDEV analysis Bootstrap STDEV Figure 5 - IQR analysis by uniform resampling 227

8 Software reliability is defined as the probability of failure-free software operation for a specified period of time in a specified environment. Many software reliability growth models are available and most of them assume that detected faults are immediately corrected [3]. Actually, this assumption may not be realistic in practice, and an uncertainty analysis has to be done based on different bootstrap variants as mentioned in [4, 8]. Functional analysis F(t), Fi(t), i=1,..., t F(t) F1(t) F2(t) F3(t) F4(t) F5(t) F6(t) F7(t) F8(t) F9(t) F10(t) Figure 6 - Software reliability growth modelling Table 1 - Collected data and Bootstrap Uniform Resampling Analysis t F(t) F 1 (t) F 2 (t) F 3 (t) F 4 (t) F 5 (t) F 6 (t) F 7 (t) F 8 (t) F 9 (t) F 10 (t)

9 4 CONCLUSIONS The paper provides a methodology to investigate uncertainty evaluation by bootstrap and describes a procedure to obtain confidence bands for linear and nonlinear models used in data analysis and design measurements. The approach can be used both for theoretical applications, and it may addresses practical applications for accuracy assessment, uncertainty analysis for all kind of models, including those based on neural networks. REFERENCES [1] Albeanu G.: The statistical and numerical analysis of some nonlinear regression models, PhD Disertation (in Romanian), Bucharest University, [2] Albeanu G.: Resampling Simultaneous Confidence Bands for Nonlinear Explicit Regression Models, Mathematical Reports, 50(5-6), , (1998). [3] Albeanu G. and Popentiu F.: On the Bootstrap Method: Software Reliability Assessment and Simultaneous Confidence Bands, Annals of Oradea University, Energetics Series, 7(1), , (2001). [4] Cowling A.; Hall P. and Phillips M.J.: Bootstrap Confidence Region for the Intensity of a Poisson Point Process, JASA, 91, , (1996). [5] Efron B.: Bootstrap Methods: Another Look at the Jacknife, Ann. Stat., 7, 1-27, (1979). [6] Efron B.; Tibshirani R.J.: An introduction to the bootstrap, Chapman and Hall, London, [7] Efron B., Computer-Intensive Methods in Statistical Regression, Siam Review, 30, 3, , (1988). [8] Frey H. C.; Burmaster D. E.: Methods for Characterizing Variability and Uncertainty: Comparison of Bootstrap Simulation and Likelihood-Based Approaches, Risk Analysis, 19, 1, , (1999). [9] Gonçalves S.; White H.: Bootstrap Standard Error Estimates for Linear Regression, JASA, 100, 471, , (2005). [10] Guslicov G.; Coarnă M.; Şerbănescu L.; Sorescu V.; Ispeciuc M.: Standard Strength of Cements Predicted via Computer Modeling, Key Engineering Materials, 264(-268), , (2004). [11] Jolliffe I.: Estimation of uncertainty in verification measures, International Verification Methods Workshop, The McGill Faculty Club, Montreal, Quebec, Canada, (2004). [12] Siersma V.; Johansen C.: The use of the bootstrap in the analysis of casecontrol studies missing data, Research Report - Institute of Public Health, University of Copenhagen, (2004). [13] The Phrase Finder: 2007, Meanings and Origins of Phrases, Sayings and Idioms, [14] Vaduva I.: Simulation models by computer, Technical Publishing House (in Romanian), Bucharest, (1977). [15] Verdonck F.; Jaworska J.; Thas O. and Vanrolleghem P.A.: Uncertainty Techniques in Environmental Risk Assessment, Med. Fac. Landbouw. Univ. Gent, 65/4, , (2000). [16] Web1: Testing Interlaboratory comparisons, Asia Pacific Laboratory Acreditation Cooperation, 229

Determining the Spread of a Distribution Variance & Standard Deviation

Determining the Spread of a Distribution Variance & Standard Deviation 1.3 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3 Lecture 3 1 / 32 Outline 1 Describing