Least squares with non-normal data: estimating experimental variance functions

Size: px
Start display at page:

Download "Least squares with non-normal data: estimating experimental variance functions"

Transcription

1 i-section: PERSPECTIVE The Analyst Least squares with non-normal data: estimating experimental variance functions Joel Tellinghuisen* DOI: /b708709h Contrary to popular belief, the method of least squares (LS) does not require that the data have normally distributed (Gaussian) error for its validity. One practically important application of LS fitting that does not involve normal data is the estimation of data variance functions (VFE) from replicate statistics. If the raw data are normal, sampling estimates s 2 of the variance s 2 are x 2 distributed. For small degrees of freedom, the x 2 distribution is strongly asymmetrical exponential in the case of three replicates, for example. Monte Carlo computations for linear variance functions demonstrate that with proper weighting, the LS variancefunction parameters remain unbiased, minimum-variance estimates of the true quantities. However, the parameters are strongly non-normal almost exponential for some parameters estimated from s 2 values derived from three replicates, for example. Similar LS estimates of standard deviation functions from estimated s values have a predictable and correctable bias stemming from the bias inherent in s as an estimator of s. Because s 2 and s have uncertainties proportional to their magnitudes, the VFE and SDFE fits require weighting as s 24 and s 22, respectively. However, these weights must be evaluated on the calculated functions rather than directly from the sampling estimates. The computation is thus iterative but usually converges in a few cycles, with remaining weighting bias sufficiently small as to be of no practical consequence. Introduction The method of least squares (LS) is perhaps the most powerful and general purpose data analysis tool in physical science. Yet its validity does rest on a number of premises about the data, which are generally impossible to confirm in practical cases. Among these is the statement, often used disparagingly, Least squares requires that the data have random, normally distributed errors, and because of experimental anomalies, most real data cannot be normal. Actually, that statement is not quite correct; but more importantly, the question of what effect non-normal data have on the outcome of an LS analysis has rarely been investigated. This paper addresses that question in the context of data that can be very far from normal: sampling estimates of variances. The practical need for such fitting is to determine data variance functions, in order to provide correct statistical weights for analyzing the data themselves. For the purpose of Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, USA the present study, several replicates are taken for each of a number of (x,y) data points. We will see that sampling estimates (s 2 ) of variances (s 2 ), though wildly non-normal in their distribution (exponential for three normally distributed replicates!), nonetheless yield unbiased, minimum-variance estimates of the LS parameters, at least in the idealized world of Monte Carlo (MC) computations, where we can be sure of the data structure and correct weights at the outset. On the other hand, if the same data are fitted as standard deviation (s) estimates of s, the results exhibit predictable bias, because now the data themselves are biased, contradicting one of the basic LS premises. The present study arose from a recent investigation of the effects of neglecting weights in calibration with heteroscedastic data. 1 For minimum-variance estimation of the parameters, the data must be weighted inversely as their variance, w i 3 s i Yet, in ref. 1 it was noted (and not for the first time 6 ) that weighting as the inverse sampling variance, w i 3 s i 22, could actually be worse than neglecting weights altogether. On the other hand, if the data variance depended in some simple way on the experimental parameters, and the sampling estimates were used collectively to estimate that dependence, then even approximate determination of the true variance function sufficed to yield near-minimumvariance estimates of the calibration parameters. These results left open the questions: (1) is it better to fit standard deviations or variances; (2) how should the s (s 2 ) data be weighted in the variance function estimation (VFE) step in the real world, where true weights are not known in advance; and (3) how many replicates suffice? The last question was already answered partially in ref. 1, by the observation that even a crude determination of the data variance function led to,10% loss in precision in the desired calibration function. The MC approach of this study is the same as that of ref. 1 and earlier works in a similar vein: 7,8 assume simple functional forms for the variance function and the data response function, generate N equivalent data sets by adding random This journal is ß The Royal Society of Chemistry 2008 Analyst, 2008, 133,

2 normal (Gaussian) error to the true response function, and then analyze each data set and accumulate statistics for all such sets. The choice of response function is unimportant here, except that it must permit variation in the variance function. I have used just the linear function, y 5 f(x) 5 a + bx. For the data error, I have also confined my attention to just two simple twoparameter forms: s 2 5 A 1 + B 1 y 2 and s 5 A 2 + B 2 y. All of these forms thus have the advantage of being linear in the adjustable parameters; and the error functions are reasonable for many experimental techniques, where constant error must dominate in the weak-signal limit, but where proportional error dominates for strong signal. Since the raw data for the estimation of the data error functions are the replicate-based estimates of s 2 or s, each of the N data sets consists of n R replicates at each of n x values of the independent variable (x). Thereby the reliability of the VFE fitting can be related to the manner in which the n R 6 n x values are distributed among x values and replicates. Before getting into the details of the computations, it is useful to review the tenets of linear LS. 2 5,8 The independent variable x is assumed to be error-free. If the data for the dependent variable y are unbiased, with errors that are random, independent, and of finite variance, the LLS estimators will be unbiased and of finite variance. If the data are weighted as s 22 i, the LLS estimators will be minimum variance (which essentially means the best possible). When the data error structure is known a priori (as, for example, in MC computations), the parameter variances are also known exactly a priori; they are the diagonal elements of the variance-covariance matrix V prior. If the data errors are normally distributed, then the LLS parameter estimates are normally distributed; and the sum of weighted, squared residuals S 5 S(d i /s i ) 2 is x 2 distributed for n degrees of freedom, where n is the number of fitted data points minus the number of adjustable parameters. Least squares and Monte Carlo computations The method of least squares is described in refs 1 8, most of which include the compact matrix notation that I will use here to cover the bare essentials. In LLS the data and adjustable parameters can be expressed as y 5 Xb + d, (1) where the column vectors y and d contain the n measured values of the dependent variable and the residuals, respectively, b contains the p adjustable parameters, and the design matrix X has n rows and p columns. The LS solution minimizes S 5 Sw i d i 2 5 d T Wd with respect to the p adjustable parameters, where the residuals d i 5 y i 2 f(x i ), and the weight matrix W is diagonal, with W ii 5 w i. Minimum-variance estimation of the parameters b requires that w i 3 s i 22. Unweighted LS (OLS) assumes that all s i are the same and uses w i 5 1, giving b u 5 (X T X) 21 X T y ; A u 21 X T y ; V u X T y. (2) The data variance is then estimated from the residuals, s 2 5 S/(n 2 p), and the estimated parameter variances are the diagonal elements of this a posteriori variance-covariance matrix, V post 5 s 2 V u. (3) V u is entirely determined by the x-structure of the data so is known from the outset. If the data variance is known to be s 2, then V prior 5 s 2 V u, (4) a result which is both known at the outset and exact. This is a special case of that for data of non-constant s, V prior 5 (X T WX) 21 ; A W 21, (5) where w i 5 s i 22. If the s i are known in only a relative sense, we must again resort to a posteriori estimates, V post ~ S n A {1 W : (6) Then the prefactor is known as the variance for data of unit weight. However, unless the s i are known to within a constant factor, eqn (6) is invalid and may be either optimistic or pessimistic. 1 If the data error is normal and the data are properly weighted, the sampling estimates s 2 of the variance are distributed as a scaled x 2 variate; thus they have the statistical properties of x 2, which has mean n and variance 2n. 9 In the reduced form x 2 /n, its probability distribution is P(z) 5 Cz (n 2 2)/2 exp(2nz/2), (7) where C is a normalization constant and z 5 x 2 /n. The reduced x 2 has mean 1 and variance (2/n); accordingly, sampling variance estimates (s 2,V post,ii ) have relative standard deviation (2/n) 1/2 ; and from simple error propagation, estimated standard deviations and parameter errors have relative standard error (RSE) a factor of 2 smaller, or (2n) 21/2. P(z) becomes Gaussian in the large-n limit, but is far from normal and highly asymmetrical for the small n of primary concern here (Fig. 1). The distribution of standard deviations is that for (x 2 /n) 1/2 and can be obtained from P(z) by 2 P y (y) dy 5 P z (z) dz, (8) p with y = ffiffi z. Using either Pz (z)orp y (y), we can verify that the average of y, SyT = p S ffiffi z T,is pffiffi C vz1 2 2 SyT~ v C v (9) 2 where C is the gamma function. 9 The value of SyT is,1 but approaches 1 for large n. This is the bias in sampling estimates of standard deviations. pffiffiffiffiffiffiffiffiffiffi (The difference between SyT and Sy 2 T is familiar to chemists in the context of gas kinetic theory, where the mean and rms molecular speeds differ.) Since both s and s 2 have standard deviations proportional to their magnitudes, LS fits of s and s 2 to data error functions should be weighted inversely as s 2 and s 4, respectively. In MC computations, we know the true s (s 2 ) by assignment. However, in dealing with actual data we have two choices, neither equivalent to this theoretical weighting: compute the weights from either the sampling estimates or from the fitted functional approximation of s (s 2 ). The latter approach requires iteration: the fitted function is not known at the outset, so estimates of the weights must be 162 Analyst, 2008, 133, This journal is ß The Royal Society of Chemistry 2008

3 Fig. 1 Probability density functions (unnormalized) for x 2 /n (var) and its square root (sd), for the indicated numbers of degrees of freedom n. revised following the fit. This procedure typically converges adequately within ca. 4 cycles. The Monte Carlo computations were done with FORTRAN programs and usually involved N data sets. Simple statistics and binning were done on the fly to avoid having to store and post-process large data sets. Thus, for example, from running sums of the parameters and their squares, one can compute the averages and variances at the end of the run, from, e.g., SaT 5 Sa i /N and s 2 a 5 Sa 2 T 2 SaT 2. For assessing the statistical significance of apparent bias, the metric for parameters p ffiffiffiffi is the standard error divided by N, while that for the RSEs is (2N) 21/2, or 0.224% for N Thus exact agreement with predictions in the MC sense means parameter disparities smaller than s and standard deviations within 0.22%, for 68% confidence. s 2 5 A 2 + B 2 y; A ; B (11b) for the data error. In a formal sense one can qffiffiffiffiffishow that fitting s 2 to s 2 2 and s to s 2 1 yield the respective parameters with identical standard errors. However, the latter forms are non-linear and that, combined with the relatively large uncertainties of the parameters (up to 100% for three replicates), gives many divergences in the MC runs. Thus I have used just the two, slightly inequivalent functions of eqns (11) for the present MC experiments (getting at most ca. 50 divergences, from the iterative weighting cycle, in 10 5 data sets). Table 1 summarizes results for a response function having six evenly spaced x values with three replicates at each. Of all the tabulated results, only those for fitting s 2 with theoretical weighting and calculated weighting (B 2 only) are statistically consistent with the true parameter values and predictions from V prior. Also noteworthy: (1) observed weighting gives catastrophic biases for the important proportionality constants B 1 and B 2 ; (2) not even theoretical weighting gives agreement in Table 1 Monte Carlo results (N ) for fitting data variance and standard deviation to the model of eqn (10) and eqn (11), with six x values and three replicates at each a Parameter Value b s b Exact values for fitting to s 2 5 A 1 + B 1 y 2 : A B Theoretical weighting c A B S/n Observed weighting c A B S/n Calculated weighting c A B S/n Results and discussion To demonstrate the statistical properties of fitting variances and standard deviations to functional forms, I have chosen models for the response function and its error close to those explored in ref. 1, specifically y 5 a + bx; a 5 0.1; b 5 10; 0 ( x ( 10 (10) for the response function, and and s A 1 + B 1 y 2 ; A ; B (11a) Exact values for fitting to s 5 A 2 + B 2 y: A B Theoretical weighting A d B d S/n Observed weighting A B S/n Calculated weighting A d B d S/n a Values in bold are the only ones in satisfactory agreement with theory and predictions. b MC results are for single runs and are given to slightly higher precision than warranted for N Exact values of standard errors are from V prior. c Weightings: theoretical employs true variances from model; observed uses sampling estimates directly; calculated uses fitted curve for each data set, p with ffiffiffi iteration to convergence. d Correcting the sampling s values for their known bias [here ( p )/2] brings the parameters values into statistical consistency but raises the parameter standard errors above their theoretical values. This journal is ß The Royal Society of Chemistry 2008 Analyst, 2008, 133,

4 the fitting of s to eqn (11a); and (3) even for theoretically weighted fitting of s 2, the standard deviation of S/n is wrong. The first of these amplifies previous concerns about basing weights on sampling estimates of variance. The second stems from the bias inherent in s, which if corrected in the data, yields correct parameter values (but too large parameter standard errors). The third failure is a reminder that the x 2 distribution for S and s 2 is contingent upon the data having normally distributed error. While the raw data are normal by design, the s 2 estimates which constitute the data for VFE are themselves x 2 distributed, which means exponentially distributed for n 5 2 [eqn (7)]. While exponentially distributed data are unusual in LS fitting, a truly surprising outcome is the nearly exponentially distributed results for the fitted values of two of the constants included in Fig. 2. At first reflection, this seems in conflict with the central limit theorem, according to which averages should become normal in the limit of large n. (As a simple example of this and an easy way to generate normal error for synthetic data the sum of 12 uniform deviates is very nearly normal, with mean 6 and standard deviation 1.) On reflection, it is clear that the result for A 1 stems from the very strong weighting for the lowest-y variance, meaning that A 1 is effectively determined by just that one point, which itself is x 2 distributed with n 5 2. The very strongly biased B 1 and B 2 from observed weighting are harder to explain, but presumably both the distribution and the bias stem from the dominance of anomalously small s 2 values (which thus are strongly overweighted). None of the S distributions in Fig. 2B approximates the x 2 distribution for n 5 4 expected for normal data. The distributions from the fitting of s, shown in Fig. 3, though more biased than those from fitting s 2, are narrower and more symmetrically distributed. This is mainly just a consequence of the transformation of eqn (8). None of the S distributions (not shown) resembles the x 2 distribution. It is interesting that some of the MC parameter errors are actually smaller than the corresponding exact values, seemingly in conflict with the LS Fig. 2 Histogrammed results of variance fitting summarized in Table 1, for A 1 and B 1 (A) and S/n (B), under different weightings: theoretical (TW), observed (OW) and calculated (CW). In A the argument X is the (observed 2 true) difference divided by the exact standard error (TW) The solid curve in A is a declining exponential fitted to all but the first displayed point for A 1 2 TW. The solid curve in B is the reduced x 2 distribution [eqn (7)] for n 5 4, fitted to the CW data. minimum-variance guarantee. However, in each such case the bias in the MC parameter value is appreciable, and when the bias error is included in the variance computation, the MC result no longer undershoots the exact value. The occurrence of bias in the estimated parameters for observed and calculated Fig. 3 Histogrammed results from standard deviation fitting summarized in Table 1. The argument is as defined in Fig. 2; the bin width is 0.25, but to avoid confusion, not all points are shown. Results for calculated weighting (not shown) are very similar to those for theoretical weighting. 164 Analyst, 2008, 133, This journal is ß The Royal Society of Chemistry 2008

5 weighting is interesting, because in LLS the use of incorrect weights leads to bias in the parameter error estimates but not in the parameters themselves. 8 The source of bias here is the variation in the weights from data set to data set and has nothing to do with the fitting of nonnormal data. This was confirmed by MC computations in which the s 2 values were fitted using weights that were incorrect but the same for all data sets. Some analysts choose to express their variance functions and weights in terms of the independent variable x rather than the dependent variable y [eqns (11)], so it is worth asking whether this choice has any bearing on the results for calculated weighting in Table 1. The answer is none at all for the 2% proportional error involved here. When this error is increased ten-fold, there is a slight statistical discrimination against the results for the y-based functions, from the set-to-set variability in the averaged y values used in eqns (11). However, this difference is of no practical significance. The choice of only three replicates was designed to emphasize the anomalies. To check the dependence on the structure of the data set, I conducted further MC calculations on variance fitting with weighting on the calculated curve, for varying numbers of calibration points and replicates. To make the constant contribution to the variance significant over a larger range of x, I increased the constant A 1 by a factor of 100 for these tests. Results in Fig. 4 show statistically significant bias in both constants for most probed data structures. The MC standard errors on A 1 and B 1 (not shown) are incremented over their theoretical values by amounts that roughly track the biases, amounting to at most 25% for A 1 and 8% for B 1. The theoretical standard errors for both parameters decrease in the order of the displayed points 5 8 in Fig. 4, by about 30% for A 1 and 20% for B 1. In part, this trend is due to the increasing total number of degrees of freedom with fewer x values; in part it reflects greater precision inherent in using more points near the extremes of the range of x. 10 For the intended purpose of VFE to obtain weights for the fit of the data to the response function neither the biases nor these precision dependences are significant. Conclusion The method of least squares does not require normal data for validity: as long as the data are of finite variance and unbiased, LLS parameter estimates will be of finite variance and unbiased; and they will also be minimum-variance estimates if the data are weighted inversely as their variances. However, biased data produce biased parameter estimates, and non-normal data give non-normal parameter distributions and non-x 2 - distributed sums of squared residuals. As a consequence it may be difficult to Fig. 4 Bias in variance function coefficients for different data structures. The first four pairs of entries are for six x values and 3, 4, 6, and 10 replicates, respectively. The last four are for ca. 24 total points each, with data values and replicates apportioned as (n x n R ): 8 3, 5 5, 4 6, 3 8 in order # 5 8. For these tests, A 1 was increased to The error bars indicate the MC precision. assign confidence limits and conduct goodness-of-fit tests. In the particular application under study here, variance function estimation (VFE), the data can be far from normal, and this can translate into strongly non-normal parameter distributions nearly exponential for some parameters estimated from three replicates at each point. However such problems are of little importance in this application, since the outcome of the analysis is a means to an end the proper weighting of the data themselves. The primary need for data variance functions in analytical chemistry is in the LS determination of calibration functions. Monte Carlo computational experiments on simple variance functions resembling those likely to be encountered in actual data analysis situations show that: N The weights should be assessed using the fitted function rather than the raw s 2 (s) values. This renders the fit iterative, but convergence is typically quick. N The variance function can be estimated with adequate precision and tolerable bias from as few as three replicates taken at each of six data values. N It makes no difference whether the variance function is expressed in terms of the dependent variable y or the independent variable x (though the former is theoretically more sensible). N It makes little difference whether s 2 or s is fitted; although fitting s yields biased parameters, the bias is correctable and anyway has negligible effect on the data fit for which the VFE or SDFE is needed. The foregoing is premised upon the existence of a simple relation between the data error and the experimental variables, like eqns (11). Such relations must surely exist in all experimental techniques, e.g. dependence upon wavelength and absorbance in spectrophotometry, 11 or signal level and titrant injection volume in isothermal titration calorimetry. 12 If this dependence cannot be perceived by the analyst, and heteroscedasticity still seems evident in the data, then considerably more than three replicates will be needed at each point to provide reliable data weights. Use of nine replicates gave ca. 10% loss in precision in the calibration tests in ref. 1. The need to weight the data in VFE/ SDFE stems from the inherent properties This journal is ß The Royal Society of Chemistry 2008 Analyst, 2008, 133,

6 of sampling estimates of variances, which have the statistical properties of x 2. All estimated variances and standard deviations have proportional error, or uncertainty that is proportional to their magnitude. The proportionality constant is readily calculated from the degrees of freedom: (2/n) 1/2 for estimated variances and (2n) 21/2 for estimated standard deviations. The fact that this uncertainty in the uncertainty is so easily calculated makes it all the more surprising that it is so widely neglected by data analysts in the physical sciences. References 1 J. Tellinghuisen, Analyst, 2007, 132, A. M. Mood and F. A. Graybill, Introduction to the Theory of Statistics, McGraw-Hill, New York, 2nd edn, W. C. Hamilton, Statistics in Physical Science: Estimation, Hypothesis Testing, and Least Squares, The Ronald Press Co., New York, W. H. Press, B. P. Flannery, S. A. Teukolsky and W. T. Vetterling, Numerical Recipes, Cambridge University Press, Cambridge, UK, R. N. Draper and H. Smith, Applied Regression Analysis, Wiley, New York, 3rd edn, R. J. Carroll and D. Ruppert, Transformation and Weighting in Regression, Chapman and Hall, New York, J. Tellinghuisen, J. Phys. Chem. A, 2000, 104, J. Tellinghuisen, J. Chem. Educ., 2005, 82, M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, Dover, New York, N. Francois, B. Govaerts and B. Boulanger, Chemom. Intell. Lab. Syst., 2004, 74, J. Tellinghuisen, Appl. Spectrosc., 2000, 54, J. Tellinghuisen, Anal. Biochem., 2005, 343, Analyst, 2008, 133, This journal is ß The Royal Society of Chemistry 2008

Non-linear least-squares and chemical kinetics. An improved method to analyse monomer-excimer decay data

Non-linear least-squares and chemical kinetics. An improved method to analyse monomer-excimer decay data Journal of Mathematical Chemistry 21 (1997) 131 139 131 Non-linear least-squares and chemical kinetics. An improved method to analyse monomer-excimer decay data J.P.S. Farinha a, J.M.G. Martinho a and

More information

Systematic errors in isothermal titration calorimetry: Concentrations and baselines

Systematic errors in isothermal titration calorimetry: Concentrations and baselines See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/50890941 Systematic errors in isothermal titration calorimetry: Concentrations and baselines

More information

References. Regression standard errors in clustered samples. diagonal matrix with elements

References. Regression standard errors in clustered samples. diagonal matrix with elements Stata Technical Bulletin 19 diagonal matrix with elements ( q=ferrors (0) if r > 0 W ii = (1 q)=f errors (0) if r < 0 0 otherwise and R 2 is the design matrix X 0 X. This is derived from formula 3.11 in

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture VIa (19.11.07) Contents: Uncertainties (II): Re: error propagation Correlated uncertainties Systematic uncertainties Re: Error Propagation (I) x = Vi,j and µi known

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Statistical Practice

Statistical Practice Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed

More information

A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

More information

Degrees-of-freedom estimation in the Student-t noise model.

Degrees-of-freedom estimation in the Student-t noise model. LIGO-T0497 Degrees-of-freedom estimation in the Student-t noise model. Christian Röver September, 0 Introduction The Student-t noise model was introduced in [] as a robust alternative to the commonly used

More information

Treatment of Error in Experimental Measurements

Treatment of Error in Experimental Measurements in Experimental Measurements All measurements contain error. An experiment is truly incomplete without an evaluation of the amount of error in the results. In this course, you will learn to use some common

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Measurements and Data Analysis

Measurements and Data Analysis Measurements and Data Analysis 1 Introduction The central point in experimental physical science is the measurement of physical quantities. Experience has shown that all measurements, no matter how carefully

More information

The exact bootstrap method shown on the example of the mean and variance estimation

The exact bootstrap method shown on the example of the mean and variance estimation Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011

More information

Measurements of a Table

Measurements of a Table Measurements of a Table OBJECTIVES to practice the concepts of significant figures, the mean value, the standard deviation of the mean and the normal distribution by making multiple measurements of length

More information

Diversity partitioning without statistical independence of alpha and beta

Diversity partitioning without statistical independence of alpha and beta 1964 Ecology, Vol. 91, No. 7 Ecology, 91(7), 2010, pp. 1964 1969 Ó 2010 by the Ecological Society of America Diversity partitioning without statistical independence of alpha and beta JOSEPH A. VEECH 1,3

More information

A Class of Fast Methods for Processing Irregularly Sampled. or Otherwise Inhomogeneous One-Dimensional Data. Abstract

A Class of Fast Methods for Processing Irregularly Sampled. or Otherwise Inhomogeneous One-Dimensional Data. Abstract A Class of Fast Methods for Processing Irregularly Sampled or Otherwise Inhomogeneous One-Dimensional Data George B. Rybicki and William H. Press Harvard-Smithsonian Center for Astrophysics, 60 Garden

More information

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Confidence Estimation Methods for Neural Networks: A Practical Comparison , 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.

More information

MICROPIPETTE CALIBRATIONS

MICROPIPETTE CALIBRATIONS Physics 433/833, 214 MICROPIPETTE CALIBRATIONS I. ABSTRACT The micropipette set is a basic tool in a molecular biology-related lab. It is very important to ensure that the micropipettes are properly calibrated,

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY PHYSICS DEPARTMENT

MASSACHUSETTS INSTITUTE OF TECHNOLOGY PHYSICS DEPARTMENT G. Clark 7oct96 1 MASSACHUSETTS INSTITUTE OF TECHNOLOGY PHYSICS DEPARTMENT 8.13/8.14 Junior Laboratory STATISTICS AND ERROR ESTIMATION The purpose of this note is to explain the application of statistics

More information

Hypothesis testing. Chapter Formulating a hypothesis. 7.2 Testing if the hypothesis agrees with data

Hypothesis testing. Chapter Formulating a hypothesis. 7.2 Testing if the hypothesis agrees with data Chapter 7 Hypothesis testing 7.1 Formulating a hypothesis Up until now we have discussed how to define a measurement in terms of a central value, uncertainties, and units, as well as how to extend these

More information

Investigation of Possible Biases in Tau Neutrino Mass Limits

Investigation of Possible Biases in Tau Neutrino Mass Limits Investigation of Possible Biases in Tau Neutrino Mass Limits Kyle Armour Departments of Physics and Mathematics, University of California, San Diego, La Jolla, CA 92093 (Dated: August 8, 2003) We study

More information

4. STATISTICAL SAMPLING DESIGNS FOR ISM

4. STATISTICAL SAMPLING DESIGNS FOR ISM IRTC Incremental Sampling Methodology February 2012 4. STATISTICAL SAMPLING DESIGNS FOR ISM This section summarizes results of simulation studies used to evaluate the performance of ISM in estimating the

More information

A Few Concepts from Numerical Analysis

A Few Concepts from Numerical Analysis 2 A Few Concepts from Numerical Analysis A systematic treatment of numerical methods is provided in conventional courses and textbooks on numerical analysis. But a few very common issues, that emerge in

More information

Non-independence in Statistical Tests for Discrete Cross-species Data

Non-independence in Statistical Tests for Discrete Cross-species Data J. theor. Biol. (1997) 188, 507514 Non-independence in Statistical Tests for Discrete Cross-species Data ALAN GRAFEN* AND MARK RIDLEY * St. John s College, Oxford OX1 3JP, and the Department of Zoology,

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Time-Series Cross-Section Analysis

Time-Series Cross-Section Analysis Time-Series Cross-Section Analysis Models for Long Panels Jamie Monogan University of Georgia February 17, 2016 Jamie Monogan (UGA) Time-Series Cross-Section Analysis February 17, 2016 1 / 20 Objectives

More information

Uncertainty due to Finite Resolution Measurements

Uncertainty due to Finite Resolution Measurements Uncertainty due to Finite Resolution Measurements S.D. Phillips, B. Tolman, T.W. Estler National Institute of Standards and Technology Gaithersburg, MD 899 Steven.Phillips@NIST.gov Abstract We investigate

More information

Variable Selection and Model Building

Variable Selection and Model Building LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 37 Variable Selection and Model Building Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur The complete regression

More information

1 Random and systematic errors

1 Random and systematic errors 1 ESTIMATION OF RELIABILITY OF RESULTS Such a thing as an exact measurement has never been made. Every value read from the scale of an instrument has a possible error; the best that can be done is to say

More information

Monte Carlo Simulations

Monte Carlo Simulations Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

More information

Statistical Analysis of Engineering Data The Bare Bones Edition. Precision, Bias, Accuracy, Measures of Precision, Propagation of Error

Statistical Analysis of Engineering Data The Bare Bones Edition. Precision, Bias, Accuracy, Measures of Precision, Propagation of Error Statistical Analysis of Engineering Data The Bare Bones Edition (I) Precision, Bias, Accuracy, Measures of Precision, Propagation of Error PRIOR TO DATA ACQUISITION ONE SHOULD CONSIDER: 1. The accuracy

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some

More information

interval forecasting

interval forecasting Interval Forecasting Based on Chapter 7 of the Time Series Forecasting by Chatfield Econometric Forecasting, January 2008 Outline 1 2 3 4 5 Terminology Interval Forecasts Density Forecast Fan Chart Most

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

More information

An Introduction to Error Analysis

An Introduction to Error Analysis An Introduction to Error Analysis Introduction The following notes (courtesy of Prof. Ditchfield) provide an introduction to quantitative error analysis: the study and evaluation of uncertainty in measurement.

More information

Part I. Experimental Error

Part I. Experimental Error Part I. Experimental Error 1 Types of Experimental Error. There are always blunders, mistakes, and screwups; such as: using the wrong material or concentration, transposing digits in recording scale readings,

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

The Use of Large Intervals in Finite- Difference Equations

The Use of Large Intervals in Finite- Difference Equations 14 USE OF LARGE INTERVALS IN FINITE-DIFFERENCE EQUATIONS up with A7! as a free parameter which can be typed into the machine as occasion demands, no further information being needed. This elaboration of

More information

1 The problem of survival analysis

1 The problem of survival analysis 1 The problem of survival analysis Survival analysis concerns analyzing the time to the occurrence of an event. For instance, we have a dataset in which the times are 1, 5, 9, 20, and 22. Perhaps those

More information

1 Measurement Uncertainties

1 Measurement Uncertainties 1 Measurement Uncertainties (Adapted stolen, really from work by Amin Jaziri) 1.1 Introduction No measurement can be perfectly certain. No measuring device is infinitely sensitive or infinitely precise.

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

Protean Instrument Dutchtown Road, Knoxville, TN TEL/FAX:

Protean Instrument Dutchtown Road, Knoxville, TN TEL/FAX: Application Note AN-0210-1 Tracking Instrument Behavior A frequently asked question is How can I be sure that my instrument is performing normally? Before we can answer this question, we must define what

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions ASTR 511/O Connell Lec 6 1 STATISTICS OF OBSERVATIONS & SAMPLING THEORY References: Bevington Data Reduction & Error Analysis for the Physical Sciences LLM: Appendix B Warning: the introductory literature

More information

ON REPLICATION IN DESIGN OF EXPERIMENTS

ON REPLICATION IN DESIGN OF EXPERIMENTS ON REPLICATION IN DESIGN OF EXPERIMENTS Bianca FAGARAS 1), Ingrid KOVACS 1), Anamaria OROS 1), Monica RAFAILA 2), Marina Dana TOPA 1), Manuel HARRANT 2) 1) Technical University of Cluj-Napoca, Str. Baritiu

More information

Error Analysis in Experimental Physical Science Mini-Version

Error Analysis in Experimental Physical Science Mini-Version Error Analysis in Experimental Physical Science Mini-Version by David Harrison and Jason Harlow Last updated July 13, 2012 by Jason Harlow. Original version written by David M. Harrison, Department of

More information

The Monte Carlo method what and how?

The Monte Carlo method what and how? A top down approach in measurement uncertainty estimation the Monte Carlo simulation By Yeoh Guan Huah GLP Consulting, Singapore (http://consultglp.com) Introduction The Joint Committee for Guides in Metrology

More information

Multicollinearity and A Ridge Parameter Estimation Approach

Multicollinearity and A Ridge Parameter Estimation Approach Journal of Modern Applied Statistical Methods Volume 15 Issue Article 5 11-1-016 Multicollinearity and A Ridge Parameter Estimation Approach Ghadban Khalaf King Khalid University, albadran50@yahoo.com

More information

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015 MS&E 226 In-Class Midterm Examination Solutions Small Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom Central Limit Theorem and the Law of Large Numbers Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Understand the statement of the law of large numbers. 2. Understand the statement of the

More information

arxiv:hep-ex/ v1 2 Jun 2000

arxiv:hep-ex/ v1 2 Jun 2000 MPI H - V7-000 May 3, 000 Averaging Measurements with Hidden Correlations and Asymmetric Errors Michael Schmelling / MPI for Nuclear Physics Postfach 03980, D-6909 Heidelberg arxiv:hep-ex/0006004v Jun

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture V (12.11.07) Contents: Central Limit Theorem Uncertainties: concepts, propagation and properties Central Limit Theorem Consider the sum X of n independent variables,

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski

More information

Optimisation of multi-parameter empirical fitting functions By David Knight

Optimisation of multi-parameter empirical fitting functions By David Knight 1 Optimisation of multi-parameter empirical fitting functions By David Knight Version 1.02, 1 st April 2013. D. W. Knight, 2010-2013. Check the author's website to ensure that you have the most recent

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

Uncertainty and Graphical Analysis

Uncertainty and Graphical Analysis Uncertainty and Graphical Analysis Introduction Two measures of the quality of an experimental result are its accuracy and its precision. An accurate result is consistent with some ideal, true value, perhaps

More information

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

More information

FAQ: Linear and Multiple Regression Analysis: Coefficients

FAQ: Linear and Multiple Regression Analysis: Coefficients Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable

More information

Combining multiple surrogate models to accelerate failure probability estimation with expensive high-fidelity models

Combining multiple surrogate models to accelerate failure probability estimation with expensive high-fidelity models Combining multiple surrogate models to accelerate failure probability estimation with expensive high-fidelity models Benjamin Peherstorfer a,, Boris Kramer a, Karen Willcox a a Department of Aeronautics

More information

1 Using standard errors when comparing estimated values

1 Using standard errors when comparing estimated values MLPR Assignment Part : General comments Below are comments on some recurring issues I came across when marking the second part of the assignment, which I thought it would help to explain in more detail

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model

Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model United States Department of Agriculture Forest Service Forest Products Laboratory Research Paper FPL-RP-484 Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model Carol L. Link

More information

ASA Section on Survey Research Methods

ASA Section on Survey Research Methods REGRESSION-BASED STATISTICAL MATCHING: RECENT DEVELOPMENTS Chris Moriarity, Fritz Scheuren Chris Moriarity, U.S. Government Accountability Office, 411 G Street NW, Washington, DC 20548 KEY WORDS: data

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Statistics for the LHC Lecture 1: Introduction

Statistics for the LHC Lecture 1: Introduction Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University

More information

Basic Analysis of Data

Basic Analysis of Data Basic Analysis of Data Department of Chemical Engineering Prof. Geoff Silcox Fall 008 1.0 Reporting the Uncertainty in a Measured Quantity At the request of your supervisor, you have ventured out into

More information

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

Data Analysis for University Physics

Data Analysis for University Physics Data Analysis for University Physics by John Filaseta orthern Kentucky University Last updated on ovember 9, 004 Four Steps to a Meaningful Experimental Result Most undergraduate physics experiments have

More information

A process capability index for discrete processes

A process capability index for discrete processes Journal of Statistical Computation and Simulation Vol. 75, No. 3, March 2005, 175 187 A process capability index for discrete processes MICHAEL PERAKIS and EVDOKIA XEKALAKI* Department of Statistics, Athens

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Estimating and Testing the US Model 8.1 Introduction

Estimating and Testing the US Model 8.1 Introduction 8 Estimating and Testing the US Model 8.1 Introduction The previous chapter discussed techniques for estimating and testing complete models, and this chapter applies these techniques to the US model. For

More information

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution Wright-Fisher Models, Approximations, and Minimum Increments of Evolution William H. Press The University of Texas at Austin January 10, 2011 1 Introduction Wright-Fisher models [1] are idealized models

More information

Bayesian vs frequentist techniques for the analysis of binary outcome data

Bayesian vs frequentist techniques for the analysis of binary outcome data 1 Bayesian vs frequentist techniques for the analysis of binary outcome data By M. Stapleton Abstract We compare Bayesian and frequentist techniques for analysing binary outcome data. Such data are commonly

More information

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

Howard Mark and Jerome Workman Jr.

Howard Mark and Jerome Workman Jr. Linearity in Calibration: How to Test for Non-linearity Previous methods for linearity testing discussed in this series contain certain shortcomings. In this installment, the authors describe a method

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA Statistical Applications in the Astronomy Literature II Jogesh Babu Center for Astrostatistics PennState University, USA 1 The likelihood ratio test (LRT) and the related F-test Protassov et al. (2002,

More information

A test for improved forecasting performance at higher lead times

A test for improved forecasting performance at higher lead times A test for improved forecasting performance at higher lead times John Haywood and Granville Tunnicliffe Wilson September 3 Abstract Tiao and Xu (1993) proposed a test of whether a time series model, estimated

More information

Point-to-point response to reviewers comments

Point-to-point response to reviewers comments Point-to-point response to reviewers comments Reviewer #1 1) The authors analyze only one millennial reconstruction (Jones, 1998) with the argument that it is the only one available. This is incorrect.

More information

Analysis of Type-II Progressively Hybrid Censored Data

Analysis of Type-II Progressively Hybrid Censored Data Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in

More information

Testing composite hypotheses applied to AR-model order estimation; the Akaike-criterion revised

Testing composite hypotheses applied to AR-model order estimation; the Akaike-criterion revised MODDEMEIJER: TESTING COMPOSITE HYPOTHESES; THE AKAIKE-CRITERION REVISED 1 Testing composite hypotheses applied to AR-model order estimation; the Akaike-criterion revised Rudy Moddemeijer Abstract Akaike

More information

Computer Science Foundation Exam

Computer Science Foundation Exam Computer Science Foundation Exam May 6, 2016 Section II A DISCRETE STRUCTURES NO books, notes, or calculators may be used, and you must work entirely on your own. SOLUTION Question Max Pts Category Passing

More information

Food delivered. Food obtained S 3

Food delivered. Food obtained S 3 Press lever Enter magazine * S 0 Initial state S 1 Food delivered * S 2 No reward S 2 No reward S 3 Food obtained Supplementary Figure 1 Value propagation in tree search, after 50 steps of learning the

More information

Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions

Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions Oy Leuangthong, Jason McLennan and Clayton V. Deutsch Centre for Computational Geostatistics Department of Civil & Environmental Engineering

More information

Estimation of Effect Size From a Series of Experiments Involving Paired Comparisons

Estimation of Effect Size From a Series of Experiments Involving Paired Comparisons Journal of Educational Statistics Fall 1993, Vol 18, No. 3, pp. 271-279 Estimation of Effect Size From a Series of Experiments Involving Paired Comparisons Robert D. Gibbons Donald R. Hedeker John M. Davis

More information

Section 8.1: Interval Estimation

Section 8.1: Interval Estimation Section 8.1: Interval Estimation Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 8.1: Interval Estimation 1/ 35 Section

More information

Solar neutrinos are the only known particles to reach Earth directly from the solar core and thus allow to test directly the theories of stellar evolu

Solar neutrinos are the only known particles to reach Earth directly from the solar core and thus allow to test directly the theories of stellar evolu Absence of Correlation between the Solar Neutrino Flux and the Sunspot Number Guenther Walther Dept. of Statistics, Stanford University, Stanford, CA 94305 Abstract There exists a considerable amount of

More information

Advanced Statistical Methods. Lecture 6

Advanced Statistical Methods. Lecture 6 Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

VHF Dipole effects on P-Band Beam Characteristics

VHF Dipole effects on P-Band Beam Characteristics VHF Dipole effects on P-Band Beam Characteristics D. A. Mitchell, L. J. Greenhill, C. Carilli, R. A. Perley January 7, 1 Overview To investigate any adverse effects on VLA P-band performance due the presence

More information