Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )

Similar documents
Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Statistical Data Analysis Stat 5: More on nuisance parameters, Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters

Introduction to Likelihoods

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Statistics for the LHC Lecture 1: Introduction

Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009

Statistics for the LHC Lecture 2: Discovery

Introductory Statistics Course Part II

A Bayesian Treatment of Linear Gaussian Regression

Introduction to Statistical Methods for High Energy Physics

Recent developments in statistical methods for particle physics

Some Statistical Tools for Particle Physics

Discovery significance with statistical uncertainty in the background estimate

Statistical Methods for Particle Physics Lecture 3: systematic uncertainties / further topics

Statistical Methods for Astronomy

Lectures on Statistical Data Analysis

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Minimum Power for PCL

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Statistical Methods in Particle Physics Day 4: Discovery and limits

Statistical Methods for Particle Physics (I)

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Stat 5101 Lecture Notes

Statistics and Data Analysis

Frequentist-Bayesian Model Comparisons: A Simple Example

Primer on statistics:

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12

Parametric Techniques Lecture 3

Statistical Data Analysis 2017/18

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Bayesian analysis in nuclear physics

Simple Linear Regression for the Climate Data

Some Topics in Statistical Data Analysis

Practical Statistics

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

Statistical Methods in Particle Physics

Statistical techniques for data analysis in Cosmology

Multivariate statistical methods and data mining in particle physics

Measurement And Uncertainty

arxiv: v1 [physics.data-an] 24 Jul 2016

Bayesian methods in economics and finance

RWTH Aachen Graduiertenkolleg

E. Santovetti lesson 4 Maximum likelihood Interval estimation

You can compute the maximum likelihood estimate for the correlation

Statistical Methods for Particle Physics

Error analysis for efficiency

Parametric Techniques

Model Checking and Improvement

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Parameter estimation Conditional risk

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Topics in statistical data analysis for high-energy physics

Advanced Statistical Methods. Lecture 6

Confidence Distribution

Asymptotic formulae for likelihood-based tests of new physics

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Bayesian Econometrics

Confidence Intervals. First ICFA Instrumentation School/Workshop. Harrison B. Prosper Florida State University

Modern Methods of Data Analysis - SS 2009

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

Negative binomial distribution and multiplicities in p p( p) collisions

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

arxiv: v3 [physics.data-an] 24 Jun 2013

Part 4: Multi-parameter and normal models

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

Inconsistency of Bayesian inference when the model is wrong, and how to repair it

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Modern Methods of Data Analysis - WS 07/08

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

arxiv: v1 [physics.data-an] 2 Mar 2011

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

2 Statistical Estimation: Basic Concepts

Stat 451 Lecture Notes Simulating Random Variables

arxiv: v1 [physics.data-an] 3 Jun 2008

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

COS513 LECTURE 8 STATISTICAL CONCEPTS

Multivariate Normal & Wishart

Statistical Methods in Particle Physics

Part 7: Hierarchical Modeling

Linear Models A linear model is defined by the expression

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

PMR Learning as Inference

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Bias Variance Trade-off

CPSC 540: Machine Learning

32. STATISTICS. 32. Statistics 1

Introduction to Bayesian Inference

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Statistics Challenges in High Energy Physics Search Experiments

Transcription:

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv:1809.05778) Workshop on Advanced Statistics for Physics Discovery aspd.stat.unipd.it Department of Statistical Sciences, University of Padova, 24-25 Sep 2018 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 1

Outline Using measurements with known systematic errors: Least Squares (BLUE) Allowing for uncertainties in the systematic errors Estimates of sys errors ~ Gamma Single-measurement model Asymptotics, Bartlett correction Curve fitting, averages Confidence intervals Goodness-of-fit Sensitivity to outliers Discussion and conclusions Details in: G. Cowan, Statistical Models with Uncertain Error Parameters, arxiv:1809.05778 [physics.data-an] G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 2

Introduction Suppose measurements y have probability (density) P(y µ,θ), µ = parameters of interest θ = nuisance parameters To provide info on nuisance parameters, often treat their best estimates u as indep. Gaussian distributed r.v.s., giving likelihood or log-likelihood (up to additive const.) G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 3

Systematic errors and their uncertainty Often the θ i could represent a systematic bias and its best estimate u i in the real measurement is zero. The σ u,i are the corresponding systematic errors. Sometimes σ u,i is well known, e.g., it is itself a statistical error known from sample size of a control measurement. Other times the u i are from an indirect measurement, Gaussian model approximate and/or the σ u,i are not exactly known. Or sometimes σ u,i is at best a guess that represents an uncertainty in the underlying model ( theoretical error ). In any case we can allow that the σ u,i are not known in general with perfect accuracy. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 4

Gamma model for variance estimates Suppose we want to treat the systematic errors as uncertain, so let the σ u,i be adjustable nuisance parameters. Suppose we have estimates s i for σ u,i or equivalently v i = s i2, is an estimate of σ u,i2. Model the v i as independent and gamma distributed: Set α and β so that they give desired relative uncertainty r in σ u. Similar to method 2 in W.J. Browne and D. Draper, Bayesian Analysis, Volume 1, Number 3 (2006), 473-514. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 5

Distributions of v and s = v For α, β of gamma distribution, relative error on error G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 6

Likelihood for gamma error model Treated like data: y 1,...,y N u 1,...,u N (the primary measurements) (estimates of nuisance par.) v 1,...,v N (estimates of variances of estimates of NP) Parameters: µ 1,...,µ M (parameters of interest) θ 1,...,θ N (bias parameters) σ u,1,..., σ u,n (sys. errors = std. dev. of of NP estimates) G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 7

Profiling over systematic errors We can profile over the σ u,i in closed form which gives the profile log-likelihood (up to additive const.) In limit of small r i, v i σ u,i 2 and the log terms revert back to the quadratic form seen with known σ u,i. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 8

Equivalent likelihood from Student s t We can arrive at same likelihood by defining Since u i ~ Gauss and v i ~ Gamma, z i ~ Student s t with Resulting likelihood same as profile Lʹ(µ,θ) from gamma model G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 9

Single-measurement model As a simplest example consider y ~ Gauss(µ, σ 2 ), v ~ Gamma(α, β), Test values of µ with t µ = -2 ln λ(µ) with G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 10

Distribution of t µ From Wilks theorem, in the asymptotic limit we should find t µ ~ chi-squared(1). Here asymptotic limit means all estimators ~Gauss, which means r 0. For increasing r, clear deviations visible: G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 11

Distribution of t µ (2) For larger r, breakdown of asymptotics gets worse: Values of r ~ several tenths are relevant so we cannot in general rely on asymptotics to get confidence intervals, p-values, etc. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 12

One can modify t µ defining Bartlett corrections such that the new statistic s distribution is better approximated by chi-squared for n d degrees of freedom (Bartlett, 1937). For this example E[t µ ] 1 + 3r 2 + 2r 4 works well: G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 13

Bartlett corrections (2) Good agreement for r ~ several tenths out to t µʹ ~ several, i.e., good for significances of several sigma: G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 14

68.3% CL confidence interval for µ G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 15

Curve fitting, averages Suppose independent y i ~ Gauss, i = 1,...,N, with µ are the parameters of interest in the fit function φ(x;µ), θ are bias parameters constrained by control measurements u i ~ Gauss(θ i, σ u,i ), so that if σ u,i are known we have G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 16

Profiling over θ i with known σ u,i Profiling over the bias parameters θ i for known σ u,i gives usual least-squares (BLUE) Widely used technique for curve fitting in Particle Physics. Generally in real measurement, u i = 0. Generalized to case of correlated y i and u i by summing statistical and systematic covariance matrices. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 17

Curve fitting with uncertain σ u,i Suppose now σ u,i 2 are adjustable parameters with gamma distributed estimates v i. Retaining the θ i but profiling over σ u,i 2 gives Profiled values of θ i from solution to cubic equations G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 18

Goodness of fit Can quantify goodness of fit with statistic where Lʹ (φ,θ) has an adjustable φ i for each y i (the saturated model). Asymptotically should have q ~ chi-squared(n-m). For increasing r i, may need Bartlett correction or MC. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 19

Distributions of q G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 20

Distributions of Bartlett-corrected qʹ G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 21

Example: average of two measurements MINOS interval (= approx. confidence interval) based on with Increased discrepancy between values to be averaged gives larger interval. Interval length saturates at ~level of absolute discrepancy between input values. relative error on sys. error G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 22

Same with interval from p µ = α with nuisance parameters profiled at µ G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 23

Sensitivity of average to outliers Suppose we average 5 values, y = 8, 9, 10, 11, 12, all with stat. and sys. errors of 1.0, and suppose negligible error on error (here take r = 0.01 for all). G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 24

Sensitivity of average to outliers (2) Now suppose the measurement at 10 was actually at 20: outlier Estimate pulled up to 12.0, size of confidence interval ~unchanged (would be exactly unchanged with r 0). G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 25

Average with all r = 0.2 If we assign to each measurement r = 0.2, Estimate still at 10.00, size of interval moves 0.63 0.65 G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 26

Average with all r = 0.2 with outlier Same now with the outlier (middle measurement 10 20) Estimate 10.75 (outlier pulls much less). Half-size of interval 0.78 (inflated because of bad g.o.f.). G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 27

Naive approach to errors on errors Naively one might think that the error on the error in the previous example could be taken into account conservatively by inflating the systematic errors, i.e., But this gives without outlier (middle meas. 10) with outlier (middle meas. 20) So the sensitivity to the outlier is not reduced and the size of the confidence interval is still independent of goodness of fit. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 28

Discussion / Conclusions Gamma model for variance estimates gives confidence intervals that increase in size when the data are internally inconsistent, and gives decreased sensitivity to outliers (known property of Student s t based regression). Equivalence with Student s t model, ν = 1/2r 2 degrees of freedom. Simple profile likelihood quadratic terms replaced by logarithmic: G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 29

Discussion / Conclusions (2) Asymptotics can break for increased error-on-error, may need Bartlett correction or MC. Model should be valuable when systematic errors are not well known but enough expert opinion is available to establish meaningful errors on the errors. Could also use e.g. as stress test crank up the r i values until significance of result degrades and ask if you really trust your assigned systematic errors at that level. Here assumed that meaningful r i values can be assigned. Alternatively one could try to fit a global r to all systematic errors, analogous to PDG scale factor method or meta-analysis à la DerSimonian and Laird. ( future work). G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 30

Extra slides G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 31

Gamma model for estimates of variance Suppose the estimated variance v was obtained as the sample variance from n observations of a Gaussian distributed bias estimate u. In this case one can show v is gamma distributed with We can relate α and β to the relative uncertainty r in the systematic uncertainty as reflected by the standard deviation of the sampling distribution of s, σ s G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 32

Exact relation between r parameter and relative error on error r parameter defined as: v ~ Gamma(α, β) so s = v follows a Nakagami distribution G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 33

Exact relation between r parameter and relative error on error (2) The exact relation between the error and the error r s and the parameter r is therefore r s r good for r 1. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 34

PDG scale factor Suppose we do not want to take the quoted errors as known constants. Scale the variances by a factor ϕ, The likelihood function becomes The estimator for µ is the same as before; for ϕ ML gives which has a bias; is unbiased. The variance of µ ^ is inflated by ϕ: G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 35

Bayesian approach Given measurements: and (usually) covariances: Predicted value: expectation value control variable parameters bias Frequentist approach: Minimize G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 36

Its Bayesian equivalent Take Joint probability for all parameters and use Bayes theorem: To get desired probability for θ, integrate (marginalize) over b: Posterior is Gaussian with mode same as least squares estimator, σ θ same as from χ 2 = χ 2 min + 1. (Back where we started!) G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 37

Bayesian approach with non-gaussian prior π b (b) Suppose now the experiment is characterized by where s i is an (unreported) factor by which the systematic error is over/under-estimated. Assume correct error for a Gaussian π b (b) would be s i σ i sys, so Width of σ s (s i ) reflects error on the error. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 38

Error-on-error function π s (s) A simple unimodal probability density for 0 < s < 1 with adjustable mean and variance is the Gamma distribution: mean = b/a variance = b/a 2 Want e.g. expectation value of 1 and adjustable standard Deviation σ s, i.e., s In fact if we took π s (s) ~ inverse Gamma, we could find π b (b) in closed form (cf. D Agostini, Dose, von Linden). But Gamma seems more natural & numerical treatment not too painful. G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 39

Prior for bias π b (b) now has longer tails b Gaussian (σ s = 0) P( b > 4σ sys ) = 6.3 10-5 σ s = 0.5 P( b > 4σ sys ) = 0.65% G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 40

A simple test Suppose a fit effectively averages four measurements. Take σ sys = σ stat = 0.1, uncorrelated. Case #1: data appear compatible Posterior p(µ y): measurement p(µ y) experiment µ Usually summarize posterior p(µ y) with mode and standard deviation: G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 41

Simple test with inconsistent data Case #2: there is an outlier Posterior p(µ y): measurement p(µ y) experiment µ Bayesian fit less sensitive to outlier. See also G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 42

Goodness-of-fit vs. size of error In LS fit, value of minimized χ 2 does not affect size of error on fitted parameter. In Bayesian analysis with non-gaussian prior for systematics, a high χ 2 corresponds to a larger error (and vice versa). posterior 2000 repetitions of experiment, σ s = 0.5, here no actual bias. σ µ from least squares χ 2 G. Cowan Padova, 25 Sep 2018 / Statistical Models with Uncertain Error Parameters 43