Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Similar documents
Statistical Data Analysis Stat 3: p-values, parameter estimation

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

Statistics and Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Modern Methods of Data Analysis - WS 07/08

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Introduction to Statistical Methods for High Energy Physics

Modern Methods of Data Analysis - WS 07/08

Primer on statistics:

01 Probability Theory and Statistics Review

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Lecture 9: Elementary Matrices

Modern Methods of Data Analysis - WS 07/08

Fitting a Straight Line to Data

Modern Methods of Data Analysis - WS 07/08

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Statistical Methods for Astronomy

Introduction to Error Analysis

Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009

Bayesian Regression Linear and Logistic Regression

Intermediate Lab PHYS 3870

Hypothesis testing:power, test statistic CMS:

Statistical Methods in Particle Physics

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Physics 6720 Introduction to Statistics April 4, 2017

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Statistical techniques for data analysis in Cosmology

Statistical Methods in Particle Physics

Lecture Note 1: Probability Theory and Statistics

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

L03. PROBABILITY REVIEW II COVARIANCE PROJECTION. NA568 Mobile Robotics: Methods & Algorithms

Statistics for the LHC Lecture 1: Introduction

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Lectures on Statistical Data Analysis

PHYSICS 2150 LABORATORY

p(z)

CMS Note Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras

PHYS 6710: Nuclear and Particle Physics II

Statistical Methods for Astronomy

Treatment of Error in Experimental Measurements

E. Santovetti lesson 4 Maximum likelihood Interval estimation

Introduction to Statistics and Error Analysis

Statistics, Data Analysis, and Simulation SS 2015

Analytical Methods. Session 3: Statistics II. UCL Department of Civil, Environmental & Geomatic Engineering. Analytical Methods.

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

A Measure of the Goodness of Fit in Unbinned Likelihood Fits; End of Bayesianism?

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras

PHYS 6710: Nuclear and Particle Physics II

Intelligent Embedded Systems Uncertainty, Information and Learning Mechanisms (Part 1)

J = L + S. to this ket and normalize it. In this way we get expressions for all the kets

PHYSICS 2150 LABORATORY

Lecture 2: Reporting, Using, and Calculating Uncertainties 2. v = 6050 ± 30 m/s. v = 6047 ± 3 m/s

Module 9 : Infinite Series, Tests of Convergence, Absolute and Conditional Convergence, Taylor and Maclaurin Series

Multivariate Statistical Analysis

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

The Multivariate Gaussian Distribution [DRAFT]

Introduction to Statistics and Error Analysis II

Statistical Methods for Particle Physics (I)

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Practical Statistics

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Confidence Limits and Intervals 3: Various other topics. Roger Barlow SLUO Lectures on Statistics August 2006

Lecture 5: Bayes pt. 1

arxiv: v1 [physics.data-an] 2 Mar 2011

Parameter Estimation and Fitting to Data

1. The Multivariate Classical Linear Regression Model

AST 418/518 Instrumentation and Statistics

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

Lecture 2: Repetition of probability theory and statistics

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That

CSC 2541: Bayesian Methods for Machine Learning

F & B Approaches to a simple model

Autonomous Navigation for Flying Robots

Statistical Methods for Astronomy

1 Linear Regression and Correlation

Ch. 8 Math Preliminaries for Lossy Coding. 8.5 Rate-Distortion Theory

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation

Modern Methods of Data Analysis - SS 2009

arxiv:hep-ex/ v1 2 Jun 2000

The Bayes classifier

The Delta Method and Applications

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Particle Physics. Michaelmas Term 2011 Prof. Mark Thomson. Handout 2 : The Dirac Equation. Non-Relativistic QM (Revision)

Intro to Probability. Andrei Barbu

Introduction to Statistics and Data Analysis

PHYSICS 2150 LABORATORY

Numerical Methods Lecture 7 - Statistics, Probability and Reliability

Dimensionality Reduction and Principle Components

component risk analysis

Transcription:

Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution functions, Binomial distributions, Poisson distribution 2: The Gaussian Limit The central limit theorem, Gaussian errors, Error propagation, Combination of measurements, Multidimensional Gaussian errors, Error Matrix 3: Fitting and Hypothesis Testing The χ 2 test, Likelihood functions, Fitting, Binned maximum likelihood, Unbinned maximum likelihood 4: Dark Arts Bayesian statistics, Confidence intervals, systematic errors. Prof. M.A. Thomson Lent Term 2015 30

! Problem: given the results of two straight line fits with errors, calculate the uncertainty on the intersection! Solution: first learn about " Gaussian errors " Correlations " Error propagation Prof. M.A. Thomson Lent Term 2015 31 The Central Limit Theorem! We have already shown that for large µ that a Poisson distribution tends to a Gaussian! This is one example of a more general theorem, the Central Limit Theorem * If n random variables, x i, each distributed according to any PDF, are combined then the sum y = Σ x i will have a PDF which, for large n, tends to a Gaussian! For this reason the Gaussian distribution plays an important role in statistics which by make a suitable coordinate transformation, Normal distribution, gives the Mean = zero Rms = 1 * The proof of the central limit theorem is non-trivial and isn t reproduced here Prof. M.A. Thomson Lent Term 2015 32

A useful integral relationship! We will often take averages of functions of Gaussian distributed quantities! Hence interested in integrals of the form! Define For n odd, For even n:! Hence! By writing e.g. Prof. M.A. Thomson Lent Term 2015 33 Properties of the Gaussian Distribution! Normalised to unity (it s a PDF) Proof:! Variance Proof: Prof. M.A. Thomson Lent Term 2015 34

Properties of the 1D Gaussian Distribution, cont.! Natural to introduce squared deviation from mean in terms of standard error! Fractions of events Prof. M.A. Thomson Lent Term 2015 35 Averaging Gaussian Measurements! Suppose we have two independent measurements of a quantity, e.g. the W boson mass: and there are two questions we can ask: Are the measurements compatible? [Hypothesis test we ll return to this ] What is our best estimate of the parameter x? (i.e. how to average)! In principle can take any linear combination as an unbiased estimator of x since provided! Clearly want to give the highest weight to the more precise measurements e.g. two undergraduate measurements of! Method I: choose the weights to minimise the uncertainty on subject to constraint Prof. M.A. Thomson Lent Term 2015 36

! Therefore, since the weights sum to unity:! Hence for two measurements with Problem: derive this. (just error propagation as described later) Prof. M.A. Thomson Lent Term 2015 37 Averaging Gaussian Measurements II! Can obtain the same expression using a natural probability based approach " We can interpret the first measurement in terms of a probability distribution for the true value of x, i.e. a Gaussian centred on x 1 " Bayes theorem then tells us how to modify this in the light of a new measurement " So our new expression for the knowledge of x is: " Completing the square gives plus a little algebra gives with and " Product of n Gaussians is a Gaussian Prof. M.A. Thomson Lent Term 2015 38

Error Propagation I! Suppose measure a quantity x with a Gaussian uncertainty σ x ; what is the uncertainty on a derived quantity Expand about Define estimate of y: so Prof. M.A. Thomson Lent Term 2015 39! It is easy to understand the origin of! How does a small change in x, i.e. σ x, propagate to a small change in y, σ y Prof. M.A. Thomson Lent Term 2015 40

! A word of warning " In deriving the error propagation equation " Neglected second order terms in the Taylor expansion " This is equivalent to saying that the derivative is constant in region of interest " This may not always be true Calculated Correct Prof. M.A. Thomson Lent Term 2015 41 Example! Measurement of transverse momentum of a track from a fit radius of curvature of track helix, R, given by track fit returns a Gaussian uncertainty in radius of curvature, and hence, the PDF is Gaussian in what is the error in let Prof. M.A. Thomson Lent Term 2015 42

Error on Error! Recall question 2: Given 5 measurements of a quantity x: 10.2, 5.5, 6.7, 3.4, 3.5 What is the best estimate of x and what is the estimated uncertainty? So our best estimate of x is:! But how good is our estimate of the error i.e. what is the error on the error? " It can be shown (but not easy): " For a Gaussian distribution so " Hence (by error propagation show this) the error on the error estimate is " To obtain a 10% estimate of σ; need rms of 51 measurements! Prof. M.A. Thomson Lent Term 2015 43 Combining Gaussian Errors! There are many cases where we want to combine measurements to extract a single quantity, e.g. di-jet invariant mass What is the uncertainty on the mass given! Start by considering a simple example Mean of a is Variance of a is given by:! Two important points: " Errors add in quadrature (i.e. sum the squares) " The appearance of a new term, the covariance of x and y Prof. M.A. Thomson Lent Term 2015 44

Correlated errors: covariance! Consider Suppose in a single experiment measure a value of x and y Imagine repeating the measurement multiple times If the measurements of x and y are uncorrelated, i.e. INDEPENDENT If x and y are correlated If x and y are anti-correlated Prof. M.A. Thomson Lent Term 2015 45! Often convenient to express covariance in terms of the correlation coefficient Consider an experiment which returns two values x and y; where y-y = 2(x-x)! Hence (unsurprisingly) the correlation coefficient expresses the degree of correlation with! Going back to Prof. M.A. Thomson Lent Term 2015 46

Error Propagation II: the general case! We can now consider the more general case! In order to estimate the error on a derived quantity need to know correlations Prof. M.A. Thomson Lent Term 2015 47! Back to the original problem Example continued! First assume independent errors on and for simplicity neglect term giving:! EXERCISE: by first considering, calculate, including the term ANS: Prof. M.A. Thomson Lent Term 2015 48

Estimating the Correlation Coefficient! Correlations can arise from physical effects, e.g. Would expect E 1 and E 2 to be (slightly) anti-correlated why? Can always check (in MC) by plotting against NOTE: uncertainty on correlation coefficient! Correlations also arise when calculating derived quantities from uncorrelated measurements e.g. this type of correlation can be handled mathematically (see later) Prof. M.A. Thomson Lent Term 2015 49 Properties of the 2D Gaussian Distribution! For two independent variables (x,y) the joint probability distribution is simply the product of the two distributions NOTE:! Can write in terms of with Prof. M.A. Thomson Lent Term 2015 50

68 % of events within ±1 σ x 68 % of events within ±1 σ y Now consider contours of corresponds to contour where PDF falls to of peak Only 39% of events within Only 86% of events within Now to introduce correlations rotate the ellipse Prof. M.A. Thomson Lent Term 2015 51 " Same PDF, but now w.r.t. different axes " Simple to derive the general error ellipse with correlations Prof. M.A. Thomson Lent Term 2015 52

Let To find the equivalent correlation coefficient, evaluate hence To eliminate the rotation angle, write giving Compare to: gives hence Prof. M.A. Thomson Lent Term 2015 53 Properties of the 2D Gaussian Distribution! Start from uncorrelated 2D Gaussian:! Make the coordinate transformation! From previous page identify Prof. M.A. Thomson Lent Term 2015 54

-! Note we have now expressed the same ellipse in terms of the new coordinates, where the errors are now correlated.! If dealing with correlated errors can always find a linear combination of variables which are uncorrelated Prof. M.A. Thomson Lent Term 2015 55! Example 2D error ellipses with different correlation coefficients Prof. M.A. Thomson Lent Term 2015 56

The Error Ellipse and Error Matrix! Now we have the general equation for two correlated Gaussian distributed quantities! Defines the error ellipse! Ultimately want to generalise this to an N variable hyper-ellipsoid! Sounds hard but is actually rather simple in matrix form! Define the ERROR MATRIX i.e.! and define the DISCREPANCY VECTOR using and we can write Prof. M.A. Thomson Lent Term 2015 57! The beauty of this formalism is that it can be extended to any number of correlated Gaussian distributed variables with! Can write this in terms of the for n-variables (including correlations) with Prof. M.A. Thomson Lent Term 2015 58

General transformation of Errors! Suppose we have a set of variables, x i, and the error matrix, M, and now wish to transform to a set of variables, y i, defined by! Taylor expansion about mean: Prof. M.A. Thomson Lent Term 2015 59! T is the error transformation matrix For Gaussian errors we can now do anything!! Can deal with: # correlated errors # arbitrary dimensions # parameter transformations Examples Prof. M.A. Thomson Lent Term 2015 60

! Measure two uncorrelated variables A simple example Error matrix! Calculate two derived quantities! Transformation matrix! Giving Prof. M.A. Thomson Lent Term 2015 61 A more involved example! Given the results of two straight line fits, calculate the uncertainty on the intersection With the error matrix (note the results of the two fits are uncorrelated) " Lines Intersect at: " To calculate error on intersection need error transformation matrix, i.e. need the partial derivatives, e.g. " giving with Prof. M.A. Thomson Lent Term 2015 62

" then its just algebra " giving! OK, it is not pretty, but we now have an analytic expression (i.e. once you have done the calculation, computationally very fast) Prof. M.A. Thomson Lent Term 2015 63 " Apply to a special case, intersection with a fixed line Hence which makes perfect sense! The treatment of Gaussian errors via the error matrix is an extremely powerful technique it is also easy to apply (once you understand the basic ideas) Prof. M.A. Thomson Lent Term 2015 64

! Should now understand: Summary # Properties of the Gaussian distribution # How to combine errors # Propagation simple of 1D errors # How to include correlations # How to treat multi-dimensional errors # How to use the error matrix! Next up, chi-squared, likelihood fits, Prof. M.A. Thomson Lent Term 2015 65 Appendix: Error on Error - Justification " Assume mean of distribution is zero (can always make this transformation without affecting the variance) For large n Prof. M.A. Thomson Lent Term 2015 66