Modern Methods of Data Analysis - WS 07/08

Similar documents
Modern Methods of Data Analysis - WS 07/08

Statistical Methods in Particle Physics

Modern Methods of Data Analysis - WS 07/08

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Lectures on Statistical Data Analysis

Probability Density Functions

E. Santovetti lesson 4 Maximum likelihood Interval estimation

YETI IPPP Durham

Physics 6720 Introduction to Statistics April 4, 2017

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

RWTH Aachen Graduiertenkolleg

Numerical Methods Lecture 7 - Statistics, Probability and Reliability

PHYSICS 2150 EXPERIMENTAL MODERN PHYSICS. Lecture 3 Rejection of Data; Weighted Averages

Lecture 2: Repetition of probability theory and statistics

Statistics and Data Analysis

Use of the likelihood principle in physics. Statistics II

Experimental Signatures for Lorentz Violation (in Neutral B Meson Mixing)

Electroweak Physics at the Tevatron

Physics Sep Example A Spin System

Rare B Meson Decays at Tevatron

LAB 2 1. Measurement of 2. Binomial Distribution

Statistical Data Analysis 2017/18

b hadron properties and decays (ATLAS)

Statistics, Probability Distributions & Error Propagation. James R. Graham

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

PHYSICS 2150 LABORATORY

HQL Virginia Tech. Bob Hirosky for the D0 Collaboration. Bob Hirosky, UNIVERSITY of VIRGINIA. 26May, 2016

Measuring CP violation in. B s φφ with LHCb. Jim Libby (University of Oxford) for the LHCb collaboration. 14/12/2006 CKM 2006 Nagoya 1

The Solar Neutrino Day-Night Effect. Master of Science Thesis Mattias Blennow Division of Mathematical Physics Department of Physics KTH

ECT Lecture 2. - Reactor Antineutrino Detection - The Discovery of Neutrinos. Thierry Lasserre (Saclay)

Algorithms for Uncertainty Quantification

Recent Heavy Flavors results from Tevatron. Aleksei Popov (Institute for High Energy Physics, Protvino) on behalf of the CDF and DØ Collaborations

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Scientific Measurement

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

Chapter 5 continued. Chapter 5 sections

Parity violation. no left-handed ν$ are produced

Modern Methods of Data Analysis - WS 07/08

Lower Bound Techniques for Statistical Estimation. Gregory Valiant and Paul Valiant

PHYSICS 2150 LABORATORY

Today: Fundamentals of Monte Carlo

Statistical Data Analysis Stat 3: p-values, parameter estimation

Lecture 8 Hypothesis Testing

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

Data Analysis and Monte Carlo Methods

Math Review Sheet, Fall 2008

B Hadron lifetimes using a displaced track trigger

Lecture 13: Covariance. Lisa Yan July 25, 2018

Uncertainty Quantification and Validation Using RAVEN. A. Alfonsi, C. Rabiti. Risk-Informed Safety Margin Characterization.

Introduction to Statistical Methods for High Energy Physics

Proton Decays. -- motivation, status, and future prospect -- Univ. of Tokyo, Kamioka Observatory Masato Shiozawa

Polarized muon decay asymmetry measurement: status and challenges

Normal Distributions Rejection of Data + RLC Circuits. Lecture 4 Physics 2CL Summer 2011

FINITE DIFFERENCES. Lecture 1: (a) Operators (b) Forward Differences and their calculations. (c) Backward Differences and their calculations.

arxiv: v1 [hep-ex] 30 Nov 2009

Statistics 224 Solution key to EXAM 2 FALL 2007 Friday 11/2/07 Professor Michael Iltis (Lecture 2)

Introduction to Error Analysis

Charm Baryon Studies at BABAR

Search for a heavy gauge boson W e

Glauber modelling in high-energy nuclear collisions. Jeremy Wilkinson

The Lund Model. and some extensions. Department of Astronomy and Theoretical Physics Lund University Sölvegatan 14A, SE Lund, Sweden

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Introduction to Statistics and Error Analysis II

Introduction to least square fits for Geant 4 Simulation and ROOT Analysis of a Silicon Beam Telescope Mar , DESY Hamburg Olaf Behnke, DESY

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

The W-mass Measurement at CDF

Measurements of Particle Production in pp-collisions in the Forward Region at the LHC

ECE 510 Lecture 7 Goodness of Fit, Maximum Likelihood. Scott Johnson Glenn Shirley

Some Statistics. V. Lindberg. May 16, 2007

Modern Methods of Data Analysis - WS 07/08

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS

Lecture 2 Binomial and Poisson Probability Distributions

PRECISION&MEASUREMENTS&

Statistical Tests: Discriminants

Lecture 03 Positive Semidefinite (PSD) and Positive Definite (PD) Matrices and their Properties

b and c production in CMS

Why is the field of statistics still an active one?

UNIT NUMBER PROBABILITY 6 (Statistics for the binomial distribution) A.J.Hobson

Error propagation. Alexander Khanov. October 4, PHYS6260: Experimental Methods is HEP Oklahoma State University

KEKB collider and Belle detector

Radiative penguins at hadron machines. Kevin Stenson! University of Colorado!

Searches for BSM Physics in Rare B-Decays in ATLAS

Lecture 2. Binomial and Poisson Probability Distributions

Statistics of Radioactive Decay

Standard Model at the LHC (Lecture 3: Measurement of Cross-Sections) M. Schott (CERN)

DELPHI Collaboration DELPHI PHYS 656. Measurement of mass dierence between? and + and mass of -neutrino from three-prong -decays

Simple Linear Regression for the Climate Data

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

CLEO c. Anders Ryd Cornell University June 7, e e cc D D. D K,D K e

Prospects for quarkonium studies at LHCb. «Quarkonium Production at the LHC» workshop

Likelihood Template Fitting

Bayes Theorem (Recap) Adrian Bevan.

Standard Model of Particle Physics SS 2013

Statistical Methods in Particle Physics

Fall Quarter 2010 UCSB Physics 225A & UCSD Physics 214 Homework 1

Joint Probability Distributions, Correlations

Practical Statistics

Measurement And Uncertainty

Transcription:

Modern Methods of Data Analysis Lecture V (12.11.07) Contents: Central Limit Theorem Uncertainties: concepts, propagation and properties

Central Limit Theorem Consider the sum X of n independent variables, with i = 1,2,3,..., each taken from a different distribution with expectation value and variance Then the distribution for properties: has the following Its expectation value is Its variance is It becomes Gaussian distributed for n

Uniform distribution

Uneven Distribution

Exponential Distribution

But... In order to work No source contributes significant to overall variance multiple scattering Distribution with few events in large tails need many more statistic to converge. CLT unfortunately doesn't work for many physics applications!

Application: Many repeated Measurements

m(b0) = 5279.63 ± 0.53 (stat) ± 0.33 (sys) CDF has a mass resolution of 16 MeV the reconstructed mass of a single B meson is spread around the true B mass with σ=16 MeV The B mass can be measured with way better precision

Law of large numbers...

Weighted Mean (I) Combining measurements with different uncertainties: Twice time measurement of the seed of a car: v1 = 67 ± 4 m/s v2 = 62 ± 2 m/s Uncertainty on mean is larger than single uncertainty...???

Weighted Mean (II) To reach σ = 2 m/s, 4 single measurements with σ = 4 m/s are needed. Therefore this measurement should get 4 times the weight! More general:

Weighted Mean (III) v1 = 67 ± 4 m/s v2 = 53 ± 2 m/s????

New Scientist, 31 March 1988

Neutron Lifetime (PDG 2006)

Particle Data Group: World Average 1) Calculate weighted mean 2) Calculate there are 3 cases to consider < 1: all fine, simple weighted mean is OK 1: Depending on the reason, either make no average at all, or quote calculated average and make educated guess of error, taking into account known problems with data > 1: Errors on some or all measurements may have been underestimated, scale all of them with: to compute S, reject ones with larger uncertainties

Ideogram http://pdg.lbl.gov/2007/reviews/textrpp.pdf

Reminder: Covariance covariance of two variables x, y is defined as: if two variables are uncorrelated cov(x,y) = 0 If cov(x,y)=0 x,y are uncorrelated Correlation is defined as:

Properties Correlations Example of correlations for two random variables x and y the covariance V(x,y) or cov(x,y) can be represented by a matrix (often called error matrix ) General case of correlations for n random variables, there is covariance between each pair of variables analogous, correlation matrix is defined

Error Propagation (I) x= Vi,j and µi known y(x) is function of first order Taylor expansion...

Error Propagation (II)

Gaussian error propagation Error estimates for functions of several correlated variables : Normal errors for uncorrelated variables Additional terms accounting for correlations Special case, uncorrelated variables: This is called Gaussian error propagation, however has nothing to do with Gaussian distributions

And the same in more dimensions (A is Jacobi matrix)

Example: Track Parametrization (I) Typical tracking chamber measures 3D, due to symmetry uses cylindrical coordinates: (r,φ,z) Want to know what are the uncertainties in Cartesian coordinates (x,y,z): x = r cosφ; y = r sinφ

Example: Track Parametrization(II)

Exercise: x,y are two random variables with the correlation matrix V: a = 3/7 x +1/7 y; b = 1/7 x - 2/7y; Give the correlation matrix of a,b (note this time it is not an approximation...)

Orthogonal Transformation For n variables one can always find a linear transformation such that the in the covariance matrix of the new set of variables is diagonal.

Exercise: Some standard formulae

Example: Measure Asymmetry (I) Define A = (F-B)/(F+B) an asymmetry of events in forward and backward hemisphere of the detector (e.g. @ LEP). Here F (B) is the number of events in forward (backward) hemisphere Case I: If the events in forward and backward hemisphere are uncorrelated then: if errors are individually Poisson distributed for F and B, then errors are dominated by smaller counting rate

Example: Measure Asymmetry (II) case II: events in forward and backward hemisphere are completely anti-correlated, N is thus fixed - distribution of events F and B are then given by Binomial distribution; let p be probability that event is in forward h.s.: - this is the same expression as before demonstrates relationship between Poisson & Binomial - either Poisson prob of obtaining N events altogether times binomial prob. of having F events in forward - or: two independent Poisson prob. in the number of backward and forward events

Repetition: Histogram Interpretation of bins A histogram can be equivalently regarded as: 1. A Poisson distribution in the overall number of events N and the corresponding multinomial distribution of obtaining events in each bin 2. An independent Poisson distribution of the number of entries in each bin of the histogram.

Exercise: Detector Efficiency Compute the error on measuring detector efficiency in two different ways: Binomial distribution; p: probability to detect traversing particle, N number of events Using error propagation, : number of detected events :number of not detected events, treat as independent variables

Be aware... The approximation using Taylor expansion breaks down if the function is significantly not linear in the region ± 1σ around the mean value. Example: momentum estimate in B field; p ~ 1/κ 10 % momentum bias!

st Failure of (1 order) error propagation experiment had data to look for non-zero mass of electron neutrino Quantity R was measured: Don't need to know details. Sufficient to know: a, b, c, d, e are measured quantities and K is constant. If R < 0.42, then neutrino must have mass. The experiment measured R=0.165 and found with error propagation σ(r) = 0.073 -> 3σ evidence for neutrino mass! however, the formula is highly non-linear... 1st order error propagation not applicable!

What to do instead? Use MC methods! Throw Gaussian distributed values for a,b,c,d,e Compute R; repeat toy many times and check how often you are below the measured value of R. In example this happens in 4% of the cases. This is the so-called p-value of this result. In many physics analysis, simple Gaussian error propagation not valid or too complicated... (e.g. highly non linear functions, many correlated variables) p-value MC method always works!