Statistics and Data Analysis

Similar documents
Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Statistical Methods in Particle Physics

Statistical Data Analysis Stat 3: p-values, parameter estimation

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistical Methods in Particle Physics

If we want to analyze experimental or simulated data we might encounter the following tasks:

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Statistical techniques for data analysis in Cosmology

2. A Basic Statistical Toolbox

32. STATISTICS. 32. Statistics 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Introduction to Statistics and Error Analysis

32. STATISTICS. 32. Statistics 1

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Statistical Methods for Astronomy

Lectures on Statistical Data Analysis

E. Santovetti lesson 4 Maximum likelihood Interval estimation

Introduction to Statistical Methods for High Energy Physics

Probability and Estimation. Alan Moses

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Statistics: Learning models from data

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Notes on the Multivariate Normal and Related Topics

Parameter Estimation and Fitting to Data

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12

F & B Approaches to a simple model

arxiv:hep-ex/ v1 2 Jun 2000

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That

Error analysis for efficiency

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Inferring from data. Theory of estimators

Lecture 2: Repetition of probability theory and statistics

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Practical Statistics

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Multivariate Distribution Models

Recall the Basics of Hypothesis Testing

Math Review Sheet, Fall 2008

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM)

Primer on statistics:

Parametric Techniques

L2: Review of probability and statistics

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

PHYS 6710: Nuclear and Particle Physics II

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

A732: Exercise #7 Maximum Likelihood

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Parametric Techniques Lecture 3

Statistical Data Analysis 2017/18

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

32. STATISTICS. 32. Statistics 1

Investigation of Possible Biases in Tau Neutrino Mass Limits

A Very Brief Summary of Statistical Inference, and Examples

Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009

BACKGROUND NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2016 PROBABILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO

Physics 6720 Introduction to Statistics April 4, 2017

Introduction to Error Analysis

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )

Algorithms for Uncertainty Quantification

Introductory Statistics Course Part II

Revised May 1996 by D.E. Groom (LBNL) and F. James (CERN), September 1999 by R. Cousins (UCLA), October 2001 and October 2003 by G. Cowan (RHUL).

Parameter Estimation and Hypothesis Testing

MAS223 Statistical Inference and Modelling Exercises

Modern Methods of Data Analysis - WS 07/08

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Measurement And Uncertainty

Practice Problems Section Problems

Asymptotic formulae for likelihood-based tests of new physics

Expect Values and Probability Density Functions

Exercises and Answers to Chapter 1

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

01 Probability Theory and Statistics Review

Introduction to Statistics and Data Analysis

YETI IPPP Durham

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Probability Theory and Statistics. Peter Jochumzen

Data Analysis and Monte Carlo Methods

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Lectures on Statistics. William G. Faris

Transcription:

Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli

Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli

2 Intro The Crash Course Definitions: random vars, probability, PDFs Point estimators Max likelihood, least squares fits Hypothesis testing, confidence limits Monte Carlo techniques Systematic uncertainties

3 Necessary Tool Statistics Physics = Experimental Science Both qualitative and quantitative understanding is required Observations Laws Make a set of measurements or observations Summarize the results Most of the conclusions are drawn with some degree of (un)certainty» Ex: Gravity exists (qualitative).» F=G N m 1 m 2 /r 2 (quantitative)» But in reality G N and n=2 are known to some precision, e.g. G N = (6.67428±0.00067)*10-11 N*m 2 /kg 2 Many measurements are a priori uncertain (e.g. quantum physics), and have to be interpreted in probabilistic terms Estimating such probabilities given finite amount of data and to test whether a given model is consistent with the data is the basic task of classical statistics

4 Describing the Data Data: results of the measurements In physics, we mostly deal with quantitative data, i.e. set of numbers Other fields may deal with qualitative data An American Robin has gray upperparts and head, and orange underparts, usually brighter in the male Numbers are easier to handle mathematically (duh), hence statistics will deal with quantitative measurements Discrete data, e.g. integers (counts) Continuous data, e.g. energies, momenta, etc Write down with some precision, typically set by the measuring apparatus or other external conditions

5 Scatter plot Examples Asymmetry (ppm) Δy (µm)

6 Examples Graph

7 Histogram Examples

8 Frequentist interpretation: Probability is a limiting frequency a given outcome is reported when experiments are repeating an infinite number of times Measurable parameters are represented by estimators with assigned confidence levels. CL measures a probability an estimator would fall in a certain range, given a true value of a parameter. No probability is assigned to constants of nature. Bayesian interpretation: Probability: Two Interpretations More general: define probability as a degree of belief that a given statement is true E.g. that the true value of parameter x is in interval [a,b] This is somewhat subjective, but follows how most humans think

9 Bayes Theorem Conditional probability of A given B P (A B) = P (A B) P (B) Interpreted within Bayesian statistics as P (theory data) P (data theory)p (theory) Posterior probability Likelihood (result of the measurement) Prior probability (initial prejudice) Allows one to interpret a single experiment as a measure of (subjective) probability that a given hypothesis is correct (e.g. that some fundamental constant is in some range). Requires assigning some probability interpretation to prior knowledge, which is where subjectivity comes in.

10 Random Variables Random variable: a numerical outcome of a (repeatable) measurement Characterized by a Probability Density Function Depends on a set of parameters θ Cumulative distribution (CDF): F (a) = a f(x) dx

11 Expectation Values Expectation value of function u(x): E[u(x)] = Moments of a random variable x: u(x) f(x) dx Special moments: µ α 1, σ 2 V [x] m 2 = α 2 µ 2 α n E[x n ]= m n E[(x α 1 ) n ]= x n f(x) dx (x α 1 ) n f(x) dx n-th moment n-th central moment Mean Variance

12 Examples 09/10/2013 09/15/2011 YGK, Phys226: YGK, Statistics Phys129

13 Sample from a Continuous PDF

14 Sample From a Continuous PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

15 Sample From a Continuous PDF PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

16 Sample From a Discrete PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

17 Sample From a Discrete PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

18 Sample From a Discrete PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

19 PDF Cumulative Distribution Cumulative Distribution 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

20 2d Distribution dx dy 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

21 2d Distribution 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

22 Marginal PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

23 Correlations Between Variables Covariance: cov[x, y] =E[(x µ x )(y µ y )] = E[xy] µ x µ y Represent N-dimensional parameter space in terms of covariance matrix Vij Diagonal elements: variances (squares of RMS) Off-diagonal: covariances Correlation: normalized covariance ρ xy =cov[x, y]/σ x σ y

24 Profile Histogram 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

25 Common PDFs

26 Poisson Distribution 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

27 Gaussian Distribution

28 χ 2 Distribution

29 Cauchy (Breit-Wigner) PDF

30 Point Estimation Standard problem: set of values x 1, x 2,, x n described by PDF data parameter(s) Point estimation: want to construct Estimator of parameter θ

31 Estimator Properties Consistency Approaches true value asymptotically for infinite dataset Bias Difference wrt true value for finite dataset Efficiency Variance of the estimator (compared to others) Sufficiency Dependence on true value Robustness Sensitivity to bad data, e.g. outliers Others: physicality, tractable-ness, etc. No ideal recipe, what is best depends on the problem

32 Basic Estimators Estimators for mean and variance Shape of the PDF (fitting): Maximum likelihood Most efficient, but may be biased Goodness of fit is not readily available Least Chisquared ML for gaussian-distributed data Convenient for binned data, analytic solutions for linear functions Automatic goodness-of-fit measure Be careful of gaussian approximations (e.g. when Poisson becomes Gaussian)

33 Mean and Variance from a Sample Estimators: (equally weighted data) µ = 1 N Variances of these estimators: V [ˆµ] = σ2 N [ σ2] V = 1 N N i=1 σ 2 = 1 N 1 i.e. x i N (x i µ) 2 i=1 N>0 N>1 σ [ˆµ] =σ/ N ( m 4 N 3 ) N 1 σ4 σ [ˆσ] =σ/ 2N for Gaussian distribution of x and large N

34 Maximum Likelihood Estimators Define likelihood for N independent measurements x i : L(θ) = N f(x i ; θ) i=1 max to determine estimators of θ This leads to a system of (generally nonlinear) equations for parameters θ: ln L θ i =0, i =1,...,n Solutions of these equations (often done numerically) determine estimators. Their covariance matrix iis given by ( V 1 ) ij = 2 ln L θ i θ j θ Maximum likelihood method has a nice property that (in the limit of infinite. (33 10) can result in an statistics) it produces unbiased estimators with smallest possible variance. But beware of small statistics samples! ML fits are implemented in many statistical packages (ROOT, MATLAB). Can be applied to binned or unbinned data

35 Least Squares Estimators For a set of Gaussian-distributed variables y i, define: χ 2 (θ) = 2lnL(θ)+ constant = N i=1 Or, for correlated variables! χ 2 (θ) =(y F (θ)) T V 1 (y F (θ)) Estimators: i (y i F (x i ; θ)) 2 σ 2 i In particular, if the function F is linear in parameters θ, LS estimators are found by y solving a a system of of linear linear equations analytically:! m F (x i ; θ) = θ j h j (x i ) θ =(H T V 1 H) 1 H T V 1 y Dy. j=1 Least-squares east-squares fits fits are are typically typically done done on on binned binned data, data, and and implemented implemented in in most statistical packages (ROOT, MATLAB, even Excel)

36 Confidence Limits

37 Error Intervals From Likelihood Ratio

38 Error Intervals From Likelihood Ratio

39 Chi2 Quantiles

40 Confidence Limits Frequentist approach: confidence belts Define Small caveat: interval not unique. Use central intervals (equal area on both sides) or decide based on likelihood ratio (e.g. Feldman-Cousins) x 0

41 Bayesian Approach Likelihood function + prior -> posterior for parameter Treat as PDF and integrate Caveat: choice of prior

42 Ben Hooberman s thesis! Example Likelihood 1 0.8 Likelihood 1 0.8 0.6 0.4 0.2 0.6 0.4 0.2 0-4 -2 0 2 4 6 8 10-6 BF(ϒ(3S) eτ) ( 10 ) 0-4 -2 0 2 4 6 8 10-6 BF(ϒ(3S) µτ) ( 10 ) Figure 2.33: Likelihood as a function of the branching fractions (left) and (right) [60]. The dotted red curve includes statistical uncertainties only, the solid blue curve includes systematic uncertainties as well. The shaded green regions bounded by the vertical lines indicate 90% of the area under the physical ( )regionsofthelikelihood curves.

43 Hypothesis Testing Setting a confidence interval is a special case of a general problem of hypothesis testing E.g. hypothesis is that x is within this interval Or x belongs to a distribution Hypothesis testing is a procedure for assigning a significance (confidence) level to a test Generally involves computing quintiles of a distribution

44 Example: Gaussian distribution 1 α = 1 2πσ µ+δ µ δ e (x µ)2 /2σ 2 dx =erf ( δ 2 σ ) α/2 that the measured value will fall within ± of the tru 1 α f (x; µ,σ) α/2 α δ α δ 0.3173 1σ 0.2 1.28σ 4.55 10 2 2σ 0.1 1.64σ 2.7 10 3 3σ 0.05 1.96σ 6.3 10 5 4σ 0.01 2.58σ 5.7 10 7 5σ 0.001 3.29σ 2.0 10 9 6σ 10 4 3.89σ ± 3 2 1 0 1 2 3 (x µ)/σ

45 Example: chi-squared p-values 1.000 0.500 p-value for test α for confidence intervals 0.200 0.100 0.050 0.020 0.010 0.005 n = 1 2 3 4 6 8 10 15 25 20 30 40 50 0.002 0.001 1 2 3 4 5 7 10 20 30 40 50 70 100 e 33.1: One minus the cumulative distribution, 1 ( ; ), for de χ 2

46 Systematics

47 Statistical errors: Another Class of Errors Spread in values one would see if the experiment were repeated multiple times RMS of the estimator for an ensemble of experiments done under the same conditions (e.g. numbers of events) Several methods discussed (sqrt)variance of the estimator if PDF is known Curvature of log(likelihood) Δlog(L) = 1/2 rule (or Δχ 2 =1) But there is another source of uncertainty in results: systematics

48 Mass spectrometer error: resolutio Stat error: resolution/sqrt(n) m = qrb2 2V Measure V,B for each run Average fluctuations Common errors do not average out Scale of B,V Radius r Velocity selection Energy loss (residual pressure) Etc, etc. Simple Example 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

49 Combination of Errors Normally, independent errors are added in quadrature For instance, if measurements of r,v,b are uncorrelated, then (to first order) "(m) m = #"(r)& % ( $ r ' 2 # + "(V ) & % ( $ V ' This is fine for a single ion But when we average (take more data), have to take into account the fact that errors on r,v,b correlate measurements of mass for each ion 2 # + 2 "(B) & % ( $ B ' 2

50 Quadrature Sum Stat and syst errors are typically quoted separately in experimental papers (though not in PDG) E.g. σ=[15 ± 5 (stat.) ± 1 (syst.)] nb It is understood that the first number scales with the number of events while the second may not Splitting like this gives a feeling of how much a measurement could be improved with more data It is also understood that stat and syst errors are uncorrelated (if this is not the case, have to say so explicitly!) It is also understood that stat errors are uncorrelated between different experiments, while syst errors could be correlated (modeling, bias)

51 Classic Example (one of many)

52 Combining Errors For one measurement with stat and syst errors, this is easy Suppose we measure x 1 =<x 1 >±σ 1 ±S Split into random and systematic parts x 1 =<x 1 >+x R +x S <x R >=<x S >=0, <(x R ) 2 >=σ 1, <(x S ) 2 >=S Total variance V[x 1 ]=<x 1 2>-<x 1 > 2 =< (x R +x S ) 2 >= σ 1 2+S 2 Syst and stat errors are combined in quadrature

53 Error Propagation Fully fledged formula Assume small errors (i.e. keep 1st Taylor term): Consequences If two measurements are correlated, it may be possible to find a combination with zero variance (det(v)=0) Two fully-correlated measurements x 1,x 2 ; for X=x 1 +x 2 : Errors add up linearly!

54 Error Propagation, General m functions f 1,,f m, n variables x 1,,x n : Or in matrix form and

55 Systematic Errors and Fitting Use covariance matrix in χ 2 : d i =(y i -y i fit) Can apply the same recipe for ML fit (e.g. L~exp(-χ 2 /2))

56 Practical Implications In the full formalism, can still use χ 2 /df test to determine the goodness of fit But this will not work unless correlations are taken into account For simplicity, if all stat errors are roughly equal and all systematic errors are common, can do the fit with stat errors only (this will determine stat errors on parameters), then propagate syst errors Limitations More points do not improve the systematic error Goodness of fit would not reveal unsuspected sources of systematics All points move together -- same goodness of fit