FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

Similar documents
Primer on statistics:

Hypothesis testing (cont d)

Use of the likelihood principle in physics. Statistics II

Statistics for the LHC Lecture 1: Introduction

Hypothesis testing:power, test statistic CMS:

Statistics for the LHC Lecture 2: Discovery

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

E. Santovetti lesson 4 Maximum likelihood Interval estimation

Statistical Methods for Particle Physics (I)

Statistical Data Analysis Stat 3: p-values, parameter estimation

Some Statistical Tools for Particle Physics

TUTORIAL 8 SOLUTIONS #

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

ETH Zurich HS Mauro Donegà: Higgs physics meeting name date 1

Hypotheses Testing. Chapter Hypotheses and tests statistics

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

FYST17 Lecture 6 LHC Physics II

Institute of Actuaries of India

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Hypothesis Testing Chap 10p460

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters

Practice Problems Section Problems

Modern Methods of Data Analysis - WS 07/08

Advanced statistical methods for data analysis Lecture 1

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

FYST17 Lecture 6 LHC Physics II

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Tutorial 8: Discovery of the Higgs boson

Negative binomial distribution and multiplicities in p p( p) collisions

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits

Recent developments in statistical methods for particle physics

Statistics for Resonance Search

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Statistics Challenges in High Energy Physics Search Experiments

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Search for high mass diphoton resonances at CMS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Discovery and Significance. M. Witherell 5/10/12

Introduction to the Terascale: DESY 2015

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

The Higgs boson discovery. Kern-und Teilchenphysik II Prof. Nicola Serra Dr. Annapaola de Cosa Dr. Marcin Chrzaszcz

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Statistical Methods in Particle Physics

Multivariate statistical methods and data mining in particle physics

Random Number Generation. CS1538: Introduction to simulations

Modern Methods of Data Analysis - WS 07/08

Hypothesis Testing - Frequentist

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

If there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,

Statistical Methods for LHC Physics

The search for standard model Higgs boson in ZH l + l b b channel at DØ

Combined Higgs Results

Statistical Methods for Astronomy

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Discovery Potential for the Standard Model Higgs at ATLAS

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Statistical Tests: Discriminants

Evidence for Single Top Quark Production. Reinhard Schwienhorst

STAT 830 Hypothesis Testing

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Search for tt(h bb) using the ATLAS detector at 8 TeV

Higgs couplings and mass measurements with ATLAS. Krisztian Peters CERN On behalf of the ATLAS Collaboration

Search for SM Higgs Boson at CMS

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

Parameter Estimation and Fitting to Data

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Some Anomalous Results from the CMS

8: Hypothesis Testing

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

simple if it completely specifies the density of x

Search for W' tb in the hadronic final state at ATLAS

Probability Density Functions

STAT 830 Hypothesis Testing

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

A Very Brief Summary of Statistical Inference, and Examples

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

arxiv: v1 [hep-ex] 18 Jul 2016

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland. Rare B decays at CMS

Summary of Columbia REU Research. ATLAS: Summer 2014

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Two Early Exotic searches with dijet events at ATLAS

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

P Values and Nuisance Parameters

Discovery significance with statistical uncertainty in the background estimate

Summary of Chapters 7-9

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

ATLAS and CMS Higgs Mass and Spin/CP Measurement

Seminario finale di dottorato

EVENT SELECTION CRITERIA OPTIMIZATION FOR THE DISCOVERY OF NEW PHYSICS WITH THE ATLAS EXPERIMENT

CONTINUOUS RANDOM VARIABLES

Statistical Methods in Particle Physics. Lecture 2

Review. December 4 th, Review

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004

How to find a Higgs boson. Jonathan Hays QMUL 12 th October 2012

Continuous Probability Distributions. Uniform Distribution

Transcription:

FYST17 Lecture 8 Statistics and hypothesis testing Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons 1

Plan for today: Introduction to concepts The Gaussian distribution Likelihood functions Hypothesis testing Including p-values and significance More examples 2

Interpretation of probability 3

PDF = probability density function 4

Definitions 5

PDF examples Binomial: N trials with p chance of success, probability for n successes: f If N and p 0 but Np then we have Poisson: (already N>50, p<0.1 works) f n; λ = λn n! e λ And for large s can use Gaussian 6

The Gaussian distribution The Gaussian pdf is defined by E[x] = V[x] = ² standard Gaussian If y is a Gaussian with, ², then follows (x) x = y μ σ 7

The Gaussian distribution It is useful to know the most common Gaussian integrals: @Wikipedia 8

ATLAS examples of Gaussian distributions Distribution of vertex z coordinate for tracks Invariant mass for K 0 s peak fitted with a Gaussian (!) 9

Central limit theorem What we used already in MC studies Central Limit theorem: The sum on N independent continuous random variables x i with means i and variances i ² becomes a Gaussian random variable with mean = i i and variance ² = i i ² in the limit that N approaches infinity Try for yourselves! Example: sum of 10 uniform numbers = Gaussian! Gaussian functions play important role in applied statistics Uncertainties tend to be Gaussian! 10

Quick exercise Measurement of transverse momentum of a track from a fit Radius of helix given by R=0.3Bp T Track fit returns a Gaussian uncertainty in the curvature, e.g. the pdf is Gaussian in 1/p T What is the error on p T? σ pt p T = p T σ 1/pT 11

Error propagation in two variables 13

Parameter estimation 14

Estimators 15

Likelihood functions Given a PDF f(x) with parameter(s), what is the probability that with N observations, x i falls in the intervals [x i ; x i + dx i ]? Described by the likelihood function: 16

Likelihood functions Given a set of measurements x i and parameter(s), the likelihood function is defined as: The principle of maximum likelihood for parameter estimation consists of maximizing the likelihood of parameter(s) (here ) given some data (here x) The likelihood function plays a central role in statistics, as it can shown to be: Consistent (converges to the right value!) Asymptotically normal (converges with Gaussian errors) Efficient and optimal if it can be applied in practice f Computational: often easier to minimize log likelihood: In problems with Gaussian errors boils down to a ² Two versions, in practice: - Binned likelihood - Unbinned likelihood 17

Binned likelihood Sum over bins in a histogram: 18

Unbinned likelihood Sum over single measurements: 19

Hypothesis testing 20

Hypotheses and acceptance/rejection regions 21

Test statistics 1. State hypothesis (null and alternative) 2. Set criteria for decision, select test statistics, select a significance level 3. Compute the value of the test statistics and from that the probability of observation under null-hypothesis (p-value) 4. Make the decision! Reject null hypothesis if p-value is below significance level Null hypothesis Alternative hypothesis 22

Test statistics 23

Example of hypothesis test The spin of the newly discovered Higgs-like particle (spin 0 or 2?) PDF of spin 2 hypothesis PDF of spin 0 hypothesis Test statistics (Likelihood ratio[decay angles] ) 24

Selection 25

Other selection options 26

ALICE example 27

ALICE example 28

Type I / Type II errors: 29

Signal/background efficiency 30

Neyman-Pearson s lemma The Neyman-Pearson lemma states: to ge the highest purity for a given efficiency, (i.e. highest power for a given significance level), choose the acceptance region such that: g(t H 0 ) g(t H 1 ) > c, where c=constant that determines the efficiency (Chap 5) This even gives that the likelihood ratio, 2 ln L 0 L 1, is the most powerful test 31

Significance tests/goodness of fit 32

p-values 33

Significance of an observed signal! 34

Significance of an observed signal! 35

Significance vs p-value Small p = unexpected 36

Significance of a peak 37

Significance of a peak 38

How many s? 39

Look-elsewhere-effect (LLE) Example from CDF: Is there a bump at 7.2 GeV? (and even 7.75 GeV?!) Excess has significance but when we take into account that the bump(s) could have been anywhere in the spectrum (the lookelsewhere-effect) significance is reduced: p-value(corr) = p-value (number of places it might have been spotted in spectrum) In this case ~ mass interval / width of bump Results in low significance Never saw these again 40

Remember the penta-quark 41

The ATLAS/CMS diphoton bump 42

(example 3.2) CMS 2012 Higgs result Limit setting 43

(example 3.2) Confidence intervals Upper limt on signal cross section/ SM Higgs cross section in H WW channel 44

Summary/ outlook Gaussian distribution very useful Errors tend to be gaussian To check a New Physics hypothesis against the Standard Model Define test statistics Define level of significance Remember the look elsewhere effect P-values gives P(data null hypothesis) It does not say whether the hypothesis is true! 45