Primer on statistics:

Size: px
Start display at page:

Download "Primer on statistics:"

Transcription

1 Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing Insight Data Science - AI Fellows Workshop Feb 16, 018

2 Outline 1. Maximum likelihood estimators. Variance of MLE Confidence intervals Efficiency example 5. A/B testing

3 Is this significant? Statistical questions: How can we calculate the best-fit Data estimate 011 of some parameter? (b) s Total background Point m H =15 estimation GeV, 1 x SM and confidence intervals = 7 TeV, Ldt = 4.8 fb How can we be precise and rigorous about how confident we are that a model is wrong? Hypothesis testing m llll -1 [GeV] Events / 5 GeV 1 Data s Total background m H =15 GeV, 1 x SM = 7 TeV, -1 Ldt = 4.8 fb 3 events ATLAS H ZZ 4l Has a local p0 of % (c) (*) [arxiv: ] m llll [GeV] 3

4 Confidence Intervals A frequentist confidence interval is constructed such that, given the model, if the experiment were repeated, each time creating an interval, 95% (or other CL) of the intervals would contain the true population parameter (i.e. the interval has 95% coverage). They can be one-sided exclusions, e.g. X > 100. at 95% CL Two-sided measurements, e.g. X = 15.1 ± 0. at 68% CL Contours in or more parameters This is not the same as saying There is a 95% probability that the true parameter is in my interval. Any probability assigned to a parameter strictly involves a Bayesian prior probability. Bayes theorem: P(Theory Data) P(Data Theory) P(Theory) likelihood prior 4

5 Maximum likelihood method

6 Maximum likelihood Consider a Gaussian distributed measurement: f 1 (x µ, )= 1 (x µ) p exp If we repeat the measurement, the joint PDF is just a product: f(~x µ, )= Y i f 1 (x i µ, ) The likelihood function is the same function as the PDF, only thought of as a function of the parameters, given the data. The experiment is over. L(µ, ~x) =f(~x µ, ) The likelihood principle states that the best estimate of the true parameters are the values which maximize the likelihood. 6

7 Maximum likelihood It is often more convenient to consider the log likelihood, which has the same maximum. ln L =ln Y f 1 = X ln f 1 = X 1 ln( ) i (x i µ) Maximize: ) X ln = X i x i ˆµ =0 (x i ˆµ) =0, ) ˆµ = 1 N X i x i = x Which agrees with our intuition that the best estimate of the mean of a Gaussian is the sample mean. 7

8 Maximum likelihood Note that in the case of a Gaussian PDF, maximizing likelihood is equivalent to minimizing. ln L = X i 1 ln( ) (x µ) is maximized when is minimized. = X i (x µ) This was a simple example of what statisticians call point estimation. Now we would like to quantify our error on this estimate. 8

9 Variance of MLEs

10 Variance True parameter µ,? Maximum Likelihood Estimator (MLE) ˆµ = 1 N X i x i = x Assuming this model, how confident are we that is close to µ,? What is the variance of ˆµ? ˆµ 10

11 Variance One would think that if the likelihood function varies rather slowly near the peak, then there is a wide range of values of the parameters that are consistent with the data, and thus the estimate should have a large error. To see the behavior of the likelihood function near the peak, consider the Taylor expansion of a general ln L of some parameter, nearitsmaximum likelihood estimate ˆ ln L ln L( ) > 0 1 =lnl(ˆ )+ ( ˆ ln ( ˆ ) + ˆ {z } 1/s Dropping the remaining terms would imply that ( ˆ ) L( ) =L(ˆ ) exp s! Note that the ln L( ) is parabolic. 11

12 Variance So if this were a good approximation, we would expect that the variance of ˆ would be given by V [ˆ ] ˆ = s ln ˆ 1 It turns out that there is more truth to this than you would think, given by an important theorem in statistics, the Cramér-Rao Inequality: V [ˆ ] ln An estimator s e ciency is defined to measure to what extent this inequality is equivalent: h 1/E ln "[ˆ ] ˆ V [ˆ ] ˆ 1

13 Variance It can be shown that in the large sample limit: Maximum likelihood estimators are unbiased and 100% e cient. Therefore, in principle, one can calculate the variance of an ML estimator with 1 ln L V [ˆ ] = Calculating the expectation value would involve an analytic integration over the PDFs of all our possible measurements, or a Monte Carlo simulation of it. In practice, one usually uses the observed maximum likelihood estimate as the expectation. ˆ V [ˆ ] ln ˆ 1 13

14 Variance Let s go back to our simple example of a Gaussian likelihood to test this method of calculating the ML estimator s variance. V [ˆµ] =! ln ln = i X i 1 ln( ) x i ˆµ = X i (x i µ) 1 = N ) V [ˆµ] = N ) ˆµ = p N Which many of you will recognize as the proper error on the sample mean. If you are unfamiliar with it, we can actually derive it analytically in this case. 14

15 Variance Analytic variance of a gaussian: f 1 (x µ, )= 1 (x µ) p exp V [ x] =E[ x * µ ] E[ x]! 0 = E 4 1 X x 1 N N i = 1 N E 4 X x i x j + X i6=j i 0 = X * µ N E[x] + X i6=j i X j x i x j 3 13 A5 µ 5 µ 1 E[x ] A µ Y To find E[x ], consider V [x] = = E[x ] E[x] = E[x ] µ ) E[x ]= + µ 15

16 Variance ) V [ x] = 1 N X i6=j µ + X i 1 ( + µ ) A µ = 1 N (N N)µ + N( + µ ) µ = N Which verifies the result we got from calculating derivatives of the likelihood 1 ln L V [ˆ ] In practice, one usually doesn t calculate this analytically, but instead: calculates the derivatives numerically, or uses the ln L or method, described now ˆ 16

17 Variance by Back to our Taylor expansion of ln L: ln L( ) =lnl(ˆ )+ ln ( ˆ ) + ˆ {z } 1/ ˆ Let ln L( ) ln L( ) ln L(ˆ ) ln L( ) ' ( ˆ ) ˆ! ˆ ± n ˆ ln L(ˆ ± n ˆ ) = (±n ˆ ) ˆ ln L(ˆ ± n ˆ ) = n 17

18 Variance by ln L(ˆ ± n ˆ ) = n This is the most common definition of the 68% and 95% confidence intervals: 68%/ 1 ˆ : ln L = 1 95%/ ˆ : ln L = 18

19 Variance by Recall that in the case that the PDF is Gaussian, the ln L is just the statistic. ln L =, = X (x ) ln L(ˆ ± n ˆ ) =lnl(ˆ ± n ˆ ) ln L max = n 1 ) (ˆ ± n ˆ ) min = n (ˆ ± n ˆ ) =n 68%/ 1 ˆ : =1 95%/ ˆ : =4 (3.84 for σ) 19

20 Variance by Multi-dimensional case Q c = c n=1 n= n=

21 Example: measuring an efficiency & A/B testing

22 Efficiency Out of n trials I measure k conversions Estimate the conversion rate its precision/confidence. Without a precision, we cannot know if observed changes are significant.

23 Efficiency f(k; n, p) = If Binomial Distribution n k p k (1 p) n k k = np(1 p) o k p (1 p) " = = n n p 0.01 ) (0.01) (0.99) " o = n o = 1 10 p n " MLE: 0.01 n ˆ" = n k for p =0.01 n " o " o/p % 1, %, % 10, % % 3

24 Example taken from here: A/B testing Data like a 1-dim, 4-bin histogram Assume conversion rate unchanged from A to B: " = N 0 A N A ' N 0 B N B ' N 0 B + N 0 B N A + N B Construct = X (x ) = (N 0 A N A ") N A " + ((N A N 0 A ) N A (1 ")) N A (1 ") +(A! B) 4

25 A/B testing Plugin above values gives: Remember: = 7.5 1σ (68%) : 1 σ (95%) : σ (99%) : 6.63 >99% CL significant improvement in B model 5

26 Hypothesis test 6

27 Take-aways MLEs are the best Don t just calculate MLE, find its variance! Quantify significance with a confidence interval Under many common assumptions ln L =, = X (x ) and one can calculate to determine confidence intervals/contours. 1σ (68%) : 1 σ (95%) : σ (99%) : 6.63 In the simplest case of measuring an efficiency, A/B-testing amounts to a 4-term that can be calculated by hand. If the is large, the change is significant! 7

28 Back up slides

29 Efficiency Other examples with numbers: Method Numerator Denominator Mean (Mode) Variance Uncertainty σ Poisson Binomial Bayesian (0.0) Method Numerator Denominator Mean (Mode) Variance Uncertainty σ Poisson Binomial Bayesian (0.9433)

30 Hypothesis testing Null hypothesis, H0: the SM Alternative hypothesis, H1: some new physics Type-I error: false positive rate (α) Type-II error: false negative rate (β) Power: 1-β Want to maximize power for a fixed false positive rate Particle physics has a tradition of claiming discovery at 5σ p0 = = 1 in 3.5 million, and presents exclusion with p0 = 5%, (95% CL coverage ). Neyman-Pearson lemma (1933): the most powerful test for fixed α is the likelihood ratio: L(x H 0 ) L(x H 1 ) > k α Neyman 30

31 Power & Significance 31

32 Variance by What if L( ) is not Gaussian, i.e. ln L( ) is not parabolic? Likelihood functions have an invariance property, such that if g(x) is a monotonic function, then the maximum likelihood estimate of g( ) is g(ˆ ). In principle, one can find a change of variables function g( ), for which the ln L(g( )) is parabolic as a function of g( ). Therefore,usingthe invariance of the likelihood function, one can make inferences about a parameter of a non-gaussian likelihood function without actually finding such a transformation [James p. 34]. 3

33 Systematics from: 33

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009 Statistics for Particle Physics Kyle Cranmer New York University 91 Remaining Lectures Lecture 3:! Compound hypotheses, nuisance parameters, & similar tests! The Neyman-Construction (illustrated)! Inverted

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons FYST17 Lecture 8 Statistics and hypothesis testing Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons 1 Plan for today: Introduction to concepts The Gaussian distribution Likelihood functions Hypothesis

More information

Statistics for the LHC Lecture 2: Discovery

Statistics for the LHC Lecture 2: Discovery Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University of

More information

Use of the likelihood principle in physics. Statistics II

Use of the likelihood principle in physics. Statistics II Use of the likelihood principle in physics Statistics II 1 2 3 + Bayesians vs Frequentists 4 Why ML does work? hypothesis observation 5 6 7 8 9 10 11 ) 12 13 14 15 16 Fit of Histograms corresponds This

More information

Some Statistical Tools for Particle Physics

Some Statistical Tools for Particle Physics Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI für Physik u. Astrophysik Munich, 10 May, 2016 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Hypothesis Testing - Frequentist

Hypothesis Testing - Frequentist Frequentist Hypothesis Testing - Frequentist Compare two hypotheses to see which one better explains the data. Or, alternatively, what is the best way to separate events into two classes, those originating

More information

Statistics for the LHC Lecture 1: Introduction

Statistics for the LHC Lecture 1: Introduction Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University

More information

Hypothesis Testing: The Generalized Likelihood Ratio Test

Hypothesis Testing: The Generalized Likelihood Ratio Test Hypothesis Testing: The Generalized Likelihood Ratio Test Consider testing the hypotheses H 0 : θ Θ 0 H 1 : θ Θ \ Θ 0 Definition: The Generalized Likelihood Ratio (GLR Let L(θ be a likelihood for a random

More information

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery Statistical Methods in Particle Physics Lecture 2: Limits and Discovery SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits www.pp.rhul.ac.uk/~cowan/stat_aachen.html Graduierten-Kolleg RWTH Aachen 10-14 February 2014 Glen Cowan Physics Department

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Centro de ciencias Pedro Pascual Benasque, Spain 3-15

More information

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester Physics 403 Credible Intervals, Confidence Intervals, and Limits Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Summarizing Parameters with a Range Bayesian

More information

E. Santovetti lesson 4 Maximum likelihood Interval estimation

E. Santovetti lesson 4 Maximum likelihood Interval estimation E. Santovetti lesson 4 Maximum likelihood Interval estimation 1 Extended Maximum Likelihood Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Lecture 2 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Basic Concepts of Inference

Basic Concepts of Inference Basic Concepts of Inference Corresponds to Chapter 6 of Tamhane and Dunlop Slides prepared by Elizabeth Newton (MIT) with some slides by Jacqueline Telford (Johns Hopkins University) and Roy Welsch (MIT).

More information

Statistical Methods for Particle Physics (I)

Statistical Methods for Particle Physics (I) Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods www.pp.rhul.ac.uk/~cowan/stat_aachen.html Graduierten-Kolleg RWTH Aachen 10-14 February 2014 Glen Cowan Physics

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Statistical Methods for Particle Physics Lecture 3: systematic uncertainties / further topics

Statistical Methods for Particle Physics Lecture 3: systematic uncertainties / further topics Statistical Methods for Particle Physics Lecture 3: systematic uncertainties / further topics istep 2014 IHEP, Beijing August 20-29, 2014 Glen Cowan ( Physics Department Royal Holloway, University of London

More information

arxiv: v1 [hep-ex] 9 Jul 2013

arxiv: v1 [hep-ex] 9 Jul 2013 Statistics for Searches at the LHC Glen Cowan Physics Department, Royal Holloway, University of London, Egham, Surrey, TW2 EX, UK arxiv:137.2487v1 [hep-ex] 9 Jul 213 Abstract These lectures 1 describe

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics

More information

Part 4: Multi-parameter and normal models

Part 4: Multi-parameter and normal models Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

P Values and Nuisance Parameters

P Values and Nuisance Parameters P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;

More information

Lecture 5: Bayes pt. 1

Lecture 5: Bayes pt. 1 Lecture 5: Bayes pt. 1 D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Bayes Probabilities

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture VIc (19.11.07) Contents: Maximum Likelihood Fit Maximum Likelihood (I) Assume N measurements of a random variable Assume them to be independent and distributed according

More information

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009 Statistics for Particle Physics Kyle Cranmer New York University 1 Hypothesis Testing 55 Hypothesis testing One of the most common uses of statistics in particle physics is Hypothesis Testing! assume one

More information

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1 4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability

More information

Signal detection theory

Signal detection theory Signal detection theory z p[r -] p[r +] - + Role of priors: Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z)) Is there a better test to use than r? z p[r -] p[r +] - + The optimal

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Cosmological parameters A branch of modern cosmological research focuses on measuring

More information

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009 Systematic uncertainties in statistical data analysis for particle physics DESY Seminar Hamburg, 31 March, 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter

More information

Introduction to Error Analysis

Introduction to Error Analysis Introduction to Error Analysis Part 1: the Basics Andrei Gritsan based on lectures by Petar Maksimović February 1, 2010 Overview Definitions Reporting results and rounding Accuracy vs precision systematic

More information

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits www.pp.rhul.ac.uk/~cowan/stat_freiburg.html Vorlesungen des GK Physik an Hadron-Beschleunigern, Freiburg, 27-29 June,

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Advanced statistical methods for data analysis Lecture 1

Advanced statistical methods for data analysis Lecture 1 Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood

More information

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

Some Topics in Statistical Data Analysis

Some Topics in Statistical Data Analysis Some Topics in Statistical Data Analysis Invisibles School IPPP Durham July 15, 2013 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan G. Cowan

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Asymptotic formulae for likelihood-based tests of new physics

Asymptotic formulae for likelihood-based tests of new physics Eur. Phys. J. C (2011) 71: 1554 DOI 10.1140/epjc/s10052-011-1554-0 Special Article - Tools for Experiment and Theory Asymptotic formulae for likelihood-based tests of new physics Glen Cowan 1, Kyle Cranmer

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

Statistical Distributions and Uncertainty Analysis. QMRA Institute Patrick Gurian

Statistical Distributions and Uncertainty Analysis. QMRA Institute Patrick Gurian Statistical Distributions and Uncertainty Analysis QMRA Institute Patrick Gurian Probability Define a function f(x) probability density distribution function (PDF) Prob [A

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Topic 19 Extensions on the Likelihood Ratio

Topic 19 Extensions on the Likelihood Ratio Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Introductory Statistics Course Part II

Introductory Statistics Course Part II Introductory Statistics Course Part II https://indico.cern.ch/event/735431/ PHYSTAT ν CERN 22-25 January 2019 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Mathematical statistics

Mathematical statistics October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Error analysis for efficiency

Error analysis for efficiency Glen Cowan RHUL Physics 28 July, 2008 Error analysis for efficiency To estimate a selection efficiency using Monte Carlo one typically takes the number of events selected m divided by the number generated

More information

Recent developments in statistical methods for particle physics

Recent developments in statistical methods for particle physics Recent developments in statistical methods for particle physics Particle Physics Seminar Warwick, 17 February 2011 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Statistical Methods for Particle Physics

Statistical Methods for Particle Physics Statistical Methods for Particle Physics Invisibles School 8-13 July 2014 Château de Button Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistics and Error Analysis -

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistics and Error Analysis - Physics 736 Experimental Methods in Nuclear-, Particle-, and Astrophysics - Statistics and Error Analysis - Karsten Heeger heeger@wisc.edu Schedule No lecture next Monday, April 8, 2013. Walter will be

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised September 2007 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in High Energy Physics. In statistics, we are interested in using a

More information

Hypothesis testing:power, test statistic CMS:

Hypothesis testing:power, test statistic CMS: Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this

More information

ST495: Survival Analysis: Hypothesis testing and confidence intervals

ST495: Survival Analysis: Hypothesis testing and confidence intervals ST495: Survival Analysis: Hypothesis testing and confidence intervals Eric B. Laber Department of Statistics, North Carolina State University April 3, 2014 I remember that one fateful day when Coach took

More information

Practical Statistics part II Composite hypothesis, Nuisance Parameters

Practical Statistics part II Composite hypothesis, Nuisance Parameters Practical Statistics part II Composite hypothesis, Nuisance Parameters W. Verkerke (NIKHEF) Summary of yesterday, plan for today Start with basics, gradually build up to complexity of Statistical tests

More information

Discovery significance with statistical uncertainty in the background estimate

Discovery significance with statistical uncertainty in the background estimate Glen Cowan, Eilam Gross ATLAS Statistics Forum 8 May, 2008 Discovery significance with statistical uncertainty in the background estimate Introduction In a search for a new type of event, data samples

More information

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised April 1998 by F. James (CERN); February 2000 by R. Cousins (UCLA); October 2001, October 2003, and August 2005 by G. Cowan (RHUL). This chapter gives an overview

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm Midterm Examination STA 215: Statistical Inference Due Wednesday, 2006 Mar 8, 1:15 pm This is an open-book take-home examination. You may work on it during any consecutive 24-hour period you like; please

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45 Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Review. December 4 th, Review

Review. December 4 th, Review December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information