Statistics and Data Analysis

Size: px
Start display at page:

Download "Statistics and Data Analysis"

Transcription

1 Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli

2 Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli

3 2 Intro The Crash Course Definitions: random vars, probability, PDFs Point estimators Max likelihood, least squares fits Hypothesis testing, confidence limits Monte Carlo techniques Systematic uncertainties

4 3 Necessary Tool Statistics Physics = Experimental Science Both qualitative and quantitative understanding is required Observations Laws Make a set of measurements or observations Summarize the results Most of the conclusions are drawn with some degree of (un)certainty» Ex: Gravity exists (qualitative).» F=G N m 1 m 2 /r 2 (quantitative)» But in reality G N and n=2 are known to some precision, e.g. G N = ( ± )*10-11 N*m 2 /kg 2 Many measurements are a priori uncertain (e.g. quantum physics), and have to be interpreted in probabilistic terms Estimating such probabilities given finite amount of data and to test whether a given model is consistent with the data is the basic task of classical statistics

5 4 Describing the Data Data: results of the measurements In physics, we mostly deal with quantitative data, i.e. set of numbers Other fields may deal with qualitative data An American Robin has gray upperparts and head, and orange underparts, usually brighter in the male Numbers are easier to handle mathematically (duh), hence statistics will deal with quantitative measurements Discrete data, e.g. integers (counts) Continuous data, e.g. energies, momenta, etc Write down with some precision, typically set by the measuring apparatus or other external conditions

6 5 Scatter plot Examples Asymmetry (ppm) Δy (µm)

7 6 Examples Graph

8 7 Histogram Examples

9 8 Frequentist interpretation: Probability is a limiting frequency a given outcome is reported when experiments are repeating an infinite number of times Measurable parameters are represented by estimators with assigned confidence levels. CL measures a probability an estimator would fall in a certain range, given a true value of a parameter. No probability is assigned to constants of nature. Bayesian interpretation: Probability: Two Interpretations More general: define probability as a degree of belief that a given statement is true E.g. that the true value of parameter x is in interval [a,b] This is somewhat subjective, but follows how most humans think

10 9 Bayes Theorem Conditional probability of A given B P (A B) = P (A B) P (B) Interpreted within Bayesian statistics as P (theory data) P (data theory)p (theory) Posterior probability Likelihood (result of the measurement) Prior probability (initial prejudice) Allows one to interpret a single experiment as a measure of (subjective) probability that a given hypothesis is correct (e.g. that some fundamental constant is in some range). Requires assigning some probability interpretation to prior knowledge, which is where subjectivity comes in.

11 10 Random Variables Random variable: a numerical outcome of a (repeatable) measurement Characterized by a Probability Density Function Depends on a set of parameters θ Cumulative distribution (CDF): F (a) = a f(x) dx

12 11 Expectation Values Expectation value of function u(x): E[u(x)] = Moments of a random variable x: u(x) f(x) dx Special moments: µ α 1, σ 2 V [x] m 2 = α 2 µ 2 α n E[x n ]= m n E[(x α 1 ) n ]= x n f(x) dx (x α 1 ) n f(x) dx n-th moment n-th central moment Mean Variance

13 12 Examples 09/10/ /15/2011 YGK, Phys226: YGK, Statistics Phys129

14 13 Sample from a Continuous PDF

15 14 Sample From a Continuous PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

16 15 Sample From a Continuous PDF PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

17 16 Sample From a Discrete PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

18 17 Sample From a Discrete PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

19 18 Sample From a Discrete PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

20 19 PDF Cumulative Distribution Cumulative Distribution 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

21 20 2d Distribution dx dy 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

22 21 2d Distribution 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

23 22 Marginal PDF 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

24 23 Correlations Between Variables Covariance: cov[x, y] =E[(x µ x )(y µ y )] = E[xy] µ x µ y Represent N-dimensional parameter space in terms of covariance matrix Vij Diagonal elements: variances (squares of RMS) Off-diagonal: covariances Correlation: normalized covariance ρ xy =cov[x, y]/σ x σ y

25 24 Profile Histogram 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

26 25 Common PDFs

27 26 Poisson Distribution 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

28 27 Gaussian Distribution

29 28 χ 2 Distribution

30 29 Cauchy (Breit-Wigner) PDF

31 30 Point Estimation Standard problem: set of values x 1, x 2,, x n described by PDF data parameter(s) Point estimation: want to construct Estimator of parameter θ

32 31 Estimator Properties Consistency Approaches true value asymptotically for infinite dataset Bias Difference wrt true value for finite dataset Efficiency Variance of the estimator (compared to others) Sufficiency Dependence on true value Robustness Sensitivity to bad data, e.g. outliers Others: physicality, tractable-ness, etc. No ideal recipe, what is best depends on the problem

33 32 Basic Estimators Estimators for mean and variance Shape of the PDF (fitting): Maximum likelihood Most efficient, but may be biased Goodness of fit is not readily available Least Chisquared ML for gaussian-distributed data Convenient for binned data, analytic solutions for linear functions Automatic goodness-of-fit measure Be careful of gaussian approximations (e.g. when Poisson becomes Gaussian)

34 33 Mean and Variance from a Sample Estimators: (equally weighted data) µ = 1 N Variances of these estimators: V [ˆµ] = σ2 N [ σ2] V = 1 N N i=1 σ 2 = 1 N 1 i.e. x i N (x i µ) 2 i=1 N>0 N>1 σ [ˆµ] =σ/ N ( m 4 N 3 ) N 1 σ4 σ [ˆσ] =σ/ 2N for Gaussian distribution of x and large N

35 34 Maximum Likelihood Estimators Define likelihood for N independent measurements x i : L(θ) = N f(x i ; θ) i=1 max to determine estimators of θ This leads to a system of (generally nonlinear) equations for parameters θ: ln L θ i =0, i =1,...,n Solutions of these equations (often done numerically) determine estimators. Their covariance matrix iis given by ( V 1 ) ij = 2 ln L θ i θ j θ Maximum likelihood method has a nice property that (in the limit of infinite. (33 10) can result in an statistics) it produces unbiased estimators with smallest possible variance. But beware of small statistics samples! ML fits are implemented in many statistical packages (ROOT, MATLAB). Can be applied to binned or unbinned data

36 35 Least Squares Estimators For a set of Gaussian-distributed variables y i, define: χ 2 (θ) = 2lnL(θ)+ constant = N i=1 Or, for correlated variables! χ 2 (θ) =(y F (θ)) T V 1 (y F (θ)) Estimators: i (y i F (x i ; θ)) 2 σ 2 i In particular, if the function F is linear in parameters θ, LS estimators are found by y solving a a system of of linear linear equations analytically:! m F (x i ; θ) = θ j h j (x i ) θ =(H T V 1 H) 1 H T V 1 y Dy. j=1 Least-squares east-squares fits fits are are typically typically done done on on binned binned data, data, and and implemented implemented in in most statistical packages (ROOT, MATLAB, even Excel)

37 36 Confidence Limits

38 37 Error Intervals From Likelihood Ratio

39 38 Error Intervals From Likelihood Ratio

40 39 Chi2 Quantiles

41 40 Confidence Limits Frequentist approach: confidence belts Define Small caveat: interval not unique. Use central intervals (equal area on both sides) or decide based on likelihood ratio (e.g. Feldman-Cousins) x 0

42 41 Bayesian Approach Likelihood function + prior -> posterior for parameter Treat as PDF and integrate Caveat: choice of prior

43 42 Ben Hooberman s thesis! Example Likelihood Likelihood BF(ϒ(3S) eτ) ( 10 ) BF(ϒ(3S) µτ) ( 10 ) Figure 2.33: Likelihood as a function of the branching fractions (left) and (right) [60]. The dotted red curve includes statistical uncertainties only, the solid blue curve includes systematic uncertainties as well. The shaded green regions bounded by the vertical lines indicate 90% of the area under the physical ( )regionsofthelikelihood curves.

44 43 Hypothesis Testing Setting a confidence interval is a special case of a general problem of hypothesis testing E.g. hypothesis is that x is within this interval Or x belongs to a distribution Hypothesis testing is a procedure for assigning a significance (confidence) level to a test Generally involves computing quintiles of a distribution

45 44 Example: Gaussian distribution 1 α = 1 2πσ µ+δ µ δ e (x µ)2 /2σ 2 dx =erf ( δ 2 σ ) α/2 that the measured value will fall within ± of the tru 1 α f (x; µ,σ) α/2 α δ α δ σ σ σ σ σ σ σ σ σ σ σ σ ± (x µ)/σ

46 45 Example: chi-squared p-values p-value for test α for confidence intervals n = e 33.1: One minus the cumulative distribution, 1 ( ; ), for de χ 2

47 46 Systematics

48 47 Statistical errors: Another Class of Errors Spread in values one would see if the experiment were repeated multiple times RMS of the estimator for an ensemble of experiments done under the same conditions (e.g. numbers of events) Several methods discussed (sqrt)variance of the estimator if PDF is known Curvature of log(likelihood) Δlog(L) = 1/2 rule (or Δχ 2 =1) But there is another source of uncertainty in results: systematics

49 48 Mass spectrometer error: resolutio Stat error: resolution/sqrt(n) m = qrb2 2V Measure V,B for each run Average fluctuations Common errors do not average out Scale of B,V Radius r Velocity selection Energy loss (residual pressure) Etc, etc. Simple Example 09/10/2013 YGK, Phys226: YGK, Statistics Phys129

50 49 Combination of Errors Normally, independent errors are added in quadrature For instance, if measurements of r,v,b are uncorrelated, then (to first order) "(m) m = #"(r)& % ( $ r ' 2 # + "(V ) & % ( $ V ' This is fine for a single ion But when we average (take more data), have to take into account the fact that errors on r,v,b correlate measurements of mass for each ion 2 # + 2 "(B) & % ( $ B ' 2

51 50 Quadrature Sum Stat and syst errors are typically quoted separately in experimental papers (though not in PDG) E.g. σ=[15 ± 5 (stat.) ± 1 (syst.)] nb It is understood that the first number scales with the number of events while the second may not Splitting like this gives a feeling of how much a measurement could be improved with more data It is also understood that stat and syst errors are uncorrelated (if this is not the case, have to say so explicitly!) It is also understood that stat errors are uncorrelated between different experiments, while syst errors could be correlated (modeling, bias)

52 51 Classic Example (one of many)

53 52 Combining Errors For one measurement with stat and syst errors, this is easy Suppose we measure x 1 =<x 1 >±σ 1 ±S Split into random and systematic parts x 1 =<x 1 >+x R +x S <x R >=<x S >=0, <(x R ) 2 >=σ 1, <(x S ) 2 >=S Total variance V[x 1 ]=<x 1 2>-<x 1 > 2 =< (x R +x S ) 2 >= σ 1 2+S 2 Syst and stat errors are combined in quadrature

54 53 Error Propagation Fully fledged formula Assume small errors (i.e. keep 1st Taylor term): Consequences If two measurements are correlated, it may be possible to find a combination with zero variance (det(v)=0) Two fully-correlated measurements x 1,x 2 ; for X=x 1 +x 2 : Errors add up linearly!

55 54 Error Propagation, General m functions f 1,,f m, n variables x 1,,x n : Or in matrix form and

56 55 Systematic Errors and Fitting Use covariance matrix in χ 2 : d i =(y i -y i fit) Can apply the same recipe for ML fit (e.g. L~exp(-χ 2 /2))

57 56 Practical Implications In the full formalism, can still use χ 2 /df test to determine the goodness of fit But this will not work unless correlations are taken into account For simplicity, if all stat errors are roughly equal and all systematic errors are common, can do the fit with stat errors only (this will determine stat errors on parameters), then propagate syst errors Limitations More points do not improve the systematic error Goodness of fit would not reveal unsuspected sources of systematics All points move together -- same goodness of fit

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

2. A Basic Statistical Toolbox

2. A Basic Statistical Toolbox . A Basic Statistical Toolbo Statistics is a mathematical science pertaining to the collection, analysis, interpretation, and presentation of data. Wikipedia definition Mathematical statistics: concerned

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised April 1998 by F. James (CERN); February 2000 by R. Cousins (UCLA); October 2001, October 2003, and August 2005 by G. Cowan (RHUL). This chapter gives an overview

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Introduction to Statistics and Error Analysis

Introduction to Statistics and Error Analysis Introduction to Statistics and Error Analysis Physics116C, 4/3/06 D. Pellett References: Data Reduction and Error Analysis for the Physical Sciences by Bevington and Robinson Particle Data Group notes

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised September 2007 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in High Energy Physics. In statistics, we are interested in using a

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Lectures on Statistical Data Analysis

Lectures on Statistical Data Analysis Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

E. Santovetti lesson 4 Maximum likelihood Interval estimation

E. Santovetti lesson 4 Maximum likelihood Interval estimation E. Santovetti lesson 4 Maximum likelihood Interval estimation 1 Extended Maximum Likelihood Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson

More information

Introduction to Statistical Methods for High Energy Physics

Introduction to Statistical Methods for High Energy Physics Introduction to Statistical Methods for High Energy Physics 2011 CERN Summer Student Lectures Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits www.pp.rhul.ac.uk/~cowan/stat_freiburg.html Vorlesungen des GK Physik an Hadron-Beschleunigern, Freiburg, 27-29 June,

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Parameter Estimation and Fitting to Data

Parameter Estimation and Fitting to Data Parameter Estimation and Fitting to Data Parameter estimation Maximum likelihood Least squares Goodness-of-fit Examples Elton S. Smith, Jefferson Lab 1 Parameter estimation Properties of estimators 3 An

More information

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,

More information

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12 Physics 509: Propagating Systematic Uncertainties Scott Oser Lecture #1 1 Additive offset model Suppose we take N measurements from a distribution, and wish to estimate the true mean of the underlying

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

arxiv:hep-ex/ v1 2 Jun 2000

arxiv:hep-ex/ v1 2 Jun 2000 MPI H - V7-000 May 3, 000 Averaging Measurements with Hidden Correlations and Asymmetric Errors Michael Schmelling / MPI for Nuclear Physics Postfach 03980, D-6909 Heidelberg arxiv:hep-ex/0006004v Jun

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That Statistics Lecture 4 August 9, 2000 Frank Porter Caltech The plan for these lectures: 1. The Fundamentals; Point Estimation 2. Maximum Likelihood, Least Squares and All That 3. What is a Confidence Interval?

More information

Error analysis for efficiency

Error analysis for efficiency Glen Cowan RHUL Physics 28 July, 2008 Error analysis for efficiency To estimate a selection efficiency using Monte Carlo one typically takes the number of events selected m divided by the number generated

More information

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester Physics 403 Credible Intervals, Confidence Intervals, and Limits Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Summarizing Parameters with a Range Bayesian

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Practical Statistics

Practical Statistics Practical Statistics Lecture 1 (Nov. 9): - Correlation - Hypothesis Testing Lecture 2 (Nov. 16): - Error Estimation - Bayesian Analysis - Rejecting Outliers Lecture 3 (Nov. 18) - Monte Carlo Modeling -

More information

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Fourier and Stats / Astro Stats and Measurement : Stats Notes Fourier and Stats / Astro Stats and Measurement : Stats Notes Andy Lawrence, University of Edinburgh Autumn 2013 1 Probabilities, distributions, and errors Laplace once said Probability theory is nothing

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10 Physics 509: Error Propagation, and the Meaning of Error Bars Scott Oser Lecture #10 1 What is an error bar? Someone hands you a plot like this. What do the error bars indicate? Answer: you can never be

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM)

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Alexandr Malusek Division of Radiological Sciences Department of Medical and Health Sciences Linköping University 2014-04-15

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

L2: Review of probability and statistics

L2: Review of probability and statistics Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution

More information

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions ASTR 511/O Connell Lec 6 1 STATISTICS OF OBSERVATIONS & SAMPLING THEORY References: Bevington Data Reduction & Error Analysis for the Physical Sciences LLM: Appendix B Warning: the introductory literature

More information

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

PHYS 6710: Nuclear and Particle Physics II

PHYS 6710: Nuclear and Particle Physics II Data Analysis Content (~7 Classes) Uncertainties and errors Random variables, expectation value, (co)variance Distributions and samples Binomial, Poisson, and Normal Distribution Student's t-distribution

More information

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery Statistical Methods in Particle Physics Lecture 2: Limits and Discovery SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

A732: Exercise #7 Maximum Likelihood

A732: Exercise #7 Maximum Likelihood A732: Exercise #7 Maximum Likelihood Due: 29 Novemer 2007 Analytic computation of some one-dimensional maximum likelihood estimators (a) Including the normalization, the exponential distriution function

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Cosmological parameters A branch of modern cosmological research focuses on measuring

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Statistical Data Analysis 2017/18

Statistical Data Analysis 2017/18 Statistical Data Analysis 2017/18 London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised September 2009 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a

More information

Investigation of Possible Biases in Tau Neutrino Mass Limits

Investigation of Possible Biases in Tau Neutrino Mass Limits Investigation of Possible Biases in Tau Neutrino Mass Limits Kyle Armour Departments of Physics and Mathematics, University of California, San Diego, La Jolla, CA 92093 (Dated: August 8, 2003) We study

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009

Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009 Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 1 Statistische Methoden der Datenanalyse Kapitel 1: Fundamentale Konzepte Professor Markus Schumacher

More information

BACKGROUND NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2016 PROBABILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO

BACKGROUND NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2016 PROBABILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO ACKGROUND NOTES FYS 4550/FYS9550 - EXERIMENTAL HIGH ENERGY HYSICS AUTUMN 2016 ROAILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO efore embarking on the concept of probability, we will first define

More information

Physics 6720 Introduction to Statistics April 4, 2017

Physics 6720 Introduction to Statistics April 4, 2017 Physics 6720 Introduction to Statistics April 4, 2017 1 Statistics of Counting Often an experiment yields a result that can be classified according to a set of discrete events, giving rise to an integer

More information

Introduction to Error Analysis

Introduction to Error Analysis Introduction to Error Analysis Part 1: the Basics Andrei Gritsan based on lectures by Petar Maksimović February 1, 2010 Overview Definitions Reporting results and rounding Accuracy vs precision systematic

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: ) Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv:1809.05778) Workshop on Advanced Statistics for Physics Discovery aspd.stat.unipd.it Department of Statistical Sciences, University of

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Introductory Statistics Course Part II

Introductory Statistics Course Part II Introductory Statistics Course Part II https://indico.cern.ch/event/735431/ PHYSTAT ν CERN 22-25 January 2019 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Revised May 1996 by D.E. Groom (LBNL) and F. James (CERN), September 1999 by R. Cousins (UCLA), October 2001 and October 2003 by G. Cowan (RHUL).

Revised May 1996 by D.E. Groom (LBNL) and F. James (CERN), September 1999 by R. Cousins (UCLA), October 2001 and October 2003 by G. Cowan (RHUL). 31. Probability 1 31. PROBABILITY Revised May 1996 by D.E. Groom (LBNL) and F. James (CERN), September 1999 by R. Cousins (UCLA), October 2001 and October 2003 by G. Cowan (RHUL). 31.1. General [1 8] An

More information

Parameter Estimation and Hypothesis Testing

Parameter Estimation and Hypothesis Testing Parameter Estimation and Hypothesis Testing Parameter estimation Maximum likelihood Least squares Hypothesis tests Goodness-of-fit Elton S. Smith Jefferson Lab Con ayuda de Eduardo Medinaceli y Cristian

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture VII (26.11.07) Contents: Maximum Likelihood (II) Exercise: Quality of Estimators Assume hight of students is Gaussian distributed. You measure the size of N students.

More information

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics

More information

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009 Systematic uncertainties in statistical data analysis for particle physics DESY Seminar Hamburg, 31 March, 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Measurement And Uncertainty

Measurement And Uncertainty Measurement And Uncertainty Based on Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, NIST Technical Note 1297, 1994 Edition PHYS 407 1 Measurement approximates or

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Asymptotic formulae for likelihood-based tests of new physics

Asymptotic formulae for likelihood-based tests of new physics Eur. Phys. J. C (2011) 71: 1554 DOI 10.1140/epjc/s10052-011-1554-0 Special Article - Tools for Experiment and Theory Asymptotic formulae for likelihood-based tests of new physics Glen Cowan 1, Kyle Cranmer

More information

Expect Values and Probability Density Functions

Expect Values and Probability Density Functions Intelligent Systems: Reasoning and Recognition James L. Crowley ESIAG / osig Second Semester 00/0 Lesson 5 8 april 0 Expect Values and Probability Density Functions otation... Bayesian Classification (Reminder...3

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Introduction to Statistics and Data Analysis

Introduction to Statistics and Data Analysis Introduction to Statistics and Data Analysis RSI 2005 Staff July 15, 2005 Variation and Statistics Good experimental technique often requires repeated measurements of the same quantity These repeatedly

More information

YETI IPPP Durham

YETI IPPP Durham YETI 07 @ IPPP Durham Young Experimentalists and Theorists Institute Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan Course web page: www.pp.rhul.ac.uk/~cowan/stat_yeti.html

More information

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Data Analysis and Monte Carlo Methods

Data Analysis and Monte Carlo Methods Lecturer: Allen Caldwell, Max Planck Institute for Physics & TUM Recitation Instructor: Oleksander (Alex) Volynets, MPP & TUM General Information: - Lectures will be held in English, Mondays 16-18:00 -

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept

More information

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information

Lectures on Statistics. William G. Faris

Lectures on Statistics. William G. Faris Lectures on Statistics William G. Faris December 1, 2003 ii Contents 1 Expectation 1 1.1 Random variables and expectation................. 1 1.2 The sample mean........................... 3 1.3 The sample

More information