Parameter Estimation and Hypothesis Testing

Similar documents
Parameter Estimation and Fitting to Data

Hypothesis testing (cont d)

Statistical Methods for Astronomy

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistics and Data Analysis

Statistical Methods in Particle Physics

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Practice Problems Section Problems

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Composite Hypotheses and Generalized Likelihood Ratio Tests

Institute of Actuaries of India

Introduction to Error Analysis

Some Statistical Tools for Particle Physics

PHYS 6710: Nuclear and Particle Physics II

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

Hypothesis testing:power, test statistic CMS:

Statistics for the LHC Lecture 2: Discovery

32. STATISTICS. 32. Statistics 1

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

32. STATISTICS. 32. Statistics 1

E. Santovetti lesson 4 Maximum likelihood Interval estimation

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

simple if it completely specifies the density of x

2. A Basic Statistical Toolbox

Recall the Basics of Hypothesis Testing

Psychology 282 Lecture #4 Outline Inferences in SLR

Statistical Methods for Particle Physics (I)

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Frequentist-Bayesian Model Comparisons: A Simple Example

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistics and Error Analysis -

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

1 Degree distributions and data

32. STATISTICS. 32. Statistics 1

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Some Statistics. V. Lindberg. May 16, 2007

Practical Statistics

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Modern Methods of Data Analysis - WS 07/08

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Statistics for the LHC Lecture 1: Introduction

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

MS&E 226: Small Data

Journeys of an Accidental Statistician

Goodness of Fit Goodness of fit - 2 classes

Generalized Linear Models (1/29/13)

Chapter 8 - Statistical intervals for a single sample

Primer on statistics:

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Physics 6720 Introduction to Statistics April 4, 2017

Introductory Statistics Course Part II

Statistical Methods for Particle Physics Lecture 3: systematic uncertainties / further topics

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

If we want to analyze experimental or simulated data we might encounter the following tasks:

Multivariate statistical methods and data mining in particle physics

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Appendix A Numerical Tables

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Measurement And Uncertainty

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Negative binomial distribution and multiplicities in p p( p) collisions

Least Squares Regression

Hypothesis Testing - Frequentist

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

Fourier and Stats / Astro Stats and Measurement : Stats Notes

arxiv: v3 [physics.data-an] 24 Jun 2013

Recent developments in statistical methods for particle physics

Recommendations for presentation of error bars

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

Least Squares Regression

Ling 289 Contingency Table Statistics

4.5.1 The use of 2 log Λ when θ is scalar

Wald s theorem and the Asimov data set

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing

Statistical Methods for Astronomy

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Inferring from data. Theory of estimators

Simple and Multiple Linear Regression

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Search and Discovery Statistics in HEP

Statistics Challenges in High Energy Physics Search Experiments

Review of Maximum Likelihood Estimators

Parametric Techniques

Transcription:

Parameter Estimation and Hypothesis Testing Parameter estimation Maximum likelihood Least squares Hypothesis tests Goodness-of-fit Elton S. Smith Jefferson Lab Con ayuda de Eduardo Medinaceli y Cristian Peña 1

Dictionary / Diccionario systematic sistematico spin espin background trasfondo scintillator contador de centelleo random aleatorio (al azar) scaler escalar histogram histograma degrees of freedom grados de libertad signal sen~al power potencia maximum likelihood maxima verosimilitud test statistic prueba estadistica least squares minimos cuadrados goodness bondad/fidelidad chi-square ji-cuadrado estimation estima confidence intervals and limits intervalos de confianza y limites significance significancia frequentist frecuentista? conditional p p condicional nuissance parameters parametros molestosos? (A given B) (A dado B) outline indice fit ajuste (encaje) asymmetry asimetria parent distribution distr del patron variance varianza signal-to-background fraction sen~al/trasfondo root-mean-square raiz quadrada media (rms) sample muestra biased sesgo flip-flopping indeciso counting experiments multiple scattering correlations weight experimentos de conteo dispersion multiple correlaciones pesadas

Parameter estimation 3

Properties of estimators 4

An estimator for the mean 5

An estimator for the variance 6

The Likelihood function via example We have a data set given by N data pairs (x i, y i ±σ i ) graphically represented below. The goal is to determine the fixed, but unknown, µ = f(x). σ is known or estimated from the data. 7

Gaussian probabilities (least-squares) We assume that at a fixed value of x i, we have made a measurement y i and that the measurement was drawn from a Gaussian probability distribution with mean y(x i ) = a + bx i and variance σ i. f (y i ;a,b) = e (y i y(x i )) 1 σ i L = N 1 i=1 πσ i πσ i e (y i a bx i ) σ i y(x i ) = a + bx i χ = ln L + k = N i=1 (y i a bx i ) σ i 8

χ minimization χ a = N (y i a bx i ) = 0 i=1 σ i χ b = N x i (y i a bx i ) = 0 i=1 σ i a N i=1 1 σ i + b N i=1 x i σ i = N i=1 y i σ i a N i=1 x i σ i + b N i=1 x i σ i = N i=1 x i y i σ i 9

Solution for linear fit For simplicity, assume constant σ = σ i. Then solve two simultaneous equations for unknowns: a = y i x i x i x i y i ( ) N x i x i b = N x y x i i i y i N x i x i ( ) Parameter uncertainties can be estimated from the curvature of the χ function. V[a] χ a V[b] χ b 1 θ = ˆ θ 1 = = θ = ˆ θ σ x i ( ) N x i x i Nσ ( ) N x i x i 10

Parameter uncertainties In a graphical method the uncertainty in the parameter estimator θ 0 is obtained by changing χ by one unit. χ (θ 0 ±σ θ ) = χ (θ 0 ) + 1 In general, using maximum likelihood lnl(θ 0 ±σ θ ) = lnl(θ 0 ) 1/ 11

The Likelihood function via example What does the fit look like? ROOT fit to pol1 Additional information about the fit: χ and probability Were the assumptions for the fit valid? We will return to this question after a discussion of hypothesis testing 1

More about the likelihood method Recall likelihood for least squares: L = N 1 i=1 πσ i e (y i a bx i ) σ i But the probability density depends on application L = N i=1 f (y i ; parameters) Proceed as before maximizing lnl (χ has minus sign). The values of the estimated parameters might not be very different, but the uncertainties can be greatly affected. 13

Applications L = N 1 i=1 πσ i e (y i a bx i ) σ i a) σ i = constant b) σ i = y i c) σ i = Y(x) Poisson distribution (see PDG Eq. 3.1) L = N i=1 ( a bx i) n i e (a bx i ) n i! Stirling s approx ln(n!) ~ n ln(n) - n ln L = N i 1 [( a + bx ) i n i + n i ln[n i /(a + bx i )]] 14

Statistical tests In addition to estimating parameters, one often wants to assess the validity of statements concerning the underlying distribution Hypothesis tests provide a rule for accepting or rejecting hypotheses depending on the outcome of an experiment. [Comparison of H 0 vs H 1 ] In goodness-of-fit tests one gives the probability to obtain a level of incompatibility with a certain hypothesis that is greater than or equal to the level observed in the actual data. [How good is my assumed H 0?] Formulation of the relevant question is critical to deciding how to answer it. 15

Selecting events background signal 16

Other ways to select events 17

Test statistics 18

Significance level and power of a test 19

Efficiency of event selection 0

Linear test statistic 1

Fisher discriminant

Testing significance/goodness of fit Quantify the level of agreement between the data and a hypothesis without explicit reference to alternative hypotheses. This is done by defining a goodness-of-fit statistic, and the goodness-of-fit is quantified using the p-value. For the case when the χ is the goodness-of-fit statistic, then the p-value is given by p = f (χ ;n d )dχ χ obs The p-value is a function of the observed value χ obs and is therefore itself a random variable. If the hypothesis used to compute the p-value is true, then p will be uniformly distributed between zero and one. 3

χ distribution Gaussian-like 4

p-value for χ distribution 5

Using the goodness-of-fit Data generated using Y(x) = 6.0 + 0. x, σ = 0.5 Compare three different polynomial fits to the same data. y(x) = p 0 + p 1 x y(x) = p 0 y(x) = p 0 + p 1 x + p x 6

χ /DOF vs degrees of freedom 7

Summary of second lecture Parameter estimation Illustrated the method of maximum likelihood using the least squares assumption Reviewed how hypotheses are tested Use of goodness-of-fit statistics to determine the validity of underlying assumptions used to determine parent parameters 8

Constructing a test statistic 9