Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009

Similar documents
Introduction to Statistical Methods for High Energy Physics

Lectures on Statistical Data Analysis

Statistical Methods in Particle Physics. Lecture 2

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

YETI IPPP Durham

Statistical Data Analysis 2017/18

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

RWTH Aachen Graduiertenkolleg

Statistical Methods for Particle Physics (I)

The language of random variables

Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009

Statistics for the LHC Lecture 1: Introduction

Lecture 1: Probability Fundamentals

Statistische Methoden der Datenanalyse. Kapitel 3: Die Monte-Carlo-Methode

Statistical Methods in Particle Physics

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

Single Maths B: Introduction to Probability

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

BACKGROUND NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2016 PROBABILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO

L2: Review of probability and statistics

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Basics on Probability. Jingrui He 09/11/2007

Modern Methods of Data Analysis - SS Introduction

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5: Bayes pt. 1

Statistical Methods in Particle Physics

Statistical Data Analysis Stat 3: p-values, parameter estimation

Recitation 2: Probability

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Review (Probability & Linear Algebra)

Statistics and Data Analysis

Data Analysis and Monte Carlo Methods

Communication Theory II

Math Review Sheet, Fall 2008

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Random Signals and Systems. Chapter 3. Jitendra K Tugnait. Department of Electrical & Computer Engineering. Auburn University.

E. Santovetti lesson 4 Maximum likelihood Interval estimation

Intro to Probability. Andrei Barbu

2. A Basic Statistical Toolbox

Probability and Information Theory. Sargur N. Srihari

Chapter 2. Probability

Bivariate distributions

Lecture 2: Review of Basic Probability Theory

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Numerical Methods for Data Analysis

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

ACE 562 Fall Lecture 2: Probability, Random Variables and Distributions. by Professor Scott H. Irwin

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Machine Learning: Probability Theory

Probability, Entropy, and Inference / More About Inference

Conditional distributions

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Lecture 2: Repetition of probability theory and statistics

Basics of Probability

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Review: mostly probability and some statistics

Probability Theory. Prof. Dr. Martin Riedmiller AG Maschinelles Lernen Albert-Ludwigs-Universität Freiburg.

Brief Review of Probability

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Introduction to Statistical Hypothesis Testing

An introduction to Bayesian reasoning in particle physics

Multivariate statistical methods and data mining in particle physics

What are the Findings?

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

2 Functions of random variables

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Probability theory. References:

Statistics for the LHC Lecture 2: Discovery

arxiv: v1 [hep-ex] 9 Jul 2013

Review of probability. Nuno Vasconcelos UCSD

CMPSCI 240: Reasoning Under Uncertainty

Probability Theory Review Reading Assignments

Grundlagen der Künstlichen Intelligenz

Revised May 1996 by D.E. Groom (LBNL) and F. James (CERN), September 1999 by R. Cousins (UCLA), October 2001 and October 2003 by G. Cowan (RHUL).

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability - Lecture 4

Appendix A : Introduction to Probability and stochastic processes

Terminology. Experiment = Prior = Posterior =

ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Probabilistic Reasoning

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Statistical Methods for Astronomy

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

Stat 5101 Lecture Notes

Machine Learning Srihari. Probability Theory. Sargur N. Srihari

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

PHYS 6710: Nuclear and Particle Physics II

Transcription:

Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 1 Statistische Methoden der Datenanalyse Kapitel 1: Fundamentale Konzepte Professor Markus Schumacher Freiburg / Sommersemester 2009

Data analysis in particle physics Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 2 Observe events of a certain type Measure characteristics of each event (particle momenta, number of muons, energy of jets,...) Theories (e.g. SM) predict distributions of these properties up to free parameters, e.g., a, G F, M Z, a s, m H,... Some tasks of data analysis: Estimate (measure) the parameters; Quantify the uncertainty of the parameter estimates; Test the extent to which the predictions of a theory are in agreement with the data.

Dealing with uncertainty In particle physics there are various elements of uncertainty: theory is not deterministic quantum mechanics random measurement errors present even without quantum effects things we could know in principle but don t e.g. from limitations of cost, time,... We can quantify the uncertainty using PROBABILITY Lecture 1 page 3 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 3

A definition of probability Consider a set S with subsets A, B,... Kolmogorov axioms (1933) From these axioms we can derive further properties, e.g. Lecture 1 page 4 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 4

Conditional probability, independence Also define conditional probability of A given B (with P(B) 0): E.g. rolling dice: Subsets A, B independent if: If A, B independent, N.B. do not confuse with disjoint subsets, i.e., Lecture 1 page 5 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 5

Interpretation of probability I. Relative frequency A, B,... are outcomes of a repeatable experiment cf. quantum mechanics, particle scattering, radioactive decay... II. Subjective probability A, B,... are hypotheses (statements that are true or false) Both interpretations consistent with Kolmogorov axioms. In particle physics frequency interpretation often most useful, but subjective probability can provide more natural treatment of non-repeatable phenomena: systematic uncertainties, probability that Higgs boson exists,... Lecture 1 page 6 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 6

Bayes theorem From the definition of conditional probability we have, and but, so Bayes theorem First published (posthumously) by the Reverend Thomas Bayes (1702 1761) An essay towards solving a problem in the doctrine of chances, Philos. Trans. R. Soc. 53 (1763) 370; reprinted in Biometrika, 45 (1958) 293. Lecture 1 page 7 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 7

The law of total probability B Consider a subset B of the sample space S, S divided into disjoint subsets A i such that [ i A i = S, A i B A i law of total probability Bayes theorem becomes Lecture 1 page 8 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 8

An example using Bayes theorem Suppose the probability (for anyone) to have AIDS is: prior probabilities, i.e., before any test carried out Consider an AIDS test: result is + or - probabilities to (in)correctly identify an infected person probabilities to (in)correctly identify an uninfected person Suppose your result is +. How worried should you be? Lecture 1 page 9 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 9

Bayes theorem example (cont.) The probability to have AIDS given a + result is posterior probability i.e. you re probably OK! Your viewpoint: my degree of belief that I have AIDS is 3.2% Your doctor s viewpoint: 3.2% of people like this will have AIDS Lecture 1 page 10 Prof. M. Markus Schumacher Schumacher Stat Meth. der Prediscussion Datenanalyse for Lecture Kapi,1: on Fundamentale Statistical Data Konzepten Analysis Uni. Uni. Freiburg Siegen / SoSe09 / SS08 10

Frequentist Statistics general philosophy Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 11 In frequentist statistics, probabilities are associated only with the data, i.e., outcomes of repeatable observations (shorthand: ). Probability = limiting frequency Probabilities such as P (Higgs boson exists), P (0.117 < a s < 0.121), etc. are either 0 or 1, but we don t know which. The tools of frequentist statistics tell us what to expect, under the assumption of certain probabilities, about hypothetical repeated observations. The preferred theories (models, hypotheses,...) are those for which our observations would be considered usual.

Bayesian Statistics general philosophy Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Lecture Uni. 1 Freiburg page / SoSe09 12 12 In Bayesian statistics, use subjective probability for hypotheses: probability of the data assuming hypothesis H (the likelihood) prior probability, i.e., before seeing the data posterior probability, i.e., after seeing the data normalization involves sum over all possible hypotheses Bayes theorem has an if-then character: If your prior probabilities were p (H), then it says how these probabilities should change in the light of the data. No general prescription for priors (subjective!)

Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Lecture Uni. 1 Freiburg page / SoSe09 13 13 Random variables and probability density functions A random variable is a numerical characteristic assigned to an element of the sample space; can be discrete or continuous. Suppose outcome of experiment is continuous value x f(x) = probability density function (pdf) x must be somewhere Or for discrete outcome x i with e.g. i = 1, 2,... we have probability mass function x must take on one of its possible values

Cumulative distribution function Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Lecture Uni. 1 Freiburg page / SoSe09 14 14 Probability to have outcome less than or equal to x is cumulative distribution function Alternatively define pdf with

Mean, median and mode Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 15 e.g.: Maxwells s velocity distribution

Obtaining PDF from Histograms Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 16 pdf = histogram with infinite data sample, zero bin width, normalized to unit area

Multivariate distributions Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 17 Outcome of experiment characterized by several values, e.g. an n-component vector, (x 1,... x n ) joint pdf Normalization:

Marginal pdf Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 18 Sometimes we want only pdf of some (or one) of the components: i marginal pdf x 1, x 2 independent if

Marginal pdf (2) Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Lecture Uni. 1 Freiburg page / SoSe09 19 19 Marginal pdf ~ projection of joint pdf onto individual axes.

Conditional pdf Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 20 Sometimes we want to consider some components of joint pdf as constant. Recall conditional probability: conditional pdfs: Bayes theorem becomes: Recall A, B independent if x, y independent if

Conditional pdfs (2) Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 21 E.g. joint pdf f(x,y) used to find conditional pdfs h(y x 1 ), h(y x 2 ): Basically treat some of the r.v.s as constant, then divide the joint pdf by the marginal pdf of those variables being held constant so that what is left has correct normalization, e.g.,

WDF für zwei Variablen mit Abhängigkeit Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 22

2.4 Functions of a random variable Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 23 A function of a random variable is itself a random variable. Suppose x follows a pdf f(x), consider a function a(x). What is the pdf g(a)? ds = region of x space for which a is in [a, a+da]. For one-variable case with unique inverse this is simply

Mapping the x and a spaces Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 24 probability in a-space g(a)da equals one in x-space f(x)ds figure from Lutz Feld

Mapping the x and a spaces Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 25

Mapping the x and a spaces: two branches Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 26

Functions without unique inverse Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 27 If inverse of a(x) not unique, include all dx intervals in ds which correspond to da: Example:

Functions of more than one r.v. Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 28 Consider r.v.s and a function ds = region of x-space between (hyper)surfaces defined by

Functions of more than one r.v. (2) Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 29 Example: r.v.s x, y > 0 follow joint pdf f(x,y), consider the function z = xy. What is g(z)? (Mellin convolution)

More on transformation of variables Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 30 Consider a random vector with joint pdf Form n linearly independent functions for which the inverse functions exist. Then the joint pdf of the vector of functions is where J is the Jacobian determinant: For e.g. integrate over the unwanted components.

Expectation values G. Cowan Prof. M. Schumacher Stat Meth. Lectures der Datenanalyse on Kapi,1: Statistical Fundamentale Konzepten Lecture Uni. 1 Freiburg page / SoSe09 31 31 Consider continuous r.v. x with pdf f (x). Define expectation (mean) value as Notation (often): ~ centre of gravity of pdf. For a function y(x) with pdf g(y), (equivalent) Variance: Notation: Standard deviation: s ~ width of pdf, same units as x.

Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 32 Covariance and correlation Define covariance cov[x,y] (also use matrix notation V xy ) as Correlation coefficient (dimensionless) defined as If x, y, independent, i.e.,, then x and y, uncorrelated N.B. converse not always true.

Correlations Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 33

Korrelationen Prof. M. Schumacher Stat Meth. der Datenanalyse Kapi,1: Fundamentale Konzepten Uni. Freiburg / SoSe09 34