Numerical Methods for Data Analysis

Similar documents
Statistics, Data Analysis, and Simulation SS 2013

Statistics, Data Analysis, and Simulation SS 2015

Statistics, Data Analysis, and Simulation SS 2017

Statistics, Data Analysis, and Simulation SS 2017

Lectures on Statistical Data Analysis

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Introduction to Statistical Methods for High Energy Physics

Statistics and data analyses

Probability Density Functions

Random Variables. P(x) = P[X(e)] = P(e). (1)

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Statistics: Learning models from data

Lecture 2: Repetition of probability theory and statistics

Recitation 2: Probability

Deep Learning for Computer Vision

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Introduction to Bayesian Learning. Machine Learning Fall 2018

Poisson Statistics. Department of Physics University of Cape Town Course III Laboratory January 13, 2015

Tutorial 1 : Probabilities

Statistics, Data Analysis, and Simulation SS 2013

Statistics for Managers Using Microsoft Excel (3 rd Edition)

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Single Maths B: Introduction to Probability

Algorithms for Uncertainty Quantification

Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009

Statistical Methods for Astronomy

Probability. Lecture Notes. Adolfo J. Rumbos

Probability and Estimation. Alan Moses

RWTH Aachen Graduiertenkolleg

Statistical Methods in Particle Physics. Lecture 2

Week 1 Quantitative Analysis of Financial Markets Distributions A

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

Review of Statistics

Statistical Methods for Particle Physics (I)

Intro to Probability. Andrei Barbu

ACM 116: Lecture 2. Agenda. Independence. Bayes rule. Discrete random variables Bernoulli distribution Binomial distribution

Probability Midterm Exam 2:15-3:30 pm Thursday, 21 October 1999

Lecture 3. Discrete Random Variables

1 Presessional Probability

Probability - Lecture 4

Statistical Methods for Astronomy

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Three hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Lecture 1: August 28

Probability and Probability Distributions. Dr. Mohammed Alahmed

Introduction to Probability and Statistics (Continued)

Copyright c 2006 Jason Underdown Some rights reserved. choose notation. n distinct items divided into r distinct groups.

Lecture 11. Probability Theory: an Overveiw

Introduction to Machine Learning

Introduction to Statistical Inference Self-study

Chapter 3: Random Variables 1

Lecture 1: Probability Fundamentals

Statistical Data Analysis 2017/18

7 Random samples and sampling distributions

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

Quick Tour of Basic Probability Theory and Linear Algebra

Statistical Methods in Particle Physics

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.

Probabilities and distributions

MATH Notebook 5 Fall 2018/2019

functions Poisson distribution Normal distribution Arbitrary functions

Math Review Sheet, Fall 2008

II. Probability. II.A General Definitions

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Common Discrete Distributions

Class 26: review for final exam 18.05, Spring 2014

YETI IPPP Durham

Probability, Entropy, and Inference / More About Inference

Analysis of Engineering and Scientific Data. Semester

λ = pn µt = µ(dt)(n) Phylomath Lecture 4

STA 111: Probability & Statistical Inference

Brief Review of Probability

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

STAT Chapter 5 Continuous Distributions

Introduction to Information Entropy Adapted from Papoulis (1991)

1 Random Variable: Topics

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Data Analysis and Monte Carlo Methods

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Basics on Probability. Jingrui He 09/11/2007

A Journey Beyond Normality

Introduction to Machine Learning

1: PROBABILITY REVIEW

PROBABILITY DISTRIBUTION

Data, Estimation and Inference

STAT 414: Introduction to Probability Theory

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

What is Probability? Probability. Sample Spaces and Events. Simple Event

Review of Probabilities and Basic Statistics

Lecture 5: Moment generating functions

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Transcription:

Michael O. Distler distler@uni-mainz.de Bosen (Saar), August 29 - September 3, 2010 Fundamentals Probability distributions Expectation values, error propagation Parameter estimation Regression analysis Maximum likelihood Linear Regression Advanced topics

Some statistics books, papers, etc. Volker Blobel und Erich Lohrmann: Statistische und numerische Methoden der Datenanalyse, Teubner Verlag (1998) Siegmund Brandt: Datenanalyse, BI Wissenschaftsverlag (1999) Philip R. Bevington: Data Reduction and Error Analysis for the Physical Sciences, McGraw-Hill (1969) Roger J. Barlow: Statistics, John Wiley & Sons (1993) Glen Cowan: Statistical Data Analysis, Oxford University Press (1998) Frederick James: Statistical Methods in Experimental Physics, 2nd Edition, World Scientific, 2006 Wes Metzger s lecture notes: www.hef.kun.nl/~wes/stat_course/statist.pdf Glen Cowan s lecture notes: www.pp.rhul.ac.uk/~cowan/stat_course.html Particle Physics Booklet: http://pdg.lbl.gov/

Introduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each event Theories predict distributions of these properties up to free parameters Some tasks of data analysis: Estimate (measure) the parameters; Quantify the uncertainty of the parameter estimates; Test the extent to which the predictions of a theory are in agreement with the data.

Introduction Philosophy of Science Karl R. Popper (* 28. Juli 1902 in Vienna, Austria; 17. September 1994 in London, England) coined the term critical rationalism. At the heart of his philosophy of science lies the account of the logical asymmetry between verification and falsifiability. Logik der Forschung, 1934. Existence of a true value of measured quantities and derived values.

Theory of probability Probability theory, mathematics: Kolmogorov axioms Classical interpretation, frequentist probability: Pragmatical definition of probability: n p(e) = lim N N n(e) = number of events E N = number of trials (experiments) Experiments have to be repeatable (in principle). Disadvantage: Strictly speaking one cannot make statements on the probability of any true value. Only upper and lower limits are possible given a certain confidence level.

Theory of probability Probability theory, mathematics Classical interpretation, frequentist probability Bayesian statistics, subjective probability: Prior subjective assumptions enter into the calculation of probabilities of a hypotheses H. p(h) = degree of belief that H is true Metaphorically speaking: Probabilities are the ratio of the (maximum) wager and the anticipated prize in a bet.

Theory of probability Probability theory, mathematics Classical interpretation, frequentist probability Bayesian statistics, subjective probability: Suppose there is a town with green and yellow taxicabs. In a hit-and-run accident a man was hurt and a witness saw a green cab. In court the lawer of the taxi company impeaches the credibility of the witness, because of the lighting conditions. A test showed that under similar conditions 10% of the witnesses confuse the color of the cabs. Would you believe the witness?

Theory of probability Probability theory, mathematics Classical interpretation, frequentist probability Bayesian statistics, subjective probability: Suppose there is a town with green and yellow taxicabs. In a hit-and-run accident a man was hurt and a witness saw a green cab. In court the lawer of the taxi company impeaches the credibility of the witness, because of the lighting conditions. A test showed that under similar conditions 10% of the witnesses confuse the color of the cabs. Would you believe the witness? What if there were 20 times more yellow cabs than green cabs? Would you still believe the witness?

Theory of probability Probability theory, mathematics Classical interpretation, frequentist probability Bayesian statistics, subjective probability: Prior subjective assumptions enter into the calculation of probabilities of a hypotheses H. taxicabs witness sees... statement is... 200 yellow 180 yellow 20 green 20/29 = 69% wrong 10 green 9 green 9/29 = 31% true 1 yellow

Theory of probability Probability theory, mathematics Classical interpretation, frequentist probability Bayesian statistics, subjective probability: Prior subjective assumptions enter into the calculation of probabilities of a hypotheses H. Disadvantage: Prior hypotheses influence the probability. Advantages for rare and one-time events, like noisy signals or catastrophe modeling.

Theory of probability Probability theory, mathematics Classical interpretation, frequentist probability Bayesian statistics, subjective probability: Prior subjective assumptions enter into the calculation of probabilities of a hypotheses H. Disadvantage: Prior hypotheses influence the probability. Advantages for rare and one-time events, like noisy signals or catastrophe modeling. In this lecture we will focus on the classical statistics, e.g. error estimates have to be understood as confidence regions.

Combining probabilities Two kinds of events are given: A and B. The probability of A is p(a) (B: p(b)). Then the probability of A or B is: p(a or B) = p(a) + p(b) p(a and B) If A and B are mutually exclusive then p(a and B) = 0 Example: Drawing from a deck of German Skat cards. p(ace or spades) = 4 32 + 8 32 1 32 = 11 32 Special case: B = Ā (A will NOT occur). p(a and Ā) = p(a) + p(ā) = 1

Combining probabilities Joint probability of A and B occuring simultaniously: p(a and B) = p(a) p(b A), p(b A) is called condional probability. If A and B are independent one gets p(b A) = p(b), respectively p(a and B) = p(a) p(b)

Death in the mountains In a book on mountaineering achievements of Reinhold Messner one reads the following: If you consider that the probability of dying in a expedition to an eight-thousander is 3,4%, then Messner had a probability of 3,4% 29 = 99% to be killed during his 29 expeditions.

Death in the mountains In a book on mountaineering achievements of Reinhold Messner one reads the following: If you consider that the probability of dying in a expedition to an eight-thousander is 3,4%, then Messner had a probability of 3,4% 29 = 99% to be killed during his 29 expeditions. That may not be true. What if Messner sets off to a 30th expedition?

Death in the mountains In a book on mountaineering achievements of Reinhold Messner one reads the following: If you consider that the probability of dying in a expedition to an eight-thousander is 3,4%, then Messner had a probability of 3,4% 29 = 99% to be killed during his 29 expeditions. That may not be true. What if Messner sets off to a 30th expedition? The probability to survive an expedition is obviously 1 0,034 = 0,966. If one assumes that the various expeditions represent independent events, the probability of surviving all 29 expeditions is: P = 0,966 29 = 0,367.

Definitions probability mass function (pmf) probability density function (pdf) of a measured value (=random variable) f(n) 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 n f(x) 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 f (n) discrete f (x) continuous Normalization: f (n) 0 f (n) = 1 f (x) 0 f (x) dx = 1 Probability: n p(n 1 n n 2 ) = n 2 x n 1 f (n) p(x 1 x x 2 ) = x2 x 1 f (x)dx

Definitions Cumulative distribution function (CDF): F(x) = x f (x )dx, F( ) = 0, F( ) = 1 Example: Decay time t of a radioactive nucleus with mean life time τ: f (t) = 1 τ e t/τ F(t) = 1 e t/τ 1 0.8 f(t)*12s F(t) 0.6 0.4 0.2 0 0 10 20 30 40 50 t/s

Expectation values and moments Mean: A random variable X takes on the values X 1, X 2,..., X n with probability p(x i ), then the expected value of X ( mean ) is X = X = n X i p(x i ) i=1 The expected value of an arbitrary funktion h(x) for a continuous random variable is: E[h(x)] = The mean ist the expected value of x: E[x] = x = h(x) f (x)dx x f (x)dx

Expectation values and moments standard deviation = {mean (deviation from x) 2 } 1/2 σ 2 = (x x) 2 = = (x x) 2 f (x)dx (x 2 2x x + x 2 ) f (x)dx = x 2 2 x x + x 2 = x 2 x 2 σ 2 = Variance, σ = Standard deviation Discrete distributions: ( x 2 ( x) 2 ) σ 2 = 1 N N Attention: This is the definition of the variance! To get a bias free estimation of the variance, 1 1 N will be replaced by N 1.

Expectation values and moments Moments are the expected value of x n and of (x x ) n. They are called nth algebraic moment µ n and nth central moment µ n, respectivly. Skewness v(x) is a measure of the asymmetry of the probability distribution of a random variable x: v = µ 3 σ 3 = E[(x E[x])3 ] σ 3 Kurtosis is a measure of the peakedness of the probability distribution of a random variable x. γ 2 = µ 4 σ 4 3 = E[(x E[x])4 ] σ 4 3

Binomial distribution The binomial distribution is the discrete probability distribution of the number of successes r in a sequence of n independent yes/no experiments, each of which yields success with probability p (Bernoulli experiment). P(r) = ( n r ) p r (1 p) n r P(r) is normalized. Proof: Binomial theorem with q = 1 p. The mean of r is: n r = E[r] = rp(r)= np The varianz σ 2 is V [r] = E[(r r ) 2 ] = r=0 n (r r ) 2 P(r)= np(1 p) r=0

Poisson distribution 0.6 0.6 The Poisson distribution ist given by: The mean is: The variance is: P(r) = µr e µ r! r = µ V [r] = σ 2 = np = µ 0.5 0.4 µ = 0.5 0.3 0.2 0.1 0 0 2 4 6 8 10 0.35 0.3 0.25 0.2 µ = 2 0.15 0.1 0.05 0 0 2 4 6 8 10 0.5 0.4 µ = 1 0.3 0.2 0.1 0 0 2 4 6 8 10 0.35 0.3 0.25 0.2 µ = 4 0.15 0.1 0.05 0 0 2 4 6 8 10

Law of large numbers The law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed. We perform n independent experiments (Bernoulli trials) where the result j occurs n j times. p j = E[h j ] = E[n j /n] The variance of a Binomial distribution is: V [h j ] = σ 2 (h j ) = σ 2 (n j /n) = 1 n 2 σ2 (n j ) = 1 n 2 np j(1 p j ) From the product p j (1 p j ) which is 1 4, we can deduce the law of large numbers: σ 2 (h j ) < 1/n

The central limit theorem The central limit theorem (CLT) states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. Let x i be a sequence of n independent and identically distributed random variables each having finite values of expectation µ and variance σ 2 > 0. In the limit n the random variable w = n i=1 x i will be normally distributed with mean w = n x and variance V [w] = nσ 2.

Illustration: The central limit theorem 0.5 0.5 N=1 0.4 Gauss 0.4 N=2 0.3 0.2 0.1 0-3 -2-1 0 1 2 3 0.3 0.2 0.1 0-3 -2-1 0 1 2 3 0.5 0.4 N=3 0.5 0.4 N=10 0.3 0.3 0.2 0.2 0.1 0.1 0-3 -2-1 0 1 2 3 0-3 -2-1 0 1 2 3 The sum of uniformly distributed random variables and the standard normal distribution.

Special probability densities Uniform distribution: This probability distribution is constant in between the limits x = a and x = b: f (x) = Mean and variance: { 1 b a a x < b 0 otherwise x = E[x] = a + b 2 V [x] = σ 2 = (b a)2 12

Gaussian distribution The most important probability distribution - also called normal distribution: f (x) = 1 e (x µ)2 2σ 2 2πσ The Gaussian distribution has two parameters, the mean µ and the variance σ 2. The probability distribution with mean µ = 0 and variance σ 2 = 1 is named standard normal distribution or short N(0, 1). The Gaussian distribution can be derived from the binomial distribution for large values of n and r and similarly from the Poisson distribution for large values of Werte von µ.

Gaussian distribution 1 1 2 2 3 3 dx N(0, 1) = 0,6827 = (1 0,3173) dx N(0, 1) = 0,9545 = (1 0,0455) dx N(0, 1) = 0,9973 = (1 0,0027) FWHM: useful to estimate the standard deviation: FWHM = 2σ 2ln2 = 2,355σ

Gaussian distribution 0.3 0.25 0.2 0.15 0.1 0.05 0 0 2 4 6 8 10 12 14 0.2 0.15 0.1 0.05 0 0 2 4 6 8 10 12 14 Left side: The binomial distribution for n = 10 and p = 0,6 in comparison to the Gaussian distribution for µ = np = 6 and σ = np(1 p) = 2,4. Right side: The Poisson distribution for µ = 6 and σ = 6 in comparison to the Gaussian distribution.