Introduction to Probabilistic Reasoning

Similar documents
Bayesian Models in Machine Learning

Computational Perception. Bayesian Inference

CSC321 Lecture 18: Learning Probabilistic Models

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Probability and Estimation. Alan Moses

Bayesian analysis in nuclear physics

Bayesian Inference for Normal Mean

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

A Theory of Measurement Uncertainty Based on Conditional Probability

Artificial Intelligence

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

Lecture : Probabilistic Machine Learning

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Bayesian RL Seminar. Chris Mansley September 9, 2008

Why Try Bayesian Methods? (Lecture 5)

CS 361: Probability & Statistics

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12

Basics on Probability. Jingrui He 09/11/2007

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Data Analysis and Monte Carlo Methods

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Conditional distributions

Foundations of Statistical Inference

Probabilistic Inference and Applications to Frontier Physics Part 2

Statistics: Learning models from data

Bayesian Data Analysis

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

Bayesian Inference: What, and Why?

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014

Machine Learning 4771

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Lectures on Statistical Data Analysis

2. A Basic Statistical Toolbox

A Very Brief Summary of Bayesian Inference, and Examples

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Probabilistic Inference and Applications to Frontier Physics Part 1

Bayesian Regression (1/31/13)

Data Analysis and Uncertainty Part 2: Estimation

Expectation Propagation Algorithm

Probability and Information Theory. Sargur N. Srihari

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Robotics. Lecture 4: Probabilistic Robotics. See course website for up to date information.

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

DD Advanced Machine Learning

CS 361: Probability & Statistics

Introduction to Machine Learning. Lecture 2

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

Overfitting, Bias / Variance Analysis

Introduction to Statistical Methods for High Energy Physics

Bayesian Learning (II)

1 A simple example. A short introduction to Bayesian statistics, part I Math 217 Probability and Statistics Prof. D.

Statistical Methods for Astronomy

Bayesian Regression Linear and Logistic Regression

Variational Bayesian Logistic Regression

PMR Learning as Inference

Bayesian Inference for Regression Parameters

Introduction to Bayesian Learning. Machine Learning Fall 2018

Solution: chapter 2, problem 5, part a:

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation

Probability is related to uncertainty and not (only) to the results of repeated experiments

Multiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Bayesian Machine Learning

Lecture 1: Probability Fundamentals

Bayesian Linear Regression [DRAFT - In Progress]

Lecture 18: Learning probabilistic models

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict

Inference for a Population Proportion

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Machine Learning using Bayesian Approaches

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

Bayesian Inference. 2 CS295-7 cfl Michael J. Black,

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic

Lecture 4: Probabilistic Learning

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Primer on statistics:

Time Series and Dynamic Models

Statistical Methods in Particle Physics

Stat 451 Lecture Notes Numerical Integration

Grundlagen der Künstlichen Intelligenz

Introduction to Machine Learning

Statistical Methods in Particle Physics. Lecture 2

Statistics and Data Analysis

Bayesian Inference. Chris Mathys Wellcome Trust Centre for Neuroimaging UCL. London SPM Course

Learning Bayesian belief networks

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Discrete Binary Distributions

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

Transcription:

Introduction to Probabilistic Reasoning inference, forecasting, decision Giulio D Agostini giulio.dagostini@roma.infn.it Dipartimento di Fisica Università di Roma La Sapienza Part 3 Probability is good sense reduced to a calculus (Laplace) G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p.

Contents Summary of relevant formulae Propagating uncertainty the straight way Simple case of the Gaussian distribution : when (and why) ML provides a good estimates and why it can never tell something meaningfull about uncertainties the role of conjugate priors, with application to the Bernoulli trials ( binomial problem ): beta pdf. Handling systematics, with the special case of two measurements with Gaussian response of the same detector, affected by a possible systematic error: how the overall uncertainty increases how the two results become correlated Examples with OpenBUGS (see web page) [ JAGS G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 2

Important formulae (For continous uncertain variables) product rule: f xy (x,y I) = f x y (x y,i)f y (y I) independence: f xy (x,y I) = f x (x I)f y (y I) marginalizzation: f xy (x,y I)dy = f x (x I) f xy (x,y I)dx = f y (y I) decomposition: f x (x I) = f x y (x y,i)f y (y I)dy f y (y I) = f y x (y x,i)f x (x I)dx weighted average G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 3

Important formulae continued Bayes rule (using observation x and true value µ and simplifying the notation) f(µ x,i) = (µ,x I) f(x I) = f(x µ,i)f(µ I) f(x I) = f(x µ,i)f(µ I) f(x µ,i)f(µ I)dµ f(x µ,i)f(µ I) (In the following I will be implicit in most cases) G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 4

Propagation of uncertainties What has to do with Bayesian? G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

Propagation of uncertainties What has to do with Bayesian? Think what most of you have been taught in laboratory courses (typically): Use probabilistic formulae for object that are not random variables (in the frequentistic approach) G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

Propagation of uncertainties What has to do with Bayesian? Think what most of you have been taught in laboratory courses (typically): Use probabilistic formulae for object that are not random variables (in the frequentistic approach) An insult to logic!! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

Propagation of uncertainties What has to do with Bayesian? Think what most of you have been taught in laboratory courses (typically): Use probabilistic formulae for object that are not random variables (in the frequentistic approach) An insult to logic!! In the Bayesian approach it is straightforward because true values are uncertain variables. G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

Propagation of uncertainties What has to do with Bayesian? Think what most of you have been taught in laboratory courses (typically): Use probabilistic formulae for object that are not random variables (in the frequentistic approach) An insult to logic!! In the Bayesian approach it is straightforward because true values are uncertain variables. For details see 2005 CERN lectures (nr. 4, pp. 7-28) http://indico.cern.ch/conferencedisplay.py? conferencedisplay.py?confid=a04375 G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

back to our slides a very simple inference in a gaussian model conjugate priors systematics G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 6

Simple case of Gaussian errors x N(µσ): f(x µ,i) = f(µ x,i) = IF f(µ I) const f(µ x,i) = E[µ, σ(µ), etc. 2πσ 2σ 2 2πσ 2σ 2 f(µ I) + 2πσ 2σ 2 f(µ I) dµ [ (µ x)2 2πσ 2σ 2. G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 7

Simple case of Gaussian errors x N(µσ): f(x µ,i) = f(µ x,i) = IF f(µ I) const f(µ x,i) = E[µ, σ(µ), etc. 2πσ 2σ 2 2πσ 2σ 2 f(µ I) + 2πσ 2σ 2 f(µ I) dµ [ (µ x)2 2πσ 2σ 2. maximum of posterior = maximum of likelihood G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 7

Simple case of Gaussian errors x N(µσ): f(x µ,i) = f(µ x,i) = IF f(µ I) const f(µ x,i) = E[µ, σ(µ), etc. 2πσ 2σ 2 2πσ 2σ 2 f(µ I) + 2πσ 2σ 2 f(µ I) dµ frequentistic equivalent of σ(µ)? [ (µ x)2 2πσ 2σ 2. G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 7

Simple case of Gaussian errors x N(µσ): f(x µ,i) = f(µ x,i) = IF f(µ I) const f(µ x,i) = E[µ, σ(µ), etc. 2πσ 2σ 2 2πσ 2σ 2 f(µ I) + 2πσ 2σ 2 f(µ I) dµ [ (µ x)2 2πσ 2σ 2. frequentistic equivalent of σ(µ)? NO! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 7

Simple case of Gaussian errors x N(µσ): f(x µ,i) = f(µ x,i) = IF f(µ I) const f(µ x,i) = E[µ, σ(µ), etc. 2πσ 2σ 2 2πσ 2σ 2 f(µ I) + 2πσ 2σ 2 f(µ I) dµ [ (µ x)2 2πσ 2σ 2. frequentistic equivalent of σ(µ)? NO! prescriptions!! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 7

Prior conjugate Since ever there have been computational problems: survive with approximations e.g. Gaussian approximation of the posterior (somehow a reflex of the Central Limit theorem) G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 8

Prior conjugate Since ever there have been computational problems: survive with approximations e.g. Gaussian approximation of the posterior (somehow a reflex of the Central Limit theorem) Another famous approximation is to chose a proper shape for the prior: compromize between real beliefs and easy math! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 8

Prior conjugate Since ever there have been computational problems: survive with approximations e.g. Gaussian approximation of the posterior (somehow a reflex of the Central Limit theorem) Another famous approximation is to chose a proper shape for the prior: compromize between real beliefs and easy math! inferring p of binomial black/green board G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 8

Beta distribution X Beta(r,s)): f(x Beta(r,s)) = β(r,s) xr ( x) s Denominator is just normalization, i.e. { r, s > 0 0 x. β(r,s) = 0 x r ( x) s dx. beta function, resulting in β(r,s) = Γ(r)Γ(s) Γ(r+s). very flexible distribution file beta_distribution.pdf G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 9

Uncertainties due to systematics This is another subjects where the frequentistic approach fails miserably: no consistent theory, just prescriptions. add them linearly 2. add the quadratically 3. do if small, 2 if large;... G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 0

Uncertainties due to systematics This is another subjects where the frequentistic approach fails miserably: no consistent theory, just prescriptions. add them linearly 2. add the quadratically 3. do if small, 2 if large;... Straightforward in the bayesian approach: influence quantities from which the final results may depend (calibrations constants, etc.) are characterized by an uncertain value, and hence we can attach to them probabilities, and use probability theory. (This is what, in practice, also physicists wh think to adhere to fquentistic school finally do, as in error propagation.) G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 0

. Conditional inference Let s make esplicit one of the pieces of information: f(µ x 0,I) = f(µ x 0,h,I 0 ) f(µ x o ) f(µ x o,h) true value µ o µ f(x µ o,h) f(x µ o ) observed value x o x G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p.

. Conditional inference Let s make esplicit one of the pieces of information: f(µ x 0,I) = f(µ x 0,h,I 0 ) then use a probability theory f(µ x 0,I) = f(µ x 0,h,I 0 )f(h I 0 )dh Easy, logical, it does work try it! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p.

2. Joint inference + marginalization We can infer both µ and h from the data (One piece of data, two inferred quantities? So what? they will just be 00% correlated) f(µ,h x 0,I) f(x 0 µ,h,i) f 0 (µ,h I) And then apply marginalization with respect to the so called nuisance parameter f(µ x 0,I) = f(µh x 0,I 0 )f(h I 0 )dh G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 2

3. Raw result corrected result As third possibility, we might think at a raw result obtained at a fixed value of h (its nominal, best value) only affected by uncertainties due to random errors (or all others, a part those coming from hour imperfect knpwledge about h f(µr x 0,I) Then think to a corrections due to all possible values of h, consistently with out best knowledge: [g(µ R,h) is not a pdf! µ = g(µ R,h) then use propagation of uncertainties Example: µ = µ R +z, where z is an offset known to be 0±σ z one of the assigned problems G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 3

Common systematic correlation As well understood, if measurements have a systematic effect in common, the results will become correlated. As it happens when two quantities depend from a third one: µ = µ R +z µ 2 = µ R2 +z common uncertainty in z affects µ and µ 2 in the same direction: µ and µ 2 will become positively correlated. assigned problem For a more detailed example, using reasoning 2 see file em common_systematics.pdf (sections 6.8-6.0) G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 4

Conclusions G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

Conclusions Take the courage to use it! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5

Conclusions It s easy if you try...! G. D Agostini, Probabilistic Reasoning [3 (Stellenbosch, 23 November 203) c G. D Agostini p. 5