Basics of Bayesian analysis. Jorge Lillo-Box MCMC Coffee Season 1, Episode 5

Similar documents
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

2. A Basic Statistical Toolbox

Bayesian Models in Machine Learning

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

Uncertainties in estimates of the occurrence rate of rare space weather events

Probability Theory Review

Introduction into Bayesian statistics

STAT 425: Introduction to Bayesian Analysis

Introduction to Bayes

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

Classical and Bayesian inference

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

arxiv:astro-ph/ v1 14 Sep 2005

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Bayesian Regression Linear and Logistic Regression

Bayes Theorem. Jan Kracík. Department of Applied Mathematics FEECS, VŠB - TU Ostrava

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Applied Bayesian Statistics STAT 388/488

1 INFO 2950, 2 4 Feb 10

Frequentist Statistics and Hypothesis Testing Spring

Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4)

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Introduction to Bayesian Data Analysis

Frequentist-Bayesian Model Comparisons: A Simple Example

Part 2: One-parameter models

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Choosing priors Class 15, Jeremy Orloff and Jonathan Bloom

The dark energ ASTR509 - y 2 puzzl e 2. Probability ASTR509 Jasper Wal Fal term

Review: Statistical Model

Principles of Bayesian Inference

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Statistics for Business and Economics

Bayesian Inverse Problems

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Introduction to Statistical Methods for High Energy Physics

David Giles Bayesian Econometrics

Chapter 5. Bayesian Statistics

Computational Perception. Bayesian Inference

Bayesian Inference. Chris Mathys Wellcome Trust Centre for Neuroimaging UCL. London SPM Course

CSC321 Lecture 18: Learning Probabilistic Models

Week 2 Quantitative Analysis of Financial Markets Bayesian Analysis

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

STA 4273H: Sta-s-cal Machine Learning

Confidence Distribution

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Bayesian Phylogenetics

STAT J535: Introduction

Hierarchical Bayesian Modeling

Hierarchical Bayesian Modeling of Planet Populations

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation

Hierarchical Models & Bayesian Model Selection

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 4 of Data Mining by I. H. Witten, E. Frank and M. A.

Terminology. Experiment = Prior = Posterior =

Parameter Estimation. Industrial AI Lab.

Hierarchical Bayesian Modeling

My talk concerns estimating a fixed but unknown, continuously valued parameter, linked to data by a statistical model. I focus on contrasting

CLASS NOTES Models, Algorithms and Data: Introduction to computing 2018

Chapter Learning Objectives. Random Experiments Dfiii Definition: Dfiii Definition:

A Probabilistic Framework for solving Inverse Problems. Lambros S. Katafygiotis, Ph.D.

Lecture 4 Bayes Theorem

Probability and Information Theory. Sargur N. Srihari

Lecture 4 Bayes Theorem

Computer vision: models, learning and inference

Making rating curves - the Bayesian approach

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell

Bayesian data analysis using JASP

Statistische Methoden der Datenanalyse. Kapitel 1: Fundamentale Konzepte. Professor Markus Schumacher Freiburg / Sommersemester 2009

Bayesian Estimation An Informal Introduction

Behavioral Data Mining. Lecture 2

Bayesian Approaches Data Mining Selected Technique

Intro to Bayesian Methods

Lecture 2: Priors and Conjugacy

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

STT When trying to evaluate the likelihood of random events we are using following wording.

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Foundations of Statistical Inference

Bayesian Learning Extension

Bayes Rule: A Tutorial Introduction ( www45w9klk)

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan

E. Santovetti lesson 4 Maximum likelihood Interval estimation

Bayesian Inference. Chapter 1. Introduction and basic concepts

Subjective and Objective Bayesian Statistics

Probabilistic and Bayesian Machine Learning

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014

Including historical data in the analysis of clinical trials using the modified power priors: theoretical overview and sampling algorithms

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016

COMP90051 Statistical Machine Learning

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Transcription:

Basics of Bayesian analysis Jorge Lillo-Box MCMC Coffee Season 1, Episode 5

Disclaimer This presentation is based heavily on Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data: A Practical Python Guide for the Analysis of Survey Data by Ivezić, Connolly, Vanderplas, & Gray Princeton University Press: 2014. and it is only meant for internal purposes of the MCMC Coffee seassions. Please excuse our plagiarism!

Very brief historical context Reverend Thomas Bayes (1702-1761), a British amateur mathemetician, wrote about how to combine an intial belief with new data to arrive at an improved belief. This manuscript was published in 1763 (posthumously) In 1774 Pierre Simon Lapalace rediscovered and clarified Bayes' principles. He applied these to astronomy, physics, population stats, jurisprudence (study of law). By the way, he estimated the mass of Saturn and its uncertainty, which remains consistent with the best current measurements. In the early 20th century, Harold Jeffreys brought back Bayes' theorem and Laplace's work.

250 years later Source: Roberto Trotta (Imperial College)

The principles of Bayesian inference Probability statements are not limited to data, but can be made for model parameters and models themselves. Inferences are made by producing probability density functions (PDFs) Model parameters are treated as random variables Remember, Bayesian method yields optiumum results assuming that all of the supplied info is correct(!!!).

The principles of Bayesian inference Probability statements are not limited to data, but can be made for model parameters and models themselves. Inferences are made by producing probability density functions (PDFs) Model parameters are treated as random variables Remember, Bayesian method yields optiumum results assuming that all of the supplied info is correct(!!!).

The principles of Bayesian inference Probability statements are not limited to data, but can be made for model parameters and models themselves. Inferences are made by producing probability density functions (PDFs) Model parameters are treated as random variables** Remember, Bayesian method yields optiumum results assuming that all of the supplied info is correct(!!!). **A variable whose value results from the measurement of a quantity that is subject to random variations

The principles of Bayesian inference Probability statements are not limited to data, but can be made for model parameters and models themselves. Inferences are made by producing probability density functions (PDFs) Model parameters are treated as random variables Remember, Bayesian method yields optiumum results assuming that all of the supplied info is correct(!!!).

The principles of Bayesian inference Classical statistics Bayesian Methods Likelihood function Used to find model parameters that yield the highest data likelihood Extends the concept of the data likelihood function by adding extra information to the analysis (so-called prior) It cannot be interpreted as a probability density function for model parameters One can assign PDF to model parameters

The Bayes rule Practical exemple: the Monty Hall problem! Section 3.1.3

The Bayes theorem Bayes rule is not controversial The frequentist vs. Bayesian controversy sets in when we apply Bayes rule to the likelihood function p(d M) to obtain Bayes theorem: D = data M = model

The Bayes theorem Bayes rule is not controversial The frequentist vs. Bayesian controversy sets in when we apply Bayes rule to the likelihood function p(d M) to obtain Bayes theorem: Posterior probability Likelihood Prior D = data M = model Evidence

The Bayes theorem Bayes rule is not controversial The frequentist vs. Bayesian controversy sets in when we apply Bayes rule to the likelihood function p(d M) to obtain Bayes theorem: Posterior probability Likelihood Prior D = data M = model Evidence An "improved belief" is proportional to the product of an "initial belief" and the probability that the "initial belief" generated the observed data.

The Bayes theorem Bayes rule is not controversial The frequentist vs. Bayesian controversy sets in when we apply Bayes rule to the likelihood function p(d M) to obtain Bayes theorem: Posterior probability Likelihood Prior D = data M = model θ = θ1,, θk parameters of the model I = any other information Evidence An "improved belief" is proportional to the product of an "initial belief" and the probability that the "initial belief" generated the observed data.

The Bayes theorem Posterior probability Posterior probability density function (pdf) for model M with parameters θ given the data D and the additional information I. It has k+1 dimensions

The Bayes theorem Posterior probability Posterior probability density function (pdf) for model M with parameters θ given the data D and the additional information I. It has k+1 dimensions Likelihood The likelihood of data given some model M and given some fixed values of parameters θ describing it, and all other prior information I.

The Bayes theorem Posterior probability Posterior probability density function (pdf) for model M with parameters θ given the data D and the additional information I. It has k+1 dimensions Likelihood The likelihood of data given some model M and given some fixed values of parameters θ describing it, and all other prior information I. Prior A priori joint probability for model M and its parameters θ in the absence of any of the data used to compute likelihood

The Bayes theorem Prior A priori joint probability for model M and its parameters θ in the absence of any of the data used to compute likelihood Only this term is needed for parameter estimation Also needed for model selection

How to choose a prior Informative priors: are based on previous measurements (e.g., when analyzing the radial velocity of a star modulated by a transiting planet, you can constrain the period of the planet very well from the transit periodicity as P +/- e_p) RV = Vsys + K*cos(2πt/P) Uninformative priors: when no other information other than the data we are analyzing is available (but note!: it can incorporate weak but objective information) (e.g., the systemic radial velocity of the star Vsys)

How to choose a prior Rules for selecting uninformative priors Principle of indifference: states that a set of basic, mutually exclusive possibilities need to be assigned equal probabilities (e.g., for a fair six-sided die, each of the outcomes has a prior probability of 1/6) Principle of consistency: based on transformation groups, demands that the prior for a location parameter should not change with translations of the coordinate system, and yields a flat prior. The prior for a scale parameter should not depend on the choice of units. The solution is p(σ I) σ 1 (or a flat prior for lnσ), called a scale-invariant prior. Principle of maximum entropy: when we have additional weak prior information about some parameter, such as a low-order statistic