Introduction to Bayesian Inference

Similar documents
PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric

2 Belief, probability and exchangeability

STAT J535: Introduction

A primer on Bayesian statistics, with an application to mortality rate estimation

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Time Series and Dynamic Models

Computational Perception. Bayesian Inference

Bayesian Models in Machine Learning

Chapter 5. Bayesian Statistics

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem

It applies to discrete and continuous random variables, and a mix of the two.

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

COMP90051 Statistical Machine Learning

Bayesian Inference. p(y)

Bayesian Inference. Introduction

COS513 LECTURE 8 STATISTICAL CONCEPTS

CS 441 Discrete Mathematics for CS Lecture 19. Probabilities. CS 441 Discrete mathematics for CS. Probabilities

Review of Probabilities and Basic Statistics

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

With Question/Answer Animations. Chapter 7

Advanced Herd Management Probabilities and distributions

Reasoning with Uncertainty

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Bayesian Inference: What, and Why?

Learning with Probabilities

Machine Learning

10. Exchangeability and hierarchical models Objective. Recommended reading

A PECULIAR COIN-TOSSING MODEL

Bayesian Learning (II)

Probability theory basics

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

ECE521 week 3: 23/26 January 2017

6.867 Machine Learning

Basic Probabilistic Reasoning SEG

Classical and Bayesian inference

Lecture 1: Bayesian Framework Basics

Introduction to Machine Learning

Probabilistic Reasoning

Conditional Probability. CS231 Dianna Xu

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

CS 630 Basic Probability and Information Theory. Tim Campbell

Statistical Machine Learning Lecture 1: Motivation

Basics of Statistical Estimation

Machine Learning

Introduction to Probabilistic Machine Learning

Probability and Estimation. Alan Moses

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

CSC 2541: Bayesian Methods for Machine Learning

Bayesian RL Seminar. Chris Mansley September 9, 2008

Computational Cognitive Science

2 Chapter 2: Conditional Probability

CSC Discrete Math I, Spring Discrete Probability

Classical and Bayesian inference

Probability, Random Processes and Inference

Lecture : Probabilistic Machine Learning

MATH 556: PROBABILITY PRIMER

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Probability Review - Bayes Introduction

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Grundlagen der Künstlichen Intelligenz

CSC321 Lecture 18: Learning Probabilistic Models

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Properties of Probability

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Lecture 1: Probability Fundamentals

Statistical learning. Chapter 20, Sections 1 3 1

Origins of Probability Theory

1 Preliminaries Sample Space and Events Interpretation of Probability... 13

Probability and Inference

Single Maths B: Introduction to Probability

Probability in Programming. Prakash Panangaden School of Computer Science McGill University

Discrete Binary Distributions

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Continuing Probability.

Exam 2 Practice Questions, 18.05, Spring 2014

Uncertain Knowledge and Bayes Rule. George Konidaris

6.867 Machine Learning

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Probabilistic Graphical Models

Intro to Probability. Andrei Barbu

Discrete Probability

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I

Predictive Distributions

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Introduction to Bayesian inference

an introduction to bayesian inference

CLASS NOTES Models, Algorithms and Data: Introduction to computing 2018

CS626 Data Analysis and Simulation

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Sample Spaces, Random Variables

Announcements. Proposals graded

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Learning Bayesian network : Given structure and completely observed data

Transcription:

Introduction to Bayesian Inference p. 1/2 Introduction to Bayesian Inference September 15th, 2010 Reading: Hoff Chapter 1-2

Introduction to Bayesian Inference p. 2/2 Probability: Measurement of Uncertainty Counting: equally likely outcomes (card and dice games, sampling) Long-Run Frequency: Hurricane in September? Relative frequency from similar years Subjective: measure of belief, may combine expert opinion computer simulations historic data from not necessarily identical conditions applies to unique events probability of terrorist attacks where long run frequencies are unavailable

Introduction to Bayesian Inference p. 3/2 Belief Functions A belief function is a function that assigns numbers to statements such that the larger the number, the higher the degree of belief. Belief functions allow you to combine evidence from different sources and arrive at a degree of belief (represented by a belief function) that takes into account all the available evidence. Statements F, G, and H. Be(F ) > Be(G) implies that we would prefer to bet that F is true over that G is true Be(F H) > Be(G H) implies that if we know that H is true, then we would prefer to bet that F is also true over that G is true.

Introduction to Bayesian Inference p. 4/2 Axioms of Beliefs B1 Be(not H H) Be(F H) Be(H H) B2 Be(F or G H) max{be(f H), Be(G H)} B3 Be(F and G H) can be derived from Be(F H) and Be(G F and H). Probability Axioms (Conditional Version) P1 0 = P(H C H) P(F H) P(H H) = 1 P2 P(F G H) = P(F H) + P(G H) if F G = P3 P(F G H) = P(F H)P(G F H) Probability functions satisfy belief axioms Dempster-Shafer theory generalization of Bayes.

Introduction to Bayesian Inference p. 5/2 Probability Events E, and a partition of H into disjoint sets H i H i H j = (1) i H i = H (2) P (H j ) P (E H j ) If E occurs, then how do we update our beliefs about H i? Bayes Theorem

Introduction to Bayesian Inference p. 6/2 Bayes Theorem Bayes Theorem reverses the conditioning: What is probability of H i given that E has occurred? P (H i E) = P (H i E) P (E) = P (E H i)p (H i ) P (E) (3) (4) Law of Total Probability: P (E) = j P (E H j )P (H j )

Introduction to Bayesian Inference p. 7/2 Twin Example Twins are either identical (single egg that splits) or fraternal (two eggs) Sexes: Girl/Girl GG, Boy/Boy BB, Girl/Boy GB or Boy/Girl BG GG or BB twins could be identical (I) or fraternal (F ) Given that I have GG twins, what is the probability that they are identical: P (I GG)? Easy to calculate the probability in the other way P (GG I)

Introduction to Bayesian Inference p. 8/2 Identical Twins P (I GG) = P (GG I)P (I) P (GG I)P (I) + P (GG F )P (F ) P (I) is the overall probability of having identical twins (among pregnancies with twins). Real-world data shows that about 1/3 of all twins are identical twins. This is our prior information. If the twins are identical, they are either two boys or two girls It is reasonable to assume that the two cases are equally probable. P (GG I) 1/2 In the case of non-identical twins, the four possible outcomes are assumed to be equally likely, so P (GG F ) 1/4

Introduction to Bayesian Inference p. 9/2 Result Substituting the probabilities P (GG) = 1/2 1/3 + 1/4 2/3 = 1/3 (5) P (I GG) = P (GG I)P (I) P (GG) = 1/2 1/3 1/3 = 1/2 (6) The probability that the twins are identical twins, given that they are both girls is 1/2 The result combines experimental data (two girls) with prior information (1/3 of all twins are identical twins).

Introduction to Bayesian Inference p. 10/2 Statistical Analysis Statistical induction is the process of learning about the general characteristics of a population from a subset (sample) of its members Characteristics often expressed in terms of parameters θ measurements on the subset of members given by numerical values Y Before the data are observed, both Y and θ are unknown probability model for observed data if we knew θ is the truth What if we have prior information about θ?

Introduction to Bayesian Inference p. 11/2 Bayesian Inference Bayesian inference provides a formal approach for updating prior beliefs with the observed data to quantify uncertainty a posteriori about θ Prior Distribution p(θ) Sampling Model p(y θ) Posterior Distribution: p(θ y) = Θ p(y θ) p(θ) p(y θ) p(θ) dθ (for discrete support for θ replace integral with sum)

Introduction to Bayesian Inference p. 12/2 Analysis Goals Bayesian methods go beyond the formal updating of the prior distribution to obtain a posterior distribution Estimation of uncertain quantities (parameters) with good statistical properties Prediction of future events Tests of hypotheses Making Decisions

Introduction to Bayesian Inference p. 13/2 Sampling Models At least Two Physical Meanings for Sampling Models 1. Random Sampling Model well-defined Population individuals in sample are drawn from the population in a random way could conceivably measure the entire population probability model specifies properties of the full population

Introduction to Bayesian Inference p. 14/2 Infection rate Interest is in the prevalence of a disease (say H1N1 flu) in a city. Rather than using a census of the population, take a random sample of 20 individuals. θ: fraction of infected individuals Y i indicator that ith individuals in the sample of 20 is infected Model for Y i given θ?

Introduction to Bayesian Inference p. 15/2 Sampling Models 2. Observations are made on a system subject to random fluctuations probability model specifies what would happen if, hypothetically, observations were repeated again and again under the same conditions. Population is hypothetical Climate change example

Introduction to Bayesian Inference p. 16/2 Mathematical versus Statistical Models Flipping a Coin: Given enough physics and accurate measurements of initial conditions, could the outcome of a coin toss be predicted with certainty? Deterministic model. Errors: Wrong assumptions in mathematical model Simplification in assumptions seemingly random outcomes All models are wrong, but some are more useful than others! (George Box)

Introduction to Bayesian Inference p. 17/2 Probability Distributions for the Random Outcomes Parametric Probability Models: Bernoulli/Binomial Multinomial Poisson Normal Log-normal Exponential Gamma Y i θ iid f(y θ) for i = 1,..., n

Introduction to Bayesian Inference p. 18/2 Exchangeable Let p(y 1,..., y n ) be the joint distribution of Y 1,..., Y n and let π 1,..., π n be a permutation of the indices 1,..., n. If p(y 1,..., y n ) = p(y π1,..., y πn ) for all permutations, then Y 1,..., Y n are exchangeable. de Finetti s Theorem Y 1, Y 2,... be a sequence of random variables. If for any n, Y 1,..., Y n are exchangeable, then there exists a prior distribution p(θ) and sampling model p(y θ) such that { n } p(y 1,..., y n ) = p(y i θ) p(θ) dθ Θ 1

Introduction to Bayesian Inference p. 19/2 Models Y 1,... Y n θ iid p(y θ) θ p(θ) } Y 1,... Y n are exchangeable for al Applicable if Y 1,..., Y n are outcomes of a repeatable experiment random sample from finite population with replacement sampled from an infinite population w/out replacement sampled from a finite population of size N >> n w/out replacement (approximate) Labels carry no information.

Introduction to Bayesian Inference p. 20/2 Infectious Disease Example Does model of exchangeability make sense?