Lecture 13: Markov Chain Monte Carlo. Gibbs sampling

Similar documents
15-381/781 Bayesian Nets & Probabilistic Inference

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

Distributions, spatial statistics and a Bayesian perspective

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

, which yields. where z1. and z2

Hypothesis Tests for One Population Mean

NAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling?

THE LIFE OF AN OBJECT IT SYSTEMS

A proposition is a statement that can be either true (T) or false (F), (but not both).

Admin. MDP Search Trees. Optimal Quantities. Reinforcement Learning

Maximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016

CONSTRUCTING STATECHART DIAGRAMS

Dataflow Analysis and Abstract Interpretation

Source Coding and Compression

A Matrix Representation of Panel Data

The Law of Total Probability, Bayes Rule, and Random Variables (Oh My!)

READING STATECHART DIAGRAMS

B. Definition of an exponential

Chapter 2 GAUSS LAW Recommended Problems:

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

GMM with Latent Variables

Thermodynamics Partial Outline of Topics

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

You need to be able to define the following terms and answer basic questions about them:

Activity Guide Loops and Random Numbers

Kinetic Model Completeness

Sequential Allocation with Minimal Switching

ELE Final Exam - Dec. 2018

Chemistry 20 Lesson 11 Electronegativity, Polarity and Shapes

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

Five Whys How To Do It Better

Differentiation Applications 1: Related Rates

Pattern Recognition 2014 Support Vector Machines

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Hiding in plain sight

NUMBERS, MATHEMATICS AND EQUATIONS

Flipping Physics Lecture Notes: Simple Harmonic Motion Introduction via a Horizontal Mass-Spring System

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Chapter 8: The Binomial and Geometric Distributions

Work, Energy, and Power

20 Faraday s Law and Maxwell s Extension to Ampere s Law

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

AP Statistics Notes Unit Two: The Normal Distributions

Physics 2010 Motion with Constant Acceleration Experiment 1

Probability, Random Variables, and Processes. Probability

Reinforcement Learning" CMPSCI 383 Nov 29, 2011!

Computational modeling techniques

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

Lecture 12: Chemical reaction equilibria

Section 5.8 Notes Page Exponential Growth and Decay Models; Newton s Law

What is Statistical Learning?

Solution to HW14 Fall-2002

LCAO APPROXIMATIONS OF ORGANIC Pi MO SYSTEMS The allyl system (cation, anion or radical).

Flipping Physics Lecture Notes: Simple Harmonic Motion Introduction via a Horizontal Mass-Spring System

ELT COMMUNICATION THEORY

Department of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets

CHM112 Lab Graphing with Excel Grading Rubric

Lesson Plan. Recode: They will do a graphic organizer to sequence the steps of scientific method.

BASD HIGH SCHOOL FORMAL LAB REPORT

Experiment #3. Graphing with Excel

Chapter 3: Cluster Analysis

Tutorial 3: Building a spectral library in Skyline

1 The limitations of Hartree Fock approximation

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

SPH3U1 Lesson 06 Kinematics

Part 3 Introduction to statistical classification techniques

ENSC Discrete Time Systems. Project Outline. Semester

Lab #3: Pendulum Period and Proportionalities

Inference in the Multiple-Regression

Tree Structured Classifier

NOTE ON A CASE-STUDY IN BOX-JENKINS SEASONAL FORECASTING OF TIME SERIES BY STEFFEN L. LAURITZEN TECHNICAL REPORT NO. 16 APRIL 1974

Physics 212. Lecture 12. Today's Concept: Magnetic Force on moving charges. Physics 212 Lecture 12, Slide 1

Homology groups of disks with holes

37 Maxwell s Equations

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

Eric Klein and Ning Sa

ES201 - Examination 2 Winter Adams and Richards NAME BOX NUMBER

Simple Linear Regression (single variable)

Lecture 24: Flory-Huggins Theory

Introduction to Spacetime Geometry

ALE 21. Gibbs Free Energy. At what temperature does the spontaneity of a reaction change?

Equilibrium of Stress

Thermodynamics and Equilibrium

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

ENG2410 Digital Design Arithmetic Circuits

Public Key Cryptography. Tim van der Horst & Kent Seamons

INSTRUMENTAL VARIABLES

Least Squares Optimal Filtering with Multirate Observations

Sections 15.1 to 15.12, 16.1 and 16.2 of the textbook (Robbins-Miller) cover the materials required for this topic.

NOTES. Name: Date: Topic: Periodic Table & Atoms Notes. Period: Matter

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms

In the OLG model, agents live for two periods. they work and divide their labour income between consumption and

Interference is when two (or more) sets of waves meet and combine to produce a new pattern.

Name Student ID. A student uses a voltmeter to measure the electric potential difference across the three boxes.

Lecture 17: Free Energy of Multi-phase Solutions at Equilibrium

Transcription:

Lecture 13: Markv hain Mnte arl Gibbs sampling Gibbs sampling Markv chains 1 Recall: Apprximate inference using samples Main idea: we generate samples frm ur Bayes net, then cmpute prbabilities using (weighted) cunts) But we may need a lt f wrk t get enugh samples (eg if the PDs are very extreme) Rejectin sampling and likelihd weighting are als specific t directed mdels 2

Recall: rward sampling P() = 5 ludy P(S) 10 50 Sprinkler Rain P(R) 0 20 S R P() et Grass 99 90 90 00 Each sample is cnstructed frm scratch! e saw tw alternatives fr incrprating the evidence hrw away samples that are incnsistent with it rce the evidence variables, but then weigh the samples In bth cases, after a sample is cnstructed, we start a new ne frm scratch 3 A different idea P() = 5 ludy P(S) 10 50 Sprinkler Rain P(R) 0 20 S R P() et Grass Suppse we want t cmpute e generate ne sample, with the given evidence variables instantiated crrectly hen we keep changing it! If we are careful, we will get samples frm the crrect distributin 99 90 90 00 4

H G 1 Initializatin Gibbs sampling Set evidence variables, t the bserved values Set all ther variables t randm values (eg by frward sampling, unifrm sampling) his gives us a sample 2 Repeat (as much as wanted) Pick a nn-evidence variable unifrmly randmly Sample frm! " $# Keep all ther values: % &')(+*-, he new sample is % 3 Alternatively, yu can march thrugh the variables in sme predefined rder 5 hy Gibbs wrks in Bayes nets he key step is sampling accrding t!/ " 0 1# 2 Hw d we cmpute this? In Bayes nets, we knw that a variable is cnditinally independent f all ther given its Markv blanket (parents, children, spuses) 3!/ " 0 1# 2 MarkvBlanket S we need t sample frm 4 MarkvBlanket Let 56'7* :9 be the children f Yu will shw (next hmewrk) that: ;=<?> :@ <BA ;<1> MarkxBlanket )DE :@ <?A Parents )D JIK ;=<BL @ <1M Parents D 6

V Example P() = 5 ludy P(S) 10 50 Sprinkler Rain P(R) 0 20 et Grass S R P() 1 Generate a first sample: N 2 Pick, sample it frm N 99 90 90 00 :P JP Q Suppse we get 3 ur new sample is N JP 4 7 Analyzing Gibbs sampling nsider the variables -R Each pssible assignment f values t these variables is a state f the wrld, SK2U In Gibbs sampling, we start frm a given state V S U Based n this, we generate a new state, S U V depends nly n V! here is a well-defined prbability f ging frm V t V Gibbs sampling cnstructs a Markv chain ver the Bayes net

f V A Markv chain is defined by: A set f states Markv chains V 3 V X A starting distributin ver the set f states YX e ften put these in a vectr Z\[ A statinary transitin prbability ^]_]a` V/b # r cnvenience, we ften put these in a a c dec matrix P V X=f V gf hhhif V b f V b # jf Vb V V 9 Steady-state (statinary) distributin here will the chain be in 1 step? lk lkx m ^ lx In tw steps? Yn n X In steps? b b " b lx A statinary distributin is a distributin left invariant by the chain: Nte that sme chains can have mre than ne statinary distributin! 10

] Detailed balance nsider the statinary distributin: V p V _ V V his can be viewed as a flw prperty: the flw ut f V has t be equal t the flw cming int V frm all states ne way t ensure this is t make flw equal between any pair f states: V a V V p V a V V his gives us a sufficient cnditin fr statinarity, called detailed balance (why) A Markv chain with this prperty is called reversible 11 Mnte arl Markv hain (MM) Suppse we want t sample data frm sme distributin e will set up a Markv chain which has the desired distributin as its statinary distributin! r this we wuld like the chain t have a unique statinary distributin, s that we can get samples frm it regardless f the starting distributin 12

Ergdicity An ergdic Markv chain is ne in which any state is reachable frm any ther state, and there are n strictly peridic cycles In such a chain, there is a unique statinary distributin, which can be btained as: qsrut b vxw b his is called equilibrium distributin Nte that the chain reaches the equilibrium distributin regardless f X 13 Sampling the equilibrium distributin e can sample just by running the chain a lng time: fr sme arbitrary Set V X p r zy, if V b V, sample a value V fr V b # based n V V Return V{ If y is large enugh, this will be a sample frm In practice, yu d like t have a rapidly mixing chain, ie ne that reaches the equilibrium quickly 14

m m m Implementatin issues he initial samples are influenced by the starting distributin, s they need t be thrwn away his is called the burn-in stage Because burn-it can take a while, we d like t draw several samples frm the same chain! Hwever, if we take samples,,! }, they will be highly crrelated Usually we wait fr burn-in, then take every c ~ sample, fr sme c sufficiently large his will ensure that the samples are (fr all practical purpses) uncrrelated 15 Gibbs sampling as MM ƒ, with e have a set f randm variables evidence variables e want t sample frm & Let be the variable t be sampled, currently set t, and ˆ m be the values fr all ther variables in he transitin prbability fr the chain is: V V p ˆ Š bviusly the chain is ergdic e want t shw that & is the statinary distributin 16

m Gibbs satisfies detailed balance V a V V & ˆ & ˆ & ˆ & V V V ˆ & ˆ & ˆ & K ˆ & ˆ Š (by chain rule) (backwards chain rule) 17