MCMC and Gibbs Sampling. Kayhan Batmanghelich

Similar documents
Markov Chain Monte Carlo (MCMC)

Monte Carlo Methods. Leon Gu CSD, CMU

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Graphical Models and Kernel Methods

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

17 : Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Machine Learning for Data Science (CS4786) Lecture 24

STA 4273H: Statistical Machine Learning

Results: MCMC Dancers, q=10, n=500

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Markov Networks.

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov chain Monte Carlo Lecture 9

19 : Slice Sampling and HMC

16 : Markov Chain Monte Carlo (MCMC)

CPSC 540: Machine Learning

Bayesian Inference and MCMC

MCMC: Markov Chain Monte Carlo

13: Variational inference II

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

The Ising model and Markov chain Monte Carlo

Approximate Inference using MCMC

28 : Approximate Inference - Distributed MCMC

Markov Chain Monte Carlo

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Markov chain Monte Carlo

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Answers and expectations

Introduction to Machine Learning CMU-10701

Bayes Nets: Sampling

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Particle-Based Approximate Inference on Graphical Model

Probabilistic Machine Learning

Bayesian Methods for Machine Learning

Markov Chain Monte Carlo, Numerical Integration

Bayesian Phylogenetics:

Bayesian networks: approximate inference

Lecture 13 : Variational Inference: Mean Field Approximation

Sampling Algorithms for Probabilistic Graphical models

Monte Carlo in Bayesian Statistics

Probabilistic Graphical Models

Variational Inference. Sargur Srihari

Monte Carlo Methods. Geoff Gordon February 9, 2006

Inference in Bayesian Networks

Quantifying Uncertainty

14 : Approximate Inference Monte Carlo Methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Introduction to Machine Learning

Markov Chain Monte Carlo

Markov Chains and MCMC

Lecture 7 and 8: Markov Chain Monte Carlo

Particle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007

CS 343: Artificial Intelligence

Computational statistics

13 : Variational Inference: Loopy Belief Propagation and Mean Field

CS 188: Artificial Intelligence. Bayes Nets

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

MCMC algorithms for fitting Bayesian models

MARKOV CHAIN MONTE CARLO

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Convex Optimization CMU-10725

Markov chain Monte Carlo methods for visual tracking

Probabilistic Graphical Networks: Definitions and Basic Results

Sampling Methods (11/30/04)

RANDOM TOPICS. stochastic gradient descent & Monte Carlo

Markov chain Monte Carlo

Markov Chain Monte Carlo methods

An introduction to Sequential Monte Carlo

Stochastic optimization Markov Chain Monte Carlo

Introduction to Stochastic Gradient Markov Chain Monte Carlo Methods

Artificial Intelligence

Pattern Recognition and Machine Learning

Bayesian Inference for Dirichlet-Multinomials

Stochastic Simulation

Simulation - Lectures - Part III Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo

MCMC Sampling for Bayesian Inference using L1-type Priors

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Lecture 6: Markov Chain Monte Carlo

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

Latent Dirichlet Allocation

DAG models and Markov Chain Monte Carlo methods a short overview

Sampling Methods. Bishop PRML Ch. 11. Alireza Ghane. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

MCMC Methods: Gibbs and Metropolis

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

Lecture 8: Bayesian Networks

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

Reminder of some Markov Chain properties:

I. Bayesian econometrics

Monte Carlo Inference Methods

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Transcription:

MCMC and Gibbs Sampling Kayhan Batmanghelich 1

Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction tree algorithms l Approximate inference techniques l l l Variational algorithms l l Loopy belief propagation Mean field approximation Stochastic simulation / sampling methods Markov chain Monte Carlo methods Eric Xing @ CMU, 2005-2015 2

<latexit sha1_base64="fc+7qcw5lwx1mpkmfwp1fwazqck=">aaacf3icbva7t8mwghr4lvikmljyvejtuhkebayvklgyw4nqsk1uoy7twnuesh3ukprnspbxwbgasclgv8fpm0dlszzpd98n+86ngrxsml61pewv1bx10kz5c2t7z1ff278tucixsxdeit51kscmhsssvdlsjtlbgctixx1d537nnnbbo/bwpjfxajqiqu8xkkrq6yd2gotqdbpwpjo2zidxdtag7xoem1ts5phcgncm2aidx7cvv4y6mqvcjgzbkqbaq69/2v6ek4ceejmkrm80yulkieukgzmu7usqgoergpceoiekihcyabajpfakb/2iqxnkofv/b2qoecinxdwzxxdzxi7+5/us6v84gq3jrjiqzx7yewzlbpowoec5wzkliidmqforxeokwpgqy7iqwzypveis0/pl3wyfvzpxrrslcaioqbwy4bw0wq1oaqtg8aiewst40560f+1d+5inlmnfzgh4a+3zb6qhnxw=</latexit> <latexit sha1_base64="fc+7qcw5lwx1mpkmfwp1fwazqck=">aaacf3icbva7t8mwghr4lvikmljyvejtuhkebayvklgyw4nqsk1uoy7twnuesh3ukprnspbxwbgasclgv8fpm0dlszzpd98n+86ngrxsml61pewv1bx10kz5c2t7z1ff278tucixsxdeit51kscmhsssvdlsjtlbgctixx1d537nnnbbo/bwpjfxajqiqu8xkkrq6yd2gotqdbpwpjo2zidxdtag7xoem1ts5phcgncm2aidx7cvv4y6mqvcjgzbkqbaq69/2v6ek4ceejmkrm80yulkieukgzmu7usqgoergpceoiekihcyabajpfakb/2iqxnkofv/b2qoecinxdwzxxdzxi7+5/us6v84gq3jrjiqzx7yewzlbpowoec5wzkliidmqforxeokwpgqy7iqwzypveis0/pl3wyfvzpxrrslcaioqbwy4bw0wq1oaqtg8aiewst40560f+1d+5inlmnfzgh4a+3zb6qhnxw=</latexit> <latexit sha1_base64="fc+7qcw5lwx1mpkmfwp1fwazqck=">aaacf3icbva7t8mwghr4lvikmljyvejtuhkebayvklgyw4nqsk1uoy7twnuesh3ukprnspbxwbgasclgv8fpm0dlszzpd98n+86ngrxsml61pewv1bx10kz5c2t7z1ff278tucixsxdeit51kscmhsssvdlsjtlbgctixx1d537nnnbbo/bwpjfxajqiqu8xkkrq6yd2gotqdbpwpjo2zidxdtag7xoem1ts5phcgncm2aidx7cvv4y6mqvcjgzbkqbaq69/2v6ek4ceejmkrm80yulkieukgzmu7usqgoergpceoiekihcyabajpfakb/2iqxnkofv/b2qoecinxdwzxxdzxi7+5/us6v84gq3jrjiqzx7yewzlbpowoec5wzkliidmqforxeokwpgqy7iqwzypveis0/pl3wyfvzpxrrslcaioqbwy4bw0wq1oaqtg8aiewst40560f+1d+5inlmnfzgh4a+3zb6qhnxw=</latexit> <latexit sha1_base64="fc+7qcw5lwx1mpkmfwp1fwazqck=">aaacf3icbva7t8mwghr4lvikmljyvejtuhkebayvklgyw4nqsk1uoy7twnuesh3ukprnspbxwbgasclgv8fpm0dlszzpd98n+86ngrxsml61pewv1bx10kz5c2t7z1ff278tucixsxdeit51kscmhsssvdlsjtlbgctixx1d537nnnbbo/bwpjfxajqiqu8xkkrq6yd2gotqdbpwpjo2zidxdtag7xoem1ts5phcgncm2aidx7cvv4y6mqvcjgzbkqbaq69/2v6ek4ceejmkrm80yulkieukgzmu7usqgoergpceoiekihcyabajpfakb/2iqxnkofv/b2qoecinxdwzxxdzxi7+5/us6v84gq3jrjiqzx7yewzlbpowoec5wzkliidmqforxeokwpgqy7iqwzypveis0/pl3wyfvzpxrrslcaioqbwy4bw0wq1oaqtg8aiewst40560f+1d+5inlmnfzgh4a+3zb6qhnxw=</latexit> <latexit sha1_base64="c3pt6y845+h/bz9d9nfnvlv4uk8=">aaacbnicbvdlssnafj34rpuvdsniybhqpiqiqoci6mzlc8ywmlamk0k7ddijmxoxhozc+ctuxki49rvc+tdo2iy09cdlhs65l5l7/irrqszr21hyxfpewa2svdc3nre2zz3doxmnahmhxywwxr9jwignjqkkkw4icip8rjr+6lrwo/desbrzwzvoibehaachxuhpqw8eukfaohmvzqhjwnn94tjpru2iwuto982a1bamgppelkknlgj1zs83iheaea4wq1l2bctrxoaeopirvoqmkiqij9ca9dtlkclsyyz35pbikwemy6glkzhrf29kkjjyhpl6mkjqkge9qvzp66uqppcyypnuey6nd4upgyqgrsgwoijgxcaaicyo/iveq6sdutq6qg7bnj15njgnjyug3t6tna/kncpghxycordbgwicg9acdsdgetydv/bmpbkvxrvxmr1dmmqdpfahxucp+tmyra==</latexit> <latexit sha1_base64="c3pt6y845+h/bz9d9nfnvlv4uk8=">aaacbnicbvdlssnafj34rpuvdsniybhqpiqiqoci6mzlc8ywmlamk0k7ddijmxoxhozc+ctuxki49rvc+tdo2iy09cdlhs65l5l7/irrqszr21hyxfpewa2svdc3nre2zz3doxmnahmhxywwxr9jwignjqkkkw4icip8rjr+6lrwo/desbrzwzvoibehaachxuhpqw8eukfaohmvzqhjwnn94tjpru2iwuto982a1bamgppelkknlgj1zs83iheaea4wq1l2bctrxoaeopirvoqmkiqij9ca9dtlkclsyyz35pbikwemy6glkzhrf29kkjjyhpl6mkjqkge9qvzp66uqppcyypnuey6nd4upgyqgrsgwoijgxcaaicyo/iveq6sdutq6qg7bnj15njgnjyug3t6tna/kncpghxycordbgwicg9acdsdgetydv/bmpbkvxrvxmr1dmmqdpfahxucp+tmyra==</latexit> <latexit sha1_base64="c3pt6y845+h/bz9d9nfnvlv4uk8=">aaacbnicbvdlssnafj34rpuvdsniybhqpiqiqoci6mzlc8ywmlamk0k7ddijmxoxhozc+ctuxki49rvc+tdo2iy09cdlhs65l5l7/irrqszr21hyxfpewa2svdc3nre2zz3doxmnahmhxywwxr9jwignjqkkkw4icip8rjr+6lrwo/desbrzwzvoibehaachxuhpqw8eukfaohmvzqhjwnn94tjpru2iwuto982a1bamgppelkknlgj1zs83iheaea4wq1l2bctrxoaeopirvoqmkiqij9ca9dtlkclsyyz35pbikwemy6glkzhrf29kkjjyhpl6mkjqkge9qvzp66uqppcyypnuey6nd4upgyqgrsgwoijgxcaaicyo/iveq6sdutq6qg7bnj15njgnjyug3t6tna/kncpghxycordbgwicg9acdsdgetydv/bmpbkvxrvxmr1dmmqdpfahxucp+tmyra==</latexit> <latexit sha1_base64="c3pt6y845+h/bz9d9nfnvlv4uk8=">aaacbnicbvdlssnafj34rpuvdsniybhqpiqiqoci6mzlc8ywmlamk0k7ddijmxoxhozc+ctuxki49rvc+tdo2iy09cdlhs65l5l7/irrqszr21hyxfpewa2svdc3nre2zz3doxmnahmhxywwxr9jwignjqkkkw4icip8rjr+6lrwo/desbrzwzvoibehaachxuhpqw8eukfaohmvzqhjwnn94tjpru2iwuto982a1bamgppelkknlgj1zs83iheaea4wq1l2bctrxoaeopirvoqmkiqij9ca9dtlkclsyyz35pbikwemy6glkzhrf29kkjjyhpl6mkjqkge9qvzp66uqppcyypnuey6nd4upgyqgrsgwoijgxcaaicyo/iveq6sdutq6qg7bnj15njgnjyug3t6tna/kncpghxycordbgwicg9acdsdgetydv/bmpbkvxrvxmr1dmmqdpfahxucp+tmyra==</latexit> Recap: Rejection Sampling Steps: Find Q(x) that is easy to sample from. Find k such that k such that: P (x) kq(x) < 1 Sample auxiliary variable y P(y =1 x) = P (x) kq(x) accept the sample with probability P(y=1 x) 3

<latexit sha1_base64="8dao0lgvwvmig3thzz77mqexmxm=">aaace3icbvdlssnafj3uv62vqks3g0woccurqv0irrfcvjc20iqymu7aozmhmzfsevorbvwvny5u3lpx5984abpq1gmdh3pu5c45xiy4atp8ngoli0vlk8xv0tr6xuzwexvnxkwjpmymkyhkyyokcr4ygzgi1ooli4enwnmbxgv+84fjxapwdkyxcwpsc7npkqetdcphtkcg73np9bitdrgjeidjmxye86gn/erwedus9/rg4otouwlwzanwplfyuke5gp3yl9onabkwekggsrutmwy3jri4fwxcchlfykihpmfamoykympnj6hg+earxexhur8q8et9vzgsqklr4onjlika9tlxp6+dgh/mpjyme2ahnr7ye4ehwlldumsloybgmhaquf4rpn0icqxdy0mxym1gnif2ce28zt2evoqxertftif2urvz6btv0q1qibtr9iie0st6m56mf+pd+jiofox8zxf9gfh5a6e/nx0=</latexit> <latexit sha1_base64="8dao0lgvwvmig3thzz77mqexmxm=">aaace3icbvdlssnafj3uv62vqks3g0woccurqv0irrfcvjc20iqymu7aozmhmzfsevorbvwvny5u3lpx5984abpq1gmdh3pu5c45xiy4atp8ngoli0vlk8xv0tr6xuzwexvnxkwjpmymkyhkyyokcr4ygzgi1ooli4enwnmbxgv+84fjxapwdkyxcwpsc7npkqetdcphtkcg73np9bitdrgjeidjmxye86gn/erwedus9/rg4otouwlwzanwplfyuke5gp3yl9onabkwekggsrutmwy3jri4fwxcchlfykihpmfamoykympnj6hg+earxexhur8q8et9vzgsqklr4onjlika9tlxp6+dgh/mpjyme2ahnr7ye4ehwlldumsloybgmhaquf4rpn0icqxdy0mxym1gnif2ce28zt2evoqxertftif2urvz6btv0q1qibtr9iie0st6m56mf+pd+jiofox8zxf9gfh5a6e/nx0=</latexit> <latexit sha1_base64="8dao0lgvwvmig3thzz77mqexmxm=">aaace3icbvdlssnafj3uv62vqks3g0woccurqv0irrfcvjc20iqymu7aozmhmzfsevorbvwvny5u3lpx5984abpq1gmdh3pu5c45xiy4atp8ngoli0vlk8xv0tr6xuzwexvnxkwjpmymkyhkyyokcr4ygzgi1ooli4enwnmbxgv+84fjxapwdkyxcwpsc7npkqetdcphtkcg73np9bitdrgjeidjmxye86gn/erwedus9/rg4otouwlwzanwplfyuke5gp3yl9onabkwekggsrutmwy3jri4fwxcchlfykihpmfamoykympnj6hg+earxexhur8q8et9vzgsqklr4onjlika9tlxp6+dgh/mpjyme2ahnr7ye4ehwlldumsloybgmhaquf4rpn0icqxdy0mxym1gnif2ce28zt2evoqxertftif2urvz6btv0q1qibtr9iie0st6m56mf+pd+jiofox8zxf9gfh5a6e/nx0=</latexit> <latexit sha1_base64="8dao0lgvwvmig3thzz77mqexmxm=">aaace3icbvdlssnafj3uv62vqks3g0woccurqv0irrfcvjc20iqymu7aozmhmzfsevorbvwvny5u3lpx5984abpq1gmdh3pu5c45xiy4atp8ngoli0vlk8xv0tr6xuzwexvnxkwjpmymkyhkyyokcr4ygzgi1ooli4enwnmbxgv+84fjxapwdkyxcwpsc7npkqetdcphtkcg73np9bitdrgjeidjmxye86gn/erwedus9/rg4otouwlwzanwplfyuke5gp3yl9onabkwekggsrutmwy3jri4fwxcchlfykihpmfamoykympnj6hg+earxexhur8q8et9vzgsqklr4onjlika9tlxp6+dgh/mpjyme2ahnr7ye4ehwlldumsloybgmhaquf4rpn0icqxdy0mxym1gnif2ce28zt2evoqxertftif2urvz6btv0q1qibtr9iie0st6m56mf+pd+jiofox8zxf9gfh5a6e/nx0=</latexit> <latexit sha1_base64="9aljmpz0psa/aqahexhvmw1d3ke=">aaac0niclvlpb9mwfhyyykp8kndk8krvqbuuzeladpg0wyudh1yibfptry7rbfydx7odqzwxbulkx8enf4g/aqcnurdx4umrp3/f+/ls95xlzrsjol9buhxv/optnyedr4+fph3wff7is65qrwhckl6p0xxrypmgiwgg01opkc5ztk/y+ftgp7misrnkfdjlsaclphesyaqbt2xd3/20xoyiz+3idzzh8zfflhxbwihmbgoyn9fgwow6ox83c3t6z5men3cwjshfuqpq0abhzn50koq6zhhloes6krmmw0dh3sahg7+htevgrdlubocz1vvl1r4cxp7l187/cyhlur1ogk0c7ok4bt3uxijr/kxnfallkgzhwotjhekztvgzrjj1n6s1lzjm8tmdechwsfxurmbioo+zgrsv8p8wsgi3hraxwi/l3gc2a9c3tyb8lzaptxewtuzi2lbb1owkmoopobkwzjiixpclb5go5s8k5al7phn/ddq+cfhtk98fyd7wcbip3/ao37xd2egv0gs0qdhar8foaxqhbjfgffwf18hxmalt+c38vk4ng9bzet2i8mcfl1zdig==</latexit> <latexit sha1_base64="9aljmpz0psa/aqahexhvmw1d3ke=">aaac0niclvlpb9mwfhyyykp8kndk8krvqbuuzeladpg0wyudh1yibfptry7rbfydx7odqzwxbulkx8enf4g/aqcnurdx4umrp3/f+/ls95xlzrsjol9buhxv/optnyedr4+fph3wff7is65qrwhckl6p0xxrypmgiwgg01opkc5ztk/y+ftgp7misrnkfdjlsaclphesyaqbt2xd3/20xoyiz+3idzzh8zfflhxbwihmbgoyn9fgwow6ox83c3t6z5men3cwjshfuqpq0abhzn50koq6zhhloes6krmmw0dh3sahg7+htevgrdlubocz1vvl1r4cxp7l187/cyhlur1ogk0c7ok4bt3uxijr/kxnfallkgzhwotjhekztvgzrjj1n6s1lzjm8tmdechwsfxurmbioo+zgrsv8p8wsgi3hraxwi/l3gc2a9c3tyb8lzaptxewtuzi2lbb1owkmoopobkwzjiixpclb5go5s8k5al7phn/ddq+cfhtk98fyd7wcbip3/ao37xd2egv0gs0qdhar8foaxqhbjfgffwf18hxmalt+c38vk4ng9bzet2i8mcfl1zdig==</latexit> <latexit sha1_base64="9aljmpz0psa/aqahexhvmw1d3ke=">aaac0niclvlpb9mwfhyyykp8kndk8krvqbuuzeladpg0wyudh1yibfptry7rbfydx7odqzwxbulkx8enf4g/aqcnurdx4umrp3/f+/ls95xlzrsjol9buhxv/optnyedr4+fph3wff7is65qrwhckl6p0xxrypmgiwgg01opkc5ztk/y+ftgp7misrnkfdjlsaclphesyaqbt2xd3/20xoyiz+3idzzh8zfflhxbwihmbgoyn9fgwow6ox83c3t6z5men3cwjshfuqpq0abhzn50koq6zhhloes6krmmw0dh3sahg7+htevgrdlubocz1vvl1r4cxp7l187/cyhlur1ogk0c7ok4bt3uxijr/kxnfallkgzhwotjhekztvgzrjj1n6s1lzjm8tmdechwsfxurmbioo+zgrsv8p8wsgi3hraxwi/l3gc2a9c3tyb8lzaptxewtuzi2lbb1owkmoopobkwzjiixpclb5go5s8k5al7phn/ddq+cfhtk98fyd7wcbip3/ao37xd2egv0gs0qdhar8foaxqhbjfgffwf18hxmalt+c38vk4ng9bzet2i8mcfl1zdig==</latexit> <latexit sha1_base64="9aljmpz0psa/aqahexhvmw1d3ke=">aaac0niclvlpb9mwfhyyykp8kndk8krvqbuuzeladpg0wyudh1yibfptry7rbfydx7odqzwxbulkx8enf4g/aqcnurdx4umrp3/f+/ls95xlzrsjol9buhxv/optnyedr4+fph3wff7is65qrwhckl6p0xxrypmgiwgg01opkc5ztk/y+ftgp7misrnkfdjlsaclphesyaqbt2xd3/20xoyiz+3idzzh8zfflhxbwihmbgoyn9fgwow6ox83c3t6z5men3cwjshfuqpq0abhzn50koq6zhhloes6krmmw0dh3sahg7+htevgrdlubocz1vvl1r4cxp7l187/cyhlur1ogk0c7ok4bt3uxijr/kxnfallkgzhwotjhekztvgzrjj1n6s1lzjm8tmdechwsfxurmbioo+zgrsv8p8wsgi3hraxwi/l3gc2a9c3tyb8lzaptxewtuzi2lbb1owkmoopobkwzjiixpclb5go5s8k5al7phn/ddq+cfhtk98fyd7wcbip3/ao37xd2egv0gs0qdhar8foaxqhbjfgffwf18hxmalt+c38vk4ng9bzet2i8mcfl1zdig==</latexit> <latexit sha1_base64="bb77uxzkemvnag7tr19p+0k1rj0=">aaadghiclvjbb9mwfhytycnc1sejl0eusu0dxykqbkivjnjhyq+trnm0posc19msookxo9aq+hfwv3jjzyccs6v23csswt7+vvodm04oojpkdf+2lpvbw0d7+4+dj0+fpt9oh774ltm8i3rcup5m5ygwlloethrtnj6ljoi45pqsxhwp+bmfnjmstb6placzgf8llgiekwmfh63fxt/g6jomi5huryber2ufhubhgsafrxif05jy9nwxgjcpon2lqmarxarj8leqwbps3d1dngrwzr4hvif0oz2uzxivlceychc9jfiwywylfgvk0maj/t9uf9bws8zr5x1uoj2dsrybwndu+et+r0ulub29rez8cq2191gzo6x1br96+vj0nflnwin2xx241yfdw2umdmrokgj/8ecpywoakmkxlfppfwpw4ewxwqnppzduyllav3rqzathvm6kaom0da0yhyjnze0uvoimoscxlks4nj7ltsjbxanexu1zfx2yfswruaijqrnfoqevqrmomgczjyqvjifjxkytqk6xmaays+uyixi3w941ju8ghwfe+h3n5hmzjx30cr1gpeshy3scvqirmids+me9sd5aa9u2+/ar7dwuvqvrverbx/70h1kefyw=</latexit> <latexit sha1_base64="bb77uxzkemvnag7tr19p+0k1rj0=">aaadghiclvjbb9mwfhytycnc1sejl0eusu0dxykqbkivjnjhyq+trnm0posc19msookxo9aq+hfwv3jjzyccs6v23csswt7+vvodm04oojpkdf+2lpvbw0d7+4+dj0+fpt9oh774ltm8i3rcup5m5ygwlloethrtnj6ljoi45pqsxhwp+bmfnjmstb6placzgf8llgiekwmfh63fxt/g6jomi5huryber2ufhubhgsafrxif05jy9nwxgjcpon2lqmarxarj8leqwbps3d1dngrwzr4hvif0oz2uzxivlceychc9jfiwywylfgvk0maj/t9uf9bws8zr5x1uoj2dsrybwndu+et+r0ulub29rez8cq2191gzo6x1br96+vj0nflnwin2xx241yfdw2umdmrokgj/8ecpywoakmkxlfppfwpw4ewxwqnppzduyllav3rqzathvm6kaom0da0yhyjnze0uvoimoscxlks4nj7ltsjbxanexu1zfx2yfswruaijqrnfoqevqrmomgczjyqvjifjxkytqk6xmaays+uyixi3w941ju8ghwfe+h3n5hmzjx30cr1gpeshy3scvqirmids+me9sd5aa9u2+/ar7dwuvqvrverbx/70h1kefyw=</latexit> <latexit sha1_base64="bb77uxzkemvnag7tr19p+0k1rj0=">aaadghiclvjbb9mwfhytycnc1sejl0eusu0dxykqbkivjnjhyq+trnm0posc19msookxo9aq+hfwv3jjzyccs6v23csswt7+vvodm04oojpkdf+2lpvbw0d7+4+dj0+fpt9oh774ltm8i3rcup5m5ygwlloethrtnj6ljoi45pqsxhwp+bmfnjmstb6placzgf8llgiekwmfh63fxt/g6jomi5huryber2ufhubhgsafrxif05jy9nwxgjcpon2lqmarxarj8leqwbps3d1dngrwzr4hvif0oz2uzxivlceychc9jfiwywylfgvk0maj/t9uf9bws8zr5x1uoj2dsrybwndu+et+r0ulub29rez8cq2191gzo6x1br96+vj0nflnwin2xx241yfdw2umdmrokgj/8ecpywoakmkxlfppfwpw4ewxwqnppzduyllav3rqzathvm6kaom0da0yhyjnze0uvoimoscxlks4nj7ltsjbxanexu1zfx2yfswruaijqrnfoqevqrmomgczjyqvjifjxkytqk6xmaays+uyixi3w941ju8ghwfe+h3n5hmzjx30cr1gpeshy3scvqirmids+me9sd5aa9u2+/ar7dwuvqvrverbx/70h1kefyw=</latexit> <latexit sha1_base64="bb77uxzkemvnag7tr19p+0k1rj0=">aaadghiclvjbb9mwfhytycnc1sejl0eusu0dxykqbkivjnjhyq+trnm0posc19msookxo9aq+hfwv3jjzyccs6v23csswt7+vvodm04oojpkdf+2lpvbw0d7+4+dj0+fpt9oh774ltm8i3rcup5m5ygwlloethrtnj6ljoi45pqsxhwp+bmfnjmstb6placzgf8llgiekwmfh63fxt/g6jomi5huryber2ufhubhgsafrxif05jy9nwxgjcpon2lqmarxarj8leqwbps3d1dngrwzr4hvif0oz2uzxivlceychc9jfiwywylfgvk0maj/t9uf9bws8zr5x1uoj2dsrybwndu+et+r0ulub29rez8cq2191gzo6x1br96+vj0nflnwin2xx241yfdw2umdmrokgj/8ecpywoakmkxlfppfwpw4ewxwqnppzduyllav3rqzathvm6kaom0da0yhyjnze0uvoimoscxlks4nj7ltsjbxanexu1zfx2yfswruaijqrnfoqevqrmomgczjyqvjifjxkytqk6xmaays+uyixi3w941ju8ghwfe+h3n5hmzjx30cr1gpeshy3scvqirmids+me9sd5aa9u2+/ar7dwuvqvrverbx/70h1kefyw=</latexit> Recap: Importance Sampling Previous slide assumed we could evaluate P (x) = P (x)/z P Z E x p [f(x)] = x f(x)p(x) = Rx f(x) p(x) q(x) q(x) R x p(x) q(x) q(x) Let! ",,! % be samples from q(x). Z f(x)p(x) x Pl f(xl ) p(xl ) q(x l ) P l p(x l ) q(x l ) = LX f(x l )w l l=1 4

Recap: Particle Filters Apply the importance sampling to a temporal distribution!(# $:& ) General idea, change the proposal in each step: ( # & # $:& ) The recursion rule: Forward message: 5

Summary so far General ideas for the sampling approaches Proposal distribution (q(x)): Use another distribution to sample from. Change the proposal distribution with the iterations. Introduce an auxiliary variable to decide keeping a sample or not. Why should we discard samples? Sampling from high-dimension is difficult. Let s incorporate the graphical model into our sampling strategy. Can we use the gradient of the p(x)? 6

Summary so far General ideas for the sampling approaches Proposal distribution (q(x)): Use another distribution to sample from. Change the proposal distribution with the iterations. Introduce an auxiliary variable to decide keeping a sample or not. Why should we discard samples? Sampling from high-dimension is difficult. Let s incorporate the graphical model into our sampling strategy. Can we use the gradient of the p(x)? 7

<latexit sha1_base64="wbxyijmaiemebe6qqkqhmxzfhpu=">aaacahicbzdnssnafivv6l+tf1e3gpvbirgqiqjqqii6csnumlbqhjkzttqhk0mymrrkqbtfxy0lfbc+hjvfxmkbufspdhycey937gkszpr2nc+rslc4tlxsxc2trw9sbtnbo/cqtiwhhol5lbsbvpqzqt3nnkenrficbzzwg/7vuf4fuklylo70mkf+hluchyxgbay2vxedbu2sxusohyn08cntu+xuninqplg5lcfxrw1/tjoxssmqnofyqabrjnrpsnsmcdoqtvjfe0z6ueubbgwoqpkzyqujdgicdgpjaz7qaol+nshwpnqwckxnhhvpzdbg5n+1zqrdmz9jikk1fws6kew50jeax4e6tfki+daajpkzvylswxitbuirmrdc2zpnwtuunffc25ny9tjpowj7cabh4mipvoeaauabgqd4ghd4tr6tz+vnep+2fqx8zhf+ypr4bhbolt8=</latexit> <latexit sha1_base64="wbxyijmaiemebe6qqkqhmxzfhpu=">aaacahicbzdnssnafivv6l+tf1e3gpvbirgqiqjqqii6csnumlbqhjkzttqhk0mymrrkqbtfxy0lfbc+hjvfxmkbufspdhycey937gkszpr2nc+rslc4tlxsxc2trw9sbtnbo/cqtiwhhol5lbsbvpqzqt3nnkenrficbzzwg/7vuf4fuklylo70mkf+hluchyxgbay2vxedbu2sxusohyn08cntu+xuninqplg5lcfxrw1/tjoxssmqnofyqabrjnrpsnsmcdoqtvjfe0z6ueubbgwoqpkzyqujdgicdgpjaz7qaol+nshwpnqwckxnhhvpzdbg5n+1zqrdmz9jikk1fws6kew50jeax4e6tfki+daajpkzvylswxitbuirmrdc2zpnwtuunffc25ny9tjpowj7cabh4mipvoeaauabgqd4ghd4tr6tz+vnep+2fqx8zhf+ypr4bhbolt8=</latexit> <latexit sha1_base64="wbxyijmaiemebe6qqkqhmxzfhpu=">aaacahicbzdnssnafivv6l+tf1e3gpvbirgqiqjqqii6csnumlbqhjkzttqhk0mymrrkqbtfxy0lfbc+hjvfxmkbufspdhycey937gkszpr2nc+rslc4tlxsxc2trw9sbtnbo/cqtiwhhol5lbsbvpqzqt3nnkenrficbzzwg/7vuf4fuklylo70mkf+hluchyxgbay2vxedbu2sxusohyn08cntu+xuninqplg5lcfxrw1/tjoxssmqnofyqabrjnrpsnsmcdoqtvjfe0z6ueubbgwoqpkzyqujdgicdgpjaz7qaol+nshwpnqwckxnhhvpzdbg5n+1zqrdmz9jikk1fws6kew50jeax4e6tfki+daajpkzvylswxitbuirmrdc2zpnwtuunffc25ny9tjpowj7cabh4mipvoeaauabgqd4ghd4tr6tz+vnep+2fqx8zhf+ypr4bhbolt8=</latexit> <latexit sha1_base64="wbxyijmaiemebe6qqkqhmxzfhpu=">aaacahicbzdnssnafivv6l+tf1e3gpvbirgqiqjqqii6csnumlbqhjkzttqhk0mymrrkqbtfxy0lfbc+hjvfxmkbufspdhycey937gkszpr2nc+rslc4tlxsxc2trw9sbtnbo/cqtiwhhol5lbsbvpqzqt3nnkenrficbzzwg/7vuf4fuklylo70mkf+hluchyxgbay2vxedbu2sxusohyn08cntu+xuninqplg5lcfxrw1/tjoxssmqnofyqabrjnrpsnsmcdoqtvjfe0z6ueubbgwoqpkzyqujdgicdgpjaz7qaol+nshwpnqwckxnhhvpzdbg5n+1zqrdmz9jikk1fws6kew50jeax4e6tfki+daajpkzvylswxitbuirmrdc2zpnwtuunffc25ny9tjpowj7cabh4mipvoeaauabgqd4ghd4tr6tz+vnep+2fqx8zhf+ypr4bhbolt8=</latexit> Random Walks of the Annoying Fly Room 1 Room 2 Room 3 p(x t+1 = i x t = j) =M ij 2 3 0.7 0.5 0 40.3 0.3 0.55 0 0.2 0.5 Stationary distribution: Mv 1 = v 1 Eigen vector of the Markov matrix

Exploiting the structure GIBBS SAMPLING 9

Gibbs Sampling Sample one (block of) variable at the time x 2 p(x) x (t+1) x (t) p(x P (x) 1 x (t) 2 ) a) x 1 ( 10

Gibbs Sampling x 2 p(x) x (t+2) x (t+1) P (x) p(x 2 x (t+1) x (t) 1 ) a) x 1 ( 11

Gibbs Sampling x 2 p(x) x (t+2) x (t+1) x (t+3) x (t+4) x (t) P (x) a) x 1 ( 12

Gibbs Sampling Link: https://www.youtube.com/watch?v=aewy6qxwoug https://www.youtube.com/watch?v=zakwpvgmkty 13

Figure from Jordan Ch. 21 Ingredients for Gibb Recipe Full conditionals only need to condition on the Markov Blanket MRF Bayes Net Must be easy to sample from conditionals 14

Gibbs Sampling Sample one (block of) variable at the time Easy to compute The proposal distribution: Markov Blanket Make sure other variables do not change Choose one of the variables randomly with probability q(i) 15

Figure from Jordan Ch. 21 Again. Full conditionals only need to condition on the Markov Blanket MRF Bayes Net Must be easy to sample from conditionals Many conditionals are log-concave and are amenable to adaptive rejection sampling 16

Whiteboard Gibbs Sampling as M-H 17

Case Study: LDA 18

Bayesian Approach LDA Inference Dirichlet Document-specific topic distribution m Topic Dirichlet Topic Intractable assignment z mn Exact Inference? Observed word x mn k N m M K 19

LDA Inference Explicit Gibbs Sampler Dirichlet Document-specific topic distribution m Topic Dirichlet Topic assignment z mn Sampled Observed word x mn k N m M K 20

LDA Inference Collapsed Gibbs Sampler Dirichlet Document-specific topic distribution m Topic Dirichlet Integrated out Topic assignment z mn Sampled Observed word x mn k N m K M 21

Goal: Sampling Draw samples from the posterior p(z X,, ) Integrate out topics ϕ and document-specific distribution over topics θ Algorithm: While not done For each document, m: For each word, n:» Resample a single topic assignment using the full conditionals for z mn 22

Sampling What queries can we answer with samples of! "#? Mean of z mn Mode of z mn Estimate posterior over z mn Estimate of topics ϕ and document-specific distribution over topics θ 23

<latexit sha1_base64="hfiqb911fmjq3kvk18jsqedtuia=">aaaconicbvfda9swfjw9bu3sfwtb414udywuzceehw0pg9lbkpshkwvaqjwzwzybtbikjhk09fzh9jp21n9t2fega3zb6hdpuufsuai4mzyi7jz/0cbjj5tbtzvbz56/enl99frcyeitoiassz1jskgcctq2zhi6uzripoh0irn+wvmxp6k2tiozu1b0lunlwtjgshwtuptb9w9jbl/gcn7bbvy+z9vgmoaiczxhbk+oxxsqks2vlrblgpns1llbvfwj7n/ebk+cd0tdvqlachmrvnqghrolaqccdzp0r0lz6iply7cqgonwvbvf3v4wdjqcdrc2oifagsxdp1eqszftyqnhxkzdqnlzibvlhnoqexwgkkyu8swdoihwts2sbdkuynd1usikdktyalqreyxojvnkivpm2m7nq65u/o+bfjb7ncuzuiwlgiwpygoolu76wyblmhllfw5gopm7k5a5dhfa960df0l48mnrypxh+hkynu73dg7bnlbqw7sd+iheh9ebokijnebea++bd+kn/f3/2d/1vy+lvtfoveh/lb/dazayybg=</latexit> <latexit sha1_base64="hfiqb911fmjq3kvk18jsqedtuia=">aaaconicbvfda9swfjw9bu3sfwtb414udywuzceehw0pg9lbkpshkwvaqjwzwzybtbikjhk09fzh9jp21n9t2fega3zb6hdpuufsuai4mzyi7jz/0cbjj5tbtzvbz56/enl99frcyeitoiassz1jskgcctq2zhi6uzripoh0irn+wvmxp6k2tiozu1b0lunlwtjgshwtuptb9w9jbl/gcn7bbvy+z9vgmoaiczxhbk+oxxsqks2vlrblgpns1llbvfwj7n/ebk+cd0tdvqlachmrvnqghrolaqccdzp0r0lz6iply7cqgonwvbvf3v4wdjqcdrc2oifagsxdp1eqszftyqnhxkzdqnlzibvlhnoqexwgkkyu8swdoihwts2sbdkuynd1usikdktyalqreyxojvnkivpm2m7nq65u/o+bfjb7ncuzuiwlgiwpygoolu76wyblmhllfw5gopm7k5a5dhfa960df0l48mnrypxh+hkynu73dg7bnlbqw7sd+iheh9ebokijnebea++bd+kn/f3/2d/1vy+lvtfoveh/lb/dazayybg=</latexit> <latexit sha1_base64="hfiqb911fmjq3kvk18jsqedtuia=">aaaconicbvfda9swfjw9bu3sfwtb414udywuzceehw0pg9lbkpshkwvaqjwzwzybtbikjhk09fzh9jp21n9t2fega3zb6hdpuufsuai4mzyi7jz/0cbjj5tbtzvbz56/enl99frcyeitoiassz1jskgcctq2zhi6uzripoh0irn+wvmxp6k2tiozu1b0lunlwtjgshwtuptb9w9jbl/gcn7bbvy+z9vgmoaiczxhbk+oxxsqks2vlrblgpns1llbvfwj7n/ebk+cd0tdvqlachmrvnqghrolaqccdzp0r0lz6iply7cqgonwvbvf3v4wdjqcdrc2oifagsxdp1eqszftyqnhxkzdqnlzibvlhnoqexwgkkyu8swdoihwts2sbdkuynd1usikdktyalqreyxojvnkivpm2m7nq65u/o+bfjb7ncuzuiwlgiwpygoolu76wyblmhllfw5gopm7k5a5dhfa960df0l48mnrypxh+hkynu73dg7bnlbqw7sd+iheh9ebokijnebea++bd+kn/f3/2d/1vy+lvtfoveh/lb/dazayybg=</latexit> <latexit sha1_base64="hfiqb911fmjq3kvk18jsqedtuia=">aaaconicbvfda9swfjw9bu3sfwtb414udywuzceehw0pg9lbkpshkwvaqjwzwzybtbikjhk09fzh9jp21n9t2fega3zb6hdpuufsuai4mzyi7jz/0cbjj5tbtzvbz56/enl99frcyeitoiassz1jskgcctq2zhi6uzripoh0irn+wvmxp6k2tiozu1b0lunlwtjgshwtuptb9w9jbl/gcn7bbvy+z9vgmoaiczxhbk+oxxsqks2vlrblgpns1llbvfwj7n/ebk+cd0tdvqlachmrvnqghrolaqccdzp0r0lz6iply7cqgonwvbvf3v4wdjqcdrc2oifagsxdp1eqszftyqnhxkzdqnlzibvlhnoqexwgkkyu8swdoihwts2sbdkuynd1usikdktyalqreyxojvnkivpm2m7nq65u/o+bfjb7ncuzuiwlgiwpygoolu76wyblmhllfw5gopm7k5a5dhfa960df0l48mnrypxh+hkynu73dg7bnlbqw7sd+iheh9ebokijnebea++bd+kn/f3/2d/1vy+lvtfoveh/lb/dazayybg=</latexit> Gibbs Sampling for LDA Full conditionals p(z i = j z i,x,, ) / n(x i) i,j + n ( ) i,j + T (d n i ) i,j + n (d i) i, + K n (x i) i,j n ( ) i,j n (d i) i,j n (d i) i, the number of instances of word! " assigned to topic j, not including current word. total number of words assigned to topic j, not including the current one. the number of words for document # " assigned to topic j. total number of words in the document # " not including the current one. Link: https://people.cs.umass.edu/~wallach/courses/s11/cmpsci791ss/readings/griffiths02gibbs.pdf 24

Gibbs Sampling for LDA Sketch of the derivation of the full conditionals p(z i = k Z i,x,, ) = p(x, Z, ) p(x, Z i, ) p(x, Z, ) = p(x Z, )p(z ) = p(x Z, )p( ) d = K k=1 B( n k + ) B( ) + p(z i = j z i,x,, ) / n(x i) i,j + n ( ) i,j + T n M m=1 p(z )p( ) d B( n m + ) B( ) + i,j + (d i ) n (d i) i, + K 25

Definitions and Theoretical Justification for MCMC MARKOV CHAINS 28

MCMC Goal: Draw approximate, correlated samples from a target distribution p(x) MCMC: Performs a biased random walk to explore the distribution 29

Simulations of MCMC Visualization of Metroplis-Hastings, Gibbs Sampling, and Hamiltonian MCMC: https://www.youtube.com/watch?v=vv3f0qnwvwq 30

Metropolis-Hastings Sampling Consider this mixturefor the proposal Stay where you are Or. Sample from a distribution depends on current location (x) Of course! It should be a proper density! Choose between those options with a probability that depends on a proposed point and current point (0 # $ %, $ 1) 31

Metropolis-Hastings Sampling Consider this mixturefor the proposal Is it a proper density? 32

Metropolis-Hastings Sampling Consider this mixturefor the proposal How to choose!(#, #) and '((# ) #)? p(x) must be the Stationary distribution 33

Designing!(# $, #) MH acceptance function: 34

Designing!(# $, #) MH acceptance function: Detailed Balance: 35

Detailed Balance Detailed balance means that, for each pair of states x and x, arriving at x then x and arriving at x then x are equiprobable. x x x' x' 36

Figure from Bishop (2006) A choice for!(# #) For Metropolis-Hastings, a generic proposal distribution is: If ϵ is large, many rejections If ϵ is small, slow mixing q(x x (t) )=N (0, 2 ) max p(x) min q(x x (t) ) 37

Figure from Bishop (2006) A choice for!(# #) For Rejection Sampling, the accepted samples are are independent But for Metropolis-Hastings, the samples are correlated Question: How long must we wait to get effectively independent samples? min max q(x x (t) ) p(x) A: independent states in the M-H random walk are separated by roughly steps ( max / min ) 2 38

Metropolis-Hastings Algorithm 39

Practical Issues Question: Is it better to move along one dimension or many? Answer: For Metropolis-Hasings, it is sometimes better to sample one dimension at a time Q: Given a sequence of 1D proposals, compare rate of movement for one-at-a-time vs. concatenation. Answer: For Gibbs Sampling, sometimes better to sample a block of variables at a time Q: When is it tractable to sample a block of variables? 42

Practical Issues Question: How do we assess convergence of the Markov chain? Answer: It s not easy! Compare statistics of multiple independent chains Ex: Compare log-likelihoods Chain 1 Chain 2 Log-likelihood Log-likelihood # of MCMC steps # of MCMC steps 43

Practical Issues Question: How do we assess convergence of the Markov chain? Answer: It s not easy! Compare statistics of multiple independent chains Ex: Compare log-likelihoods Chain 1 Chain 2 Log-likelihood Log-likelihood # of MCMC steps # of MCMC steps 44

Practical Issues Question: Is one long Markov chain better than many short ones? Note: typical to discard initial samples (aka. burnin ) since the chain might not yet have mixed Answer: Often a balance is best: Compared to one long chain: More independent samples Compared to many small chains: Less samples discarded for burn-in We can still parallelize Allows us to assess mixing by comparing chains 45