CS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo

CS 3750 Machne Learnng Lectre 6 Monte Carlo methods Mlos Haskrecht mlos@cs.ptt.ed 5329 Sennott Sqare Markov chan Monte Carlo Importance samplng: samples are generated accordng to Q and every sample from Q s reweghted accordng to w, bt the Q dstrbton may be very far from the target MCMC s a strategy for generatng samples from the target dstrbton, ncldng condtonal dstrbtons MCMC: Markov chan defnes a samplng process that ntally generates samples very dfferent from the target dstrbton e.g. posteror bt gradally refnes the samples so that they are closer and closer to the posteror. 1

MCMC The constrcton of a Markov chan reqres two basc ngredents a transton matr P an ntal dstrbton 0 Assme a fnte set S={1, m} of states, then a transton matr s p11 p12 p1 m p21 p22 p2m P pm 1 pm2 pmm Where p j 0, j S 2 and js p j 1 S Markov Chan Markov chan defnes a random process of selectng states Chan Dynamcs P, t1 X,, 0 1 m Intal state selected based on 0 t1 ' t t+1 P Dom X t X t Probablty of a state beng selected at tme t+1 Sbseqent states selected based on the prevos state and the transton matr T ' transton matr 2

MCMC Markov chan satsfes P X n j X 0 0, X1 1, X n n P X n1 j X 1 n n Irredcblty: A MC s called rredcble or ndecomposable f there s a postve transton probablty for all pars of states wthn a lmted nmber of steps In rredcble chans there may stll est a perodc strctre sch that for each state, the set of possble retrn tmes to when startng n s a sbset of the set pν { p,2 p,3p, } contanng all bt a fnte set of these elements. The smallest nmber p wth ths property s the so-called perod of the chan p gcd{ n N : n p 0} Aperodcty: An rredcble chan s called aperodc or acyclc f the perod p eqals 1 or, eqvalently, f for all pars of states there s an nteger n j sch that for all n n j, the probablty p n j>0. If a Markov chan satsfes both rredcblty and aperodcty, then t converges to an nvarant dstrbton q A Markov chan wth transton matr P wll have an eqlbrm dstrbton q ff q = qp. A sffcent, bt not necessary, condton to ensre a partclar q s the nvarant dstrbton of transton matr P s the followng reversblty detaled balance condton q P MCMC q P 1 1 1 3

Markov Chan Monte Carlo Objectve: generate samples from the posteror dstrbton Idea: Markov chan defnes a samplng process that ntally generates samples very dfferent from the target posteror bt gradally refnes the samples so that they are closer and closer to the posteror. MCMC PX e PX e the qery we want to compte e 1 & e 2 are known evdence Samplng from the dstrbton PX s very dfferent from the desred posteror PX e e 1 e 2 4

Markov Chan Monte Carlo MCMC X 2 X 3 State Space X 4 MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 5

MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 Apply T MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 From 1 and transton generate 2 X 2 Apply T Apply T 6

MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 From 1 and transton generate 2 X 2 Apply T Apply T MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 From 1 and transton generate 2 Repeat for n steps P X e X 2 X n Apply T Apply T Apply T 7

MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 From 1 and transton generate 2 Repeat for n steps P X e X 2 X n Apply T Apply T Apply T MCMC Cont. Goal: a sample from PX e Start from some PX and generate a sample 1 From 1 and transton generate 2 Repeat for n steps P X e Samples from desred P X e X 2 X n X n+1 X n+2 Apply T Apply T Apply T 8

MCMC In general, an MCMC samplng process doesn t have to converge to a statonary dstrbton A fnte state Markov Chan has a nqe statonary dstrbton ff the markov chan s reglar reglar: est some k, for each par of states and, the probablty of gettng from to n eactly k steps s greater than 0 We want Markov chans that converge to a nqe target dstrbton from any ntal state How to bld sch Markov chans? Gbbs Samplng - A smple method to defne MC for BBN can beneft from the strctre ndependences n the network 1 2 3 Evdence: 5 =T 6 =T all varables have bnary vales T or F 4 5 6 9

Gbbs Samplng Intal state 1 2 3 4 1 =F, 2 =T 3 =T, 4 =T 5 6 5 = 6 =T Fed X 0 Gbbs Samplng Intal state 1 Update Vale of 4 2 3 1 =F, 2 =T 3 =T, 4 =T 4 5 6 5 = 6 =T Fed X 0 10

Gbbs Samplng 1 2 3 1 2 3 1 =F, 2 =T, 3 =T, 5 4 6 X 0 5 4 6 4 =F 5 =T 6 =T Gbbs Samplng 1 2 3 1 2 3 Update Vale of 3 5 4 6 X 0 5 4 6 4 =F 5 =T 6 =T 11

After many reassgnments Gbbs Samplng 1 1 2 3 2 3 4 4 5 6 5 6 X n Samples from desred PX rest e Gbbs Samplng Keep resamplng each varable sng the vale of varables n ts local neghborhood Markov blanket 1 1 2 3 2 3 4 4 P X 4 2, 3, 5, 6 5 6 5 6 12

Gbbs Samplng Gbbs samplng takes advantage of the strctre Markov blanket makes the varable ndependent from the rest of the network 1 2 3 4 P X 4 2, 3, 5, 6 5 6 Bldng a Markov Chan A reversble Markov chan: A sffcent, bt not necessary, condton to ensre a partclar q s the nvarant dstrbton of transton matr P s the followng reversblty detaled balance condton q P q P Metropols-Hastngs algorthm blds a reversble Markov Chan Uses a proposal dstrbton to generate canddate states Ether accept t and take a transton to state Or reject t and stay at crrent state 1 1 1 13

Bldng a Markov Chan Metropols-Hastngs algorthm blds a reversble Markov Chan ses the proposal dstrbton smlar to proposal the dstrbton n mportance samplng to generate canddates for A proposal dstrbton Q: T Q ' Eample: Unform over the vales of varables Ether accept a proposal and take a transton to state Or reject t and stay at crrent state Acceptance probablty A ' Bldng a Markov Chan Transton for the MH: Q T ' T ' A ' f ' T T Q ' From reversblty condton: q T ' q ' T ' We get T q ' T A ' mn[1, q T Q '1 A ' Q ' ] ' Q otherwse 14

15 Bldng a Markov Chan Comparng MH wth Gbbs For Gbbs Specal MH, for whch acceptance probablty s 1. 1 mn[1,1] ] ' ' mn[1, ],,, ', ' mn[1, ',, ' Q Q P P P P T P T P A MH algorthm Assmptons: We can t draw the samples from q We can evalate q for any We se a Markov chan that moves towards * wth acceptance probablty The transton kernel defned by ths process satsfes the detaled balance condton * * * 1, mn *, p q p q

Mng Tme n Usng Markov Chan Mng Tme The nmber of steps we take ntl we collect a sample from the target dstrbton. # = n Mng Tme Samples from desred PX e X 2 X n X n+1 X n+2 Local Rles Local Rles Local Rles Smmary Markov Chan Monte Carlo method attempts to generate samples from posteror dstrbton Metropols Hastngs algorthm s a general scheme for specfyng a Markov chan. Gbbs samplng s a specal case that takes advantage of the network strctre Markov Blanket 16