Lecture 13: Markov Chain Monte Carlo. Gibbs sampling

Lecture 13: Markv hain Mnte arl Gibbs sampling Gibbs sampling Markv chains 1 Recall: Apprximate inference using samples Main idea: we generate samples frm ur Bayes net, then cmpute prbabilities using (weighted) cunts) But we may need a lt f wrk t get enugh samples (eg if the PDs are very extreme) Rejectin sampling and likelihd weighting are als specific t directed mdels 2

Recall: rward sampling P() = 5 ludy P(S) 10 50 Sprinkler Rain P(R) 0 20 S R P() et Grass 99 90 90 00 Each sample is cnstructed frm scratch! e saw tw alternatives fr incrprating the evidence hrw away samples that are incnsistent with it rce the evidence variables, but then weigh the samples In bth cases, after a sample is cnstructed, we start a new ne frm scratch 3 A different idea P() = 5 ludy P(S) 10 50 Sprinkler Rain P(R) 0 20 S R P() et Grass Suppse we want t cmpute e generate ne sample, with the given evidence variables instantiated crrectly hen we keep changing it! If we are careful, we will get samples frm the crrect distributin 99 90 90 00 4

H G 1 Initializatin Gibbs sampling Set evidence variables, t the bserved values Set all ther variables t randm values (eg by frward sampling, unifrm sampling) his gives us a sample 2 Repeat (as much as wanted) Pick a nn-evidence variable unifrmly randmly Sample frm! " $# Keep all ther values: % &')(+*-, he new sample is % 3 Alternatively, yu can march thrugh the variables in sme predefined rder 5 hy Gibbs wrks in Bayes nets he key step is sampling accrding t!/ " 0 1# 2 Hw d we cmpute this? In Bayes nets, we knw that a variable is cnditinally independent f all ther given its Markv blanket (parents, children, spuses) 3!/ " 0 1# 2 MarkvBlanket S we need t sample frm 4 MarkvBlanket Let 56'7* :9 be the children f Yu will shw (next hmewrk) that: ;=<?> :@ <BA ;<1> MarkxBlanket )DE :@ <?A Parents )D JIK ;=<BL @ <1M Parents D 6

V Example P() = 5 ludy P(S) 10 50 Sprinkler Rain P(R) 0 20 et Grass S R P() 1 Generate a first sample: N 2 Pick, sample it frm N 99 90 90 00 :P JP Q Suppse we get 3 ur new sample is N JP 4 7 Analyzing Gibbs sampling nsider the variables -R Each pssible assignment f values t these variables is a state f the wrld, SK2U In Gibbs sampling, we start frm a given state V S U Based n this, we generate a new state, S U V depends nly n V! here is a well-defined prbability f ging frm V t V Gibbs sampling cnstructs a Markv chain ver the Bayes net

f V A Markv chain is defined by: A set f states Markv chains V 3 V X A starting distributin ver the set f states YX e ften put these in a vectr Z\[ A statinary transitin prbability ^]_]a` V/b # r cnvenience, we ften put these in a a c dec matrix P V X=f V gf hhhif V b f V b # jf Vb V V 9 Steady-state (statinary) distributin here will the chain be in 1 step? lk lkx m ^ lx In tw steps? Yn n X In steps? b b " b lx A statinary distributin is a distributin left invariant by the chain: Nte that sme chains can have mre than ne statinary distributin! 10

] Detailed balance nsider the statinary distributin: V p V _ V V his can be viewed as a flw prperty: the flw ut f V has t be equal t the flw cming int V frm all states ne way t ensure this is t make flw equal between any pair f states: V a V V p V a V V his gives us a sufficient cnditin fr statinarity, called detailed balance (why) A Markv chain with this prperty is called reversible 11 Mnte arl Markv hain (MM) Suppse we want t sample data frm sme distributin e will set up a Markv chain which has the desired distributin as its statinary distributin! r this we wuld like the chain t have a unique statinary distributin, s that we can get samples frm it regardless f the starting distributin 12

Ergdicity An ergdic Markv chain is ne in which any state is reachable frm any ther state, and there are n strictly peridic cycles In such a chain, there is a unique statinary distributin, which can be btained as: qsrut b vxw b his is called equilibrium distributin Nte that the chain reaches the equilibrium distributin regardless f X 13 Sampling the equilibrium distributin e can sample just by running the chain a lng time: fr sme arbitrary Set V X p r zy, if V b V, sample a value V fr V b # based n V V Return V{ If y is large enugh, this will be a sample frm In practice, yu d like t have a rapidly mixing chain, ie ne that reaches the equilibrium quickly 14

m m m Implementatin issues he initial samples are influenced by the starting distributin, s they need t be thrwn away his is called the burn-in stage Because burn-it can take a while, we d like t draw several samples frm the same chain! Hwever, if we take samples,,! }, they will be highly crrelated Usually we wait fr burn-in, then take every c ~ sample, fr sme c sufficiently large his will ensure that the samples are (fr all practical purpses) uncrrelated 15 Gibbs sampling as MM ƒ, with e have a set f randm variables evidence variables e want t sample frm & Let be the variable t be sampled, currently set t, and ˆ m be the values fr all ther variables in he transitin prbability fr the chain is: V V p ˆ Š bviusly the chain is ergdic e want t shw that & is the statinary distributin 16

m Gibbs satisfies detailed balance V a V V & ˆ & ˆ & ˆ & V V V ˆ & ˆ & ˆ & K ˆ & ˆ Š (by chain rule) (backwards chain rule) 17