Markov Chan Monte-Carlo (MCMC) What for s t and what does t look lke? A. Favorov, 2003-2017 favorov@sens.org favorov@gal.co
Monte Carlo ethod: a fgure square The value s unknown. Let s saple a rando value (r.v.) : x, y:..d. as flat0,1 1 x, y 0 x, y 1 μ Clever notaton: I x, y..d. s Identcally Independently Dstrbuted 1 Expectaton of : E S S
Monte Carlo ethod: effcency Large Nubers Law: S 1 S 1 Central Lt Theore: 1 S S N 0,var Varance var 2 E E, also notated as 2.
Monte Carlo Integraton We are evaluatng We can saple r.v. I f xdx. D s doan of f D x D : The Monte Carlo estaton: I f x D I I N 0,varD f x x or ts subset. 1 E f x f x dx I D x are..d. unforly n D : D 1, D f x. Advantage: o The ultpler 1 2 does not depend on the space denson. D Dsadvantage: o a lot of saples are spent n the area where f o the varaton value x s sall; var D f x that deterne convergence te can be large.
Monte Carlo portance ntegraton We are evaluatng Let s saple I f x dx D x D fro a tral dstrbuton g x that looks lke f x 0 g x 0. x..d. n D as g x that resebles f x f x and Thus f x f x Eg g xdx f xdx. g x g x D D MC evaluaton: 1 f x I 1 g x ; 1 f x I I N 0,varD g x More unfor eans better.
Another exaple of portance ntegraton We are evaluatng xdx 1 saple x fro a dstrbuton E h x hx xdx, where Iportance weght g. so that x g x x w x g x ; g g x 0 0 x s a dstrbuton, e.g. 1 E w x g x dx 1 x 1 h x w x h x w x h x w x 1 g x 1 1 1 1 w x h x w x h x w x 1 1 1 1 Saplng fro x : h x 1
Rejecton saplng (Von Neuann, 1951) We have a dstrbuton we want to saple fro t. x and f x g x We are able to calculate f x c ( x ) for x. Any c. We are able to saple, : Thus, we can saple g x M Mg x f x. x : Draw a value x fro g x. Accept the value x wth the probablty f x Mg x. Mg x Paccept x P x c x M Paccept Mg x c c x c P accept P accept x P x dx g x dx P x accept g x x M x
Metropols-Hastngs algorth (1953,1970) ( ) We want to be able to draw x fro a dstrbuton x T x y (nstruental dstrbuton, transton kernel). ( ) Let s denote the -th step result as x.. We know how to copute the value of a functon f x so that f x x each pont and we are able to draw x fro at Draw ( ) ( ) y fro ( ) T y x. T y x s flat n pure Metropols. It s an analog of g n portance saplng. y Transton probablty ( ) ( ) ( ) T x y f y x n 1, ( ) ( ) ( ). T y x f x ( ) ( ) The new value s accepted ( 1) ( ) x y wth probablty ( ) ( ) y x ( 1) ( ). Otherwse, t s rejected x x.
Why does t work: the local balance Let s show that f x s already dstrbuted as keeps the dstrbuton. Local balance condton for two ponts x and y : x f x, then the MH algorth f x T y x y x f y T x y x y. Let s check t: T x y f y f xt y x y x f xt y x n 1, T y x f x T y x f x T x y f y f y T x y x y n, The balance s stable: f x T y x y x s the flow fro x to y and f y T x y x y s the flow fro y to x. The stable local balance s enough (BTW, t s not a necessary condton).
Markov chans, Maxzaton, Sulated Annealng x created as descrbed above s a Markov chan (MC) wth transton kernel ( 1) ( ) ( 1) ( ) x x T x x. The fact that the chan has a statonary dstrbuton and the convergence of the chan to the dstrbuton can be proved by the MC theory ethods. Mnzaton. C x s a cost (a fne). f x C x exp C t n. We can characterze the transton kernel wth a teperature. Then we can decrease the teperature step-by-step (sulated annealng). MCMC and SA are very effectve for optzaton snce gradent ethods use to be locked s a local axu whle pure MC s extreely neffectve.
MCMC pror and Bayesan paradg P( D M ) P( M ) P( M D) P( D M ) P( M ) P( D) posteror lkelhood pror here, evdence MCMC and ts varatons are often used for the best odel search. Let s can forulate soe requreents for the algorth and thus for the transton kernel: o We want t not to depend on the current data. o We want to nze the rejecton rate. So, an effectve transton kernel s so that the pror P( M ) s ts statonary dstrbuton.
Ternology: naes of relatve algorths o MCMC, Metropols, Metropols-Hastngs, hybrd Metropols, confguratonal bas Monte-Carlo, exchange Monte-Carlo, ultgrd Monte-Carlo (MGMC), slce saplng, RJMCMC (saples the densonalty of the space), Multple-Try Metropols, Hybrd Monte-Carlo.. o Sulated annealng, Monte-Carlo annealng, statstcal coolng, ubrella saplng, probablstc hll clbng, probablstc exchange algorth, parallel teperng, stochastc relaxaton. o Gbbs algorth, successve over-relaxaton
Gbbs Sapler (Gean and Gean, 1984) Now, x s a k -densonal varable x1, x2... x k. x x, x.., x, x,.. x,1 k Let s denote 1 2 1 1 k On each step of the Markov Chan we choose the current coordnate ( ) Then, we calculate the dstrbuton f x x and draw the next value. y fro the ( ) dstrbuton. All other coords are the sae as on the prevous step, y x. ( ) ( ) For such a transton kernel, ( ) ( ) ( ) T x y ( ) ( ) f y y x n 1, 1 ( ) ( ) ( ) T y x f x. o We have no rejects, so the procedure s very effectve. o The teperature decreases rather fast.
Inverse transfor saplng (well-known) We want to saple fro the densty functon for the cuulatve dstrbuton. x. We know how to calculate the nverse Generate a rando nuber fro the 0,1 unfor dstrbuton; call ths u. Copute the value x such that x x dx u x s the rando nuber that s drawn fro the dstrbuton descrbed by x. 1 0 x dx x, x x u, u u p x x unfor u u u p x unfor x x
Slce saplng (Neal, 2003) Saplng of x fro f x s equvalent to saplng of, x y pars fro they area. So, we ntroduce an auxlary varable y and terate as follows: for a saple x t we choose t gven y t we choose 1 the saple of x dstrbuted as y unforly fro the nterval f x x unforly at rando fro f 0, t x f x y : t x s obtaned by gnorng the y values.
Lterature Lu, J.S. (2002) Monte Carlo Strateges n Scentfc Coputng. Sprnger-Verlag, NY, Berln, Hedelberg. Robert, C.P. (1998) Dscretzaton and MCMC Convergence Assessent, Sprnger-Verlag. Laarhoven, van, P.M.J. and Aarts, E.H.L (1988) Sulated Annealng: Theory and Applcatons. Kluwer Acadec Publshers. Gean, S and Gean, D (1984). Stochastc relaxaton, Gbbs dstrbuton and the Bayesan restoraton of ages. IEEE Transactons on Pattern Analyss and Machne Intellgence. 6, 621-641. Besag, J., Green, P., Hgdon, D., and Mengersen, K. (1996) Bayesan coputaton and Stochastc Sytes. Statstcal Scence, 10, 1, 3-66. Lawrence, C.E., Altschul, S.F., Bogusk, M.S., Lu, J.S., Neuwald, A.F., and Wootton, J.C. (1993). Detectng subtle sequence sgnals: a Gbbs saplng strategy for ultple algnent. Scence 262, 208-214. Sva, D.S. (1996) Data Analyss. A Bayesan tutoral. Clarendon Press, Oxford. Neal, Radford M. (2003) Slce Saplng. The Annals of Statstcs 31(3):705-767. http://cvs.ucla.edu/mcmc/mcmc_tutoral.ht Sheldon Ross. A Frst Course n Probablty Соболь И.М. Метод Монте-Карло Soetes, t works Favorov, A.V., Andreewsk, T.V., Sudoona, M.A., Favorova O.O., Pargan, G. Ochs, M.F. (2005). A Markov chan Monte Carlo technque for dentfcaton of cobnatons of allelc varants underlyng coplex dseases n huans. Genetcs 171(4): 2113-21. Favorov, A.V., Gelfand, M.S., Gerasova, A.V. Ravcheev, D.A., Mronov, A.A., Makeev, V. J. (2005). A Gbbs sapler for dentfcaton of syetrcally structured, spaced DNA otfs wth proved estaton of the sgnal length. Bonforatcs 21(10): 2240-2245.