Lecture 6: Monte-Carlo methods

Size: px
Start display at page:

Download "Lecture 6: Monte-Carlo methods"

Transcription

1 Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2015 Lecture 6: Monte-Carlo methods Readings Recommended: handout on Classes site, from notes by Weinan E, Tiejun Li, and Eric Vanden-Eijnden Optional: Sokal [1989]. A classic manuscript on Monte Carlo methods. Posted to Classes site. Grimmett and Stirzaker [2001] A section on Markov Chain Monte Carlo. Madras [2002] A short, classic set of notes on Monte Carlo methods. Diaconis [2009] An introduction to MCMC methods, that particularly discusses some of the theoretical tools available to analyze them. Before we start, here is the most important thing to remember about the lecture today: Monte Carlo is an extremely bad method; it should be used only when all alternative methods are worse. (Sokal [1989]) evertheless, there are times when all else is worse. This lecture is about those times. It is a very basic introduction to some of the fundamental ideas associated with the word Monte-Carlo. We will consider three questions: (1) How can we generate random variables with a particular distribution? (2) How can we integrate functions, using random variables? (3) How can we sample from very high-dimensional distributions? For the latter we will introduce a widely used set of tools called Markov Chain Monte Carlo. 6.1 Generating Random Variables Suppose we can generate uniform random variables, i.e. we can produce a random variable X U([0,1]). Although doing this is an important topic in itself, we will not say much about it because most numerical software has libraries to do this. You can learn about common algorithms, such as the Linear Congruential Generator, in references such as Press et al. [2007]. The next question is: how can we generate a random variable Y with distribution function F(x)? We will consider two methods to do this Inverse Transformation Method Here is the algorithm:

2 Choose X U([0,1]) Set Y = F 1 (x). Then Y has distribution function F(x). Proof. Calculate: P(Y y) = P(F 1 (X) y) = P(X F(y)) = F(y). Intuitively, what we are doing is throwing a random variable uniformly on the vertical axis, and seeing where it came from, see figure to right. There is more chance of it landing in regions where F(y) is changing rapidly, and this is where the density is highest. This algorithm works even when F is not strictly increasing (for example, if Y is discrete.) In this case, let F 1 (u) = inf{x : F(x) > u}. Example (Exponential(λ)). We have F(y) = 1 e λy, so F 1 (x) = 1 λ ln(1 x). Therefore we can generate an exponential random variable by setting Y = 1 λ lnx, where X U([0,1]). We have used the fact that X d = 1 X. Example (Cauchy). This has density f (x) = 1 π(1+x 2 ), so F(x) = π 1 arctan(x)+ 1 2, so F 1 (y) = tan(π(y 1 2 )). Therefore we set Y = tan(π(x 2 1 )), where X U([0,1]). This is a great way of generating random variables if F 1 (y) can be easily calculated, because no samples are wasted. However, F 1 (y) has an analytic expression only for a small number of distributions (e.g. uniform, exponential, Cauchy, Weibull, logistic, discrete). In other cases we need another method. If the random variables are Gaussian, the Box-Muller transform is very efficient (see handout.) Acceptance-Rejection works for general continuous random variables Acceptance-Rejection Method Suppose X has pdf p(x), that satisfies 0 p(x) < d, with support on [a,b]. Here is how to generate X: Choose X U([a,b]), Y U([0,d]) If 0 Y p(x ), accept set X = X. Otherwise, reject go back to the beginning and try again. 2

3 After the first step, (X,Y ) is uniformly distributed in the box [a,b] [0,d]. The probability of accepting a point on the horizontal axis is p(x) d p(x). Proof. The pdf of (X,Y ) is χ A (x,y), where A is the region under p(x) and χ A is the indicator function. The pdf of X is the marginal distribution of X : p(x) 0 χ A (x,y)dy = p(x) 0 1dy = p(x). otes This method can handle general pdfs However, it an be very inefficient if p(x) is large in a few small regions, and small elsewhere, e.g. It doesn t work if p(x) is unbounded, e.g. p(x) 1/ x near x = 0. It also doesn t handle unbounded regions, such as when the density is defined on (, ). A more general method works by finding a function f (x) that bounds p(x): 0 p(x) f (x), and generating (X,Y ) uniformly under f (x). The steps are: Let Z = f (x)dx Choose X Z 1 f (x). Choose Y U([0, f (X )]). Acceptance-Rejection (General Method). Given f (x) such that 0 p(x) f (x), let F(x) = x f (x)dx, and suppose we have an analytic expression for F 1 (x). Let Z = f (x)dx. ote that F(x) does not necessarily have to be a cumulative distribution function: Z 1 in general. otes Choose X = F 1 (ZW), where W U([0,1]). Choose Y U([0, f (X )]). If 0 Y p(x ), accept. Set X = X. Otherwise, reject, and try again. This works for low-dimensional random variables. However, in high dimensions the corners of the regions (where the holes are) take up proportionally more and more room, leading to LOTS of 3

4 rejections, so it becomes extremely inefficient. Use MCMC (section 6.3) instead. 6.2 Monte-Carlo Integration Monte-Carlo integration is a technique used to evaluate integrals, particularly in high dimensions. The idea is to choose points at which to approximate the integral randomly, instead of on a pre-determined grid. Let s first compare a deterministic method with a random one for a one-dimensional problem. Suppose we want to calculate the integral I( f ) = 1 0 f (x)dx. Deterministically: suppose we have points, with grid spacing x = 1/. If we use the trapezoidal rule, then we approximate I( f ) i=1 f (x i ) + f (x i+1 ) x = Error ( x)2 f (ξ ) C Randomly: suppose we choose random points X 1,X 2,...,X uniformly on [0,1]. The we approximate I( f ) I ( f ) = 1 f (X i ). i=1 We know by the Law of Large umbers that this converges a.s. to I( f ). How quickly does it converge? The error has variance E(I ( f ) I( f )) 2 = i, j=1 E( f (X i ) I)( f (X j ) I) = 1 Var( f ), (1) where Var( f ) = f 2 ( f ) 2. Therefore the error is roughly I I 1 Var( f ). This converges extremely slowly to the true answer the deterministic method is much better in this case. What happens if we increase the dimension? Deterministically: if we choose points to be evenly spaced, then x 1/d. 1 Using a second-order method we would obtain Error C( x) 2 = C 2/d. Even for moderate d, e.g. d = 8, we get Error C 1/4. This is terrible convergence! Randomly: choosing the points to be uniformly distributed in the domain, and repeating the calculation (1), shows that Var( f ) Error 1/2 = C 1/2. This holds no matter what the dimension. 1 Suppose there are k points per dimension, with spacing x, then = k d = (1/ x) d. 4

5 Therefore, a Monte-Carlo integration method is expected to be better than the trapezoidal rule for d > 4. Of course, there are better methods than the trapezoidal rule, but for high-dimensional problems, Monte- Carlo always wins. ote that the asymptotic order or convergence of a deterministic method may not matter, if the number of grid points required to obtain it is too high. For example, we expect a fourth-order integration method (such as Simpson s rule) to be better than MC up to dimension d = 8. However, if we choose even a modest 10 points per dimension, we need 100 million points so obtaining a small error becomes rapidly impractical Importance sampling Importance sampling is one of a number of variance-reduction techniques. The idea is to reduce the prefactor in the error of MC integration, without changing. The idea is as follows: instead of sampling X uniformly on an interval, which wastes points in regions of low probability, choose points where the probability is likely to be large. Account for this by weighting the integral. Algorithm to calculate D f (x)dx, where D Rd : Choose points X i with density p(x) in D Calculate I n (p) = 1 i=1 This works because f (X i ) p(x i ). f (x)dx = ( ) f (x) f p(x) p(x)dx = E = lim p Therefore the mean value of the integral is as expected. Let s estimate the error variance: E(I (p) I)2 = 1 ( ) f Var = 1 p 1 f (X i ) i=1 p(x i ). ( ( f 2 ) 2 p dx f dx). If we choose p(x) = Z 1 f (x), with Z = f dx, then the variance above is 0, and I( f ) = I (p) ( f ) there is no error! 5

6 But Z is what we are trying to calculate in the first place, so if we could do this, then we would already know the answer. Therefore, we should choose p(x) to match f (x) as closely as possible, using an educated guess about how to do this, so as to reduce the overall variance of the answer. Importance sampling is particularly effective/necessary for rare event sampling. The homework contains an example of this. 6.3 Markov Chain Monte Carlo (MCMC) MCMC is a collection of techniques to sample a probability distribution π in a (usually) very high-dimensional space, by constructing a Markov Chain that has π as its stationary distribution. Typically these work even when we only know a function g(x) that is proportional to the stationary distribution: g(x) π(x), with no convenient way to get the normalization factor. A very common algorithm is Metropolis-Hastings. Examples (1) (Ising model): (see lecture 2 for description.) The energy of a particular configuration of spins σ {±1} is H(σ) = i, j σ i σ j, where i, j indicates that nodes i, j are neighbours. The energy is lower when neighbouring spins are the same. The stationary distribution is the Gibbs measure/boltzmann distribution: π(σ) = Z 1 e βh(σ). Here Z is a normalization constant, which is almost never known, and β is a parameter, representing the inverse temperature. In statistical mechanics we would set β = (k B T ) 1, where k B is Boltzmann s constant and T is the temperature. For large β (low temperature), π(σ) is bimodal: the system is either mostly (+1) or mostly (-1). This means the system is magnetized. For small β (high temperature), π(σ) gives most weight to systems with nearly equal,, so the system is disordered and loses its magnetization. One question of interest is: at which temperature (β 1 ) does this transition occur? We can answer this by calculating M π, the average with respect to π of the absolute value of the magnetization M = 1 i=1 σ i. Here f (x) π = x π x f (x). For this, we need to sample from π and calculate representative values of M, and then average these values. As β increases, M π should undergo a sharp transition from 0 to 1. (2) (Particles interacting with a pairwise potential) A very common model in chemistry and other areas that consider systems of interacting components (e.g. protein folding, materials science, etc) is to suppose there is a collection of point particles that interact with a pairwise potential V (r), where r is the distance between the pair. The total energy of a system of n particles is the sum over all the pairwise interactions: U(x) = V ( x i x j ), i, j where x = (x 1,x 2,...,x n ) is the 3n-dimensional vector of particle positions. The stationary distribution is again the Boltzmann distribution: π(x) = Z 1 e βu(x), where again, β is the inverse temperature, and we almost never know the normalization constant Z. Depending on β, the system could prefer to be in a number of different states such as a solid, crystal, liquid, gas, or other phase. To calculate the phase diagram we must sample π(x). 6

7 There are several difficulties in sampling π for these, and many related, examples: The size of the state space is often HUGE! Ising model, n n, has 2 n2 elements in the state space. Even for small n, say n = 10, we have over configurations... there is no way we can list them all. Even a small system of 100 particles lives in 300-dimensional space. There is no way we are going to adequately sample every region in this space. Often we don t know the normalization constant for π, only a function it s proportional to. the Boltzmann distirbution Z 1 e βu(x) arises frequently. We usually know the potential energy U(x) but can almost never calculate Z. We may not know the true dynamics that give rise to π, or these may have widely separated time scales so cannot be efficiently simulated. Ising model what are the true dynamics of a magnet? For this, need quantum mechanics. Particles in a box these should also interact with a solvent (such as water), but we can t simulate all the necessary atoms. We can t use previous methods to generate independent samples from π when the dimensionality becomes very high. Instead, what we can do is invent an artificial dynamics that has π as its stationary distribution, and then simulate these dynamics instead. Metropolis-Hasting Algorithm. Given a set of n nodes and stationary distribution π, the algorithm samples it as follows: If you are in state i, choose a state j from a proposal probability distribution encoded in a transition matrix H = (h i j ). Here h i j = P(consider j next in state i). Move to state j with probability a i j. Otherwise, remain in state i. The induced Markov chain has transition probabilities { hi p i j = j a i j (i j) 1 i j h i j a i j (i = j) The acceptance matrix A = (a i j ) is typically chosen to satisfy detailed balance: π i h i j a i j = π j h ji a ji. ote that we can ensure this condition holds, even if we don t know the normalization constant Z! A common choice is a i j = min(1, π jh ji π i h i j ). This is the Hastings algorithm. (The Metropolis algorithm refers to the case when H is symmetric, ie h i j = h ji, so that a i j = min(1, π j π i ).) We showed on HW2 that the induced Markov chain satisfies detailed balance, and has π as its stationary distribution. otes This algorithm works equally well in a continuous state space, provided we interpret h as a density: ( h(y x) is the ) density of jumping to y, given we start at x. The acceptance ratio is a(y x) = min. 1, π(y)h(x y) π(x)h(y x) 7

8 A general form of the acceptance matrix is a i j = F( π jh ji π i h i j ), where F : [0, ] [0,1] is any function satisfying F(z) = zf(1/z). The proposal matrix can be absolutely anything, provided h i j > 0 h ji > 0 (so the chain can satisfy detailed balance), and provided the chain described by H is ergodic. Choosing a good proposal matrix, however, is an art. Proposing many far-away moves is good, because it lets you move around state space quickly. Proposing too many of these, however, leads to many rejected moves, which slows convergence. A rule of thumb used to be that you should choose your proposal matrix so that an average of roughly 50% of the proposals are rejected. A recent paper by Andrew Gelman at Columbia argued that 27% is a better number, for some processes. What does this mean? o one really knows for sure, and you should choose parameters that seem to work well for the problem at hand. Here are some example proposal matrices: (1) (Ising model) pick a spin, flip it pick a pair of spins, exchange their values pick a spin or a cluster of spins, change the value(s) depending on the environment surrounding them, e.g. set the value to be the one that minimizes the energy in the spin s local neighbourhood (2) (Particles) pick a single particle, move it some random amount move all particles in the direction of the gradient of the potential, plus some random amount To calculate averages and distributions, e.g. f (x) π, one typically discards the initial transient steps. This is because it takes time for the chain to forget its initial state, and reach equilibrium. Theoretically, we can bound the time it takes to reach equilibrium using 1/ λ 2, where λ 2 is the secondlargest eigenvalue (in absolute value) of the transition matrix. In practice, we almost never know λ 2, so one must determine whether the chain has converged empirically. (ELFS: can you think of ways to do this?) The total effective number of samples is less than the actual number of points generated, even when equilibrium is reached, because the points are correlated. There is a nice formula for the effective number of points in terms of the covariance function of the Markov process. We will consider this on the homework. 8

9 6.4 Monte-Carlo methods in optimization Suppose you have a non-convex, possibly very rugged, function U(x), e.g. as shown above. How can you find the global minimum? Deterministic methods (e.g. steepest-descent, Gauss-ewton, Levenberg-Marquadt, BFGS, etc) are very good at finding a local minimum. To find a global minimum, or one that is close to optimal, one must typically search the landscape stochastically. One way to do this is to create a stationary distribution π that puts high probability on the lowest-energy parts of the landscape. A common choice is the Boltzmann distribution Z 1 e βu(x) for some inverse temperature β. Then, one constructs a Markov chain to sample this stationary distribution, and keeps track of the smallest value the chain has seen. The result will be sensitive to the value of β. How should we choose this? If β is large, the global minimum will be the most likely place to be in equilibrium, but it will take a very long time to reach equilibrium. If β is small, the chain moves about on the landscape much more quickly, but doesn t tend to spend as much time in the low-energy parts of the space. One possibility is to cycle β periodically: to alternate between high-temperature, fast dynamics, and lowtemperature dynamics that tend to find a low minimum and stay there. Another possiblity is Simulated Annealing. This is a technique that slowly increases β, for example as β = logt or β = (1.001) t, where t is the number of steps. As t, it can be shown that the stationary distribution converges to a delta-function at the global minimum. In practice, it takes exponentially long to do so, but this method still gives good results for many problems. Example (Lennard-Jones [ clusters). A Lennard-Jones cluster is a set of n points interacting with a pairwise ( potential U(r) = ε σr ) 6 ( σr ) ] 12 for some parameters σ,ε. This is a model for many atoms, molecules, or other interacting components with reasonably short-range interactions. The energy landscape is very rugged, with a great many local minima. To explore the landscape, and/or to find the lowest minima, one method is to construct a Markov chain on the set of local minima. This works as follows: at each step in the chain, perturb the positions of the points x 1,x 2,...,x n by some random amount, then find the nearest local minimum using a deterministic method, then check the energy of this local minimum, and either accept it or reject it according to the Hastings criterion. This method can also be used to solve packing problems, where now the energy is a function of the density, or other quantity to be optimized. 9

10 Example (Cryptography). Another use of Monte-Carlo methods in optimization is in cryptography (Diaconis [2009].) A cipher is a function φ : S A, where S is a set of symbols (e.g. a permutation of the alphabet, the Greek letters, a collection of squiggles, etc), and A = {a,b,c,...,x,y,z} is the set of letters. A code is a string of symbols x 1 x 2 x 3...x n, where x i S. Here is an example of a code, used by inmates in a prison in California (this and the next figure are from Diaconis [2009]): If one knows the language the code is written in, then one can obtain the distribution of letter frequencies f 1 (a i ), where a i A. One can then construct a plausibility function, as L 1 (x 1,x 2,...,x n ;φ) = n i=1 f 1 (φ(x i )). To decipher a particular code, one can use a Monte-Carlo method to minimize L 1 over all ciphers φ. This method was actually used to decode the prisoners messages. It didn t work initially. However, when the plausibility function was updated to include information about the frequencies of pairs of letters f 2 (a i,a j ) as L 1 (x 1,x 2,...,x n ;φ) = n i=1 f 2 (φ(x i ),φ(x i+1 )), then it did work, and the researchers learned about daily life in prison (in a mixture of English, Spanish, and prison-slang): 10

11 References P. Diaconis. The markov chain monte carlo revolution. Bulletin of the American Mathematical Society, 46: , G. Grimmett and D. Stirzaker. Probability and Random Processes. Oxford University Press, eal Madras. Lectures on Monte Carlo Methods. American Mathematical Society, W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. umerical Recipes. Cambridge University Press, 3 edition, Alan D. Sokal. Monte carlo methods in statistical mechanics: foundations and new algorithms. In Cours de Troisieme Cyle de la Physique en Suisse Romande., Lausanne,

Lecture 3: Markov Chains (II): Detailed Balance, and Markov Chain Monte Carlo (MCMC)

Lecture 3: Markov Chains (II): Detailed Balance, and Markov Chain Monte Carlo (MCMC) Lecture 3: Markov Chains (II): Detailed Balance, and Markov Chain Monte Carlo (MCMC) Readings Recommended: Grimmett and Stirzaker (2001) 6.5 (Detailed balance), 6.14 (MCMC). Optional: Sokal (1989). A classic

More information

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

Optimization Methods via Simulation

Optimization Methods via Simulation Optimization Methods via Simulation Optimization problems are very important in science, engineering, industry,. Examples: Traveling salesman problem Circuit-board design Car-Parrinello ab initio MD Protein

More information

Monte Carlo and cold gases. Lode Pollet.

Monte Carlo and cold gases. Lode Pollet. Monte Carlo and cold gases Lode Pollet lpollet@physics.harvard.edu 1 Outline Classical Monte Carlo The Monte Carlo trick Markov chains Metropolis algorithm Ising model critical slowing down Quantum Monte

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Probabilistic Models of Cognition, 2011 http://www.ipam.ucla.edu/programs/gss2011/ Roadmap: Motivation Monte Carlo basics What is MCMC? Metropolis Hastings and Gibbs...more tomorrow.

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Simulated Annealing for Constrained Global Optimization

Simulated Annealing for Constrained Global Optimization Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

More information

Lecture 2 : CS6205 Advanced Modeling and Simulation

Lecture 2 : CS6205 Advanced Modeling and Simulation Lecture 2 : CS6205 Advanced Modeling and Simulation Lee Hwee Kuan 21 Aug. 2013 For the purpose of learning stochastic simulations for the first time. We shall only consider probabilities on finite discrete

More information

Numerical methods for lattice field theory

Numerical methods for lattice field theory Numerical methods for lattice field theory Mike Peardon Trinity College Dublin August 9, 2007 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 1 / 24 Numerical

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 1940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic -

More information

Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques

Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques Practical Numerical Methods in Physics and Astronomy Lecture 5 Optimisation and Search Techniques Pat Scott Department of Physics, McGill University January 30, 2013 Slides available from http://www.physics.mcgill.ca/

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Random processes and probability distributions. Phys 420/580 Lecture 20

Random processes and probability distributions. Phys 420/580 Lecture 20 Random processes and probability distributions Phys 420/580 Lecture 20 Random processes Many physical processes are random in character: e.g., nuclear decay (Poisson distributed event count) P (k, τ) =

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic - not

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky Monte Carlo Lecture 15 4/9/18 1 Sampling with dynamics In Molecular Dynamics we simulate evolution of a system over time according to Newton s equations, conserving energy Averages (thermodynamic properties)

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) School of Computer Science 10-708 Probabilistic Graphical Models Markov Chain Monte Carlo (MCMC) Readings: MacKay Ch. 29 Jordan Ch. 21 Matt Gormley Lecture 16 March 14, 2016 1 Homework 2 Housekeeping Due

More information

Generating the Sample

Generating the Sample STAT 80: Mathematical Statistics Monte Carlo Suppose you are given random variables X,..., X n whose joint density f (or distribution) is specified and a statistic T (X,..., X n ) whose distribution you

More information

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013 Eco517 Fall 2013 C. Sims MCMC October 8, 2013 c 2013 by Christopher A. Sims. This document may be reproduced for educational and research purposes, so long as the copies contain this notice and are retained

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Monte Carlo Methods. PHY 688: Numerical Methods for (Astro)Physics

Monte Carlo Methods. PHY 688: Numerical Methods for (Astro)Physics Monte Carlo Methods Random Numbers How random is random? From Pang: we want Long period before sequence repeats Little correlation between numbers (plot ri+1 vs ri should fill the plane) Fast Typical random

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

Simulations with MM Force Fields. Monte Carlo (MC) and Molecular Dynamics (MD) Video II.vi

Simulations with MM Force Fields. Monte Carlo (MC) and Molecular Dynamics (MD) Video II.vi Simulations with MM Force Fields Monte Carlo (MC) and Molecular Dynamics (MD) Video II.vi Some slides taken with permission from Howard R. Mayne Department of Chemistry University of New Hampshire Walking

More information

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018 Math 456: Mathematical Modeling Tuesday, April 9th, 2018 The Ergodic theorem Tuesday, April 9th, 2018 Today 1. Asymptotic frequency (or: How to use the stationary distribution to estimate the average amount

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 1940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic -

More information

2 Random Variable Generation

2 Random Variable Generation 2 Random Variable Generation Most Monte Carlo computations require, as a starting point, a sequence of i.i.d. random variables with given marginal distribution. We describe here some of the basic methods

More information

in Computer Simulations for Bioinformatics

in Computer Simulations for Bioinformatics Bi04a_ Unit 04a: Stochastic Processes and their Applications in Computer Simulations for Bioinformatics Stochastic Processes and their Applications in Computer Simulations for Bioinformatics Basic Probability

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Approximate inference in Energy-Based Models

Approximate inference in Energy-Based Models CSC 2535: 2013 Lecture 3b Approximate inference in Energy-Based Models Geoffrey Hinton Two types of density model Stochastic generative model using directed acyclic graph (e.g. Bayes Net) Energy-based

More information

Sampling from complex probability distributions

Sampling from complex probability distributions Sampling from complex probability distributions Louis J. M. Aslett (louis.aslett@durham.ac.uk) Department of Mathematical Sciences Durham University UTOPIAE Training School II 4 July 2017 1/37 Motivation

More information

Random Walks A&T and F&S 3.1.2

Random Walks A&T and F&S 3.1.2 Random Walks A&T 110-123 and F&S 3.1.2 As we explained last time, it is very difficult to sample directly a general probability distribution. - If we sample from another distribution, the overlap will

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Intelligent Systems I

Intelligent Systems I 1, Intelligent Systems I 12 SAMPLING METHODS THE LAST THING YOU SHOULD EVER TRY Philipp Hennig & Stefan Harmeling Max Planck Institute for Intelligent Systems 23. January 2014 Dptmt. of Empirical Inference

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II References Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

M155 Exam 2 Concept Review

M155 Exam 2 Concept Review M155 Exam 2 Concept Review Mark Blumstein DERIVATIVES Product Rule Used to take the derivative of a product of two functions u and v. u v + uv Quotient Rule Used to take a derivative of the quotient of

More information

7.1 Coupling from the Past

7.1 Coupling from the Past Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes Monte Carlo Integration Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

More information

Advanced Monte Carlo Methods Problems

Advanced Monte Carlo Methods Problems Advanced Monte Carlo Methods Problems September-November, 2012 Contents 1 Integration with the Monte Carlo method 2 1.1 Non-uniform random numbers.......................... 2 1.2 Gaussian RNG..................................

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Markov-Chain Monte Carlo

Markov-Chain Monte Carlo Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. References Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 15: MCMC Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Course progress Learning from examples Definition + fundamental theorem of statistical learning,

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

Expectations, Markov chains, and the Metropolis algorithm

Expectations, Markov chains, and the Metropolis algorithm Expectations, Markov chains, and the Metropolis algorithm Peter Hoff Departments of Statistics and Biostatistics and the Center for Statistics and the Social Sciences University of Washington 7-27-05 1

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Advanced Sampling Algorithms

Advanced Sampling Algorithms + Advanced Sampling Algorithms + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh

More information

Numerical integration and importance sampling

Numerical integration and importance sampling 2 Numerical integration and importance sampling 2.1 Quadrature Consider the numerical evaluation of the integral I(a, b) = b a dx f(x) Rectangle rule: on small interval, construct interpolating function

More information

Quantum Monte Carlo. Matthias Troyer, ETH Zürich

Quantum Monte Carlo. Matthias Troyer, ETH Zürich Quantum Monte Carlo Matthias Troyer, ETH Zürich 1 1. Monte Carlo Integration 2 Integrating a function Convert the integral to a discrete sum b! f (x)dx = b " a N a N ' i =1 # f a + i b " a % $ N & + O(1/N)

More information

Numerical methods for lattice field theory

Numerical methods for lattice field theory Numerical methods for lattice field theory Mike Peardon Trinity College Dublin August 9, 2007 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 1 / 37 Numerical

More information

André Schleife Department of Materials Science and Engineering

André Schleife Department of Materials Science and Engineering André Schleife Department of Materials Science and Engineering Length Scales (c) ICAMS: http://www.icams.de/cms/upload/01_home/01_research_at_icams/length_scales_1024x780.png Goals for today: Background

More information

Monte Carlo Methods. Geoff Gordon February 9, 2006

Monte Carlo Methods. Geoff Gordon February 9, 2006 Monte Carlo Methods Geoff Gordon ggordon@cs.cmu.edu February 9, 2006 Numerical integration problem 5 4 3 f(x,y) 2 1 1 0 0.5 0 X 0.5 1 1 0.8 0.6 0.4 Y 0.2 0 0.2 0.4 0.6 0.8 1 x X f(x)dx Used for: function

More information

Markov Processes. Stochastic process. Markov process

Markov Processes. Stochastic process. Markov process Markov Processes Stochastic process movement through a series of well-defined states in a way that involves some element of randomness for our purposes, states are microstates in the governing ensemble

More information

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror Molecular dynamics simulation CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror 1 Outline Molecular dynamics (MD): The basic idea Equations of motion Key properties of MD simulations Sample applications

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Lecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods.

Lecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods. Lecture 35 Minimization and maximization of functions Powell s method in multidimensions Conjugate gradient method. Annealing methods. We know how to minimize functions in one dimension. If we start at

More information

Lecture 21: Convergence of transformations and generating a random variable

Lecture 21: Convergence of transformations and generating a random variable Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 110 Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,

More information