Markov and Gibbs Random Fields

Size: px
Start display at page:

Download "Markov and Gibbs Random Fields"

Transcription

1 Markov and Gibbs Random Fields Bruno Galerne MAP5, Université Paris Descartes Master MVA Cours Méthodes stochastiques pour l analyse d images Lundi 6 mars 2017

2 Outline The Ising Model Markov Random Fields and Gibbs Fields Sampling Gibbs Fields

3 Reference The main reference for this course is Chapter 4 of the book Pattern Theory: The Stochastic Analysis of Real-World Signals by D. Mumford and A. Desolneux [Mumford and Desolneux 2010]

4 Outline The Ising Model Markov Random Fields and Gibbs Fields Sampling Gibbs Fields

5 The Ising Model The Ising model is the simplest and most famous Gibbs model. It has its origin in statistical mechanics but can be presented in the context of image segmentation. Framework: A real-valued image I R M N of size M N is given (with gray-levels mostly in [ 1, 1] after some contrast change). One wants to associate a binary image J { 1, 1} M N with values in { 1, 1} ( 1 for black, and 1 for white) supposed to model the predominantly black/white areas of I. There is only a finite number of 2 MN values for J. Input image I A realization J of the associated Ising model (with T = 1.5)

6 The Ising Model Notation: Ω M,N = {0,..., M 1} {0,..., N 1} is the set of pixel indexes. For keeping the notation short we will denote by α or β pixel coordinates (k, l), (m, n), so that I(α) = I(k, l) for some (k, l). We note α β to mean that pixels α and β are neighbors for the 4-connectivity (each pixel has 4 neighbors, except at the border). Energy: To each couple of images (I, J), I R M N, J { 1, 1} M N, one associates the energy E(I, J) defined by E(I, J) = c (I(α) J(α)) 2 + (J(α) J(β)) 2 α Ω M,N α β where c > 0 is a positive constant and the sum α β means that each pair of connected pixels is summed once (and not twice). The first term of the energy measures the similarity between I and J. The second term measures the similarity between pixels of J that are neighbors.

7 The Ising Model Energy: E(I, J) = c (I(α) J(α)) 2 + (J(α) J(β)) 2 α Ω M,N α β The Ising model associated with I: For any fixed image I, this energy enables to define a discrete probability distribution on { 1, 1} M N by p T (J) = 1 Z T e 1 T E(I,J), where T > 0 is a constant called the temperature and Z T is the normalizing constant Z T = e 1 T E(I,J). J { 1,1} M N The probability distribution p T is the Ising model associated with I (and constant c and temperature T ).

8 The Ising Model Energy: E(I, J) = c (I(α) J(α)) 2 + (J(α) J(β)) 2 α Ω M,N α β Probability distribution: p T (J) = 1 Z T e 1 T E(I,J) Minimizing with respect to J the energy E(I, J) is equivalent to find the most probable state of the discrete distribution p T. Note that this most probable state (also called the mode of the distribution) is the same for all temperatures T. As T tends to 0, the distribution p T tends to be concentrated at this mode. As T tends to +, p T tends to be uniform over all the possible image configurations. Hence the temperature parameter T controls the amount of allowed randomness around the most probable state.

9 The Ising Model The main questions regarding the Ising model are the following: 1. How can we sample efficiently from the discrete distribution p T? Although the distribution is a discrete probability on the finite set of binary images { 1, 1} M N, one cannot compute the table of the probability distribution even for images! Indeed, the memory size of the table would be 2 } {{} }{{} 4 > Bytes = TeraBytes Number of binary images 4 bytes for a float 2. How can we compute the common mode(s) of the distributions p T, that is (one of) the most probable state of the distributions p T? 3. What is the good order for the temperature value T?

10 The Ising Model The main questions regarding the Ising model are the following: 1. How can we sample efficiently from the discrete distribution p T? Although the distribution is a discrete probability on the finite set of binary images { 1, 1} M N, one cannot compute the table of the probability distribution even for images! Indeed, the memory size of the table would be 2 } {{} }{{} 4 > Bytes = TeraBytes Number of binary images 4 bytes for a float 2. How can we compute the common mode(s) of the distributions p T, that is (one of) the most probable state of the distributions p T? 3. What is the good order for the temperature value T?

11 The Ising Model The main questions regarding the Ising model are the following: 1. How can we sample efficiently from the discrete distribution p T? Although the distribution is a discrete probability on the finite set of binary images { 1, 1} M N, one cannot compute the table of the probability distribution even for images! Indeed, the memory size of the table would be 2 } {{} }{{} 4 > Bytes = TeraBytes Number of binary images 4 bytes for a float 2. How can we compute the common mode(s) of the distributions p T, that is (one of) the most probable state of the distributions p T? 3. What is the good order for the temperature value T?

12 The Ising Model

13 Outline The Ising Model Markov Random Fields and Gibbs Fields Sampling Gibbs Fields

14 A General Framework The Ising model is a very special case of a large class of stochastic models called Gibbs fields. General framework: We consider an arbitrary finite graph (V, E) with vertices V and undirected edges E V V. Each vertex has a random label in a phase space Λ (also supposed to be finite, e.g. discrete gray-levels). The set Ω = Λ V of all label configurations is called the state space. The elements of Ω = Λ V are denoted by x, y,..., that is, x = (x α) α V, x α Λ. Example of the Ising model: The vertices V are the pixels of Ω M,N. The edges are the ones of the 4-connectivity. The phase space is Λ = { 1, 1}. The state space is the set of binary images Ω = { 1, 1} M N.

15 Markov Random Fields Definition (Markov Random Field) A random variable X = (X α) α V with values in the state space Ω = Λ V is a Markov random field (MRF) if for any partition V = V 1 V 2 V 3 of the set of vertices V such that there is no edge between vertices of V 1 and V 3, and denoting by X i = X Vi the restriction of X to V i, i = 1, 2, 3, one has P(X 1 = x 1 X 2 = x 2, X 3 = x 3 ) = P(X 1 = x 1 X 2 = x 2 ) whenever both conditional probabilities are well defined, that is whenever P(X 2 = x 2, X 3 = x 3 ) > 0. The above property is called the Markov property. It is equivalent to saying that X 1 and X 3 are conditionally independent given X 2, that is, P(X 1 = x 1, X 3 = x 3 X 2 = x 2 ) = P(X 1 = x 1 X 2 = x 2 )P(X 3 = x 3 X 2 = x 2 ) whenever the conditional probabilities are well defined, that is whenever P(X 2 = x 2 ) > 0...

16 Conditional Independence Proof: Conditional independence Markov property Assuming P(X 2 = x 2, X 3 = x 3 ) > 0 (since otherwise there is nothing to show): P(X 1 = x 1 X 2 = x 2, X 3 = x 3 ) = P(X 1 = x 1, X 2 = x 2, X 3 = x 3 ) P(X 2 = x 2, X 3 = x 3 ) = P(X 1 = x 1, X 3 = x 3 X 2 = x 2 ) P(X 3 = x 3 X 2 = x 2 ) = P(X 1 = x 1 X 2 = x 2 )P(X 3 = x 3 X 2 = x 2 ) P(X 3 = x 3 X 2 = x 2 ) = P(X 1 = x 1 X 2 = x 2 ).

17 Conditional Independence Proof: Markov property Conditional independence Assuming P(X 2 = x 2, X 3 = x 3 ) > 0 (since otherwise there is nothing to show): P(X 1 = x 1 X 2 = x 2 )P(X 3 = x 3 X 2 = x 2 ) = P(X 1 = x 1 X 2 = x 2, X 3 = x 3 )P(X 3 = x 3 X 2 = x 2 ) = P(X 1 = x 1, X 2 = x 2, X 3 = x 3 ) P(X 2 = x 2, X 3 = x 3 ) P(X 2 = x 2, X 3 = x 3 ) P(X 2 = x 2 ) = P(X 1 = x 1, X 3 = x 3 X 2 = x 2 ).

18 Gibbs Fields Definition (Cliques) Given a graph (V, E), a subset C V is a clique if any distinct vertices of C are linked with an edge of E: α, β C, α β (α, β) E. The set of cliques of a graph is denoted by C. Example 1. The singletons {α} of V are always cliques. 2. For the graph of the 4-connectivity, the cliques are the singletons, the pairs of horizontal adjacent pixels, the pairs of vertical adjacent pixels.

19 Gibbs Fields Definition (Families of potentials) A family of function U C : Ω R, C C, is said to be a family of potentials if each function U C only depends of the restriction on the cliques C: x, y Ω, x C = y C U C (x) = U C (y). Definition (Gibbs distribution and Gibbs field) A Gibbs distribution on the state space Ω = Λ V is a probability distribution P that comes from an energy function deriving from a family of potentials on cliques (U C ) C C : P(x) = 1 Z e E(x), where E(x) = C C U C (x), and Z is the normalizing constant Z = x Ω e E(x). A Gibbs field is a random variable X = (X α) α V Ω such that its law is a Gibbs distribution.

20 Gibbs Fields Gibbs distribution: P(x) = 1 Z e E(x), where E(x) = C C U C (x). Example The distribution of the Ising model is given by p T (J) = 1 Z T e 1 T E(I,J), where E(I, J) = c (I(α) J(α)) 2 + (J(α) J(β)) 2. α Ω M,N α β Hence it is a Gibbs distribution that derives from the family of potentials U {α} = c T (I(α) J(α))2, α V, and U {α,β} = 1 T (J(α) J(β))2, (α, β) E.

21 Gibbs Fields Gibbs distribution: P(x) = 1 Z e E(x), where E(x) = C C U C (x). Remark: One can always introduce a temperature parameter T : For all T > 0, is also a Gibbs distribution. P T (x) = 1 Z T e 1 T E(x)

22 The Hammersley-Clifford Theorem Theorem (Hammersley-Clifford) Gibbs Fields and MRF are equivalent in the following sense: 1. If X is a Gibbs field then it satisfies the Markov property. 2. If P is the distribution of a MRF such that P(x) > 0 for all x Ω, then there exists an energy function E deriving from a family of potentials (U C ) C C such that Take-away message: P(x) = 1 Z e E(x). Gibbs fields and Markov random fields are essentially the same thing.

23 Proof of Gibbs fields are MRF Let X be a Gibbs field with distribution P(x) = 1 Z e E(x), where E(x) = C C U C (x). Let V = V 1 V 2 V 3 be any partition of the set of vertices V such that there is no edge between vertices of V 1 and V 3. For each i = 1, 2, 3, one denotes by x i = x Vi the restriction of a state x to V i. Remark that a clique C C cannot intersect both V 1 and V 3, that is, either C V 1 V 2 or C V 2 V 3. Since U C (x) only depends on the restriction of x to C, U C (x) either depends on (x 1, x 2 ) or (x 2, x 3 ). One can thus write Hence E(x) = E 1 (x 1, x 2 ) + E 2 (x 2, x 3 ) and thus P(x) = F 1 (x 1, x 2 )F 2 (x 2, x 3 ). P(x 1 x 2, x 3 ) = P(x 1, x 2, x 3 ) P(x 2, x 3 ) P(x 1, x 2, x 3 ) = y Λ V P(y, x 1 2, x 3 ) = F 1 (x 1, x 2 )F 2 (x 2, x 3 ) y Λ V F 1 1(y, x 2 )F 2 (x 2, x 3 ) F 1 (x 1, x 2 ) = y Λ V F 1 1(y, x 2 ) = P(x 1, x 2 ) P(x 2 ) = P(x 1 x 2 ).

24 The Hammersley-Clifford Theorem Hence a Gibbs field satisfies the Markov property, it is a MRF. The difficult part of the Hammersley-Clifford theorem is the converse implication! It solely relies on the Möbius inversion formula. Proposition (Möbius inversion formula) Let V be a finite subset and f and g be two functions defined on the set P(V ) of subsets of V. Then, A V, f (A) = B A ( 1) A\B g(b) A V, g(a) = B A f (B).

25 Proof of Möbius inversion formula Simple fact: Let n be an integer, then ( ) { n n ( 1) k 1 if n = 0, = k 0 if n > 0, k=0 (since it is equal to (1 + ( 1)) n!) Proof of A V, f (A) = ( 1) A\B g(b) A V, g(a) = B A B A Let A V. Then, f (B) = ( 1) B\D g(d) B A B A D B = ( 1) B\D g(d) D A B D = ( 1) E g(d) D A E A\D = g(d) ( 1) E D A E A\D = g(d) D A ( ) A\D A \ D ( 1) k k k=0 }{{} = g(a). =0 if A \ D > 0 and 1 if D = A f (B):

26 Proof of Möbius inversion formula Proof of A V, g(a) = B A Let A V. Then, f (B) A V, f (A) = B A( 1) A\B g(b): f (D) B A( 1) A\B g(b) = B A( 1) A\B D B = ( 1) A\B f (D) D A B D = ( 1) A\(E D) D A E A\D = f (D) ( 1) A E D D A E A\D = f (D)( 1) A D ( 1) E D A = f (D)( 1) A D D A E A\D A\D ( 1) k k=0 }{{} =0 if A \ D > 0 and 1 if D = A = f (A).

27 Proof of MRF with Positive Probability Are Gibbs Fields Notation: Let λ 0 Λ denote a fixed value in the phase space Λ. For a subset A V, and a configuration x Ω = Λ V, we denote by xa the configuration that is the same as x at each site of A and is λ 0 elsewhere, that is, { x α if α A, (xa) α = otherwise. λ 0 In particular, we have xv = x and we denote x 0 = x the configuration with the value λ 0 at all the sites. Candidate energy: Remark that if P is a Gibbs distribution with energy E, then, P(x) = 1 Z e E(x) E(x) = log P(x 0) P(x) + E(x 0). Besides, the constant E(x 0 ) has no influence since it can be included in Z. Hence given our Markov distribution P, we now define a function E : Ω R by E(x) = log P(x 0) P(x) (P(x) > 0 by hypothesis).

28 Proof of MRF with Positive Probability Are Gibbs Fields Candidate energy: E(x) = log P(x 0) so that P(x) = P(x 0 )e E(x). P(x) Goal: Show that this energy E derives from a family of potentials on cliques (U C ) C C : For all x Ω, E(x) = U C (x). C C Let x be a given configuration. For all subset A V, we define U A (x) = B A( 1) A\B E(xB). By the Möbius inversion formula, we have, for all A V, E(xA) = B A U B (x). In particular, taking A = V, we get E(x) = U A (x) and thus A V ( P(x) = P(x 0 )e E(x) = P(x 0 ) exp ) U A (x). A V

29 Proof of MRF with Positive Probability Are Gibbs Fields ( P(x) = P(x 0 )e E(x) = P(x 0 ) exp ) U A (x). A V P has nearly the targeted form! U A (x) = B A( 1) A\B E(xB) only depends on the values of x at the sites of A. It only remains to show that U A (x) = 0 if A is not a clique. Indeed, in this case the energy E(x) = A V U A (x) = C C U C (x) derives from a family of potential, that is, P is a Gibbs field. The proof of U A (x) = 0 if A is not a clique will use the Markov property satisfied by P... (not used yet)

30 Proof of MRF with Positive Probability Are Gibbs Fields = Proof of U A (x) = 0 if A is not a clique: Let A V such that A is not a clique, that is A / C. Then, A contains two vertices α and β that are not neighbors in the graph (V, E). B A ( 1) A\B E(xB) + B A ( 1) A\B E(xB) α B, β B U A (x) = + B A = B A\{α,β} α B, β / B B A\{α,β} B A\{α,β} ( 1) A\B ( 1) A\B E(xB) + α / B, β / B B A α / B, β B ( 1) A\B E(x(B {α, β})) + ( 1) A\B E(x(B {α})) ( 1) A\B E(xB) B A\{α,β} B A\{α,β} ( 1) A\B E(xB) ( 1) A\B E(x(B {β})) E(x(B {α, β})) + E(xB) E(x(B {α})) E(x(B {β})). }{{} Let us show that it is = 0

31 Proof of MRF with Positive Probability Are Gibbs Fields Proof of U A (x) = 0 if A is not a clique: Goal: E(x(B {α, β})) + E(xB) E(x(B {α})) E(x(B {β})) = 0 Since E(x) = log P(x 0) P(x) E(x(B {α, β})) + E(xB) E(x(B {α})) E(x(B {β})) = log P(x(B {α}))p(x(b {β})). P(x(B {α, β}))p(xb) Let us use the conditional independence (eq. to Markov property) with the partition : V 1 = {α}, V 2 = V \ {α, β}, and V 3 = {β}. Remark that the four states x(b {α}), x(b {β}), x(b {α, β}), and xb have the same restriction x 2 to V 2 = V \ {α, β}. P(x(B {α, β})) = P(X α = x α, X β = x β, X 2 = x 2 ) Similarly = P(X α = x α, X β = x β X 2 = x 2 )P(X 2 = x 2 ) = P(X α = x α X 2 = x 2 )P(X β = x β X 2 = x 2 )P(X 2 = x 2 ). P(xB) = P(X α = λ 0 X 2 = x 2 )P(X β = λ 0 X 2 = x 2 )P(X 2 = x 2 ) P(x(B {α})) = P(X α = x α X 2 = x 2 )P(X β = λ 0 X 2 = x 2 )P(X 2 = x 2 ) P(x(B {β})) = P(X α = λ 0 X 2 = x 2 )P(X β = x β X 2 = x 2 )P(X 2 = x 2 )

32 Proof of MRF with Positive Probability Are Gibbs Fields Proof of U A (x) = 0 if A is not a clique: Goal: E(x(B {α, β})) + E(xB) E(x(B {α})) E(x(B {β})) = 0 Hence, P(x(B {α}))p(x(b {β})) = 1 P(x(B {α, β}))p(xb) which shows that E(x(B {α, β})) + E(xB) E(x(B {α})) E(x(B {β})) = log and thus U A (x) = 0. P(x(B {α}))p(x(b {β})) P(x(B {α, β}))p(xb) = 0

33 Outline The Ising Model Markov Random Fields and Gibbs Fields Sampling Gibbs Fields

34 Considered Problems Gibbs fields pose several difficult computational problems 1. Compute samples from this distribution, 2. Compute the mode, the most probable state x = argmax P(x) (if it is unique...), 3. Compute the marginal distribution on each component X α for some fixed α V. The simplest method to sample from a Gibbs field is to use a type of Monte Carlo Markov Chain (MCMC) known as the Metropolis Algorithm. The main idea of Metropolis algorithms are to let evolve a Markov chain taking values in the space of configurations Ω = Λ V. Running a sampler for a reasonable length of time enables one to estimate the marginals on each component X α or the mean of other random functions of the state. Simulated annealing: Samples at very low temperatures will cluster around the mode of the field. But the Metropolis algorithm takes a very long time to give good samples if the temperature is too low. Simulated annealing consists in starting the Metropolis algorithm at sufficiently high temperatures and gradually lowering the temperature.

35 Recalls on Markov Chains We refer to the textbook [Brémaud 1998] for a complete reference on Markov chains. The main reference for this section is [Mumford and Desolneux 2010]. Definition (Markov Chain) Let Ω be a finite set of states. A sequence (X n) n N of random variables taking values in Ω is said to be a Markov chain if it satisfies the Markov condition: For all n 1 and x 0, x 1,..., x n Ω, P(X n = x n X 0 = x 0, X 1 = x,..., X n 1 = x n 1 ) = P(X n = x n X n 1 = x n 1 ), whenever both conditional probabilities are well-defined, that is, whenever P(X 0 = x 0, X 1 = x,..., X n 1 = x n 1 ) > 0. Moreover the Markov chain (X n) n N is said homogeneous if the probability does not depend on n, that is, for all n 1, and x, y Ω, P(X n = y X n 1 = x) = P(X 1 = y X 0 = x). In what follows Markov chains will always be assumed homogeneous. In this case, the transition matrix Q of the chain is defined as the Ω Ω matrix of all transition probabilities [ ] Q(x, y) = q x y = P(X 1 = y X 0 = x) = P(X n = y X n 1 = x), n.

36 Recalls on Markov Chains Proposition For all n 1, the matrix Q n = Q Q gives the law of X n given the initial state X 0, that is, P(X n = y X 0 = x) = Q n (x, y). Proof. By induction. This is true for n = 1. Suppose it is true for some n 1. Then, P(X n+1 = y X 0 = x) = z Ω P(X n+1 = y, X n = z X 0 = x) = P(X n+1 = y, X n = z, X 0 = x) P(X 0 = x) z Ω = P(X n+1 = y, X n = z, X 0 = x) P(X n = z, X 0 = x) P(X n = z, X 0 = x) P(X 0 = x) z Ω = z Ω P(X n+1 = y X n = z, X 0 = x)p(x n = z X 0 = x) = z Ω P(X n+1 = y X n = z)q n (x, z) = z Ω Q n (x, z)q(z, y) = Q n+1 (x, y).

37 Recalls on Markov Chains Definition We say that the Markov chain is irreducible if for all x, y Ω, there exists n 0 such that Q n (x, y) > 0. If a Markov chain is irreducible then every pair of states can be connected by the chain. Some Markov chains can have periodic behavior in visiting in a prescribed order all the states of Ω. This is generally a behavior that is best avoided. Definition We say that a Markov chain is aperiodic if for all x Ω the greatest common divisor of all the n 1 such that Q n (x, x) > 0 is 1.

38 Equilibrium probability distribution Definition A probability distribution Π on Ω is an equilibrium probability distribution of Q if y Ω, Π(x)Q(x, y) = Π(y). x Ω Interpretation: If the initial state X 0 follows the distribution Π, then all the random variables X n also follow the distribution Π: P(X 1 = y) = x Ω P(X 1 = y X 0 = x)p(x 0 = x) = x Ω Q(x, y)π(x) = Π(y). Theorem If Q is irreducible, then there exists a unique equilibrium probability distribution Π for Q. If moreover Q is aperiodic, then x, y Ω, lim n + Qn (x, y) = Π(y). Consequence: If X 0 follows any distribution P 0, then, for all y Ω, P(X n = y) = P(X n = y X 0 = x)p(x 0 = x) x Ω = x Ω Q n (x, y)p 0 (x) n + x Ω Π(y)P 0 (x) = Π(y).

39 Equilibrium probability distribution Consequence: P(X n = y) n + Π(y) Whatever the distribution of the initial state X 0, as n tends to +, the sequence (X n) n N converges in distribution to Π. Approximate simulation of the distribution Π: 1. Draw randomly an initial state x 0 (with some distribution P 0 ). 2. For n = 1 to N (N large ), compute x n by simulating the discrete distribution P(X n X n 1 = x n 1 ) = Q(x n 1, ). 3. Return the random variate x N. This procedure is called a Monte Carlo Markov Chain (MCMC) method and is at the heart of the Metropolis algorithm to simulate Gibbs fields.

40 Metropolis algorithm Framework: Ω a finite state space (Ω = Λ V in our case). Given any energy function E : Ω R, we define a probability distribution on Ω at all temperatures T via the formula: p T (x) = 1 Z T e 1 T E(x), where Z T = x Ω e 1 T E(x). Idea: Simulate a Markov chain (X n) n 0, X n Ω, such that the equilibrium probability distribution is P T, so that, as n tends to +, the distribution of X n will tend to P T. We start with some simple Markov chain, given by a transition matrix Q(x, y) = (q x y) (x,y) Ω 2, q x y being the probability of the transition from a state x Ω to a state y Ω. Hypotheses on Q: Q is symmetric (qx y = q y x), Q is irreducible, Q is aperiodic.

41 Metropolis algorithm Definition The Metropolis transition matrix associated with P T and Q is the matrix P defined by q x y if x y and P T (y) P T (x), P T (y) P(x, y) = p x y = P T (x) qx y if x y and P T (y) P T (x), 1 p x z if y = x. z x Theorem The Metropolis transition matrix P(x, y) = (p x y) is irreducible and aperiodic if Q(x, y) = (q x y) is. In addition, the equilibrium probability distribution of P(x, y) = (p x y) is P T, that is, x, y Ω, lim n + Pn (x, y) = P T (y).

42 Metropolis algorithm Proof: Since P T (y) 0 for all y Ω, and by induction, Q(x, y) 0 P(x, y) 0, Q n (x, y) 0 P n (x, y) 0. As Q, P is irreducible and aperiodic. In consequence, P has a unique equilibrium probability distribution. We now prove that there is detailed balance, which will imply that P T is the equilibrium probability distribution of P: x, y Ω, P T (x)p x y = P T (y)p y x. Let x, y Ω. If x = y, there is obvious. If x y, we can suppose that P T (x) P T (y). Then, P T (x)p x y = P T (x) P T (y) P T (x) qx y = P T (y)q x y = P T (y)q y x = P T (y)p y x. Summing this equality over all x, one gets for all y Ω, p y x = P T (y), x Ω P T (x)p x y = P T (y) x Ω i.e. P T is the equilibrium probability distribution of the transition matrix P.

43 Metropolis algorithm Practical aspect: Choice of the transition matrix Q The uniform change satisfies the hypotheses x, y Ω, q x y = 1 Ω. However, this choice is not practical since one attempts at randomly draw a structured image from a white noise image. A more practical transition matrix is the one corresponding to the following procedure: one selects randomly a pixel/vertex α V, and one draws uniformly a new value λ Λ \ {x α} for this vertex. Hence, q x y = { 1 V ( Λ 1) if x and y differs on exactly one vertex/pixel, 0 otherwise. One of the main problem of the Metropolis algorithm is that many proposed moves x y will not be accepted. The Gibbs sampler attempts to reduce this problem.

44 Gibbs Sampler Algorithm Let Ω = Λ V be the Gibbs state space. The Gibbs sampler selects one site α V at random and then picks a new value λ for x α with probability distribution given by P T, conditioned on the values of the state at all other vertices: 1 V P T (X α = y α X V \{α} = x V \{α} ) if y differs from x only at α, p x y = 1 V P T (X α = x α X V \{α} ) if y = x, α V 0 otherwise. The transition matrix P satisfies the same property as the one of the Metropolis algorithm (Exercice).

45 Bibliographic references I P. Brémaud, Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues, Springer, 1998 D. Mumford and A. Desolneux, Pattern Theory: The Stochastic Analysis of Real-World Signals, Ak Peters Series, 2010

Basics on the Graphcut Algorithm. Application for the Computation of the Mode of the Ising Model

Basics on the Graphcut Algorithm. Application for the Computation of the Mode of the Ising Model Basics on the Graphcut Algorithm. Application for the Computation of the Mode of the Ising Model Bruno Galerne bruno.galerne@parisdescartes.fr MAP5, Université Paris Descartes Master MVA Cours Méthodes

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

Markov Random Fields

Markov Random Fields Markov Random Fields 1. Markov property The Markov property of a stochastic sequence {X n } n 0 implies that for all n 1, X n is independent of (X k : k / {n 1, n, n + 1}), given (X n 1, X n+1 ). Another

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Markov Random Fields

Markov Random Fields Markov Random Fields Umamahesh Srinivas ipal Group Meeting February 25, 2011 Outline 1 Basic graph-theoretic concepts 2 Markov chain 3 Markov random field (MRF) 4 Gauss-Markov random field (GMRF), and

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Markov Random Fields and Gibbs Measures

Markov Random Fields and Gibbs Measures Markov Random Fields and Gibbs Measures Oskar Sandberg December 15, 2004 1 Introduction A Markov random field is a name given to a natural generalization of the well known concept of a Markov chain. It

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Rapid Introduction to Machine Learning/ Deep Learning

Rapid Introduction to Machine Learning/ Deep Learning Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/24 Lecture 5b Markov random field (MRF) November 13, 2015 2/24 Table of contents 1 1. Objectives of Lecture

More information

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Markov Random Fields and Bayesian Image Analysis. Wei Liu Advisor: Tom Fletcher

Markov Random Fields and Bayesian Image Analysis. Wei Liu Advisor: Tom Fletcher Markov Random Fields and Bayesian Image Analysis Wei Liu Advisor: Tom Fletcher 1 Markov Random Field: Application Overview Awate and Whitaker 2006 2 Markov Random Field: Application Overview 3 Markov Random

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

Discrete solid-on-solid models

Discrete solid-on-solid models Discrete solid-on-solid models University of Alberta 2018 COSy, University of Manitoba - June 7 Discrete processes, stochastic PDEs, deterministic PDEs Table: Deterministic PDEs Heat-diffusion equation

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder

More information

Results: MCMC Dancers, q=10, n=500

Results: MCMC Dancers, q=10, n=500 Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time

More information

Introduction to Graphical Models

Introduction to Graphical Models Introduction to Graphical Models STA 345: Multivariate Analysis Department of Statistical Science Duke University, Durham, NC, USA Robert L. Wolpert 1 Conditional Dependence Two real-valued or vector-valued

More information

Markov Networks.

Markov Networks. Markov Networks www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts Markov network syntax Markov network semantics Potential functions Partition function

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks) (Bayesian Networks) Undirected Graphical Models 2: Use d-separation to read off independencies in a Bayesian network Takes a bit of effort! 1 2 (Markov networks) Use separation to determine independencies

More information

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC) Markov Chains (2) Outlines Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC) 2 pj ( n) denotes the pmf of the random variable p ( n) P( X j) j We will only be concerned with homogenous

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods p. /36 Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Markov Chain Monte Carlo Methods p. 2/36 Markov Chains

More information

The Monte Carlo Method

The Monte Carlo Method The Monte Carlo Method Example: estimate the value of π. Choose X and Y independently and uniformly at random in [0, 1]. Let Pr(Z = 1) = π 4. 4E[Z] = π. { 1 if X Z = 2 + Y 2 1, 0 otherwise, Let Z 1,...,

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Markov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University

Markov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University Markov Chains Andreas Klappenecker Texas A&M University 208 by Andreas Klappenecker. All rights reserved. / 58 Stochastic Processes A stochastic process X tx ptq: t P T u is a collection of random variables.

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Stat 8501 Lecture Notes Spatial Lattice Processes Charles J. Geyer November 16, Introduction

Stat 8501 Lecture Notes Spatial Lattice Processes Charles J. Geyer November 16, Introduction Stat 8501 Lecture Notes Spatial Lattice Processes Charles J. Geyer November 16, 2016 1 Introduction A spatial lattice process is a stochastic process with a discrete carrier T, that is, we have random

More information

Review: Directed Models (Bayes Nets)

Review: Directed Models (Bayes Nets) X Review: Directed Models (Bayes Nets) Lecture 3: Undirected Graphical Models Sam Roweis January 2, 24 Semantics: x y z if z d-separates x and y d-separation: z d-separates x from y if along every undirected

More information

Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 110 Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

More information

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States

More information

CSC 446 Notes: Lecture 13

CSC 446 Notes: Lecture 13 CSC 446 Notes: Lecture 3 The Problem We have already studied how to calculate the probability of a variable or variables using the message passing method. However, there are some times when the structure

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov

More information

Markov properties for undirected graphs

Markov properties for undirected graphs Graphical Models, Lecture 2, Michaelmas Term 2011 October 12, 2011 Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y,

More information

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graphical model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018 Math 456: Mathematical Modeling Tuesday, April 9th, 2018 The Ergodic theorem Tuesday, April 9th, 2018 Today 1. Asymptotic frequency (or: How to use the stationary distribution to estimate the average amount

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

EE512 Graphical Models Fall 2009

EE512 Graphical Models Fall 2009 EE512 Graphical Models Fall 2009 Prof. Jeff Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2009 http://ssli.ee.washington.edu/~bilmes/ee512fa09 Lecture 3 -

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Markov Chains on Countable State Space

Markov Chains on Countable State Space Markov Chains on Countable State Space 1 Markov Chains Introduction 1. Consider a discrete time Markov chain {X i, i = 1, 2,...} that takes values on a countable (finite or infinite) set S = {x 1, x 2,...},

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states. Chapter 8 Finite Markov Chains A discrete system is characterized by a set V of states and transitions between the states. V is referred to as the state space. We think of the transitions as occurring

More information

The Theory behind PageRank

The Theory behind PageRank The Theory behind PageRank Mauro Sozio Telecom ParisTech May 21, 2014 Mauro Sozio (LTCI TPT) The Theory behind PageRank May 21, 2014 1 / 19 A Crash Course on Discrete Probability Events and Probability

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Introduction to Restricted Boltzmann Machines

Introduction to Restricted Boltzmann Machines Introduction to Restricted Boltzmann Machines Ilija Bogunovic and Edo Collins EPFL {ilija.bogunovic,edo.collins}@epfl.ch October 13, 2014 Introduction Ingredients: 1. Probabilistic graphical models (undirected,

More information

Gibbs Fields & Markov Random Fields

Gibbs Fields & Markov Random Fields Statistical Techniques in Robotics (16-831, F10) Lecture#7 (Tuesday September 21) Gibbs Fields & Markov Random Fields Lecturer: Drew Bagnell Scribe: Bradford Neuman 1 1 Gibbs Fields Like a Bayes Net, a

More information

Chapter 7. The worm algorithm

Chapter 7. The worm algorithm 64 Chapter 7 The worm algorithm This chapter, like chapter 5, presents a Markov chain for MCMC sampling of the model of random spatial permutations (chapter 2) within the framework of chapter 4. The worm

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

MATH 56A: STOCHASTIC PROCESSES CHAPTER 7

MATH 56A: STOCHASTIC PROCESSES CHAPTER 7 MATH 56A: STOCHASTIC PROCESSES CHAPTER 7 7. Reversal This chapter talks about time reversal. A Markov process is a state X t which changes with time. If we run time backwards what does it look like? 7.1.

More information

Exact Goodness-of-Fit Testing for the Ising Model

Exact Goodness-of-Fit Testing for the Ising Model Exact Goodness-of-Fit Testing for the Ising Model The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Convergence Rate of Markov Chains

Convergence Rate of Markov Chains Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution

More information

Image segmentation combining Markov Random Fields and Dirichlet Processes

Image segmentation combining Markov Random Fields and Dirichlet Processes Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO

More information

1 Random Walks and Electrical Networks

1 Random Walks and Electrical Networks CME 305: Discrete Mathematics and Algorithms Random Walks and Electrical Networks Random walks are widely used tools in algorithm design and probabilistic analysis and they have numerous applications.

More information

Lecture 5: Random Walks and Markov Chain

Lecture 5: Random Walks and Markov Chain Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7. Markov chain background. 7.1 Finite state space Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

More information

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture

More information

MARKOV CHAINS: STATIONARY DISTRIBUTIONS AND FUNCTIONS ON STATE SPACES. Contents

MARKOV CHAINS: STATIONARY DISTRIBUTIONS AND FUNCTIONS ON STATE SPACES. Contents MARKOV CHAINS: STATIONARY DISTRIBUTIONS AND FUNCTIONS ON STATE SPACES JAMES READY Abstract. In this paper, we rst introduce the concepts of Markov Chains and their stationary distributions. We then discuss

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

7.1 Coupling from the Past

7.1 Coupling from the Past Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

Lecture 3: September 10

Lecture 3: September 10 CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 3: September 10 Lecturer: Prof. Alistair Sinclair Scribes: Andrew H. Chan, Piyush Srivastava Disclaimer: These notes have not

More information

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs) Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.

More information

Ch5. Markov Chain Monte Carlo

Ch5. Markov Chain Monte Carlo ST4231, Semester I, 2003-2004 Ch5. Markov Chain Monte Carlo In general, it is very difficult to simulate the value of a random vector X whose component random variables are dependent. In this chapter we

More information

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t 2.2 Filtrations Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of σ algebras {F t } such that F t F and F t F t+1 for all t = 0, 1,.... In continuous time, the second condition

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Markov properties for undirected graphs

Markov properties for undirected graphs Graphical Models, Lecture 2, Michaelmas Term 2009 October 15, 2009 Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y,

More information

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graph model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015 Markov Chains Arnoldo Frigessi Bernd Heidergott November 4, 2015 1 Introduction Markov chains are stochastic models which play an important role in many applications in areas as diverse as biology, finance,

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Statistics 352: Spatial statistics. Jonathan Taylor. Department of Statistics. Models for discrete data. Stanford University.

Statistics 352: Spatial statistics. Jonathan Taylor. Department of Statistics. Models for discrete data. Stanford University. 352: 352: Models for discrete data April 28, 2009 1 / 33 Models for discrete data 352: Outline Dependent discrete data. Image data (binary). Ising models. Simulation: Gibbs sampling. Denoising. 2 / 33

More information

Probability & Computing

Probability & Computing Probability & Computing Stochastic Process time t {X t t 2 T } state space Ω X t 2 state x 2 discrete time: T is countable T = {0,, 2,...} discrete space: Ω is finite or countably infinite X 0,X,X 2,...

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Conditional Independence and Factorization

Conditional Independence and Factorization Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Theory of Stochastic Processes 3. Generating functions and their applications

Theory of Stochastic Processes 3. Generating functions and their applications Theory of Stochastic Processes 3. Generating functions and their applications Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo April 20, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information