The coupling method - Simons Counting Complexity Bootcamp, 2016
|
|
- Chloe Atkins
- 5 years ago
- Views:
Transcription
1 The coupling method - Simons Counting Complexity Bootcamp, 2016 Nayantara Bhatnagar (University of Delaware) Ivona Bezáková (Rochester Institute of Technology) January 26, 2016
2 Techniques for bounding the mixing time Probabilistic techniques - coupling, martingales, strong stationary times, coupling from the past. Eigenvalues and eigenfunctions. Functional, isoperimteric and geometric inequalities - Cheeger s inequality, conductance, Poincaré and Nash inequalities, discrete curvature. (Levin-Peres-Wilmer 2009, Aldous-Fill 1999/2014) have comprehensive accounts.
3 In this part of the tutorial Coupling distributions Coupling Markov chains Path Coupling Exact sampling - coupling from the past
4 Definition - Total variation distance 1 γ = α = β = µ ν tv Let µ and ν be probability measures on the same measurable space (Ω, F). The total variation distance between µ and ν is given by Here Ω is finite and F = 2 Ω µ ν tv = sup µ(a) ν(a) A F µ ν tv = max A Ω µ(a) ν(a) = 1 2 µ(x) ν(x) x Ω
5 Definition - Coupling of distributions (Doeblin 1938) Let µ and ν be probability measures on the same measurable space (Ω, F). A coupling of µ and ν is a pair of random variables (X, Y ) on the probability space (Ω Ω, F F, P) such that the marginals coincide P(X A) = µ(a), P(Y A) = ν(a), A F.
6 Example - Biased coins µ: Bernoulli(p), probability p of heads (= 1). ν: Bernoulli(q), probability q > p of heads. H µ (n) = no. of heads in n tosses. H ν (n) = no. of heads in n tosses. Proposition 1. P(H µ (n) > k) P(H ν (n) > k)
7 Example - Biased coins µ: Bernoulli(p), probability p of heads (= 1). ν: Bernoulli(q), probability q > p of heads. H µ (n) = no. of heads in n tosses. H ν (n) = no. of heads in n tosses. Proposition 1. P(H µ (n) > k) P(H ν (n) > k) Proof. For 1 i n, (X i, Y i ) a coupling of µ and ν. Indep. (X i, Y i ). Let X i µ. If X i = 1, set Y i = 1. If X i = 0, set Y i = 1 w.p. q p 1 p and 0 otherwise. P(X X n > k) P(Y Y n > k)
8 Coupling and Total Variation Distance Lemma 2. Let µ and ν be distributions on Ω. Then µ ν tv inf {P(X Y )} couplings (X,Y ) Proof. For any coupling (X, Y ) and A Ω µ(a) ν(a) = P(X A) P(Y A) P(X A, Y / A) P(X Y ). Similarly, ν(a) µ(a) P(X Y ). Therefore, µ ν tv P(X Y ).
9 Maximal coupling 1 γ = α = β = µ ν tv Lemma 3. Let µ and ν be distributions on Ω. Then µ ν tv = inf {P(X Y )} couplings (X,Y )
10 Coupling Markov chains (Pitman 1974, Griffeath 1974/5, Aldous 1983) A coupling of Markov chains with transition matrix P is a process (X t, Y t ) t=0 such that both (X t) and (Y t ) are Markov chains with transition matrix P. Applied to bounding the rate of convergence to stationarity of MC s for sampling tilings of lattice regions, particle processes, card shuffling, random walks on lattices and other natural graphs, Ising and Potts models, colorings, independent sets... (Aldous-Diaconis 1986, Bayer-Diaconis 1992, Bubley-Dyer 1997, Luby-Vigoda 1999, Luby-Randall-Sinclair 2001, Vigoda 2001, Wilson 2004,... )
11 Simple random walk on {0, 1..., n} SRW: Move up or down with probability 1 2 if possible. Do nothing if attempt to move outside interval. Claim 4. If 0 x y n, P t (y, 0) P t (x, 0).
12 Simple random walk on {0, 1..., n} SRW: Move up or down with probability 1 2 if possible. Do nothing if attempt to move outside interval. Claim 4. If 0 x y n, P t (y, 0) P t (x, 0). A coupling (X t, Y t ) of P t (x, ) and P t (y, ): X 0 = x, Y 0 = y. Let b 1, b 2... be i.i.d. {±1}-valued Bernoulli(1/2). At the ith step, attempt to add b i to both X i 1 and Y i 1.
13 Simple random walk on {0, 1..., n} For all t, X t Y t. Therefore, P t (y, 0) = P(Y t = 0) P(X t = 0) = P t (x, 0).
14 Simple random walk on {0, 1..., n} For all t, X t Y t. Therefore, P t (y, 0) = P(Y t = 0) P(X t = 0) = P t (x, 0). Note: Can modify any coupling so that the chains stay together after the first time they meet. We ll assume this to be the case.
15 Distance to stationarity and the mixing time Define d(t) := max x Ω Pt (x, ) π tv, d(t) := max x,y Ω Pt (x, ) P t (y, ) tv By the triangle inequality, d(t) d(t) 2d(t). The mixing time is t mix (ε) := min{t : d(t) ε} It s standard to work with t mix := t mix (1/4) (any constant ε < 1/2) because t mix (ε) log 2 ε 1 t mix.
16 Bound on the mixing time Theorem 5. Let (X t, Y t ) be a coupling with X 0 = x and Y 0 = y. Let τ be the first time they meet (and thereafter, coincide). Then P t (x, ) P t (y, ) tv P x,y (τ > t) Proof. By Lemma 2, P t (x, ) P t (y, ) tv P(X t Y t ) = P x,y (τ > t)
17 Bound on the mixing time Theorem 5. Let (X t, Y t ) be a coupling with X 0 = x and Y 0 = y. Let τ be the first time they meet (and thereafter, coincide). Then P t (x, ) P t (y, ) tv P x,y (τ > t) Proof. By Lemma 2, P t (x, ) P t (y, ) tv P(X t Y t ) = P x,y (τ > t) Corollary 6. t mix 4 max x,y E x,y (τ ). Proof. d(t) d(t) max x,y P x,y (τ > t) max x,y E x,y (τ ) t
18 Example - Card shuffling by random transpositions Shuffling MC on Ω = S n : Choose card X t and an independent position Y t uniformly. Exchange X t with σ t (Y t ) (the card at Y t ). Stationary distribution is uniform over permutations S n.
19 Example - Card shuffling by random transpositions Coupling of σ t, σ t: Choose card X t and independent position Y t uniformly. Use X t and Y t to update both σ t and σ t Let M t = number of cards at the same position in σ and σ.
20 Example - Card shuffling by random transpositions Coupling of σ t, σ t: Choose card X t and independent position Y t uniformly. Use X t and Y t to update both σ t and σ t Let M t = number of cards at the same position in σ and σ. Case 1: X t in same position M t+1 = M t.
21 Example - Card shuffling by random transpositions Coupling of σ t, σ t: Choose card X t and independent position Y t uniformly. Use X t and Y t to update both σ t and σ t Let M t = number of cards at the same position in σ and σ. Case 2: X t in different pos. σ(y t ) = σ (Y t ). M t+1 = M t.
22 Example - Card shuffling by random transpositions Mt+1 = Mt + 1 Mt+1 = Mt + 2 Case 3: Mt+1 = Mt + 3 I Xt in different pos. I σ(yt ) 6= σ 0 (Yt ). I Mt+1 > Mt.
23 Example - Card shuffling by random transpositions Proposition 7 (Broder). any x, y, Let τ be the first time M t = n. For E x,y (τ ) < π2 6 n2, t mix = O(n 2 ). Proof. Let τ i = steps to increase M t from i 1 to i so τ = τ 1 + τ τ n. P(M t+1 > M t M t = i) = Therefore, for any x, y (n i)2 n 2 E(τ i+1 M t = i) = n2 (n i) 2. n 1 E x,y (τ ) n 2 1 (n i) 2 < π2 6 n2. i=0
24 In this part of the tutorial Coupling distributions Coupling Markov chains Coloring MC Path Coupling Exact sampling - coupling from the past
25 A few remarks on yesterday s talk Shuffling by random transpositions has t mix O(n log n). Strong stationary times - random times at which the MC is guaranteed to be at stationarity. Maximal/optimal coupling 1 γ = α = β = µ ν tv Lemma 3. Let µ and ν be distributions on Ω. Then µ ν tv = inf {P(X Y )} couplings (X,Y )
26 Sampling Colorings Graph G = (V, E). Ω set of proper colorings of G. Metropolis MC: Select v V and k [q] uniformly. If k is allowed at v, update. Stationary distribution: uniform over colorings Ω (q > + 1).
27 Sampling Colorings Graph G = (V, E). Ω set of proper colorings of G. Metropolis MC: Select v V and k [q] uniformly. If k is allowed at v, update. Stationary distribution: uniform over colorings Ω (q > + 1).
28 Sampling Colorings Graph G = (V, E). Ω set of proper colorings of G. Metropolis MC: Select v V and k [q] uniformly. If k is allowed at v, update. Stationary distribution: uniform over colorings Ω (q > + 1).
29 A coupling for colorings Recall, Theorem 8. Let (X t, Y t ) be a coupling with X 0 = x and Y 0 = y. Let τ be the first time they meet (and thereafter, coincide). Then d(t) max P x,y (τ > t). x,y Theorem 9. Let G have max. degree. Let q > 4. Then, 1 t mix (ε) 1 4 /q n(log n + log(ε 1 )). Coupling: Generate a vertex-color pair (v, k) uniformly. Update the colorings X t and Y t by recoloring v with k if it is allowed.
30 Case analysis For colorings X t, Y t, let D t := {v X t (v) Y t (v)}. D t+1 = D t 1 w.p. Dt(q 2 ) qn
31 Case analysis For colorings X t, Y t, let D t := {v X t (v) Y t (v)}. D t+1 = D t 1 w.p. Dt(q 2 ) qn D t+1 = D t + 1 w.p. Dt(2 ) qn
32 Case analysis For colorings X t, Y t, let D t := {v X t (v) Y t (v)}. D t+1 = D t 1 w.p. Dt(q 2 ) qn E(D t+1 X t, Y t ) D t Dt(q 2 ) qn D t+1 = D t + 1 w.p. Dt(2 ) qn ( ) + 2Dt qn = D t 1 q 4 qn
33 Case analysis For colorings X t, Y t, let D t := {v X t (v) Y t (v)}. D t+1 = D t 1 w.p. Dt(q 2 ) qn E(D t+1 X t, Y t ) D t Dt(q 2 ) qn ( Iterating, E(D t X 0, Y 0 ) D 0 D t+1 = D t + 1 w.p. Dt(2 ) qn ( ) + 2Dt qn = D t 1 q 4 qn ) t ( n 1 q 4 qn 1 q 4 qn ) t
34 Mixing time We ve shown E(D t X 0 = x, Y 0 = y) n By Theorem 8, ( 1 q 4 ) t ( n exp q 4 qn q d(t) max P x,y (τ > t) = max P(D t > 1 X 0 = x, Y 0 = y) x,y x,y max x,y E(D t X 0 = x, Y 0 = y) t n ). Therefore, 1 t mix (ε) 1 4 /q n(log n + log(ε 1 )).
35 Fast mixing for q > 2 (Jerrum 1995, Salas-Sokal 1997) Theorem 10. Let G have max. degree. If q > 2, the mixing time of the Metropolis chain on colorings is ( ) q t mix (ε) n(log n + log(ε 1 )) q 2 Use path metrics on Ω to couple only colorings with a single difference and simplify the proof. Coupling colorings with just one difference in a smarter way.
36 Contraction in D t Recall, we showed contraction in one step: for some α > 0 E(D t+1 X t, Y t ) D t e α d(t) ne αt log n + log ε 1 t mix (ε) α
37 Contraction in D t Recall, we showed contraction in one step: for some α > 0 E(D t+1 X t, Y t ) D t e α d(t) ne αt log n + log ε 1 t mix (ε) α
38 Contraction in D t Recall, we showed contraction in one step: for some α > 0 E(D t+1 X t, Y t ) D t e α d(t) ne αt log n + log ε 1 t mix (ε) α
39 Path Coupling (Bubley-Dyer 1997) Connected graph (Ω, E 0 ) on Ω. Length function l : E 0 [1, ). A path from x 0 to x r is ξ = (x 0, x 1,..., x r ), and (x i 1, x i ) E 0. Length of path ξ is l(ξ) := r i=1 l((x i 1, x i ). Path metric on Ω is ρ(x, y) = min{l(ξ) ξ is a path between x, y}
40 Path Coupling (Bubley-Dyer 1997) Theorem 11 (Bubley-Dyer 97). If for each (x, y) E 0 there is a coupling (X 1, Y 1 ) of P(x, ) and P(y, ) so for some α > 0, E x,y (ρ(x 1, Y 1 )) e α ρ(x, y), then log(diam(ω) + log(ε)) t mix (ε) α where diam(ω) = max x,y ρ(x, y).
41 Metropolis chain on extended space Graph G = (V, E). Ω set of all colorings of G (including improper). Metropolis MC: Select v V and k [q] uniformly. If k is allowed at v, update. Stationary distribution: uniform on Ω, proper colorings.
42 Metropolis chain on extended space Graph G = (V, E). Ω set of all colorings of G (including improper). Metropolis MC: Select v V and k [q] uniformly. If k is allowed at v, update. Stationary distribution: uniform on Ω, proper colorings.
43 Metropolis chain on extended space Graph G = (V, E). Ω set of all colorings of G (including improper). Metropolis MC: Select v V and k [q] uniformly. If k is allowed at v, update. Stationary distribution: uniform on Ω, proper colorings.
44 Path coupling for colorings Let x y in ( Ω, E 0 ) if their color differs at 1 vertex. For (x, y) E 0, let l(x, y) = 1.
45 Path coupling for colorings Let x y in ( Ω, E 0 ) if their color differs at 1 vertex. For (x, y) E 0, let l(x, y) = 1.
46 Path coupling for colorings Let x y in ( Ω, E 0 ) if their color differs at 1 vertex. For (x, y) E 0, let l(x, y) = 1.
47 Coupling Let v be the disagreeing vertex in colorings x and y. Pick a vertex u and color k uniformly at random. If u / N(v) attempt to update u with k. If u N(v) If k / {x(v), y(v)} attempt to update u with k. Otherwise, attempt to update u in x with k and in y with the color in {x(v), y(v)} \ {k}. E x,y (ρ(x 1, Y 1 )) ρ(x, y) q qn + qn = q 2 qn
48 Sampling colorings Conjecture: O(n log n) mixing for q + 2. (Vigoda 1999) O(n 2 log n) mixing for q (Hayes-Sinclair 2005) Ω(n log n) lower bound. Restricted cases - triangle free graphs, large max. degree. Non-Markovian couplings. (Dyer, Flaxman, Frieze, Hayes, Molloy, Vigoda)
49 Exact Sampling - Coupling from the past (Propp-Wilson 1996) Two copies of the chain are run until they meet. When the chains meet, are they at stationarity? No: π(a) = 1 3, π(b) = 2 3, but the chains never meet at a. Coupling from the past (CFTP): If we run the chain backwards/ from the past from a time so that all trajectories meet, guaranteed to be at stationarity.
50 Random function representation of a MC Ergodic MC with transition matrix P. A distribution F over functions f : Ω Ω is a random function representation (RFR) iff P F (f (x) = y) = P(x, y) e.g. for the SRW on {0, 1..., n} let f (i) = min{i + 1, n}, f (i) = max{i 1, 0}. and F uniform on f and f. Proposition 12. Every transition matrix has an RFR, not necessarily unique. F defines a coupling on Ω via (x, y) f F (f (x), f (y))
51 Simulating forwards and backwards in time Associate a random function f t F to each t {,..., }. Forward simulation of the chain from x for t steps: F t 0(x) = f t 1 f 0 (x), P t (x, y) = P(F t 0(x) = y). Backward simulation of the chain from x for t steps: F 0 t(x) = f 1 f t (x). Let S be time so that F0 S (Ω) = 1. May not be stationary at S. CFTP: Let S be such that F S 0 (Ω) = 1. The chain is stationary!
52 Why does CFTP work? lim P(F t t 0(x) = y) = π(y) = lim P(F 0 t t(x) = y) Let S be such that F S 0 (Ω) = 1 and t > S. F 0 t(x) = f 1 f t (x) = F 0 S f S 1 f t (x) = F 0 S (y) = F 0 S (x). Since t > S was arbitrary, F 0 t has the same distribution as F 0.
53 Implementing CFTP Issues/Benefits: Choosing F so E(S) is small. When Ω is large, detecting when F S 0 (Ω) = 1. Exact samples even in absence of bounds on mixing time. Monotone CFTP: Partial order on states respected by the coupling with a maximal state x max and minimal state x min. Run CFTP from x max and x min. All trajectories will have converged when these coalesce. Ising Model, dimer configs. of hexagonal grid, bipartite independent set.
54 Strong stationary times A random variable τ is a stopping time for (X t ) if 1 τ=t is a function only of (X 0,..., X t ). A strong stationary time (SST) for (X t ) with stationary dist π is a randomized stopping time such that P x (τ = t, X τ = y) = P x (τ = t)π(y) Theorem 13. If τ is an SST then d(t) max P x (τ > t). x
55 Broder stopping time for random shuffling MC on S n : Choose L t and R t u.a.r. and transpose if different. Stopping time: Mark R t if both R t is unmarked Either L t is marked or L t = R t. Time τ for all cards to be marked is an SST. τ = τ τ n 1 where τ k = number of transpositions after kth card is marked and upto and including when k + 1st card is marked. ( ) (k + 1)(n k) τ k Geom n 2
56 Coupon collector estimate Can also calculate ( n 2 ) ( = n2 1 (k + 1)(n k) n + 1 k ) n k n 1 E(τ) = E(τ k ) = 2n(log n + O(1)) k=0 Var(τ) = O(n 2 ). Let t 0 = E(τ) + 2 Var(τ). By Chebyshev, P(τ > t 0 ) 1 4. t mix (2 + o(1))n log n.
Coupling AMS Short Course
Coupling AMS Short Course January 2010 Distance If µ and ν are two probability distributions on a set Ω, then the total variation distance between µ and ν is Example. Let Ω = {0, 1}, and set Then d TV
More informationLecture 28: April 26
CS271 Randomness & Computation Spring 2018 Instructor: Alistair Sinclair Lecture 28: April 26 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They
More informationLecture 6: September 22
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 6: September 22 Lecturer: Prof. Alistair Sinclair Scribes: Alistair Sinclair Disclaimer: These notes have not been subjected
More informationOberwolfach workshop on Combinatorics and Probability
Apr 2009 Oberwolfach workshop on Combinatorics and Probability 1 Describes a sharp transition in the convergence of finite ergodic Markov chains to stationarity. Steady convergence it takes a while to
More information7.1 Coupling from the Past
Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov
More information215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that
15 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that µ ν tv = (1/) x S µ(x) ν(x) = x S(µ(x) ν(x)) + where a + = max(a, 0). Show that
More informationCoupling. 2/3/2010 and 2/5/2010
Coupling 2/3/2010 and 2/5/2010 1 Introduction Consider the move to middle shuffle where a card from the top is placed uniformly at random at a position in the deck. It is easy to see that this Markov Chain
More informationCharacterization of cutoff for reversible Markov chains
Characterization of cutoff for reversible Markov chains Yuval Peres Joint work with Riddhi Basu and Jonathan Hermon 23 Feb 2015 Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff
More informationFlip dynamics on canonical cut and project tilings
Flip dynamics on canonical cut and project tilings Thomas Fernique CNRS & Univ. Paris 13 M2 Pavages ENS Lyon November 5, 2015 Outline 1 Random tilings 2 Random sampling 3 Mixing time 4 Slow cooling Outline
More informationUniversity of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods
University of Chicago Autumn 2003 CS37101-1 Markov Chain Monte Carlo Methods Lecture 4: October 21, 2003 Bounding the mixing time via coupling Eric Vigoda 4.1 Introduction In this lecture we ll use the
More informationPath Coupling and Aggregate Path Coupling
Path Coupling and Aggregate Path Coupling 0 Path Coupling and Aggregate Path Coupling Yevgeniy Kovchegov Oregon State University (joint work with Peter T. Otto from Willamette University) Path Coupling
More informationTOTAL VARIATION CUTOFF IN BIRTH-AND-DEATH CHAINS
TOTAL VARIATION CUTOFF IN BIRTH-AND-DEATH CHAINS JIAN DING, EYAL LUBETZKY AND YUVAL PERES ABSTRACT. The cutoff phenomenon describes a case where a Markov chain exhibits a sharp transition in its convergence
More informationIntroduction to Markov Chains and Riffle Shuffling
Introduction to Markov Chains and Riffle Shuffling Nina Kuklisova Math REU 202 University of Chicago September 27, 202 Abstract In this paper, we introduce Markov Chains and their basic properties, and
More informationThe Markov Chain Monte Carlo Method
The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov
More informationOn Markov Chain Monte Carlo
MCMC 0 On Markov Chain Monte Carlo Yevgeniy Kovchegov Oregon State University MCMC 1 Metropolis-Hastings algorithm. Goal: simulating an Ω-valued random variable distributed according to a given probability
More informationXVI International Congress on Mathematical Physics
Aug 2009 XVI International Congress on Mathematical Physics Underlying geometry: finite graph G=(V,E ). Set of possible configurations: V (assignments of +/- spins to the vertices). Probability of a configuration
More informationThe cutoff phenomenon for random walk on random directed graphs
The cutoff phenomenon for random walk on random directed graphs Justin Salez Joint work with C. Bordenave and P. Caputo Outline of the talk Outline of the talk 1. The cutoff phenomenon for Markov chains
More informationPropp-Wilson Algorithm (and sampling the Ising model)
Propp-Wilson Algorithm (and sampling the Ising model) Danny Leshem, Nov 2009 References: Haggstrom, O. (2002) Finite Markov Chains and Algorithmic Applications, ch. 10-11 Propp, J. & Wilson, D. (1996)
More informationIntroduction to Stochastic Processes
18.445 Introduction to Stochastic Processes Lecture 1: Introduction to finite Markov chains Hao Wu MIT 04 February 2015 Hao Wu (MIT) 18.445 04 February 2015 1 / 15 Course description About this course
More informationA Note on the Glauber Dynamics for Sampling Independent Sets
A Note on the Glauber Dynamics for Sampling Independent Sets Eric Vigoda Division of Informatics King s Buildings University of Edinburgh Edinburgh EH9 3JZ vigoda@dcs.ed.ac.uk Submitted: September 25,
More informationCharacterization of cutoff for reversible Markov chains
Characterization of cutoff for reversible Markov chains Yuval Peres Joint work with Riddhi Basu and Jonathan Hermon 3 December 2014 Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff
More informationMARKOV CHAINS AND MIXING TIMES
MARKOV CHAINS AND MIXING TIMES BEAU DABBS Abstract. This paper introduces the idea of a Markov chain, a random process which is independent of all states but its current one. We analyse some basic properties
More informationLecture 14: October 22
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 14: October 22 Lecturer: Alistair Sinclair Scribes: Alistair Sinclair Disclaimer: These notes have not been subjected to the
More informationDecomposition Methods and Sampling Circuits in the Cartesian Lattice
Decomposition Methods and Sampling Circuits in the Cartesian Lattice Dana Randall College of Computing and School of Mathematics Georgia Institute of Technology Atlanta, GA 30332-0280 randall@math.gatech.edu
More informationLectures on Stochastic Stability. Sergey FOSS. Heriot-Watt University. Lecture 4. Coupling and Harris Processes
Lectures on Stochastic Stability Sergey FOSS Heriot-Watt University Lecture 4 Coupling and Harris Processes 1 A simple example Consider a Markov chain X n in a countable state space S with transition probabilities
More informationMARKOV CHAINS AND HIDDEN MARKOV MODELS
MARKOV CHAINS AND HIDDEN MARKOV MODELS MERYL SEAH Abstract. This is an expository paper outlining the basics of Markov chains. We start the paper by explaining what a finite Markov chain is. Then we describe
More informationGlauber Dynamics for Ising Model I AMS Short Course
Glauber Dynamics for Ising Model I AMS Short Course January 2010 Ising model Let G n = (V n, E n ) be a graph with N = V n < vertices. The nearest-neighbor Ising model on G n is the probability distribution
More informationINTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING
INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing
More informationCan extra updates delay mixing?
Can extra updates delay mixing? Yuval Peres Peter Winkler July 7, 2010 Abstract We consider Glauber dynamics (starting from an extremal configuration) in a monotone spin system, and show that interjecting
More informationFAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS
FAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS EYAL LUBETZKY AND ALLAN SLY Abstract. In the study of Markov chain mixing times, analysis has centered on the performance from a worst-case starting state.
More informationMarkov chain Monte Carlo algorithms
M ONTE C ARLO M ETHODS Rapidly Mixing Markov Chains with Applications in Computer Science and Physics Monte Carlo algorithms often depend on Markov chains to sample from very large data sets A key ingredient
More informationARCC WORKSHOP: SHARP THRESHOLDS FOR MIXING TIMES SCRIBES: SHARAD GOEL, ASAF NACHMIAS, PETER RALPH AND DEVAVRAT SHAH
ARCC WORKSHOP: SHARP THRESHOLDS FOR MIXING TIMES SCRIBES: SHARAD GOEL, ASAF NACHMIAS, PETER RALPH AND DEVAVRAT SHAH Day I 1. Speaker I: Persi Diaconis Definition 1 (Sharp Threshold). A Markov chain with
More informationOn Markov chains for independent sets
On Markov chains for independent sets Martin Dyer and Catherine Greenhill School of Computer Studies University of Leeds Leeds LS2 9JT United Kingdom Submitted: 8 December 1997. Revised: 16 February 1999
More informationModern Discrete Probability Spectral Techniques
Modern Discrete Probability VI - Spectral Techniques Background Sébastien Roch UW Madison Mathematics December 22, 2014 1 Review 2 3 4 Mixing time I Theorem (Convergence to stationarity) Consider a finite
More informationProbability Models of Information Exchange on Networks Lecture 2. Elchanan Mossel UC Berkeley All rights reserved
Probability Models of Information Exchange on Networks Lecture 2 Elchanan Mossel UC Berkeley All rights reserved Lecture Plan We will see the first simple models of information exchange on networks. Assumptions:
More informationLecture 3: September 10
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 3: September 10 Lecturer: Prof. Alistair Sinclair Scribes: Andrew H. Chan, Piyush Srivastava Disclaimer: These notes have not
More informationErgodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.
Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions
More informationWhat can be sampled locally?
What can be sampled locally? Yitong Yin Nanjing University Joint work with: Weiming Feng, Yuxin Sun Local Computation Locality in distributed graph algorithms. [Linial, FOCS 87, SICOMP 92] the LOCAL model:
More informationCUTOFF FOR GENERAL SPIN SYSTEMS WITH ARBITRARY BOUNDARY CONDITIONS
CUTOFF FOR GENERAL SPIN SYSTEMS WITH ARBITRARY BOUNDARY CONDITIONS EYAL LUBETZKY AND ALLAN SLY Abstract. The cutoff phenomenon describes a sharp transition in the convergence of a Markov chain to equilibrium.
More informationPolynomial-time counting and sampling of two-rowed contingency tables
Polynomial-time counting and sampling of two-rowed contingency tables Martin Dyer and Catherine Greenhill School of Computer Studies University of Leeds Leeds LS2 9JT UNITED KINGDOM Abstract In this paper
More informationMIXING. Abstract. 1 Introduction DANA RANDALL. 1.1 The basics of sampling
E MIXIG RLL bstract In this tutorial we introduce the notion of a Markov chain and explore how it can be used for sampling from a large set of configurations Our primary focus will be determining how quickly
More informationStatistics 251/551 spring 2013 Homework # 4 Due: Wednesday 20 February
1 Statistics 251/551 spring 2013 Homework # 4 Due: Wednesday 20 February If you are not able to solve a part of a problem, you can still get credit for later parts: Just assume the truth of what you were
More information1 Sequences of events and their limits
O.H. Probability II (MATH 2647 M15 1 Sequences of events and their limits 1.1 Monotone sequences of events Sequences of events arise naturally when a probabilistic experiment is repeated many times. For
More informationNot all counting problems are efficiently approximable. We open with a simple example.
Chapter 7 Inapproximability Not all counting problems are efficiently approximable. We open with a simple example. Fact 7.1. Unless RP = NP there can be no FPRAS for the number of Hamilton cycles in a
More informationA GIBBS SAMPLER ON THE N-SIMPLEX. By Aaron Smith Stanford University
Submitted to the Annals of Applied Probability arxiv: math.pr/0290964 A GIBBS SAMPLER ON THE N-SIMPLEX By Aaron Smith Stanford University We determine the mixing time of a simple Gibbs sampler on the unit
More informationRandomly coloring constant degree graphs
Randomly coloring constant degree graphs Martin Dyer Alan Frieze Thomas P. Hayes Eric Vigoda April 6, 2011 Abstract We study a simple Markov chain, known as the Glauber dynamics, for generating a random
More informationINTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505
INTRODUCTION TO MCMC AND PAGERANK Eric Vigoda Georgia Tech Lecture for CS 6505 1 MARKOV CHAIN BASICS 2 ERGODICITY 3 WHAT IS THE STATIONARY DISTRIBUTION? 4 PAGERANK 5 MIXING TIME 6 PREVIEW OF FURTHER TOPICS
More informationSome Definition and Example of Markov Chain
Some Definition and Example of Markov Chain Bowen Dai The Ohio State University April 5 th 2016 Introduction Definition and Notation Simple example of Markov Chain Aim Have some taste of Markov Chain and
More informationMIXING TIMES OF LOZENGE TILING AND CARD SHUFFLING MARKOV CHAINS 1. BY DAVID BRUCE WILSON Microsoft Research
The Annals of Applied Probability 2004, Vol. 14, No. 1, 274 325 Institute of Mathematical Statistics, 2004 MIXING TIMES OF LOZENGE TILING AND CARD SHUFFLING MARKOV CHAINS 1 BY DAVID BRUCE WILSON Microsoft
More informationLecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016
Lecture 15: MCMC Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Course progress Learning from examples Definition + fundamental theorem of statistical learning,
More informationEdge Flip Chain for Unbiased Dyadic Tilings
Sarah Cannon, Georgia Tech,, David A. Levin, U. Oregon Alexandre Stauffer, U. Bath, March 30, 2018 Dyadic Tilings A dyadic tiling is a tiling on [0,1] 2 by n = 2 k dyadic rectangles, rectangles which are
More informationConvergence Rate of Markov Chains
Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution
More informationLECTURE NOTES FOR MARKOV CHAINS: MIXING TIMES, HITTING TIMES, AND COVER TIMES IN SAINT PETERSBURG SUMMER SCHOOL, 2012
LECTURE NOTES FOR MARKOV CHAINS: MIXING TIMES, HITTING TIMES, AND COVER TIMES IN SAINT PETERSBURG SUMMER SCHOOL, 2012 By Júlia Komjáthy Yuval Peres Eindhoven University of Technology and Microsoft Research
More informationMARKOV CHAINS AND COUPLING FROM THE PAST
MARKOV CHAINS AND COUPLING FROM THE PAST DYLAN CORDARO Abstract We aim to explore Coupling from the Past (CFTP), an algorithm designed to obtain a perfect sampling from the stationary distribution of a
More informationHard-Core Model on Random Graphs
Hard-Core Model on Random Graphs Antar Bandyopadhyay Theoretical Statistics and Mathematics Unit Seminar Theoretical Statistics and Mathematics Unit Indian Statistical Institute, New Delhi Centre New Delhi,
More informationApplied Stochastic Processes
Applied Stochastic Processes Jochen Geiger last update: July 18, 2007) Contents 1 Discrete Markov chains........................................ 1 1.1 Basic properties and examples................................
More informationA simple condition implying rapid mixing of single-site dynamics on spin systems
A simple condition implying rapid mixing of single-site dynamics on spin systems Thomas P. Hayes University of California Berkeley Computer Science Division hayest@cs.berkeley.edu Abstract Spin systems
More informationApproximate Counting
Approximate Counting Andreas-Nikolas Göbel National Technical University of Athens, Greece Advanced Algorithms, May 2010 Outline 1 The Monte Carlo Method Introduction DNFSAT Counting 2 The Markov Chain
More informationThe complexity of approximate counting Part 1
The complexity of approximate counting Part 1 Leslie Ann Goldberg, University of Oxford Counting Complexity and Phase Transitions Boot Camp Simons Institute for the Theory of Computing January 2016 The
More informationINTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505
INTRODUCTION TO MCMC AND PAGERANK Eric Vigoda Georgia Tech Lecture for CS 6505 1 MARKOV CHAIN BASICS 2 ERGODICITY 3 WHAT IS THE STATIONARY DISTRIBUTION? 4 PAGERANK 5 MIXING TIME 6 PREVIEW OF FURTHER TOPICS
More informationWeak convergence in Probability Theory A summer excursion! Day 3
BCAM June 2013 1 Weak convergence in Probability Theory A summer excursion! Day 3 Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM June 2013 2 Day 1: Basic
More informationSolutions to Problem Set 5
UC Berkeley, CS 74: Combinatorics and Discrete Probability (Fall 00 Solutions to Problem Set (MU 60 A family of subsets F of {,,, n} is called an antichain if there is no pair of sets A and B in F satisfying
More informationFaithful couplings of Markov chains: now equals forever
Faithful couplings of Markov chains: now equals forever by Jeffrey S. Rosenthal* Department of Statistics, University of Toronto, Toronto, Ontario, Canada M5S 1A1 Phone: (416) 978-4594; Internet: jeff@utstat.toronto.edu
More informationDynamics for the critical 2D Potts/FK model: many questions and a few answers
Dynamics for the critical 2D Potts/FK model: many questions and a few answers Eyal Lubetzky May 2018 Courant Institute, New York University Outline The models: static and dynamical Dynamical phase transitions
More informationEECS 495: Randomized Algorithms Lecture 14 Random Walks. j p ij = 1. Pr[X t+1 = j X 0 = i 0,..., X t 1 = i t 1, X t = i] = Pr[X t+
EECS 495: Randomized Algorithms Lecture 14 Random Walks Reading: Motwani-Raghavan Chapter 6 Powerful tool for sampling complicated distributions since use only local moves Given: to explore state space.
More informationMixing Rates for the Gibbs Sampler over Restricted Boltzmann Machines
Mixing Rates for the Gibbs Sampler over Restricted Boltzmann Machines Christopher Tosh Department of Computer Science and Engineering University of California, San Diego ctosh@cs.ucsd.edu Abstract The
More informationMixing time for a random walk on a ring
Mixing time for a random walk on a ring Stephen Connor Joint work with Michael Bate Paris, September 2013 Introduction Let X be a discrete time Markov chain on a finite state space S, with transition matrix
More information1. Stochastic Processes and filtrations
1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S
More informationn! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2
Order statistics Ex. 4.1 (*. Let independent variables X 1,..., X n have U(0, 1 distribution. Show that for every x (0, 1, we have P ( X (1 < x 1 and P ( X (n > x 1 as n. Ex. 4.2 (**. By using induction
More informationLecture 17 Brownian motion as a Markov process
Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is
More informationModel Counting for Logical Theories
Model Counting for Logical Theories Wednesday Dmitry Chistikov Rayna Dimitrova Department of Computer Science University of Oxford, UK Max Planck Institute for Software Systems (MPI-SWS) Kaiserslautern
More informationLecture 19: November 10
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 19: November 10 Lecturer: Prof. Alistair Sinclair Scribes: Kevin Dick and Tanya Gordeeva Disclaimer: These notes have not been
More informationCUTOFF FOR THE ISING MODEL ON THE LATTICE
CUTOFF FOR THE ISING MODEL ON THE LATTICE EYAL LUBETZKY AND ALLAN SLY Abstract. Introduced in 1963, Glauber dynamics is one of the most practiced and extensively studied methods for sampling the Ising
More informationarxiv: v2 [math.pr] 26 Aug 2017
CONSTRAINED PERCOLATION, ISING MODEL AND XOR ISING MODEL ON PLANAR LATTICES ZHONGYANG LI arxiv:1707.04183v2 [math.pr] 26 Aug 2017 Abstract. We study constrained percolation models on planar lattices including
More informationEYAL LUBETZKY AND ALLAN SLY
N EXPOSITION TO INFORMTION PERCOLTION FOR THE ISING MODEL EYL LUBETZKY ND LLN SLY bstract. Information percolation is a new method for analyzing stochastic spin systems through classifying and controlling
More informationINFORMATION PERCOLATION AND CUTOFF FOR THE STOCHASTIC ISING MODEL
INFORMATION PERCOLATION AND CUTOFF FOR THE STOCHASTIC ISING MODEL EYAL LUBETZKY AND ALLAN SLY Abstract. We introduce a new framework for analyzing Glauber dynamics for the Ising model. The traditional
More informationcurvature, mixing, and entropic interpolation Simons Feb-2016 and CSE 599s Lecture 13
curvature, mixing, and entropic interpolation Simons Feb-2016 and CSE 599s Lecture 13 James R. Lee University of Washington Joint with Ronen Eldan (Weizmann) and Joseph Lehec (Paris-Dauphine) Markov chain
More informationMarkov Chains and Mixing Times
Markov Chains and Mixing Times David A. Levin Yuval Peres Elizabeth L. Wilmer with a chapter on coupling from the past by James G. Propp and David B. Wilson DRAFT, version of September 15, 2007. 100 80
More informationn! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2
Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,
More informationAn Introduction to Randomized algorithms
An Introduction to Randomized algorithms C.R. Subramanian The Institute of Mathematical Sciences, Chennai. Expository talk presented at the Research Promotion Workshop on Introduction to Geometric and
More informationSimons Workshop on Approximate Counting, Markov Chains and Phase Transitions: Open Problem Session
Simons Workshop on Approximate Counting, Markov Chains and Phase Transitions: Open Problem Session Scribes: Antonio Blanca, Sarah Cannon, Yumeng Zhang February 4th, 06 Yuval Peres: Simple Random Walk on
More informationNon-Essential Uses of Probability in Analysis Part IV Efficient Markovian Couplings. Krzysztof Burdzy University of Washington
Non-Essential Uses of Probability in Analysis Part IV Efficient Markovian Couplings Krzysztof Burdzy University of Washington 1 Review See B and Kendall (2000) for more details. See also the unpublished
More informationLect4: Exact Sampling Techniques and MCMC Convergence Analysis
Lect4: Exact Sampling Techniques and MCMC Convergence Analysis. Exact sampling. Convergence analysis of MCMC. First-hit time analysis for MCMC--ways to analyze the proposals. Outline of the Module Definitions
More informationG(t) := i. G(t) = 1 + e λut (1) u=2
Note: a conjectured compactification of some finite reversible MCs There are two established theories which concern different aspects of the behavior of finite state Markov chains as the size of the state
More informationLecture 2: September 8
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 2: September 8 Lecturer: Prof. Alistair Sinclair Scribes: Anand Bhaskar and Anindya De Disclaimer: These notes have not been
More informationProblem Points S C O R E Total: 120
PSTAT 160 A Final Exam Solution December 10, 2015 Name Student ID # Problem Points S C O R E 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 10 10 11 10 12 10 Total: 120 1. (10 points) Take a Markov chain
More informationSampling Adsorbing Staircase Walks Using a New Markov Chain Decomposition Method
66 66 77 D Sampling Adsorbing Staircase Walks Using a New Markov hain Decomposition Method Russell A Martin School of Mathematics Georgia Institute of Technology martin@mathgatechedu Dana Randall ollege
More informationLecture 9: Counting Matchings
Counting and Sampling Fall 207 Lecture 9: Counting Matchings Lecturer: Shayan Oveis Gharan October 20 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationINTRODUCTION TO MARKOV CHAIN MONTE CARLO
INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate
More informationA NOTE ON THE POINCARÉ AND CHEEGER INEQUALITIES FOR SIMPLE RANDOM WALK ON A CONNECTED GRAPH. John Pike
A NOTE ON THE POINCARÉ AND CHEEGER INEQUALITIES FOR SIMPLE RANDOM WALK ON A CONNECTED GRAPH John Pike Department of Mathematics University of Southern California jpike@usc.edu Abstract. In 1991, Persi
More information1 Stat 605. Homework I. Due Feb. 1, 2011
The first part is homework which you need to turn in. The second part is exercises that will not be graded, but you need to turn it in together with the take-home final exam. 1 Stat 605. Homework I. Due
More informationDefinition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.
Chapter 8 Finite Markov Chains A discrete system is characterized by a set V of states and transitions between the states. V is referred to as the state space. We think of the transitions as occurring
More informationRandom Walks on Graphs. One Concrete Example of a random walk Motivation applications
Random Walks on Graphs Outline One Concrete Example of a random walk Motivation applications shuffling cards universal traverse sequence self stabilizing token management scheme random sampling enumeration
More informationMONOTONE COUPLING AND THE ISING MODEL
MONOTONE COUPLING AND THE ISING MODEL 1. PERFECT MATCHING IN BIPARTITE GRAPHS Definition 1. A bipartite graph is a graph G = (V, E) whose vertex set V can be partitioned into two disjoint set V I, V O
More informationConjectures and Questions in Graph Reconfiguration
Conjectures and Questions in Graph Reconfiguration Ruth Haas, Smith College Joint Mathematics Meetings January 2014 The Reconfiguration Problem Can one feasible solution to a problem can be transformed
More informationTheorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )
Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the
More informationMixing Times and Hitting Times
Mixing Times and Hitting Times David Aldous January 12, 2010 Old Results and Old Puzzles Levin-Peres-Wilmer give some history of the emergence of this Mixing Times topic in the early 1980s, and we nowadays
More informationAssignment 2 : Probabilistic Methods
Assignment 2 : Probabilistic Methods jverstra@math Question 1. Let d N, c R +, and let G n be an n-vertex d-regular graph. Suppose that vertices of G n are selected independently with probability p, where
More informationNEW FRONTIERS IN APPLIED PROBABILITY
J. Appl. Prob. Spec. Vol. 48A, 209 213 (2011) Applied Probability Trust 2011 NEW FRONTIERS IN APPLIED PROBABILITY A Festschrift for SØREN ASMUSSEN Edited by P. GLYNN, T. MIKOSCH and T. ROLSKI Part 4. Simulation
More informationLecture 5: Counting independent sets up to the tree threshold
CS 7535: Markov Chain Monte Carlo Algorithms Fall 2014 Lecture 5: Counting independent sets up to the tree threshold Lecturer: Richard Brooks and Rakshit Trivedi Date: December 02 In this lecture, we will
More information