University of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods. Lecture 7: November 11, 2003 Estimating the permanent Eric Vigoda

Similar documents
University of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods

An Algorithmist s Toolkit September 24, Lecture 5

Lecture 9: Counting Matchings

Accelerating Simulated Annealing for the Permanent and Combinatorial Counting Problems

Lecture 6: September 22

25.1 Markov Chain Monte Carlo (MCMC)

7.1 Coupling from the Past

Lecture 17: D-Stable Polynomials and Lee-Yang Theorems

The Monte Carlo Method

Lecture 19: November 10

A Note on the Glauber Dynamics for Sampling Independent Sets

Lecture 3: September 10

Approximate Counting and Markov Chain Monte Carlo

Conductance and Rapidly Mixing Markov Chains

Sampling Binary Contingency Tables with a Greedy Start

Approximate Counting

Lecture 24: April 12

Lecture 2: January 18

Exponential Metrics. Lecture by Sarah Cannon and Emma Cohen. November 18th, 2014

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Random Walks on Graphs. One Concrete Example of a random walk Motivation applications

Lecture 16: Roots of the Matching Polynomial

Coupling. 2/3/2010 and 2/5/2010

CS364A: Algorithmic Game Theory Lecture #16: Best-Response Dynamics

Lower Bounds for Testing Bipartiteness in Dense Graphs

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

The Markov Chain Monte Carlo Method

Counting and sampling paths in graphs

Probability & Computing

Not all counting problems are efficiently approximable. We open with a simple example.

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

On Markov chains for independent sets

Negative Examples for Sequential Importance Sampling of Binary Contingency Tables

Lecture 6: Random Walks versus Independent Sampling

Mixing Rates for the Gibbs Sampler over Restricted Boltzmann Machines

Lecture 28: April 26

Ramanujan Graphs of Every Size

Randomized Algorithms

Lecture 5: Counting independent sets up to the tree threshold

SZEMERÉDI S REGULARITY LEMMA FOR MATRICES AND SPARSE GRAPHS

Lecture 2: September 8

6.046 Recitation 11 Handout

Polynomial-time classical simulation of quantum ferromagnets

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

Research Collection. Grid exploration. Master Thesis. ETH Library. Author(s): Wernli, Dino. Publication Date: 2012

A Polynomial-Time Approximation Algorithm for the Permanent of a Matrix with Non-Negative Entries

Properly colored Hamilton cycles in edge colored complete graphs

Essential facts about NP-completeness:

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505

6.842 Randomness and Computation March 3, Lecture 8

Exact Algorithms for Dominating Induced Matching Based on Graph Partition

Lecture 4: Graph Limits and Graphons

Markov Decision Processes

Notes for Lecture 14

Testing Expansion in Bounded-Degree Graphs

Lecture 5: Lowerbound on the Permanent and Application to TSP

CS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Bounds on the generalised acyclic chromatic numbers of bounded degree graphs

Analogies and discrepancies between the vertex cover number and the weakly connected domination number of a graph

MDP Preliminaries. Nan Jiang. February 10, 2019

Math 775 Homework 1. Austin Mohr. February 9, 2011

Generalized Expanders

Analysis of Markov chain algorithms on spanning trees, rooted forests, and connected subgraphs

1 Matchings in Non-Bipartite Graphs

Model Counting for Logical Theories

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science ALGORITHMS FOR INFERENCE Fall 2014

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS261: Problem Set #3

Markov Chains and MCMC

ON TESTING HAMILTONICITY OF GRAPHS. Alexander Barvinok. July 15, 2014

1 More finite deterministic automata

An algorithm to increase the node-connectivity of a digraph by one

CS264: Beyond Worst-Case Analysis Lecture #18: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

SPIN SYSTEMS: HARDNESS OF APPROXIMATE COUNTING VIA PHASE TRANSITIONS

Sampling Binary Contingency Tables with a Greedy Start

Competitive Weighted Matching in Transversal Matroids

Rapid Mixing of the Switch Markov Chain for Strongly Stable Degree Sequences and 2-Class Joint Degree Matrices

Inapproximability After Uniqueness Phase Transition in Two-Spin Systems

Competitive Weighted Matching in Transversal Matroids

Topics in Theoretical Computer Science April 08, Lecture 8

Connections between exponential time and polynomial time problem Lecture Notes for CS294

Lecture 5 Polynomial Identity Testing

The Turán number of sparse spanning graphs

Lecture 4. 1 Estimating the number of connected components and minimum spanning tree

arxiv: v1 [math.co] 1 Oct 2013

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Machine Learning for Data Science (CS4786) Lecture 24

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

Linear Programming Redux

Statistics 251/551 spring 2013 Homework # 4 Due: Wednesday 20 February

The coupling method - Simons Counting Complexity Bootcamp, 2016

Chapter 0: Preliminaries

Decomposing oriented graphs into transitive tournaments

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees

Lecture 21: Counting and Sampling Problems

Rapid Mixing of Several Markov Chains for a Hard-Core Model

Transcription:

University of Chicago Autumn 2003 CS37101-1 Markov Chain Monte Carlo Methods Lecture 7: November 11, 2003 Estimating the permanent Eric Vigoda We refer the reader to Jerrum s book [1] for the analysis of a Markov chain for generating a random matching of an arbitrary graph. Here we ll look at how to extend the argument to sample perfect matchings in dense graphs and arbitrary bipartite graphs. Both of these results are due to Jerrum and Sinclair [2]. Dense graphs We ll need some definitions and notations first: Definition 7.1 A graph G = (V, E) is said to be dense if for every v V, degree(v) > n/2, where n = V. Definition 7.2 Let G be a graph and let M G be the set of all perfect matchings of G. Similarly, for two distinct vertices u, v, let N G (u, v) be the set of all perfect matchings of the graph G\{u, v}. We refer to matchings in N G (u, v) as near-perfect matchings, or matchings with holes u and v. Whenever G is clearly understood from the context, we shall use simply M and N (u, v). Jerrum s book presents the analysis of a Markov chain for sampling matchings (not necessarily perfect). The Markov chain for sampling perfect matchings in dense graphs is only a slight modification of the original chain. The state space Ω is the set of perfect and nearperfect matchings, i.e., Ω = M u,v V N (u, v). The near-perfect matchings are needed to navigate between perfect matchings. For X t Ω, 1. Choose e = (u, v) uniformly at random from E. 2. If u, v are both unmatched, let X = X t e. If e X t, then let X = X t \ e. If u is unmatched and (v, w) X t, let X = X t e \ (v, w). 7-1

Lecture 7: November 11, 2003 7-2 3. If X Ω, with probability 1/2 set X t+1 = X, else X t+1 = X t. The chain is symmetric, thus it s stationary distribution π is uniform over Ω. For this Markov chain to efficiently (in polynomial time) sample perfect matchings, we have to have a polynomial bound on the ratio of perfect matchings to near-perfect matchings. Lemma 7.3 For a dense graph, Ω M n2. Thus, π(m) 1/n 2. Proof: We shall prove that M N (u, v) for every u, v. injective map f u,v : N (u, v) M. We do this by defining an Case 1: If u and v are adjacent, then f u,v (M) = M {(u, v)} for M N (u, v). Clearly, this map is injective. Case 2: If u and v are not adjacent, then for M N (u, v) we claim there exists e = {w, x} M such that w N(u) and x N(v). (The set N(u) denotes the neighbors of vertex u.) This will then give an augmenting path of length three to transform M into a perfect matching. To prove this claim, let S(v) = {w {w, x} M, x N(v)}. Since v is unmatched, every vertex in S(v) is matched and therefore S(v) = N(v) > n/2. Since we also know N(u) > n/2, we have S(v) N(u). Let z S(v) Γ(u) and {w, z} M. Finally, let f u,v (M) = M {v, w} {z, u} \ {w, z}. Again, f u,v is injective. Theorem 7.4 For dense graphs, τ(ɛ) poly(n, log 1/ɛ). Proof:(sketch): Our aim is to use the canonical paths and associated encoding η t which were we used for the chain on all matchings. We will take it for granted that the reader remembers the paths and encoding that were used earlier. However, there are two problems we must address. By taking a brief look at the various cases, we will see that there are two scenarios where the proof from last class uses matchings with 4 holes. This is the issue we must resolve since 4-hole matchings are not contained in Ω. Consider a pair of perfect matchings I, F. The canonical path used previously as well as the associated encoding are legitimate in the sense that both always contain at most two holes. Suppose I N (u, v) and F M. Looking at I F, we have an augmenting path from u to v and a set of alternating cycles. We define the canonical path γ IF to unwind, i.e., augment, the path, and then unwind the cycles in some canonical order. Notice that after we unwind the path, we are at a perfect matching. It is easy to verify that this path never has more than two holes (the holes arise in the midst of unwinding an alternating cycle). However, the associated encoding will have 4 holes. After we unwind the path, the encoding will have holes at u and v. During the unwinding of any cycle, we will get a further two holes in the encoding. Despite this fact, we keep the same encoding as before, but we now

Lecture 7: November 11, 2003 7-3 have that η t : cp(t) Ω, where Ω is the set of matchings with at most 4 holes and cp(t) is the set of canonical paths that traverse t. The encoding is simply an ingredient in the proof, used to bound cp(t). Therefore, there is no inherent reason why the encoding must be contained in Ω. It will suffice to simply prove that Ω n 2 Ω. Finally, consider a pair of near-perfect matchings I, F. The canonical path will visit 4-hole matchings. To avoid this, we construct different canonical paths. We first choose a random M M and the new canonical path is the concatenation of γ IM and γ MF. Now we want to bound the congestion. We first analyze the congestion created by paths between I Ω and F M. For every t = M M, we can define an injective map η t : cp(t) Ω. Thus, we can bound cp(t) Ω n 2 Ω. (The second inequality follows from Lemma 7.3.) In order to account for the congestion added by flows between pairs of near-perfect matchings we use the fact that N = u,v N (u, v) n 2 M. Fix a near-perfect matching I and a perfect matching M. For each near-perfect matching F, we use γ IM with probability 1/ M. Summing over F, the expected extra load on γ IM is N / M n 2. Therefore, paths between pairs of near-perfect matchings increase the congestion by a factor of n 2. If π is uniform on Ω, we can bound the congestion through t = M M : ρ = 1 π(m)p (M, M ) Ω m = mn cp(t) Ω n Ω 2 mn Ω Ω π(i)π(f ) γ IF mn 3 For dense graphs we can generate a random perfect matching in polynomial time, by running the chain for the mixing time. If the final state is a perfect matching (with probability at least 1/n 2 ), then it s a random perfect matching. Otherwise, we run the chain again. General Bipartite Graphs Consider a bipartite graph G = (V 1, V 2, E). We ll present the algorithm of Jerrum, Sinclair and Vigoda [3] for generating a random perfect of an arbitrary bipartite graph (and thus approximate the permanent of any non-negative matrix). Once we remove the min-degree condition, the ratio of perfect matchings to near-perfect matchings can be exponentially small. To ensure perfect matchings are likely in the stationary distribution we introduce suitable weights on matchings and modify the Markov chain.

Lecture 7: November 11, 2003 7-4 The weight of a matching is a function of the location of its holes (if any). Let { 1 for M M w (M) = w (u, v) = M N (u,v) for M N (u, v) We then modify the Markov chain introduced for dense graphs so that the stationary distribution is proportional to the weights. To do this we simply add what is known as a Metropolis filter. More precisely we simply modify the last step of the Markov chain to only move to the proposed new matching with an appropriate probability: 3. If X Ω, with probability 1/2 min{1, w (X ) } set X w (X t) t+1 = X, else X t+1 = X t. It is straightforward to verify that, for M Ω, π(m) = w(m)/z where Z = M Ω w(m ). There is an obvious concern with our proposed weights. In order to compute the weights, we need to be able to compute the number of perfect matchings and near-perfect matchings this was our original task. We will tackle this problem later. Let us first show that if we can obtain the weights, then we can generate a random perfect matching efficiently. To motivate the above weights, let s look at the effect on the stationary distribution. We have π(n (u, v)) = 1 w (M) == 1 M Z Z N (u, v) = M /Z. Similarly, M N (u,v) π(m) = M /Z. M N (u,v) Therefore, each hole pattern is equally likely, i.e., for all u V 1, v V 2, Hence, π(m) = π(n (u, v)). π(m) = 1 n 2 + 1. Thus, in the stationary distribution we output a random perfect matching with probability 1/(n 2 +1). Equally important, with these ideal weights, or even a reasonable approximation, our Markov chain is rapidly mixing. Lemma 7.5 For the Markov chain with weights w satisfying, for all u, v where the associated quantities are defined, we have w (u, v)/2 w(u, v) 2w (u, v), τ(ɛ) poly(n, log 1/ɛ).

Lecture 7: November 11, 2003 7-5 The proof idea is the same as for dense graphs. We use the same canonical paths as before. The critical difference in the analysis is that we now have to guarantee that for a transition t = M M, w(i)w(f ) w(m)p (M, M )w(η t (I, F )), where η t (I, F ) : cp(t) Ω is the same injective map we used for dense graphs. This inequality is equivalent to π(i)π(f ) π(m)p (M, M )π(η t (I, F )). This ensures that the map η has sufficient weight to encode the initial, final pair. Assuming this inequality we can still bound the congestion thru a transition t = M M as follows: ρ t = 1 π(m)p (M, M ) 1 π(m)p (M, M ) = n π(η t (I, F )) n. π(i)π(f ) γ IF π(m)p (M, M )π(η t (I, F ))n For bipartite graphs, it turns out that the same canonical paths and encodings η t as used for dense graphs, also satisfy the desired inequality on weights. This follows by checking several cases. (The inequality does not hold for non-bipartite graphs.) Therefore, if we have these ideal weights w we can generate a random perfect matching in a polynomial number of steps (with high probability). What happens if we run the chain with an approximation w to the ideal weights? The chain still converges rapidly to its stationary distribution, but the stationary distribution is no longer uniform over hole patterns. The likelihood, in the stationary distribution, of each hole pattern will be skewed by how far off the weights are from ideal. In particular, we will have π(m) M, π(n (u, v)) w(u, v) N (u, v). Therefore, we have π(m) N (u, v) = w (u, v) w(u, v). The left-hand side we can estimate by simply generating many samples from the chain and count the number of occurrences of each hole pattern. Since we know w(u, v), we can then

Lecture 7: November 11, 2003 7-6 estimate w (u, v). In other words, given weights which are reasonably close to ideal, we can revise these weights arbitrarily close to ideal, i.e., we can compute an arbitrarily close approximation to w. Before presenting the algorithm for generating a random perfect matching, we need to convert from unweighted graphs to weighted graphs. To avoid confusion with the weights w we call the weights on edges as fugacities. Given a bipartite graph G = (V 1, V 2, E) we view it as a graph with fugacities where for u V 1, v V 2, the edge (u, v) has fugacity { λ 1 if (u, v) E (u, v) = 0 otherwise. Then a matching M has a fugacity λ(m) = e M λ(e). Similarly, we redefine w (u, v) to be over matchings weighted by their fugacity. Our algorithm starts with a simple graph where we can compute the weights w. We then learn the graph along with the weights w in a slow manner. Our initial graph is the complete bipartite graph, here we can easily compute w. Iteratively, we reduce the fugacities of some non-edge by a factor of 1/2. We then run the chain with the ideal weights from the previous graph on this new graph. The old ideal weights are sufficiently close to the new ideal weights, so that we can estimate the new ideal weights as described above. We repeat this process until all non-edges have negligible fugacities (fugacities 1/n! suffice since there are at most n! perfect matchings in any graph). References [1] M. Jerrum. Counting, sampling and integrating: algorithms and complexity. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 2003. [2] M. Jerrum and A. Sinclair. Approximating the permanent. SIAM J. Comput., 18(6):1149 1178, 1989. [3] M. Jerrum, A. Sinclair, and E. Vigoda. A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries. In ACM Symposium on Theory of Computing, pages 712 721, 2001.