Lecture 21: Counting and Sampling Problems

Similar documents
princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

CS151 Complexity Theory. Lecture 13 May 15, 2017

On the complexity of approximate multivariate integration

CS 350 Algorithms and Complexity

Lecture 13 March 7, 2017

CS 350 Algorithms and Complexity

Essential facts about NP-completeness:

Lecture 11 October 7, 2013

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

2 Evidence that Graph Isomorphism is not NP-complete

Class Note #20. In today s class, the following four concepts were introduced: decision

NP-Completeness. Algorithmique Fall semester 2011/12

Lecture 22: Counting

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

An Introduction to Randomized algorithms

Lecture Examples of problems which have randomized algorithms

An Algorithmist s Toolkit September 24, Lecture 5

Model Counting for Logical Theories

The Tutte polynomial: sign and approximability

Lecture 4 : Quest for Structure in Counting Problems

Notes on Complexity Theory Last updated: December, Lecture 2

Lecture Notes 17. Randomness: The verifier can toss coins and is allowed to err with some (small) probability if it is unlucky in its coin tosses.

Provable Approximation via Linear Programming

Optimization of Submodular Functions Tutorial - lecture I

Average Case Complexity: Levin s Theory

Not all counting problems are efficiently approximable. We open with a simple example.

CSE200: Computability and complexity Space Complexity

Computational Complexity: A Modern Approach. Draft of a book: Dated January 2007 Comments welcome!

Week Cuts, Branch & Bound, and Lagrangean Relaxation

1.1 P, NP, and NP-complete

BCCS: Computational methods for complex systems Wednesday, 16 June Lecture 3

Notes for Lecture 11

Q = Set of states, IE661: Scheduling Theory (Fall 2003) Primer to Complexity Theory Satyaki Ghosh Dastidar

P, NP, NP-Complete, and NPhard

1 Introduction (January 21)

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora

Lecture 12: Interactive Proofs

Computational Learning Theory - Hilary Term : Introduction to the PAC Learning Framework

1 Strict local optimality in unconstrained optimization

Foundations of Machine Learning and Data Science. Lecturer: Avrim Blum Lecture 9: October 7, 2015

DRAFT. Algebraic computation models. Chapter 14

DRAFT. Complexity of counting. Chapter 8

CS264: Beyond Worst-Case Analysis Lecture #18: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Lecture 6: The Pigeonhole Principle and Probability Spaces

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Advanced topic: Space complexity

0.1 Motivating example: weighted majority algorithm

Permanent and Determinant

Lecture 18: P & NP. Revised, May 1, CLRS, pp

6.842 Randomness and Computation Lecture 5

1 Computational problems

15.083J/6.859J Integer Optimization. Lecture 2: Efficient Algorithms and Computational Complexity

Limitations of Algorithm Power

Lecture 19: Interactive Proofs and the PCP Theorem

Lecture 16: October 29

The Class NP. NP is the problems that can be solved in polynomial time by a nondeterministic machine.

Linear Algebra, Summer 2011, pt. 2

Complexity Theory: The P vs NP question

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 1/29/2002. Notes for Lecture 3

Lecture 10: Learning DNF, AC 0, Juntas. 1 Learning DNF in Almost Polynomial Time

Space and Nondeterminism

Lecture 21: Algebraic Computation Models

1 Closest Pair of Points on the Plane

Approximation Algorithms

-bit integers are all in ThC. Th The following problems are complete for PSPACE NPSPACE ATIME QSAT, GEOGRAPHY, SUCCINCT REACH.

Introduction to Complexity Theory

1 Circuit Complexity. CS 6743 Lecture 15 1 Fall Definitions

Notes for Lecture 21

Part V. Intractable Problems

Notes for Lecture 3... x 4

Lecture 26: Arthur-Merlin Games

Lecture 16 November 6th, 2012 (Prasad Raghavendra)

Lecturer: Shuchi Chawla Topic: Inapproximability Date: 4/27/2007

The Monte Carlo Method

1 Algorithms for Permutation Groups

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Topic: Primal-Dual Algorithms Date: We finished our discussion of randomized rounding and began talking about LP Duality.

1 Reductions and Expressiveness

COL 352 Introduction to Automata and Theory of Computation Major Exam, Sem II , Max 80, Time 2 hr. Name Entry No. Group

This means that we can assume each list ) is

Lecture 14 - P v.s. NP 1

Automata Theory CS Complexity Theory I: Polynomial Time

P P P NP-Hard: L is NP-hard if for all L NP, L L. Thus, if we could solve L in polynomial. Cook's Theorem and Reductions

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

a 1 a 2 a 3 a 4 v i c i c(a 1, a 3 ) = 3

Graph Theory and Optimization Computational Complexity (in brief)

Tractability. Some problems are intractable: as they grow large, we are unable to solve them in reasonable time What constitutes reasonable time?

Lecture 3 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Theory of Computation Chapter 1: Introduction

CSE 105 THEORY OF COMPUTATION

No Harder Than. We want these transformations to effectively be a problem-hardness version of #, which we will call No Harder Than.

Lecture 2: November 9

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Approximation complexity of min-max (regret) versions of shortest path, spanning tree, and knapsack

1 Computational Problems

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events

University of Toronto Department of Electrical and Computer Engineering. Final Examination. ECE 345 Algorithms and Data Structures Fall 2016

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

NP-Completeness Review

Transcription:

princeton univ. F 14 cos 521: Advanced Algorithm Design Lecture 21: Counting and Sampling Problems Lecturer: Sanjeev Arora Scribe: Today s topic of counting and sampling problems is motivated by computational problems involving multivariate statistics and estimation, which arise in many fields. For instance, we may have a probability density function φ(x) where x R n. Then we may want to compute moments or other parameters of the distribution, e.g. x 3 φ(x)dx. Or, we may have a model for how links develop faults in a network, and we seek to compute the probability that two nodes i, j stay connected under this model. This is a complicated probability calculation. In general, such problems can be intractable (eg, NP-hard). The simple-looking problem of integrating a multivariate function is NP-hard in the worst case, even when we have an explicit expression for the function f(x 1, x 2,..., x n ) that allows f to be computed in polynomial (in n) time. x 1 =0 x 2 =0 f(x 1, x 2,..., x n )dx 1 dx 2... dx n. x n=0 In fact even approximating such integrals can be NP-hard, as shown by Koutis (2003). Valiant (1979) showed that the computational heart of such problems is combinatorial counting problems. The goal in such problems is to compute the size of a set S where we can test membership in S in polynomial time. The class of such problems is called P. Example 1 SAT is the problem where, given a boolean formula ϕ, we have to compute the number of satisfying assignments to ϕ. Clearly it is NP-hard since if we can solve it, we can in particular solve the decision problem: decide if the number of satisfying assignments at least 1. CYCLE is the problem where, given a graph G = (V, E), we have to compute the number of cycles in G. Here the decision problem ( is G acyclic? ) is easily solvable using breadth first search. Nevertheless, the counting problem turns out to be NP-hard. SPANNINGTREE is the problem where, given a graph G = (V, E), we have to compute the number of spanning trees in G. This is known to be solvable using a simple determinant computation (Kirchoff s matrix-tree theorem) since the 19th century. Valiant s class P captures most interesting counting problems. Many of these are NPhard, but not all. You can learn more about them in COS 522: Computational Complexity, usually taught in the spring semester. It is easy to see that the above integration problem can be reduced to a counting problem with some loss of precision. First, recall that integration basically involves summation: we appropriately discretize the space and then take the sum of the integrand values (assuming in each cell of space the integrand doesn t vary much). Thus the integration reduces to some sum of the form g(x 1, x 2,..., x n ), x 1 [N],x 2 [N],...,x n [N] 1

2 where [N] denotes the set of integers in 0, 1,..., N. Now assuming g( ) 0 this is easily estimated using sizes of of the following sets: {(x, c) : x [N] n ; c g(x) c + ɛ}. Note if g is computable in polynomial time then we can test membership in this set in polynomial time given (x, c, ɛ) so we ve shown that integration is a P problem. We will also be interested in sampling a random element of a set S. In fact, this will turn out to be intimately related to the problem of counting. 1 Counting vs Sampling We say that an algorithm is an approximation scheme for a counting problem if for every ɛ > 0 it can output an estimate of the size of the set that is correct within a multiplicative factor (1 + ɛ). We say it is a randomized fully polynomial approximation scheme (FPRAS) if it is randomized and it runs in poly(n, 1/ɛ, log 1/δ) time and has probability at least (1 δ) of outputting such an answer. We will assume δ < 1/poly(n) so we can ignore the probability of outputting an incorrect answer. An fully polynomial-time approximate sampler for S is one that runs in poly(n, 1/ɛ, log 1/δ) and outputs a sample u S such that u S Pr[u is output] 1 ɛ. Theorem 1 (Jerrum, Valiant, Vazirani 1986) For nicely behaved counting problems (the technical term is downward self-reducible ) sampling in the above sense is equivalent to counting (i.e., a algorithm for one task can be converted into one for the other). Proof: For concreteness, let s prove this for the problem of counting the number of satisfying assignments to a boolean formula. Let ϕ denote the number of satisfying assignments to formula ϕ. Sampling Approximate counting: Suppose we have an algorithm that is an approximate sampler for the set of satisfying assignments for any formula. For now assume it is an exact sampler instead of approximate. Take m samples from it and let p 0 be the fraction that have a 0 in the first bit x i, and p 1 be the fraction that have a 1. Assume p 0 1/2. Then the estimate of p 0 is correct up to factor (1 + 1/ m) by Chernoff bounds. But denoting by ϕ x1 =0 the formula obtained from ϕ by fixing x 1 to 0, we have p 0 = ϕ x 1 =0 ϕ. Since we have a good estimate of p 0, to get a good estimate of ϕ it suffices to have a good estimate of ϕ x1 =0. So produce the formula ϕ x1 =0 obtained from ϕ by fixing x 1 to 0, then use the same algorithm recursively on this smaller formula to estimate N 0, the value of ϕ x1 =0. Then output N 0 /p 0 as your estimate of ϕ. (Base case n = 1 can be solved exactly of course.) Thus if Err n is the error in the estimate for formulae with n variables, this satisfies Err n (1 + 1/ m)err n 1, S

3 which solves to Err n (1 + 1/ m) n. By picking m >> n 2 /ɛ 2 this error can be made less than 1 + ɛ. It is easily checked that if the sampler is not exact but only approximate, the algorithm works essentially unchanged, except the sampling error also enters the expression for the error in estimating p 0. Approximate counting Sampling: This involves reversing the above reasoning. Given an approximate counting algorithm we are trying to generate a random satisfying assignment. First use the counting algorithm to approximate ϕ x1 =0 and ϕ and take the ratio to get a good estimate of p 0, the fraction of assignments that have 0 in the first bit. (If p 0 is too small, then we have a good estimate of p 1 = 1 p 0.) Now toss a coin with Pr[heads] = p 0. If it comes up heads, output 0 as the first bit of the assignment and then recursively use the same algorithm on ϕ x1 =0 to generate the remaining n 1 bits. If it comes up tails, output 1 as the first bit of the assignment and then recursively use the same algorithm on ϕ x1 =1 to generate the remaining n 1 bits. Note that the quality ɛ of the approximation suffers a bit in going between counting and sampling. 1.1 Monte Carlo method The classical method to do counting via sampling is the Monte Carlo method. A simple example is the ancient method to estimate the area of a circle of unit radius. Draw the circle in a square of side 2. Now throw darts at the square and measure the fraction that fall in the circle. Multiply that fraction by 4 to get the area of the circle. Figure 1: Monte Carlo (dart throwing) method to estimate the area of a circle. The fraction of darts that fall inside the disk is π/4. Now replace circle with any set S and square with any set Ω that contains S and can be sampled in polynomial time. Then just take many samples from Ω and just observe the fraction that are in S. This is an estimate for S. The problem with this method is that usually the obvious Ω is much bigger than S, and we need Ω / S samples to get any that lie in S. (For instance the obvious Ω for computing ϕ is the set of all possible assignments, which may be exponentially bigger.)

4 2 Dyer s algorithm for counting solutions to KNAPSACK The Knapsack problem models the problem faced by a kid who is given a knapsack and told to buy any number of toys that fit in the knapsack. The problem is that not all toys give him the same happiness, so he has to trade off the happiness received from each toy with its size; toys with high happiness/size ratio are prefered. Turns out this problem is NP-hard if the numbers are given in binary. We are interested in a counting version of the problem that uses just the sizes. Definition 1 Given n weights w 1, w 2,..., w n and a target weight W, a feasible solution to the knapsack problem is a subset T such that i T w i W. We wish to approximately count the number of feasible solutions. This had been the subject of some very technical papers, until M. Dyer gave a very elementary solution in 2003. First, we note that the counting problem can be solved exactly in O(nW ) time, though of course this is not polynomial since W is given to us in binary, i.e. using log W bits. The idea is dynamic programming. Let Count(i, U) denote the number of feasible solutions involving only the first i numbers, and whose total weight is at most U. The dynamic programming follows by observing that there are two types of solutions: those that involve the ith element, and those that don t. Thus Count(i, U) = Count(i 1, U w i ) + Count(i 1, U) 1 if i = 1 and w 1 U 0 if i = 1 and w 1 > U Denoting by S the set of feasible solutions, S = Count(n, W ). But as observed, computing this exactly is computationally expensive and not polynomial-time. Dyer s next idea is to find a set Ω containing S but at most n times bigger. This set Ω can be exactly counted as well as sampled from. So then by the Monte Carlo method we can estimate the size of S in polynomial time by drawing samples from Ω. Ω is simply the set of solutions to a Knapsack instance in which the weights have been rounded to lie in [0, n 2 ]. Specifically, let w i = w in 2 W and W = n 2. Then Ω is the set of feasible solutions to this modified knapsack problem. claim 1: S Ω. (Consequently, S Ω.) This follows since if T S is a feasible solution for the original problem, then i w i i w in 2 /W n 2, and so T is a feasible solution for the rounded problem. claim 2: Ω n S. To prove this we give a mapping g from Ω to S that is at most n-to-1. { g(t = T if T S ) = = T \ {j} (else) where j = index of element in T with highest value of w j In the second case note that this element j satisfies w j > W/n which implies w j n.

5 Clearly, g is at most n-to-1 since a set T in S can have at most n pre-images under g. Now let s verify that T = g(t ) lies in S. which implies T S. i T w i i T W n 2 (w i + 1) W n 2 (W w j + n 1) W (since W = n 2 and w j n ) Sampling algorithm for Ω To sample from Ω, use our earlier equivalence of approximate counting and sampling. That algorithm needs an approximate count not only for Ω but also for the subset of Ω that contain the first element. This is another knapsack problem and can thus be solved by Dyer s dynamic programming. And same is true for instances obtained in the recursion. Bibliography 1. On the Hardness of Approximate Multivariate Integration. I. Koutis, Proc. Approx- Random 2013. Springer Verlag. 2. The complexity of enumeration and reliability problems. L. Valiant. SIAM J. Computing, 8:3 (1979), pp.410-421.