On the complexity of approximate multivariate integration

Similar documents
Sampling Contingency Tables

Lecture 21: Counting and Sampling Problems

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

Bounds for pairs in partitions of graphs

Irredundant Families of Subcubes

Counting bases of representable matroids

Lecture 17: D-Stable Polynomials and Lee-Yang Theorems

Counting Linear Extensions of a Partial Order

Length-Increasing Reductions for PSPACE-Completeness

Lecture 6: Random Walks versus Independent Sampling

On the Complexity of the Minimum Independent Set Partition Problem

1 Review of Vertex Cover

Not all counting problems are efficiently approximable. We open with a simple example.

Chapter 1 Preliminaries

6.1 Polynomial Functions

k-protected VERTICES IN BINARY SEARCH TREES

Conductance and Rapidly Mixing Markov Chains

The Complexity of Computing the Sign of the Tutte Polynomial

The Tutte polynomial: sign and approximability

On the Sample Complexity of Noise-Tolerant Learning

Separating Monotone VP and VNP

Hardness of the Covering Radius Problem on Lattices

A Note on the Complexity of Network Reliability Problems. Hans L. Bodlaender Thomas Wolle

VALIANT S MODEL AND THE COST OF COMPUTING INTEGERS

Optimization of a Convex Program with a Polynomial Perturbation

Fine Grained Counting Complexity I

Lecture 6: September 22

1 Randomized Computation

Lecture 22: Counting

DRAFT. Algebraic computation models. Chapter 14

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

Some Sieving Algorithms for Lattice Problems

Decomposition of random graphs into complete bipartite graphs

The Lovasz-Vempala algorithm for computing the volume of a convex body in O*(n^4) - Theory and Implementation

An Introduction to Randomized algorithms

Abstract. This paper discusses polynomial-time reductions from Hamiltonian Circuit (HC),

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

CSE200: Computability and complexity Space Complexity

Decomposition of random graphs into complete bipartite graphs

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

How many randomly colored edges make a randomly colored dense graph rainbow hamiltonian or rainbow connected?

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity

Convex Optimization Notes

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

c 2010 Society for Industrial and Applied Mathematics

A Necessary Condition for Learning from Positive Examples

Approximating Longest Directed Path

On the mean connected induced subgraph order of cographs

Lecture 11 October 7, 2013

Randomly coloring constant degree graphs

COUNTING NUMERICAL SEMIGROUPS BY GENUS AND SOME CASES OF A QUESTION OF WILF

Limits of Functions (a, L)

ITCS:CCT09 : Computational Complexity Theory Apr 8, Lecture 7

Adding random edges to create the square of a Hamilton cycle

Approximate Counting and Markov Chain Monte Carlo

fsat We next show that if sat P, then fsat has a polynomial-time algorithm. c 2010 Prof. Yuh-Dauh Lyuu, National Taiwan University Page 425

Monotone Submodular Maximization over a Matroid

ON TESTING HAMILTONICITY OF GRAPHS. Alexander Barvinok. July 15, 2014

arxiv: v1 [cs.dm] 12 Jun 2016

PROBABILISTIC ANALYSIS OF THE GENERALISED ASSIGNMENT PROBLEM

Average-case Analysis for Combinatorial Problems,

Questionnaire for CSET Mathematics subset 1

Convex and Semidefinite Programming for Approximation

Connections between exponential time and polynomial time problem Lecture Notes for CS294

A Duality between Clause Width and Clause Density for SAT

Rainbow Hamilton cycles in uniform hypergraphs

New lower bounds for hypergraph Ramsey numbers

Theory of Computer Science to Msc Students, Spring Lecture 2

COMP Analysis of Algorithms & Data Structures

FPT hardness for Clique and Set Cover with super exponential time in k

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Notes 6 : First and second moment methods

10.4 The Kruskal Katona theorem

Polynomial-time counting and sampling of two-rowed contingency tables

Economics 204 Fall 2011 Problem Set 1 Suggested Solutions

On Symmetric Signatures in Holographic Algorithms. Jin-Yi Cai University of Wisconsin, Madison. Pinyan Lu Tsinghua University, Beijing

ACO Comprehensive Exam October 14 and 15, 2013

ON SENSITIVITY OF k-uniform HYPERGRAPH PROPERTIES

Near-Optimal Algorithms for Maximum Constraint Satisfaction Problems

Essential facts about NP-completeness:

1 Agenda. 2 History. 3 Probabilistically Checkable Proofs (PCPs). Lecture Notes Definitions. PCPs. Approximation Algorithms.

Rational exponents in extremal graph theory

Lecture 15 - NP Completeness 1

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

Game Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems

Background on Coherent Systems

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem

The Complexity of the Permanent and Related Problems

Worst Case Performance of Approximation Algorithm for Asymmetric TSP

W -Hardness under Linear FPT-Reductions: Structural Properties and Further Applications

Min-Wise independent linear permutations

Model Counting for Logical Theories

Minimum-Dilation Tour (and Path) is NP-hard

DRAFT. Complexity of counting. Chapter 8

Algebra 2: Semester 2 Practice Final Unofficial Worked Out Solutions by Earl Whitney

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School

MATHEMATICS Trigonometry. Mathematics 30-1 Mathematics (10 12) /21 Alberta Education, Alberta, Canada (2008)

Lecture 16: Roots of the Matching Polynomial

arxiv: v2 [cs.ds] 3 Oct 2017

Improved Bounds for Flow Shop Scheduling

Transcription:

On the complexity of approximate multivariate integration Ioannis Koutis Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 USA ioannis.koutis@cs.cmu.edu January 11, 2005 Abstract We consider the integration problem for the class of smooth positive functions in n variables. An approximation scheme for this problem was given by Applegate and Kannan [proc. of the 23 rd ACM Symposium on Theory of Computing]. For any given function f, the complexity of the algorithm is polynomial when f is O(log n)-far from log-concavity. Somewhat informally, this holds if for any one-dimensional projection of log f, there is a function g with g = O(log n), such that log f + g is concave. In this paper we show that a polynomial time algorithm for functions that are Ω(log 3 n)-far from log-concavity, implies a 2 o(n) algorithm for the Hamiltonian path problem for graphs with n vertices. In addition, we show that in the general case, for any fixed natural k, it is NP-hard to approximate the integral within a factor of 2 nk. 1 Introduction Suppose F ( ) is a real positive function defined on a cube C in Euclidean n-space R n. We consider the problem of approximating the integral I(F ) of F over C with relative error ɛ, under the additional assumption that F satisfies a smoothness condition. The exact integration of a multivariate function is hard, under the conjectured hardness of #P, given the result in [2], which implies that the exact computation of the volume of an n-dimensional polytope is #P-complete. In view of this, it is natural to ask whether there is an algorithm that returns an ɛ-approximation of I(F ), in other words a value V such that 1/(1 + ɛ) I(F )/V < (1 + ɛ). 1 This paper improves on a preliminary version that appeared in the 6 th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems. 1

The first somewhat surprising answer to this question, came with the major result of Dyer, Frieze and Kannan, who gave a fully polynomial randomized approximation scheme (FPRAS) for the volume of n-dimensional convex bodies [4]. More precisely, they showed that the volume of an n-dimensional convex body K defined by a weak membership oracle M, can be ɛ-approximated with failure probability ξ, with poly(n, ɛ 1, log ξ 1 ) calls to M. Here, M can be thought of as a black-box algorithm that answers in one step whether a given point is in K. This directly implies that there is a FPRAS for the integration of n-variate concave functions which can be evaluated in time poly(n) at any point in the cube C. Subsequently, Applegate and Kannan extended this result to positive smooth and nearly log-concave functions [1]. To be more precise, if we define f(x) = ln F (x), c be the edge length of C, t(n) be an upper bound on the time to evaluate F at any point in C, and α, β satisfying ( ) f(x) f(y) α max x i y i i [1,n] f(λx + (1 λ)y) λf(x) + (1 λ)f(y) β for all x, y C and λ [0, 1], their algorithm has running time O(t(n) n7 c 2 α 2 e 2β log n cαn log ). ɛ 2 ξ ɛ It can be seen that α measures the smoothness of F. Also, if β = 0, the function is logconcave, so β can be seen as a measure of the distance of F from log-concavity. The natural question then becomes whether the dependence on β can be removed or somewhat alleviated. The first contribution of this paper is the following theorem. Theorem 1.1. The integral of any n-variate, polynomial time computable, positive function cannot be 2 n -approximated in time O(poly(α, n)2õ(β1/3) ) (where õ suppresses lower order terms), unless the Hamiltonian path problem for n-vertex graphs has a 2 o(n) algorithm. It is known that the Hamiltonian path problem is hard for the class SNP under subexponential reductions [7]. Hence, a 2 o(n) algorithm would imply that the class SNP, which includes problems such as k-sat, Independent Set, or Set Cover, has subexponential algorithms. The second contribution of this paper is the following theorem. Theorem 1.2. For any fixed natural k it is NP-hard to 2 nk -approximate the integral of positive, polynomial time computable functions, that are smooth in the sense that α = poly(n). Incidentally, we obtain a new proof of the #P-hardness of the 1/n!-approximate integration. We note that in general, only a few negative results concerning the inapproximability of counting problems are known. As observed in [5], the hardness of counting problems in 2

most cases follows either from the NP-completeness of the corresponding decision problem, or from applying some boosting technique which exploits some embedded NP-complete problem. There appears to be a paucity of results that prove the hardness of an approximate counting problem modulo a plausible complexity-theoretic hypothesis. One such case is [3], which proves that there is no FPRAS for counting the number of independent sets in graphs of maximum degree greater than 25, unless NP = RP. As noted in [6], in view of the lack of such satisfactory results, research efforts have often been directed towards proving that certain restricted algorithmic approaches fail (see Section 4 of [6] and the references therein). 2 An outline of the proof technique We derive the result through a reduction from the Hamiltonian path problem (HP for short). Recall that HP is one of the first problems shown to be NP-complete [8]. Given a graph G (in some usual representation) HP asks whether there exists a path of length n that goes exactly one time through every vertex of G. With every graph G, we associate a function F G. If G has n vertices, F G is a function of n log n variables. The function F G has the the following useful characteristics. It can be computed at any point x in a cube C of interest, in time polynomial in n. The parameters α, β of F G (defined in the introduction), are polynomial in n. Also, the value of the integral of F G depends on whether G contains a Hamilton path or not. Specifically, if there is a HP, the integral of F G over a cube C of constant edge size c, is lower bounded by a explicitly known quantity I H. If not, it is upper bounded by I NH, with I H /I NH 2 nk, for any fixed constant k. This proves Theorem 1.2. When k = 1, we additionally have β = O(n 3 log 5 n), which proves Theorem 1.1. In fact, we show that if #HP denotes the number of Hamiltonian paths in G, the integral satisfies (#HP )I H I(F ) (#HP )I H + I NH. Since there are at most n! Hamiltonian paths, it easily follows that computing the integral within a factor of 1 + 1/n!, allows the computation of #HP. Since the problem of counting the Hamiltonian paths is #P-complete [10], this provides a proof of the #P-hardness of the 1/n!-approximate integration. 3 The proof 3.1 Definition and properties of the function F G Let G be a graph with n vertices, and P be the set of length-n paths of G. We will assume without loss of generality that n is a power of 2, and that the vertices are arbitrarily numbered with numbers in [n]. The function F G (X) is a function of n log n variables, X = {x 1,1, x 1,2,..., x n,log n }. The reader can think of {x i,1,..., x i,log n } as the set of variables used by node i. Each path p P is associated with a term f p (X), and F G (X) = p P f p(x). 3

We now describe the term f p (X) for a path p. We consider p as an ordered set of vertices v 1,..., v n, where v i [n]. For any integer i n, we let w(i) be the set of the positions of ones in the binary representation of i and z(i) = [log n] w(i). We let m = m(n), a polynomially bounded function to be discussed later. We define n f p (X) = g i,p (X) with g i,p (X) = i=1 j w(i) xm v i,j j z(i) xm v i,j Let us give a small example to illustrate the construction. Suppose n = 4, and let p 1 = {1, 2, 3, 4} and p 2 = {1, 2, 1, 4}. We have f p1 (X) = f p2 (X) = 1 x m 1,1x m 1,2 1 x m 1,1x m 1,2 xm 2,1 xm 3,2 x m x m 2,2 x 4,1x m m 4,2 3,1 xm 2,1 xm 1,2 x m x m 2,2 x 4,1x m m 4,2 1,1 The first thing to observe is that by the symmetry in the construction, for any f p, the total degree of the numerator is equal to the total degree of the denominator. More importantly, note that in f p2, x m 1,2 appears both in the numerator and the denominator. In contrast, there is no cancellation of monomials in f p1. These observations hold in general. Let p be any non-simple path, and assume a node v appears in the j th and k th positions of p. The binary representations of j and k differ in at least one bit, let this be the t th bit. Then, by construction, x m v,t appears in the numerator of g j,p and in the denominator of g k,p (or vice versa), and so we get a cancellation. On the other hand, if p is a simple path, there is no cancellation of monomials, since each g i,p uses a different subset of the variables. We will integrate F G over the cube C = [1, c] n log n, so we study its properties in this cube. Each term f p (X) is increasing in the variables appearing in the numerator and decreasing in the variables appearing in the denominator. By setting the former to c and the latter to 1, we get that the maximum value of f p (X) is O(c nm log n ). Since there are at most n! paths, it follows that for any X C, F G (X) can be expressed with O(mn 2 log 2 n) bits. As noted in [1], the smoothness parameter α, can be upper bounded by α n log n max,x i X ln F (X) x i = n log n max p P f p(x)/ x i,x i X p P f p(x) Let x i X be any variable. Then, since the exponent of x i is at most nm, for all points X in C, we have f p (X) x nmf p (X) 4 (1)

which combined with inequality 1, gives α n 2 m log n. The definition of β trivially implies that any upper bound for f(x) is also an upper bound for β. From the above analysis we get β O(mn 2 log 2 n). A note about the algorithm of [1] is due here. The algorithm operates on a grid imposed on C. The coordinates of the grid are multiples of γ 1/2α. From the bound on α it follows that we are interested in evaluating F G at points which are rationals that are expressible in polynomial space. From the definition of F G, its value at any point of the grid, is also a rational expressible in polynomial space. We finally note that F G (X) has some additional interesting properties. First, F G has derivatives of any order, everywhere in the cube C. Also, its form is relatively simple, as it is a sum of rational multivariate polynomials. In addition, given a graph G we can easily obtain a closed form for the integral of F G, of exponential length though. 3.2 A polynomial time algorithm to evaluate F G We give an algorithm that computes F G (X) at any point X, in n time steps. We extend the definition of the path terms, to paths of length t. Concretely, we let t f p (X) = g i,p (X) with g i,p (X) = i=1 j w(i) xm v i,j j z(i) xm v i,j Let P t (v) be the set of paths of length t, that end in node v. Inductively, assume that just before time step t, for every v V, we have computed Q t 1 (v) = f p (X) p P t 1 (v) Let N(v) denote the set of neighbors of node v. At step t, for each node v we compute j w(t) Q t (v) = xm v,j Q t 1 (v ) j z(t) xm v,j v N(v) After n steps the quantities Q n (v) have been computed for all vertices v V. Then, F G (X) = v V Q n (v) The computation of Q t (v) is performed with a polynomial number of operations. Since there are n steps and n vertices, it follows that F G can be computed with a polynomial number of operations. The points we are interested in are rationals that can be expressed in polynomial space, and from the observations of the previous subsection all the intermediate quantities can be expressed in polynomial space. It follows that F G (X) can be evaluated exactly, at any point X C, in time polynomial in n. 5

3.3 Bounding the integrals Recall that we integrate F (X) over a cube C = [1, c] n log n. Let dx = dx 1,1... dx n,log n and π be an arbitrary permutation of the variable names. Since f p (X)dX = f p (X)dX p P p P we can consider the integral of each path separately. We will refer to the value of the integral of a term corresponding to a path p as the integral of p. Also, since f p (x 1,..., x n,log n )dx = f p (π(x 1 ),..., π(x n,log n )) dx we can rename the variables in any term of F. It is then easy to see that the integral of a path depends only on the structure of the path and not on the particular vertices appearing on it. In the rest of the chapter we rename the variables, by letting X = {x 1, x 2,..., x nlog n }. We first consider the integral of a Hamiltonian path. Using the fact that there are no cancellations of monomials we have (n log n)/2 n log n I H = 1 dx = i=1 x m i j=(n log n)/2+1 = (m + 1) (n log n)/2 (m 1) (n log n)/2 (1 c m+1 ) (n log n)/2 (c m+1 (n log n)/2 1) x m j Let us now consider the integral of an arbitrary non-simple path p. As noted in Section 3.1, we get at least one cancellation of monomials in f p (X). In general, we have X = X 0 X 1 X 2, where X 0 is the set of variables that do not appear in f p, X 1 is the set of variables that contribute monomials of negative degree, and X 2 is the set of variables that contribute monomials of positive degree. We have f p (X) = g 1 (X 1 )g 2 (X 2 ), and thus I(f p ) = dx 0 g 1 (X 1 )dx 1 g 2 (X 2 )dx 2 X 0 C X 1 C X 2 C It is not hard to see that we have dx 0 (c 1) n log n X 0 C and X 1 C g 1 (X 1 )dx 1 < 1 Because of the cancellation, and using the fact that the total degrees of the nominator and the denominator are equal in f p (X), we have g 2 (X) = x ajm j x j X 2 where a j (n log n)/2 1 j 6

By integrating, we get X 2 g 2 (X 2 )dx 2 j ( c a j m+1 1 ) j ( c a j m+1 ) c (n log n)/2 c j a jm c (n log n)/2 c m(n log n)/2 c m Now suppose that G is not Hamiltonian. Since there are at most n n c n log n paths in the graph, the integral of the associated function is bounded above by I NH c 2n log n c (mn log n)/2 c m On the other hand, if G has #HP Hamiltonian paths then the integral satisfies For m = Ω(n log n), we asymptotically have, (#HP )I H I(F ) (#HP )I H + I NH I H (m + 1) n log n c (n log n)/2 (mn log n)/2 c We assume that m is polynomially bounded and thus m = o(c log2 n ) and (m + 1) n log n c n log3 n. This asymptotically gives the following gap I H I NH c m n log3 n By taking m = n k = Ω(n log n) be any fixed polynomial, Theorem 1.2 is proved. By taking m = 2n log 3 n the ratio is greater than 2 n, and also b = O(n 3 log 5 n). Thus, Theorem 1.1 follows. 4 Conclusions We showed that for any fixed natural k, it is NP-hard to 2 nk -approximate the integral of n- variate positive smooth functions. We also argued that the currently best known integration algorithm cannot be improved significantly, unless the class SNP, which includes several well studied NP-complete problems, has algorithms that break the currently known strongly exponential upper bounds. We wish to note that at the heart of our proof lies an attempt to algebraically isolate the term corresponding to the Hamiltonian path, based only on the fact that it is a simple path. In [9], Valiant suggested that any proof of a lower bound for NP should be preceded by explaining the failure of similar restricted algebraic approaches. In this sense, we feel that the most interesting question left open in this paper, is whether we can show a lower bound on the parameter β for all the polynomial (or more generally subexponential) time computable functions F G which can be constructed using the techniques of this paper. That would prove that the algorithmic approach underlying our proof fails entirely, and would also provide a partial answer to Valiant s question. 7

References [1] David Applegate and Ravi Kannan. Sampling and integration of near log-concave functions. In Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, 1991. [2] Graham Brightwell and Peter Winkler. Counting linear extensions is #P-complete. In Proceedings of the 23th Annual ACM Symposium on Theory of Computing, 1991. [3] Martin Dyer, Alan Frieze, and Mark Jerrum. On counting independent sets in graphs. In Proceedings of the 40nd Annual Symposium on the Foundations of Computer Science, 1999. [4] Martin Dyer, Alan Frieze, and Ravi Kannan. A random polynomial-time algorithm for approximating the volume of convex bodies. Journal of the ACM, 38:1 17, 1991. [5] Martin Dyer, Leslie Ann Goldberg, Catherine Greenhill, and Mark Jerrum. On the relative complxity of approximate counting problems. Algorithmica, to appear. [6] Martin Dyer and Catherine Greenhill. Random walks on combinatorial objects. Surveys in Combinatorics, 267:101 136, 1999. [7] R. Impagliazzo, R. Paturi, and F. Zane. Which problems have strongly exponential complexity? In Proceedings of the 39th Annual Symposium on Foundations of Computer Science, page 653. IEEE Computer Society, 1998. [8] Richard M. Karp. Reducibility among combinatorial problems. In R. E. Miller and J. W. Thatcher, editors, Complexity of Computer Computations, pages 85 103. Plenum Press, 1972. [9] Leslie G. Valiant. Why is boolean complexity difficult? Boolean Function Complexity, Lond. Math. Soc. Lecure Note Ser., 169. [10] Leslie G. Valiant. The complexity of enumeration and reliability problems. SIAM Journal of Computing, 8(3):410 421, 1979. 8