COUNTING INTEGER POINTS IN POLYHEDRA. Alexander Barvinok

Similar documents
Alexander Barvinok and J.A. Hartigan. January 2010

Alexander Barvinok August 2008

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

1 Maximal Lattice-free Convex Sets

Chapter 1. Preliminaries

Linear Algebra Review: Linear Independence. IE418 Integer Programming. Linear Algebra Review: Subspaces. Linear Algebra Review: Affine Independence

Key Things We Learned Last Time. IE418 Integer Programming. Proving Facets Way #2 Indirect. A More Abstract Example

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

Optimization WS 13/14:, by Y. Goldstein/K. Reinert, 9. Dezember 2013, 16: Linear programming. Optimization Problems

Nonlinear Programming Models

Mathematics 530. Practice Problems. n + 1 }

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Enumerating integer points in polytopes: applications to number theory. Matthias Beck San Francisco State University math.sfsu.

The Triangle Closure is a Polyhedron

TRISTRAM BOGART AND REKHA R. THOMAS

Integer Quadratic Programming is in NP

Sampling Contingency Tables

Linear and non-linear programming

Integer Programming ISE 418. Lecture 12. Dr. Ted Ralphs

Ehrhart polynome: how to compute the highest degree coefficients and the knapsack problem.

Weighted Ehrhart polynomials and intermediate sums on polyhedra

Counting with rational generating functions

Minimizing Cubic and Homogeneous Polynomials over Integers in the Plane

Lattice closures of polyhedra

10 Years BADGeometry: Progress and Open Problems in Ehrhart Theory

MAT-INF4110/MAT-INF9110 Mathematical optimization

A strongly polynomial algorithm for linear systems having a binary solution

Integer Programming, Part 1

Hermite normal form: Computation and applications

ARCS IN FINITE PROJECTIVE SPACES. Basic objects and definitions

Geometry of log-concave Ensembles of random matrices

On John type ellipsoids

3. Linear Programming and Polyhedral Combinatorics

Lattices and Hermite normal form

A Criterion for the Stochasticity of Matrices with Specified Order Relations

The partial-fractions method for counting solutions to integral linear systems

Introduction to Semidefinite Programming I: Basic properties a

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Lecture 2: Lattices and Bases

On mixed-integer sets with two integer variables

Cutting Plane Methods I

Lattice closures of polyhedra

Support Vector Machines and Kernel Methods

CSL361 Problem set 4: Basic linear algebra

Mixed-integer linear representability, disjunctions, and variable elimination

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

The Triangle Closure is a Polyhedron

Combinatorial Reciprocity Theorems

Lecture 22: Section 4.7

arxiv:math/ v2 [math.co] 9 Oct 2008

Introduction to Decision Sciences Lecture 6

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Top Ehrhart coefficients of integer partition problems

Lattice Point Enumeration via Rational Functions, and Applications to Optimization and Statistics

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

3. Probability and Statistics

On the Rational Polytopes with Chvátal Rank 1

Convex Optimization and Modeling

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

15.083J/6.859J Integer Optimization. Lecture 10: Solving Relaxations

Linear algebra for computational statistics

LMI Methods in Optimal and Robust Control

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION henrion

3. Linear Programming and Polyhedral Combinatorics

8. Geometric problems

Minimal inequalities for an infinite relaxation of integer programs

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

CSCI : Optimization and Control of Networks. Review on Convex Optimization

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Notes on the decomposition result of Karlin et al. [2] for the hierarchy of Lasserre by M. Laurent, December 13, 2012

An algebraic perspective on integer sparse recovery

A Brief Review on Convex Optimization

Elementary maths for GMT

Analysis and Linear Algebra. Lectures 1-3 on the mathematical tools that will be used in C103

Common-Knowledge / Cheat Sheet

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Organization of a Modern Compiler. Middle1

DM559 Linear and Integer Programming. Lecture 2 Systems of Linear Equations. Marco Chiarandini

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2

9. Geometric problems

Lecture 7: Positive Semidefinite Matrices

DS-GA 1002 Lecture notes 10 November 23, Linear models

SYMBOL EXPLANATION EXAMPLE

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

(1) is an invertible sheaf on X, which is generated by the global sections

12. Interior-point methods

Some Properties of Convex Hulls of Integer Points Contained in General Convex Sets

Nonlinear Discrete Optimization

Finite-Dimensional Cones 1

Lecture 1 Introduction

web: HOMEWORK 1

ACO Comprehensive Exam October 18 and 19, Analysis of Algorithms

Linear algebra I Homework #1 due Thursday, Oct. 5

Algebraic and Geometric ideas in the theory of Discrete Optimization

MSRI Summer Graduate Workshop: Algebraic, Geometric, and Combinatorial Methods for Optimization. Part IV

MATHEMATICAL PROGRAMMING I

Grothendieck s Inequality

Transcription:

COUNTING INTEGER POINTS IN POLYHEDRA Alexander Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html Let P R d be a polytope. We want to compute (exactly or approximately) the number P Z d of integer points in P. P Typeset by AMS-TEX

PART I: EXACT COUNTING IN FIXED DIMENSION Rational polyhedra P = (ξ,...,ξ d ) : d a ij ξ j b i, i =,...,n, j= where a ij,b i Z. The input size of P: L(P) = n(d+)+ ij log 2 ( a ij +) + i log 2 ( b i +), the number of bits needed to define P. 2

Generating functions m x m P We consider the sum m P Z d x m, where The motivating example x m = x µ xµ d d for m = (µ,...,µ d ) n m= x m = xn+ x m x The sum over the integer points on an interval. m n 3

Theorem. Let us fix d. There exists a polynomial time algorithm, which, given a rational polyhedron P R d without lines, computes the generating function f(p,x) = m P Z d x m in the form f(p,x) = i I ǫ i x v i ( x u i ) ( x u id), where ǫ i {,}, v i Z d, u ij Z d \{}. The complexity of the algorithm is L O(d), where L = L(P) is the input size of P. In particular, I = L O(d). Proved in 993 (with L O(d2) bound) and then again in 999. 4

Having computed we can Applications f(p,x) = m P Z d x m, ) efficiently count P Z d, the number of integer points in a given rational polytope P. Carefully substitute x =... = x d = into f(p,x); 2) solve integer programming problems of optimizing a given linear function on the set P Z d of integer points in P. There are at least two implementations: LattE (lattice point enumerator) by J. De Loera et al., now subsumed by LattE macchiato by M. Köppe, and barvinok by S. Verdoolaege http://www.math.ucdavis.edu/ latte http://www.kotnet.org/ skimo/barvinok http://freshmeat.net/projects/barvinok/ 5

Reducing polyhedra to cones Some ideas of the proof We can represent the polyhedron as a hyperplane section of a higher-dimensional cone. P P K We have P = K H, where P R d, K R d+, and H R d+ is an affine hyperplane identified with R d. Consequently, we have f(p,x) = f(k, x) x, where x = (x,x d+). d+ xd+ = 6

Dealing with cones: unimodular cones A cone K R d generated by a basis u,...,u d of Z d is called unimodular. For such a cone, we have f(k,x) = m K Z d x m = ( x u ) ( x u d). P Integer points in the non-negative orthant. m Z d + x m = ( x ) ( x d ). A unimodular cone differs from the non-negative orthant by a unimodular transformation. The main construction (993): For any fixed d there is a polynomial time algorithm which decomposes a given rational cone K R d into a combination of unimodular cones K i : [K] = i I ǫ i [K i ], where ǫ i {,}, K i are unimodular cones, and { if x R d [A] : R d R, [A](x) = otherwise is the indicator of a set A. 7

Example: continued fractions in dimension 2 Suppose that K R 2 is generated by (,) and (3,64). First, we compute 64 3 = 5+ 9 3 = 5+ 3+ 4 9 = 5+ 3+ 2+ 4. Hence 64/3 = [5;3,2,4]. Now we compute the convergents: [5;3,2] = 5+ 3+ 2 = 37 7,[5;3] = 5+ 3 = 6 3, [5] = 5. 8

Then we do cutting and pasting of cones: (3, 6) (7, 37) + (3, 6) (, 5) + (, 5) (3, 64) (3, 64) = (7, 37) Let K be the cone generated by (,) and (,). Starting with K, we cut the cone generated by (,) and (,5); paste the cone generated by (, 5) and (3, 6); cut the cone generated by (3, 6) and (7, 37); paste the cone generated by (7, 37) and (3, 64) to finally get K generated by (,) and (3,64). The four cones we cut and paste are unimodular. So we get f(k,x) = ( x )( x 2 ) ( x 2 )( x x 5 2 ) + ( x x 5 2 )( x3 x6 2 ) ( x 3 x6 2 )( x7 x37 2 ) + ( x 7 x37 2 )( x3 x64 2 ). 9

Higher dimensions For K generated by linearly independent u,...,u d Z d, let indk be the volume of the parallelepiped spanned by u,...,u d (so indk = if and only if K is unimodular). Using Minkowski s Convex Body Theorem, compute w Z d \{} such that w = d α i u i where α i (indk) /d. i= Let K i be the cone generated by u,...,u i,w,u i+,...,u d. Then [K] = d ǫ i [K i ] i= +indicators of lower-dimensional cones, where ǫ i {,} and Now iterate. indk i (indk) d d. = + + u 3 u 3 u 3 u u 2 u u 2 w u 2 w u w u 3 u 2 w u u u 2 w u 3 = + + u w u u 2 2 w u 3 u 2 u 3 u w w u u 3

PART II : APPROXIMATE COUNTING IN HIGHER DIMENSIONS We assume that P is defined by a system of linear equations Ax = b and inequalities so the picture looks more like this x, P Here A = (a ij ) is an integer k n matrix of rank k < n and b is an integer n-vector. k integer entries * * * * * * * * * * * * n Let Λ = Ax : x Z n be the lattice in Z k. Unless b Λ, we have P Z n =. A

Joint work with J.A. Hartigan (Yale). Let us consider the function The Gaussian formula g(ξ) = (ξ +)ln(ξ +) ξlnξ for ξ. 3 2 2 4 6 8 x Let us solve the optimization problem: Find max g(x) = n g(ξ j ) j= Subject to: x = (ξ,...,ξ n ) P. Since g is strictly concave, the maximum point z = (ζ,...,ζ n ) is unique and can be found efficiently by interior point methods, for example. Computing z is easy both in theory and in practice. Let us compute a k k matrix Q = (q ij ) by q ij = n ( ) a im a jm ζ 2 m +ζ m. m= We approximate the number of integer points in P by P Z n e g(z) detλ (2π) k/2 (detq) /2. 2

A couple of examples Let us compute the number of 4 4 non-negative integer matrices with row sums 22, 25, 93 and 64 and column sums 8, 286, 7 and 27. 8 286 7 27 22 25 93 64 * * * * * * * * * * * * * * * * The number of such matrices is 2259427676854.23 5. We have n = 6 variables and k = 7 equations (one equation can be thrown out). The Gaussian formula overestimates by about 6%. J. De Loera computed more examples. Here is one of them: the exact number of 3 3 3 arrays of non-negative integers with the sums [3, 22, 87], [5, 3, 77], [42,87,] along the affine coordinate hyperplanes is 8846838772659 8.84 5. The Gaussian formula gives the relative error of.85%. 87 42 5 3 77 3 22 87 Herewe have n = 27 variablesand k = 7 equations (two equationscan be thrown out). 3

The (first level of) intuition A random variable x is geometric if Pr { x = m } = pq m for m =,,... where p+q = and p,q >. We have Ex = q p and varx = q p 2. Conversely, if Ex = z then p = +z, q = z +z and varx = z 2 +z. Theorem. Let P R n be a polytope that is the intersection of an affine subspace in R n and the non-negative orthant R n +. Suppose that P has a non-empty interior (contains a point with strictly positive coordinates). Then the strictly concave function g(x) = n ) ((ξ j +)ln(ξ j +) ξ j lnξ j j= attains its maximum on P at a unique point z = (ζ,...,ζ n ) with positive coordinates. Suppose that x,...,x n are independent geometric random variables with expectations ζ,...,ζ n and let X = (x,...,x n ). Then the probability mass function of X is constant on P Z n and equal to e g(z) at every x P Z n. In particular, P Z n = e g(z) Pr { X P }. 4

Now, suppose that P = { } x : Ax = b, x. Let Y = AX, that is, Y = (y,...,y k ), where n y i = a ij x j. j= Theorem implies that We note that and that P Z n = e g(z) Pr { Y = b }. EY = b cov(y i,y j ) = n a im a jm varx k = m= n ( ) a im a jm ζ 2 m +ζ m. Now, weobservethaty isthesumofnindependent randomvectorsx j A j, wherea j is the j-th column of A, so we make a leap of faith and assume that the distribution of Y in the vicinity of its expectation is close to the distribution of a Gaussian random vector Y with the same expectation b and the covariance matrix Q. m= b Hence it is not unreasonable to assume that Pr { Y = b } Pr { Y b+fundamental domain of Λ } detλ (2π) k/2 (detq) /2. 5

More intuition from statistics and statistical physics In 957, E.T. Jaynes formulated a general principle. Let Ω be a large but finite probability space with an unknown measure µ, let f,...,f k : Ω R be random variables with known expectations Ef i = α i for i =,...,k and let g : Ω R be yet another random variable. Then to compute or estimate Eg one should assume that µ is the probability measure on Ω of the largest entropy such that that Ef i = α i for i =,...,k. Works great for the Maxwell-Boltzmann distribution. In 963, I.J. Good argued that the null hypothesis concerning an unknown probability distribution from a given class should be the one stating that the distribution is the maximum entropy distribution in the class. In our case, Ω is the set Z n + of non-negative integer vectors, f i are the linear equationsdefining polytopep, and µisthecounting probabilitymeasure onp Z n +. We approximate µ by the maximum entropy distribution on Z n + subject to the constraints Ef i = α i, where f i are the linear equations defining P. Fact: Among all distributions on Z + with a given expectation, the geometric distribution has the maximum entropy. The entropy of a geometric distribution with expectation x is g(x) = (x+)ln(x+) xlnx. A S Let us choose a maximum entropy distribution among all probability distributions on a given set S R n with the expectation in a given affine subspace A. It is not hard to argue that the conditional distribution on S A must be uniform. 6

Multi-way transportation polytopes and multi-way contingency tables Transportationpolytopes. Letuschoosepositiver,...,r m andc,...,c n such that r +...+r m = c +...+c n = N. The polytope P of non-negative m n matrices (x ij ) with row sums r,...,r m and column sums c,...,c n is called a (two-index) transportation polytope. We have dimp = (m )(n ). Suppose r i and c j are integer. Integer points in P are called (two-way) contingency tables. ν-way transportation polytopes. Let us fix an integer ν 2. The polytope P of ν-dimensional k... k ν arrays (x j...j ν ) with prescribed sectional sums j k... j i k i j i+ k i+... j ν k ν are called ν-way transportation polytopes. x j...j i,j,j i+...j ν As long as the natural balance conditions are met, P is a polytope with dimp = k k ν (k +...+k ν )+ν. Integer points in P are called ν-way contingency tables. 7

Some results Fix ν and let k,...,k ν grow roughly proportionately. We proved: The number of integer points in P is asymptotically Gaussian provided ν 5. We suspect it is Gaussian already for ν 3. In particular, the number of non-negative integer k k magic cubes, that is, contingency tables with all sectional sums equal to r = αk ν is ( ) ((α+) +o() α+ α α) k ν ( 2πα 2 +2πα ) (kν ν+)/2 k (ν ν 2 )(k )/2 as k +, provided ν 5, r is integer and α is separated away from ; For ν = 2, the asymptotic of the number of integer contingency tables with equalrowsumsandequalcolumnsumswasrecentlycomputedbye.r.canfieldand B. McKay. It is differs from the Gaussian estimate by a constant factor (generally, greater than ). This corresponds to the Edgeworth correction to the Gaussian distribution. 8

The curious case of ν = 2 The Gaussian formula should be multiplied by the correction factor, computed as follows. Let Z = (z ij ) be the matrix maximizing g(x) = ij ((x ij +)ln(x ij +) x ij lnx ij ) on the polytope of non-negative m n matrices with row sums (r,...,r m ) and column sums (c,...,c n ). Let us consider the quadratic form q : R m+n R: Let q(s,t) = 2 ( ) z 2 ij +z ij (si +t j ) 2 for (s,t) = (s,...,s m ;t,...,t n ). ij u =,...,;,..., }{{}}{{} m times n times and let L R m+n be any hyperplane not containing u. Then the restriction of q onto L is strictly positive definite and we consider the Gaussian probability measure on L with the density proportional to e q. Let us define random variables f,h : L R by Let f(s,t) = z ij (z ij +)(2z ij +)(s i +t j ) 3 6 h(s,t) = 24 Then the correction factor is ij and z ij (z ij +) ( zij 2 +6z ij + ) (s i +t j ) 4 ij for (s,t) = (s,...,s m ;t,...,t n ). µ = Ef 2 and ν = Eh. exp { µ } 2 +ν. 9

Ramifications: counting - points in polytopes A similar approach works for the set P {,} n of points with - coordinates in P R n, where P is defined by the system { } P = x : Ax = b, x. Here A = (a ij ) is a k n integer matrix of rank k < n and b Z k. Let Λ = A(Z n ) Z k. The geometric distribution is replaced by the Bernoulli distribution and function g(ξ) = (ξ +)ln(ξ +) ξlnξ is replaced by the standard entropy function h(x) = ξln ξ +( ξ)ln ξ for ξ..6.4.2 We solve a convex optimization problem: Find max h(x) =.2.4.6.8 x n h(ξ j ) j= Subject to: x = (ξ,...,ξ n ) P. Since h is strictly concave, the maximum point z = (ζ,...,ζ n ) is unique and can be found efficiently by interior point methods, for example. We compute the k d matrix Q = (q ij ) by The Gaussian formula: q ij = n ( a im a jm ζm ζm) 2. m= P {,} n e h(z) detλ (2π) k/2 (detq) /2. 2