The Cut-Norm and Combinatorial Optimization. Alan Frieze and Ravi Kannan

The Cut-Norm and Combinatorial Optimization Alan Frieze and Ravi Kannan

Outline of the Talk The Cut-Norm and a Matrix Decomposition. Max-Cut given the Matrix Decomposition. Quadratic Assignment given the Matrix Decomposition. Constructing Matrix Decomposition via the Grothendieck Identity Alon and Naor Multi-Dimensional Matrices

The cut-norm of the R C matrix A is defined to be A = max A(S, T). S R T C

The cut-norm of the R C matrix A is defined to be A = max A(S, T). S R T C Relation to regularity: D is an R C matrix with D(i, j) = d. W is an R C matrix with W ǫ R C. A = D + W. A(S, T) d S T ǫ R C.

Cut Matrices Given S R, T C and real value d: R C Cut Matrix C = CUT(S, T, d): { d if (i, j) S T, C(i, j) = 0 otherwise. d d d d 0 0 d d d d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D (t) = CUT(R t, C t, d t ). Matrix Decomposition A = D (1) + D (2) + + D (s) + W.

D (t) = CUT(R t, C t, d t ). Matrix Decomposition A = D (1) + D (2) + + D (s) + W. We want s, max t { d t } and W to be small.

D (t) = CUT(R t, C t, d t ). Matrix Decomposition A = D (1) + D (2) + + D (s) + W. We want s, max t { d t } and W to be small. W ǫmn m = R, n = C max{ d t } t is achievable. s = O(1/ǫ 2 ) = O(1)

Let A F = i,j A(i, j) 2 1/2 be the Frobenius Norm of A.

Assume inductively that we have found cut matrices D (j) = CUT(R j, C j, d j ), such that W (t) = A (D (1) + D (2) + + D (t) ) satisfies W (t) 2 F (1 ǫ2 t) A 2 F.

Assume inductively that we have found cut matrices D (j) = CUT(R j, C j, d j ), such that W (t) = A (D (1) + D (2) + + D (t) ) satisfies W (t) 2 F (1 ǫ2 t) A 2 F. Suppose there exist S R, T C such that W (t) (S, T) ǫ mn A F. Let R t+1 = S, C t+1 = T, d t+1 = W(t) (S, T). S T

W (t+1) 2 F W(t) 2 F = W(t) D (t+1) 2 F W(t) 2 F = ((W (t) (i, j) d t+1 ) 2 W (t) (i, j) 2 ) = i R t+1 j C t+1 R t+1 C t+1 d 2 t+1 = W(t) (R t+1, C t+1 ) 2 R t+1 C t+1 ǫ 2 A 2 F.

W (t+1) 2 F W(t) 2 F = W(t) D (t+1) 2 F W(t) 2 F = ((W (t) (i, j) d t+1 ) 2 W (t) (i, j) 2 ) = i R t+1 j C t+1 R t+1 C t+1 d 2 t+1 = W(t) (R t+1, C t+1 ) 2 R t+1 C t+1 ǫ 2 A 2 F. Conclusion: D (1),...,D (s), s ǫ 2 such that. W (s) ǫ mn A F

Refinements Suppose that we can only compute R t+1, C t+1 such that W (t) (R t+1, C t+1 ) ρ W (t) where ρ 1.

Refinements Suppose that we can only compute R t+1, C t+1 such that W (t) (R t+1, C t+1 ) ρ W (t) where ρ 1. Conclusion: We can compute D (1),..., D (s), s ρ 2 ǫ 2 such that W (s) ǫ mn A F.

Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F.

Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F. If R t+1 < m/2 then either (i) W (t) (R, C t+1 ) 1 2 ρǫ mn A F or (ii) W (t) (R \ R t+1, C t+1 ) 1 2 ρǫ mn A F

Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F. If R t+1 < m/2 then either (i) W (t) (R, C t+1 ) 1 2 ρǫ mn A F or (ii) W (t) (R \ R t+1, C t+1 ) 1 2 ρǫ mn A F Conclusion: We can compute D (1),..., D (s), s 4ρ 2 ǫ 2 such that W (s) ǫ mn A F and such that R i m/2 and C i n/2.

MAX-CUT

G = (V, E) is a graph with n n adjacency A and A = D (1) + D (2) + + D (s) + W. D (t) = CUT(R t, C t, d t ), d t 2, s = O(1/ǫ 2 ) and W ǫn 2

G = (V, E) is a graph with n n adjacency A and A = D (1) + D (2) + + D (s) + W. D (t) = CUT(R t, C t, d t ), d t 2, s = O(1/ǫ 2 ) and W ǫn 2 If (S, S) is a cut in G then the weight of this cut satisfies A(S, S) (D (1) + D (2) + + D (s) )(S, S) ǫn 2.

So we look for S to (approximately) minimize s t=1 d tf t g t.

So we look for S to (approximately) minimize s t=1 d tf t g t. Let ν = ǫn and ft ft = ν, ḡ t = ν gt ν ν Then s f t g t d t f t ḡ t d t 6νns 6ǫn 2. t=1

So we look for S to (approximately) minimize s t=1 d tf t g t. Let ν = ǫn and ft ft = ν, ḡ t = ν gt ν ν Then s f t g t d t f t ḡ t d t 6νns 6ǫn 2. t=1 So we can look for S to (approximately) minimize s t=1 d t f t ḡ t.

There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters.

There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P. We check ( f t, ḡ t ) by solving the LP relaxation of the integer program 0 x P P P P ft x P < f t + ν 1 t s P Rt ḡ t P C t ( P x P ) ḡ t + ν and doing some adjusting. (x P = S P ).

There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P. The partition P has the property that for disjoint S, T V we have e(s, T) d i,j S i T j 2 W 2ǫn 2 i [k] j [k] where d i,j = e(v i, V j )/( V i V j ).

There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P. The partition P has the property that for disjoint S, T V we have e(s, T) d i,j S i T j 2 W 2ǫn 2 i [k] j [k] where d i,j = e(v i, V j )/( V i V j ). We could replace P with an ordinary regular partition. The constants as a function of ǫ get worse.

Quadratic Assignment Minimise A i,j,p,q z i,p z j,q i,j,p,q subject to z i,k = z k,j = 1 k k z i,j = 0, 1. i, j.

A set of n items V have to be assigned to a set of n locations X, one per location. z i,p = 1: Place item i in position p = π(i) T(i, i ) 1 is the amount of traffic between item i and i. D(x, x ) is the distance between location x and x. If item i is assigned to location π(i) for i [n] the total cost c(π) is defined by c(π) = n n T(i, i )D(π(i), π(i )). i=1 i =1 The problem is to minimise c(π) over all bijections π : V X.

Metric space X with metric D. Metric QAP. 1 diam(x)=1 i.e. max x,y D(x, y) = 1. 2 For all ǫ > 0 there exists a partition X = X 1 X 2 X l, l = l(ǫ), such that diam(x j ) ǫ. We call this an ǫ refinement of X. So there is a l l matrix ˆD such that if x X j and x X j then D(x, x ) ˆD(j, j ) 2ǫ. This partition must be computable in time polynomial in n and 1/ǫ. We call this the metric QAP.

We decompose T = T 1 + T 2 + + T s + W where W ǫn 2. For bijection π : V X we have c(π) = s k=1 i,j=1 n T k (i, j)d(π(i), π(j)) + 1 We compute an O(ǫ 3 )-refinement of X and let S (π) i = π 1 (X i ). s k=1 i,j=1 n T k (i, j)d(π(i), π(j)) = s l k=1 i,j=1 d k R k S (π) i C k S (π) j ˆD(i, j) + 2.

Computing a decomposition.

Computing a decomposition. Reducible to finding an approximation to the cut-norm.

Computing a decomposition. Reducible to finding an approximation to the cut-norm. Our paper gave algorithms with a small additive error.

Computing a decomposition. Reducible to finding an approximation to the cut-norm. Our paper gave algorithms with a small additive error. Alon and Naor gave an approximation algorithm with multiplicative error!

Let A 1 = max x i,y j {±1} A(i, j)x i y j. i,j

Let A 1 = max x i,y j {±1} A(i, j)x i y j. i,j i,j A(i, j)x i y j = A i,j x i =1 x i =1 y j =1 y j = 1 A i,j x i = 1 y j =1 + x i = 1 y j = 1 A i,j. So A 1 4 A.

Let A 1 = max x i,y j {±1} A(i, j)x i y j. i,j i,j A(i, j)x i y j = A i,j x i =1 x i =1 y j =1 y j = 1 A i,j x i = 1 y j =1 + x i = 1 y j = 1 A i,j. So A 1 4 A. A similar argument gives A A 1.

Grothendieck s Identity u, v are unit vectors in a Hilbert space H. z is chosen uniformly from B = {x : x = 1}. π E[sign(u z) sign(v z)] = arcsin(u.v). 2

Let (u i, v j ) define L A = max u i,v j A(i, j)u i v j ( A 1 ) i,j where (u i, v j ) lie in R m+n and u i = v j = 1.

Let (u i, v j ) define L A = max u i,v j A(i, j)u i v j ( A 1 ) i,j where (u i, v j ) lie in R m+n and u i = v j = 1. (u i, v j ) are computable via Semi-Definite Programming.

Let c = sinh 1 (1) = ln(1 + 2). sin(cu i v j ) = = ( 1) k c 2k+1 (2k + 1)! (u i v k=0 j ) 2k+1 ( 1) k c 2k+1 (2k + 1)! (u i ) (2k+1) (v k=0 = S(u i ) T(v j ). j ) (2k+1)

Here and S(u i ) = T(v j ) = ( 1) k c 2k+1 (2k + 1)! (u i ) (2k+1) c 2k+1 (2k + 1)! (v j ) (2k+1) k=0 k=0 (u 1, u 2, u 3, u 4 ) (3) = (u 3 1, u2 1 u 2, u 2 1 u 3, u 2 1 u 4, u 1 u 2 u 3,...).

Here and S(u i ) = T(v j ) = ( 1) k c 2k+1 (2k + 1)! (u i ) (2k+1) c 2k+1 (2k + 1)! (v j ) (2k+1) k=0 k=0 (u 1, u 2, u 3, u 4 ) (3) = (u 3 1, u2 1 u 2, u 2 1 u 3, u 2 1 u 4, u 1 u 2 u 3,...). Note that c has been chosen so that S(u i ) = T(v j ) = 1.

L A = i,j A i,j u i v j = c 1 i,j = c 1π A i,j E[sign(S(ui 2 i,j A i,j arcsin(s(u i ) T(v j )) ) z) sign(t(v j ) z)]

L A = i,j A i,j u i v j A i,j arcsin(s(u i ) T(v j )) = c 1 i,j = c 1π A i,j E[sign(S(ui 2 i,j ) z) sign(t(v j ) z)] We embed the S(ui ), T(v j ) in R m+n and choose z randomly and put x i = sign(t(ui z), y j = sign(t(vj ) z)). By choosing many z we get a good estimate of A 1. One can recover a good solution (x i, y j ) by first deciding whether x 1 = 1 or x 1 = 1 etcetera.

L A = i,j A i,j u i v j = c 1 i,j = c 1π A i,j E[sign(S(ui 2 i,j A i,j arcsin(s(u i ) T(v j )) ) z) sign(t(v j ) z)] One can extend the idea to approximate the cut-norm, with the same guarantee.

In Frieze, Kannan we gave a randomised algorithm for computing a weak partition using only 2Õ(ǫ 2) time.

In Frieze, Kannan we gave a randomised algorithm for computing a weak partition using only 2Õ(ǫ 2) time. Subsequently, Alon,Fernandez de la Vega,Kannan, Karpinski, Yuster show how to compute such a partition using only Õ(ǫ 4 ) probes. (A similar but weaker result was obtained by Anderson, Engebretson). See also Borgs, Chayes, Lovász, Sós, Vesztergombi.

A multi-dimensional version Max-r-CSP is the following problem: We are given m Boolean functions f i defined on Y i = (y 1, y 2,...,y r ) where {y 1, y 2,..., y r } {x 1, x 2,...,x n } and the aim to choose a setting for the variables x 1, x 2,...,x n that makes as many of the functions f i as possible, true. For each z {0, 1} r and (i 1, i 2,...,i r ) [n] r we define A (z) (i 1, i 2,...,i r ) = { j : Y j = x i1,...,x ir and f j (z 1,...,z r ) = T }.

For this it is useful to have a decomposition for r-dimensional matrices: An r-dimensional matrix A on X 1 X 2... X r is a map A : X 1 X 2 X r R. If S i X i for i = 1, 2,... r, and d is a real number the matrix M satisfying { d for e S1 S M(e) = 2 S r 0 otherwise is called a cut matrix and is denoted M = CUT(S 1, S 2,... S r ; d).

B is the (2-dimensional) matrix with rows indexed by Y 1 = X 1 Xˆr, ˆr = r/2 and columns indexed by Y 2 = Xˆr+1 X r. For i = (x 1,..., xˆr ) Y 1 j = (xˆr+1,..., x r ) Y 2 let B(i, j) = A(x 1, x 2,...,x r ).

Applying a decompositon algorithm we obtain B = D (1) + D (2) + + D (s0) + W where for 1 t s 0, D (t) = CUT(R t, C t, d t ), and W is small.

Each R t defines an ˆr-dimensional 0-1 matrix R (t) where R (t) (x 1,..., xˆr ) = 1 iff (x 1,..., xˆr ) R t. C (t) is defined similarly. Assume inductively that we can further decompose R (t) = D (t,1) + + D (t,s 1) + W (t) C (t) = ˆDt, 1 + + ˆDt, ŝ 1 + Ŵ(t) Here D (t,u) = CUT(R t,u,1,...,r t,u,ˆr, d t,u ) ˆDt, û = CUT(R t,û,ˆr+1,...,r t,û,r, ˆd t,û )

THANK YOU