The Cut-Norm and Combinatorial Optimization Alan Frieze and Ravi Kannan
Outline of the Talk The Cut-Norm and a Matrix Decomposition. Max-Cut given the Matrix Decomposition. Quadratic Assignment given the Matrix Decomposition. Constructing Matrix Decomposition via the Grothendieck Identity Alon and Naor Multi-Dimensional Matrices
The cut-norm of the R C matrix A is defined to be A = max A(S, T). S R T C
The cut-norm of the R C matrix A is defined to be A = max A(S, T). S R T C Relation to regularity: D is an R C matrix with D(i, j) = d. W is an R C matrix with W ǫ R C. A = D + W. A(S, T) d S T ǫ R C.
Cut Matrices Given S R, T C and real value d: R C Cut Matrix C = CUT(S, T, d): { d if (i, j) S T, C(i, j) = 0 otherwise. d d d d 0 0 d d d d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
D (t) = CUT(R t, C t, d t ). Matrix Decomposition A = D (1) + D (2) + + D (s) + W.
D (t) = CUT(R t, C t, d t ). Matrix Decomposition A = D (1) + D (2) + + D (s) + W. We want s, max t { d t } and W to be small.
D (t) = CUT(R t, C t, d t ). Matrix Decomposition A = D (1) + D (2) + + D (s) + W. We want s, max t { d t } and W to be small. W ǫmn m = R, n = C max{ d t } t is achievable. s = O(1/ǫ 2 ) = O(1)
Let A F = i,j A(i, j) 2 1/2 be the Frobenius Norm of A.
Assume inductively that we have found cut matrices D (j) = CUT(R j, C j, d j ), such that W (t) = A (D (1) + D (2) + + D (t) ) satisfies W (t) 2 F (1 ǫ2 t) A 2 F.
Assume inductively that we have found cut matrices D (j) = CUT(R j, C j, d j ), such that W (t) = A (D (1) + D (2) + + D (t) ) satisfies W (t) 2 F (1 ǫ2 t) A 2 F. Suppose there exist S R, T C such that W (t) (S, T) ǫ mn A F. Let R t+1 = S, C t+1 = T, d t+1 = W(t) (S, T). S T
W (t+1) 2 F W(t) 2 F = W(t) D (t+1) 2 F W(t) 2 F = ((W (t) (i, j) d t+1 ) 2 W (t) (i, j) 2 ) = i R t+1 j C t+1 R t+1 C t+1 d 2 t+1 = W(t) (R t+1, C t+1 ) 2 R t+1 C t+1 ǫ 2 A 2 F.
W (t+1) 2 F W(t) 2 F = W(t) D (t+1) 2 F W(t) 2 F = ((W (t) (i, j) d t+1 ) 2 W (t) (i, j) 2 ) = i R t+1 j C t+1 R t+1 C t+1 d 2 t+1 = W(t) (R t+1, C t+1 ) 2 R t+1 C t+1 ǫ 2 A 2 F. Conclusion: D (1),...,D (s), s ǫ 2 such that. W (s) ǫ mn A F
Refinements Suppose that we can only compute R t+1, C t+1 such that W (t) (R t+1, C t+1 ) ρ W (t) where ρ 1.
Refinements Suppose that we can only compute R t+1, C t+1 such that W (t) (R t+1, C t+1 ) ρ W (t) where ρ 1. Conclusion: We can compute D (1),..., D (s), s ρ 2 ǫ 2 such that W (s) ǫ mn A F.
Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F.
Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F. If R t+1 < m/2 then either (i) W (t) (R, C t+1 ) 1 2 ρǫ mn A F or (ii) W (t) (R \ R t+1, C t+1 ) 1 2 ρǫ mn A F
Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F. If R t+1 < m/2 then either (i) W (t) (R, C t+1 ) 1 2 ρǫ mn A F or (ii) W (t) (R \ R t+1, C t+1 ) 1 2 ρǫ mn A F Conclusion: We can compute D (1),..., D (s), s 4ρ 2 ǫ 2 such that W (s) ǫ mn A F and such that R i m/2 and C i n/2.
Suppose that W (t) ǫ mn A F and we have computed R t+1, C t+1 such that W (t) (R, C t+1 ) ρǫ mn A F. If R t+1 < m/2 then either (i) W (t) (R, C t+1 ) 1 2 ρǫ mn A F or (ii) W (t) (R \ R t+1, C t+1 ) 1 2 ρǫ mn A F Conclusion: We can compute D (1),..., D (s), s 4ρ 2 ǫ 2 such that W (s) ǫ mn A F and such that R i m/2 and C i n/2. Then s s R t C t dt 2 A 2 F = dt 2 4 A. t=1 t=1
MAX-CUT
G = (V, E) is a graph with n n adjacency A and A = D (1) + D (2) + + D (s) + W. D (t) = CUT(R t, C t, d t ), d t 2, s = O(1/ǫ 2 ) and W ǫn 2
G = (V, E) is a graph with n n adjacency A and A = D (1) + D (2) + + D (s) + W. D (t) = CUT(R t, C t, d t ), d t 2, s = O(1/ǫ 2 ) and W ǫn 2 If (S, S) is a cut in G then the weight of this cut satisfies A(S, S) (D (1) + D (2) + + D (s) )(S, S) ǫn 2.
G = (V, E) is a graph with n n adjacency A and A = D (1) + D (2) + + D (s) + W. D (t) = CUT(R t, C t, d t ), d t 2, s = O(1/ǫ 2 ) and W ǫn 2 If (S, S) is a cut in G then the weight of this cut satisfies A(S, S) (D (1) + D (2) + + D (s) )(S, S) ǫn 2. where s D (t) (S, S) s = d t f t g t t=1 t=1 f t = S R t and g t = S C t
So we look for S to (approximately) minimize s t=1 d tf t g t.
So we look for S to (approximately) minimize s t=1 d tf t g t. Let ν = ǫn and ft ft = ν, ḡ t = ν gt ν ν Then s f t g t d t f t ḡ t d t 6νns 6ǫn 2. t=1
So we look for S to (approximately) minimize s t=1 d tf t g t. Let ν = ǫn and ft ft = ν, ḡ t = ν gt ν ν Then s f t g t d t f t ḡ t d t 6νns 6ǫn 2. t=1 So we can look for S to (approximately) minimize s t=1 d t f t ḡ t.
There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters.
There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P.
There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P. We check ( f t, ḡ t ) by solving the LP relaxation of the integer program 0 x P P P P ft x P < f t + ν 1 t s P Rt ḡ t P C t ( P x P ) ḡ t + ν and doing some adjusting. (x P = S P ).
There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P. The partition P has the property that for disjoint S, T V we have e(s, T) d i,j S i T j 2 W 2ǫn 2 i [k] j [k] where d i,j = e(v i, V j )/( V i V j ).
There are (2/ǫ) 2s choices for the sequence ( f t, ḡ t ) and so we enumerate all possiblities and see if there is a cut with (approximately) these parameters. P = V 1, V 2,...,V k is the coarsest partition of V (with at most 2 2s parts in it) such that each R t, C t is the union of sets in P. The partition P has the property that for disjoint S, T V we have e(s, T) d i,j S i T j 2 W 2ǫn 2 i [k] j [k] where d i,j = e(v i, V j )/( V i V j ). We could replace P with an ordinary regular partition. The constants as a function of ǫ get worse.
Quadratic Assignment Minimise A i,j,p,q z i,p z j,q i,j,p,q subject to z i,k = z k,j = 1 k k z i,j = 0, 1. i, j.
A set of n items V have to be assigned to a set of n locations X, one per location. z i,p = 1: Place item i in position p = π(i) T(i, i ) 1 is the amount of traffic between item i and i. D(x, x ) is the distance between location x and x. If item i is assigned to location π(i) for i [n] the total cost c(π) is defined by c(π) = n n T(i, i )D(π(i), π(i )). i=1 i =1 The problem is to minimise c(π) over all bijections π : V X.
Metric space X with metric D. Metric QAP. 1 diam(x)=1 i.e. max x,y D(x, y) = 1. 2 For all ǫ > 0 there exists a partition X = X 1 X 2 X l, l = l(ǫ), such that diam(x j ) ǫ. We call this an ǫ refinement of X. So there is a l l matrix ˆD such that if x X j and x X j then D(x, x ) ˆD(j, j ) 2ǫ. This partition must be computable in time polynomial in n and 1/ǫ. We call this the metric QAP.
We decompose T = T 1 + T 2 + + T s + W where W ǫn 2. For bijection π : V X we have c(π) = s k=1 i,j=1 n T k (i, j)d(π(i), π(j)) + 1 We compute an O(ǫ 3 )-refinement of X and let S (π) i = π 1 (X i ). s k=1 i,j=1 n T k (i, j)d(π(i), π(j)) = s l k=1 i,j=1 d k R k S (π) i C k S (π) j ˆD(i, j) + 2.
Computing a decomposition.
Computing a decomposition. Reducible to finding an approximation to the cut-norm.
Computing a decomposition. Reducible to finding an approximation to the cut-norm. Our paper gave algorithms with a small additive error.
Computing a decomposition. Reducible to finding an approximation to the cut-norm. Our paper gave algorithms with a small additive error. Alon and Naor gave an approximation algorithm with multiplicative error!
Let A 1 = max x i,y j {±1} A(i, j)x i y j. i,j
Let A 1 = max x i,y j {±1} A(i, j)x i y j. i,j i,j A(i, j)x i y j = A i,j x i =1 x i =1 y j =1 y j = 1 A i,j x i = 1 y j =1 + x i = 1 y j = 1 A i,j. So A 1 4 A.
Let A 1 = max x i,y j {±1} A(i, j)x i y j. i,j i,j A(i, j)x i y j = A i,j x i =1 x i =1 y j =1 y j = 1 A i,j x i = 1 y j =1 + x i = 1 y j = 1 A i,j. So A 1 4 A. A similar argument gives A A 1.
Grothendieck s Identity u, v are unit vectors in a Hilbert space H. z is chosen uniformly from B = {x : x = 1}. π E[sign(u z) sign(v z)] = arcsin(u.v). 2
Let (u i, v j ) define L A = max u i,v j A(i, j)u i v j ( A 1 ) i,j where (u i, v j ) lie in R m+n and u i = v j = 1.
Let (u i, v j ) define L A = max u i,v j A(i, j)u i v j ( A 1 ) i,j where (u i, v j ) lie in R m+n and u i = v j = 1. (u i, v j ) are computable via Semi-Definite Programming.
Let c = sinh 1 (1) = ln(1 + 2). sin(cu i v j ) = = ( 1) k c 2k+1 (2k + 1)! (u i v k=0 j ) 2k+1 ( 1) k c 2k+1 (2k + 1)! (u i ) (2k+1) (v k=0 = S(u i ) T(v j ). j ) (2k+1)
Here and S(u i ) = T(v j ) = ( 1) k c 2k+1 (2k + 1)! (u i ) (2k+1) c 2k+1 (2k + 1)! (v j ) (2k+1) k=0 k=0 (u 1, u 2, u 3, u 4 ) (3) = (u 3 1, u2 1 u 2, u 2 1 u 3, u 2 1 u 4, u 1 u 2 u 3,...).
Here and S(u i ) = T(v j ) = ( 1) k c 2k+1 (2k + 1)! (u i ) (2k+1) c 2k+1 (2k + 1)! (v j ) (2k+1) k=0 k=0 (u 1, u 2, u 3, u 4 ) (3) = (u 3 1, u2 1 u 2, u 2 1 u 3, u 2 1 u 4, u 1 u 2 u 3,...). Note that c has been chosen so that S(u i ) = T(v j ) = 1.
L A = i,j A i,j u i v j = c 1 i,j = c 1π A i,j E[sign(S(ui 2 i,j A i,j arcsin(s(u i ) T(v j )) ) z) sign(t(v j ) z)]
L A = i,j A i,j u i v j A i,j arcsin(s(u i ) T(v j )) = c 1 i,j = c 1π A i,j E[sign(S(ui 2 i,j ) z) sign(t(v j ) z)] We embed the S(ui ), T(v j ) in R m+n and choose z randomly and put x i = sign(t(ui z), y j = sign(t(vj ) z)). By choosing many z we get a good estimate of A 1. One can recover a good solution (x i, y j ) by first deciding whether x 1 = 1 or x 1 = 1 etcetera.
L A = i,j A i,j u i v j = c 1 i,j = c 1π A i,j E[sign(S(ui 2 i,j A i,j arcsin(s(u i ) T(v j )) ) z) sign(t(v j ) z)] One can extend the idea to approximate the cut-norm, with the same guarantee.
In Frieze, Kannan we gave a randomised algorithm for computing a weak partition using only 2Õ(ǫ 2) time.
In Frieze, Kannan we gave a randomised algorithm for computing a weak partition using only 2Õ(ǫ 2) time. Subsequently, Alon,Fernandez de la Vega,Kannan, Karpinski, Yuster show how to compute such a partition using only Õ(ǫ 4 ) probes. (A similar but weaker result was obtained by Anderson, Engebretson). See also Borgs, Chayes, Lovász, Sós, Vesztergombi.
In Frieze, Kannan we gave a randomised algorithm for computing a weak partition using only 2Õ(ǫ 2) time. Subsequently, Alon,Fernandez de la Vega,Kannan, Karpinski, Yuster show how to compute such a partition using only Õ(ǫ 4 ) probes. (A similar but weaker result was obtained by Anderson, Engebretson). See also Borgs, Chayes, Lovász, Sós, Vesztergombi. Above also applies to Multi-Dimensional Arrays.
A multi-dimensional version Max-r-CSP is the following problem: We are given m Boolean functions f i defined on Y i = (y 1, y 2,...,y r ) where {y 1, y 2,..., y r } {x 1, x 2,...,x n } and the aim to choose a setting for the variables x 1, x 2,...,x n that makes as many of the functions f i as possible, true.
A multi-dimensional version Max-r-CSP is the following problem: We are given m Boolean functions f i defined on Y i = (y 1, y 2,...,y r ) where {y 1, y 2,..., y r } {x 1, x 2,...,x n } and the aim to choose a setting for the variables x 1, x 2,...,x n that makes as many of the functions f i as possible, true. For each z {0, 1} r and (i 1, i 2,...,i r ) [n] r we define A (z) (i 1, i 2,...,i r ) = { j : Y j = x i1,...,x ir and f j (z 1,...,z r ) = T }.
A multi-dimensional version Max-r-CSP is the following problem: We are given m Boolean functions f i defined on Y i = (y 1, y 2,...,y r ) where {y 1, y 2,..., y r } {x 1, x 2,...,x n } and the aim to choose a setting for the variables x 1, x 2,...,x n that makes as many of the functions f i as possible, true. For each z {0, 1} r and (i 1, i 2,...,i r ) [n] r we define A (z) (i 1, i 2,...,i r ) = { j : Y j = x i1,...,x ir and f j (z 1,...,z r ) = T }. Problem becomes to maximize, over x 1, x 2,...,x n, r (z) (i 1, i 2,...,i r )( 1) r z (x it + z t 1). A z i 1,i 2,...,i r t=1
For this it is useful to have a decomposition for r-dimensional matrices: An r-dimensional matrix A on X 1 X 2... X r is a map A : X 1 X 2 X r R. If S i X i for i = 1, 2,... r, and d is a real number the matrix M satisfying { d for e S1 S M(e) = 2 S r 0 otherwise is called a cut matrix and is denoted M = CUT(S 1, S 2,... S r ; d).
B is the (2-dimensional) matrix with rows indexed by Y 1 = X 1 Xˆr, ˆr = r/2 and columns indexed by Y 2 = Xˆr+1 X r. For i = (x 1,..., xˆr ) Y 1 j = (xˆr+1,..., x r ) Y 2 let B(i, j) = A(x 1, x 2,...,x r ).
Applying a decompositon algorithm we obtain B = D (1) + D (2) + + D (s0) + W where for 1 t s 0, D (t) = CUT(R t, C t, d t ), and W is small.
Each R t defines an ˆr-dimensional 0-1 matrix R (t) where R (t) (x 1,..., xˆr ) = 1 iff (x 1,..., xˆr ) R t. C (t) is defined similarly. Assume inductively that we can further decompose R (t) = D (t,1) + + D (t,s 1) + W (t) C (t) = ˆDt, 1 + + ˆDt, ŝ 1 + Ŵ(t) Here D (t,u) = CUT(R t,u,1,...,r t,u,ˆr, d t,u ) ˆDt, û = CUT(R t,û,ˆr+1,...,r t,û,r, ˆd t,û )
Each R t defines an ˆr-dimensional 0-1 matrix R (t) where R (t) (x 1,..., xˆr ) = 1 iff (x 1,..., xˆr ) R t. C (t) is defined similarly. Assume inductively that we can further decompose R (t) = D (t,1) + + D (t,s 1) + W (t) C (t) = ˆDt, 1 + + ˆDt, ŝ 1 + Ŵ(t) Here D (t,u) = CUT(R t,u,1,...,r t,u,ˆr, d t,u ) ˆDt, û = CUT(R t,û,ˆr+1,...,r t,û,r, ˆd t,û ) It follows that we can write A = t,u,û CUT(R t,u,1,..., R t,u,ˆr, ˆR t,û,ˆr+1,..., ˆR t,û,r, d t,u,û ) + W 1, where W 1 is small.
THANK YOU