arxiv: v1 [cs.ds] 22 Oct 2016

Similar documents
Local Maxima and Improved Exact Algorithm for MAX-2-SAT

APPENDIX A Some Linear Algebra

Finding Dense Subgraphs in G(n, 1/2)

Complete subgraphs in multipartite graphs

Edge Isoperimetric Inequalities

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Notes on Frequency Estimation in Data Streams

More metrics on cartesian products

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Lecture Notes on Linear Regression

Lecture 4: Universal Hash Functions/Streaming Cont d

NP-Completeness : Proofs

Problem Set 9 Solutions

Maximizing the number of nonnegative subsets

Min Cut, Fast Cut, Polynomial Identities

Difference Equations

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

COS 521: Advanced Algorithms Game Theory and Linear Programming

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Foundations of Arithmetic

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

1 Matrix representations of canonical matrices

Affine transformations and convexity

Calculation of time complexity (3%)

A new construction of 3-separable matrices via an improved decoding of Macula s construction

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Vapnik-Chervonenkis theory

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Lecture 12: Discrete Laplacian

Lecture 10 Support Vector Machines II

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

Errors for Linear Systems

Learning Theory: Lecture Notes

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Exercises. 18 Algorithms

Lecture Space-Bounded Derandomization

Graph Reconstruction by Permutations

Lecture 4. Instructor: Haipeng Luo

a b a In case b 0, a being divisible by b is the same as to say that

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Section 8.3 Polar Form of Complex Numbers

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN

Generalized Linear Methods

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

arxiv: v1 [math.co] 1 Mar 2014

Solutions to the 71st William Lowell Putnam Mathematical Competition Saturday, December 4, 2010

Assortment Optimization under MNL

Canonical transformations

Kernel Methods and SVMs Extension

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

2.3 Nilpotent endomorphisms

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

The Expectation-Maximization Algorithm

Anti-van der Waerden numbers of 3-term arithmetic progressions.

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Eigenvalues of Random Graphs

Formulas for the Determinant

Text S1: Detailed proofs for The time scale of evolutionary innovation

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

HMMT February 2016 February 20, 2016

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Volume 18 Figure 1. Notation 1. Notation 2. Observation 1. Remark 1. Remark 2. Remark 3. Remark 4. Remark 5. Remark 6. Theorem A [2]. Theorem B [2].

Feature Selection: Part 1

Approximate Smallest Enclosing Balls

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

18.1 Introduction and Recap

A 2D Bounded Linear Program (H,c) 2D Linear Programming

= z 20 z n. (k 20) + 4 z k = 4

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

The L(2, 1)-Labeling on -Product of Graphs

Lecture 2: Gram-Schmidt Vectors and the LLL Algorithm

Module 9. Lecture 6. Duality in Assignment Problems

CSCE 790S Background Results

Computing Correlated Equilibria in Multi-Player Games

A be a probability space. A random vector

Expected Value and Variance

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Société de Calcul Mathématique SA

Spectral Graph Theory and its Applications September 16, Lecture 5

THE SUMMATION NOTATION Ʃ

Lecture 3. Ax x i a i. i i

Appendix B. Criterion of Riemann-Stieltjes Integrability

MMA and GCMMA two methods for nonlinear optimization

The Geometry of Logit and Probit

The Feynman path integral

find (x): given element x, return the canonical element of the set containing x;

n ). This is tight for all admissible values of t, k and n. k t + + n t

The Minimum Universal Cost Flow in an Infeasible Flow Network

Transcription:

Local Maxma and Improved Exact Algorthm for MAX-2-SAT Matthew B. Hastngs 1 Staton Q, Mcrosoft Research, Santa Barbara, CA 93106-6105, USA 2 Quantum Archtectures and Computaton Group, Mcrosoft Research, Redmond, WA 98052, USA arxv:1610.07100v1 [cs.ds] 22 Oct 2016 Gven a MAX-2-SAT nstance, we defne a local maxmum to be an assgnment such that changng any sngle varable reduces the number of satsfed clauses. We consder the queston of the number of local maxma that an nstance of MAX-2-SAT can have. We gve upper bounds n both the sparse and nonsparse case, where the sparse case means that there s a bound d on the average number of clauses nvolvng any gven varable. The bounds n the nonsparse case are tght up to polylogarthmc factors, whle n the sparse case the bounds are tght up to a multplcatve factor n d for large d. Addtonally, we generalze to the queston of assgnments whch are maxma up to changng k > 1 varables smultaneously; n ths case, we gve explct constructons wth large (n a sense explaned below) numbers of such maxma n the sparse case. The basc dea of the upper bound proof s to consder a random assgnment to some subset of the varables and determne the probablty that some fracton of the remanng varables can be fxed wthout consderng nteractons between them. The bounded results hold n the case of weghted MAX-2-SAT as well. Usng ths technque and combnng wth deas from Ref. 6, we fnd an algorthm for weghted MAX-2-SAT whch s faster for large d than prevous algorthms whch use polynomal space; ths algorthm does requre an addtonal bounds on maxmum weghts and degree. I. INTRODUCTION Local search algorthms for combnatoral optmzaton problems such as MAX-SAT can be trapped n local maxma and hence fal to fnd the global maxmum. A natural queston then s: how many local maxma can an optmzaton problem have? We frst consder the queston of assgnments whch are maxma when a sngle varable s assgnment s changed, and we fnd tght bounds on the number of such maxma up to polylogarthmc factors for nonsparse MAX-2-SAT nstances and fnd other bounds for sparse nstances (tght up to certan constants n the exponent explaned later). The methods used to prove these bounds lead to an algorthm for weghted MAX-2-SAT whch s faster for hgh degree nstances than any prevously known algorthm whch uses only polynomal space (there s an algorthm of Wllams whch uses exponental space 1 and s exponentally faster); ths algorthm requres combnng these results wth prevous results of Golovnev and Kutzkov 6 and the algorthm does requre some addtonal bounds on maxmum weghts and degree. The formal defnton of a local maxmum wll be: Defnton 1. Gven an nstance of MAX-2-SAT, an assgnment s a local maxmum f t has the property that changng the assgnment to any sngle varable reduces the number of satsfed clauses, whle a global maxmum s an assgnment whch maxmzes the number of satsfed clauses. (We defne local and global maxma for weghted MAX-2-SAT smlarly, replacng the number of satsfed clauses wth a sum of weghts of satsfed clauses.) In secton III, we gve a generalzaton of ths defnton to changng the assgnment to k = O(1) varables smultaneously. We call these k-maxma and we construct nstances wth large numbers of such k-maxma. Note that t s clearly possble for a MAX-2-SAT nstance wth N varables to have 2 N global maxma: smply take an nstance wth no clauses so that every assgnment s a global maxmum. However, none of these global maxma are local maxma accordng to ths defnton. The followng constructon 2 shows that t s possble for a MAX-2-SAT nstance to have Θ(N 1/2 )2 N local maxma. Assume N s even. For each par of varables b,b j, wth 1 < j N, we have clauses b b j and b b j. There are 2 ( N 2) N 2 clauses n total. For every par,j, at least one of these clauses s satsfed, and both are satsfed f s true and j s false or vce-versa. If n of the varables are set to true and the remander are false, the total number of satsfed clauses s ( N 2) + n (N n). Ths s maxmzed f ( n = N/2 so that every assgnment wth n = N/2 s both a local maxmum and a global maxmum. Thus, there are N N/2) = Θ(N 1/2 )2 N local maxma. Note that f we nstead consder MAX-CSP wth constrants that are allowed to nvolve an arbtrary number of varables, then by takng a sngle clause whch s smply a party functon of the varables, we obtan an nstance wth (1/2)2 N local maxma. Whle t s not hard to show that ths number (1/2)2 N s optmal for a MAX-CSP nstance 3, a natural queston s whether Θ(N 1/2 )2 N s the maxmum possble number of local maxma for a MAX-2-SAT nstance. In ths paper, we prove an upper bound by polylog(n)n 1/2 2 N for the number of local maxma. We defne the degree of a varable n a MAX-SAT nstance to be the number of other varables j such that there s a clause dependng on b and b j ; note that ths does not depend upon the number of such clauses or the weghts of

such clauses. The constructon of Ref. 2 can be modfed to gve an nstance wth a large number of local maxma and wth bounded degree, by takng multple copes of the constructon. The constructon of Ref. 2 has maxmum degree d = N 1. Consder an nstance wth N varables, all of degree d, wth N beng a multple of d+1, such that the nstance conssts of N/(d+1) decoupled copes of the constructon of Ref. 2. Ths gves an nstance wth N varables, degree d, and ( Θ((d+1) 1/2 )2 d+1) N/(d+1) = 2 N (1 1 2 log 2 (d) d O(1/d)) local maxma, where here the O(...) notaton refers to asymptotc dependence on d (we use bg-o notaton for dependence on both d,n n ths paper and t should be clear by context what s ntended). In ths paper, we prove an upper bound by 2 N (1 κlog 2 (λd)/d)) on the number of local maxma n the bounded degree case, for some constants κ,λ > 0; snce the constant κ < 1/2, ths bound wll not mmedately mply the unbounded degree bound, and we prove the two bounds separately. Our upper bounds n the bounded degree case wll hold also for bounded average degree. Ths upper bound s tght up to a multplcatve factor n d for large d: for any κ < κ for all suffcently large d, the upper bound on the possble number of local maxma for degree d s smaller than the lower bound for degree κ d whch follows from the above constructon (the need to take κ < κ and to take d large s due to terms N O(1/d) n the exponent). The bounds n both the bounded and unbounded degree case rely on the same basc dea. We fnd a subset T of the varables so that nteractons between pars of varables n T are small compared to nteractons between varables n T wth those outsde T. Then, we show that, for a random assgnment of varables not n T, one often fnds that for many of the varables n T, the value of that varable at a local or global maxmum can be fxed by the assgnment to varables outsde T. The smplest verson of ths argument gves a bound of 2 N (1 1/(d+1)) on the number of local maxma n the case of bounded maxmum degree as follows: construct a graph whose vertces correspond to varables and wth an edge between varables f they are both n a clause. Ths graph can be (d+1)-colored and at least one color has at least N/(d + 1) varables. Let T be the set of varables wth that color so that there s no nteracton between varables n T. Then, there s at most one local maxmum for every assgnment to varables outsde T, so that there are at most 2 N (1 1/(d+1)) local maxma. The stronger bound n the present paper nvolves choosng a larger set T so that the nteracton between varables n T may not vansh; ths requre a more complcated probablstc estmate. Ths knd of bound of local maxma naturally leads to an algorthm to fnd a global maxmum: terate over assgnments to varables outsde T. Then, for each such assgnment, for many of the varables n T, one can determne the optmal assgnment to that varable consderng only ts nteracton to varables outsde T. We wll thus obtan an algorthm whch takes tme Õ(2N (1 κlog 2 (λd)/d)) ). PrevousalgorthmsforMAX-2-SAT nclude an algorthmtakngtme Õ(2ωN/3 ) but usng exponentalspace 1 where ω s the matrx multplcaton exponent. Among algorthms usng polynomal space 6 8, the fastest 6 for large d takes tme Õ(2N (1 αln(d)/d) ) for any α < 1, whle others took tme Õ(2N (1 const./d) ) for varous constants. Ths algorthm 6 s faster for large d than the algorthm gven n the above paragraph. However, we show how to combne deas of the two algorthms to obtan a faster algorthm for large d, subject to some addtonal bounds on maxmum degree and weghts explaned later. Some notaton: f not otherwse stated, logarthms are always to base 2. We use const. to denote varous postve numercal constants throughout. When we need a more specfc value of a constant, we ntroduce a symbol such as c,κ,λ,... We use Pr(...) to denote probabltes. We use Õ(...) to denote asymptotc behavor up to a polylogarthm of the argument; when the argument s an exponental n N, ths wll be a polynomal n N. Gven a set of varables T, we use T to denote the varables not n that set. As a fnal pont of termnology, we wll translate (for notatonal convenence) both weghted and unweghted MAX-2-SAT nstances nto Isng nstances and we wll almost exclusvely use the notaton of Isng nstances later n ths paper. Whenever we refer to MAX-2-SAT nstances, we wll be consdered wth assgnments that are maxma, but for Isng nstances these wll become mnma. We begn n secton II wth defntons and we ntroduce the noton of an effectve Hamltonan, whch s an optmzaton problem for some subset of varables gven an assgnment to the others. Then, n secton III, we ntroduce the noton of k-mnma and k-maxma and gve constructons wth large numbers of such mnma; secton III s ndependent of the followng sectons of the paper and can be skpped f desred. The man result n ths secton s theorem 1. We show that there s a constant c > 0 such that for any f,l there s an Isng nstance whch has degree N (1 f log(f l/c) d = f(l 1) and whch has at least 2 l ) global mnma, all of whch have Hammng dstance at least 2 f from each other. Ths result mples that one can obtan nstances wth many such dsconnected mnma, where there are many mnma n that the term 1 f log(f l/c) l n the exponent s only slghtly smaller than N once d s large. One could then add addtonal terms to the objectve functon to make one of those mnma the global mnmum whle rasng all the others n energy slghtly, gvng an nstance wth many local mnma whch are all separated by Hammng dstance 2 f and wth a unque global mnmum. 2

Then, n secton IV, we gve upper bounds on the number of local mnma of an Isng nstance wthout any degree bound. The man result n ths secton s theorem 2. Then, n secton V, we gve upper bounds on the number of local mnma of such an nstance assumng a degree bound. The man result n ths secton s theorem 3. In secton VI we prove a techncal lemma on sums of random varables whch we use n sectons IV,V; ths techncal lemma s only needed because we are concerned wth the weghted case and we must consder arbtrary weghts; wth bounds on the weghts we would not need ths lemma as we could nstead use a Berry-Esseen theorem. Fnally, n secton VII we show how to combne the deas here wth those n Ref. 6 to obtan a faster algorthm. 3 II. MAX-2-SAT DEFINITIONS AND EFFECTIVE HAMILTONIANS A. Problem Defntons We consder a MAX-2-SAT nstance, wth varables b for = 1,...,N takng values true or false. We re-wrte the nstance as an Isng model to make the notaton smpler, settng S = +1 f b s true and S = 1 f b s false. Each 2-SAT clause can be wrtten as a sum of terms whch are quadratc and lnear n these varables. For example, clause b b j s true f b s true or b j s true. So, the clause s true f 1 (1 S )(1 S j )/4 s equal to 1 and s 0 otherwse. The negaton of a varable (replacng b by b ) corresponds to replacng S wth S. Note that 1 (1 S )(1 S j )/4 = 3/4+S /4 + S j /4 S S j /4. So, gven C clauses, we can express the number of volated clauses as an expresson of the form H = 3 4 C + 1 h S + 1 J j S S j, 4 4 where h,j j are ntegers. We wll set J j = J j and J = 0. We refer to an H such as n Eq. (II.1) as a Hamltonan. We drop constant terms such as (3/4)C throughout. We refer to the problem of mnmzng an expresson such as H overassgnments as the Isng problem (as we explan n the next few paragraphs, we wll n fact allow h,j j to be arbtrary reals for the Isng problem but we wll assume that certan operatons nvolvng h,j j can be done n polynomal tme). Below, we construct algorthms to solve the Isng problem and bounds on local mnma for the Isng problem; ths mples correspondng algorthms and bounds for MAX-2-SAT and for weghted MAX-2-SAT (n the case of weghted MAX-2-SAT, the h,j j need not be ntegers f the weghts are not ntegers). We wll allow h,j j to be arbtrary real numbers n what follows. However, we wll assume that all the arthmetc we do below (addng and comparng quanttes h,j j ) can be done n polynomal tme n N. <j (II.1) B. Effectve Hamltonan We ntroduce a noton of an effectve Hamltonan, as follows. Defnton 2. Let V denote the set of varables. Let T V. Let T V \T. Gven an assgnment A to all varables n T, we defne an effectve Hamltonan H eff on the remanng varables as where H eff = 1 4 T h eff S + 1 4,j T,<j J j S S j, (II.2) h eff = h + j T J j S j. (II.3) Then, gven any local mnmum of H, ths local mnmum determnes some assgnment A to varables n T and some assgnment B to varables n T. The assgnment A determnes an effectve Hamltonan H eff. Then, the assgnment B must be a local mnmum of H eff. Further, f we have an algorthm to determne a global mnmum of H eff for any assgnment A to varables n T, and gven that that algorthm takes tme t, we can fnd a global mnmum of H n tme Õ(2N T t) by teratng over assgnments to varables n T and fndng the mnmum of H eff for each assgnment. We wll consder some specal cases.

Lemma 1. Suppose that H eff has the property that J j = 0 for all,j T. Then, for every T such that h eff 0, every global mnmum of H eff has S = sgn(h eff ). The varables T wth h eff = 0 can be chosen arbtrarly at a global mnmum. There s at most one local mnmum; such a local mnmum exsts only f there are no wth h eff = 0; n ths case, the local mnmum s the unque global mnmum. Proof. Immedate. Another specal case we consder s where J j may be non-vanshng for pars,j T, but for many we have that j T J,j. h eff Defnton 3. Let h max j T J,j. (II.4) 4 Defnton 4. For T, f h eff h max we say that s fxed, otherwse, s free. Lemma 2. If F s the set of free varables for H eff, then H eff has at most 2 F local mnma. At every global mnmum or local mnmum of H eff, for each whch s fxed wth h eff 0, we have S = sgn(h eff ). The fxed wth h eff = 0 can be chosen arbtrarly at a global mnmum. Proof. Immedate. III. k-minima AND k-maxima In ths secton, we gve a more general defnton of local maxmum or mnmum, whch we call a k-maxmum or k-mnmum. Ths defnton allows one to change the assgnment to several varables at a tme, and also generalzes n a way that s approprate to descrbng equlbra of certan local search algorthms. We then gve constructons of Isng nstances wth large numbers of k-mnma. Defnton 5. Gven an assgnment A to a MAX-2-SAT problem, such an assgnment s called a k-maxmum f every assgnment dfferng n at least one and at most k varables from assgnment A satsfes fewer clauses than A does. Smlarly, for an Isng Hamltonan, an assgnment A s called a k-mnmum f every assgnment dfferng n at least one and at most k varables from assgnment A has a larger value for H than A does. Hence, the defnton of a local maxmum or local mnmum above corresponds to a 1-maxmum or 1-mnmum. Before gvng the constructon of nstances wth large numbers of k-mnma, we gve one other possble generalzaton of the noton of a mnmum: Defnton 6. Gven a problem nstance of MAX-2-SAT or the Isng problem, and gven nteger k 1, defne a graph G whose vertces correspond to assgnments. There s one vertex for each assgnment such that changng at most k varables of that assgnment does not reduce the number of satsfed clauses (n the MAX-2-SAT case) or does not ncrease the value of H (n the Isng case). Let there be an edge between any two vertces whch are wthn Hammng dstance k of each other. Then, we refer to the connected components of ths graphs as k-basns. Note that for every k-mnmum, the graph n the above defnton wll have a connected component contanng just the one vertex correspondng to that mnmum, so the constructon below of nstances wth large numbers of k-mnma wll gve a constructon wth large numbers of k-basns. Gven a local search algorthm that teratvely updates an assgnment changng at most k varables at a tme, f the algorthm only accepts the update f t does not reduce the number of satsfed clauses, then once the assgnment s n a gven k-basn, t cannot leave that basn. The notaton usng Isng Hamltonans rather than MAX-2-SAT problems can be used to slghtly smplfy the constructon of local mnma n Ref. 2, so we use that notaton and use the Isng problem for the rest of ths secton. The Hamltonan used to construct local mnma s smply H = 1 S S j 2 <j = 1 ( ) 2 S +const. 4 (III.1)

Hence, the mnmum s at S = 0. The local mnma of ths Hamltonan are 1-mnma. Ths has degree d = N 1. As explaned, we can take multple copes of ths constructon to gve nstances wth N varables, degree d, and 2 N (1 1 log(d) 2 d O(1/d)) local mnma. Now we consder k > 1. We begn by gvng a constructon whch s the analogue of the sngle copy above for whch the degree scales wth N, and then explan how to take multple copes of ths constructon to have fxed degree d, ndependent of N. Pck ntegers f,l > 0 wth l even. The constructon wll gve k-mnma for k = 2 f 1. We have N = l f varables, labelled by a vector (x 1,...,x f ), wth 1 x a l. These varables may be thought of as lyng on the nteger ponts contaned n an f-dmensonal hypercube [1,l] f. A column C wll be labelled by a choce of an nteger b wth 1 b f and a choce of ntegers y 1,y 2,...,y b 1,y b+1,...,y f wth 1 y a l. We say that a varable s n such a column C f s labelled by (x 1,...,x f ) wth x a = y a for all a b. For example, n the case f = 2, we can regard the varables as arranged n a square of sze l-by-l and a column s ether a row or column of ths square dependng on whether b = 1 or b = 2. There are n C columns, wth 5 Then, we take the Hamltonan n C = fl f 1. H = ( ) 2, S M C columns C C (III.2) (III.3) where M C s some nteger whch depends upon column C. The case f = 1 wth M C = 0 for all columns s the Hamltonan (III.1) up to multplcaton and addton by constants. We use ths constant M C to smplfy some of the proofs later. An assgnment such that C S = M C for all columns C wll be called a zero energy assgnment. Every zero energy assgnment s a global mnmum. Frst let us consder the case that M C = 0 for all columns C. As an explct example of a zero energy assgnment n ths case, for a varable labelled by vector (x 1,...,x f ), take S = +1 f a x a s even and S = 1 f a x a s odd. Heurstcally, one may guess that ths Hamltonan has roughly 2 N ( c l ) nc such zero energy assgnments, for some constant c > 0. Ths heurstc guess s based on the followng. There are 2 N possble choces of varables. In a gven column C, the probablty that such a choce gves C S = 0 s equal to 2 l( l l/2) c/ l, for some constant c. Assumng such events are ndependent between columns, one arrves at ths heurstc guess. Of course, the events that C S = 0 are not ndependent for dfferent choces of C so we need a more careful analyss. In the case that f = 2, enumeratng the number of zero energy assgnments of ths Hamltonan s a well-studed queston. It s the same as enumeratng the number of 0 1 matrces wth l rows and l columns such that each row and column sums to l/2. It s shown n Ref. 4 that the heurstc guess s a lower bound for the number of such assgnments; more detaled estmates are n Ref. 5. However, we also want to consder the case f > 2 and the estmates for f = 2 do not seem to straghtforwardly generalze to f > 2. However we have: Lemma 3. Let n ze denote the number of zero energy assgnments. If n ze > 0, then the number of global mnma s equal to n ze. There s a constant c > 0 such that for any f,l, there exsts choces of M C such that n ze 2 N c ( f. l )nc (III.4) Proof. Suppose that we choose the S = ±1 ndependently at random wth S = +1 wth probablty 1/2, and then defne M C = C S. Ths random choce of S defnes a probablty dstrbuton of Pr( M), where M denotes the vector of choces of M C for each column C. To prove the bound (III.4), we need to show that there s some M such that Pr( M) ( c f l )nc. What we wll do s estmate M Pr( M) 2. Note that M Pr( M) 2 max M Pr( M), so that our lower bound on M Pr( M) 2 wll mmedately mply a lower bound on max M Pr( M) (very heurstcally, one may say that we are choosng M C by pckng S = ±1 ndependently at random wth S = +1 wth probablty 1/2, and then defnng M C = C S ).

Toestmate M Pr( M) 2, weneedtoconsderthe probabltythat, gventworandomassgnments,bothassgnments have the same resultng M. To label the varables n the two dfferent assgnments, we wll label 2l f varables by a vector (x 1,...,x f ), wth 1 x a f and by an ndex σ = 1,2, where σ wll label one of the two assgnments. We label columns by a choce of an nteger b wth 1 b f and a choce of ntegers y 1,y 2,...,y b 1,y b+1,... and an ndex σ = 1,2. We say that a varable s n such a column C f x a = y a for a b and the σ ndex of the varable agrees wth the σ ndex of the column. Then, to compute M Pr( M) 2 we wsh to compute the probablty that for every par of columns C 1 wth σ = 1 and C 2 wth σ = 2 (wth C 1,C 2 havng the same b and y a ) we have C 1 S = C 2 S. Changng the sgn of all varables wth σ = 2, ths s the same as requrng that C 1 S C 2 S = 0. We now redefne what a column means. For the rest of the proof of ths lemma, a column wll be labelled by a choce of an nteger b wth 1 b f and a choce of ntegers y 1,y 2,...,y b 1,y b+1,..., wthout any σ ndex. We say that a varable s n such a column C f s labelled by (x 1,...,x f ) wth x a = y a for a b; thus, for every column there are 2l varables n that column. So, we wsh to estmate the probablty that C S = 0 for all columns C. We can express ths as an ntegral: [0,2π] n C ( cos( )( C ) C θ C dθ ) ( C = s.t.σ=1cos( θ C ) 2)( 2π [0,2π] n C C C dθ ) C 2π 6 (III.5) where θ C s ntegrated over from 0 to 2π for each column C. The product over n the left-hand sde of ths equaton s over all varables ; usng the fact that for each vector (x 1,...,x f ) there are two varables labelled by that vector, wth σ = 1,2, we re-wrte the ntegral as n the rght-hand sde of the equaton where we take the product only over varables wth σ = 1 but we square the cosne. A smlar ntegral can be used to express the probablty n the orgnal problem (.e., wthout the σ ndex) that we have a gven M. However, the reason we have taken ths σ ndex s that the cosne term s squared so that now the ntegral s over a postve functon. Ths makes t easer to lower bound the ntegral. Restrctng to the regon of the ntegral wth θ C 1/(f l), the sum C θ C s bounded by 1/ l n every case, so that cos( C θ C) 2 s lower bounded by 1 1/(2l). Hence, the ntegrand s lower bounded by (2π) NC (1 1/(2l)) N (2π) NC const. N/l = const. NC, for some postve constants. The volume of ths ntegraton doman s (f l) NC. So, the result follows. Remark: we expect that the factor of f can be removed from Eq. (III.4) by a more careful estmate of the ntegral. More strongly, we conjecture that a smlar lower bound holds for n ze n the case that M C = 0 for all C. Next, we show that Lemma 4. For any choce of M C wth n ze > 0, all global mnma are 2 f 1 mnma. Proof. We must show that for every zero energy assgnment, there s no other zero energy assgnment wthn Hammng dstance less than 2 f. We prove ths nductvely on f. To prove the case f = 1, consder any assgnment A whch s a zero energy assgnment. Any assgnment B wth Hammng dstance 1 from A must have sum S whch s ether M C +2 or M C 2, dependng on whether one changes a sngle S from 1 to +1 or +1 to 1. Now we gve the nducton step. Assume the result holds for an (f 1)-dmensonal hypercube. We now prove t for the f-dmensonal hypercube. We re-wrte the Hamltonan as H = H 0 (y)+h 1, (III.6) 1 y l where ( ) 2, H 0 (y) = S M C columns C s.t. 1 b<f andy f =y C (III.7) ( ) 2. H 1 = S M C columns Cs.t. b=f C (III.8) That s, the columns such that b < f are n the sum y H 0(y) wth y = y f, whle the columns wth b = f are n the sum n H 1. Let assgnment A be a zero energy assgnment. Suppose that assgnment B dffers from assgnment A n some varable labelled by (X 1,...,X f ) and suppose that B s a zero energy assgnment also. Then, A,B are both zero energy assgnments for H 0 (X f ) and so by the nducton hypothess, the assgnments A,B dffer n at least 2 f 1 varables such the label of the varable (x 1,...,x f ) has x f = X f.

Snce B s also a zero energy assgnment of H 1, there must be some other varable labelled by (X 1,...,X f 1,Z f ) wth Z f X f such that A,B dffer n that varable. Then, snce A,B are both zero energy assgnments of H 0 (Z f ), agan by the nducton hypothess, the assgnments A,B dffer n at least 2 f 1 varables such the label of the varable (x 1,...,x f ) has x f = Z f. Hence, A,B dffer n at least 2 f varables. Hence, we arrve at: Lemma 5. There s a constant c > 0 such that for any f,l, there exsts choces of M C such that the Hamltonan (III.3) has at least 2 N ( c f l )nc global mnma, all of whch are 2 f 1 mnma. So, by consderng multple copes of the above nstance, usng that n C = fn/l n the above constructon, we fnd that: Theorem 1. There s a constant c > 0 such that for any f,l there s a Hamltonan whch has degree d = f(l 1) and whch has at least global mnma, all of whch are 2 f 1 mnma. 2 log(f l/c) N (1 f l ) 7 IV. NON-SPARSE CASE In ths secton we gve an upper bound on the number of local maxma for the case wth no degree bound. We frst need a techncal lemma, upper boundng the probablty that a weghted sum of a large number of Bernoull random varables wll fall wthn some nterval. We remark that f all the weghts are the same, then the desred result would follow from the Berry-Esseen theorem: we would have many random varables, all wth bounded second moments (and vanshng frst and thrd moments) and so the dstrbuton would converge to a Gaussan up to 1/ n errors n cumulatve dstrbuton functon. However, snce we allow arbtrary weghts, the sum may be far from a Gaussan and a separate proof s needed. Lemma 6. Let σ for = 1,...,m be ndependent random varables, unformly chosen ±1. Let Σ = a σ. Assume that there are at least n dfferent values of such that wth a a mn, for some a mn δ. Then, max h Pr( Σ+h δ) const. δ/amn n. (IV.1) Proof. Ths lemma s a corollary of lemma 8 proven n secton VI. Now, we prove Theorem 2. Consder an Isng nstance H on N varables, wth J j,h arbtrary. Then, there are at most polylog(n)n 1/2 2 N local maxma. Proof. We frst construct a set T of varables that are weaklycoupled to each other and are at least as stronglycoupled to many varables n T, where the strength of the couplng between two varables,j s J j. We wll pck a quantty ǫ later, wth ǫ proportonal to log(n)/n. Let T 0 be a randomly chosen set of varables wth T 0 = ǫn. We wll then label varables n ths set T 0 as good or bad. A varable s good f there are at least (1/2) ǫ 1 varables j T 0 such that J j max k T0 J k. Otherwse, s bad. Colloqually, f s good, then there are at least (1/2) ǫ 1 varables not n T 0 whch are at least as strongly coupled to as any varable n T 0 s. Let us estmate the probablty that for a random choce of T 0 that a randomly chosen varable n T 0 s bad. Ths probablty attans ts maxmum n the case that all J j dffer n absolute value for dfferent choces of j. In ths case, we need to estmate the probablty that gven a set of N 1 elements all dfferng n magntude, wth T 0 1 elements chosen at random from ths set, we choose at least one of the (1/2) ǫ 1 largest elements (.e., that T 0 contans a j such that J j s one of the (1/2) ǫ 1 largest possble). Ths probablty s not hard to compute exactly, but we gve nstead a smple estmate. The probablty that any gven one of these largest elements s chosen s ǫ. Hence, the average number chosen s 1/2 and so that probablty that s bad s at most 1/2. In case some J j have the same absolute value, one can arbtrarly choose a set of (1/2) ǫ 1 dstnct j such that J j for each j n ths set s at least as large as J k for all k not n ths set, and then estmate the probablty that one of the elements of ths set s chosen to upper bound the probablty that s bad n the same way.

Hence, for a random choce of T 0, the average number of good varables s at least (1/2) ǫn, and so there must be some choce of T 0 such that there are at least (1/2) ǫn good varables. Choose T to be the set of good varables for that choce of T 0, so that T (1/2) ǫn. Now, choose a random assgnment to all varables n T. Gven such an assgnment, we compute the effectve Hamltonan H eff for varables n T. Recall that f h eff h max = j T J,j we say that s fxed, otherwse, s free. We now consder the probablty that a gven varable T s free. We wll apply lemma 6 as follows. Let a mn = max j T J j. We can assume that a mn > 0, otherwse s trvally fxed. Then, h max a mn ǫn as T ǫn, and h eff = h + j T J js j. So, Pr( h eff > h max ) max h Pr( h+ j T J js j a mn ǫn). There are at least (1/2) ǫ 1 choces of j T such that J j a mn, so by lemma 6, the probablty that s free s bounded by const. (ǫn)ǫ 1/2. Hence, by a unon bound, the probablty that at least one varable T s free s at most const. T (ǫn)ǫ 1/2 = cn 2 ǫ 5/2, for some constant c > 0. Now, f no varable n T s free, then H eff has exactly 1 local mnmum by lemma 2. There are 2 N T assgnments to varables n T, and we have establshed that at most cn 2 ǫ 5/2 2 N T such assgnments have more than one local mnmum. Hence, there are at most 2 N( 2 T +cn 2 ǫ 5/2) 2 N( 2 ((1/2)ǫN 1) +cn 2 ǫ 5/2) (IV.2) local mnma for H. Choosng ǫ = log(n)/n, we fnd that the above equaton s bounded by 2 N( 2/ N +polylog(n)/ ) N, where the polylog s bounded by a constant tmes log(n) 5/2. 8 V. SPARSE CASE In ths secton, we gve an upper bound on local mnma for the sparse case. Gven an Isng nstance, defne a graph G whose vertces correspond to varables, wth an edge between two varables,,j, f J j 0. Let V be the set of vertces. The degree of a vertex s defned as usual n graph theory; t s the number of edges attached to that vertex. We wll prove Theorem 3. There are constants κ,λ > 0 such that the followng holds. Consder an Isng nstance and defne the graph G as above. Suppose that G has average vertex degree bounded by d. Then, there are at most 2 N (1 κlog(λd)/d) local mnma. Further, there s a determnstc algorthm takng polynomal space and tme Õ(2N (1 κlog(λd)/d) ) whch fnds the assgnment whch mnmzes H. We prove ths theorem by provng a smlar bound n the case of bounded maxmum degree: Theorem 4. There are constants κ,λ > 0 such that the followng holds. Consder an Isng nstance and defne the graph G as above. Suppose that G has maxmum vertex degree bounded by d. Then, there are at most 2 N (1 κ log(λ d)/d) local mnma. Further, there s a determnstc algorthm takng polynomal space and tme Õ(2N (1 κ log(λ d)/d) ) whch fnds the assgnment whch mnmzes H. Proof of theorem 3 assumng theorem 4: let W be the set of varables wth degree at most 2d. Snce G has average vertex degree d, W N/2. For each of the 2 N W assgnments to varables n W, we construct an effectve Hamltonan for varables n W. Applyng theorem 4 to ths effectve Hamltonan, shows that there are at most 2 W (1 κ log(2λ d)/(2d)) local mnma of ths Hamltonan and so there are at most 2 N W 2 W (1 κ log(2λ d)/(2d)) 2 N (1 κ log(2λ d)/(4d)) local mnma of H. Smlarly, we can mnmze H by teratng over assgnments to varables n W and then mnmzng the effectve Hamltonan usng the algorthm of theorem 4. So, theorem 3 follows wth κ = κ /4. So, we now focus on provng theorem 4. Frst, n subsecton VA, we construct a set T whch s n some ways analogous to the set T constructed n the non-sparse case above n that vertces n T wll have many edges wth large J j to vertces j T and wll have small h max j T J j. Then, n subsecton VB, we complete the proof of the theorem, by showng that for a random assgnment to vertces n T, the effectve Hamltonan H eff for vertces n T wll have many vertces fxed. The number of local mnma of H eff wll be bounded by 2 F where F s the set of free varables, and we wll bound the sum of ths quantty over all assgnments to vertces n T. We wll refer to ths sum as a partton functon, Z. Then, the algorthm of theorem 4 wll smply be: (1) construct T (2)

terate over assgnments to varables n T. For each assgnment, compute H eff (ths can be done n polynomal tme) and then compute the set F of free varables n T. An optmal assgnment to the fxed varables (those n T \ F) can be computed n lnear tme gven H eff, and then one can defne a new effectve Hamltonan for the varables n F, takng the assgnment to the varables n T \ F and the varables n T = V \ T as gven. Fnally, ths new effectve Hamltonan can have ts mnmum found by teratng over all assgnments to varables n F. Snce there are 2 F such assgnments, the total run tme s equal to the sum over all assgnments to varables n T of 2 F tmes a polynomal;.e., t s equal to Z tmes a polynomal. The polynomal factor s the tme requred to compute the effectve Hamltonans and fnd the set T and other sets. 9 A. Constructon of set T We wll gve frst a randomzed constructon of T, and then use that to gve a determnstc algorthm to fnd T. We wll have T = Θ(ǫN) where we later pck ǫ = log(d)/d. The run tme of the algorthm wll be exponental n N, but for small ǫ ths runtme s small compared to the upper bounds on the runtme of the algorthm n theorem 4. It s mportant to understand that the partcular choce of ǫ does not matter too much. We have pcked an optmal value (up to constants) for the proof as wlll be clear later. However, to gve a rough dea of the approprate value of ǫ: frst, we need to pck ǫ at least const. log(d)/d, as otherwse, even f all varables n T were fxed wth probablty 1, we would not obtan a good bound on the number of local mnma. Second, we can actually pck ǫ sgnfcantly larger than log(d)/d; we could have for example pcked ǫ = d α for any exponent α > 1/2 and we would stll have a meanngful bound (though not qute as tght. The pont s that we wll need d T,T (as defned n the lemma) large enough that d T,T >> d T so that the nteractons wthn the set T (whch at worst case have strength d T ) are small compared to the nteractons between a varable n T and a varable n T (whch on average have strength d T,T ). Lemma 7. Assume the condtons of theorem 4 hold. For all suffcently small ǫ, there s a set T V wth T = Θ(ǫN), such that the followng propertes hold. Defne the graph G as above. Defne a bpartte graph G T,T contanng only the edges between vertces T and j T. Defne a graph G T whch s the nduced subgraph of G contanng only vertces T. Then, frst, for every T, the degree of that vertex n G T s at most d T where d T 99ǫd. Second, for every T, the number of j T such that J j max k T J k s at least d T,T where d T,T = (1/99)ǫ 1. For each, f the degree of the vertex n G T s nonzero, then we pck d T,T edges n G T,T whch connect to j such that J j max k T J k and we call these strong edges. Then, thrd, for every T, the sum over frst neghbors of n G T,T of the number of strong edges attached to that frst neghbor s at most wth = 99d. Further, such a set T can be constructed by a determnstc algorthm takng tme poly(n) ( N T ). Ths tme s Õ(c N 0 ), where c 0 tends to 1 as T /N tends to 0. Proof. Let T 0 be a randomly chosen subset of V where we ndependently choose for each vertex whether or not t s n T 0, choosng t to be n T 0 wth probablty ǫ. Wth hgh probablty, T 0 (1 o(1))ǫn. We wll test varous propertes of the vertces, labelng certan vertces as good or bad dependng on whether or not they obey these propertes. Defne a bpartte graph G T0,T 0 contanng only the edges between vertces T 0 and j T 0. Defne a graph G T0 whch s the nduced subgraph of G contanng only vertces T 0. Every T 0 whch has degree 0 n G T0 s labelled as good. Every T 0 whch has nonzero degree n G T0 s labelled as good f the followng three propertes hold, and otherwse we label as bad. Frst, the degree of that vertex n G T0 s at most d T. Second, the number of dstnct j T 0 such that J j max k T0 J k s at least d T,T. Assumng obeys these two crtera, we then randomly choose d T,T of these j and call the edge connectng,j a strong edge. Then, thrd, for every T 0, the sum over frst neghbors of n G T,T of the number of strong edges attached to that frst neghbor s at most. What we wll show s a random vertex n T 0 has at least some constant

postve probablty of beng good. Hence, there s a choce of T 0 such that at least a constant fracton of vertces n T 0 are good. We then set T to be set of good vertces n T 0, and the desred propertes wll follow. We upper bound the probablty that a random vertex T 0 does not obey each of the three propertes above, usng three separate frst moment bounds. We then apply a unon bound to upper bound the probablty that the vertex s bad. The expected degree of n G T0 s at most ǫd, so the probablty that the vertex does not obey the frst property above s at most 1/99. The probablty that the number of dstnct j T such that J j max k T J k s at least d T,T can be bounded smlarly to the non-sparsecase: construct a set of j T contanng d T,T elements such that J j s at least as large for all j n ths set as J k s for all k not n ths set. Note that f has degree suffcently small, then t may be necessary to nclude some j such that J j = 0. Then, the average number of j n ths set whch are n T 0 s ǫd T,T = 1/99, so there s a probablty at most 1/99 that does not obey the second property. Before gvng a detaled proof of the thrd property, let us gve a heurstc estmate: There are Θ(ǫN) varables n T, each wth Θ(ǫ 1 ) strong edges, so that there are Θ(N) strong edges n total, so the averagenumber of strong edges attached to each vertex n T s Θ(1). Each T has at most d edges to vertces n T, so on average one mght guess that there are Θ(d) edges attached to those vertces. To gve a proof, though, we need to consder correlatons more carefully, as we now do. Note, however, that ths heurstc (and the proof below) both show that the value of does not depend on the ǫ that we have chosen; ths s one of the reasons why the proof of the theorem would work even f we had chosen a larger value of ǫ, as we noted above ths lemma. To show the thrd property, we frst bound the number of trples,j,k wth,k T 0 and j T 0 wth,j connected by an edge and j,k connected by a strong edge, and k good. Frst consder the trples wth,k neghbors. There are at most T 0 choces of k and for each k there are at most d T neghbors T 0 and at most d T,T strong edges n total, so there are at most T 0 d T d T,T T 0 d such trples. Now consder the trples wth,k not neghbors. There are at most T 0 d T,T choces of j,k and there are at most T 0 d T,T d vertces whch neghbor j. Snce k s not a neghbor of, the set of strong edges attached to k s ndependent of whether or not such s n T 0, and hence there are on average at most T 0 d T,T dǫ T 0 d/99 such trples. Hence, there are on average at most T 0 d(1 + 1/99) trples n total. Hence, for random T 0, the expected number of such j,k s at most d(1+1/99). Hence, the probablty that the vertex does not obey the thrd property s at most (1/99)(1+1/99). So, by a unon bound, s bad wth probablty at most 1/99+ 1/99+ (1/99)/(1/99+ 1) < 4/99 and so s good wth probablty > 95/99. So, for a random choce of T 0, the average number of good varables s at least (95/99) T 0 and so there must be some choce of T 0 such that there are at least (95/99)(1 o(1))ǫn good varables. Choose T to be the set of good varables for that choce of T 0. Ths gves a randomzed constructon of T. However, a set T wth these propertes can be constructed usng a determnstc algorthm smply by teratng over all ( W ) ( T N T ) choces of T and checkng each choce to see whether t has these propertes. 10 B. Bound on Mnma and Algorthm We consder an assgnment to varables n V \T. We denote ths assgnment A. We wll choose ǫ = log(d)/d n the constructon of the set T. We defne H eff as before: where H eff = 1 4 T h eff S + 1 4 <j, T,j T J j S S j, (V.1) h eff = h + j T J j S j. (V.2) We defne a partton functon Z (to borrow the language of statstcal physcs) to be the sum, over all assgnments A to varables n V \ T, of 2 F. We wll estmate Z. Then, as explaned at the start of the secton, ths gves the desred bound on the number of local mnma and gves the desred algorthm.

We wll estmate the number of fxed varables by constructng a sequence of varables 1, 2,... T, where every varable n T appears n the sequence exactly once. As we construct ths sequence, we estmate the probablty that a gven varable a s fxed or free, where the probablty s over random assgnments to varables n V \T. Thus, the partton functon Z s equal to 2 N tmes the sum over all possble sequences of events (.e., for each a, whether a s fxed or not, so that there are 2 T possble sequences of events) of the probablty of that sequence of events tmes 2 F. Theprobabltythat 1 (thefrstvarablenthesequence)sfxedwllbeeasytoestmatesmlarlytothesparsecase: the terms J j S j wll be uncorrelated random varables, so h eff 1 s a sum of uncorrelated random varables. However, the event that a s fxed for a > 1 wll be correlated wth the prevous events of whether or not 1, 2,..., a 1 are fxed. To keep track of these correlatons, we ntroduce another sequence of sets, R(a) V \T; the set R(a) s a set of varables j for j T, that have been revealed as explaned later. We set R(0) =. Durng the constructon the set R wll have addtonal varables added to t as follows. If varable a s free, then R(a) s equal to R(a 1) unon the set of j such that J aj 0. If varable a s fxed, then R(a) = R(a 1). We explan later how we choose the 1, 2,... The choce of a wll depend upon the prevous events,.e., on whch b for b < a are fxed. Consder the a-th step. We wsh to estmate the probablty that h eff a < h max 0. We wrte 11 where h eff a h eff,0 a = h eff,0 a = h a + +h eff,1 a, (V.3) j R(a 1) J j S j, (V.4) and h eff,1 a = j (V \T)\R(a 1) J j S j. (V.5) So, the probablty that h eff a < h eff s equal to the probablty that h eff,1 a +h eff,0 < h max. Consder a gven sequence of events, such as 1 fxed, 2 free, 3 free, 4 fxed. We have Pr( 1, 4 F; 2, 3 F) = Pr( 4 F 1 F, 2, 3 F)Pr( 3 F 1 F, 2 F)Pr( 2 F 1 F) Pr( 4 F 1 F)Pr( 1 F). (V.6) In general, we can apply ths nequalty to any sequence of events: the probablty that the set F contans exactly the varables a1, a2, a3,... for a 1 < a 2 < a 3... s bounded by the product of condtonal probabltes, Pr( a3 F a1, a2 F)Pr( a2 F a1 F)Pr( a1 F). Ths nequalty s behnd the usage of the term revealed above: by computng just ths product of condtonal probabltes, where the only events condtoned are events where varables are found to be n F and we never condton on an event that a varable s not n F, we can treat all the terms J aj for j that have not been revealed as ndependent random events. To compute a probablty such as Pr( ak F a1,..., ak 1 F), we compute the probablty that h eff,1 ak +h eff,0 ak h max. The random varable h eff,0 ak may be correlated wth the event that a1,..., ak 1 F n some complcated way, and thus condtonng on ths event may gve some complcated dstrbuton to ths random varable. However, the random varable h eff,1 ak s uncorrelated wth the event that a1,..., ak 1 F. We have = h Pr( ak F a1,..., ak 1 F) Pr(h eff,0 ak = h a1,..., ak 1 F)Pr( h eff,1 ak +h h max ) (V.7) max h Pr( h eff,1 ak +h h max ak ). At ths pont, we use lemma 6 as n the non-sparse case. Let a mn = max j T J ak j, so h max ak d T a mn. For any, let d unr (a) denote the number of strong edges connectng to vertces j R(a 1). That s, t s the number of dstnct j (V \T)\R(a 1) such that J j max k T J k. The suffx unr s short for unrevealed. Then, by lemma 6, max h Pr( h eff,1 ak +h h max ak ) const. d T / d unr ak (a). We next lower bound d unr ak (a).

Let T(a) = T \{ 1, 2,..., a 1 } so that T(a) = T (a 1). Let E unr (a) equal the sum of d unr (a) over T(a). The average, over T(a), of d unr (a) s equal to E unr (a)/( T (a 1)). Choosng a to be a varable n T(a) whch maxmzes d unr a, we can ensure that d unr a (a) E unr (a)/( T (a 1)). Let f(a) denote the number of varables 1,..., a whch are free. So, Hence, Hence, E unr (a) ( T (a 1))d T,T f(a 1). d unr a (a) d T,T ( 1 d T,T f(a 1) 1 2 ( T (a 1))d T,T 12 (V.8) (V.9) f(a 1) ). (V.10) T (a 1) d unr a (a) d T,T 2. (V.11) Note that the left-hand sde of the nequalty on the left-hand sde of the mplcaton n Eq.(V.11) s a non-decreasng functon of a whle the rght-hand sde of that nequalty s a decreasng functon of a, so f the nequalty fals to hold for some gven a, then t also fals for all larger a. Gven a sequence of events of whether or not 1, 2,..., a are n F, we say that the sequence termnates at a f f(a 1) 1 2 ( T (a 1)) d and f(a) > 1 T,T 2 ( T a) d. We can upper T,T bound Z by summng over sequences of events up to the step a at whch the sequence termnates and then usng as an upper bound the assumpton that after that pont, all varables b for b > a are free wth probablty 1. Before the sequence termnates, we have max h Pr( h eff,1 ak Z 2 N a 2 N a = 2 N a seq. termnates at a events seq. termnates at a events seq. termnates at a events +h h max ak ) const. d T / d unr ak (a)) const. d T / 2 F T 2 a b a,s.t. b F b a,s.t. b F (const. d T / d T,T ) 2 (const. d T / d T,T ) 2 a( ) f(a), const. d T / d T,T d T,T. So, (V.12) where the sum s over sequences of events termnatng at the gven a. The factor n parenthess on the frst lne s the upper bound on the probablty that b s free, condtoned on prevous varables beng free; the factor s bounded by 1 because t s a probablty. The factor 2 a on the second lne multpled by the factor of 2 for every b F for b a s upper bounded by 2 F T. On the last lne, we absorbed the 2 nto the constant, so that the factor n parenthess n the last lne s bounded by 2. Inorderforthesequencetotermnateata, wemusthavef(a) > (1/2)( T a)d T,T /. Thus, a > T 2f(a) /d T,T. Thus, Z 2 N f(a) ( ) f(a) const. d T / d T,T a>t 2f(a) /d T,T ( ) a 2 a. (V.13) f(a) The factor ( ) ( a f(a) counts the number of sequences wth the gven f(a). The factor a f(a)) 2 a s exponentally small n a unless a 2f(a). We break the sum over f(a) nto two parts. The frst s a sum over f(a) such that 2f(a) /d T,T T /2. The second part s the sum over the remanng f(a). In the frst sum, we alwayshave a T /2 2f(a)d T,T / so that the factor ( a f(a)) 2 a s exponentally small n a; we wll have d T,T << so that n fact the exponent s close to 1/2. Thus, the frst sum s O(c T 1 ) for a constant c 1 < 1 (the constant c 1 s slghtly larger than 1/2; the amount ( ) f(a) t s larger depends on d T,T / ). As for the second sum, each term s bounded by const. d T / d T,T where f(a) ( T /2)d T,T /(2 ). Snce the number of terms n the sum s bounded by T, the second sum s bounded by ( ) ( T /2)dT,T /(2 ) T const. d T / d T,T.

Hence, Z 2 N( ) O(c T 1 (const. d )+ T ( T /2)dT,T /(2 )) T / d T,T. (V.14) We have d T / d T,T = O(log 3/2 (d)/ d) and ( T /2)d T,T /(2 ) = Ω(N/d). Hence, 13 Z 2 N( O(c ǫn 1 )+O(log 3/2 (d)/ d) Ω(N/d)) (V.15) = 2 N O(2 Ω(N log(d)/d +2 Ω(N log(d)/d) ) = 2 N 2 Ω(N log(d)/d). The reader can now see why we have chosen ǫ as we dd; t s so that both terms wll be comparable n the above equaton to get the optmal bound. However, even f we had chosen ǫ larger (ǫ = d α for α > 1/2), we would have stll obtaned a bound Z 2 N 2 Ω(N log(d)/d). The only way n whch the bound would be worse would be that the constant hdden by the Ω(...) notaton would be smaller. The reason s that such a larger ǫ would stll lead to ( T /2)d T,T /(2 ) = Ω(N/d), but d T / d T,T would be larger. VI. SUM OF RANDOM VARIABLES Ths secton s devoted to the proof of the followng lemma: Lemma 8. Let σ for = 1,...,n be ndependent random varables, unformly chosen ±1. Let Σ = a σ, wth a 1 for all. Let δ 1. Then, max h Pr( Σ+h δ) const. δ n. (VI.1) Proof. We have max h Pr( Σ+h δ) e 1/2 max h E[ exp( (Σ+h) 2 /2δ 2 )], (VI.2) where E[...] denotes the expectaton value over choces of σ. Fourer transformng, we wsh to evaluate const. δ dkexp( k 2 δ 2 /2)exp(kh) for some numercal constant. By a trangle nequalty, ths s bounded by const. δ 2π dkexp( k 2 δ 2 /2) n cos(a k), =1 n cos(a k), =1 whch s ndependent of h so that we do not need to take a maxmum. We wrte δ 2π dkexp( k 2 δ 2 /2) n n cos(a k) = E δ [ cos(a k) ], =1 =1 (VI.3) where E δ [...] denotes an expectaton value for a random choce of k from the Gaussan δ 2π dkexp( k 2 δ 2 /2). Ths allows us to use the language of probablty whch wll make certan arguments more clear. For the remander of the proof, all probabltes and expectaton values refer to expectaton values wth respect to ths Gaussan dstrbuton. We defne certan dsjont events. The frst wll be the event where the product n =1 cos(a k) s n the nterval (e 1,1]. The second s where the product s n the nterval (e 2,e 1 ], and so on, so that n the b-th event, ths product wll be n the nterval (e b,e 1 b ]. In order for the b-th event to occur, t must be the case that for at least half of the, we have cos(a k) exp( 2b/n). To estmate the probablty of the b-th event, we clam (and we show n the next paragraph) that the probablty (for any gven ) that cos(a k) e 2b/n s bounded by const. δ b/n. Hence,

the expected number of such that cos(a k) e 2b/n s bounded by const. δn b/n. Hence, the probablty that at least half of the a have cos(a k) e 2b/n s bounded by 2 const. δ b/n. So, Eq. (VI.3) s bounded by const. b=1 exp1 b δ b/n = const. δ 1/n. We fnally show that the probablty that cos(a k) e 2b/n s ndeed bounded by const. δ b/n. If cos(a k) e 2b/n, we have ln( cos(a k) ) 2b/n. We have ln( cos(x) ) max m (x πm) 2 /2 where the max s over nteger m (proof: t suffces to consder the case that π/2 < x < π/2; on ths nterval, let f(x) = ln( cos(x) ) and let g(x) = x 2 /2; note that f(0) = g(0) and f (0) = g (0) where a prme denotes dervatve and f (x) = 1/cos 2 (x) g (x) = 1). Hence, f cos(a k) e 2b/n then a k s wthn dstance 2 b/n of mπ for some nteger m. Hence, k s wthn 2a 1 b/n of mπa 1. So, the probablty s bounded by m+2a 1 b/n m m 2a 1 b/n δ 2π exp( k 2 δ 2 /2)dk. For each choce of m, the ntegral s bounded by const. δa 1 b/n. We dstngush two cases, ether δ a or δ < a. In the frst case, the ntegral on the ntervals wth m 0 decays exponentally n m, so that the sum over m s bounded by const. δa 1 b/n. Usng a 1, ths s bounded by const. δ b/n. In the second case, for m > a /δ, the ntegral for each nterval decays exponentally n mδ/a, so that the probablty s bounded by const. b/n, whch s bounded by const. δ b/n snce δ 1. 14 VII. COMBINED ALGORITHM Here we explan how to combne the deas above wth an algorthm from Ref. 6. The dea of s as follows. Frst, the authors show the followng lemma (lemma 4 n that reference, whch we repeat here, slghtly rephrased): Lemma 9. Let G have average degree d and N vertces. For any constant 0 < α < 1 and suffcently large d, there exsts two sets T 1,T 2, wth T 1 T 2 = and T 1, T 2 = αn ln(d)/d such that there are no edges (u,v) wth u T 1,v T 2. Proof. For a detaled proof, set Ref. 6. Here s a sketch of the proof: the proof s by the probablstc method. Choose T 1 at random. Then, compute the probablty that a vertex not n T 1 has no neghbors n T 1. For the gven T 1, ths probablty s large enough that the expected number of such vertces s greater than αn ln(d)/d. Thus, there must be a choce of T 1 such that there are at least αn ln(d)/d vertces wth no neghbor n T 1. Take ths choce of T 1. Separately n Ref. 6 t s shown how to fnd these sets T 1,T 2 n tme small compared to Õ(2N (1 αln(d)/d) ). One other way to do ths s smply to terate over all such sets. Then, oncethesesetst 1,T 2 arefoundthealgorthmssmply: terateoverallassgnmentstovarablesnv\(t 1 T 2 ). There are 2 N (1 2αln(d)/d) such assgnments. For each such assgnment one can then fnd an optmal assgnment for varables T 1 and T 2 separately (as no edges connect T 1 to T 2 ), and then combne the two assgnments. Ths takes tme Õ(2N αln(d)/d ) for each assgnment to varables n V \(T 1 T 2 ), gvng the clamed total tme. We now show how to combne ths dea wth the method here n the case of an Isng nstance for whch all J j are ntegers subject to a bound for all we have J j J max, (VII.1) j for some J max. The results wll be effectve for J max suffcently small compared to d 3/2 /log 3/2 (d). The dea wll be as follows. Let V 0 = V \(T 1 T 2 ). Then, fnd a T V 0 of sze Θ(ǫ V 0 ) wth ǫ = log(d)/d such that all the condtons of lemma 7 hold for T and such that addtonally vertces n T are only weakly coupled to vertces n T 1 T 2, n a sense defned below. Then, apply smlar methods as before: terate over all assgnments to varables n V 0 \T; for most such assgnments many of the varables n T wll be fxed ndependently of the choce of the varables n T 1 T 2. We frst need the followng lemma whch generalzes lemma 7. Ths lemma wll (lke lemma 7) assume bounded maxmum degree, whle lemma 9 assumed bounded average degree; however, the fnal theorem wll only assume bounded average degree. Ths lemma wll allow T 1,T 2 to be arbtrary sets; we wll not use any specfc propertes of them from lemma 9. Ths lemma modfes lemma 7 n two ways. Frst, we have a lower bound on the degree of the vertces and we then requre that there be at least d T,T strong edges connected to each T (n lemma 7,