Lecture 8: February 8

Similar documents
Lecture 7: February 6

Tight Bounds For Random MAX 2-SAT

Lecture 5: January 30

Lecture 28: April 26

Two-coloring random hypergraphs

Lecture 6: September 22

Notes 6 : First and second moment methods

Phase Transitions and Satisfiability Threshold

The Probabilistic Method

Independence and chromatic number (and random k-sat): Sparse Case. Dimitris Achlioptas Microsoft

Approximation Algorithm for Random MAX-kSAT

NP-Hardness reductions

Lecture 18: March 15

Lecture 24: April 12

Chasing the k-sat Threshold

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

c 2006 Society for Industrial and Applied Mathematics

Essential facts about NP-completeness:

The unsatisfiability threshold conjecture: the techniques behind upper bound improvements

NP Completeness and Approximation Algorithms

Homework 4 Solutions

On the Solution-Space Geometry of Random Constraint Satisfaction Problems

The concentration of the chromatic number of random graphs

Hardness of Approximation

CS5314 Randomized Algorithms. Lecture 18: Probabilistic Method (De-randomization, Sample-and-Modify)

3-Coloring of Random Graphs

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

NP Completeness. CS 374: Algorithms & Models of Computation, Spring Lecture 23. November 19, 2015

Random Satisfiability

CS21 Decidability and Tractability

Lecture 3: September 10

Induced subgraphs with many repeated degrees

Oberwolfach workshop on Combinatorics and Probability

Containment restrictions

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox

1 More finite deterministic automata

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

Threshold Phenomena in Random Constraint Satisfaction Problems. Harold Scott Connamacher

Lecture 4. 1 FPTAS - Fully Polynomial Time Approximation Scheme

Lower Bounds for Testing Bipartiteness in Dense Graphs

Lecture 18: Inapproximability of MAX-3-SAT

NP-Complete Reductions 2

Topic: Sampling, Medians of Means method and DNF counting Date: October 6, 2004 Scribe: Florin Oprea

Comp487/587 - Boolean Formulas

Phase transition for Local Search on planted SAT

Lecture 15: A Brief Look at PCP

Lecture 2: January 18

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

Integer Linear Programs

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 22 Lecturer: David Wagner April 24, Notes 22 for CS 170

Lecture 19: November 10

Lecture 6,7 (Sept 27 and 29, 2011 ): Bin Packing, MAX-SAT

Lecture 10: Hardness of approximating clique, FGLSS graph

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions

NP-Completeness Review

Lecture 14: October 22

On the Complexity of the Minimum Independent Set Partition Problem

Lecture 12 Scribe Notes

Asymptotically optimal induced universal graphs

A Duality between Clause Width and Clause Density for SAT

NP Complete Problems. COMP 215 Lecture 20

Phase Transitions of Bounded Satisfiability Problems*

SAT, NP, NP-Completeness

Chapter 2. Reductions and NP. 2.1 Reductions Continued The Satisfiability Problem (SAT) SAT 3SAT. CS 573: Algorithms, Fall 2013 August 29, 2013

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

The Probabilistic Method

Almost all graphs with 2.522n edges are not 3-colorable

Lecture 18: More NP-Complete Problems

1 Agenda. 2 History. 3 Probabilistically Checkable Proofs (PCPs). Lecture Notes Definitions. PCPs. Approximation Algorithms.

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees

20.1 2SAT. CS125 Lecture 20 Fall 2016

NP-Complete Problems. More reductions

11.1 Set Cover ILP formulation of set cover Deterministic rounding

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

CS151 Complexity Theory. Lecture 15 May 22, 2017

A New Lower Bound on the Maximum Number of Satisfied Clauses in Max-SAT and its Algorithmic Applications

LECTURE 10: REVIEW OF POWER SERIES. 1. Motivation

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

Connections between exponential time and polynomial time problem Lecture Notes for CS294

Out-colourings of Digraphs

The Interface between P and NP: COL, XOR, NAE, 1-in-, and Horn SAT

Lecture Notes Each circuit agrees with M on inputs of length equal to its index, i.e. n, x {0, 1} n, C n (x) = M(x).

The chromatic number of random regular graphs

Lecture 24: Approximate Counting

CSE 312 Final Review: Section AA

Not all counting problems are efficiently approximable. We open with a simple example.

2 P vs. NP and Diagonalization

Random Bernstein-Markov factors

6.842 Randomness and Computation Lecture 5

Keywords. Approximation Complexity Theory: historical remarks. What s PCP?

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Lecture 3: Oct 7, 2014

Matchings in hypergraphs of large minimum degree

NP and NP Completeness

Lecture 2: September 8

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity

Lecture 2 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Lecture 2 Sep 5, 2017

Transcription:

CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 8: February 8 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may be distributed outside this class only with the permission of the Instructor. 8.1 The clique number of a random graph [cont.] We begin by sketching a proof of the following theorem from last lecture: Theorem 8.1 For G G n,p log 1/p (n. for any constant p (0, 1, the clique number of G is almost surely Proof: We restrict attention to the case p = 1/; the generalization to arbitrary p is straightforward. Throughout this proof, all logarithms are base. Let X k be the number of k-cliques in a graph G drawn from G n, 1. By linearity of expectation we have g(k := EX k = k ( k and let us define k0 (n to be the largest value of k such that g(k 1. An easy calculation (Exercise! shows that k 0 (n log n. (To see this is plausible, note that for k n, g(k is roughly nk k! k / k log n k / k log k. Now let c be some small integer constant (independent of n to be decided later. In order to prove our claim, it is enough to show the following, for some constant c: (1 for k 1 (n = k 0 (n + c : Pr[X k1(n > 0] 0 as n ( for k (n = k 0 (n c : Pr[X k(n > 0] 1 as n Note first that, if we work with expectations, then the above two claims are not hard to prove. Specifically, we can see that (a EX k1(n 0 as n (b EX k(n as n We will not rigorously prove these claims. However, they follow fairly easily by examining the rate of change of the function g(k near k = k 0. In particular, note that for k log n we have g(k + 1 g(k = n k k + 1 k n log n n 0 as n This indicates that, as n, in the neighborhood of k 0 (n the graph of g(k decreases sharply, so for some constant c, g(k 0 (n + c 0 as n and g(k 0 (n c as n. The details are left as an Exercise. We illustrate the behavior of g(k near k 0 by depicting log g(k in the neighborhood of k 0 (figure 8.1. 8-1

8- Lecture 8: February 8 log g(k k k0 k1 k Figure 8.1: The descent of log g(k near k 0 (n for fixed n. Note that the domain of g(k is actually the integers, but we show an interpolated approximation for readability. Now, we can go ahead and prove claims (1 and (. Claim (1 is almost immediate: since X k is integer valued, we have Pr[X k1 > 0] = Pr[X k1 1], and then by Markov s inequality we have that Pr[X k1 1] EX k1. However, EX k1 0 as n and so, as n : Pr[X k1 > 0] 0 To prove Claim (, we will use the second moment method to show that Pr[X k = 0] 0 as n. Applying Chebyshev s inequality as we did for the case of 4-cliques earlier, we have Pr[X k = 0] Pr[ X k EX k EX k ] Var(X k (EX k. Thus, it is enough to show that Var(X k (EX k 0 as n. For ease of notation, let us denote the random variable X k by X and, for every subset S of the vertices of G of size k, let us denote by X S the following random variable: { 1, if S is a clique; X S = 0, if S is not a clique. Clearly, X = S X S. We now follow a similar path to our treatment of 4-cliques earlier, but in more generality and more systematically. First, we have Var(X = Var( S X S = S Var(X S + S T Cov(X S, X T. Writing S T to denote the fact that X S, X T are not independent, we have that Cov(X S, X T = 0 unless X S X T. In our current application, S T iff S T and S T. Continuing: Var(X = S S = S Var(X S + S T Cov(X S, X T E(X S + S T E(X S X T EX S + S T E(X S X T (since X S is 0/1 valued = EX + S T E(X S X T. It follows that Var(X (EX 1 EX + 1 (EX S T E(X SX T. Thus, since EX, it is enough for us to prove that S T E(X SX T = o((ex.

Lecture 8: February 8 8-3 We can go further than this using symmetry of the events X S. 1 We have Pr[X S = 1 X T = 1] S T E(X S X T = S T = S T Pr[X S = 1]Pr[X T = 1 X S = 1] = ( Pr[X S = 1] Pr[X T = 1 X S = 1] S T :T S = Pr[X T = 1 X S0 = 1] Pr[X S = 1] (for any fixed S 0, by symmetry T :T S 0 S = EX Pr[X T = 1 X S0 = 1], T :T S 0 and so Var(X (EX 1 EX + 1 Pr[X T = 1 X S0 = 1]. EX T :T S 0 Thus, since EX, to conclude our proof it is enough to show that T :T S 0 Pr[X T = 1 X S0 = 1] = o(ex for a fixed S 0. Note that the above argument is very general and applies in any symmetric setting. Specializing now to our clique problem, and letting S 0 be any fixed set of vertices of size k (n, we have k 1 ( ( k n k Pr[X T = 1 X S0 = 1] = [(k ( ] i i k T :T S 0 i= i Note that the sum on the right hand side of the above equation is taken over i = T S 0. The factor ( k i counts the number of ways that T can choose i vertices of S 0 to share. Factor k k i is all the possible ways that T can choose its other vertices, whereas [(k ( ] i is the probability that all edges in T that are not shared with S 0 appear. Now we have the following: T :T S 0 Pr[X T = 1 X S0 = 1] = EX = = k 1 ( k i= k 1 i= k 1 i= k i k i k ( k (i k ( k k i k i [( k ( i ] f(i, where f(i := k max {f(i}. i k 1 ( k k i k i (i k Now it can be shown that, provided the constant c is chosen large enough, f(i is maximized for i = (Exercise!. Its value there is ( k k f( = k ( k k (as k n. k Since k log n, k 5 /n 0 as n, so the above quotient tends to zero as required. This concludes the proof of claim ( and the proof of the theorem. 1 i.e., there is an automorphism of the probability space that maps any X S onto any other n

8-4 Lecture 8: February 8 8. Random k-sat Background and Motivation Determining whether a k-cnf formula is satisfiable is an NP-complete problem. However, the problem may be easier to solve for random formulae. This is because of the empirical observation that a random formula with a certain density of clauses is either almost surely satisfiable or almost surely unsatisfiable depending on whether the density is below or above a certain critical value. There are ties between this threshold phenomenon and so-called spin glasses in statistical physics; indeed, some people believe that this connection may one day unlock the P vs NP question. Model We now define a model. ϕ k (n, rn is a k-cnf formula over n variables generated by choosing rn clauses independently and uniformly at random. What exactly do we mean by choosing the clauses randomly? The two standard possibilities are (i to choose the k literals in the clause independently and u.a.r. from the n possibilities (so that repetitions of variables are allowed; or (ii to choose the clause u.a.r. from the set of all k k valid k-clauses (i.e., without repeated variables. It turns out that most results hold for either model, so we will use the most convenient implementation in each application. The above model has just one parameter, r, which we can think of as the density of the formula. Obviously, as r increases there are more constraints so it becomes less likely that ϕ k (n, rn is satisfiable. Conjecture 8. For every k, there exists a threshold value rk R such that r > rk Pr[ϕ k (n, rn is satisfiable] 0 as n r < rk Pr[ϕ k (n, rn is satisfiable] 1 as n This conjecture has been a major open problem for at least 5 years. The first landmark result was the following, which resolves the case k =. Theorem 8.3 [CR9, FdlV9, G96] The threshold r exists and is equal to 1. Some years later, a weaker form of the conjecture was proved by Friedgut, who showed the existence of a non-uniform threshold for every k: Theorem 8.4 [F99] For every k, there exists a sequence {r k (n} such that ɛ > 0 r > (1 + ɛr k (n Pr[ϕ k (n, rn is satisfiable] 0 as n r < (1 ɛr k (n Pr[ϕ k (n, rn is satisfiable] 1 as n. The crucial difference between Theorem 8.4 and Conjecture 8. is that in the conjecture rk is a constant, while in the theorem r k (n may depend on n. Despite this weakness, Friedgut s result was a surprising and significant step. Following a long sequence of developments by several authors, a major breakthrough was achieved very recently in a technical tour-de-force by Ding, Sly and Sun: Theorem 8.5 [DSS14] For all sufficiently large k, the threshold r k exists.

Lecture 8: February 8 8-5 The actual value of the threshold is also given in [DSS14] as the minimum of an explicit function. As had been previously established by Coja-Oghlan and Panagiotou [CP16], the asymptotic form of r k is where ε k 0 as k. r k = k ln 1 (1 + ln ε k, For small values of k >, the conjecture remains open. There has been much work establishing bounds on r k (assuming it exists for the important case of k = 3 (3-Sat. For example, it is known that 3.5 < r 3 < 4.49, and based on large-scale experiments and heuristic arguments, r 3 is believed to be close to 4.. The upper bound here is due to [DKMP09], and the lower bound to [KKL03,HS03]. Upper Bounds on rk. A more or less tight upper bound on r k can be found by a simple application of the first moment method. Let X = the number of satisfying assignments of ϕ k (r, rn. Then X = σ X σ where { 1, if assignment σ satisfies ϕ X σ = 0, if assignment σ does not satisfy ϕ Assuming the model in which every literal is independent, we clearly have E[X σ ] = (1 k rn, and therefore E[X] = n (1 k rn. Now it is easy to see that, if r k ln, E[X] 0 as n. By Markov s inequality, this immediately yields: Corollary 8.6 r k k ln. Lower Bounds on rk. A natural way to prove lower bounds is via algorithms: if a certain algorithm finds a satisfying assignment with high probability in some range of r, then obviously a satisfying assignment must exist in this range. However, bounds found in this way (see, e.g., [CF90] usually have the form rk c k k for some constant c, which is a factor O(k off from the upper bound. An important step was the introduction of non-algorithmic proofs based on the second moment method [AM0,AP04] to yield the much better bound r k k ln k for all k 3. This result already bounds the value of rk up to a linear term in k. The works [CP16,DSS14] mentioned earlier use more sophisticated non-algorithmic ideas inspired by statistical physics to obtain the final tight bound. The gap between algorithmic and non-algorithmic lower bounds suggests that there may exist two different thresholds: the first separating those densities for which it is easy to find a satisfying assignment w.h.p. (and therefore one necessarily exists and those for which a satisfying assignment exists w.h.p. but is hard to find; the second threshold separating the latter densities from those for which no satisfying assignment exists w.h.p. In the next lecture we will sketch the second moment approach to this lower bound. For now, we return to the proof of theorem 8.3. Proof of Theorem 8.3: We begin with the lower bound, namely that if r 1 ɛ then, with high probability, a random -SAT formula with m = rn (1 ɛn clauses is satisfiable. Let us recall the well-known fact (easy exercise that a -SAT formula is satisfiable if and only if its graph of implications does not contain a (directed cycle that contains both a variable and its negation. (The graph of implications contains a vertex for every literal and, for every clause (l 1 l, the edges ( l 1, l and ( l, l 1. For example, the edges induced by clause (x 1 x are as shown in Figure 8.. We will actually work with a slightly stronger sufficient condition which is easier to quantify, namely: the formula is satisfiable if it does not contain a bicycle, which is defined as follows.

8-6 Lecture 8: February 8 X1 X X1 X Figure 8.: The edges induced by clause (x 1 x. Definition 8.7 For k, a bicycle is a path (u, w 1, w,..., w k, v in the graph of implications, where the w i s are literals on distinct variables and u, v {w i, w i : 1 i k}. Exercise: Verify that if φ does not contain a bicycle then φ is satisfiable. Note that the variables appearing among the w i must be distinct! Therefore we may bound the probability that φ is not satisfiable as follows: Pr[ϕ not satisfiable] Pr[ϕ contains a bicycle] ( n n k k (k m k+1 Here we are summing over the possible sizes k of the bicycle; factor n k is an upper bound on the number of ways to choose k distinct variables for the bicycle, k accounts for the sign of each variable in the bicycle, (k is the number of ways to choose u and v, m k+1 is an upper bound on the number of ways to choose ( k+1 1 which of the m clauses of the formula induce the k + 1 edges of the bicycle, and is the probability 4 that the chosen clauses induce the corresponding edges of the bicycle. (Here we are working with the model which chooses valid clauses u.a.r. Now, the right hand side of the previous inequality becomes ( n n k k (k m k+1 k= 1 4 k+1 = n k= k= 1 4 n ( k+1 m k 0 as n n 1 The last expression tends to zero because the factor outside the summation goes to 0 and the summation is bounded by a constant. (This latter fact follows because m n 1 = (1 ɛn n 1 is less than 1, so the terms of the sum form a decreasing geometric series weighted by a polynomial factor. This concludes the proof of the first part (lower bound of theorem 8.3. The upper bound, which uses the second moment method, will be proved in the next lecture. k+1 References [AM0] [AP04] [CF90] D. Achlioptas and C. Moore, The asymptotic order of the random k-sat threshold, Proceedings of the 43rd IEEE FOCS, 00, pp. 779 788. D. Achlioptas and Y. Peres, The threshold for random k-sat is k log O(k, Journal of the American Mathematical Society 17 (004, pp. 947 973. M.-T. Chao and J. Franco, Probabilistic analysis of a generalization of the unit-clause literal selection heuristics for the k-satisfiability problem, Information Science 51 (1990, pp. 89 314.

Lecture 8: February 8 8-7 [CR9] [CP16] [DKMP09] [DSS14] V. Chvátal and B. Reed, Mick gets some (the odds are on his side, Proceedings of the 33rd IEEE FOCS, 199, pp. 60 67. A. Coja-Oghlan and K. Panagiotou, The asymptotic k-sat threshold, Advances in Mathematics 88 (016, pp. 985 1068. J. Diaz, L. Kirousis, D. Mitsche and X. Perez-Gimenez, On the satisfiability threshold of formulas with three literals per clause, Theoretical Computer Science 410 (009, pp. 90 934. J. Ding, A. Sly and N. Sun. Proof of the satisfiability conjecture for large k, ArXiv preprint arxiv:1411.0650 [math.pr], 014. [FdlV9] W. Fernandez de la Vega, On random -SAT, Manuscript, 199. [F99] E. Friedgut, Necessary and sufficient conditions for sharp thresholds of graph properties, Journal of the American Mathematical Society 1 (1999, pp. 1017 1054. [G96] A. Goerdt, A threshold for unsatisfiability, Journal of Computer & System Sciences 53 (1996, pp. 469 486. [HS03] [KKL03] M. Hajiaghayi and G.B. Sorkin, The satisfiability threshold for random 3-SAT is at least 3.5, submitted, 003. A. Kaporis, L.M. Kirousis and E. Lalas, Selecting complementary pairs of literals, Proc. LICS 03 Workshop on Typical Case Complexity and Phase Transitions, June 003.