Notes for Lecture 25

Similar documents
Notes for Lecture 11

Certifying polynomials for AC 0 [ ] circuits, with applications

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

CS6840: Advanced Complexity Theory Mar 29, Lecturer: Jayalal Sarma M.N. Scribe: Dinesh K.

Notes for Lecture 15

On the correlation of parity and small-depth circuits

CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds

Poly-logarithmic independence fools AC 0 circuits

EXPONENTIAL LOWER BOUNDS FOR DEPTH THREE BOOLEAN CIRCUITS

Lecture 10: Learning DNF, AC 0, Juntas. 1 Learning DNF in Almost Polynomial Time

6.842 Randomness and Computation April 2, Lecture 14

Lecture 4: LMN Learning (Part 2)

1 AC 0 and Håstad Switching Lemma

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Notes for Lecture 14 v0.9

On Probabilistic ACC Circuits. with an Exact-Threshold Output Gate. Richard Beigel. Department of Computer Science, Yale University

Notes for Lecture 3... x 4

Hardness amplification proofs require majority

Lecture 2: Proof of Switching Lemma

Complexity Theory of Polynomial-Time Problems

THE COMPLEXITY OF CONSTRUCTING PSEUDORANDOM GENERATORS FROM HARD FUNCTIONS

1 Circuit Complexity. CS 6743 Lecture 15 1 Fall Definitions

1 Randomized Computation

Meta-Algorithms vs. Circuit Lower Bounds Valentine Kabanets

Pseudorandom Bits for Constant Depth Circuits with Few Arbitrary Symmetric Gates

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

Lecture 3: AC 0, the switching lemma

Finite fields, randomness and complexity. Swastik Kopparty Rutgers University

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 1/29/2002. Notes for Lecture 3

Notes for Lecture 21

2 Completing the Hardness of approximation of Set Cover

Quantum algorithms (CO 781/CS 867/QIC 823, Winter 2013) Andrew Childs, University of Waterloo LECTURE 13: Query complexity and the polynomial method

Notes for Lecture 3... x 4

Impagliazzo s Hardcore Lemma

Lecture 5: February 16, 2012

The complexity of SPP formula minimization

Guest Lecture: Average Case depth hierarchy for,, -Boolean circuits. Boaz Barak

Last time, we described a pseudorandom generator that stretched its truly random input by one. If f is ( 1 2

Lecture 3 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

2 Natural Proofs: a barrier for proving circuit lower bounds

6.895 Randomness and Computation March 19, Lecture Last Lecture: Boosting Weak Learners Into Strong Learners

Lecture 13: Lower Bounds using the Adversary Method. 2 The Super-Basic Adversary Method [Amb02]

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

Notes for Lecture 14

JASS 06 Report Summary. Circuit Complexity. Konstantin S. Ushakov. May 14, 2006

COS598D Lecture 3 Pseudorandom generators from one-way functions

20.1 2SAT. CS125 Lecture 20 Fall 2016

Two Sides of the Coin Problem

CS Communication Complexity: Applications and New Directions

Notes on Computer Theory Last updated: November, Circuits

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

Circuits. Lecture 11 Uniform Circuit Complexity

Computational Learning Theory - Hilary Term : Introduction to the PAC Learning Framework

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 9/6/2004. Notes for Lecture 3

Notes for Lecture Decision Diffie Hellman and Quadratic Residues

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 22 Lecturer: David Wagner April 24, Notes 22 for CS 170

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

1 PSPACE-Completeness

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Tight bounds on The Fourier Spectrum of AC 0

Some notes on streaming algorithms continued

Graph Isomorphism is not AC 0 reducible to Group Isomorphism

Lecture 2: January 18

More on Noncommutative Polynomial Identity Testing

Essential facts about NP-completeness:

Foundations of Machine Learning and Data Science. Lecturer: Avrim Blum Lecture 9: October 7, 2015

CS Foundations of Communication Complexity

Lecture 29: Computational Learning Theory

Lecture 7: Quantum Fourier Transform over Z N

arxiv: v1 [cs.cc] 13 Feb 2017

Quantum Lower Bound for Recursive Fourier Sampling

Lecture 26. Daniel Apon

ALGEBRAIC METHODS IN THE THEORY OF LOWER BOUNDS FOR BOOLEAN CIRCUIT COMPLEXITY

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

On Uniform Amplification of Hardness in NP

Notes on Space-Bounded Complexity

Complete problems for classes in PH, The Polynomial-Time Hierarchy (PH) oracle is like a subroutine, or function in

CS 151 Complexity Theory Spring Solution Set 5

1 Nisan-Wigderson pseudorandom generator

Approximating MAX-E3LIN is NP-Hard

BCCS: Computational methods for complex systems Wednesday, 16 June Lecture 3

Notes for Lecture 7. 1 Increasing the Stretch of Pseudorandom Generators

Cell-Probe Proofs and Nondeterministic Cell-Probe Complexity

18.5 Crossings and incidences

Lecture 22: Counting

Introduction to Computational Complexity

Constant-Depth Circuits for Arithmetic in Finite Fields of Characteristic Two

NP-Completeness. f(n) \ n n sec sec sec. n sec 24.3 sec 5.2 mins. 2 n sec 17.9 mins 35.

A New Lower Bound Technique for Quantum Circuits without Ancillæ

Notes on Space-Bounded Complexity

The complexity of formula minimization

Majority is incompressible by AC 0 [p] circuits

Rounds in Communication Complexity Revisited

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

DRAFT. Complexity of counting. Chapter 8

Proof Complexity of Quantified Boolean Formulas

Testing by Implicit Learning

On the Gap Between ess(f) and cnf size(f) (Extended Abstract)

More Approximation Algorithms

Transcription:

U.C. Berkeley CS278: Computational Complexity Handout N25 ofessor Luca Trevisan 12/1/2004 Notes for Lecture 25 Circuit Lower Bounds for Parity Using Polynomials In this lecture we prove a lower bound on the size of a constant depth circuit which computes the XOR of n bits. Before we talk about bounds on the size of a circuit, let us first clarify what we mean by circuit depth and circuit size. The depth of a circuit is defined as the length of the longest path from the input to output. The size of a circuit is the number of AND and OR gates in the circuit. Note that, for our purpose, we assume all the gates have unlimited fan-in and fan-out. We define AC 0 to be the class of decision problems solvable by circuits of polynomial size and constant depth. We want to prove the result that PARITY is not in AC 0. There are two known techniques to prove this result. In this class, we will talk about a proof which uses polynomials; in the next class we will look at a different proof which uses random restrictions. 1 Circuit Upper Bounds for PARITY Before we go into our proof, let us first look at a circuit of constant depth d that computes PARITY. Theorem 1 For every constant d 2, there are circuits of size 2 O n d 1 1 that compute parity. In the next lecture, we will prove a 2 n 1 d 1 of Theorem 1. Today we will prove a weaker 2 n 1 4d size lower bound, establishing the tightness lower bound. oof: [Of Theorem 1] Consider the circuit C shown in Figure 1, which computes the PARITY of n variables. The circuit C is a tree of XOR gates, each of which has fan-in n 1 d 1 ; the tree has depth d 1. Now, since each XOR gate is a function of n 1 d 1 variables, it can be implemented by a CNF or a DNF of size 2 n d 1 1. Let us replace alternating layers of XOR gates in the tree by CNF s and DNF s - for example we replace gates in the first layer by their CNF implementation, gates in the second layer by their DNF implementation, and so on. This gives us a circuit of depth 2(d 1). Now we can use the associativity of OR to collapse consecutive layers of OR gates into a single layer. The same thing can be done for AND to get a circuit of depth d. This gives us a circuit of depth d and size O(n2 n d 1 1 ) which computes PARITY. 1

Figure 1: Circuit for Computing XOR of n variables; each small circuit in the tree computes the XOR of k = n 1 d 1 variables 2 Overview of the Lower Bound oof For our proof, we will utilise a property which is common to all circuits of small size and constant depth, which PARITY does not have. 1 The property is that circuits of small size and constant depth can be represented by low degree polynomials, with high probability. More formally, we show that if a function f : {0, 1} n {0, 1} is computed by a circuit of size s and depth d, then there exists a function g : {0, 1} n R such that x [f(x) = g(x)] 3 4 and ĝ α 0 only for α O((log S) 2d ), where ĝ is the Fourier transform of g. Then we will show that if a function g : {0, 1} n R agrees with PARITY on more than a fraction 3 4 of its inputs, then there is a coefficient α such that ĝ α 0 and α = Ω( n). That is, a function which agrees with PARITY on a large fraction of its inputs, has to have high degree. From these two results, it is easy to see that PARITY cannot be computed by circuits of constant depth and small size. We give a formal definition of degree of a function and then formally state the two results that give our lower bound. Definition 1 We say that a function g : {0, 1} n R has degree at most d if there is a polynomial over the reals of degree at most d such that g and the polynomial agree on {0, 1} n. 1 Incidentally, the property is false with high probability for random functions and it is computable in time 2 O(n) given the truth-table of a function. You may remember that this implies that our lower bound will be a natural proof. 2

An equivalent way of looking at the definition of degree is to consider the size of the largest non-zero coefficient of the Fourier transform of the function. Fact 2 A function g : {0, 1} n R has degree at most d if and only if ĝ α = 0 for all α such that α > d. The following two lemmas are the main results of this lecture. Lemma 3 For every circuit C of size S and depth d, there is a function g : {0, 1} n R of degree O((log S) 2d ) such that g and C agree on at least a 3/4 fraction of {0, 1} n. Lemma 4 Let g : {0, 1} n R be a function that agrees with PARITY on at least a 3/4 fraction of {0, 1} n. Then the degree of g is Ω( n). From Lemma 3 and Lemma 4 it is immediate to derive the following lower bound. Theorem 5 For every constant d 2, if C is a circuit of depth d and size S that computes parity, then S 2 Ω(n1/4d). oof: From Lemma 3 we have that there is a function g : {0, 1} n R that agrees with PARITY on a 3/4 fraction of {0, 1} n, and whose degree is at most O((log S) 2d ). From Lemma 4 we deduce that the degree of g must be at least Ω( n), so that (log S) 2d = Ω( n) which is equivalent to S = 2 Ω(n1/4d ) 3 oof of Lemma 3 Most of the work in the proof of Lemma 3 will be in showing how to give a probabilistic approximation of a single gate using low-degree functions. 3.1 Approximating OR The following lemma says that we can approximately represent OR with a polynomial of degree exponentially small in the the fan-in of the OR gate. We ll use the notation that x is a vector of k bits, x i is the ith bit of x, and 0 is the vector of zeros (of the appropriate size based on context). Lemma 6 For all k and ɛ, there exists a distribution G over functions g : {0, 1} k R such that 1. g is of degree O((log 1 ɛ )(log k)), and 3

2. for all x {0, 1} k, [g(x) = x 1... x k ] 1 ɛ. (1) g G oof Idea: We want a random polynomial p : {0, 1} k R that computes OR. An obvious choice is p bad (x 1,..., x k ) = 1 (1 x i ), (2) i {1,...,k} which computes OR with no error. But it has degree k, whereas we d like it to have logarithmic degree. To accomplish this amazing feat, we ll replace the tests of all k variables with just a few tests of random batches of variables. This gives us a random polynomial which computes OR with one-sided error: when x = 0, we ll have p(x) = 0; and when some x i = 1, we ll almost always (over our choice of p) have p(x) = 1. oof: We pick a random collection S of subsets of the bits of x. (That is, for each S S we have S {1,..., k}). We ll soon see how to pick S, but once the choice has been made, we define our polynomial as p(x 1,..., x k ) = 1 ( 1 ) x i. (3) S S i S Why does p successfully approximate OR? First, suppose x 1... x k = 0. Then we have x = 0, and: ( ) p(0,..., 0) = 1 S S 1 i S So, regardless of the distribution from which we pick S, we have 0 = 0. (4) [p(0) = 0] = 1. (5) S Next, suppose x 1... x k = 1. We have p(x) = 1 if and only if the product term is zero. The product term is zero if and only if the sum in some factor is 1. And that, in turn, happens if and only if there is some S S which includes exactly one x i which is 1. Formally, for any x {0, 1} k, x 0, we want the following to be true with high probability. S S. ( {i S : x i = 1} = 1) (6) Given that we do not want S to be very large (so that the degree of the polynomial is small), we ll have to pick S very carefully. In order to accomplish this, we turn to the Valiant-Vazirani reduction, which you may recall from an earlier class. Lemma 7 (Valiant-Vazirani) Let A {1,..., k}, let a be such that 2 a A 2 a+1, and let H be a family of pairwise independent hash functions of the form h : {1,..., k} {0, 1} a+2. Then if we pick h at random from H, there is probability at least 1/8 that there is a unique element i A such that h(i) = 0. ecisely, h H [ {i A : h(i) = 0} = 1] 1 8 (7) 4

With this as a guide, we will define our collection S in terms of pairwise independent hash functions. Let t > 0 be a value that we will set later in terms of the approximation parameter ɛ. Then we let S = {S a,j } a {0,...,log k},j {1,...,t} where the sets S a,j are defined as follows. For a {0,..., log k}: For j {1,..., t}: Pick random pairwise independent hash function h a,j : {1,..., k} {0, 1} a+2 Define S a,j = h 1 (0). That is, S a,j = {i : h(i) = 0}. Now consider any x 0 which we are feeding to our OR-polynomial p. Let A be the set of bits of x which are 1, i.e., A = {i : x i = 1}, and let a be such that 2 a A 2 a+1. Then we have a {0,..., log k}, so S includes t sets S a,1,..., S a,t. Consider any one such S a,j. By Valiant-Vazirani, we have which implies that h a,j H [ {i A : h a,j(i) = 0} = 1] 1 8 h a,j H [ {i A : i S a,j} = 1] 1 8 so the probability that there is some j for which S a,j A = 1 is at least 1 ( 7 t, 8) which by the reasoning above tells us that p [p(x) = x 1... x k ] 1 (8) (9) ( ) 7 t. (10) 8 Now, to get a success probability of 1 ɛ as required by the lemma, we just pick t = O(log 1 ɛ ). The degree of p will then be S = t(log k) = O((log 1 ɛ )(log k)), which satisfies the degree requirement of the lemma. Note that given this lemma, we can also approximate AND with an exponentially low degree polynomial. Suppose we have some G which approximates OR within ɛ as above. Then we can construct G which approximates AND by drawing g from G and returning g such that g (x 1,..., x k ) = 1 g(1 x 1,..., 1 x k ). (11) Any such g has the same degree as g. Also, for a particular x {0, 1} k, g clearly computes AND if g computes OR, which happens with probability at least 1 ɛ over our choice of g. 3.2 oof of Lemma 3 Given a circuit C of size S and depth d, for every gate we pick independently an approximating function g i with parameter ɛ = 1 4S, and replace the gate by g i. Then, for a given input, the probability that the new function so defined computes C(x) correctly is at least the probability that the results of all the gates are correctly computed, which is at least 3 4. In particular, there is a function among those generated this way that agrees with C() on at least a 3/4 fraction of inputs. Each g i has degree at most O((log S) 2, because the fan-in of each gate is at most S, and the degree of the function defined in the construction is at most O((log S) 2d ). 5

4 oof of Lemma 4 Let g : {0, 1} n R be a function of degree at most t that agrees with PARITY on at least a 3/4 fraction of inputs. Let G : { 1, 1} n R be defined as ( 1 G(x) := 1 2g 2 1 2 x 1,, 1 2 1 ) 2 x n (12) Note that: G is still of degree at most t, G agrees with the function Π(x 1,..., x n ) = x 1 x 2 x n on at least a 3/4 fraction of { 1, 1} n. Define A to be the set of x { 1, 1} n such that G(x) = Π(x). { } n A := x : G(x) = x i. (13) Then A 3 4 2n, by our initial assumption. Now consider the set F of all functions f : A R. These form a vector space of dimension A over the reals. We know that any function f in this set can be written as f(x) = ˆf α x i (14) α Over A, G(x) = n i=1 x i, and so for x A, i=1 i α i α x i = G(x) i/ α x i (15) By our initial assumption, G(x) is a polynomial of degree at most t. Therefore, for every α, such that α n 2, we can replace i α x i by a polynomial of degree less than or equal to t + n 2. Every such function f which belong to F can be written as a polynomial of degree at most t + n 2. Hence the set { i α x } i forms a basis for the set S. As there must α t+ n 2 be at least A such monomials, this means that and, in particular, t+ n 2 k=0 t+ n 2 k= n 2 ( ) n 3 k 4 2n (16) ( ) n 1 k 4 2n (17) We know from Stirling s approximation that every binomial coefficient ( n k) is at most O(2 n / n), so we get ( ) t O 2 n 1 n 4 2n (18) And so t = Ω( n). 6

5 References The proof of the PARITY lower bound using polynomials is due to Razborov [Raz87] and Smolensky [Smo87]. The first proof that PARITY is not in AC 0 used a different argument, and was due to Furst, Saxe and Sipser [FSS84]. The lower bound was improved to exponential by Yao [Yao85], and the optimal lower bound is due to Håstad [Hås86]. References [FSS84] Merrick L. Furst, James B. Saxe, and Michael Sipser. Parity, circuits, and the polynomial-time hierarchy. Mathematical Systems Theory, 17(1):13 27, 1984. 7 [Hås86] Johan Håstad. Almost optimal lower bounds for small depth circuits. In oceedings of the 18th ACM Symposium on Theory of Computing, pages 6 20, 1986. 7 [Raz87] A.A. Razborov. Lower bounds on the size of bounded depth networks over a complete basis with logical addition. Matematicheskie Zametki, 41:598 607, 1987. 7 [Smo87] Roman Smolensky. Algebraic methods in the theory of lower bounds for boolean circuit complexity. In oceedings of the 19th ACM Symposium on Theory of Computing, pages 77 82, 1987. 7 [Yao85] Andrew C Yao. Separating the polynomial-time hierarchy by oracles. In oceedings of the 26th IEEE Symposium on Foundations of Computer Science, pages 1 10, 1985. 7 7