Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Similar documents
Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

Notes on Complexity Theory Last updated: November, Lecture 10

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Notes on Complexity Theory Last updated: December, Lecture 27

Lecture 2. 1 More N P-Compete Languages. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

Lecture 3. 1 Terminology. 2 Non-Deterministic Space Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005.

Lecture 24: Randomized Complexity, Course Summary

Notes on Complexity Theory Last updated: December, Lecture 2

Randomized Computation

Pr[X = s Y = t] = Pr[X = s] Pr[Y = t]

1 Randomized complexity

: On the P vs. BPP problem. 30/12/2016 Lecture 12

: On the P vs. BPP problem. 30/12/2016 Lecture 11

Notes on Computer Theory Last updated: November, Circuits

Notes for Lecture 3... x 4

1 Randomized Computation

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

Lecture 12: Randomness Continued

1 More finite deterministic automata

CS Communication Complexity: Applications and New Directions

Lecture 23: Alternation vs. Counting

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 1/29/2002. Notes for Lecture 3

Umans Complexity Theory Lectures

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Randomness and non-uniformity

Lecture 2 Sep 5, 2017

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Lecture 22. m n c (k) i,j x i x j = c (k) k=1

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Section 2: Classes of Sets

Randomized Complexity Classes; RP

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

Lecture 3 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Lecture 5: Efficient PAC Learning. 1 Consistent Learning: a Bound on Sample Complexity

Theory of Computer Science to Msc Students, Spring Lecture 2

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

198:538 Complexity of Computation Lecture 16 Rutgers University, Spring March 2007

Lecture 17. In this lecture, we will continue our discussion on randomization.

Notes for Lecture 3... x 4

A An Overview of Complexity Theory for the Algorithm Designer

1 The Low-Degree Testing Assumption

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Time and space classes

Lecture 5: Two-point Sampling

20.1 2SAT. CS125 Lecture 20 Fall 2016

Two Comments on Targeted Canonical Derandomizers

A Lower Bound for the Size of Syntactically Multilinear Arithmetic Circuits

CSCI3390-Lecture 14: The class NP

Handout 1: Mathematical Background

Lecture 22: Quantum computational complexity

Complexity Theory of Polynomial-Time Problems

Boolean circuits. Lecture Definitions

1 Quantum Circuits. CS Quantum Complexity theory 1/31/07 Spring 2007 Lecture Class P - Polynomial Time

On Pseudorandomness w.r.t Deterministic Observers

Dr. Al Rodi, Dean, College of Engineering and Applied Science

About the relationship between formal logic and complexity classes

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

HW11. Due: December 6, 2018

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Handout 1: Mathematical Background

1 Computational problems

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

1 AC 0 and Håstad Switching Lemma

Essential facts about NP-completeness:

Lecture Examples of problems which have randomized algorithms

6.045 Final Exam Solutions

Solution of Exercise Sheet 12

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Turing Machines, diagonalization, the halting problem, reducibility

Ph 219b/CS 219b. Exercises Due: Wednesday 22 February 2006

Technische Universität München Summer term 2010 Theoretische Informatik August 2, 2010 Dr. J. Kreiker / Dr. M. Luttenberger, J. Kretinsky SOLUTION

Lecture 4 : Quest for Structure in Counting Problems

11.1 Set Cover ILP formulation of set cover Deterministic rounding

On the possibilities and limitations of pseudodeterministic algorithms

6.842 Randomness and Computation Lecture 5

Notes on State Minimization

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 22 Lecturer: David Wagner April 24, Notes 22 for CS 170

Non-Interactive Zero Knowledge (II)

On the Possibilities and Limitations of Pseudodeterministic Algorithms

Quantum Complexity Theory. Wim van Dam HP Labs MSRI UC Berkeley SQUINT 3 June 16, 2003

1 Probabilistically checkable proofs

Lecture 23: Introduction to Quantum Complexity Theory 1 REVIEW: CLASSICAL COMPLEXITY THEORY

Diskrete Mathematik Solution 6

Lecture 2: Connecting the Three Models

Lecture 20: Goemans-Williamson MAXCUT Approximation Algorithm. 2 Goemans-Williamson Approximation Algorithm for MAXCUT

Lecture 4: Two-point Sampling, Coupon Collector s problem

Quantum Computing Lecture 8. Quantum Automata and Complexity

1 Computational Problems

Lecture 8 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

An Epistemic Characterization of Zero Knowledge

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

Variations of the Turing Machine

Lecture 2: From Classical to Quantum Model of Computation

Lecture 8: Alternatation. 1 Alternate Characterizations of the Polynomial Hierarchy

Lecture 11: Non-Interactive Zero-Knowledge II. 1 Non-Interactive Zero-Knowledge in the Hidden-Bits Model for the Graph Hamiltonian problem

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

Friday Four Square! Today at 4:15PM, Outside Gates

DRAFT. Algebraic computation models. Chapter 14

Reductions to Graph Isomorphism

Lecture 21: Algebraic Computation Models

Transcription:

Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound on the size of circuits needed to compute any n-ary function. The key idea is to re-use prior computations whenever possible. We begin with an observation and a definition. First, the observation: every n-ary function f can be written as f(x 1 x n ) = a 1,...,a n {0,1} for some set of binary constants {α a1 a n }, where x a i i def = α a1 a n x a 1 1 xan n { xi if a i = 1 1 if a i = 0. (Note that the {α a1 a n } are simply selector bits which determine whether a given term contributes to the sum.) To see this, consider the arithmetization of f over Z 2 : write f in conjunctive normal form and then map x to (1 x); map C C to CC (where C,C are expressions); and map C C to 1 (1 C)(1 C ). Expanding everything out, we obtain a sum over all possible monomials. The above is known as the ring-sum expansion of f. Now, the definition: given a set of n-ary functions F, let S(F) be the size of the smallest circuit C such that for each f F there is some gate in C computing the function f. For simplicity in what follows, we will work over the basis {0,1,, } (and so S( ) is defined over this basis). Lemma 1 Given any collection F of n-ary functions and arbitrary p 1, we have 2 S(F) 2 n n + 2 p F 2n +. p p Proof Label the inputs x = x 1,...,x n. We first compute the terms x a 1 1 xan n for all possible values a 1,... a n {0,1}. Say such a term has length l if the number of non-zero a i -values is exactly l. By computing all terms of length l 1 before moving on to the terms of length l, we can compute each additional term using only a single gate. It follows that we can compute all such terms using 2 n gates (this is actually a slight over-estimate). Next, partition these terms into q = 2n p sets {C i} q, each containing at most p terms. Within each set C i = {t i,1,...,t i,p } (with p p), compute the of all possible subsets of the terms in C i ; i.e., for each S [p ], compute DS i def = j S t i,j. As before, we can compute each value using one additional gate; thus, we can compute all such values, over all sets {C i }, using (at most) q 2 p gates. 5-1

Relying on the observation made earlier, we can represent the value f(x) for any function f F as q DS i i, for some sets {S i } q. This requires at most (q 1) gates for each f F, and so at most F 2n p additional gates. Adding everything up gives the result of the lemma. With the above in mind, we can prove the following theorem due to Lupanov: Theorem 2 Every n-ary function can be computed by circuits with 2n n +o( ) 2 n n gates (over the basis {0, 1,, }). Proof We first prove the following claim: Claim 3 Given any n-ary function f, and for any p 1 and 0 m n, there is a circuit computing f of size 2 3 2 n m + 2 m m + 2 p + 2n p p. This claim implies the theorem by setting m = n and p = n m log n. To prove the claim, we construct a circuit of the stated size computing f. First compute each of the terms x a m+1 m+1 xan n for all a m+1,...,a n {0,1}. As in the proof of the previous lemma, this can be done using at most 2 n m -gates. Next, observe that f can be written as f(x 1 x n ) = h am+1 a n (x 1 x m ) x a m+1 m+1 xan n, (1) a m+1,...,a n {0,1} for some set of 2 n m m-ary functions {h am+1 a n ( )}. By the previous lemma, we can compute all of these functions using 2 2 m m + 2 p + 2n m 2 m 2 = 2 m m + 2 p + 2n p p p p gates. We then need (at most) an additional 2 n m -gates and (2 n m 1) -gates to compute f using Eq. (1). Adding everything up proves the claim. 2 Randomized Computation There are multiple ways to define a randomized model of computation. The first is via Turing machines with a probabilistic transition function (this is called the online model); the second is by augmenting Turing machines with an additional (read-only) random tape populated with infinitely-many random bits. For the latter approach, one can consider either one-way or two-way random tapes. The difference between these models is unimportant for randomized time complexity classes, but is important for randomized space classes. (In fact, the various models described above are exactly analogous to the various models we discussed 5-2

in the case of non-deterministic space classes. As there, the difference is immaterial when our complexity measure is time, but becomes important when out complexity measure is space.) We denote by M(x) a random computation of M on input x, and by M(x;r) the deterministic computation of M on input x using random tape r. We now define some randomized complexity classes. In the following, ppt stands for probabilistic, polynomial time. 2.1 RP and corp Definition 1 Class RP consists of languages L for which there exists a ppt machine M such that: x L Pr[M(x) = 1] 1 2 and x L Pr[M(x) = 0] = 1. When running an RP algorithm for a language L on an input x, if M(x) outputs 1 we are sure that x L; however, if M(x) outputs 0 we cannot make any definitive claim. Definition 2 Class corp consists of languages L for which L RP. Alternately, L corp if there exists a ppt machine M such that: x L Pr[M(x) = 1] = 1 and x L Pr[M(x) = 0] 1 2. It is left as an exercise to prove that the two definitions of corp given above are indeed equivalent. Note that when running a corp algorithm for a language L on an input x, if M(x) outputs 0 we are sure that x L, but if M(x) outputs 1 we cannot make any definitive claim. A simple observation is that RP N P, since a random tape r for which M(x;r) = 1 (which is guaranteed to exist if x L) serves as a witness that x L. One can also think about this directly in terms of the computation tree of the machine M: an N P algorithm for a language is only required to have a single accepting computation path, whereas an RP machine deciding the same language is guaranteed to have many accepting paths. 2.1.1 Error Reduction Another straightforward observation is that the constant 1/2 in the above definitions is 1 arbitrary, and can be replaced with any fraction f satisfying f( x ) 1 e q( x ) for polynomials p,q. In particular: Lemma 4 Fix a language L. If there exists a ppt machine M such that: x L Pr[M(x) = 1] 1 and x L Pr[M(x) = 0] = 1 for some polynomial p, then for any polynomial q there exists a ppt machine M such that: x L Pr[M (x) = 1] 1 e q( x ) and x L Pr[M (x) = 0] = 1 (here, e is the base of natural logarithms). 5-3

Proof M (x) is defined as follows: run M(x) at most t( x ) = q( x ) times, using independent random coins; output 1 iff one or more of the runs of M output 1. Note that t is polynomial, so M is still a ppt algorithm. No matter how large t is, Pr[M (x) = 0] = 1 when x L. On the other hand, for x L we have: ( ) t( x ) Pr[M (x) = 0] = Pr[M(x) = 0] ( 1 1 ) q( x ) < e q( x ). This concludes the proof. The above is an example of what is called error reduction. One natural question is whether we can do better. There are two measures of interest here: how many random bits we use, and how many invocations of the original algorithm M are needed. Let us fix (arbitrarily) p = 1/2 in the above, and assume M uses m random coins. Then the total number of random coins used to obtain error probability e q( x ) is O(q m), and we run M for O(q) times. Another option which uses fewer random coins but more invocations of M is to use pairwise-independent random coins. (In class we discussed finite fields, pairwise independence, and a construction of pairwise independent random variables.) Namely, consider the algorithm M which runs M for a total of q times using pairwise-independent random coins, and outputs 1 iff one or more of the executions of M output 1. When x L things are as above. When x L define the indicator random variables {X i } q such that X i = 1 iff the i th execution of M outputs 1. Note that Pr[X i = 1] 1/2 and the {X i } are pairwise independent. Let X def = q X i. Using Chebyshev s inequality, we have: ] Pr[M outputs 0] < Pr [ X Exp[X] q Var[X] 4 (q /4) 2 = q Var[X 1 ] (q /4) 2 (q /4) (q /4) 2 = 4 q, where Var[ i X i] = i Var[X i] = q Var[X 1 ] follows from pairwise independence of the {X i } and the fact that they are identically distributed. So, to achieve error e q we need to set q = O(2 q ) and so we are running M many more times! But the randomness needed is just O(max{q,m}). Can we get the best of both worlds (i.e., low randomness and few invocations of M)? Later in the semester, we may see a method based on expander graphs that achieves this. 2.2 BPP RP and corp are classes of languages that can be decided with one-sided error. BPP is the class of languages that can be decided with two-sided error. Definition 3 Class BPP consists of languages L for which there exists a ppt machine M such that: x L Pr[M(x) = 1] 2 3 and x L Pr[M(x) = 0] 2 3. 5-4

In other words: Pr[M(x) = χ L (x)] 2 3, where χ L(x) is the characteristic function of L (i.e., χ L (x) = 1 if x L and 0 otherwise). Note that RP BPP (and also corp BPP), so BPP is potentially a more powerful complexity class. As in the case of RP, the constant 2/3 in the definition of BPP is arbitrary. All we require is a noticeable (i.e., polynomially-bounded) gap between acceptance probabilities for strings in and out of the language. Specifically: 1 Lemma 5 Fix L, and say there exists a ppt machine M such that x L Pr[M(x) = 1] 1 2 + 1 and x L Pr[M(x) = 1] < 1 2 1 for some polynomial p. Then L BPP. Proof We construct ppt machine M such that x L Pr[M(x) = 1] 2 3 and x L Pr[M(x) = 1] < 1 3. This will complete the proof. In designing the algorithm, we rely on a version of the Chernoff bound stated here for convenience (see [2, Chap. 4]): Claim 6 Let X 1,...,X n be independent 0-1 random variables with Pr[X i = 1] = ρ for all i. Let X = n X i and µ = Exp[X] = ρ n. Then for 0 < ε < µ: Pr[X µ ε] e ε2 /2n. We define M as follows: on input x, it runs M(x) a total of t( x ) = 2p 2 ( x )ln 4 times with independent random inputs (note that t is polynomial). M outputs 1 iff the fraction of 1 s output by these various executions of M is greater than 1/2. Consider the execution of M on input x. Let X i be an indicator random variable which is equal to 1 iff the i th execution of M(x) outputs the correct answer χ L (x). Let ρ def = Pr[X i = 1]. Note that ρ 1 2 + 1 and so t( x ) Pr[M (x) χ L (x)] Pr X i t( x ) 2 t( x ) Pr X i ρ t( x ) t( x ) e t( x )/2p2 ( x ) = 1 4. The theorem follows. Exactly analogous to the case of RP discussed earlier, we can also assume that if L BPP there is an algorithm M such that Pr[M(x) = χ L (x)] 1 e q( x ) for any desired polynomial q. The error reduction technique using pairwise independence also applies to BPP algorithms (although the analysis is slightly more technical). Some miscellaneous comments about BPP follow: 1 Actually, even the constant 1/2 in the lemma is arbitrary. See [1, Lect. 7]. 5-5

1. BPP seems to represent the class of efficiently-solvable problems (at least until quantum computers become practical...). Clearly P BPP, and it was originally thought that P BPP and so randomness was a way to help solve problems outside of P. More recently, the consensus seems to be that P = BPP (this has not yet been proven to hold unconditionally, but it has been shown to hold based on what seem to be reasonable assumptions). 2. Notwithstanding the above, we do not currently even know whether BPP N P or vice versa. 3. BPP is easily seen to be closed under union, intersection, and complement. Bibliographic Notes Section 1 is from [3, Section 1.5.2]. Section 2 is adapted from [1, Lecture 7]. References [1] O. Goldreich. Introduction to Complexity Theory (July 31, 1999). [2] M. Mitzenmacher and E. Upfal. Probablity and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, 2005. [3] H. Vollmer. Introduction to Circuit Complexity. Springer, 1999. 5-6