Probablistically Checkable Proofs

Lectue 12 Pobablistically Checkable Poofs May 13, 2004 Lectue: Paul Beame Notes: Chis Re 12.1 Pobablisitically Checkable Poofs Oveview We know that IP = PSPACE. This means thee is an inteactive potocol between a pove P and a veifie V such that Ψ TQBF (P, V ) accepts Ψ with pobability 1 Ψ / TQBF (P, V ) accepts Ψ with Pobability 1 2. Suppose that we wanted to expess the entie stategy of the pove in this potocol fo all possible inteactions with a veifie. This stategy would consist of a ooted tee of the possible inteactions, whee each node of the tee coesponds to the state of the system between ounds of communication and the edges ae labelled by the values sent duing the ounds. At each ound, the veifie sends a polynomial numbe of andom bits (in the size N of the input Ψ) so each node just befoe a veifie move has fan-out 2 N O(1). Each node just befoe a pove move has fan-out 1 which lists the pove s (best) esponse to the pevious communcation. The entie tee has depth N O(1). Because of the depth and fan-out of the tee, it has 2 N O(1) nodes oveall. Thus, the IP potocol coesponds to a table of size 2 N O(1) of which the potocol accesses/queies only N O(1) bits detemined by the choice of N O(1) andom bits. PCP is a genealization of these poof techniques along the two mentioned axes: the numbe of andom bits and the numbe of queies allowed. 12.2 PCP definition Think of a poof as a lage table and the veification has access to only a small numbe of places in the table. This is the main idea of pobabilistically checkable poofs. Definition 12.1. L PCP((n), q(n)) a polynomial time oacle TM V? with access to O((n)) andom bits and that makes O(q(n)) queies to its oacle such that x L Π such that P [V Π (x) accepts] = 1 x / L Π. P [V Π (x) accepts] 1 2 65

LECTURE 12. PROBABLISTICALLY CHECKABLE PROOFS 66 whee Π denotes an oacle which we think of as a poof fo L. Π is viewed as a table listing all the values on which it can be queied. Uppe bound on Π How big does the poof table need to be? Think of a lage table indexed by the queies and the andom stings, this case implies it is at most O(q(n))2 O((n)). It follows that Lemma 12.1. PCP((n), q(n)) NTIME(2 O((n)) q(n)). A moe geneal fomulation Thee ae still two constants in ou fomulation 1 and 1 2. We can genealize PCP futhe by intoducing the notions of the soundness and completeness of the poof system. Definition 12.2. L P CP c,s ((n), q(n)) a polynomial time oacle TM V? with access to O((n)) andom bits and that makes O(q(n)) queies to its oacle such that x L Π such that P[V Π (x) accepts] c (Completeness) x / L Π. P[V Π (x) accepts] s (Soundness) This fomulation is impotant fo some hadness of appoximation esults. If the paametes ae not specified we assume the oiginal fomulation. Definition 12.3. PCP(poly, poly) = k PCP(nk, n k ) Remak. The definition of PCP(poly, poly) was oiginally motivated by a diffeent way of extending the idea of IP. This idea was to allow multiple all-poweful instead of just one pove. With such Multipove Inteactive Poof systems the multiple poves can be used with the estiction that they may not collude with each othe. The set of languages poved in polynomial time in such settings was/is known as MIP. Late it was shown that the languages in MIP ae pecisely those in PCP(poly, poly) and the oacle chaacteization was used but esults ae still sometimes stated as esults about MIP. 12.2.1 Results about PCP Immediately fom the motivating discussion fo the definition of the PCP classes above we have IP PCP(poly, poly) and thus the following lemma. Lemma 12.2. PSPACE PCP(poly, poly). As a coollay to Lemma 12.1 we have Lemma 12.3. PCP(poly, poly) NEXP and fo any polynomial q(n) we have PCP(log n, q(n)) NP. The main esults about PCP ae the following which show stong conveses of the above simple inclusions. Theoem 12.4 (Babai-Fotnow-Lund). PCP(poly, poly) = NEXP. The following esult is so stong it is known as the PCP Theoem. It was the culimination of a sequence impovements of the above esult by Babai, Levin, Fotnow, and Szegedy, by Feige, Goldwasse, Lovasz, Safa, and Szegedy and by Aoa and Safa.

LECTURE 12. PROBABLISTICALLY CHECKABLE PROOFS 67 Theoem 12.5 (Aoa-Lund-Motwani-Szegedy-Sudan). NP = PCP(log n, 1). It is inteesting to note that the O(1) in the numbe of queies and can actually be educed to 3 in the stongest vesions of the PCP Theoem. This is impotant fo many of the esults about appoximation poblems. Fo example in appoximating CLIQUE this stonge vesion is used to show that it is had to appoximate CLIQUE with even an n 1 o(1) facto. Note that the PCP Theoem can be seens as a stict stengthening of the esult of Babai, Fotnow, and Lund since it yields the following coollay. Coollay 12.6. PCP(poly, 1) = NEXP. Ove the next seveal lectues we will go ove the poofs of these theoems. To begin with we need a convenient chaacteization fo NEXP. 12.2.2 A Complete Poblem fo NEXP 3SAT woked nicely as a complete poblem fo NP, so it is natual to look fo an analog of 3SAT fo NEXP. In fact, the Cook-Levin tableau agument will again be ou basis fo showing completeness. The analogous poblem will be an implicitly defined vesion of 3SAT. Fist we will define a poblem ORACLE-3SAT that we will diectly show to be NEXP-complete. ORACLE-3SAT will be 3SAT defined on exponential size fomulas in 3CNF defined on 2 n vaiables and 2 m clauses. Definition 12.4. An oacle tuth assignment is a function A : {0, 1} n {0, 1} whee we intepet A(v) = 1 x v = 1. Definition 12.5. An oacle 3CNF is a map C : {0, 1} m {0, 1} 3n+3 that specifies, given the index w {0, 1} m of a clause, the 3 vaiables in that clause and the thee signs fo those vaiables. Fo convenience, the output C(w) be epesented by a tuple (v 1, v 2, v 3, s 1, s 2, s 3 ) whee v 1, v 2, v 3 {0, 1} n, s 1, s 2, s 3 {0, 1} and the clause epesented is x s 1 v 1 x s 2 v 2 x s 3 v 3 whee x 0 denotes x and x 1 denotes x. Definition 12.6. An oacle 3CNF C is satisfiable if and only if thee exists an oacle tuth assignment A such that fo all w {0, 1} m thee exists an i {1, 2, 3} such that if C(w) = (v 1, v 2, v 3, s 1, s 2, s 3 ) then A(v i ) = s i. Definition 12.7. We will epesent oacle 3CNF fomulas using multi-output combinational cicuits C computing functions C : {0, 1} m {0, 1} 3n+3. Theefoe we define ORACLE-3SAT = { C C is a Boolean cicuit epesenting a satisfiable oacle 3CNF}. Lemma 12.7. ORACLE-3SAT is NEXP-Complete Poof. Given a cicuit C let m be the numbe of inputs to C and 3n + 3 be the numbe of outputs of C. Given C, a NEXP machine can clealy guess and wite down a tuth assignment S fo all 2 n choices of v {0, 1} n and then veify that fo all 2 m values of w, clause C(w) is satisfied by A. Theefoe ORACLE-3SAT NEXP. Let L be an abitay language in NEXP. Now conside the Cook-Levin tableau fo L and how this convets fist to a CIRCUIT-SAT poblem and then to a 3SAT poblem. Fo some polynomial p, The tableau has width 2 p( x ) and height 2 p( x ). The fist x enties in the fist ow depend on x; the emainde of

LECTURE 12. PROBABLISTICALLY CHECKABLE PROOFS 68 the fist ow ae based on geneic nondeteministic guesses y and in the entie est of the tableau the tableau is vey geneic with each enty based on the 3 enties immediately above it in the tableau. Except fo the fist enties in this table the complexity of each local window depends only on the complexity of the Tuing machine fo L which is constant size and the only diffeence is the dependence of the fist x enties on x. Now conside the 3CNF fomula that esults fom the eduction of the CIRCUIT-SAT fomula which simulates this tableau. Given the indices (i, j) in binay of a cell in the tableau it is easy to descibe in polynomial time what the connections of the pieces of the cicuit ae that simulates this tableau and theefoe what the pieces of the 3CNF fomula ae that will be poduced as the 3CNF fomula in the Cook-Levin poof. Thus we have a polynomial-time algoithm C that, based on the index of a clause, will poduce a 3-clause that is in an (exponential-size) 3CNF fomula ϕ such that x L if and only if ϕ is satisfiable. Since C is a polynomial-time algoithm thee is a polynomial size cicuit C that simulates C ; moeove it is vey easy to poduce C given the input x and the desciption of the Tuing machine fo L. The eduction maps x to C. Clealy x L if and only if C ORACLE-3SAT. ORACLE-3SAT is slightly inconvenient to use fo ou puposes so we will use a vaiant of the idea that includes as a single Boolean function both the output of the cicuit and the veification of the 3-CNF clause that it outputs. Suppose that C is a boolean cicuit with g gates. Conside a function B which takes as input: the oiginal input w to C, a tiple of 3 vaiables v 1, v 2, v 3 output by C and pupoted values of each of the gates of C and 3 binay values a 1, a 2, a 3 and states that if on input w, the vaiables appeaing in the output clause of C ae coectly epesented in the input desciption fo B and the values of the gates of C ae coectly epesented in the input desciption to B then the signs of the clause output by C on input w will evaluate to tue if vaiable x v1 = a 1, x v2 = a 2 and x v3 = a 3. Because we have included the values of the intenal gates of C we can easily expess B as a Boolean fomula based on C. Moe fomally: B : {0, 1} m+g+3n+3 {0, 1} is defined as follows whee w = m, z = g, v i = n, and a i = 1. B(w, z, v 1, v 2, v 3, a 1, a 2, a 3 ) = (C(w) = (v 1, v 2, v 3, s 1, s 2, s 3 ) and z epesents values of the gates of C) i. (s i = a i ) s 1,s 2,s 3 The fact that B can epesented as a Boolean fomula simply mios the usual agument conveting CIRCUIT-SAT to SAT. Notice that the definition of B implies that fo an oacle assignment A, A satisfies C w, z, v 1, v 2, v 3 B(w, z, v 1, v 2, v 3, A(v 1 ), A(v 2 ), A(v 3 )). Definition 12.8. A Boolean fomula B in h + 3n + 3 vaiables is a satisfiable implicit 3CNF fomula iff thee is an oacle tuth assignment A : {0, 1} n {0, 1} such that w {0, 1} h, v 1, v 2, v 3 {0, 1} n, B(w, v 1, v 2, v 3, A(v 1 ), A(v 2 ), A(v 3 )) is tue. Definition 12.9. IMPLICIT-3SAT = { B B is a satisfiable implicit 3CNF fomula }. Clealy the convesion fom C to B as defined above is polynomial time so we have that ORACLE- 3SAT is polynomial time educible to IMPLICIT-3SAT and thus: Theoem 12.8. IMPLICIT-3SAT is NEXP-Complete. Now that we have a complete poblem fo NEXP, we need to show how to convince ouselves in the equied time bounds to achieve ou goal of showing that NEXP = PCP(poly, poly).

LECTURE 12. PROBABLISTICALLY CHECKABLE PROOFS 69 12.2.3 Veification pocedue fo an oacle tuth assignment 1st idea The definition of the satisfiability of the implicit 3CNF B can easily be seen to be computation that can be done in conp A since thee ae only univesal quantifies othe than the oacle calls and the evaluation of B. Since conp A P SP ACE A we could ty to modify the IP=PSPACE to use an oacle. This IP pocotol elied on ou ability to aithmetize fomulas, like quantified vesions of B that give values ove {0, 1}, to polynomials ove a field F of modeate size. It then elied on the ability to quey such functions on andomly chosen field elements F. In tying to apply the potocol hee thee is no poblem with aithmetizing B. Howeve, so fa, we only have an oacle A that gives Boolean outputs given an input sting v {0, 1} n ; we would need to be able to evaluate A on elements of F n instead. Fotunately, we can do this by using a multilinea extension of A. Definition 12.10. A function in n vaiables is multilinea if and only if it can be expessed as a multivaiate polynomial which has degee at most 1 in each vaiable. Lemma 12.9. Fo any A : {0, 1} n F thee is a (unique) multilinea polynomial that extends A. That is thee exists a multilinea Â : Fn F such that Â(v) = A(v) fo v {0, 1}n. Poof. Let Â(x) = v {0,1} n A(v) x i (1 xi ). This is the desied multilinea polynomial. Uniqueness is left as an execise. So we have two pats fo the poof table so fa: A table of the multilinea oacle Â and the full inteaction tee fo the IPÂ potocol fo the conpâ poblem of veifying that Â satisfies B. Howeve, this is not enough. The pove might not poduce a coect table of Â! Thus, in addition to Â, the poof table must include pat that allows the veifie to check with some cetainty that Â eally is a multilinea extension of a tuth assignment. Actually, because the pove can only check a small pat of the table thee will be no way to convince the veifie that the table Â eally is multilinea. Howeve, as we will see in the next lectue, we just need it to be close to such an extension which is something the veifie will be able check.