CPSC 536N: Randomized Algorithms Term 2. Lecture 9

Similar documents
Lecture 7: Schwartz-Zippel Lemma, Perfect Matching. 1.1 Polynomial Identity Testing and Schwartz-Zippel Lemma

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Lecture 5 Polynomial Identity Testing

1 Reductions from Maximum Matchings to Perfect Matchings

Advanced Combinatorial Optimization Feb 13 and 25, Lectures 4 and 6

20.1 2SAT. CS125 Lecture 20 Fall 2016

Theory of Computer Science to Msc Students, Spring Lecture 2

Lecture 2: January 18

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

CPSC 536N: Randomized Algorithms Term 2. Lecture 2

Essential facts about NP-completeness:

RECAP How to find a maximum matching?

Lecture 22: Counting

UMA Putnam Talk LINEAR ALGEBRA TRICKS FOR THE PUTNAM

EXAMPLES OF PROOFS BY INDUCTION

CSCI-B609: A Theorist s Toolkit, Fall 2016 Oct 4. Theorem 1. A non-zero, univariate polynomial with degree d has at most d roots.

Lecture 24: Randomized Complexity, Course Summary

Lecture 17. In this lecture, we will continue our discussion on randomization.

Math 40510, Algebraic Geometry

Basic Probabilistic Checking 3

1 Randomized Computation

Lecture 10 Oct. 3, 2017

ACO Comprehensive Exam March 17 and 18, Computability, Complexity and Algorithms

15-251: Great Theoretical Ideas In Computer Science Recitation 9 : Randomized Algorithms and Communication Complexity Solutions

Lecture 8 - Algebraic Methods for Matching 1

fsat We next show that if sat P, then fsat has a polynomial-time algorithm. c 2010 Prof. Yuh-Dauh Lyuu, National Taiwan University Page 425

Lecture 5 January 16, 2013

SPECIAL POINTS AND LINES OF ALGEBRAIC SURFACES

Lecture 18: Inapproximability of MAX-3-SAT

compare to comparison and pointer based sorting, binary trees

We simply compute: for v = x i e i, bilinearity of B implies that Q B (v) = B(v, v) is given by xi x j B(e i, e j ) =

Lecture 22: Derandomization Implies Circuit Lower Bounds

Factoring. there exists some 1 i < j l such that x i x j (mod p). (1) p gcd(x i x j, n).

Algebraic approach to exact algorithms, Part III: Polynomials over nite elds of characteristic two

1 The Low-Degree Testing Assumption

Introduction Long transparent proofs The real PCP theorem. Real Number PCPs. Klaus Meer. Brandenburg University of Technology, Cottbus, Germany

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

Graph coloring, perfect graphs

Approximating MAX-E3LIN is NP-Hard

Counting bases of representable matroids

MATH 61-02: PRACTICE PROBLEMS FOR FINAL EXAM

Lecture 10 + additional notes

CS Foundations of Communication Complexity

Lecture 23: Alternation vs. Counting

Problem Set 2. Assigned: Mon. November. 23, 2015

Lecture 3: Oct 7, 2014

Lecture Examples of problems which have randomized algorithms

Unit 4 Systems of Equations Systems of Two Linear Equations in Two Variables

1 More finite deterministic automata

NP-Completeness Review

Lecture 18: More NP-Complete Problems

Lecture 8 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Math 5707: Graph Theory, Spring 2017 Midterm 3

Cosets and Lagrange s theorem

Lecture 15: Expanders

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

On enumerating monomials and other combinatorial structures by polynomial interpolation

10.4 The Kruskal Katona theorem

Sydney University Mathematical Society Problems Competition Solutions.

Bipartite Perfect Matching

CSC 5170: Theory of Computational Complexity Lecture 13 The Chinese University of Hong Kong 19 April 2010

Lecture 24: Approximate Counting

Computability and Complexity Theory: An Introduction

1 Primals and Duals: Zero Sum Games

CS151 Complexity Theory. Lecture 9 May 1, 2017

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem

Lecture 19: Interactive Proofs and the PCP Theorem

1 Adjacency matrix and eigenvalues

U.C. Berkeley CS174: Randomized Algorithms Lecture Note 12 Professor Luca Trevisan April 29, 2003

Projective space. There are some situations when this approach seems to break down; for example with an equation like f(x; y) =y 2 (x 3 5x +3) the lin

Weekly Activities Ma 110

5 Flows and cuts in digraphs

ACO Comprehensive Exam October 18 and 19, Analysis of Algorithms

Lecture 10 February 4, 2013

The integers. Chapter 3

Lecture 6: Finite Fields

MTAT Complexity Theory October 13th-14th, Lecture 6

Randomness. What next?

Reconstruction of full rank algebraic branching programs

Effective Resistance and Schur Complements

GALOIS; The first Memoir

Generating Functions

A Lower Bound for the Size of Syntactically Multilinear Arithmetic Circuits

Pr[X = s Y = t] = Pr[X = s] Pr[Y = t]

Advanced Combinatorial Optimization September 22, Lecture 4

Randomized Algorithms

MATH 115, SUMMER 2012 LECTURE 12

Algebraic function fields

DISTINGUISHING PARTITIONS AND ASYMMETRIC UNIFORM HYPERGRAPHS

Winter 2011 Josh Benaloh Brian LaMacchia

Design and Analysis of Algorithms

Permutations and Polynomials Sarah Kitchen February 7, 2006

Formalizing Randomized Matching Algorithms

Show that the following problems are NP-complete

1 Randomized complexity

Matching Polynomials of Graphs

Lecture 15: A Brief Look at PCP

Math 300: Final Exam Practice Solutions

Proving languages to be nonregular

Chapter 2. Reductions and NP. 2.1 Reductions Continued The Satisfiability Problem (SAT) SAT 3SAT. CS 573: Algorithms, Fall 2013 August 29, 2013

Transcription:

CPSC 536N: Randomized Algorithms 2011-12 Term 2 Prof. Nick Harvey Lecture 9 University of British Columbia 1 Polynomial Identity Testing In the first lecture we discussed the problem of testing equality of two bitstrings in a distributed setting. We reduced that problem to the problem of testing whether a certain polynomial q is the zero polynomial. That was our first glimpse of polynomial identity testing (PIT). Let us now define the general PIT problem, while still being a bit vague. Given two multivariate polynomials p(x 1,..., x n ) and q(x 1,..., x n ) over some field F, we would like to decide if they are equal. Note that this is equivalent to deciding if their difference p q is zero. This equivalent form is often more convenient to think about because it only involves the single polynomial p q. So we could also define PIT to be: given a polynomial p(x 1,..., x n ) over some field F, decide if p is zero. You might think this problem is rather dull and algebraic, but it actually captures many interesting and natural computational problems. For example, we ve already seen that testing equality of strings reduces to PIT (even univariate PIT). We ll see another example in Section 2. Next we discuss an important way in which our definition of PIT is too vague. What is a zero polynomial? Our definition of PIT did not say precisely what it means for p to be zero. There are two possible meanings, which lead to two different computational problems. Our desired definition of PIT uses the second meaning. The Evaluates to Zero Everywhere (EZE) problem. Given a polynomial p(x 1,..., x n ) over F, we must decide whether, for every choice of numbers y 1,..., y n F, the value of p(y 1,..., y n ) is the number 0. The Polynomial Identity Testing (PIT) problem. Given a polynomial p(x 1,..., x n ), we can write it as a sum over monomials with various coefficients. For example, given p(x, y, z) = (x + 2y)(3y z), we can expand it into a sum of monomials as p(x, y, z) = 3xy + 6y 2 xz 2yz. The problem is to decide whether, after expanding p into monomials, are all coefficients of those monomials equal to zero? If so, we say that p is the zero polynomial, or that it is identically zero. It might never have occurred to you that EZE and PIT are different problems. If p is identically zero then definitely it evaluates to zero everywhere. Is the converse true? Over any infinite field, like R or C, the converse is true: a polynomial evaluates to zero at every point if and only if it is identically zero. But over finite fields the converse is not true, so EZE and PIT are different problems. For example, the univariate polynomial p(x) = x 2 + x over F 2 is not identically zero but p(0) = 0 and p(1) = 1 + 1 = 0. It is worth pointing out that trivial approaches do not solve these problems efficiently. The trivial approach for EZE is to exhaustively try every possible choice of numbers y 1,..., y n and evaluate p(y 1,..., y n ). This requires F n evaluations. The trivial approach for PIT is to explicitly expand the 1

polynomial into monomials, but if p has degree d then there can be ( ) n+d d such monomials, which is exponential in d. Let me also clarify what is meant by the degree of a multivariate polynomial (or even monomial). A monomial is any expression of the form α n i=1 xβ i i where α F and β 1,..., β n are non-negative integers. The total degree of that monomial is i β i. The total degree of a polynomial is defined to be the largest total degree of its monomials. Our definitions of EZE and PIT are still vague in another important way. How to represent a polynomial? What does it mean to be given a polynomial? One natural definition is to allow p(x 1,..., x n ) to be presented as an explicit algebraic formula involving only the variables x 1,..., x n, any numbers in F, parentheses, and the operations of addition, subtraction and multiplication. For example, p(x 1, x 2 ) = x 1 (x 1 2 x 2 ). Alternatively, we could allow p to be presented as a black box, meaning that we do not have an explicit representation of p but our algorithms are allowed to evaluate p by plugging in any desired numbers for x 1,..., x n. This is useful because we might have some weird representation for p that is not an explicit formula. For example, let 1 x 1 x 2 1... x n 1 1 1 x 2 x 2 2... x n 1 2 p(x 1,..., x n ) = det.. 1 x n x 2 n. xn n 1 Then p is actually a polynomial in x 1,..., x n of total degree at most n(n 1). We can easily choose numeric values for x 1,..., x n and evaluate p at that point simply by plugging those numbers into the matrix and computing the determinant numerically. But it might be the case that p does not have a concise representation as an explicit formula. (In this particular example the matrix is a Vandermonde matrix, and the explicit formula p(x 1,..., x n ) = i<j (x i x j ) is known.) For our purposes, it does not matter whether the polynomial is given as an explicit formula or as a black box. We discuss this issue further in Section 1.4. 1.1 Complexity Status Unfortunately EZE is conp-hard. There is a simple procedure to encode any 3SAT formula as a polynomial over F 2 such that the polynomial evaluates to zero everywhere if and only if the formula is unsatisfiable. The complexity status of PIT is much more interesting. We will show that there is a randomized algorithm to decide PIT. However, there is no known deterministic algorithm for deciding PIT. Furthermore, if a deterministic algorithm existed then there would be remarkable consequences in complexity theory. 1.2 The Schwartz-Zippel Lemma The main tool that we will use today is the Schwartz-Zippel lemma. Lemma 1 Let p(x 1,..., x n ) be a polynomial of total degree d. Assume that p is not identically zero. Let S F be any finite set. Then, if we pick y 1,..., y n independently and uniformly from S, Pr[ p(y 1,..., y n ) = 0 ] 2 d S.

To help understand the lemma, consider the case of polynomials over R. The theorem says that if we evaluate p at a random point, then we have small probability of seeing a root. Does that mean that polynomials over R have finitely many roots? No! Consider the polynomial p(x 1, x 2 ) = x 1 which has total degree d = 1. It has infinitely many roots because setting x 1 = 0 and x 2 to anything gives a root. However, if we fix S = {0, 1} and randomly choose x 1, x 2 S then our probability of seeing a root is exactly 1/2, which matches the bound given by the Schwartz-Zippel lemma. It is also worth noting that the conclusion of the theorem does not depend on n. Proof: We proceed by induction on n. The base case is the case n = 1, which is the univariate case we discussed in the first lecture. We claimed that any univariate polynomial of degree d has at most d roots. (The reason is that any univariate polynomial of degree d factors uniquely into at most d irreducible polynomials, each of which has at most one root.) So, the probability that y 1 is a root is at most d/ S. Now we assume the theorem is true for polynomials with n 1 variables, and we prove it for those with n variables. The main idea is to obtain polynomials with fewer variables by factoring out the variable x 1 from p. Let k be the largest power of x 1 appearing in any monomial of p. Then p(x 1,..., x n ) = k x i 1 q i (x 2,..., x n ). i=0 By our choice of k, the polynomial q k is not identically zero. Furthermore its total degree is at most d k, so by induction Pr[ q k (y 2,..., y n ) = 0 ] d k. S Define E 1 to be the event q k (y 2,..., y n ) = 0. Let us now randomly choose the values of y 2,..., y n and assume that the event E 1 did not occur. Define f(x 1 ) to be the univariate polynomial f(x 1 ) = k x i 1 q i (y 2,..., y n ) = p(x 1, y 2,..., y n ). i=0 Since E 1 did not occur, the coefficient of x k 1 in f is non-zero, so f is not identically zero. By the argument of our base case, Pr[ f(y 1 ) = 0 E 1 ] k S. Define E 2 to be the event f(y 1 ) = 0. By definition of f, E 2 is equivalent to p(y 1,..., y n ) = 0. We re almost done. The goal of the lemma is to bound Pr[E 2 ], and so far we ve bounded Pr[E 2 E 1 ] and Pr[E 1 ]. By a few manipulations, we get which finishes the proof. Pr[E 2 ] = Pr[E 2 E 1 ] + Pr[E 2 E 1 ] = Pr[E 2 E 1 ] + Pr[E 2 E 1 ] Pr[ E 1 ] Pr[E 1 ] + Pr[E 2 E 1 ] d k + k = d S S S 3

1.3 Solving PIT With the Schwartz-Zippel lemma in hand, we can easily solve PIT. Suppose we are given a polynomial p(x 1,..., x n ) of total degree d. Assume that d < F. (For more on this point, see Section 1.4.) Our algorithm simply picks y 1,..., y n uniformly and independently from F (or if F is infinite, a suitably large finite subset of F). If p(y 1,..., y n ) evaluates to a non-zero value, the algorithm announces that p is not identically zero. In this case we will never make an error: we have conclusive proof that p is not identically zero. Otherwise, if p(y 1,..., y n ) = 0, the algorithm announces that p is identically zero. In this case the Schwartz-Zippel lemma shows that probability of error is at most d/ F < 1. We can decrease the probability of error to any desired level by repeating the algorithm multiple times. This is the amplification by independent trials trick. For example, even if d = F 1, then repeating the algorithm F times reduces the probability of failure to at most ( F 1 ) F F ( 1 1 ) F 1/e. F 1.4 Further Discussion of Complexity Status Our randomized algorithm above assumed that p had total degree d < F. Under that assumption, the Schwartz-Zippel lemma implies that any polynomial that evaluates to zero everywhere must be identically zero. Therefore, under the assumption that d < F, the problems EZE and PIT are actually equivalent, so they can both be solved by our randomized algorithm. This also explains why EZE and PIT are equivalent for infinite fields. It remains to discuss the case d F. In this case: EZE is NP-hard as mentioned above. PIT in the black box model cannot be solved. This is because there are polynomials which are not identically zero, but evaluate to zero everywhere (such as p(x) = x 2 + x over F 2 ) and these cannot be distinguished from the identically zero polynomial. PIT in the explicit formula model is solvable! Since d F, the Schwartz-Zippel lemma does not give any useful information about the number of roots. The trick is to use a field extension of F. Any finite field F can be extended to larger finite field F which contains F as a subfield. So instead of viewing p as a polynomial over F, we view it as a polynomial over a larger field F with d < F. This preserves the property of p being identically zero, so we can simply run our randomized algorithm for p over the field F. 2 Bipartite Matching We conclude by describing using PIT to solve the Bipartite Matching problem. Let G = (U V, E) be a bipartite graph, meaning that U and V are disjoint sets of vertices, and every edge in E has exactly one endpoint in U and exactly one endpoint in V. A matching in G is a set of edges that share no endpoints. A perfect matching in G is a set of edges M E such that every vertex is contained in exactly one edge of M. 4

Polynomial time algorithms are known to decide if G has a perfect matching, and even to construct such a matching. We will give a randomized algorithm to decide if G has a perfect matching, by reducing that problem to PIT. Let A be the matrix whose rows are indexed by the vertices in U and columns are indexed by the vertices in V. The entries of A are: { x u,v (uv E) A u,v = 0 (uv E), where { x u,v : uv E } are distinct variables. Claim 2 det A is identically zero if and only if G has no perfect matching. Proof: By the Leibniz formula for determinants, det A = π sign(π) u U A u,π(u), where the sum is over all bijections π : U V and sign(π) is a function taking values in {+1, 1} whose definition is irrelevant for our purposes. The key observation is that a bijection from U to V is simply a pairing { (u, π(u)) : u U } of elements in U and elements in V such that each vertex appears in exactly one pair. This is almost the same as our definition of a perfect matching. If each of those pairs (u, π(u)) were an edge in E, then that pairing would be exactly a perfect matching in G. Conveniently, the monomial u U A u,π(u) tells us exactly when that happens. More precisely, u U A u,π(u) is a non-zero monomial if and only if π corresponds to a perfect matching in G. Furthermore, since all of these monomials involve distinct sets of variables, there can be no cancellations when we add up the monomials. Therefore u U A u,π(u) appears as a monomial in det A (possibly with a minus sign) if and only if π corresponds to a matching in G. This claim yields the following algorithm for testing if a bipartite graph has a perfect matching. Let F be any field of size at least n 2. We can view det A as a polynomial over F since all coefficients of its monomials are +1 and 1. Assign every variable x u,v a random value from F. Compute the numeric determinant det A. If the determinant is zero, say G has no perfect matching. Otherwise, say G has a perfect matching. Since det A has degree n the Schwartz-Zippel lemma implies that the failure probability of this algorithm is at most n/n 2 < 1/n. 5