U.C. Berkeley CS174: Randomized Algorithms Lecture Note 12 Professor Luca Trevisan April 29, 2003

Similar documents
Lecture 10 Oct. 3, 2017

compare to comparison and pointer based sorting, binary trees

CSE525: Randomized Algorithms and Probabilistic Analysis April 2, Lecture 1

Lecture 7: Fingerprinting. David Woodruff Carnegie Mellon University

1 Randomized Computation

Lecture 1: January 16

Lecture 2: January 18

Instructor: Bobby Kleinberg Lecture Notes, 25 April The Miller-Rabin Randomized Primality Test

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Chapter 6 Randomization Algorithm Theory WS 2012/13 Fabian Kuhn

RECAP How to find a maximum matching?

CPSC 536N: Randomized Algorithms Term 2. Lecture 9

15-251: Great Theoretical Ideas In Computer Science Recitation 9 : Randomized Algorithms and Communication Complexity Solutions

String Matching. Thanks to Piotr Indyk. String Matching. Simple Algorithm. for s 0 to n-m. Match 0. for j 1 to m if T[s+j] P[j] then

Ma/CS 6a Class 4: Primality Testing

Problem Set 2. Assigned: Mon. November. 23, 2015

Theoretical Cryptography, Lecture 10

1 Reductions from Maximum Matchings to Perfect Matchings

Primality Testing- Is Randomization worth Practicing?

CPSC 467b: Cryptography and Computer Security

Chapter 7 Randomization Algorithm Theory WS 2017/18 Fabian Kuhn

Cryptography CS 555. Topic 18: RSA Implementation and Security. CS555 Topic 18 1

Communication Complexity. The dialogues of Alice and Bob...

Lecture 24: Randomized Complexity, Course Summary

Ma/CS 6a Class 4: Primality Testing

Randomized Algorithms. Lecture 4. Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010

Lecture 31: Miller Rabin Test. Miller Rabin Test

Theoretical Cryptography, Lecture 13

U.C. Berkeley CS276: Cryptography Luca Trevisan February 5, Notes for Lecture 6

25.2 Last Time: Matrix Multiplication in Streaming Model

CS Communication Complexity: Applications and New Directions

Lecture Examples of problems which have randomized algorithms

Lecture 11 - Basic Number Theory.

CSCI3390-Lecture 16: Probabilistic Algorithms: Number Theory and Cryptography

Lecture Notes, Week 6

Lecture 5: Arithmetic Modulo m, Primes and Greatest Common Divisors Lecturer: Lale Özkahya

Theory of Computer Science to Msc Students, Spring Lecture 2

Homework 10 Solution

Randomness. What next?

Notes for Lecture 25

Non-Interactive Zero Knowledge (II)

CPSC 467b: Cryptography and Computer Security

CSCI-B609: A Theorist s Toolkit, Fall 2016 Oct 4. Theorem 1. A non-zero, univariate polynomial with degree d has at most d roots.

Randomized algorithms. Inge Li Gørtz

Lecture 14: IP = PSPACE

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 11. Error Correcting Codes Erasure Errors

Module 9: Tries and String Matching

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Discussion 6A Solution

Fermat s Little Theorem. Fermat s little theorem is a statement about primes that nearly characterizes them.

Continuing discussion of CRC s, especially looking at two-bit errors

CSC 373: Algorithm Design and Analysis Lecture 30

Some notes on streaming algorithms continued

Lecture 2: Randomized Algorithms

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

CSCI Honor seminar in algorithms Homework 2 Solution

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

RSA RSA public key cryptosystem

Chapter 9 Mathematics of Cryptography Part III: Primes and Related Congruence Equations

Lecture 1: Introduction to Public key cryptography

1 Recommended Reading 1. 2 Public Key/Private Key Cryptography Overview RSA Algorithm... 2

Lecture 21: Quantum communication complexity

Mathematics of Cryptography

CS286.2 Lecture 8: A variant of QPCP for multiplayer entangled games

15-451/651: Design & Analysis of Algorithms October 31, 2013 Lecture #20 last changed: October 31, 2013

Notes for Lecture 17

Lecture 7: Schwartz-Zippel Lemma, Perfect Matching. 1.1 Polynomial Identity Testing and Schwartz-Zippel Lemma

Discrete Mathematics and Probability Theory Fall 2018 Alistair Sinclair and Yun Song Note 6

Great Theoretical Ideas in Computer Science

Notes for Lecture 3... x 4

Linear Congruences. The equation ax = b for a, b R is uniquely solvable if a 0: x = b/a. Want to extend to the linear congruence:

Solutions to the Mathematics Masters Examination

Notes for Lecture 14

Lecture 12: Interactive Proofs

9 Knapsack Cryptography

YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE

CS5371 Theory of Computation. Lecture 18: Complexity III (Two Classes: P and NP)

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Notes for Lecture Notes 2

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

fsat We next show that if sat P, then fsat has a polynomial-time algorithm. c 2010 Prof. Yuh-Dauh Lyuu, National Taiwan University Page 425

Primality Testing. 1 Introduction. 2 Brief Chronology of Primality Testing. CS265/CME309, Fall Instructor: Gregory Valiant

Winter 2011 Josh Benaloh Brian LaMacchia

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

1 The Fundamental Theorem of Arithmetic. A positive integer N has a unique prime power decomposition. Primality Testing. and. Integer Factorisation

Tutorial on Quantum Computing. Vwani P. Roychowdhury. Lecture 1: Introduction

PAC Learning. prof. dr Arno Siebes. Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht

Lecture Hardness of Set Cover

Problem Set 2 Solutions

Integer factorization, part 1: the Q sieve. D. J. Bernstein

Lecture 8 - Algebraic Methods for Matching 1

1 Randomized complexity

ALG 4.1 Randomized Pattern Matching:

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

Theoretical Cryptography, Lectures 18-20

Introduction to Cryptology. Lecture 20

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Number Theoretic Assumptions

Introduction to Cryptography Lecture 13

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

Transcription:

U.C. Berkeley CS174: Randomized Algorithms Lecture Note 12 Professor Luca Trevisan April 29, 2003 Fingerprints, Polynomials, and Pattern Matching 1 Example 1: Checking Matrix Multiplication Notes by Alistair Sinclair Somebody gives you some code that purports to multiply matrices. You want to check it on some sample inputs. So you need an algorithm that solves the following problem: Input :threen n real matrices A, B, C. Question :isc = A B? Of course, you could just multiply A and B and compare the answer with C; thiswouldtake O(n 2.81 ) time in practice, and O(n 2.38 ) time even in theory. You d like to do the checking in much less time than matrix multiplication itself. Here s a very simple randomized algorithm, due to Freivalds, that runs in only O(n 2 )time: pick a vector r {0, 1} n u.a.r. if A(Br) = Cr then output yes else output no The algorithm runs in O(n 2 ) time since it does only 3 matrix-vector multiplications. (Check this.) Why does it work? Well, if AB = C then the output will always be yes. (Why?) But if AB C then the algorithm will make a mistake if it happens to choose an r for which ABr = Cr. This, however, is unlikely: Claim: If AB C then Pr[ABr = Cr] 1 2. We ll prove the claim in a moment. First, note that it implies: If AB = C then the algorithm is always correct. If AB C then the algorithm makes an error with probability 1 2. To make the error probability in the second case very small (say, less than ɛ, the probability that an earthquake will strike while we are running the computation), we just run t = log 2 (1/ɛ) trials and output yes unless some trial says no. (Check this.) Proof of Claim: Define D = AB C. Our assumption is that D 0. Wewanttoshow that Pr[Dr =0] 1 2 for randomly chosen r. Since D 0, it has a non-zero entry, say d ij.nowtheith entry of Dr is k d ikr k.andif Dr = 0 then this entry must certainly be zero. But this happens iff r j = ( k j d ikr k )/d ij ; i.e., once the values r k for k j have been chosen, there is only one value for r j that would give us zero. But r j itself is being chosen from the two possible values {0, 1}, so at least one of these choices must fail to give us zero. (Note that the numerical values here are unimportant it matters only that there are two different choices.) 1

Hence Pr[Dr =0] 1 2, as claimed. Ex: In the above proof, where did we use the fact that d ij 0? Ex: Suppose that, instead of picking the entries of r from the set {0, 1}, wepickedthem from a larger set S. Show that the failure probability of a single run of the algorithm would then be at most 1 S.BymakingSlarger, we can thus reduce the error probability. Comment on the relative merits in practice of this scheme and the method based on repeated trials mentioned above. 2 Example 2: Checking Polynomial Identities A very frequently used operation in computer algebra systems is the following: given two polynomials, Q 1,Q 2, with multiple variables x 1,...,x n,areq 1,Q 2 identical, i.e., is Q 1 (x 1,...,x n )=Q 2 (x 1,...,x n )atallpoints(x 1,...,x n )? How does the system answer such a question? Obviously, if Q 1,Q 2 are written out explicitly, the question is trivially answered in linear time just by comparing their coefficients. But in practice they are usually given in very compact form (e.g., as products of factors, or as determinants of matrices), so that we can evaluate them efficiently, but expanding them out and looking at their coefficients is out of the question. Ex: Consider the polynomial Q(x 1,x 2,...,x n )= i<j:i,j 1 (x i x j ) i<j:i,j 2 (x i x j )+ i<j:i,j 3 (x i x j ) ± i<j:i,j n (x i x j ). Show that evaluating Q at any given point can be done efficiently, but that expanding out Q to find all its coefficients is computationally infeasible even for moderate values of n. Here is a very simple randomized algorithm, due to Schwartz and Zippel. Testing Q 1 Q 2 is equivalent to testing Q 0, where Q = Q 1 Q 2. So we focus on this problem. Algorithm pick r 1,...,r n independently and u.a.r. from a set S if Q(r 1,...,r n )=0then output yes else output no This algorithm is clearly efficient, requiring only the evaluation of Q at a single point. Moreover, if Q 0 then it is always correct. (Why?) In the Theorem below, we ll see that if Q 0 then the algorithm is incorrect with probability at most 1 2, provided the set S is large enough. We can then reduce this error probability to ɛ by repeated trials as in the previous example. Theorem: Suppose Q(x 1,...,x n ) has degree at most d. 1 Then if Q 0wehave Pr[Q(r 1,...,r n )=0] d 1 The degree of a polynomial is the maximum degree of any of its terms. The degree of a term is the sum of the exponents of its variables. E.g., the degree of Q(x 1,x 2)=2x 3 1x 2 +17x 1x 2 2 7x 2 1 + x 1x 2 1is4. 2

With this theorem, to get the error probability 1 2 we just take S =2d. Thus we could take, e.g., S = {1, 2,...,2d}. Proof of Theorem: Note that the theorem is immediate for polynomials in just one variable (i.e., n = 1); this follows from the well-known fact that such a polynomial of degree d can have at most d zeros. To prove the theorem for n variables, we use induction on n. We have just proved the base case, n = 1, so we now assume n>1. We can write Q as Q(x 1,...,x n )= k i=0 xi 1 Q i(x 2,...,x n ), where k is the largest exponent of x 1 in Q. So Q k (x 2,...,x n ) 0 by our definition of k, anditsdegreeisatmostd k. (Why?) Thus by the induction hypothesis we have that Pr[Q k (r 2,...,r n )=0] d k We now consider two cases, depending on the random values r 2,...,r n chosen by the algorithm. Case 1: Q k (r 2,...,r n ) 0. In this case the polynomial Q (x 1 )= k i=0 Q i(r 2,...,r n )x i 1 in the single variable x 1 has degree k and is not identically zero, so it has at most k zeros. I.e., Q (r 1 ) = 0 for at most k values r 1. But this means that Pr[Q(r 1,...,r n )=0] k (Why?) Case 2: Q k (r 2,...,r n ) = 0. In this case we can t say anything useful about Pr[Q(r 1,r 2,...,r n )=0]. However, we know from the above discussion that this case only happens with probability d k Now let E denote the event that Q k (r 2,...,r n ) 0. Putting together cases 1 and 2 we get: Pr[Q(r 1,...,r n )=0]=Pr[Q(r 1,...,r n )=0 E]Pr[E]+Pr[Q(r 1,...,r n )=0 E]Pr[E] Pr[Q(r 1,...,r n )=0 E]+Pr[E] k S + d k S = d This completes the proof by induction. 3 Fingerprinting Both of the above are examples of fingerprinting. Suppose we want to compare two items, Z 1 and Z 2. Instead of comparing them directly, we compute random fingerprints fing(z 1 ), fing(z 2 ) and compare these. The fingerprint function fing has the following properties: If Z 1 Z 2 then Pr[fing(Z 1 )=fing(z 2 )] is small. It is much more efficient to compute and compare fingerprints than to compare Z 1,Z 2 directly. For Freivalds algorithm, if A is a n n matrix then fing(a) =Ar, forr a random vector in {0, 1} n. For the Schwartz-Zippel algorithm, if Q is a polynomial in n variables then fing(q) = Q(r 1,...,r n ), for r i chosen randomly from a set S of size 2d. We give two further applications of this idea. 3

4 Example 3: Comparing Databases Alice and Bob are far apart. Each has a copy of a database (of n bits), a and b respectively. They want to check consistency of their copies. Obviously Alice could send a to Bob, and he could compare it to b. But this requires transmission of n bits, which for realistic values of n is costly and error-prone. Instead, suppose Alice first computes a much smaller fingerprint fing(a) and sends this to Bob. He then computes fing(b) and compares it with fing(a). If the fingerprints are equal, he announces that the copies are identical. What kind of fingerprint function should we use here? Let s view a copy of the database as an n-bit binary number, i.e., a = n 1 i=0 a i2 i and b = n 1 i=0 b i2 i. Now define fing(a) =a mod p, wherep is a prime number chosen at random from the range {1,...,k}, for some suitable k. WewantPr[fing(a) =fing(b)] to be small if a b. Suppose a b. When is fing(a) = fing(b)? Well, for this to happen, we need that a mod p = b mod p, i.e., that p divides d = a b 0. Butd is an (at most) n-bit number, so the size of d is less than 2 n. This means that at most n different primes can divide d. (Why?) So: as long as we make k large enough so that the number of primes in the range {1,...,k} is much larger than n we will be in good shape. To ensure this, we need a standard fact from Number Theory: Prime Number Theorem: Let π(k) denote the number of primes less than k. Then π(k) as k. k ln k Now all we need to do is set k = cn ln(cn) for any c we like. By the Prime Number Theorem, with this choice of k, Pr[fing(a) =fing(b) a b] n π(k) 1 c. So, if we take c = 1 ɛ we will achieve an error probability less than ɛ. Finally, note that Alice only needs to send to Bob the numbers a mod p and p (so that Bob knows which fingerprint to compute), both of which are at most k. So the number of bits sent by Alice is at most 2 log 2 k = O(log n). Ex: We did not explain how Alice selects a random prime p {1,...,k}. What she does, in fact, is to generate a random number in {1,...,k}, test if it is prime, and if not throw it away and try again. (The test for primality can be done efficiently in time polynomial in log n using a randomized algorithm which you may have seen in CS170.) Use the Prime Number Theorem to show that the expected number of trials Alice has to perform before she hits a prime is ln k = O(log n). 5 Example 4: Randomized Pattern Matching Consider the classical problem of searching for a pattern Y in a string X. I.e., we want to know whether the string Y = y 1 y 2...y m occurs as a contiguous substring of X = x 1 x 2...x n. The naïve approach of trying every possible match takes O(nm) time. (Why?) There is a rather complicated deterministic algorithm that runs in O(n + m) time (which is clearly 4

best possible). A beautifully simple randomized algorithm, due to Karp and Rabin, also runs in O(n + m) time and is based on the same idea as in the above example. Assume for simplicity that the alphabet is binary. Let X(j) =x j x j+1...x j+m 1 denote the substring of X of length m starting at position j. Algorithm pick a random prime p in the range {1,...,k} for j =1to n m +1do if X(j) =Y mod p then report match and stop The test in the if-statement here is just fing(x(j)) = fing(y ) for the same fingerprint function fing as in the Alice and Bob problem. If the algorithm runs to completion without reporting a match, then Y definitely does not occur in X. (Why?) So the only error the algorithm can make is to report a false match. What is the probability that this happens? By the same analysis as above, for each j if X(j) Y then Pr[fing(X(j)) = fing(y )] m π(k). Therefore, if Y X(j) for all j, we have Pr[algorithm reports a match] nm π(k) 1 c if we choose k = cnm ln(cnm) (exactly as before). What about the running time? Well, in a naïve implementation each iteration of the loop requires the computation of a fingerprint of an m-bit number, which takes O(m) time, giving a total running time of O(nm). However, this can be improved by noticing that fing(x(j +1))=2(fing(X(j)) 2 m 1 x j )+x j+m mod p. (Check this.) Hence, under the realistic assumption that arithmetic operations on fingerprints which are small can be done in constant time, each iteration actually takes only constant time (except the first, which takes O(m) time to compute fing(x(1)) and fing(y )). The overall running time of the algorithm is therefore O(n + m) as claimed earlier. Ex: In practice, we would want to eliminate the possibility of false matches entirely. To do this, we could make the algorithm test any match before reporting it. If it is found to be a false match, the algorithm could simply restart with a new random prime p. The resulting c algorithm never makes an error. Show that its expected running time is at most c 1 T T, where T is the running time of the original algorithm, and that the probability it runs for at least (l +1)T time is at most c l. 5