Lecture 15: Expanders

Similar documents
Lecture 12: Randomness Continued

Lecture 24: Approximate Counting

Lecture 6: Random Walks versus Independent Sampling

Lecture 17. In this lecture, we will continue our discussion on randomization.

1 Adjacency matrix and eigenvalues

6.842 Randomness and Computation March 3, Lecture 8

Lecture 22: Counting

1 Randomized Computation

Lecture 26: Arthur-Merlin Games

Lecture 7: The Polynomial-Time Hierarchy. 1 Nondeterministic Space is Closed under Complement

Lecture 3: Reductions and Completeness

By allowing randomization in the verification process, we obtain a class known as MA.

Lecture 20: Goemans-Williamson MAXCUT Approximation Algorithm. 2 Goemans-Williamson Approximation Algorithm for MAXCUT

Lecture 2: From Classical to Quantum Model of Computation

Lecture 17 November 8, 2012

Reading group: proof of the PCP theorem

20.1 2SAT. CS125 Lecture 20 Fall 2016

Essential facts about NP-completeness:

11.1 Set Cover ILP formulation of set cover Deterministic rounding

Model Counting for Logical Theories

1 AC 0 and Håstad Switching Lemma

Lecture 23: Alternation vs. Counting

CPSC 536N: Randomized Algorithms Term 2. Lecture 9

25.1 Markov Chain Monte Carlo (MCMC)

Lecture 13 Spectral Graph Algorithms

Compute the Fourier transform on the first register to get x {0,1} n x 0.

CS 151 Complexity Theory Spring Solution Set 5

Introduction Long transparent proofs The real PCP theorem. Real Number PCPs. Klaus Meer. Brandenburg University of Technology, Cottbus, Germany

Lecture 12: Constant Degree Lossless Expanders

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Discrete Math Class 19 ( )

Undirected ST-Connectivity in Log-Space. Omer Reingold. Presented by: Thang N. Dinh CISE, University of Florida, Fall, 2010

Theory of Computer Science to Msc Students, Spring Lecture 2

Where do pseudo-random generators come from?

2 Natural Proofs: a barrier for proving circuit lower bounds

CS286.2 Lecture 15: Tsirelson s characterization of XOR games

Problem Set 2. Assigned: Mon. November. 23, 2015

Lecture 3: Constructing a Quantum Model

Lecture 5: Derandomization (Part II)

Random walks, Markov chains, and how to analyse them

CS 229r Information Theory in Computer Science Feb 12, Lecture 5

BBM402-Lecture 11: The Class NP

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Notes on Complexity Theory Last updated: November, Lecture 10

1 Randomized complexity

ORIE 6334 Spectral Graph Theory September 22, Lecture 11

On the efficient approximability of constraint satisfaction problems

Time and space classes

Lecture 59 : Instance Compression and Succinct PCP s for NP

Limits to Approximability: When Algorithms Won't Help You. Note: Contents of today s lecture won t be on the exam

Lecture 4: LMN Learning (Part 2)

A An Overview of Complexity Theory for the Algorithm Designer

Lecture 8: Alternatation. 1 Alternate Characterizations of the Polynomial Hierarchy

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Complexity Theory of Polynomial-Time Problems

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Lecture Space Bounded and Random Turing Machines and RL

Lecture 18: Inapproximability of MAX-3-SAT

2.1 Laplacian Variants

Lecture Introduction. 2 Brief Recap of Lecture 10. CS-621 Theory Gems October 24, 2012

Basic Probabilistic Checking 3

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

Lecture 3: Error Correcting Codes

Lecture 2: September 8

Lecture 2 Sep 5, 2017

CS154, Lecture 15: Cook-Levin Theorem SAT, 3SAT

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Lecture 18: Oct 15, 2014

Lecture 28: April 26

Lecture 29: Computational Learning Theory

Lecture 7: Passive Learning

SPECIAL POINTS AND LINES OF ALGEBRAIC SURFACES

Eigenvalues, random walks and Ramanujan graphs

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

CSC 5170: Theory of Computational Complexity Lecture 13 The Chinese University of Hong Kong 19 April 2010

Notes for Lecture 14

MA677 Assignment #3 Morgan Schreffler Due 09/19/12 Exercise 1 Using Hölder s inequality, prove Minkowski s inequality for f, g L p (R d ), p 1:

Intro to Theory of Computation

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

IITM-CS6845: Theory Toolkit February 3, 2012

Random Walk and Expander

Advanced Algorithms (XIII) Yijia Chen Fudan University

Randomized Computation

2 Evidence that Graph Isomorphism is not NP-complete

Lecture 10: October 27, 2016

1 Distributional problems

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

3 Probabilistic Turing Machines

Sublinear-Time Algorithms

Formalizing Randomized Matching Algorithms

Lecture 4: Elementary Quantum Algorithms

Spectral Clustering. Zitao Liu

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

CS286.2 Lecture 8: A variant of QPCP for multiplayer entangled games

COP 4531 Complexity & Analysis of Data Structures & Algorithms

Lecture 3: September 10

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

Transcription:

CS 710: Complexity Theory 10/7/011 Lecture 15: Expanders Instructor: Dieter van Melkebeek Scribe: Li-Hsiang Kuo In the last lecture we introduced randomized computation in terms of machines that have access to a source of random bits and that return correct answers somewhat more than half of the time. We showed that BPP has polynomial-size circuits and the conjecture in the community is that BPP = P. Today we ll introduce expanders, a type of graph that is useful in improving (amplifying) randomized algorithms with little or no additional random bit overhead. First we ll prove a theorem relating BPP to the polynomial time hierarchy. 1 BPP and the Polynomial Time Hierarchy Although we don t know if BPP = P, or even if BPP NP, we do know that BPP PH: Theorem 1. BPP Σ p Πp Proof. Fix M a randomized polytime machine that accepts L BPP, and let r be the number of random bits M uses when running on an input x of size n = x. We will now show that BPP Σ p. Since BPP = cobpp it follows that BPP coσ p = Πp, completing the proof. We need to remove randomness from the computation. For a given input x, the space {0, 1} r of r-bit strings gets partitioned into two pieces: Acc(x), the set of random strings on which M accepts x, and Rej(x), the set of strings on which M rejects x. If the error rate ε of M is small, then Acc(x) will be much larger than Rej(x) when x L, and Acc(x) will be much smaller than Rej(x) when x / L. See Figure 1. If our random computation has a small error rate then it will be correct on most random bit strings. We ll turn most into all. The idea is that when x L a few shifts of Acc(x) will cover the whole space {0, 1} r of r-bit random sequences, while the same few shifts of Acc(x) will fail to cover the whole space when x / L. If S {0, 1} r and σ {0, 1} r, then S σ = {s σ s S}, the shift of S by σ. Since shifting is invertible we see that S σ = S. We will use shifts to give a Σ -predicate using the intuition discussed above. Namely, consider x L σ 1,..., σ t ρ {0, 1} r [ρ Acc(x) σ i ]. (1) Now, r is poly in n, and so if we can pick t poly in n as well, then the above will be a Σ p -predicate, provided that we can verify the membership of ρ in t Acc(x) σ i in time poly in n. The membership check is no problem since we can equivalently check that ρ σ i Acc(x) for some i, which is polytime since it corresponds to running M on x with the random bit string ρ σ i, for poly-many σ i. We ll show we can pick t a suitable poly to make (1) true by showing that we can choose the σ i s randomly with a high rate of success. There are two cases to consider: 1

Rej(x) Acc(x) Acc(x) Rej(x) x L x / L Figure 1: If the error rate ε of M is small, then Acc(x) will be much larger than Rej(x) when x L, and Acc(x) will be much smaller than Rej(x) when x L. 1. x L: For A {0, 1} r, define For ρ fixed, Pr σ 1,...,σ t [ρ / µ(a) = A r. Acc(x) σ i ] (µ(rej(x)) t ε t, where µ(rej(x)) is the probability that M rejects when it should accept, and is hence not larger than ε. So, by an union bound, Pr[{0, 1} r Acc(x) σ i ] {0, 1} r ε t = r ε t, and hence if r ε t < 1 then some choice of the σ i s must work.. x / L: We need to be sure that for any choice of the σ i s that none of the shifts of ρ make M accpet x. But t µ( Acc(x) σ i ) µ(acc(x) σ i ) = tε, since µ(acc(x)) ε when x L, and hence we need tε < 1. So, we need to choose t so that both tε and r ε t are less than 1. So t = r works, provided that ε < 1 r. From the definition of BPP we only know that ε < 1 3, but, using the majority vote trick introduced during the last lecture, we can in k runs reduce the error to e kδ, where δ = 1 ε > 0. Since k runs will need kr random bits, we see that it suffices to choose k poly and large enough that e kδ < 1 rk. In fact, k = O(log r) works, and so an examination of the above shows that the time complexity is O(T log T ) if the original run-time was T.

Expanders The proof of Theorem 1 used the majority vote trick to get an exponential in k increase in accuracy using a linear in k increase in random bit usage. Expander graphs, which are introduced here, lead to an amplification technique that gives an exponential in k accuracy improvement using only a constant increase in random bit usage (rk versus r + k). Expanders have many other uses, some of which we mention in this lecture, and others we may see in later lectures..1 Definitions Here we introduce two definitions of expander graphs, namely the combinatorial definition and the algebraic definition. Definition 1 (Combinatorial definition). A graph G = (V, E) is a (k, c)-expander if S V with S k satisfies Γ(S) c S, where Γ(S) = {v V s S, (s, v) E} is the neighborhood of S in G. Notice that it is easy to construct a graph which is (k, c)-expanding for all k with c 1. Since Γ(V ) = V one can see it is important not to let S be too large. In other words, for k V and c > 1, no graph is (k, c)-expanding. A typical setting is c is a constant > 1 and k = N where N = V. Definition (Algebraic definition). Given a regular G = (V, E) with degree d its N N normalized adjacency matrix A is defined by { 1 A ij = d, (i, j) E 0, otherwise The normalized adjacency matrix describes the Markov chain of a random walk on G: A ij = Pr[go to state i from state j], and if p is a column vector with p i the probability of being at vertex i, then (Ap) i gives the probability of being at vertex i after one step of the random walk on G. (Ap) i = A ij p j.. Properties of the normalized adjacency matrix of a regular graph j=1 The normalized adjacency matrix has a number of properties that can be proved using basic linear algebra. We will use the following properties, but we do not prove them here. Proposition 1. Let A be a normalized adjacency matrix of a regular graph G on N vertices. 1. A is real and symmetric: all eigenvalues of A are real, and there exists a full orthonormal basis of eigenvectors.. All eigenvalues λ of A satisfy λ 1. 3

3. 1 is an eigenvalue of A, with corresponding eigenvector u = ( 1 N, 1 N,..., 1 N ). The multiplicity of 1 as an eigenvalue equals the number of components of G. 4. -1 is an eigenvalue of A iff G is bipartite. As the uniform distribution is always an eigenvector with eigenvalue 1, we look only at the remaining eigenvector/eigenvalues. It turns out that the second largest eigenvalue in absolute value is often useful to work with. Definition 3. λ(g) = max{ λ λ is an eigenvalue of G with eigenvector e λ orthogonal to u} From Proposition 1 we have λ(g) 1, and λ(g) < 1 G is connected and not bipartite. So, for expanders, the uniform distribution u is the only fixed point. The following theorem tells us that the larger the difference between 1 and λ(g) the faster an arbitrary probability distribution converges to the uniform distribution via a random walk on G. Theorem. For any probability distribution vector p, A t p u 1 Nλ t, where v q = [ i v i q] 1/q is the q-norm of v and λ = λ(g). One way to understand this theorem is: given an initial distribution for a random walk p, after t steps of the random walk A t p, we will be exponentially closer to the uniform distribution u, provided λ < 1, that is G is connected and not bipartite. Proof. Since Au = u we have A t p u = A t (p u). Now, (p u) u because p u, u = p, u u, u = p i /N 1/N = 1/N 1/N, which follows since p is a probability distribution. Write p u = i a ie i for some real a i, where the {e i } form an orthogonal basis of eigenvectors for A with Ae i = λ i e i. Then, since the u-component of p u is 0, we have the following inequality A t (p u) = At i i a i e i = λ t ia i e i = i i λ t a i e i = λt i λ t ia i e i = i a i e i = λt p u. λ i t a i e i Now p u + u = p p 1 = 1, 4

where (p u) u implies the first equality, and v v 1 for all vectors v implies the inequality. By the Cauchy-Schwartz inequality, we have A t (p u) 1 = ( 1 σ 1,..., 1 σ N ), A t (p u) ( 1 σ 1,..., 1 σ N ) A t (p u) = N A t (p u), { 0, (A t (p u)) k 0 where σ k =. Combining all of the above we get 1, otherwise completing the proof. A t (p u) 1 N A t (p u) Nλ t (p u) Nλ t, This can be used to prove that the random walk algorithm for the undirected path problem needs only polynomially many steps, i.e. that PATH BPL. The proof uses Theorem and the following exercise. Exercise 1. If G is connected and not bipartite then λ(g) < 1 1 dn. 3 Next Time In the next lecture we will prove expander mixing lemma. Lemma 1 (Expander Mixing Lemma). For every pair of subsets S, T V, E(S, T ) µ(s)µ(t ) dn λ µ(s)(1 µ(s))µ(t )(1 µ(t )). This lemma can be used in the following applications: 1. Deterministic error reduction: Given a randomized algorithm R that uses r random bits we reduce the error to be less than some arbitrary ε, without using any additional random bits. This requires running R poly(1/ε) times.. Randomness efficient error reduction: With R and r as above we reduce the error to be less than some arbitrary ε using r+o(log(1/ε)) additional random bits and running R O(log(1/ε)) times. We will discuss these applications and prove their correctness in the next lecture. Acknowledgements In writing the notes for this lecture, I perused the notes by Beth Skubak and Amanda Hittson for lecture 1 and 13 from the Spring 010 offering of CS 710 and the notes by Nathan Collins for lecture 1 from the Spring 007 offering of CS 810. 5