U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

Similar documents
U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Notes for Lecture 14

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017

The Simplest Construction of Expanders

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Rings, Paths, and Paley Graphs

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Graph fundamentals. Matrices associated with a graph

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

6.842 Randomness and Computation March 3, Lecture 8

Lower Bounds for Testing Bipartiteness in Dense Graphs

Lectures 15: Cayley Graphs of Abelian Groups

1 Adjacency matrix and eigenvalues

Connections between exponential time and polynomial time problem Lecture Notes for CS294

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Lecture 4. 1 Estimating the number of connected components and minimum spanning tree

Lecture 5: Random Walks and Markov Chain

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

Fiedler s Theorems on Nodal Domains

Strongly Regular Graphs, part 1

This section is an introduction to the basic themes of the course.

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Ma/CS 6b Class 20: Spectral Graph Theory

Lecture 4: An FPTAS for Knapsack, and K-Center

Advanced Engineering Mathematics Prof. Pratima Panigrahi Department of Mathematics Indian Institute of Technology, Kharagpur

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Effective Resistance and Schur Complements

1 Last time: least-squares problems

ORIE 6334 Spectral Graph Theory September 8, Lecture 6. In order to do the first proof, we need to use the following fact.

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 22 Lecturer: David Wagner April 24, Notes 22 for CS 170

REVIEW OF DIFFERENTIAL CALCULUS

On sums of eigenvalues, and what they reveal about graphs and manifolds

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Lecture 5: January 30

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

1 Computational Problems

Math 20F Final Exam(ver. c)

Linear Algebra. Session 12

Linear Algebra March 16, 2019

Advanced Engineering Mathematics Prof. Pratima Panigrahi Department of Mathematics Indian Institute of Technology, Kharagpur

Ma/CS 6b Class 20: Spectral Graph Theory

Eigenvalues, random walks and Ramanujan graphs

Introduction to Arithmetic Geometry Fall 2013 Lecture #7 09/26/2013

Functional Analysis Review

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Algebraic Constructions of Graphs

Dissertation Defense

Study Guide for Linear Algebra Exam 2

Intrinsic products and factorizations of matrices

The Minimum Rank, Inverse Inertia, and Inverse Eigenvalue Problems for Graphs. Mark C. Kempton

Econ Slides from Lecture 7

Stanford University CS366: Graph Partitioning and Expanders Handout 13 Luca Trevisan March 4, 2013

MATH 117 LECTURE NOTES

On the automorphism groups of PCC. primitive coherent configurations. Bohdan Kivva, UChicago

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

Lecturer: Naoki Saito Scribe: Ashley Evans/Allen Xue. May 31, Graph Laplacians and Derivatives

EIGENVECTOR NORMS MATTER IN SPECTRAL GRAPH THEORY

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

CS264: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique

Notes for Lecture 15

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Complexity Theory of Polynomial-Time Problems

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

USING GRAPHS AND GAMES TO GENERATE CAP SET BOUNDS

NORMS ON SPACE OF MATRICES

Eigenvalues and Eigenvectors

Algebraic Methods in Combinatorics

Lecture 10. Lecturer: Aleksander Mądry Scribes: Mani Bastani Parizi and Christos Kalaitzis

Spectral Partitiong in a Stochastic Block Model

Fiedler s Theorems on Nodal Domains

Cographs; chordal graphs and tree decompositions

1 Invariant subspaces

Specral Graph Theory and its Applications September 7, Lecture 2 L G1,2 = 1 1.

Ma/CS 6b Class 25: Error Correcting Codes 2

11.1 Set Cover ILP formulation of set cover Deterministic rounding

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2

Lecture 13: Spectral Graph Theory

The Cayley-Hamilton Theorem and the Jordan Decomposition

Matching Polynomials of Graphs

University of California Berkeley CS170: Efficient Algorithms and Intractable Problems November 19, 2001 Professor Luca Trevisan. Midterm 2 Solutions

NOTES ON LINEAR ALGEBRA CLASS HANDOUT

Sparse analysis Lecture II: Hardness results for sparse approximation problems

Lecture 8 : Eigenvalues and Eigenvectors

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Diagonalization. MATH 1502 Calculus II Notes. November 4, 2008

Math 3191 Applied Linear Algebra

Math212a1413 The Lebesgue integral.

PSRGs via Random Walks on Graphs

CS 246 Review of Linear Algebra 01/17/19

Spectral Analysis of Random Walks

P, NP, NP-Complete, and NPhard

Sydney University Mathematical Society Problems Competition Solutions.

PRACTICE PROBLEMS FOR MIDTERM I

Random Lifts of Graphs

COMP 558 lecture 18 Nov. 15, 2010

Transcription:

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 Lecture 3 In which we show how to find a planted clique in a random graph. 1 Finding a Planted Clique We will analyze the following algorithm: Input: graph G, desired clique size k Let A be the adjacency matrix of G and define M := A 1 2 J Let x be an eigenvector of the largest eigenvalue of M Let I be the set of k vertices v that have the largest values of x v Let C be the set of vertices v V that have at least 3 4k neighbors in I Return C We will show that if k >> n, for example k > 100 n, then with high probability the above algorithm finds the planted clique. Here, high probability means one minus an error term that goes to zero faster than the inverse of any polynomial. 2 The eigenvector of the largest eigenvalue of A 1 2 J An important theme of this lecture is that when we construct a random object using independent choices, then various quantities related to the object concentrate around the average. Before we see what we mean by this, let us study the expectation of A, because, whenever we have a random object, it is always helpful to start by understanding its average. Let us fix the set S which is the planted clique of size k in our distribution. Then the matrix E A has zeroes on the diagonal, and the off-diagonal entries (u, v) are 1 if u and v are both in S and 1 2 otherwise. So we can write E A = 1 2 J + 1 2 1 S1 T S D 1

where J is the matrix that is one everywhere, 1 S is the indicator vector of S, whose v-th coordinate is 1 if v S and is 0 otherwise, and D is a diagonal matrix such that D v,v is equal to 1 if v S and equal to 1 2 otherwise. Let us ignore D for a moment. If we had the matrix 1 2 J + 1 2 1 S1 T S, we could subtract 1 2 J from it and we would be left with the rank-one matrix 1 2 1 S1 T S, whose only non-trivial eigenvector is the indicator of S. Of course we do not have such a matrix, but we have the matrix A whose average (up to the diagonal correction) is 1 2 J + 1 2 1 S1 T S. So what happens if we pretend that A and E A are the same, and we compute the eigenvector of the largest eigenvalue of A 1 2 J? It turns out that we find a vector that is close to 1 S! We will now see why. First, we need to establish that A and E A are close to each other in an appropriate sense. We will measure closeness in the operator norm, which, for symmetric matrices, is the same as the spectral norm. Recall that is M is symmetric, and λ 1,, λ n are its eigenvalues, then the spectral norm of M is its operator norm is M spectral := M op := max x 0 max λ i i=1,...,n M x x and the two norms are the same. We will denote the norm simply by M. Recall that this is indeed a norm and, in particular, we have a M = a M for every real number a, and we have the triangle inequality M + M M + M for every two symmetric matrices M and M. We have the following concentration result for the adjacency matrix of a graph coming from the k-planted clique problem. Lemma 1 Fix a set S of size k. With probability at least 1 1 over the choice of a n log n graph G from the k-planted clique distribution conditioned on S being the clique, if A is the adjacency matrix of G, we have where n is the number of vertices of G. A E A (1 + o(1)) n Note that the matrix D described above satisfies D = 1 and so whenever the conclusion of the above lemma holds we also have A 1 2 J 1 2 1 S1 T S A E A + D (1 + o(1)) n which means that the matrix A 1 2 J, which we know, and the matrix 1 2 1 S1 T S, whose nontrivial eigenvector encodes the set we are looking for, are very close. Now we would like to say that if two matrices are close in spectral norm, then their respective eigenvectors of their largest eigenvalues are also close. The Davis-Kahan theorem says 2

exactly that. For our purposes, it is sufficient to work with a special case that is easier to state. Theorem 2 (Davis and Kahan) If M is a symmetric matrix, yy T is a rank-one symmetric matrix, and x is an eigenvector of the largest eigenvalue of M, we have sin( xy) ˆ M yy T yy T M yy T where xy ˆ is the angle between the vectors x and y. That is, if the distance between M and yy T is small compared to the size of yy T, then the eigenvector of the largest eigenvalue of M is almost aligned with y. Corollary 3 If (A J/2) 1 S 1 T S /2 (1+o(1)) n, k > 100 n, and x is an eigenvector of the largest eigenvalue of A J/2 scaled so that x 2 = k, then for sufficiently large n. min{ x 1 S 2, x 1 S 2 } 0.002k Proof: We apply the Davis-Kahan theorem and we see that sin( ˆ x 1 S ) (1 + o(1)) n k 2 (1 + o(1)) n = 1 49 + o(1) Now we have to reason about the distance between x and 1 S. Recall that both vectors have length-squared equal to k. so we have We also have So x 1 S 2 = x 2 + 1 S 2 2 x, 1 S = 2k 2 x, 1 S x 1 S 2 = 2k + 2 x, 1 S min{ x 1 S 2, x 1 S 2 } = 2k 2 x, 1 S x, 1 S = k cos( x ˆ1 S ) = k for sufficiently large n. x, 1 S = x 1 S cos( x1ˆ S ) = k cos( x ˆ1 S ) 1 sin 2 ( x ˆ 49 1 S ) k 2 1 49 2 o(1) > 0.999 k 3

3 Cleaning up The previous section analyzes what happens in the first two lines of the algorithm: with high probability we have a vector x such that either x or x is close to 1 S. How do we use such a vector to find S? The first observation, which analyzes the third line of the algorithm, is that, if we define the set I to be the k vertices v for which x v is largest, then the set I and the set S are almost the same. Lemma 4 If x is a vector such that x 2 = S = k, and { x 1 S 2, x 1 S 2 } ɛk and if I is the set of k vertices v with largest x v, then I and S have at least k (1 4ɛ) vertices in common. Proof: Let us call a vertex v bad if it is in S and we have x v 1 2 or if it is not in S and we have x v 1 2. Let B be the set of bad vertices. The first observation is that B 4ɛ, because each vertex in B contributes at least 1/4 both to x 1 S 2 and to x 1 S 2. The second observation is that I and S must have at least k B vertices in common. To see why, consider two cases. Let t be the smallest value of x v among the vertices v in I. That is, if we sorted the vertices of G in decreasing order of x v, then t would be the value that we would see in the k-th position of the sorted order. If t > 1/2, then all the vertices v S that are included in I have x v > 1/2, and so are bad vertices, which means that I contains at most B vertices not in S, and hence contains at least k B vertices from S. If t 1/2, then every vertex of S that we do not include in I has x v 1/2, and so it is a bad vertex. This means that I contains at least k B of the k vertices of S. We now understand what happens after the first three lines of our algorithm are executed: with high probability, the set I contains at least 1 4 0.002 k =.992 k of the k elements of S. In conclusion, we need to understand what happens in the fourth line. Note that, if the above happens, then C will include all the elements of S, since all the elements of S have at least.992 k > 3 4k neighbors in I. The last thing to prove is that, with high probability, C will not contain any vertex outside of S. We will use the following result, which, again, relates the probable value of a random variable to the value of its average. Theorem 5 (Chernoff bounds) If X 1,..., X k are independent random variables taking value either 0 or 1, and X := k i=1 X i, then, for every 0 < ɛ < 1 P [X E X > ɛk] e 2ɛ2 k P [X E X < ɛk] e 2ɛ2 k 4

Now, let us go back to our random graph: if v S, then the number of neighbors of v in S is a random value of average k/2 that can be written as a sum of k independent 0/1 random variables each of average 1/2. This means that each vertex v S has a very low probability of having much more than k/2 neighbors in S. Corollary 6 Fix a vertex v S. With probability at least 1 e.02k over the choice of G, the vertex v has at most.6k neighbors in S. Proof: Apply the Chernoff bound with ɛ = 0.1. Now we apply the union bound to argue that the same happens for all vertices not in S. Corollary 7 With probability at least 1 (n k) e.02k over the choice of G, every vertex v S has at most.6k neighbors in S. Proof: Consider the negation of the event that we are interested in, that is, the event that there is a vertex v S with more than.6k neighbors in S. This is the OR of n k events, each of them having probability at most e.02k, and so its probability is at most (n k) e.02k. So, with 1 e Ω( n) probability, each vertex v S has at most.6k neighbors in S. But the set I contains a subset of S plus at most.008 other vertices, so each vertex v S has at most.608k neighbors in I, which is less than 3/4. This means that C will (with high probability) not include any vertex v S. 5