On Finding Planted Cliques in Random Graphs

Similar documents
On Finding Planted Cliques and Solving Random Linear Equa9ons. Math Colloquium. Lev Reyzin UIC

TTIC An Introduction to the Theory of Machine Learning. Learning from noisy data, intro to SQ model

Computational Lower Bounds for Statistical Estimation Problems

Average-case Analysis for Combinatorial Problems,

On Noise-Tolerant Learning of Sparse Parities and Related Problems

On the power and the limits of evolvability. Vitaly Feldman Almaden Research Center

Statistical Algorithms and a Lower Bound for Detecting Planted Cliques

Sorting under partial information (without the ellipsoid algorithm)

Learning Functions of Halfspaces Using Prefix Covers

Random Graphs. Research Statement Daniel Poole Autumn 2015

CS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning

Foundations of Machine Learning and Data Science. Lecturer: Avrim Blum Lecture 9: October 7, 2015

Statistical and Computational Phase Transitions in Planted Models

Learning and Fourier Analysis

Active Testing! Liu Yang! Joint work with Nina Balcan,! Eric Blais and Avrim Blum! Liu Yang 2011

Statistical Algorithms and a Lower Bound for Detecting Planted Cliques

3 Finish learning monotone Boolean functions

Recovering Social Networks by Observing Votes

Lectures on Combinatorial Statistics. Gábor Lugosi

Why almost all k-colorable graphs are easy to color

Independence and chromatic number (and random k-sat): Sparse Case. Dimitris Achlioptas Microsoft

Lecture 29: Computational Learning Theory

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

Order- Revealing Encryp2on and the Hardness of Private Learning

Statistical Query Learning (1993; Kearns)

Lecture 20: LP Relaxation and Approximation Algorithms. 1 Introduction. 2 Vertex Cover problem. CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8

Public key cryptography based on the clique and learning parity with noise problems for post-quantum cryptography

Planted Cliques, Iterative Thresholding and Message Passing Algorithms

Community detection in stochastic block models via spectral methods

Learning convex bodies is hard

Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time

Computational Learning Theory

Unconditional Lower Bounds for Learning Intersections of Halfspaces

CS264: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique

What Can We Learn Privately?

Lecture 7: February 6

Adding random edges to create the square of a Hamilton cycle

Yale University Department of Computer Science

3-Coloring of Random Graphs

From Adversaries to Algorithms. Troy Lee Rutgers University

Introduction to Statistical Learning Theory

approximation algorithms I

A Simple Algorithm for Clustering Mixtures of Discrete Distributions

Lecture 6: Random Walks versus Independent Sampling

The Probabilistic Method

Lecture 5: January 30

Computational and Statistical Learning Theory

On the Average-case Complexity of MCSP and Its Variants. Shuichi Hirahara (The University of Tokyo) Rahul Santhanam (University of Oxford)

Essential facts about NP-completeness:

Computational Learning Theory. Definitions

Chapter 3. Vector spaces

Clique is hard on average for regular resolution

Two Decades of Property Testing

Szemerédi s Lemma for the Analyst

Reconstruction in the Generalized Stochastic Block Model

Computational and Statistical Learning Theory

APTAS for Bin Packing

Computational Learning Theory

Coloring Simple Hypergraphs

The Geometry of Random Right-angled Coxeter Groups

Unsupervised Learning

Interactive Clustering of Linear Classes and Cryptographic Lower Bounds

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds

Computational Learning Theory. CS534 - Machine Learning

Counting in Practical Anonymous Dynamic Networks is Polynomial

Approximation & Complexity

The power and weakness of randomness (when you are short on time) Avi Wigderson Institute for Advanced Study

With P. Berman and S. Raskhodnikova (STOC 14+). Grigory Yaroslavtsev Warren Center for Network and Data Sciences

On Basing Lower-Bounds for Learning on Worst-Case Assumptions

Answering Many Queries with Differential Privacy

Complexity Classes IV

Probabilistic Method. Benny Sudakov. Princeton University

FORMULATION OF THE LEARNING PROBLEM

Graph Packing - Conjectures and Results

Computational Learning Theory

Lecture 4 October 18th

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

Notes 6 : First and second moment methods

Random matrices: A Survey. Van H. Vu. Department of Mathematics Rutgers University

The chromatic number of random regular graphs

Probabilistic construction of t-designs over finite fields

Dissertation Defense

Computational Learning Theory

Spanning and Independence Properties of Finite Frames

Approximating the minmax value of 3-player games within a constant is as hard as detecting planted cliques

More Efficient PAC-learning of DNF with Membership Queries. Under the Uniform Distribution

Clustering Algorithms for Random and Pseudo-random Structures

CPSC 536N: Randomized Algorithms Term 2. Lecture 2

Lecture 3: Reductions and Completeness

Clique vs. Independent Set

Some Selected Topics. Spectral Norm in Learning Theory:

Computational Learning Theory (COLT)

1. Introduction Recap

Machine Learning

Definition: A "system" of equations is a set or collection of equations that you deal with all together at once.

The Kakeya Problem Connections with Harmonic Analysis Kakeya sets over Finite Fields. Kakeya Sets. Jonathan Hickman. The University of Edinburgh

PCP Theorem and Hardness of Approximation

Approximation Algorithms for the k-set Packing Problem

CS5371 Theory of Computation. Lecture 19: Complexity IV (More on NP, NP-Complete)

Solving LWE problem with bounded errors in polynomial time

Transcription:

On Finding Planted Cliques in Random Graphs GraphEx 2014 Lev Reyzin UIC joint with Feldman, Grigorescu, Vempala, Xiao in STOC 13 1

Erdős- Rényi Random Graphs G(n,p) generates graph G on n vertces by including each possible edge independently with probability p. G (4, 0.5) G(n,p) 2

Erdős- Rényi Random Graphs G(n,p) generates graph G on n vertces by including each possible edge independently with probability p. G(n,p) 3

Erdős- Rényi Random Graphs G(n,p) generates graph G on n vertces by including each possible edge independently with probability p. G (4, 0.5) G(n,p) 4

Typical Examples n = 100, p = 0.01 n = 100, p = 0.1 n = 100, p = 0.5 Created using soyware by Christopher Manning, available on h[p://bl.ocks.org/christophermanning/4187201 5

Erdős- Rényi Random Graphs E- R random graphs are an interestng object of study in combinatorics. When does G have a giant component? When is G connected? How large is the largest clique in G? 6

Erdős- Rényi Random Graphs E- R random graphs are an interestng object of study in combinatorics. When does G have a giant component? when np c > 1 When is G connected? sharp connectvity threshold at p = ln/n How large is the largest clique in G? for p=½, largest clique has size k(n) 2lg 2 (n) 7

w.h.p. for G ~ G(n,½), k(n) 2lg 2 (n) why? let X k be the number of cliques in G ~ G(n,.5) " $ % ' 2 k ( 2) Ε[ X K ] = n < 1 for k > 2lg 2 n k in fact, (for large n) the largest clique is almost certainly k(n) = 2lg 2 (n) or 2lg 2 (n)+1 [Matula 76] 8

1 8 2 7 3 6 4 5 9

1 8 2 7 3 6 4 5 Where is the largest clique? 10

1 8 2 7 3 6 4 5 11

Finding Large Cliques for worst- case graphs: finding largest clique is NP- Hard. W[1]- Hard hard to approximate to n 1- ε for any ε > 0 hope: in E- R random graphs, finding large cliques is easier. 12

Finding Large Cliques in G ~ G(n,½) Simply finding a clique of size = lg 2 (n) is easy initialize T = Ø, S = V! while (S Ø) {! pick random v S and add v to T! remove v and its non-neighbors from S! }! return T! Conject ure [Karp 76]: no efficient method to find cliques of size (1+ε)lg 2 n in E- R random graphs. 13

Finding Large Cliques in G ~ G(n,½) Simply finding a clique of size = lg 2 (n) is easy initialize T = Ø, S = V! while (S Ø) {! pick random v S and add v to T! remove v and its non-neighbors from S! }! return T! A stll- open Conjecture [Karp 76]: for any ε > 0, there s no efficient method to find cliques of size (1+ε)lg 2 n in E- R random graphs. 14

Summary In E- R random graphs clique of size 2lg 2 n exists can efficiently find clique of size lg 2 n likely cannot efficiently find cliques size (1+ε)lg 2 n What to do? make the problem easier by plantng a large clique to be found! [Jerrum 92] 15

Planted Clique the process: G ~ G(n,p,k) 1. generate G ~ G(n,p) 2. add clique to random subset of k < n vertces of G Goal: given G ~ G(n,p,k), find the k vertces where the clique was planted (algorithm knows values: n,p,k) Obvious relatonship to community detecton and other applicatons. 16

Planted Clique the process: G ~ G(n,p,k) 1. generate G ~ G(n,p) 2. add clique to random subset of k < n vertces of G Goal: given G ~ G(n,p,k), find the k vertces where the clique was planted (algorithm knows values: n,p,k) Obvious relatonship to community detecton and other applicatons. 17

Progress on Planted Clique For G ~ G(n,½,k), clearly no hope for k 2lg 2 n +1. For k > 2lg 2 n+1, there is an obvious n O(lg n) - algorithm: input: G from (n,1/2,k) with k > 2lg 2 n+2! 1) Check all S V of size S =2lg 2 n+2 for S! that induces a clique in G.! 2) For each v V, if (v,w) is edge for! all w in S: S = S {v}! 3) return S!! What is the smallest value of k that we have a polynomial Unfortunately, Tme algorithm this is for? not Any polynomial guesses? Tme. 18

Progress on Planted Clique For G ~ G(n,½,k), clearly no hope for k 2lg 2 n +1. For k > 2lg 2 n+1, there is an obvious n O(lg n) - algorithm: input: G from (n,1/2,k) with k > 2lg 2 n+2! 1) Check all S V of size S =2lg 2 n+2 for S! that induces a clique in G.! 2) For each v V, if (v,w) is edge for! all w in S: S = S {v}! 3) return S!! What is the smallest value of k that we have a polynomial Tme algorithm for? Any guesses? 19

State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the vertces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2! 1) find 2 nd eigenvector v 2 of A(G)! 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order.! 3) Return Q, the set of vertices with ¾k neighbors in W! 20

State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the vertces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98]! input: G from (n,1/2,k) with k 10 n 1/2! 1) find 2 nd eigenvector v 2 of A(G)! 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order.! 3) Return Q, the set of vertices with ¾k neighbors in W! 21

State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the vertces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2! 1) find 2 nd eigenvector v 2 of A(G)! 2) Sort V by decreasing In fact, (bipartte) order planted of absolute clique was values of coordinates of v 2. Let W be the top k vertices recently in used this as order.! an alternate 3) Return Q, the cryptographic set of vertices primitve for with k < n 1/2- ε ¾k. neighbors in W! [Applebaum- Barak- Wigderson 09] 22

State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the vertces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2! 1) find 2 nd eigenvector v 2 of A(G)! 2) Sort V by decreasing order of absolute my values goal: explain of coordinates why there of has v 2 been. Let no W be the progress top k vertices on this problem in this past order.! n 1/2. [FGVRX 13] 3) Return Q, the set of vertices with ¾k neighbors in W! 23

StaTsTcal Query Learning [Kearns 93] (x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) D (x m,y m )

StaTsTcal Query Learning [Kearns 93] (x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) D q(x, f(x)), S = poly where x X q: X {0,1} à {0,1} (x m,y m )

StaTsTcal Query Learning [Kearns 93] (x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) (x m,y m ) D Chooses S examples from D q(x, f(x)), S = poly where x X q: X {0,1} à {0,1} avg q(x, f(x)) over S examples

StaTsTcal Query Learning [Kearns 93] (x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) D q poly (x m,y m )

StaTsTcal Query Learning [Kearns 93] (x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) D q poly (x m,y m )

StaTsTcal Queries Theorem [Kearns 93]: If a family of functons is learnable with statstcal queries, then it is learnable (in the normal PAC sense) with noise! Theorem [Kearns 93]: parity functons are not learnable with statstcal queries. proof idea: b/c the parity func4ons are orthogonal under U, queries are either uninforma4ve or eliminate one wrong parity func4on at a 4me (and there are 2 n ) 29

StaTsTcal Queries Theorem [Kearns 93]: If a family of functons is learnable with statstcal queries, then it is learnable (in the normal PAC sense) with noise! Theorem [Kearns 93]: parity functons are not learnable with statstcal queries. Theorem [Blum et al 94], when a family of functons has exponentally high SQ dim it is not learnable with statstcal queries. SQ dim is roughly the number of nearly- orthogonal functons (wrt a reference distributon). Parity functons have SQ dimension = 2 n. 30

StaTsTcal Queries Theorem [Kearns 93]: If a family of functons is learnable with statstcal queries, then it is learnable (in the normal PAC sense) with noise! Theorem [Kearns 93]: parity functons are not learnable with statstcal queries. Theorem [Blum et al 94], when a family of functons has exponentally high SQ dim it is not learnable with statstcal queries. Shockingly, almost all learning algorithms can be implemented w/ statstcal queries! So high SQ dim is a serious barrier to learning, especially under noise. 31

Summary StaTsTcal queries and statstcal dimension from learning theory are an explanaton as to why certain classes are hard to learn. (almost all our learning algorithms are statstcal) Idea: extend this framework to optmizaton problems and use it to explain the hardness of planted clique! l 32

TradiTonal Algorithms input data output 33

TradiTonal Algorithms input data processing please wait output 34

TradiTonal Algorithms input data output 35

StaTsTcal Algorithms input data output 36

StaTsTcal Algorithms input data q: {0,1}, sample size S output 37

StaTsTcal Algorithms input data q: {0,1}, sample size S output 38

StaTsTcal Algorithms input data q: {0,1}, sample size S output avg q( ) 39

StaTsTcal Algorithms input data poly processing please wait output 40

StaTsTcal Algorithms input data output poly 41

StaTsTcal Algorithms input data output poly Turns out most (all?) current optmizaton algorithms have statstcal analogues! 42

BiparTte Planted Clique G! " $ % A(G) 43

BiparTte Planted Clique w.p. (n- k)/n w.p. k/n each row is random is random, except in plant coordinates! " $ % A(G) 44

BiparTte Planted Clique w.p. (n- k)/n w.p. k/n each row is random is random, except in plant coordinates! " 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 0 1 1 1 0 1 1 $ % A(G) 45

StaTsTcal Algorithms for BPC! " 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 0 1 1 1 0 1 1 A(G) $ % 46

StaTsTcal Algorithms for BPC! " 1 1 1 0 0 1 1 0 1 each 1 1row 1 0 1 1 0 1 1 0 1 1 0 0 0 is 1random 1 0 0 0 0 1 0 1 0 1 is random, 0 1 0except 1 in 1 plant 1 1 0 1 1 1 0 1 1 w.p. (n- k)/n w.p. k/n coordinates $ % A(G) 47

StaTsTcal Algorithms for BPC q: {0,1} n à {0,1}, S! " 1 1 1 0 0 1 1 0 1 each 1 1row 1 0 1 1 0 1 1 0 1 1 0 0 0 is 1random 1 0 0 0 0 1 0 1 0 1 is random, 0 1 0except 1 in 1 plant 1 1 0 1 1 1 0 1 1 w.p. (n- k)/n w.p. k/n coordinates A(G) $ % 48

StaTsTcal Algorithms for BPC q: {0,1} n à {0,1}, S avg value of q on S samples! " 1 1 1 0 0 1 1 0 1 each 1 1row 1 0 1 1 0 1 1 0 1 1 0 0 0 is 1random 1 0 0 0 0 1 0 1 0 1 is random, 0 1 0except 1 in 1 plant 1 1 0 1 1 1 0 1 1 w.p. (n- k)/n w.p. k/n coordinates A(G) $ % 49

StaTsTcal Algorithms for BPC poly(n)! " 1 1 1 0 0 1 1 0 1 each 1 1row 1 0 1 1 0 1 1 0 1 1 0 0 0 is 1random 1 0 0 0 0 1 0 1 0 1 is random, 0 1 0except 1 in 1 plant 1 1 0 1 1 1 0 1 1 w.p. (n- k)/n w.p. k/n coordinates A(G) $ % 50

Results Extension of statstcal query model to optmizaton. Proving Tghter, more general, lower bounds, which apply to learning also. Gives a new tool for showing problems are difficult. 51

Results Main result (almost): for any ε>0, no poly Tme statstcal alg. making queries with sample sizes o(n 2 /k 2 ), can find planted cliques of size n 1/2- ε. intui4on: many planted clique distributons with small overlap (nearly orthogonal in some sense), which are hard to tell from normal E- R graphs. Implies that many ideas will fail to work, including Markov chain approaches [Frieze- Kannan 03] for our version of the problem. Since this work, an integrality gap of n 1/2 was shown for planted clique, giving further evidence for its hardness. [Meka- Wigderson 13] 52

Results Main result (almost): for any ε>0, no poly Tme statstcal alg. making queries with sample sizes o(n 2 /k 2 ), can find planted cliques of size n 1/2- ε. intui4on: many planted clique distributons with small overlap (nearly orthogonal in some sense), which are hard to tell from normal E- R graphs. Implies that many ideas will fail to work, including Markov chain approaches [Frieze- Kannan 03] for our version of the problem. Since this work, an integrality gap of n 1/2 was shown for planted clique, giving further evidence for its hardness. [Meka- Wigderson 13] 53

Any QuesTons? 54