On Finding Planted Cliques and Solving Random Linear Equa9ons. Math Colloquium. Lev Reyzin UIC

Size: px

Start display at page:

Download "On Finding Planted Cliques and Solving Random Linear Equa9ons. Math Colloquium. Lev Reyzin UIC"

Ruby Dalton
5 years ago
Views:

1 On Finding Planted Cliques and Solving Random Linear Equa9ons Math Colloquium Lev Reyzin UIC 1

2 random graphs 2

3 Erdős- Rényi Random Graphs G(n,p) generates graph G on n ver9ces by including each possible edge independently with probability p. G (4, 0.5) G(n,p) 3

4 Erdős- Rényi Random Graphs G(n,p) generates graph G on n ver9ces by including each possible edge independently with probability p. G(n,p) 4

5 Erdős- Rényi Random Graphs G(n,p) generates graph G on n ver9ces by including each possible edge independently with probability p. G (4, 0.5) G(n,p) 5

6 Typical Examples n = 100, p = 0.01 n = 100, p = 0.1 n = 100, p = 0.5 Created using sovware by Christopher Manning, available on hwp://bl.ocks.org/christophermanning/

7 Erdős- Rényi Random Graphs E- R random graphs are an interes9ng object of study in combinatorics. When does G have a giant component? When is G connected? How large is the largest clique in G? 7

8 Erdős- Rényi Random Graphs E- R random graphs are an interes9ng object of study in combinatorics. When does G have a giant component? when np c > 1 When is G connected? sharp connec9vity threshold at p = ln/n How large is the largest clique in G? for p=½, largest clique has size k(n) 2lg 2 (n) 8

9 w.h.p. for G ~ G(n,½), k(n) 2lg 2 (n) why? let X k be the number of cliques in G ~ G(n,.5) $ ' 2 k ( 2) Ε[ X K ] = n < 1 for k > 2lg 2 n k in fact, (for large n) the largest clique is almost certainly k(n) = 2lg 2 (n) or 2lg 2 (n)+1 [Matula 76] 9

10 10

11 11

12 Where is the largest clique? 12

14 Finding Large Cliques for worst- case graphs: finding largest clique is NP- Hard. (very very unlikely to have efficient algorithms) W[1]- Hard (very likely gets harder as target cliques grow) hard to approximate to n 1- ε for any ε > 0 (give up) Hope: in E- R random graphs, finding large cliques is easier. 14

15 Finding Large Cliques for worst- case graphs: finding largest clique is NP- Hard. (very very unlikely to have efficient algorithms) W[1]- Hard (very likely gets harder as target cliques grow) hard to approximate to n 1- ε for any ε > 0 (give up) Hope: in E- R random graphs, finding large cliques is easier. 15

16 Finding Large Cliques for worst- case graphs: finding largest clique is NP- Hard. (very very unlikely to have efficient algorithms) W[1]- Hard (very likely gets harder as target cliques grow) hard to approximate to n 1- ε for any ε > 0 (give up) Hope: in E- R random graphs, finding large cliques is easier. 16

17 Finding Large Cliques for worst- case graphs: finding largest clique is NP- Hard. (very very unlikely to have efficient algorithms) W[1]- Hard (very likely gets harder as target cliques grow) hard to approximate to n 1- ε for any ε > 0 (give up) hope: in E- R random graphs, finding large cliques is easier. 17

18 Finding Large Cliques in G ~ G(n,½) Finding a clique of size = lg 2 (n) is easy initialize T = Ø, S = V while (S Ø) { pick random v S and add v to T remove v and its non-neighbors from S } return T Conject ure [Karp 76]: no efficient method to find cliques of size (1+ε)lg 2 n in E- R random graphs. 18

19 Finding Large Cliques in G ~ G(n,½) Finding a clique of size = lg 2 (n) is easy initialize T = Ø, S = V while (S Ø) { pick random v S and add v to T remove v and its non-neighbors from S } return T Conjecture [Karp 76]: for any ε > 0, there s no efficient method to find cliques of size (1+ε)lg 2 n in E- R random graphs. 19

20 Finding Large Cliques in G ~ G(n,½) Finding a clique of size = lg 2 (n) is easy initialize T = Ø, S = V while (S Ø) { pick random v S and add v to T remove v and its non-neighbors from S } return T Conjecture [Karp 76]: for any ε > 0, there s no efficient method to find cliques of size (1+ε)lg 2 n in E- R random graphs. s9ll open (would imply P NP) 20

21 Summary In E- R random graphs clique of size 2lg 2 n exists can efficiently find clique of size lg 2 n likely cannot efficiently find cliques size (1+ε)lg 2 n What to do? 21

22 Summary In E- R random graphs clique of size 2lg 2 n exists can efficiently find clique of size lg 2 n likely cannot efficiently find cliques size (1+ε)lg 2 n What to do? make the problem easier by plan9ng a large clique to be found [Jerrum 92] 22

23 planted cliques 23

24 Planted Clique the process: G ~ G(n,p,k) 1. generate G ~ G(n,p) 2. add clique to random subset of k < n ver9ces of G Goal: given G ~ G(n,p,k), find the k ver9ces where the clique was planted (algorithm knows values: n,p,k) 24

25 Progress on Planted Clique For G ~ G(n,½,k), clearly no hope for k 2lg 2 n +1. For k > 2lg 2 n+1, there is an obvious n O(lg n) - algorithm: input: G from (n,1/2,k) with k > 2lg 2 n+1 1) Check all S V of size S =2lg 2 n+2 for S that induces a clique in G. 2) For each v V, if (v,w) is edge for all w in S: S = S {v} 3) return S What is the smallest value of k that we have a polynomial Unfortunately, 9me algorithm this is for? not Any polynomial guesses? 9me. 25

26 Progress on Planted Clique For G ~ G(n,½,k), clearly no hope for k 2lg 2 n +1. For k > 2lg 2 n+1, there is an obvious n O(lg n) - algorithm: input: G from (n,1/2,k) with k > 2lg 2 n+2 1) Check all S V of size S =2lg 2 n+2 for S that induces a clique in G. 2) For each v V, if (v,w) is edge for all w in S: S = S {v} 3) return S What is the smallest value of k that we have a polynomial 9me algorithm for? Any guesses? 26

27 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector v 2 of A(G) 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order. 3) Return Q, the set of vertices with ¾k neighbors in W 27

28 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector v 2 of A(G) 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order. 3) Return Q, the set of vertices with ¾k neighbors in W 28

29 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector Bipar9te vversion 2 of A(G) 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order. 3) Return Q, the set of vertices with ¾k neighbors in W 29

30 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector Bipar9te vversion 2 of A(G) 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order. 3) Return Q, the set of vertices with ¾k neighbors in W 30

31 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector Bipar9te vversion 2 of A(G) 2) Sort V by decreasing order of absolute values of coordinates of v 2. Let W be the top k vertices in this order. 3) Return Q, the recently set of used vertices as alternate with ¾k neighbors in W In fact, (bipar9te) planted clique was cryptographic primi9ve for k < n 1/2- ε. [Applebaum- Barak- Wigderson 09] 31

32 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector v 2 of A(G) 2) Sort V by decreasing order of absolute my values goal: explain of coordinates why there of has v 2 been. Let no W be the progress top k vertices on this problem in this past order. n 1/2. [FGVRX 13] 3) Return Q, the set of vertices with ¾k neighbors in W 32

33 State- of- the- Art for Polynomial Time k c (n lg n) 1/2 is trivial. The degrees of the ver9ces in the plant stand out. (proof via Hoeffding union bound) k = c n 1/2 is best so far. [Alon- Krivelevich- Sudokov 98] input: G from (n,1/2,k) with k 10 n 1/2 1) find 2 nd eigenvector v 2 of A(G) 2) Sort V by decreasing order of absolute my values goal: explain of coordinates why there of has v 2 been. Let no W be the progress top k vertices on this problem in this past order. n 1/2. [FGVRX 13] 3) Return Q, the set of vertices with ¾k neighbors in W But first we have to discuss solving linear systems 33

34 linear systems 34

35 m equa9ons Solving Linear Systems n variables Ax = b solve for n unknowns m results the linear equa9ons are over GF(2), ie {0,1} n 35

36 m equa9ons Solving Random Linear Systems n variables Ax = b m results 36

37 m equa9ons Solving Random Linear Systems n variables Ax = b $ m results 37

38 m equa9ons Solving Random Linear Systems n variables Ax = b $ $ m results 38

39 Ax = b Solving Random Linear Systems 39 n variables m equa9ons m results $ $ $

40 m equa9ons Solving Random Linear Systems n variables Ax = b $ x 1 x 2 x 3 $ m results $ 40

41 m equa9ons Solving Random Linear Systems n variables Ax = b $ x 1 x 2 x 3 $ choose any m = poly(n) solve for unique x in poly 9me. How? m results $ 41

42 m equa9ons A Twist n variables Ax = b $ x 1 x 2 x 3 $ m results 0 $ 0 1x choose any m = poly(n) solve for x (that generated original b) in poly 9me. How? entries of b flipped independently with prob. 1/100 42

43 m equa9ons A Twist n variables Ax = b $ x 1 x 2 x 3 $ entries of b flipped independently with prob. 1/100 m results 0 $ 0 1x choose any m = poly(n) solve for x (that generated original b) in poly 9me. it s a big open ques9on is theore9cal CS called LPN. 43

44 m equa9ons A Twist n variables Ax = b $ x 1 x 2 x 3 $ entries of b flipped independently with prob. 1/100 m results 0 $ 0 1x choose any m = poly(n) solve for x (that generated original b) in poly 9me. current best is 2 O(n/lgn) 9me. [BlumKW 00] 44

45 m equa9ons A Twist n variables Ax = b $ x 1 x 2 x 3 $ entries of b flipped independently with prob. 1/100 m results 0 $ 0 1x In fact, LPN was recently used as alternate cryptographic primi9ve. [Peikart 09] 45

46 learning theory 46

47 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F 47

48 PAC Learning, in One Slide [Valiant 84] learner distribu9on data a 1 f F 48

49 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F f(a 1 ) 49

50 PAC Learning, in One Slide [Valiant 84] learner distribu9on data a 2 f F 50

51 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F f(a 2 ) 51

52 PAC Learning, in One Slide [Valiant 84] learner distribu9on data a 3 f F 52

53 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F f(a 3 ) 53

54 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F 54

55 PAC Learning, in One Slide [Valiant 84] learner distribu9on data a m f F 55

56 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F f(a m ) 56

57 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F f(a m ) 57

58 PAC Learning, in One Slide [Valiant 84] learner distribu9on f F f(a m ) 58

59 PAC Learning, in One Slide [Valiant 84] learner distribu9on data a m+1 f F 59

60 PAC Learning, in One Slide [Valiant 84] learner h(a m+1 ) distribu9on f F f(a m+1 ) 60

61 PAC Learning, in One Slide [Valiant 84] learner distribu9on data a m+2 f F 61

62 PAC Learning, in One Slide [Valiant 84] learner h(a m+2 ) distribu9on f F f(a m+2 ) 62

63 PAC Learning, in One Slide [Valiant 84] learner h(a m+2 ) distribu9on f F f(a m+2 ) 63

64 Learning Linear Func9ons (mod 2) m equa9ons n variables U = $ x $ 1 x 2 x 3 m results $ 64

65 Learning Linear Func9ons (mod 2) m equa9ons n variables U = $ x $ 1 x 2 x 3 m results $ 65

66 Learning Linear Func9ons (mod 2) m equa9ons n variables U = $ x $ 1 x 2 x 3 m results $ 66

67 Learning Linear Func9ons (mod 2) m equa9ons n variables = $ x $ 1 x 2 U x m results $ 67

68 Learning Linear Func9ons (mod 2) m equa9ons n variables U = $ x $ 1 x 2 x 3 m results $ 68

69 Learning Linear Func9ons (mod 2) m equa9ons n variables = $ x $ 1 x 2 x 3 x' 1 x' 2 x' 3 $ m results $ 69

70 Learning Linear Func9ons (mod 2) x $ 1 x 2 x 3 target f x' $ 1 x' 2 x' 3 hypothesis h Remember the coefficients of the equa9ons are generated uniformly at random from {0,1} n. So, if i s.t. x i x i, then f and h will disagree ½ of the 9me. Hence, 2 n different orthogonal func9ons. form the Fourier basis in DFA 70

71 When there s noise m equa9ons n variables $ x 1 x 2 x 3 = $ x m results $ 71

72 sta9s9cal queries 72

73 Sta9s9cal Query Learning [Kearns 93] n variables ( ) U x 1 x 2 x 3 $

74 Sta9s9cal Query Learning [Kearns 93] n variables ( ) x 1 q(a, f(a)) x U 2 x 3 $ where a {0,1} n a: {1,0} n {0,1} à {0,1}

75 Sta9s9cal Query Learning [Kearns 93] n variables ( ) x 1 q(a, f(a)) x U 2 x 3 $ chooses a from U where a {0,1} n a: {1,0} n {0,1} {0,1} q(a, f(a)) 0 or 1

76 Sta9s9cal Query Learning [Kearns 93] n variables ( ) x 1 q x U 2 x 3 $ poly(n) 0 or 1

77 Sta9s9cal Query Learning [Kearns 93] n variables ( ) x 1 q x U 2 x 3 $ poly(n) 0 or 1

78 Sta9s9cal Queries Theorem [Kearns 93]: If a family of func9ons is learnable with sta9s9cal queries, then it is learnable (in the original model) with noise Theorem [Kearns 93]: Linear func9ons (mod 2) are not learnable with sta9s9cal queries. proof idea: b/c the linear func5ons are orthogonal under U, queries are either uninforma5ve or eliminate one wrong linear func5on at a 5me (and there are 2 n ) 78

79 Sta9s9cal Queries Theorem [Kearns 93]: If a family of func9ons is learnable with sta9s9cal queries, then it is learnable (in the original model) with noise Theorem [Kearns 93]: Linear func9ons (mod 2) are not learnable with sta9s9cal queries. Theorem [Blum et al 94], when a family of func9ons has exponen9ally high SQ dim it is not learnable with sta9s9cal queries. SQ dim is roughly the number of nearly- orthogonal func9ons (wrt a reference distribu9on). Linear func9ons have SQ dimension = 2 n. 79

80 Sta9s9cal Queries Theorem [Kearns 93]: If a family of func9ons is learnable with sta9s9cal queries, then it is learnable (in the original model) with noise Theorem [Kearns 93]: Linear func9ons (mod 2) are not learnable with sta9s9cal queries. Theorem [Blum et al 94], when a family of func9ons has exponen9ally high SQ dim it is not learnable with sta9s9cal queries. Shockingly, almost all learning algorithms can be implemented w/ sta9s9cal queries So high SQ dim is a serious barrier to learning, especially under noise. 80

81 Summary Linear equa9ons with errors seem hard to solve (noisy linear func9ons over GF(2) seem hard to learn ) Sta9s9cal queries and sta9s9cal dimension from learning theory are an explana9on as to why. (almost all our learning algorithms are sta9s9cal) l 81

82 Summary Linear equa9ons with errors seem hard to solve (noisy linear func9ons over GF(2) seem hard to learn ) Sta9s9cal queries and sta9s9cal dimension from learning theory are an explana9on as to why. (almost all our learning algorithms are sta9s9cal) Idea: extend this framework to op9miza9on problems and use it to explain the hardness of planted clique 82

83 sta9s9cal algorithms [FGRVX 13] 83

84 Tradi9onal Algorithms input data output 84

85 Tradi9onal Algorithms input data processing please wait output 85

86 Tradi9onal Algorithms input data output 86

87 Sta9s9cal Algorithms input data output 87

88 Sta9s9cal Algorithms input data q: {0,1} output 88

89 Sta9s9cal Algorithms input data q: {0,1} output 89

90 Sta9s9cal Algorithms input data q: {0,1} output q( ) 90

91 Sta9s9cal Algorithms input data q poly(n) processing please wait output q( ) 91

92 Sta9s9cal Algorithms input data q poly(n) output q( ) 92

93 Sta9s9cal Algorithms input data q poly(n) output q( ) Turns out most (all?) current op9miza9on algorithms have sta9s9cal analogues 93

94 Bipar9te Planted Clique G $ A(G) 94

95 Bipar9te Planted Clique w.p. (n- k)/n w.p. k/n each row is random is random, except in plant coordinates $ A(G) 95

96 Bipar9te Planted Clique w.p. (n- k)/n w.p. k/n each row is random is random, except in plant coordinates $ A(G) 96

97 Sta9s9cal Algorithms for BPC A(G) $ 97

98 Sta9s9cal Algorithms for BPC A(G) $ 98

99 Sta9s9cal Algorithms for BPC q: {0,1} n à {0,1} A(G) $ 99

100 Sta9s9cal Algorithms for BPC q(1,0,1,1,0,1,1) $ q: {0,1} n à {0,1} A(G) 100

101 Sta9s9cal Algorithms for BPC q(1,0,1,1,0,1,1) $ q: {0,1} n à {0,1} A(G) 101

102 Sta9s9cal Algorithms for BPC poly(n) $ A(G) 102

103 Results Extension of sta9s9cal query model to op9miza9on. Proving 9ghter, more general, lower bounds, which apply to learning also. Gives a new tool for showing problems are difficult. 103

104 Results Main result (almost): for any ε>0, no poly 9me sta9s9cal algorithm using o(n 2 /k 2 ) samples, can find planted cliques of size n 1/2- ε. intui5on: many planted clique distribu9ons with small overlap (nearly orthogonal in some sense), which are hard to tell from normal E- R graphs. Implies that many ideas will fail to work, including Markov chain approaches [Frieze- Kannan 03] for our version of the problem. Since this work, an integrality gap of n 1/2 was shown for planted clique, giving further evidence for its hardness. [Meka- Wigderson 13] 104

105 Results Main result (almost): for any ε>0, no poly 9me sta9s9cal algorithm using o(n 2 /k 2 ) samples, can find planted cliques of size n 1/2- ε. intui5on: many planted clique distribu9ons with small overlap (nearly orthogonal in some sense), which are hard to tell from normal E- R graphs. Implies that many ideas will fail to work, including Markov chain approaches [Frieze- Kannan 03] for our version of the problem. Since this work, an integrality gap of n 1/2 was shown for planted clique, giving further evidence for its hardness. [Meka- Wigderson 13] 105

106 Any Ques9ons? 106

On Finding Planted Cliques in Random Graphs

On Finding Planted Cliques in Random Graphs GraphEx 2014 Lev Reyzin UIC joint with Feldman, Grigorescu, Vempala, Xiao in STOC 13 1 Erdős- Rényi Random Graphs G(n,p) generates graph G on n vertces by including