Spectral thresholds in the bipartite stochastic block model Laura Florescu and Will Perkins NYU and U of Birmingham September 27, 2016 Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 1 / 30
Stochastic Block Model Figure: Red edges added with P = p and blue edges with P = q. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 2 / 30
Community detection Goal: Detect communities in networks. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 3 / 30
Stochastic Block Model Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 4 / 30
Entries are not colored Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 5 / 30
Nor ordered Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 6 / 30
Stochastic Block Model First introduced by Holland, Laskey, Leinhardt in 1983. Motivation: discover communities in large networks. Theorem (Boppana, Dyer/Frieze, Snijders/Nowicki, Condon/Karp, McSherry, Bickel/Chen, etc) There are efficient algorithms for exactly recovering the true colors, provided that p q is large enough as n. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 7 / 30
Bipartite Stochastic Block Model Figure: Bipartite stochastic model on V 1 and V 2. Red edges added with P = δp(n 1, n 2 ) and blue edges with P = (2 δ)p(n 1, n 2 ). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 8 / 30
SBM Goal: get the planted assignment σ (on V 1 for bipartite stochastic model) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 9 / 30
SBM Goal: get the planted assignment σ (on V 1 for bipartite stochastic model) Detection: compute v that agrees with σ on 1/2 + ɛ fraction of vertices Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 9 / 30
SBM Goal: get the planted assignment σ (on V 1 for bipartite stochastic model) Detection: compute v that agrees with σ on 1/2 + ɛ fraction of vertices Recovery: compute v that agrees with σ on 1 o(1) fraction of vertices Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 9 / 30
Background Intermediate step in recovering solutions in planted problems [Feldman, Perkins, Vempala 14]. planted constraint satisfaction problems (CSP) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 10 / 30
Background Intermediate step in recovering solutions in planted problems [Feldman, Perkins, Vempala 14]. planted constraint satisfaction problems (CSP) Reducing planted problems on n variables will give vertex sets of size n 1 = n, n 2 = n k 1. (n 2 n 2 ) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 10 / 30
Unified Planted k-csp model Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 11 / 30
Unified Planted k-csp model Definition (Feldman-Perkins-Vempala 14) Given a planting distribution Q : {±1} k [0, 1], Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 11 / 30
Unified Planted k-csp model Definition (Feldman-Perkins-Vempala 14) Given a planting distribution Q : {±1} k [0, 1], and an assignment σ {±1} n, Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 11 / 30
Unified Planted k-csp model Definition (Feldman-Perkins-Vempala 14) Given a planting distribution Q : {±1} k [0, 1], and an assignment σ {±1} n, define the random constraint satisfaction problem F Q,σ (n, m) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 11 / 30
Unified Planted k-csp model Definition (Feldman-Perkins-Vempala 14) Given a planting distribution Q : {±1} k [0, 1], and an assignment σ {±1} n, define the random constraint satisfaction problem F Q,σ (n, m) by drawing m k-clauses from C k (the set of all k-tuples) independently according to Q(σ(C)) Q σ (C) = C C k Q(σ(C )) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 11 / 30
Unified Planted k-csp model Definition (Feldman-Perkins-Vempala 14) Given a planting distribution Q : {±1} k [0, 1], and an assignment σ {±1} n, define the random constraint satisfaction problem F Q,σ (n, m) by drawing m k-clauses from C k (the set of all k-tuples) independently according to Q(σ(C)) Q σ (C) = C C k Q(σ(C )) where σ(c) is the vector of values that σ assigns to the k-tuple of literals comprising C. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 11 / 30
Planted random k-sat and Goldreich PRG Planted random k-sat: Form a truth assignment φ of literals, then select each clause independently from the k-tuples of literals where at least one literal is set to 1 by φ. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 12 / 30
Planted random k-sat and Goldreich PRG Planted random k-sat: Form a truth assignment φ of literals, then select each clause independently from the k-tuples of literals where at least one literal is set to 1 by φ. Goldreich PRG: also add a 0/1, depending on a predicate evaluated on literals. (cryptography) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 12 / 30
Planted random k-sat and Goldreich PRG Planted random k-sat: Form a truth assignment φ of literals, then select each clause independently from the k-tuples of literals where at least one literal is set to 1 by φ. Goldreich PRG: also add a 0/1, depending on a predicate evaluated on literals. (cryptography) Feldman, Perkins, Vempala 14 gave a reduction of above and others to the BSBM. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 12 / 30
Information theory threshold When p = a/n and q = b/n Theorem (Mossel, Neeman, Sly, 2012) There is a test to distinguish the partition that succeeds with high probability if and only if a + b > 2 and (a b) 2 > 2(a + b). Proves conjecture of [Decelle, Krzakala, Moore, Zdeborova 13]. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 13 / 30
Computational threshold Dyer, Frieze 1989 p = na > q = nb fixed Condon, Karp 2001 a b n 1/2 McSherry 2001 a b b log n Coja-Oghlan 2010 a b b Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 14 / 30
Computational threshold Dyer, Frieze 1989 p = na > q = nb fixed Condon, Karp 2001 a b n 1/2 McSherry 2001 a b b log n Coja-Oghlan 2010 a b b Massoulié 2013 and Mossel, Neeman, Sly 2013 - detection possible and efficient (a b) 2 > 2(a + b). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 14 / 30
Computational threshold Dyer, Frieze 1989 p = na > q = nb fixed Condon, Karp 2001 a b n 1/2 McSherry 2001 a b b log n Coja-Oghlan 2010 a b b Massoulié 2013 and Mossel, Neeman, Sly 2013 - detection possible and efficient (a b) 2 > 2(a + b). Ingenious spectral methods Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 14 / 30
Previous work MNS Idea: nbhd of vertex in G(n, a/n, b/n) looks like a random labelled tree, where each child gives birth to Pois(a) vertices of same type, Pois(b) vertices of different type Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 15 / 30
Previous work MNS Idea: nbhd of vertex in G(n, a/n, b/n) looks like a random labelled tree, where each child gives birth to Pois(a) vertices of same type, Pois(b) vertices of different type show that conditioned on the labels of the bdry of the tree, the label of root is asymp indep of the rest of graph Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 15 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Root R labeled uniformly +1/ 1, each child takes parent s label with P = 1 η and opposite label with P = η. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Root R labeled uniformly +1/ 1, each child takes parent s label with P = 1 η and opposite label with P = η. Goal: reconstruct value of R from labels at level n. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Root R labeled uniformly +1/ 1, each child takes parent s label with P = 1 η and opposite label with P = η. Goal: reconstruct value of R from labels at level n. Theorem (Evans, Kenyon, Peres, Schulman 00) Probability of correct reconstruction of value of R tends to 1 2 as n if (1 2η) 2 p c (T ), where p c (T ) is the critical probability for percolation on T. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Root R labeled uniformly +1/ 1, each child takes parent s label with P = 1 η and opposite label with P = η. Goal: reconstruct value of R from labels at level n. Theorem (Evans, Kenyon, Peres, Schulman 00) Probability of correct reconstruction of value of R tends to 1 2 as n if (1 2η) 2 p c (T ), where p c (T ) is the critical probability for percolation on T. Can think of p c (T ) as the edge density at which the tree is connected. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Root R labeled uniformly +1/ 1, each child takes parent s label with P = 1 η and opposite label with P = η. Goal: reconstruct value of R from labels at level n. Theorem (Evans, Kenyon, Peres, Schulman 00) Probability of correct reconstruction of value of R tends to 1 2 as n if (1 2η) 2 p c (T ), where p c (T ) is the critical probability for percolation on T. Can think of p c (T ) as the edge density at which the tree is connected. trees with offspring distribution Pois( a+b a 2 ) and take 1 η = a+b. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Binary symmetric broadcast model T : Galton-Watson tree with mean offspring distribution mean b. Root R labeled uniformly +1/ 1, each child takes parent s label with P = 1 η and opposite label with P = η. Goal: reconstruct value of R from labels at level n. Theorem (Evans, Kenyon, Peres, Schulman 00) Probability of correct reconstruction of value of R tends to 1 2 as n if (1 2η) 2 p c (T ), where p c (T ) is the critical probability for percolation on T. Can think of p c (T ) as the edge density at which the tree is connected. trees with offspring distribution Pois( a+b a 2 ) and take 1 η = a+b. Then threshold reduces to (a b) 2 2(a + b). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 16 / 30
Previous work - Spectral methods Applying some classical results to bipartite model using spectrum with p = O(1/n 1 ) recovers partition Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 17 / 30
Previous work - Spectral methods Applying some classical results to bipartite model using spectrum with p = O(1/n 1 ) recovers partition typical analysis of spectral algos: 2nd singular value > spectral norm of noise matrix M EM; Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 17 / 30
Previous work - Spectral methods Applying some classical results to bipartite model using spectrum with p = O(1/n 1 ) recovers partition typical analysis of spectral algos: 2nd singular value > spectral norm of noise matrix M EM; here λ 2 (EM) = Θ(p n 1 n 2 ), norm of noise M EM = Θ( pn 2 ). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 17 / 30
Previous work - Spectral methods Applying some classical results to bipartite model using spectrum with p = O(1/n 1 ) recovers partition typical analysis of spectral algos: 2nd singular value > spectral norm of noise matrix M EM; here λ 2 (EM) = Θ(p n 1 n 2 ), norm of noise M EM = Θ( pn 2 ). Feldman, Perkins, Vempala 14: subsampled power iteration recovers partition whp with p = Õ((n 1n 2 ) 1/2 ) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 17 / 30
Questions 1 Here λ 2 < M EM. Is SVD doomed for p 1/n 1? Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 18 / 30
Questions 1 Here λ 2 < M EM. Is SVD doomed for p 1/n 1? 2 What is the optimal threshold for detection in BSBM? Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 18 / 30
Our results - sharp reconstruction/impossibility Theorem On the other hand, if n 2 n 1 and p 1 (δ 1) 2 n 1 n 2, then no algorithm can detect the partition. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 19 / 30
Our results - sharp reconstruction/impossibility Theorem On the other hand, if n 2 n 1 and p 1 (δ 1) 2 n 1 n 2, then no algorithm can detect the partition. Idea: Couple to a broadcast model on a multi-type Galton Watson tree. Show that conditioned on the labels of a log n bdry of the tree, the label of root is asymp indep of the rest of graph. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 19 / 30
Our results - sharp reconstruction/impossibility Theorem Let n 2 n 1. Then there is a polynomial-time algorithm that detects the partition V 1 = A 1 B 1 if p > 1 + ɛ (δ 1) 2 n 1 n 2 for any fixed ɛ > 0. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 20 / 30
Our results - sharp reconstruction/impossibility Theorem Let n 2 n 1. Then there is a polynomial-time algorithm that detects the partition V 1 = A 1 B 1 if p > 1 + ɛ (δ 1) 2 n 1 n 2 for any fixed ɛ > 0. Idea: reduce to SBM on graph on V 1 induced by paths of length 2 in bipartite graph. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 20 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Now we can compute p a = P[e = (u, v) σ(u) = σ(v)] and p b = P[e = (u, v) σ(u) σ(v)] Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Now we can compute p a = P[e = (u, v) σ(u) = σ(v)] and p b = P[e = (u, v) σ(u) σ(v)] Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Now we can compute p a = P[e = (u, v) σ(u) = σ(v)] and p b = P[e = (u, v) σ(u) σ(v)] Now compute a and b accordingly: a = (1 + ɛ)(2 2δ + δ2 ) (δ 1) 4 (1 + o(1)) b = (1 + ɛ)(2δ δ2 ) (δ 1) 4 (1 + o(1)) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Now we can compute p a = P[e = (u, v) σ(u) = σ(v)] and p b = P[e = (u, v) σ(u) σ(v)] Now compute a and b accordingly: a = (1 + ɛ)(2 2δ + δ2 ) (δ 1) 4 (1 + o(1)) b = (1 + ɛ)(2δ δ2 ) (δ 1) 4 (1 + o(1)) Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Proof sketch Reduce to a graph G by replacing each path of length 2 from V 1 to V 2 back to V 1 with a single edge between the endpoints in V 1. E = (1+ɛ)2 n 1 (1 + o(1)) (δ 1) 4 Now we can compute p a = P[e = (u, v) σ(u) = σ(v)] and p b = P[e = (u, v) σ(u) σ(v)] Now compute a and b accordingly: a = (1 + ɛ)(2 2δ + δ2 ) (δ 1) 4 (1 + o(1)) b = (1 + ɛ)(2δ δ2 ) (δ 1) 4 (1 + o(1)) Apply criterion (a b) 2 (1 + ɛ)2(a + b). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 21 / 30
Implications for planted k-sat - detection in the block model exhibits a sharp threshold at m = Θ(n r/2 ) hyperedges/clauses Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 22 / 30
Implications for planted k-sat - detection in the block model exhibits a sharp threshold at m = Θ(n r/2 ) hyperedges/clauses Definition The distribution complexity r of the planting distribution Q is the smallest r > 0 so that Q is an (r 1)-wise independent distribution on {±} k but not r-wise independent. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 22 / 30
Spectral algorithms Standard SVD: Compute left singular vector of M (adjacency matrix) corresponding to 2nd singular value, round signs to get v; compare σ and v Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 23 / 30
Spectral algorithms Standard SVD: Compute left singular vector of M (adjacency matrix) corresponding to 2nd singular value, round signs to get v; compare σ and v Diagonal deletion SVD: Set diagonal entries of MM T to 0, compute second eigenvector, round signs to get v; compare σ and v Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 23 / 30
Our results - spectral Theorem Let n 2 n 1, with n 1. Then 1 If p D > (n 1 n 2 ) 1/2, then whp the diagonal deletion SVD algorithm recovers the partition V 1 = A 1 B 1. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 24 / 30
Our results - spectral Theorem Let n 2 n 1, with n 1. Then 1 If p D > (n 1 n 2 ) 1/2, then whp the diagonal deletion SVD algorithm recovers the partition V 1 = A 1 B 1. 2 If p V > n 2/3 the partition. 1 n 1/3 2, then whp the standard SVD algorithm recovers Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 24 / 30
Our results - spectral Theorem Let n 2 n 1, with n 1. Then 1 If p D > (n 1 n 2 ) 1/2, then whp the diagonal deletion SVD algorithm recovers the partition V 1 = A 1 B 1. 2 If p V > n 2/3 the partition. 1 n 1/3 2, then whp the standard SVD algorithm recovers Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 24 / 30
Our results - spectral Theorem Let n 2 n 1, with n 1. Then 1 If p D > (n 1 n 2 ) 1/2, then whp the diagonal deletion SVD algorithm recovers the partition V 1 = A 1 B 1. 2 If p V > n 2/3 the partition. 1 n 1/3 2, then whp the standard SVD algorithm recovers When n 2 = n 2, p D n 3/2, p V n 4/3. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 24 / 30
Timeline Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 25 / 30
Our results 1 0.9 Plot of correlation as a function of p(n1,n2), n1= 1000, n2=100000, delta=0.2 v dd 0.335 ess Plot of the top eigenvalues of MM T for delta=0.5 and delta=1 for the various regimes p(n1,n2) delta=0.5 delta=1 0.33 0.8 0.7 0.325 0.6 0.32 correlation 0.5 0.4 0.3 top 10 normalized eigenvalues 0.315 0.31 0.305 0.2 0.3 0.1 0.295 0 0 1 2 p(n1,n2) 3 4 0.29 x 10 0.15 0.2 0.25 0.3 0.35 0.4 0.45 p1=(n1*n2)^( 1/2)*log(n1), p2=n1^( 2/3)*n2^( 2/3), p3=n1^( 2/3)*n2^( 1/3)*log(n1) Figure: Correlations of computed vectors with planted vector Figure: Eigenvalue separation Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 26 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V σ: partition, e 2 (B): second largest eigenvector of B, D V : degrees. Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V σ: partition, e 2 (B): second largest eigenvector of B, D V : degrees. DiagD: sin(b, EB) C B EB SVD: sin(b, EB ) C B EB + D V ED V λ 2 λ 2 by Sin Theta Theorem - sin of angle between eigenvector spaces norm/eigenvalue gap Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V σ: partition, e 2 (B): second largest eigenvector of B, D V : degrees. DiagD: sin(b, EB) C B EB SVD: sin(b, EB ) C B EB + D V ED V λ 2 λ 2 by Sin Theta Theorem - sin of angle between eigenvector spaces norm/eigenvalue gap C n1/2 (δ 1) 2 n 1 n 2 ; C n1/2 p 2 1 n 1/2 2 p (2nd λ asymptotics) 1 n 1/2 2 p+(c n 2 p log n 1 ) (δ 1) 2 n 1 n 2 p 2 Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V σ: partition, e 2 (B): second largest eigenvector of B, D V : degrees. DiagD: sin(b, EB) C B EB SVD: sin(b, EB ) C B EB + D V ED V λ 2 λ 2 by Sin Theta Theorem - sin of angle between eigenvector spaces norm/eigenvalue gap C n1/2 (δ 1) 2 n 1 n 2 ; C n1/2 p 2 1 n 1/2 2 p (2nd λ asymptotics) ( ( ) = O 1 log n 1 ); = O 1 log n 1 1 n 1/2 2 p+(c n 2 p log n 1 ) (δ 1) 2 n 1 n 2 p 2 Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V σ: partition, e 2 (B): second largest eigenvector of B, D V : degrees. DiagD: sin(b, EB) C B EB SVD: sin(b, EB ) C B EB + D V ED V λ 2 λ 2 by Sin Theta Theorem - sin of angle between eigenvector spaces norm/eigenvalue gap C n1/2 (δ 1) 2 n 1 n 2 ; C n1/2 p 2 1 n 1/2 2 p (2nd λ asymptotics) ( ( ) = O 1 log n 1 ); = O 1 log n 1 1 n 1/2 2 p+(c n 2 p log n 1 ) (δ 1) 2 n 1 n 2 p 2 e 2 (B) σ/ n 1 = O(log 1 n 1 ) (by special case of Sin Theta Theorem). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Thresholds origins DiagD: B = MM T D V, SVD: B = B + D V σ: partition, e 2 (B): second largest eigenvector of B, D V : degrees. DiagD: sin(b, EB) C B EB SVD: sin(b, EB ) C B EB + D V ED V λ 2 λ 2 by Sin Theta Theorem - sin of angle between eigenvector spaces norm/eigenvalue gap C n1/2 (δ 1) 2 n 1 n 2 ; C n1/2 p 2 1 n 1/2 2 p (2nd λ asymptotics) ( ( ) = O 1 log n 1 ); = O 1 log n 1 1 n 1/2 2 p+(c n 2 p log n 1 ) (δ 1) 2 n 1 n 2 p 2 e 2 (B) σ/ n 1 = O(log 1 n 1 ) (by special case of Sin Theta Theorem). Conclude by rounding signs of e 2 (B). Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 27 / 30
Conclusions Theorem Can efficiently detect partition in BSBM if p > 1+ɛ (δ 1) 2 n 1 n 2 Cannot detect if p 1 (δ 1) 2 n 1 n 2 Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 28 / 30
Conclusions Theorem Can efficiently detect partition in BSBM if p > 1+ɛ (δ 1) 2 n 1 n 2 Cannot detect if p 1 (δ 1) 2 n 1 n 2 spectral method still works if λ 2 norm of noise matrix Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 28 / 30
Conclusions Theorem Can efficiently detect partition in BSBM if p > 1+ɛ (δ 1) 2 n 1 n 2 Cannot detect if p 1 (δ 1) 2 n 1 n 2 spectral method still works if λ 2 norm of noise matrix modifying adjacency matrix improves recovery significantly Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 28 / 30
Open problems apply Diagonal Deletion type of algorithm for improvement over SVD in other problems? Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 29 / 30
Open problems apply Diagonal Deletion type of algorithm for improvement over SVD in other problems? sharper detection thresholds for planted k-sat? Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 29 / 30
Thank you! Laura Florescu and Will Perkins Spectral thresholds in the bipartite stochastic block model 30 / 30