ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

Size: px
Start display at page:

Download "ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018"

Transcription

1 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018

2 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space perturbation theory Extension: singular subspaces Extension: eigen-space for asymmetric transition matrices Spectral methods 2-2

3 A motivating application: graph clustering

4 Graph clustering / community detection Community structures are common in many social networks figure credit: The Future Buzz figure credit: S. Papadopoulos Goal: partition users into several clusters based on their friendships / similarities Spectral methods 2-4

5 A simple model: stochastic block model (SBM) x i = 1: 1 st community x i = 1: 2 nd community n nodes {1,, n} 2 communities n unknown variables: x 1,, x n {1, 1} encode community memberships Spectral methods 2-5

6 A simple model: stochastic block model (SBM) G observe a graph G (i, j) G with prob. { p, if i and j are from same community q, else Here, p > q and p, q log n/n Goal: recover community memberships of all nodes, i.e. {x i } Spectral methods 2-6

7 Adjacency matrix Consider the adjacency matrix A {0, 1} n n of G: { 1, if (i, j) G A i,j = 0, else WLOG, suppose x 1 = = x n/2 = 1; x n/2+1 = = x n = 1 Spectral methods 2-7

8 Adjacency matrix A = E[A] }{{} rank 2 + A E [A] [ p11 E[A] = q11 q11 p11 ] = p + q p q [ ] 1 [ 1, 1 ] }{{} 2 1 }{{} uninformative bias =x:=[x i] 1 i n Spectral methods 2-8

9 Spectral clustering A = E[A] }{{} rank 2 + A E [A] 1. computing the leading eigenvector û = [û i ] 1 i n of A p+q 2 { 11 1, if ûi > 0 2. rounding: output ˆx i = 1, if û i < 0 Spectral methods 2-9

10 Spectral clustering Rationale: recovery is reliable if A E[A] is sufficiently small }{{} perturbation if A E[A] = 0, then û ± [ 1 1 ] = perfect clustering Question: how to quantify the effect of perturbation A E[A] on û Spectral methods 2-10

11 Distance and angles between two subspaces

12 Setup and notation Consider 2 symmetric matrices M, ˆM = M + H R n n with eigen-decompositions n M = λ i u i u i and ˆM n = ˆλ i û i û i i=1 i=1 where λ 1 λ n ; ˆλ 1 ˆλ n. For simplicity, write [ ] [ ] Λ0 U M = [U 0, U 1 ] 0 Λ 1 U 1 [ ] [ ˆΛ0 Û ˆM = [Û0, Û1] 0 ˆΛ 1 Û1 Here, U 0 = [u 1,, u r ], Λ 0 = diag ([λ 1,, λ r ]), ] Spectral methods 2-12

13 Setup and notation [ ] M = u 1 u r u r+1 u n }{{}}{{} U 0 U 1 λ 1... λ r }{{} Λ 0 λ r+1... λn }{{} Λ 1 u 1. u r u r+1. u n U0 U1 Spectral methods 2-13

14 Setup and notation M : spectral norm (largest singular value of M) M F : Frobenius norm ( M F = tr(m M) = i,j M 2 i,j ) Spectral methods 2-14

15 Eigen-space perturbation theory Main focus: how does the perturbation H affect the distance between U and Û Question #0: how to define distance between two subspaces U Û F and U Û are not appropriate, since they fall short of accounting for global orthonormal transformation }{{} orthonormal R R r r, U and UR represent same subspace Spectral methods 2-15

16 Distance between two eigen-spaces One metric that takes care of global orthonormal transformation is dist(x 0, Z 0 ) := X 0 X 0 Z 0 Z 0 (2.1) This metric has several equivalent expressions: Lemma 2.1 Suppose X := [X 0, X 1 ] and Z := [Z 0, Z 1 ] are orthonormal matrices. Then }{{} complement subspace }{{} complement subspace dist(x 0, Z 0 ) = X 0 Z 1 = Z 0 X 1 sanity check: if X 0 = Z 0, then dist(x 0, Z 0 ) = X 0 Z 1 = 0 Spectral methods 2-16

17 Principal angles between two eigen-spaces In addition to distance, one might also be interested in angles i We can quantify the similarity between two lines (represented resp. by unit vectors x 0 and z 0 ) by an angle between them θ = arccos x 0, z 0 Spectral methods 2-17

18 Principal angles between two eigen-spaces For r-dimensional subspaces, one needs r angles Specifically, given X 0 Z 0 1, we write SVD of X 0 Z 0 R r r as cos θ 1 X0 Z 0 = U... } {{ cos θ r } :=cos Θ V := U cos Θ V where {θ 1,, θ r } are called the principal angles between X 0 and Z 0 Spectral methods 2-18

19 Relations between principal angles and dist(, ) As expected, principal angles and distance are closely related Lemma 2.2 Suppose X := [X 0, X 1 ] and Z := [Z 0, Z 1 ] are orthonormal matrices. Then X 0 Z 1 = sin Θ = max{ sin θ 1,, sin θ r } Lemmas 2.1 and 2.2 taken collectively give dist(x 0, Z 0 ) = max{ sin θ 1,, sin θ r } (2.2) Spectral methods 2-19

20 Proof of Lemma 2.2 X0 Z 1 = X0 Z 1 Z1 1 X 0 2 }{{} =I Z 0 Z0 = X 0 X 0 X 0 Z 0 Z 0 X = I U cos 2 Θ U 1 2 (since X 0 Z 0 = U cos Θ V ) = I cos 2 Θ 1 2 = sin Θ Spectral methods 2-20

21 Proof of Lemma 2.1 We first claim that SVD of X 1 Z 0 can be written as X 1 Z 0 = Ũ sin Θ V (2.3) for some orthonormal Ũ (to be proved later). With this claim in place, one has [ ] [ ] X Z 0 = [X 0, X 1 ] 0 U cos Θ V X1 Z 0 = [X 0, X 1 ] Ũ sin Θ V [ = Z 0 Z0 = [X 0, X 1 ] U cos 2 Θ U Ũ cos Θ sin Θ U U cos Θ sin Θ Ũ Ũ sin 2 Θ Ũ ] [ X 0 X 1 ] As a consequence, X 0 X0 Z 0 Z0 [ I U cos = [X 0, X 1 ] 2 Θ U U cos Θ sin Θ Ũ Ũ cos Θ sin Θ U Ũ sin2 Θ Ũ ] [ X 0 X 1 ] Spectral methods 2-21

22 Proof of Lemma 2.1 (cont.) This further gives X 0 X0 Z 0 Z0 ] [ ] [ ] = U sin [ 2 Θ cos Θ sin Θ U Ũ cos Θ sin Θ sin 2 Θ Ũ [ ] sin = 2 Θ cos Θ sin Θ cos Θ sin Θ sin 2 ( is rotationally invariant) Θ }{{} each block is diagonal matrix [ ] = max sin 2 θ i cos θ i sin θ i 1 i r cos θ i sin θ i sin 2 θ i [ ] = max sin 1 i r sin θ θi cos θ i i cos θ i sin θ i = max 1 i r sin θ i = sin Θ Spectral methods 2-22

23 Proof of Lemma 2.1 (cont.) It remains to justify (2.3). To this end, observe that Z 0 X 1 X 1 Z 0 = Z 0 Z 0 Z 0 X 0 X 0 Z 0 = I V cos 2 Θ V = V sin 2 Θ V and hence right singular space (resp. singular values) of X 1 Z 0 is given by V (resp. sin Θ) Spectral methods 2-23

24 Eigen-space perturbation theory

25 Davis-Kahan sin Θ Theorem: a simple case Theorem 2.3 Chandler Davis William Kahan Suppose M 0 and has rank r. If H < λ r (M), then dist ( ) HU 0 Û 0, U 0 λ r (M) H H λ r (M) H depends on the smallest non-zero eigenvalue of M }{{} and perturbation eigen-gap between λ r and λ Spectral methods r size

26 Proof of Theorem 2.3 We intend to control Û1 U 0 by studying their interactions through H: ) Û 1 HU 0 = Û1 (Û ˆΛÛ UΛU }{{ } U 0 = } {{ } M+H M ˆΛ 1 Û 1 U 0 Û 1 U 0 Λ 0 (since U 1 U 0 = Û 1 Û0 = 0) Û 1 U 0 Λ 0 ˆΛ1 Û 1 U 0 (triangle inequality) Û 1 U 0 λr Û 1 U 0 ˆΛ1 (2.4) In view of Weyl s Theorem, ˆΛ 1 H, which combined with (2.4) gives Û Û 1 U 0 1 HU 0 HU 0 λ r H Û1 λ r H This together with Lemma 2.1 completes the proof HU 0 = λ r H Spectral methods 2-26

27 Davis-Kahan sin Θ Theorem: more general case Theorem 2.4 (Davis-Kahan sin Θ Theorem) Suppose λ r (M) a and λ r+1 ( ˆM) a for some > 0. Then dist ( ) HU 0 Û 0, U 0 H immediate consequence: if λ r (M) > λ r+1 (M) + H, then dist ( Û 0, U 0 ) H λ r (M) λ r+1 (M) H }{{} spectral gap (2.5) Spectral methods 2-27

28 Back to stochastic block model... Let M = E[A] p + q [ 2 11, ˆM = A p+q 2 11 and u = 1 1 n 1 }{{} [ ] [1, 1 ] = p q Then the Davis-Kahan sin Θ Theorem yields ] dist(û, u) ˆM M λ 1 (M) ˆM M = A E[A] (p q)n 2 A E[A] (2.6) Question: how to bound A E[A] Spectral methods 2-28

29 A hammer: matrix Bernstein inequality Consider a sequence of independent random matrices { X l R d } 1 d 2 E[X l ] = 0 X l B for each l variance statistic: { E [ ] ] } v := max X l lxl E, [ l X l X l Theorem 2.5 (Matrix Bernstein inequality) For all τ 0, { P X l l } τ (d 1 + d 2 ) exp ( τ 2 /2 ) v + Bτ/3 Spectral methods 2-29

30 A hammer: matrix Bernstein inequality ( { } P X τ 2 ) /2 l l τ (d 1 + d 2 ) exp v + Bτ/3 moderate-deviation regime (τ is small): sub-gaussian tail behavior exp( τ 2 /2v) large-deviation regime (τ is large): sub-exponential tail behavior exp( 3τ/2B) (slower decay) user-friendly form (exercise): with prob. 1 O((d 1 + d 2 ) 10 ) X l l v log(d 1 + d 2 ) + B log(d 1 + d 2 ) (2.7) Spectral methods 2-30

31 Bounding A E[A] The Matrix Bernstein inequality yields Lemma 2.6 Consider SBM with p > q log n n. Then with high prob. A E[A] np log n (2.8) Spectral methods 2-31

32 Statistical accuracy of spectral clustering Substitute (2.8) into (2.6) to reach A E[A] np log n dist(û, u) (p q)n 2 A E[A] (p q)n provided that (p q)n np log n Thus, under condition p q p log n, with high prob. one has n dist(û, u) 1 = nearly perfect clustering Spectral methods 2-32

33 Statistical accuracy of spectral clustering p q p log n n = nearly perfect clustering dense regime: if p q 1, then this condition reads log n p q n sparse regime: if p = a log n n and q = b log n n a b a for a, b 1, then This condition is information-theoretically optimal (up to log factor) Mossel, Neeman, Sly 15, Abbe 18 Spectral methods 2-33

34 Proof of Lemma 2.6 To simplify presentation, assume A i,j and A j,i are independent (check: why this assumption does not change our bounds) Spectral methods 2-34

35 Proof of Lemma 2.6 Write A E[A] as i,j X i,j, where X i,j = ( A i,j E[A i,j ] ) e i e j Since Var(A i,j ) p, one has E [ X i,j Xi,j] pei e i, which gives E [ X i,j X ] i,j pe ie i np I i,j i,j Similarly, i,j E [ X i,j X i,j] np I. As a result, { v = max E [ X i,j X ] i,j, i,j i,j E [ Xi,jX ] } i,j np In addition, X i,j 1 := B Take the matrix Bernstein inequality to conclude that with high prob., A E[A] v log n + B log n np log n (since p log n n ) Spectral methods 2-35

36 Extension: singular subspaces

37 Singular value decomposition Consider two matrices M, ˆM = M + H with SVD Σ 0 0 [ ] V M = [U 0, U 1 ] 0 Σ 1 0 V1 0 0 ˆM = [Û0 Û1], ˆΣ ˆΣ1 0 0 [ ˆV 0 ˆV 1 where U 0 (resp. Û 0 ) and V 0 (resp. ˆV 0 ) represent the top-r singular subspaces of M (resp. ˆM) ] Spectral methods 2-37

38 Wedin sin Θ Theorem The Davis-Kahan Theorem generalizes to singular subspace perturbation: Theorem 2.7 (Wedin sin Θ Theorem) Suppose max σ r (M) }{{} rth singular value a and σ r+1 ( ˆM) a for some > 0. Then { dist ( Û 0, U 0 ), dist ( ˆV0, V 0 ) } max { HV 0, H U 0 } } {{ } two-sided interactions H Spectral methods 2-38

39 Example: low-rank matrix completion Netflix challenge: Netflix provides highly incomplete ratings from 0.5 million users for & 17,770 movies How to predict unseen user ratings for movies Spectral methods 2-39

40 Example: low-rank matrix completion In general, we cannot infer missing ratings this is an underdetermined system (more unknowns than observations) Spectral methods 2-40

41 Example: low-rank matrix completion... unless rating matrix has other structure A few factors explain most of the data Spectral methods 2-41

42 Example: low-rank matrix completion... unless rating matrix has other structure A few factors explain most of the data low-rank approximation How to exploit (approx.) low-rank structure in prediction Spectral methods 2-41

43 Model for low-rank matrix completion figure credit: Candès consider a low-rank matrix M each entry M i,j is observed independently with prob. p goal: fill in missing entries Spectral methods 2-42

44 Spectral estimate for matrix completion 1. set ˆM R n n as { 1 p ˆM i,j = M i,j if M i,j is observed 0, else rationale for rescaling: E[ ˆM] = M 2. compute the rank-r SVD Û ˆΣ ˆV of ˆM, and return ( Û, ˆΣ, ˆV ) Spectral methods 2-43

45 Statistical accuracy of spectral estimate Let s analyze a simple case where M = uv with u = 1 ũ 2 ũ, v = 1 ṽ 2 ṽ, ũ, ṽ N (0, I n ) From Wedin s Theorem: if p log 3 n/n, then with high prob. max {dist(û, u), dist(ˆv, v)} ˆM M σ 1 (M) ˆM M ˆM M }{{} controlled by Bernstein 1 (nearly accurate estimates) (2.9) Spectral methods 2-44

46 Sample complexity For rank-1 matrix completion, p log3 n n = nearly accurate estimates Sample complexity needed for obtaining reliable spectral estimates is n 2 p n log 3 n }{{} optimal up to log factor Spectral methods 2-45

47 Write First, Proof of (2.9) ˆM M = i,j X i,j, where X i,j = ( ˆM i,j M i,j )e i e j X i,j 1 p max M i,j log n := B (check) i,j pn Next, E [ X i,j Xi,j] = Var( ˆMi,j )e i e i and hence E [ i,j X i,jxi,j ] { = [ E max i,j i,j X i,jx i,j Var ( ) } { n ˆMi,j ni p max Mi,j 2 i,j ] n p max i,j M 2 i,j log2 n np Similar bounds hold for E [ i,j X i,j X i,j]. Therefore, } I (check) { E [ v := max X ] i,jxi,j [ ] }, E i,j i,j X i,jx i,j log2 n np Take the matrix Bernstein inequality to yield: if p log 3 n/n, then ˆM M v log n + B log n 1 Spectral methods 2-46

48 Extension: eigen-space for asymmetric transition matrices

49 Eigen-decomposition for asymmetric matrices Eigen-decomposition for asymmetric matrices is much more tricky; for example: 1. both eigenvalues and eigenvectors might be complex-valued 2. eigenvectors might not be orthogonal to each other This lecture focuses on a special case: probability transition matrices Spectral methods 2-48

50 Probability transition matrices P 2, P1, Consider Markov chain {X t } t 0 n states transition probability P{X t+1 = j X t = i} = P i,j transition matrix P = [P i,j ] 1 i,j n stationary distribution π = [π 1,, π n ] is 1st eigenvector of P }{{} π 1 + +π n=1 πp = π {X t } t 0 is said to be reversible if π i P i,j = π j P j,i for all i, j Spectral methods 2-49

51 Eigenvector perturbation for transition matrices Define a π := Theorem 2.8 (Chen, Fan, Ma, Wang 17) π 1 a π na 2 n Suppose P, ˆP are transition matrices with stationary distributions π, ˆπ, respectively. Assume P induces reversible Markov chain. If 1 > max {λ 2 (P ), λ n (P )} + ˆP P π, then π( ˆP P ) ˆπ π π π 1 max {λ 2 (P ), λ n (P )} } {{ } spectral gap ˆP P π }{{} perturbation ˆP does not need to induce reversible Markov chain Spectral methods 2-50

52 Example: ranking from pairwise comparisons pairwise comparisons for ranking tennis players figure credit: Bozóki, Csató, Temesi Spectral methods 2-51

53 Bradley-Terry-Luce (logistic) model w i : preference score n items to be ranked i: rank assign a latent score {w i } 1 i n to each item, so that item i item j if w i > w j each pair of items (i, j) is compared independently w j P {item j beats item i} = w i + w j Spectral methods 2-52

54 Bradley-Terry-Luce (logistic) model w i : preference score n items to be ranked i: rank assign a latent score {w i } 1 i n to each item, so that item i item j if w i > w j each pair of items (i, j) is compared independently w ind. 1, with prob. j w y i,j = i +w j 0, else Spectral methods 2-52

55 Spectral ranking method construct a probability transition matrix ˆP obeying ˆP i,j = { 1 2n y i,j, if i j 1 l:l i ˆP i,l, if i = j return the score estimate as the leading left eigenvector ˆπ of ˆP closely related to PageRank! Spectral methods 2-53

56 Rationale behind spectral method E[ ˆP i,j ] = 1 2n w j, w i + w j i j P := E[ ˆP ] obeys w i P i,j = w j P j,i (detailed balance) Thus, the stationary distribution π of P obeys π = 1 l w w l (reveal true scores) Spectral methods 2-54

57 Statistical guarantees for spectral ranking Negahban, Oh, Shah 16, Chen, Fan, Ma, Wang 17 Suppose max i,j w i w j 1. Then with high prob. ˆπ π 2 π 2 ˆπ π π π 2 1 n 0 }{{} nearly perfect estimate consequence of Theorem 2.8 and the matrix Bernstein (exercise) Spectral methods 2-55

58 Reference [1] The rotation of eigenvectors by a perturbation, C. Davis, W. Kahan, SIAM Journal on Numerical Analysis, [2] Perturbation bounds in connection with singular value decomposition, P. Wedin, BIT Numerical Mathematics, [3] Inference, estimation, and information processing, EE 378B lecture notes, A. Montanari, Stanford University. [4] COMS 4772 lecture notes, D. Hsu, Columbia University. [5] High-dimensional statistics: a non-asymptotic viewpoint, M. Wainwright, Cambridge University Press, [6] Principal component analysis for big data, J. Fan, Q. Sun, W. Zhou, Z. Zhu, arxiv: , Spectral methods 2-56

59 Reference [7] Community detection and stochastic block models, E. Abbe, Foundations and Trends in Communications and Information Theory, [8] Consistency thresholds for the planted bisection model, E. Mossel, J. Neeman, A. Sly, ACM Symposium on Theory of Computing, [9] Matrix completion from a few entries, R. Keshavan, A. Montanari, S. Oh, IEEE Transactions on Information Theory, [10] The PageRank citation ranking: bringing order to the web, L. Page, S. Brin, R. Motwani, T. Winograd, [11] Rank centrality: ranking from pairwise comparisons, S. Negahban, S. Oh, D. Shah, Operations Research, [12] Spectral method and regularized MLE are both optimal for top-k ranking, Y. Chen, J. Fan, C. Ma, K. Wang, accepted to Annals of Statistics, Spectral methods 2-57

Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking

Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Yuxin Chen Electrical Engineering, Princeton University Joint work with Jianqing Fan, Cong Ma and Kaizheng Wang Ranking A fundamental

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Matrix estimation by Universal Singular Value Thresholding

Matrix estimation by Universal Singular Value Thresholding Matrix estimation by Universal Singular Value Thresholding Courant Institute, NYU Let us begin with an example: Suppose that we have an undirected random graph G on n vertices. Model: There is a real symmetric

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

Spectral Partitiong in a Stochastic Block Model

Spectral Partitiong in a Stochastic Block Model Spectral Graph Theory Lecture 21 Spectral Partitiong in a Stochastic Block Model Daniel A. Spielman November 16, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

Low-Rank Matrix Recovery

Low-Rank Matrix Recovery ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Community detection in stochastic block models via spectral methods

Community detection in stochastic block models via spectral methods Community detection in stochastic block models via spectral methods Laurent Massoulié (MSR-Inria Joint Centre, Inria) based on joint works with: Dan Tomozei (EPFL), Marc Lelarge (Inria), Jiaming Xu (UIUC),

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization

More information

Binary matrix completion

Binary matrix completion Binary matrix completion Yaniv Plan University of Michigan SAMSI, LDHD workshop, 2013 Joint work with (a) Mark Davenport (b) Ewout van den Berg (c) Mary Wootters Yaniv Plan (U. Mich.) Binary matrix completion

More information

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Ranking from Pairwise Comparisons

Ranking from Pairwise Comparisons Ranking from Pairwise Comparisons Sewoong Oh Department of ISE University of Illinois at Urbana-Champaign Joint work with Sahand Negahban(MIT) and Devavrat Shah(MIT) / 9 Rank Aggregation Data Algorithm

More information

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5. 2. LINEAR ALGEBRA Outline 1. Definitions 2. Linear least squares problem 3. QR factorization 4. Singular value decomposition (SVD) 5. Pseudo-inverse 6. Eigenvalue decomposition (EVD) 1 Definitions Vector

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

Eigenvalue Problems Computation and Applications

Eigenvalue Problems Computation and Applications Eigenvalue ProblemsComputation and Applications p. 1/36 Eigenvalue Problems Computation and Applications Che-Rung Lee cherung@gmail.com National Tsing Hua University Eigenvalue ProblemsComputation and

More information

Node and Link Analysis

Node and Link Analysis Node and Link Analysis Leonid E. Zhukov School of Applied Mathematics and Information Science National Research University Higher School of Economics 10.02.2014 Leonid E. Zhukov (HSE) Lecture 5 10.02.2014

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

EE 381V: Large Scale Learning Spring Lecture 16 March 7

EE 381V: Large Scale Learning Spring Lecture 16 March 7 EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

Lecture 8: Linear Algebra Background

Lecture 8: Linear Algebra Background CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 8: Linear Algebra Background Lecturer: Shayan Oveis Gharan 2/1/2017 Scribe: Swati Padmanabhan Disclaimer: These notes have not been subjected

More information

Krylov Subspace Methods to Calculate PageRank

Krylov Subspace Methods to Calculate PageRank Krylov Subspace Methods to Calculate PageRank B. Vadala-Roth REU Final Presentation August 1st, 2013 How does Google Rank Web Pages? The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector

More information

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US

More information

Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors

Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors Alexander Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 6: Numerical Linear Algebra: Applications in Machine Learning Cho-Jui Hsieh UC Davis April 27, 2017 Principal Component Analysis Principal

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

Large-scale eigenvalue problems

Large-scale eigenvalue problems ELE 538B: Mathematics of High-Dimensional Data Large-scale eigenvalue problems Yuxin Chen Princeton University, Fall 208 Outline Power method Lanczos algorithm Eigenvalue problems 4-2 Eigendecomposition

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Link Analysis. Leonid E. Zhukov

Link Analysis. Leonid E. Zhukov Link Analysis Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis and Visualization

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

Information Recovery from Pairwise Measurements

Information Recovery from Pairwise Measurements Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

CS60021: Scalable Data Mining. Dimensionality Reduction

CS60021: Scalable Data Mining. Dimensionality Reduction J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

1 Searching the World Wide Web

1 Searching the World Wide Web Hubs and Authorities in a Hyperlinked Environment 1 Searching the World Wide Web Because diverse users each modify the link structure of the WWW within a relatively small scope by creating web-pages on

More information

Matrix concentration inequalities

Matrix concentration inequalities ELE 538B: Mathematics of High-Dimensional Data Matrix concentration inequalities Yuxin Chen Princeton University, Fall 2018 Recap: matrix Bernstein inequality Consider a sequence of independent random

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 10 Graphs II Rainer Gemulla, Pauli Miettinen Jul 4, 2013 Link analysis The web as a directed graph Set of web pages with associated textual content Hyperlinks between webpages

More information

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work

More information

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano Google PageRank Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano fricci@unibz.it 1 Content p Linear Algebra p Matrices p Eigenvalues and eigenvectors p Markov chains p Google

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Lecture 12 : Graph Laplacians and Cheeger s Inequality CPS290: Algorithmic Foundations of Data Science March 7, 2017 Lecture 12 : Graph Laplacians and Cheeger s Inequality Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Graph Laplacian Maybe the most beautiful

More information

Fundamentals of Matrices

Fundamentals of Matrices Maschinelles Lernen II Fundamentals of Matrices Christoph Sawade/Niels Landwehr/Blaine Nelson Tobias Scheffer Matrix Examples Recap: Data Linear Model: f i x = w i T x Let X = x x n be the data matrix

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Matrix Completion from a Few Entries

Matrix Completion from a Few Entries Matrix Completion from a Few Entries Raghunandan H. Keshavan and Sewoong Oh EE Department Stanford University, Stanford, CA 9434 Andrea Montanari EE and Statistics Departments Stanford University, Stanford,

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

Lecture 9: SVD, Low Rank Approximation

Lecture 9: SVD, Low Rank Approximation CSE 521: Design and Analysis of Algorithms I Spring 2016 Lecture 9: SVD, Low Rank Approimation Lecturer: Shayan Oveis Gharan April 25th Scribe: Koosha Khalvati Disclaimer: hese notes have not been subjected

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

A useful variant of the Davis Kahan theorem for statisticians

A useful variant of the Davis Kahan theorem for statisticians arxiv:1405.0680v1 [math.st] 4 May 2014 A useful variant of the Davis Kahan theorem for statisticians Yi Yu, Tengyao Wang and Richard J. Samworth Statistical Laboratory, University of Cambridge (May 6,

More information

Data science with multilayer networks: Mathematical foundations and applications

Data science with multilayer networks: Mathematical foundations and applications Data science with multilayer networks: Mathematical foundations and applications CDSE Days University at Buffalo, State University of New York Monday April 9, 2018 Dane Taylor Assistant Professor of Mathematics

More information

Orthogonal tensor decomposition

Orthogonal tensor decomposition Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.

More information

approximation algorithms I

approximation algorithms I SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions

More information

An Introduction to Spectral Learning

An Introduction to Spectral Learning An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 Outline 1 Method of Moments 2 Learning topic models using spectral properties 3 Anchor words Preliminaries X 1,, X n p (x; θ), θ = (θ 1,

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 Optimization for Tensor Models Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 1 Tensors Matrix Tensor: higher-order matrix

More information

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018 Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Algebraic Representation of Networks

Algebraic Representation of Networks Algebraic Representation of Networks 0 1 2 1 1 0 0 1 2 0 0 1 1 1 1 1 Hiroki Sayama sayama@binghamton.edu Describing networks with matrices (1) Adjacency matrix A matrix with rows and columns labeled by

More information

Spectra of Adjacency and Laplacian Matrices

Spectra of Adjacency and Laplacian Matrices Spectra of Adjacency and Laplacian Matrices Definition: University of Alicante (Spain) Matrix Computing (subject 3168 Degree in Maths) 30 hours (theory)) + 15 hours (practical assignment) Contents 1. Spectra

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms : Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Lecture 9: Low Rank Approximation

Lecture 9: Low Rank Approximation CSE 521: Design and Analysis of Algorithms I Fall 2018 Lecture 9: Low Rank Approximation Lecturer: Shayan Oveis Gharan February 8th Scribe: Jun Qi Disclaimer: These notes have not been subjected to the

More information

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 1 Outline Outline Dynamical systems. Linear and Non-linear. Convergence. Linear algebra and Lyapunov functions. Markov

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 2 Often, our data can be represented by an m-by-n matrix. And this matrix can be closely approximated by the product of two matrices that share a small common dimension

More information

Dissertation Defense

Dissertation Defense Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analysis Handout 1 Luca Trevisan October 3, 017 Scribed by Maxim Rabinovich Lecture 1 In which we begin to prove that the SDP relaxation exactly recovers communities

More information

A hybrid reordered Arnoldi method to accelerate PageRank computations

A hybrid reordered Arnoldi method to accelerate PageRank computations A hybrid reordered Arnoldi method to accelerate PageRank computations Danielle Parker Final Presentation Background Modeling the Web The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector

More information

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds MAE 298, Lecture 8 Feb 4, 2008 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in a file-sharing

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I Eigenproblems I CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Eigenproblems I 1 / 33 Announcements Homework 1: Due tonight

More information

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1 Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger

More information

Statistical and Computational Phase Transitions in Planted Models

Statistical and Computational Phase Transitions in Planted Models Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203 Cluster/Community structure in

More information

MTH 2032 SemesterII

MTH 2032 SemesterII MTH 202 SemesterII 2010-11 Linear Algebra Worked Examples Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education December 28, 2011 ii Contents Table of Contents

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information