Memory-Efficient Low Rank Approximation of Massive Graphs

Size: px
Start display at page:

Download "Memory-Efficient Low Rank Approximation of Massive Graphs"

Transcription

1 Fast and Memory-Efficient Low Rank Approximation of Massive Graphs Inderjit S. Dhillon University of Texas at Austin Purdue University Jan 31, 2012 Joint work with Berkant Savas, Donghyuk Shin, Si Si Han Song, Yin Zhang, Lili Qiu and Xin Sui, Keshav Pingali

2 Networks Social networks (Facebook, MySpace, LiveJournal,...) Communication networks (SKT) Biological networks (linking genes to diseases) Web graphs Collaboration and citation networks... Challenges Massive scale Facebook s social graph has over 750 million vertices Multiple sources of information Need scalable algorithms and good predictions

3 Social Networks Nodes: individual actors (people or groups of people) Edges: relationships (social interactions) between nodes Example: Karate Club Network:

4 Social Network Analysis Structural properties, models, and dynamics of social networks Behavioral Analysis Game Theory Structural Properties Proximity measures distances between pairs of vertices Community detection Strong ties vs Weak ties Link prediction who will become whose friend in the future Identify important people (centrality) Quantifying information flow through a link (communicability) Identify bridging links (betweenness)

5 Link Prediction Problem Consider a social network that evolves with time, A (t 2) A (t 1) A (t)? A t+1 Eve 5 Eve 5 Carol 3 1 Alice Carol 3 1 Alice Dave 4 2 Bob Dave 4 2 Bob Q: Can we predict links that will form at time step t + 1? A: Many models exist, e.g. common neighbors, the Katz measure, etc.

6 Matrix Functions The adjacency matrix A = [a ij ] of graph G = (V, E) is given by { w ij, if there is an edge between nodes i and j a ij = 0, otherwise Powers of A: (A n ) ij counts the number of length-n walks between nodes i and j Matrix Functions serve as useful conceptual and computational tool, e.g. matrix exponential and Katz measure: f (A) = e A = I + A + A2 2! + A3 3! + + Ak k! + f (A) = (I βa) 1 = I + βa + β 2 A 2 + β 3 A 3 + E. Estrada and D. J. Higham. Network Properties Revealed through Matrix Functions. SIAM Review, 2010.

7 Matrix Functions Centrality: f (A) ii, quantifies node i s well-connectedness Communicability: f (A) ij, measures how easy it is for information to pass from node i to j Betweenness: quantifies influence of a node as information flows around a network, how communicability changes when a node is removed i j,i r,j r f (A) ij f (A E(r)) ij f (A) ij

8 Low-Rank Approximations Truncated SVD: A i V Λ i V T allows approximations of matrix functions: e A Ve Λ V T (I βa) 1 V (I βλ) 1 V T BUT SVD has Major Drawback. For example, in LiveJournal graph: 3.8 million nodes and 65 million edges Average degree = 17.1 Requires about 1GB of memory to store adjacency matrix Rank-100 approximation requires about 6GB of memory PROBLEM: SVD Does Not Preserve SPARSITY

9 SVD is NOT Optimal Social network with 13,500 nodes and 365,000 edges (from LiveJournal) Relative reconstruction error in % SVD Our method with 10 clusters Rank of approximation Relative reconstruction error in % SVD Our method with 10 clusters Memory usage (number of floating point numbers) x 10 6

10 Proposed Method

11 Graph Partitioning/Clustering In many applications, goal is to partition/cluster nodes of a graph: Karate Club Network

12 Graph clustering Cluster vertices V into c disjoint sets V i, V = c i=1v i and V i V j = i j Objective functions for clustering RCut = NCut = KLObj = min V 1,...,V c min V 1,...,V c min V 1,...,V c c i=1 c i=1 c i=1 links(v i, VnV i ) V i links(v i, VnV i ) degree(v i ) links(v i, VnV i ) V i (Chan et al. 1994) (Shi and Malik 2000) s.t. V i = V j (Kernighan and Lin 1970) Spectral clustering can be expensive for massive graphs Fast clustering algorithms without eigenvector computations GRACLUS (Dhillon et al. 2007) and METIS (Karypis and Kumar 1999)

13 Multilevel partitioning scheme Repeated coarsening and partitioning

14 Clustering example: arxiv data graph condmat # of clust 10 (graclus) 10 (metis) edges in clust 79.8 % 77.9 % # of vertices 21,363 21,363 A A=.. Ac1 # of edges 182, ,628 relative error A1c... Acc

15 Low rank vs clustered low-rank Low rank: A UΣV T [ ] [ ] [ U1 0 S11 S Clustered low rank: A 12 V1 0 0 U 2 S 21 S 22 0 V 2 ] T Observe diag(u 1, U 2,, U c ) and has the same memory usage as U Experiments show that most of U is contained in diag(u 1, U 2,, U c ) Similarly of V and diag(v 1, V 2,, V c )

16 Algorithm: Clustered low rank approximation Input: An m m adjacency matrix A of a graph, number of clusters c Output: Clustered low rank approximation of A 1: Cluster the graph into c clusters 2: Compute a low rank approximation of each cluster Ui Si Vi Aii 3: Extend the cluster-wise approximations, into an approximation for the entire matrix A Sij = UiT Aij Vj

17 Approximations within each cluster Deterministic methods truncated SVD using ARPACK, PROPACK, SVDPACK. Stochastic methods Y = AΩ or Y = (AA T ) q AΩ where q is small (1,2 or 3), and Ω is a random matrix (standard Gaussian). An approximation is obtained with A Â = P Y A = Y (Y T Y ) 1 Y T A where P Y is an orthogonal projection onto the range space of Y

18 Deterministic clustered error bound For any partitioning and any Y (i) form Y = P Y and each P Y (i) Theorem Y (1)... A 11 A 1c A =..... A c1 A cc Y (c) P Y = are orthogonal projectors P Y (1)... PY (c) c (I P Y )A 2 F (I P Y (i))a ij 2 F i,j=1

19 Stochastic error bounds in the clustered setting Theorem E (I P Y )A 2 F E (I P Y )A 2 + c i=1 i=1 ( 1 + k i p i + 1 ) Σ (i) 2 2 F + c ( c ( ) ki 1 + Σ (i) pi 1 c i,j=1, i j A ij 2 i,j=1, i j A ij 2 F e k i + p i [ ] A ii U (i) Σ (i) (V (i) ) T = U (i) Σ (i) 1 0 [ ] T 0 Σ (i) V (i) 1 V (i) 2 2 p i Σ (i) 2 F )

20 LiveJournal graph 3.8 million vertices; 65 million edges Edges within clusters: 76.9%, 69.3%, 66.3%; graph not highly clusterable Ranks are k = 20, 50, 100, 150, 200. (117 clusters) Relative error in % svds 22 clusters w/ svds 61 clusters w/ svds 117 clusters w/ svds Number of parameters (memory) x 10 9 Relative error in % svds randalg randalgpow svds C randalg C randalgpow C Number of parameters (memory) x 10 9

21 Multi-scale Prediction

22 Hierarchical Clustering: arxiv data graph condmat Networks exhibit a hierarchical structure. 21,363 vertices and 182,628 edges.

23 Hierarchical Approach A level 0 A (1) 11 A (1) 22 level 1 A (2) 11 A (2) 22 A (2) 33 A (2) 44 level 2

24 Hierarchical Approach A U level 0 A (1) 11 U (1) 1 A (1) 22 U (1) 2 level 1 A (2) 11 U (2) 1 A (2) 22 U (2) 2 A (2) 33 U (2) 3 A (2) 44 U (2) 4 level 2

25 Hierarchical Approach A U level 0 A (1) 11 U (1) 1 A (1) 22 U (1) 2 level 1 A (2) 11 U (2) 1 A (2) 22 U (2) 2 A (2) 33 U (2) 3 A (2) 44 U (2) 4 level 2 A has 2 clustered low rank approximations: Level 1: A A (1) = [ ] [ U (1) U (1) 2 S (1) 11 S (1) 12 S (1) 21 S (1) 22 ] [ U (1) 1 0 ] T 0 U (1) 2

26 Hierarchical Approach A U level 0 A (1) 11 U (1) 1 A (1) 22 U (1) 2 level 1 A (2) 11 U (2) 1 A (2) 22 U (2) 2 A (2) 33 U (2) 3 A (2) 44 U (2) 4 level 2 A has 2 clustered low rank approximations: Level 2: A A (2) = U (2) U (2) U (2) U (2) 4 S (2) 11 S (2) 12 S (2) 13 S (2) 14 S (2) 21 S (2) 22 S (2) 23 S (2) 24 S (2) 31 S (2) 32 S (2) 33 S (2) 34 S (2) 41 S (2) 42 S (2) 43 S (2) 44 U (2) U (2) U (2) U (2) 4 T

27 SVD of parent node A (P ) U (P ) A (C) 11 U (C) 1 A (C) 22 U (C) 2 Q: Given U (C) 1 and U (C) 2., each of rank k, how to compute U (P)? Alternative 1: U (P) = [ U (C) U (C) 2 ]. But then U (P) = U (C) 1 U (C) 2. Alternative 2: U (P) comes from truncated rank-k SVD of A (P). But computationally slow if U (P) is computed from scratch.

28 Closeness to SVD: Principal Angles arxiv data graph AstroPh: 17,903 nodes, 394,003 edges. ( [ ]) principal angles svd(a (P) U (C) ), U (C) Cosine of principal angles Basis

29 Fast approximation of SVD of parent node Input: A = A (P), Ω = Output: U (P) [ ] U (C) U (C) and integer q. 2 1: Compute Y = (AA T ) q AΩ. 2: Compute Q as an orthonormal basis for the range of Y, e.g., via the QR factorization Y = QR. 3: Compute B = Q T AQ. (A Q(Q T AQ)Q T ) 4: Compute the eigen decomposition of B = V ΛV T. 5: Compute U (P) = QV. Just q steps of power(subspace) iteration followed by orthogonal projection onto resulting subspace.

30 Closeness to SVD: Principal Angles arxiv data graph AstroPh: 17,903 nodes, 394,003 edges. principal angles ( svd(a (P) ), U (P)) Cosine of principal angles Base Reuse (q=1) Rand q=1 0.6 Rand q=2 Rand q= Rand q=4 Rand q= Basis

31 Multi-Scale prediction Compute: f (A (0) ), f (A (1) ), f (A (2) ),..., f (A (l) ). Single-scale prediction: g(f ( )). Multi-scale prediction: g(w 0 f (A (0) ) + w 1 f (A (1) ) w l f (A (l) )), or combine g(f (A (i) )), i = 0, 1,..., l.

32 Link Inference Karate-Club: 34 nodes, 78 edges CN Katz CLRA Katz MultiScale Katz RandClust Katz Number of Hits Top k

33 Link Inference soc-epinions: 32,223 nodes, 684,026 edges CN Katz(lanczos) CLRA Katz MultiScale Katz False negative rate False positive rate x 10 4

34 Link Prediction LiveJournal: 1,757,326 nodes, T1:84,366,676 edges, T2: 85,666,494 edges CN Katz(lanczos) CLRA Katz HCLRA Katz False negative rate False positive rate x 10 3

35 Parallelization

36 Parallelized Clustered Low Rank Approximation ParMETIS partitions the graph in parallel Each process computes the SVDs of the diagonal blocks independently Heuristics to balance the approximation of off-diagonal blocks Matrix A ParMETIS Graph Reorganization SVD SVD SVD Off-Diagonal Matrix Approximations Approximated Matrix Â

37 Parallel Implementation Lonestar(TACC) machine: 1,888 nodes, each node has two Xeon 6-core 3.3GHz processors and 40GB/s of point-to-point bandwidth MPI, ParMETIS 3.1.1, ARPACK and GotoBLAS 1.30 One process is allocated on one node LiveJournal Data 3,828,682 vertices; 39,870,459 edges

38 Parallel implementation: Relations Data 9,746,232 vertices; 38,955,401 edges

39 ParMETIS becomes the bottleneck Breakdown of runtime: (a) LiveJournal (b) Relations

40 Summary and Conclusions Structural Analysis of Massive Social Networks New Method Clustered low rank approximation Combines fast clustering with low rank approximation Memory-efficient respects sparsity of graph Improved quality of approximation Efficiently parallelizable Applied to computation of proximity measures & link prediction on large social networks Future Work Hierarchical/Multilevel Approximation AND Prediction Develop better (parallel) multilevel clustering for scale-free graphs

41 References B. Savas and I. S. Dhillon. Clustered low rank approximation of graphs in information science applications. SIAM Conference on Data Mining, H. H. Song, B. Savas, T. W. Cho, V. Dave, Z. Lu, I. S. Dhillon, Y. Zhang, and L. Qiu. Clustered embedding of massive online social networks. Working manuscript. I. S. Dhillon, Y. Guan and B. Kulis. Weighted graph cuts without eigenvectors a multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, E. Estrada and D. J. Higham. Network Properties Revealed through Matrix Functions. SIAM Review, D. Easley and J Kleinberg, Networks, Crowds, and Markets, Cambridge University Press, 2010.

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 12: Graph Clustering Cho-Jui Hsieh UC Davis May 29, 2018 Graph Clustering Given a graph G = (V, E, W ) V : nodes {v 1,, v n } E: edges

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation

A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation Cho-Jui Hsieh Dept. of Computer Science University of Texas, Austin cjhsieh@cs.utexas.edu Inderjit S. Dhillon Dept. of Computer Science

More information

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition

More information

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition ECS231: Spectral Partitioning Based on Berkeley s CS267 lecture on graph partition 1 Definition of graph partitioning Given a graph G = (N, E, W N, W E ) N = nodes (or vertices), E = edges W N = node weights

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu Task: Find coalitions in signed networks Incentives: European

More information

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Faloutsos, Tong ICDE, 2009

Faloutsos, Tong ICDE, 2009 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

The Forward-Backward Embedding of Directed Graphs

The Forward-Backward Embedding of Directed Graphs The Forward-Backward Embedding of Directed Graphs Anonymous authors Paper under double-blind review Abstract We introduce a novel embedding of directed graphs derived from the singular value decomposition

More information

Normalized power iterations for the computation of SVD

Normalized power iterations for the computation of SVD Normalized power iterations for the computation of SVD Per-Gunnar Martinsson Department of Applied Mathematics University of Colorado Boulder, Co. Per-gunnar.Martinsson@Colorado.edu Arthur Szlam Courant

More information

DS504/CS586: Big Data Analytics Graph Mining II

DS504/CS586: Big Data Analytics Graph Mining II Welcome to DS504/CS586: Big Data Analytics Graph Mining II Prof. Yanhua Li Time: 6-8:50PM Thursday Location: AK233 Spring 2018 v Course Project I has been graded. Grading was based on v 1. Project report

More information

Eigenvalue Problems Computation and Applications

Eigenvalue Problems Computation and Applications Eigenvalue ProblemsComputation and Applications p. 1/36 Eigenvalue Problems Computation and Applications Che-Rung Lee cherung@gmail.com National Tsing Hua University Eigenvalue ProblemsComputation and

More information

Incomplete Cholesky preconditioners that exploit the low-rank property

Incomplete Cholesky preconditioners that exploit the low-rank property anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université

More information

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee wwlee1@uiuc.edu September 28, 2004 Motivation IR on newsgroups is challenging due to lack of

More information

Sparse Linear Algebra Issues Arising in the Analysis of Complex Networks

Sparse Linear Algebra Issues Arising in the Analysis of Complex Networks Sparse Linear Algebra Issues Arising in the Analysis of Complex Networks Department of Mathematics and Computer Science Emory University Atlanta, GA 30322, USA Acknowledgments Christine Klymko (Emory)

More information

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties Prof. James She james.she@ust.hk 1 Last lecture 2 Selected works from Tutorial

More information

Node Centrality and Ranking on Networks

Node Centrality and Ranking on Networks Node Centrality and Ranking on Networks Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Social

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:

More information

Memory Efficient Kernel Approximation

Memory Efficient Kernel Approximation Si Si Department of Computer Science University of Texas at Austin ICML Beijing, China June 23, 2014 Joint work with Cho-Jui Hsieh and Inderjit S. Dhillon Outline Background Motivation Low-Rank vs. Block

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search 6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is

More information

DS504/CS586: Big Data Analytics Graph Mining II

DS504/CS586: Big Data Analytics Graph Mining II Welcome to DS504/CS586: Big Data Analytics Graph Mining II Prof. Yanhua Li Time: 6:00pm 8:50pm Mon. and Wed. Location: SL105 Spring 2016 Reading assignments We will increase the bar a little bit Please

More information

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya CS 375 Advanced Machine Learning Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya Outline SVD and LSI Kleinberg s Algorithm PageRank Algorithm Vector Space Model Vector space model represents

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Information Search and Management Prof. Chris Clifton 6 September 2017 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 Vector Space Model Disadvantages:

More information

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 1 Outline Outline Dynamical systems. Linear and Non-linear. Convergence. Linear algebra and Lyapunov functions. Markov

More information

Markov Chains and Spectral Clustering

Markov Chains and Spectral Clustering Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 6: Numerical Linear Algebra: Applications in Machine Learning Cho-Jui Hsieh UC Davis April 27, 2017 Principal Component Analysis Principal

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database University of Maryland-College Park Advance Scientific Computing I,II Spectral Clustering on Handwritten Digits Database Author: Danielle Middlebrooks Dmiddle1@math.umd.edu Second year AMSC Student Advisor:

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Part I: Preliminary Results. Pak K. Chan, Martine Schlag and Jason Zien. Computer Engineering Board of Studies. University of California, Santa Cruz

Part I: Preliminary Results. Pak K. Chan, Martine Schlag and Jason Zien. Computer Engineering Board of Studies. University of California, Santa Cruz Spectral K-Way Ratio-Cut Partitioning Part I: Preliminary Results Pak K. Chan, Martine Schlag and Jason Zien Computer Engineering Board of Studies University of California, Santa Cruz May, 99 Abstract

More information

Node and Link Analysis

Node and Link Analysis Node and Link Analysis Leonid E. Zhukov School of Applied Mathematics and Information Science National Research University Higher School of Economics 10.02.2014 Leonid E. Zhukov (HSE) Lecture 5 10.02.2014

More information

Estimating the Largest Elements of a Matrix

Estimating the Largest Elements of a Matrix Estimating the Largest Elements of a Matrix Samuel Relton samuel.relton@manchester.ac.uk @sdrelton samrelton.com blog.samrelton.com Joint work with Nick Higham nick.higham@manchester.ac.uk May 12th, 2016

More information

Enhancing Scalability of Sparse Direct Methods

Enhancing Scalability of Sparse Direct Methods Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.

More information

Quantum Annealing Approaches to Graph Partitioning on the D-Wave System

Quantum Annealing Approaches to Graph Partitioning on the D-Wave System Quantum Annealing Approaches to Graph Partitioning on the D-Wave System 2017 D-Wave QUBITS Users Conference Applications 1: Optimization S. M. Mniszewski, smm@lanl.gov H. Ushijima-Mwesigwa, hayato@lanl.gov

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

Krylov Subspace Methods to Calculate PageRank

Krylov Subspace Methods to Calculate PageRank Krylov Subspace Methods to Calculate PageRank B. Vadala-Roth REU Final Presentation August 1st, 2013 How does Google Rank Web Pages? The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector

More information

Background Mathematics (2/2) 1. David Barber

Background Mathematics (2/2) 1. David Barber Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and

More information

PU Learning for Matrix Completion

PU Learning for Matrix Completion Cho-Jui Hsieh Dept of Computer Science UT Austin ICML 2015 Joint work with N. Natarajan and I. S. Dhillon Matrix Completion Example: movie recommendation Given a set Ω and the values M Ω, how to predict

More information

A MATRIX ANALYSIS OF DIFFERENT CENTRALITY MEASURES

A MATRIX ANALYSIS OF DIFFERENT CENTRALITY MEASURES A MATRIX ANALYSIS OF DIFFERENT CENTRALITY MEASURES MICHELE BENZI AND CHRISTINE KLYMKO Abstract. Node centrality measures including degree, eigenvector, Katz and subgraph centralities are analyzed for both

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

The QR Factorization

The QR Factorization The QR Factorization How to Make Matrices Nicer Radu Trîmbiţaş Babeş-Bolyai University March 11, 2009 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 1 / 25 Projectors A projector

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Projectors and QR Factorization Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 14 Outline 1 Projectors 2 QR Factorization

More information

Complex Social System, Elections. Introduction to Network Analysis 1

Complex Social System, Elections. Introduction to Network Analysis 1 Complex Social System, Elections Introduction to Network Analysis 1 Complex Social System, Network I person A voted for B A is more central than B if more people voted for A In-degree centrality index

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES MICHELE BENZI AND CHRISTINE KLYMKO Abstract This document contains details of numerical

More information

Computational Methods. Eigenvalues and Singular Values

Computational Methods. Eigenvalues and Singular Values Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations

More information

CS 664 Segmentation (2) Daniel Huttenlocher

CS 664 Segmentation (2) Daniel Huttenlocher CS 664 Segmentation (2) Daniel Huttenlocher Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Tensor-Tensor Product Toolbox

Tensor-Tensor Product Toolbox Tensor-Tensor Product Toolbox 1 version 10 Canyi Lu canyilu@gmailcom Carnegie Mellon University https://githubcom/canyilu/tproduct June, 018 1 INTRODUCTION Tensors are higher-order extensions of matrices

More information

Preserving sparsity in dynamic network computations

Preserving sparsity in dynamic network computations GoBack Preserving sparsity in dynamic network computations Francesca Arrigo and Desmond J. Higham Network Science meets Matrix Functions Oxford, Sept. 1, 2016 This work was funded by the Engineering and

More information

Online Tensor Factorization for. Feature Selection in EEG

Online Tensor Factorization for. Feature Selection in EEG Online Tensor Factorization for Feature Selection in EEG Alric Althoff Honors Thesis, Department of Cognitive Science, University of California - San Diego Supervised by Dr. Virginia de Sa Abstract Tensor

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory Tim Roughgarden & Gregory Valiant May 2, 2016 Spectral graph theory is the powerful and beautiful theory that arises from

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Spectral Graph Theory Tools. Analysis of Complex Networks

Spectral Graph Theory Tools. Analysis of Complex Networks Spectral Graph Theory Tools for the Department of Mathematics and Computer Science Emory University Atlanta, GA 30322, USA Acknowledgments Christine Klymko (Emory) Ernesto Estrada (Strathclyde, UK) Support:

More information

Link Analysis. Leonid E. Zhukov

Link Analysis. Leonid E. Zhukov Link Analysis Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis and Visualization

More information

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte Reputation Systems I HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte Yury Lifshits Wiki Definition Reputation is the opinion (more technically, a social evaluation) of the public toward a person, a

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008

More information

Tighter Low-rank Approximation via Sampling the Leveraged Element

Tighter Low-rank Approximation via Sampling the Leveraged Element Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com

More information

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space

More information

Unsupervised Clustering of Human Pose Using Spectral Embedding

Unsupervised Clustering of Human Pose Using Spectral Embedding Unsupervised Clustering of Human Pose Using Spectral Embedding Muhammad Haseeb and Edwin R Hancock Department of Computer Science, The University of York, UK Abstract In this paper we use the spectra of

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13 CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as

More information

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM Proceedings of ALGORITMY 25 pp. 22 211 PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM GABRIEL OKŠA AND MARIÁN VAJTERŠIC Abstract. One way, how to speed up the computation of the singular value

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Kristina Lerman USC Information Sciences Institute

Kristina Lerman USC Information Sciences Institute Rethinking Network Structure Kristina Lerman USC Information Sciences Institute Università della Svizzera Italiana, December 16, 2011 Measuring network structure Central nodes Community structure Strength

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Pawan Goyal CSE, IITKGP October 21, 2014 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 21, 2014 1 / 52 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation

More information

Numerical Methods I: Eigenvalues and eigenvectors

Numerical Methods I: Eigenvalues and eigenvectors 1/25 Numerical Methods I: Eigenvalues and eigenvectors Georg Stadler Courant Institute, NYU stadler@cims.nyu.edu November 2, 2017 Overview 2/25 Conditioning Eigenvalues and eigenvectors How hard are they

More information

The non-backtracking operator

The non-backtracking operator The non-backtracking operator Florent Krzakala LPS, Ecole Normale Supérieure in collaboration with Paris: L. Zdeborova, A. Saade Rome: A. Decelle Würzburg: J. Reichardt Santa Fe: C. Moore, P. Zhang Berkeley:

More information

Web Structure Mining Nodes, Links and Influence

Web Structure Mining Nodes, Links and Influence Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.

More information

Self-Tuning Spectral Clustering

Self-Tuning Spectral Clustering Self-Tuning Spectral Clustering Lihi Zelnik-Manor Pietro Perona Department of Electrical Engineering Department of Electrical Engineering California Institute of Technology California Institute of Technology

More information

Positive Definite Matrix

Positive Definite Matrix 1/29 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Positive Definite, Negative Definite, Indefinite 2/29 Pure Quadratic Function

More information

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into

More information

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Variables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014

Variables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014 for a Million Variables Cho-Jui Hsieh The University of Texas at Austin ICML workshop on Covariance Selection Beijing, China June 26, 2014 Joint work with M. Sustik, I. Dhillon, P. Ravikumar, R. Poldrack,

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

An indicator for the number of clusters using a linear map to simplex structure

An indicator for the number of clusters using a linear map to simplex structure An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany

More information

Graph Matching & Information. Geometry. Towards a Quantum Approach. David Emms

Graph Matching & Information. Geometry. Towards a Quantum Approach. David Emms Graph Matching & Information Geometry Towards a Quantum Approach David Emms Overview Graph matching problem. Quantum algorithms. Classical approaches. How these could help towards a Quantum approach. Graphs

More information

Matrices, Vector Spaces, and Information Retrieval

Matrices, Vector Spaces, and Information Retrieval Matrices, Vector Spaces, and Information Authors: M. W. Berry and Z. Drmac and E. R. Jessup SIAM 1999: Society for Industrial and Applied Mathematics Speaker: Mattia Parigiani 1 Introduction Large volumes

More information

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden Index 1-norm, 15 matrix, 17 vector, 15 2-norm, 15, 59 matrix, 17 vector, 15 3-mode array, 91 absolute error, 15 adjacency matrix, 158 Aitken extrapolation, 157 algebra, multi-linear, 91 all-orthogonality,

More information

1. Ignoring case, extract all unique words from the entire set of documents.

1. Ignoring case, extract all unique words from the entire set of documents. CS 378 Introduction to Data Mining Spring 29 Lecture 2 Lecturer: Inderjit Dhillon Date: Jan. 27th, 29 Keywords: Vector space model, Latent Semantic Indexing(LSI), SVD 1 Vector Space Model The basic idea

More information

Multiscale Approach for the Network Compression-friendly Ordering

Multiscale Approach for the Network Compression-friendly Ordering Multiscale Approach for the Network Compression-friendly Ordering Ilya Safro (Argonne National Laboratory) and Boris Temkin (Weizmann Institute of Science) SIAM Parallel Processing for Scientific Computing

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa Recap: vector space model Represent both doc and query by concept vectors Each concept defines one dimension K concepts define a high-dimensional space Element

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information