Heat Kernel Based Community Detection
|
|
- Brian Arnold
- 6 years ago
- Views:
Transcription
1 Heat Kernel Based Community Detection Joint with David F. Gleich, (Purdue), supported by" NSF CAREER CCF Kyle Kloster! Purdue University!
2 Local Community Detection Given seed(s) S in G, find a community that contains S. seed Community?
3 Local Community Detection Given seed(s) S in G, find a community that contains S. seed Community? high internal, low external connectivity!
4 Low-conductance sets are communities # edges leaving T conductance( T ) = # edge endpoints in T = chance a random step exits T
5 Low-conductance sets are communities # edges leaving T conductance( T ) = # edge endpoints in T = chance a random step exits T conductance(comm) = 39/381 =.102 How to find these?!
6 Graph diffusions find low-conductance sets A diffusion propagates rank from a seed across a graph. seed = high! = low! diffusion value!
7 Graph diffusions find low-conductance sets A diffusion propagates rank from a seed across a graph. seed = high! = low! diffusion value! = local community /! low-conductance set! Okay " how does this work?
8 Graph Diffusion A diffusion models how a mass (green dye, money, popularity) spreads from a seed across a network. seed p 0 p 1 p 2 p 3
9 Graph Diffusion diffuse A diffusion models how a mass (green dye, money, popularity) spreads from a seed across a network. seed p 0 p 1 p 2 p 3
10 Graph Diffusion diffuse A diffusion models how a mass seed (green dye, money, popularity) spreads from a seed across a p 0 p 1 p 2 p 3 network." Once mass reaches a node, it propagates to the neighbors, with some decay. decay : dye dilutes, money is taxed, popularity fades
11 Graph Diffusion A diffusion models how a mass seed (green dye, money, popularity) spreads from a seed across a network. " Once mass reaches a node, it propagates to the neighbors, diffuse with some decay. decay : dye dilutes, money is taxed, popularity fades p 0 p 1 p 2 p 3
12 Graph Diffusion A diffusion models how a mass seed (green dye, money, popularity) spreads from a seed across a network." Once mass reaches a node, it propagates to the neighbors, with some decay. decay : dye dilutes, money is taxed, popularity fades p 0 p 1 p 2 p 3
13 Diffusion score diffusion score of a node = " weighted sum of the mass at that node during different stages. c 0 p 0 + c 1 p 1 + c 2 p 2 + c p 3 + 3
14 Diffusion score diffusion score of a node = " weighted sum of the mass at that node during different stages. c 0 p 0 + c 1 p 1 + c 2 p 2 + c p diffusion score vector = f! 1X f = c k P k s k=0 P = random-walk s = c k = transition matrix normalized seed vector weight on stage k
15 Heat Kernel vs. PageRank Diffusions Heat Kernel uses t k /k! " Our work is new analysis for this diffusion. t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! + PageRank uses α k at stage k." Standard, widely-used diffusion we use for comparison. α 0 p 0 + α 1 p 1 + α 2 p 2 + α3 p 3 +
16 Heat Kernel vs. PageRank Behavior HK emphasizes earlier stages of diffusion =0.99 PR, α k Weight 10 5 t=1 t=5 t=15 =0.85 HK, t k /k! Length à involve shorter walks from seed, à so HK looks at smaller sets than PR
17 Heat Kernel vs. PageRank Theory PR good conductance Local Cheeger Inequality:" PR finds set of nearoptimal conductance fast algorithm PPR-push is O(1/(ε(1-α))) in theory, fast in practice [Andersen Chung Lang 06] HK
18 Heat Kernel vs. PageRank Theory PR good conductance Local Cheeger Inequality:" PR finds set of nearoptimal conductance fast algorithm PPR-push is O(1/(ε(1-α))) in theory, fast in practice [Andersen Chung Lang 06] HK Local Cheeger Inequality [Chung 07]
19 Heat Kernel vs. PageRank Theory PR good conductance Local Cheeger Inequality:" PR finds set of nearoptimal conductance fast algorithm PPR-push is O(1/(ε(1-α))) in theory, fast in practice [Andersen Chung Lang 06] HK Local Cheeger Inequality [Chung 07] Our work!
20 Our work on Heat Kernel: theory THEOREM Our algorithm for a relative" ε-accuracy in a degree-weighted norm has runtime <= O( e t (log(1/ε) + log(t)) / ε) (which is constant, regardless of graph size)
21 Our work on Heat Kernel: theory THEOREM Our algorithm for a relative" ε-accuracy in a degree-weighted norm has runtime <= O( e t (log(1/ε) + log(t)) / ε) (which is constant, regardless of graph size) COROLLARY HK is local! (O(1) runtime à diffusion vector has O(1) entries)!
22 Our work on Heat Kernel: results First efficient, deterministic HK algorithm. Deterministic is important to be able to compare the behaviors of HK and PR experimentally: Our key findings! HK more accurately describes ground-truth communities in real-world networks! identifies smaller sets à better precision speed & conductance comparable with PR
23 Python" demo Twitter graph 41.6 M nodes 2.4 B edges un-optimized Python code on a laptop Available for download:!
24 Algorithm Outline Computing HK! 1. Pre-compute push thresholds 2. Do push on all entries above threshold
25 Algorithm Intuition Computing HK given parameters t, ε, seed s! Starting from here How to end up here? seed p 0 p 1 p 2 p 3 t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
26 Algorithm Intuition Begin with mass at seed(s) in a residual staging area, r 0 The residuals r k hold mass that is unprocessed it s like error seed r 0 r 1 r 2 r 3 t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
27 Push Operation push (1) remove entry in r k," (2) put in p, r 0 r 1 r 2 r 3 t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
28 Push Operation push (1) remove entry in r k," (2) put in p, (3) then scale and spread to neighbors in next r! t 1 1! r 0 r 1 r 2 r 3 t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
29 Push Operation push (1) remove entry in r k," (2) put in p, (3) then scale and spread to neighbors in next r (repeat) t 2 2! r 0 r 1 r 2 r 3 t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
30 Push Operation push (1) remove entry in r k," (2) put in p, (3) then scale and spread to neighbors in next r (repeat) r 0 r 1 r 2 r 3 t 2 2! t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
31 Push Operation push (1) remove entry in r k," (2) put in p, (3) then scale and spread to neighbors in next r (repeat) r 0 r 1 r 2 r 3 t 2 2! t 3 3! t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
32 Thresholds entries < threshold ERROR equals weighted sum of entries left in r k à Set threshold so leftovers sum to < ε r 0 r 1 r 2 r 3 t 0 p 0 0! t ! p 1 t 2 p 2 p 3 2! t 3 3! +
33 Thresholds entries < threshold ERROR equals weighted sum of entries left in r k à Set threshold so leftovers sum to < ε Threshold for stage r k is" sum of remaining scale factors" prior scaling factor t 0 p 0 0! r 0 r 1 r 2 r 3 t ! p 1 t 2 p 2 p 3 2! t 3 3! +
34 Algorithm Outline Computing HK! 1. Pre-compute push thresholds 2. Do push on all entries above threshold
35 Communities in Real-world Networks Given a seed in an unidentified real-world community, how well can HK and PR describe that community? Measure quality using F 1 -measure. Graph V E amazon 330 K 930 K dblp 320 K 1 M youtube 1.1 M 3 M lj 4 M 35 M orkut 3.1 M 120 M friendster 66 M 1.8 B Datasets from SNAP collection [Leskovec] F 1 -measure is the harmonic mean of " # correct guesses precision = # total guesses and recall = # answers you get # answers there are
36
37 data F 1 precision set size comm HK PR HK PR HK PR size amazon dblp youtube lj orkut friendster
38 data F 1 precision set size comm HK PR HK PR HK PR size amazon dblp youtube lj orkut friendster PR achieves high recall by guessing a huge set HK identifies a tighter cluster, so attains better precision
39 Runtime &" Conductance" " HK is comparable in runtime and conductance." " " As graphs scale," the diffusions performance becomes even more similar. Runtime (s) log10(conductances) Runtime: hk vs. ppr hkgrow 50% 25% 75% pprgrow 50% 25% 75% log10( V + E ) 10 0 Conductances: hk vs. ppr hkgrow 50% 25% 75% pprgrow 50% 25% 75% log10( V + E )
40 Code, references, future work Code available at!! Ongoing work! - generalizing to other diffusions - simultaneously compute multiple diffusions Questions or suggestions? Kyle Kloster at kkloste-at-purdue-dot-edu!
Facebook Friends! and Matrix Functions
Facebook Friends! and Matrix Functions! Graduate Research Day Joint with David F. Gleich, (Purdue), supported by" NSF CAREER 1149756-CCF Kyle Kloster! Purdue University! Network Analysis Use linear algebra
More informationA Nearly Sublinear Approximation to exp{p}e i for Large Sparse Matrices from Social Networks
A Nearly Sublinear Approximation to exp{p}e i for Large Sparse Matrices from Social Networks Kyle Kloster and David F. Gleich Purdue University December 14, 2013 Supported by NSF CAREER 1149756-CCF Kyle
More informationStrong localization in seeded PageRank vectors
Strong localization in seeded PageRank vectors https://github.com/nassarhuda/pprlocal David F. Gleich" Huda Nassar Computer Science" Computer Science " supported by NSF CAREER CCF-1149756, " IIS-1422918,
More informationOverlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach
Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach Author: Jaewon Yang, Jure Leskovec 1 1 Venue: WSDM 2013 Presenter: Yupeng Gu 1 Stanford University 1 Background Community
More informationSeeded PageRank solution paths
Euro. Jnl of Applied Mathematics (2016), vol. 27, pp. 812 845. c Cambridge University Press 2016 doi:10.1017/s0956792516000280 812 Seeded PageRank solution paths D. F. GLEICH 1 and K. KLOSTER 2 1 Department
More informationParallel Local Graph Clustering
Parallel Local Graph Clustering Kimon Fountoulakis, joint work with J. Shun, X. Cheng, F. Roosta-Khorasani, M. Mahoney, D. Gleich University of California Berkeley and Purdue University Based on J. Shun,
More informationStrong Localization in Personalized PageRank Vectors
Strong Localization in Personalized PageRank Vectors Huda Nassar 1, Kyle Kloster 2, and David F. Gleich 1 1 Purdue University, Computer Science Department 2 Purdue University, Mathematics Department {hnassar,kkloste,dgleich}@purdue.edu
More informationLecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure
Stat260/CS294: Spectral Graph Methods Lecture 19-04/02/2015 Lecture: Local Spectral Methods (2 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide
More informationHeat Kernel Based Community Detection
Heat Kernel Based Community Detection Kyle Kloster Purdue University West Lafayette, IN kkloste@purdueedu David F Gleich Purdue University West Lafayette, IN dgleich@purdueedu ABSTRACT The heat kernel
More informationLocally-biased analytics
Locally-biased analytics You have BIG data and want to analyze a small part of it: Solution 1: Cut out small part and use traditional methods Challenge: cutting out may be difficult a priori Solution 2:
More informationLocal Lanczos Spectral Approximation for Community Detection
Local Lanczos Spectral Approximation for Community Detection Pan Shi 1, Kun He 1( ), David Bindel 2, John E. Hopcroft 2 1 Huazhong University of Science and Technology, Wuhan, China {panshi,brooklet60}@hust.edu.cn
More informationFour graph partitioning algorithms. Fan Chung University of California, San Diego
Four graph partitioning algorithms Fan Chung University of California, San Diego History of graph partitioning NP-hard approximation algorithms Spectral method, Fiedler 73, Folklore Multicommunity flow,
More informationJure Leskovec Joint work with Jaewon Yang, Julian McAuley
Jure Leskovec (@jure) Joint work with Jaewon Yang, Julian McAuley Given a network, find communities! Sets of nodes with common function, role or property 2 3 Q: How and why do communities form? A: Strength
More informationLecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods
Stat260/CS294: Spectral Graph Methods Lecture 20-04/07/205 Lecture: Local Spectral Methods (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide
More informationCross-lingual and temporal Wikipedia analysis
MTA SZTAKI Data Mining and Search Group June 14, 2013 Supported by the EC FET Open project New tools and algorithms for directed network analysis (NADINE No 288956) Table of Contents 1 Link prediction
More informationLecture: Local Spectral Methods (1 of 4)
Stat260/CS294: Spectral Graph Methods Lecture 18-03/31/2015 Lecture: Local Spectral Methods (1 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide
More informationKristina Lerman USC Information Sciences Institute
Rethinking Network Structure Kristina Lerman USC Information Sciences Institute Università della Svizzera Italiana, December 16, 2011 Measuring network structure Central nodes Community structure Strength
More informationUsing Local Spectral Methods in Theory and in Practice
Using Local Spectral Methods in Theory and in Practice Michael W. Mahoney ICSI and Dept of Statistics, UC Berkeley ( For more info, see: http: // www. stat. berkeley. edu/ ~ mmahoney or Google on Michael
More informationAttentive Betweenness Centrality (ABC): Considering Options and Bandwidth when Measuring Criticality
Attentive Betweenness Centrality (ABC): Considering Options and Bandwidth when Measuring Criticality Sibel Adalı, Xiaohui Lu, M. Magdon-Ismail September 5, 0 Who is the Most Critical? Would you use the
More informationSlide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.
Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University http://www.mmds.org #1: C4.5 Decision Tree - Classification (61 votes) #2: K-Means - Clustering
More informationLab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018
Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationTopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs
TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs Zhewei Wei School of Infromation, Renmin University of China zhewei@ruc.edu.cn Sibo Wang University of Queensland sibo.wang@uq.edu.au
More informationLecture 14: Random Walks, Local Graph Clustering, Linear Programming
CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com
More informationDATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS
DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information
More informationDS504/CS586: Big Data Analytics Graph Mining II
Welcome to DS504/CS586: Big Data Analytics Graph Mining II Prof. Yanhua Li Time: 6:00pm 8:50pm Mon. and Wed. Location: SL105 Spring 2016 Reading assignments We will increase the bar a little bit Please
More informationEFFICIENT ALGORITHMS FOR PERSONALIZED PAGERANK
EFFICIENT ALGORITHMS FOR PERSONALIZED PAGERANK A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE
More informationA Nearly-Sublinear Method for Approximating a Column of the Matrix Exponential for Matrices from Large, Sparse Networks
A Nearly-Sublinear Method for Approximating a Column of the Matrix Exponential for Matrices from Large, Sparse Networks Kyle Kloster, and David F. Gleich 2, Purdue University, Mathematics Department 2
More informationPU Learning for Matrix Completion
Cho-Jui Hsieh Dept of Computer Science UT Austin ICML 2015 Joint work with N. Natarajan and I. S. Dhillon Matrix Completion Example: movie recommendation Given a set Ω and the values M Ω, how to predict
More informationGreedy Maximization Framework for Graph-based Influence Functions
Greedy Maximization Framework for Graph-based Influence Functions Edith Cohen Google Research Tel Aviv University HotWeb '16 1 Large Graphs Model relations/interactions (edges) between entities (nodes)
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University.
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu What is the structure of the Web? How is it organized? 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive
More informationLOCAL CHEEGER INEQUALITIES AND SPARSE CUTS
LOCAL CHEEGER INEQUALITIES AND SPARSE CUTS CAELAN GARRETT 18.434 FINAL PROJECT CAELAN@MIT.EDU Abstract. When computing on massive graphs, it is not tractable to iterate over the full graph. Local graph
More informationNETWORKS (a.k.a. graphs) are important data structures
PREPRINT Mixed-Order Spectral Clustering for Networks Yan Ge, Haiping Lu, and Pan Peng arxiv:82.040v [cs.lg] 25 Dec 208 Abstract Clustering is fundamental for gaining insights from complex networks, and
More informationCutting Graphs, Personal PageRank and Spilling Paint
Graphs and Networks Lecture 11 Cutting Graphs, Personal PageRank and Spilling Paint Daniel A. Spielman October 3, 2013 11.1 Disclaimer These notes are not necessarily an accurate representation of what
More informationWeb Structure Mining Nodes, Links and Influence
Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.
More information0.1 Naive formulation of PageRank
PageRank is a ranking system designed to find the best pages on the web. A webpage is considered good if it is endorsed (i.e. linked to) by other good webpages. The more webpages link to it, and the more
More informationLink Analysis. Stony Brook University CSE545, Fall 2016
Link Analysis Stony Brook University CSE545, Fall 2016 The Web, circa 1998 The Web, circa 1998 The Web, circa 1998 Match keywords, language (information retrieval) Explore directory The Web, circa 1998
More informationRegularized Laplacian Estimation and Fast Eigenvector Approximation
Regularized Laplacian Estimation and Fast Eigenvector Approximation Patrick O. Perry Information, Operations, and Management Sciences NYU Stern School of Business New York, NY 1001 pperry@stern.nyu.edu
More informationLink Prediction. Eman Badr Mohammed Saquib Akmal Khan
Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More information6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search
6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)
More informationOnline Social Networks and Media. Link Analysis and Web Search
Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information
More informationAlgorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm
Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm Shang-Hua Teng Computer Science, Viterbi School of Engineering USC Massive Data and Massive Graphs 500 billions web
More informationThe Ties that Bind Characterizing Classes by Attributes and Social Ties
The Ties that Bind WWW April, 2017, Bryan Perozzi*, Leman Akoglu Stony Brook University *Now at Google. Introduction Outline Our problem: Characterizing Community Differences Proposed Method Experimental
More informationOnline Social Networks and Media. Link Analysis and Web Search
Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information
More informationEdo Liberty Principal Scientist Amazon Web Services. Streaming Quantiles
Edo Liberty Principal Scientist Amazon Web Services Streaming Quantiles Streaming Quantiles Manku, Rajagopalan, Lindsay. Random sampling techniques for space efficient online computation of order statistics
More informationEstimating Clustering Coefficients and Size of Social Networks via Random Walk
Estimating Clustering Coefficients and Size of Social Networks via Random Walk Stephen J. Hardiman* Capital Fund Management France Liran Katzir Advanced Technology Labs Microsoft Research, Israel *Research
More informationModel Reduction for Edge-Weighted Personalized PageRank
Model Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue 11 Apr 2016 1 / 33 The Computational Science & Engineering Picture Application Analysis Computation MEMS Smart
More informationContinuous Influence Maximization: What Discounts Should We Offer to Social Network Users?
Continuous Influence Maximization: What Discounts Should We Offer to Social Network Users? ABSTRACT Yu Yang Simon Fraser University Burnaby, Canada yya119@sfu.ca Jian Pei Simon Fraser University Burnaby,
More informationStatistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t
Markov Chains. Statistical Problem. We may have an underlying evolving system (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t Consecutive speech feature vectors are related
More informationOutline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting
Outline for today Information Retrieval Efficient Scoring and Ranking Recap on ranked retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Efficient
More informationDissertation Defense
Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation
More informationDS504/CS586: Big Data Analytics Graph Mining II
Welcome to DS504/CS586: Big Data Analytics Graph Mining II Prof. Yanhua Li Time: 6-8:50PM Thursday Location: AK233 Spring 2018 v Course Project I has been graded. Grading was based on v 1. Project report
More informationFast Algorithms for Pseudoarboricity
Fast Algorithms for Pseudoarboricity Meeting on Algorithm Engineering and Experiments ALENEX 2016 Markus Blumenstock Institute of Computer Science, University of Mainz January 10th, 2016 Pseudotrees Orientations
More informationOn Top-k Structural. Similarity Search. Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada
On Top-k Structural 1 Similarity Search Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China 2014/10/14 Pei
More informationLecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian
Lecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian Radu Balan February 5, 2018 Datasets diversity: Social Networks: Set of individuals ( agents, actors ) interacting with each other (e.g.,
More informationAn Efficient reconciliation algorithm for social networks
An Efficient reconciliation algorithm for social networks Silvio Lattanzi (Google Research NY) Joint work with: Nitish Korula (Google Research NY) ICERM Stochastic Graph Models Outline Graph reconciliation
More informationA Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally
Journal of Machine Learning Research 13 (2012) 2339-2365 Submitted 11/12; Published 9/12 A Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally
More informationLink Mining PageRank. From Stanford C246
Link Mining PageRank From Stanford C246 Broad Question: How to organize the Web? First try: Human curated Web dictionaries Yahoo, DMOZ LookSmart Second try: Web Search Information Retrieval investigates
More informationEdge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier
Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier W. Xie D. Bindel A. Demers J. Gehrke 12 Aug 2015 W. Xie, D. Bindel, A. Demers, J. Gehrke KDD2015 12 Aug 2015 1 / 1 PageRank
More informationMining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University
Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit
More informationSpectral Analysis of Random Walks
Graphs and Networks Lecture 9 Spectral Analysis of Random Walks Daniel A. Spielman September 26, 2013 9.1 Disclaimer These notes are not necessarily an accurate representation of what happened in class.
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationCost and Preference in Recommender Systems Junhua Chen LESS IS MORE
Cost and Preference in Recommender Systems Junhua Chen, Big Data Research Center, UESTC Email:junmshao@uestc.edu.cn http://staff.uestc.edu.cn/shaojunming Abstract In many recommender systems (RS), user
More informationAlgorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric
Axioms of a Metric Picture analysis always assumes that pictures are defined in coordinates, and we apply the Euclidean metric as the golden standard for distance (or derived, such as area) measurements.
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationMachine Learning CPSC 340. Tutorial 12
Machine Learning CPSC 340 Tutorial 12 Random Walk on Graph Page Rank Algorithm Label Propagation on Graph Assume a strongly connected graph G = (V, A) Label Propagation on Graph Assume a strongly connected
More informationA Tale of Two Eigenvalue Problems
A Tale of Two Eigenvalue Problems Music of the Microspheres + Hearing the Shape of a Graph David Bindel Department of Computer Science Cornell University CAM Visit Talk, 28 Feb 214 Brown bag 1 / 22 The
More informationPersonal PageRank and Spilling Paint
Graphs and Networks Lecture 11 Personal PageRank and Spilling Paint Daniel A. Spielman October 7, 2010 11.1 Overview These lecture notes are not complete. The paint spilling metaphor is due to Berkhin
More informationData and Algorithms of the Web
Data and Algorithms of the Web Link Analysis Algorithms Page Rank some slides from: Anand Rajaraman, Jeffrey D. Ullman InfoLab (Stanford University) Link Analysis Algorithms Page Rank Hubs and Authorities
More informationWiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte
Reputation Systems I HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte Yury Lifshits Wiki Definition Reputation is the opinion (more technically, a social evaluation) of the public toward a person, a
More informationLAPLACIAN MATRIX AND APPLICATIONS
LAPLACIAN MATRIX AND APPLICATIONS Alice Nanyanzi Supervisors: Dr. Franck Kalala Mutombo & Dr. Simukai Utete alicenanyanzi@aims.ac.za August 24, 2017 1 Complex systems & Complex Networks 2 Networks Overview
More information) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.
1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2011 Recitation 8, November 3 Corrected Version & (most) solutions
More informationEfficient Subgraph Matching by Postponing Cartesian Products. Matthias Barkowsky
Efficient Subgraph Matching by Postponing Cartesian Products Matthias Barkowsky 5 Subgraph Isomorphism - Approaches search problem: find all embeddings of a query graph in a data graph NP-hard, but can
More informationCS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine
CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,
More informationUnderstanding graphs through spectral densities
Understanding graphs through spectral densities David Bindel Department of Computer Science Cornell University SCAN seminar, 29 Feb 216 SCAN 1 / 1 Can One Hear the Shape of a Drum? 2 u = λu on Ω, u = on
More informationAlgorithms, Graph Theory, and the Solu7on of Laplacian Linear Equa7ons. Daniel A. Spielman Yale University
Algorithms, Graph Theory, and the Solu7on of Laplacian Linear Equa7ons Daniel A. Spielman Yale University Rutgers, Dec 6, 2011 Outline Linear Systems in Laplacian Matrices What? Why? Classic ways to solve
More informationThe Push Algorithm for Spectral Ranking
The Push Algorithm for Spectral Ranking Paolo Boldi Sebastiano Vigna March 8, 204 Abstract The push algorithm was proposed first by Jeh and Widom [6] in the context of personalized PageRank computations
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Graph and Network Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Methods Learnt Classification Clustering Vector Data Text Data Recommender System Decision Tree; Naïve
More informationSpectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity
Spectral Graph Theory and its Applications Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Outline Adjacency matrix and Laplacian Intuition, spectral graph drawing
More information6.842 Randomness and Computation March 3, Lecture 8
6.84 Randomness and Computation March 3, 04 Lecture 8 Lecturer: Ronitt Rubinfeld Scribe: Daniel Grier Useful Linear Algebra Let v = (v, v,..., v n ) be a non-zero n-dimensional row vector and P an n n
More informationBeyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian
Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics
More informationExtractors and the Leftover Hash Lemma
6.889 New Developments in Cryptography March 8, 2011 Extractors and the Leftover Hash Lemma Instructors: Shafi Goldwasser, Yael Kalai, Leo Reyzin, Boaz Barak, and Salil Vadhan Lecturer: Leo Reyzin Scribe:
More informationEstimating network degree distributions from sampled networks: An inverse problem
Estimating network degree distributions from sampled networks: An inverse problem Eric D. Kolaczyk Dept of Mathematics and Statistics, Boston University kolaczyk@bu.edu Introduction: Networks and Degree
More informationData dependent operators for the spatial-spectral fusion problem
Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.
More information7.1 Sampling Error The Need for Sampling Distributions
7.1 Sampling Error The Need for Sampling Distributions Tom Lewis Fall Term 2009 Tom Lewis () 7.1 Sampling Error The Need for Sampling Distributions Fall Term 2009 1 / 5 Outline 1 Tom Lewis () 7.1 Sampling
More informationThe Limiting Shape of Internal DLA with Multiple Sources
The Limiting Shape of Internal DLA with Multiple Sources January 30, 2008 (Mostly) Joint work with Yuval Peres Finite set of points x 1,...,x k Z d. Start with m particles at each site x i. Each particle
More informationNetwork Essence: PageRank Completion and Centrality-Conforming Markov Chains
Network Essence: PageRank Completion and Centrality-Conforming Markov Chains Shang-Hua Teng Computer Science and Mathematics USC November 29, 2016 Abstract Jiří Matoušek (1963-2015) had many breakthrough
More informationProtein Complex Identification by Supervised Graph Clustering
Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie
More informationCapacity Releasing Diffusion for Speed and Locality
000 00 002 003 004 005 006 007 008 009 00 0 02 03 04 05 06 07 08 09 020 02 022 023 024 025 026 027 028 029 030 03 032 033 034 035 036 037 038 039 040 04 042 043 044 045 046 047 048 049 050 05 052 053 054
More informationPerformance evaluation of binary classifiers
Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in
More informationLocal Flow Partitioning for Faster Edge Connectivity
Local Flow Partitioning for Faster Edge Connectivity Monika Henzinger, Satish Rao, Di Wang University of Vienna, UC Berkeley, UC Berkeley Edge Connectivity Edge-connectivity λ: least number of edges whose
More informationChapter 7 Network Flow Problems, I
Chapter 7 Network Flow Problems, I Network flow problems are the most frequently solved linear programming problems. They include as special cases, the assignment, transportation, maximum flow, and shortest
More informationMath 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank
Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank David Glickenstein November 3, 4 Representing graphs as matrices It will sometimes be useful to represent graphs
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu May 2, 2017 Announcements Homework 2 due later today Due May 3 rd (11:59pm) Course project
More informationDeterministic Decentralized Search in Random Graphs
Deterministic Decentralized Search in Random Graphs Esteban Arcaute 1,, Ning Chen 2,, Ravi Kumar 3, David Liben-Nowell 4,, Mohammad Mahdian 3, Hamid Nazerzadeh 1,, and Ying Xu 1, 1 Stanford University.
More informationDiffusion and random walks on graphs
Diffusion and random walks on graphs Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural
More informationFirst, parse the title...
First, parse the title... Eigenvector localization: Eigenvectors are usually global entities But they can be localized in extremely sparse/noisy graphs/matrices Implicit regularization: Usually exactly
More information