Node similarity and classification
|
|
- Louisa Preston
- 6 years ago
- Views:
Transcription
1 Node similarity and classification Davide Mottin, Anton Tsitsulin HassoPlattner Institute Graph Mining course Winter Semester 2017
2 Acknowledgements Some part of this lecture is taken from: Other adapted content is from Social Network Data Analytics (Springer) Ed. Charu Aggarwal, March
3 SimRank Structural context Idea: two objects are similar if they are referenced by similar objects decay factor in [0,1] total # of in-neighbors pairs Avg similarity between in-neighbors of α and in-neighbors of b similarity of inneighbors G(V,E) α b Glen Jeh and Jennifer Widom. SimRank: a measure of structural-context similarity. SIGKDD,
4 SimRank: Intuition G(V,E) G 2 (V 2,E 2 ) Structural context Intuition: Computing SimRank is like propagating on the G " graph of node-node pairs 1. The source of similarity is self-vertices, like (Univ, Univ). 2. Similarity propagates along pair-paths in G ", away from the sources. 4
5 SimRank for Bipartite Graphs G(V,E) A,B Avg similarity between out-neighbors of A and out-neighbors of B music videos music c d c,d Avg similarity between in-neighbors of c and in-neighbors of d [Jeh, Widom 02; Improvements: Antonellis+ 08 SimRank++, C. Li, Han+ 10, Y. Zhang 13, P. Li+ 14 ] 5
6 Lecture road Similarity based Iterative classification Label propagation 6
7 Iterative classification methods Idea: Use features that take into account the neighbor nodes and repeat the classification several time until nothing changes Suppose for each node we have two features: 1. Number of neighbors with class A 2. Number of neighbors without a class [1,1] [1,1] A A Repeat until convergence [1,1] [1,1] A A [1,1] [1,1] A A Learn A Recompute [0,1] Labels and [2,1] [0,1] features on apply to nodes [2,1] [1,0] unlabeled A [2,1] Neville, J. and Jensen, D., Iterative classification in relational data. AAAI. 7
8 Iterative Classification Algorithm (ICA) Train classifier using the labeled instances Until convergence Apply classifier to the unlabeled nodes Updated the feature vectors for unlabeled nodes Return the labels for the labeled nodes Neville, J. and Jensen, D., Iterative classification in relational data. AAAI. 8
9 Extension to multi-labels Each node has a distribution over the labels To avoid noise keep only the top-k labels for each unlabeled node sorted in descending order. Intuition: remove the less confident labels Lu, Q. and Getoor, L., Link-based classification. In ICML. 9
10 Lecture road Similarity based Iterative classification Label propagation 10
11 Guilt-by-Association Techniques????? Given: graph and few labeled nodes Find: class (red/green) for rest nodes Assuming: network effect (homophily/ heterophily) red green Fraudster Accomplice Honest 11
12 Personalized Random Walk with Restarts (RWR) Idea: Propagate labels from a set of nodes to the rest of the graph [Brin+ 98; Haveliwala 03; Tong+ 06; Minkov, Cohen 07] 12
13 PageRank: A kind of random walk John Andrew Bill Laura The importance of John is high if Laura, Andrew, and Bill are also important P_Rank John = -_./ _./01 6/78/ 5-_./01 90:8;< =>?@AB DE FGHIFGJK 13
14 Random Walk In a random walk you do the opposite, you assume that the walker moves randomly and chooses one of the neighbors to visit 1 1/2 Andrew Tom 1/2 John 1. Tom chooses Andrew or Laura with probability 1/2. 2. Once he chooses one it increases the number of times he visited that node 3. Continue the process until nothing changes anymore (at a probabilistic level) 12 Laura John will receive many visits since many nodes are connected to him 14
15 Personalized RWR Tom Now assume that with probability c you perform another move and with probability (1-c) you jump back to Tom Andrew Laura Therefore the probability for the walker of being in Tom place will be Probability of visiting Andrew at time (t-1) Tom t = c Prob Andrew t 1 + Prob Laura t (1 c) Number of Andrew and Laura s neighbors Probability of jumping back to Tom 15
16 Comparing two nodes Start two random walks from the two nodes you want to compare separately Compare the final scores you obtain for each node in the graph using some vector comparison (e.g., cosine similarity, KL-divergence) Tom Andrew Alice John Paul [Tom, Andrew, Alice, John,, Paul] Vector for Tom = [0.2, 0.2, 0.1, 0.3,, 0.01] Vector for John = [0.05, 0.15, 0.2, 0.2,..., 0.2] Compare the two vectors (e.g., subtract) 16
17 Personalized RWR Tom Now assume that with probability c you perform another move and with probability (1-c) you jump back to Tom restart prob [I cd 1 A]x = (1 c)y ½ ½ ½ 0 ½ 0 1 0? graph relevance structure vector starting vector 17
18 Personalized RWR Personalized RWR is defined as the probability of a random surfer of reaching a node n after wandering in the graph for a long time After some time the value for n (and for all the other nodes) does not changes anymore Technically speaking the random walk converges to a stationary distribution Let A be the adjacency matrix of graph G = V, E x = cd Hf Ax + 1 c y where D is a diagonal matrix with the node degree in the diagonal and y is a vector containing the probability of starting from any of the nodes y 3, x is the probability of reaching any node in the graph dim x = dim y = V What is D Hf A? Recall that the inverse of diagonal matrix is a diagonal matrix containing the reciprocal of the elements in the diagonal We need to find x Ix = cd Hf Ax + 1 c y Ix cd Hf Ax = 1 c y (I cd Hf A)x = 1 c y x = I cd Hf A Hf 1 c y 18
19 Another way of looking at the RW x = cd Hf Ax + 1 c y Suppose the surfer starts from the beginning, it will choose one node at random among y In one step he will either choose, with probability (1-c) another starting node or with probability c to move to one neighbor. Why? Think about the process without restart, setting W = D Hf A. x f = Wx r x " = Wx f = WWx r = W " x r x 0 = W 0 x r That means that W 3,o 0 contains the probability of reaching j starting from i in n steps 19
20 Semi-Supervised Learning (SSL) Idea If you have few labeled nodes exploit the structure and the homophily (similarity) between nodes graph-based few labeled nodes edges: similarity between nodes Inference: exploit neighborhood information S T E P 1?? -0.3 S T E P Ji, M., Sun, Y., Danilevsky, M., Han, J. and Gao, J., Graph regularized transductive classification on heterogeneous information networks. ECML/PKDD. 20
21 SSL Equation Laplacian matrix 0.8 [I + a(d A)]x = y ? 1 d graph final homophily strength of structure labels neighbors ~ stiffness of spring d1 d known labels??
22 SSL Equation What does it compute? [I + a(d A)]x = y Hard? Let s unroll it! x = a D A x + y = aax adx + y 0 x 3 = a A 3o x o + (y 3 ad 33 x 3 ) o f Sum the labels from the neighbors Difference between the learned label in the node and the input label 22
23 When is RWR = SSL? [I cd 1 A]x = (1 c)y [I + a(d A)]x = y *all nodes have degree d (1 c) I cd Hf A Hf = I + a D A Hf Hf 1 c I 1 c 1 c DHf A = I + a D A Hf 1 c I 1 c 1 c DHf A = I + a D A 1 c I I 1 c 1 c DHf A = a D A c c I 1 c 1 c DHf A = a D A c 1 c (I DHf A) = a D A c 1 c (I 1 A) = a D A d c D A = ad D A 1 c c d(1 c) = a We assume the graph is d-regular* 23
24 Belief Propagation Iterative message-based method Propagation matrix : ² Homophily class of receiver AI class of sender until st nd round stop criterion fulfilled PL - Pearl, J., Reverend Bayes on inference engines: A distributed hierarchical approach. In AAAI. - Yedidia, J.S., Freeman, W.T. and Weiss, Y., Understanding belief propagation and its generalizations. Exploring artificial intelligence in the new millennium. 24
25 Belief Propagation Iterative message-based method Propagation matrix : ² Homophily class of receiver class of sender ² Heterophily
26 Belief Propagation Equations message(i > j) belief(i) x homophily strength i j 26
27 Belief Propagation Equations belief of i b i (x i ) φ i (x i ) m ij (x i ) prior belief j N(i) messages from neighbors i j 27
28 Fast Belief Propagation Original [Yedidia+]: Belief Propagation m ij (x j ) = φ i (x i ) ψ ij (x i, x j ) m ni (x i ) xi b i (x i ) = η φ i (x i ) m ij (x i ) j N (i) non-linear n N(i)\ j FaBP [Koutra+]: Linearized BP BPis approximated by [I + ad c'a] b h = φ h d1 d2 d3 linear ? prior beliefs Koutra, D., Ke, T.Y., Kang, U., Chau, D.H.P., Pao, H.K.K. and Faloutsos, C., Unifying guilt-by-association approaches: Theorems and fast algorithms. PKDD 28
29 Wait How? m ij (x j ) = φ i (x i ) ψ ij (x i, x j ) m ni (x i ) xi b i (x i ) = η φ i (x i ) m ij (x i ) j N (i) n N(i)\ j [I + ad c'a] b h = φ h [I + ad c'a] b h = φ h Ib ƒ + adb ƒ c Ab = φ ƒ b ƒ = ad c A b + φ ƒ b ƒ i = ad 33 b ƒ (i) c o A 3o b ƒ j + φ ƒ (i) Exchanged messages 29
30 Qualitative Comparison GBA Method Heterophily Scalability Convergence RWR SSL BP? FABP 30
31 Correspondence of Methods RWR SSL BP Random Walk Semi-supervised Belief with Restart Learning Propagation Method Matrix unknown known RWR [I c AD -1 ] x = (1-c)y SSL [I + a(d - A)] x = y FABP [I + a D - c A] b h = φ h ?
32 Questions? 32
33 References Lorrain, F. and White, H.C.. Structural equivalence of individuals in social networks. The Journal of mathematical sociology, Borgatti, S.P. and Everett, M.G., Notions of position in social network analysis. Sociological methodology. Borgatti, S.P. and Everett, M.G., Regular blockmodels of multiway, multimode matrices. Social Networks. Henderson, K., Gallagher, B., Eliassi-Rad, T., Tong, H., Basu, S., Akoglu, L., Koutra, D., Faloutsos, C. and Li, L., Rolx: structural role extraction & mining in large graphs. SIGKDD Henderson, K., Gallagher, B., Li, L., Akoglu, L., Eliassi-Rad, T., Tong, H. and Faloutsos, C., It's who you know: graph mining using recursive structural features. SIGKDD. H. Tong, Y. Koren, & C. Faloutsos. Fast direction-aware proximity for graph mining. SIGKDD, Glen Jeh and Jennifer Widom. SimRank: a measure of structural-context similarity. SIGKDD, 2002 Ji, M., Sun, Y., Danilevsky, M., Han, J. and Gao, J., Graph regularized transductive classification on heterogeneous information networks. ECML/PKDD. Pearl, J., Reverend Bayes on inference engines: A distributed hierarchical approach. In AAAI. Yedidia, J.S., Freeman, W.T. and Weiss, Y., Understanding belief propagation and its generalizations. Exploring artificial intelligence in the new millennium. Koutra, D., Ke, T.Y., Kang, U., Chau, D.H.P., Pao, H.K.K. and Faloutsos, C., Unifying guilt-byassociation approaches: Theorems and fast algorithms. PKDD Neville, J. and Jensen, D., Iterative classification in relational data. AAAI. Lu, Q. and Getoor, L., Link-based classification. In ICML. 33
Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms
Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms Danai Koutra 1, Tai-You Ke 2,U.Kang 1, Duen Horng (Polo) Chau 1, Hsing-Kuo Kenneth Pao 2, and Christos Faloutsos 1 1 School of Computer
More informationECEN 689 Special Topics in Data Science for Communications Networks
ECEN 689 Special Topics in Data Science for Communications Networks Nick Duffield Department of Electrical & Computer Engineering Texas A&M University Lecture 8 Random Walks, Matrices and PageRank Graphs
More informationAnomaly Detection. Davide Mottin, Konstantina Lazaridou. HassoPlattner Institute. Graph Mining course Winter Semester 2016
Anomaly Detection Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 and Next week February 7, third round presentations Slides are due by February 6,
More informationSemi-supervised learning for node classification in networks
Semi-supervised learning for node classification in networks Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Paul Bennett, John Moore, and Joel Pfeiffer)
More informationClassification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter
Classification Semi-supervised learning based on network Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS 249-2 2017 Winter Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Xiaojin
More informationOnline Social Networks and Media. Link Analysis and Web Search
Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information
More informationThanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides
Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank
More informationOverlapping Communities
Overlapping Communities Davide Mottin HassoPlattner Institute Graph Mining course Winter Semester 2017 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides GRAPH
More informationThanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides
Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank
More informationTHE POWER OF CERTAINTY
A DIRICHLET-MULTINOMIAL APPROACH TO BELIEF PROPAGATION Dhivya Eswaran* CMU deswaran@cs.cmu.edu Stephan Guennemann TUM guennemann@in.tum.de Christos Faloutsos CMU christos@cs.cmu.edu PROBLEM MOTIVATION
More informationWeb Structure Mining Nodes, Links and Influence
Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationSupporting Statistical Hypothesis Testing Over Graphs
Supporting Statistical Hypothesis Testing Over Graphs Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Tina Eliassi-Rad, Brian Gallagher, Sergey Kirshner,
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationLarge Graph Mining: Power Tools and a Practitioner s guide
Large Graph Mining: Power Tools and a Practitioner s guide Task 3: Recommendations & proximity Faloutsos, Miller & Tsourakakis CMU KDD'09 Faloutsos, Miller, Tsourakakis P3-1 Outline Introduction Motivation
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More informationThe Linearization of Belief Propagation on Pairwise Markov Random Fields
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) The Linearization of Belief Propagation on Pairwise Markov Random Fields Wolfgang Gatterbauer Carnegie Mellon University
More informationGraphs in Machine Learning
Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017
More informationDATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS
DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information
More informationOnline Social Networks and Media. Opinion formation on social networks
Online Social Networks and Media Opinion formation on social networks Diffusion of items So far we have assumed that what is being diffused in the network is some discrete item: E.g., a virus, a product,
More informationOn Top-k Structural. Similarity Search. Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada
On Top-k Structural 1 Similarity Search Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China 2014/10/14 Pei
More informationWalk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationSlide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.
Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University http://www.mmds.org #1: C4.5 Decision Tree - Classification (61 votes) #2: K-Means - Clustering
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationHou, Ch. et al. IEEE Transactions on Neural Networks March 2011
Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Semi-supervised approach which attempts to incorporate partial information from unlabeled data points Semi-supervised approach which attempts
More informationRaRE: Social Rank Regulated Large-scale Network Embedding
RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat
More informationLink Prediction. Eman Badr Mohammed Saquib Akmal Khan
Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationMining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee
Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee wwlee1@uiuc.edu September 28, 2004 Motivation IR on newsgroups is challenging due to lack of
More informationFaloutsos, Tong ICDE, 2009
Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart
More informationSpectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods
Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition
More informationPageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10
PageRank Ryan Tibshirani 36-462/36-662: Data Mining January 24 2012 Optional reading: ESL 14.10 1 Information retrieval with the web Last time we learned about information retrieval. We learned how to
More informationOnline Social Networks and Media. Link Analysis and Web Search
Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information
More informationLecture: Local Spectral Methods (1 of 4)
Stat260/CS294: Spectral Graph Methods Lecture 18-03/31/2015 Lecture: Local Spectral Methods (1 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide
More information1 Random Walks and Electrical Networks
CME 305: Discrete Mathematics and Algorithms Random Walks and Electrical Networks Random walks are widely used tools in algorithm design and probabilistic analysis and they have numerous applications.
More informationBeyond the Point Cloud: From Transductive to Semi-Supervised Learning
Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationData Mining Techniques
Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!
More informationIntegrated Anchor and Social Link Predictions across Social Networks
Proceedings of the TwentyFourth International Joint Conference on Artificial Intelligence IJCAI 2015) Integrated Anchor and Social Link Predictions across Social Networks Jiawei Zhang and Philip S. Yu
More informationMachine Learning CPSC 340. Tutorial 12
Machine Learning CPSC 340 Tutorial 12 Random Walk on Graph Page Rank Algorithm Label Propagation on Graph Assume a strongly connected graph G = (V, A) Label Propagation on Graph Assume a strongly connected
More informationA Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation
A Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation Pelin Angin Department of Computer Science Purdue University pangin@cs.purdue.edu Jennifer Neville Department of Computer Science
More informationLink Analysis Ranking
Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query
More informationCollective classification in large scale networks. Jennifer Neville Departments of Computer Science and Statistics Purdue University
Collective classification in large scale networks Jennifer Neville Departments of Computer Science and Statistics Purdue University The data mining process Network Datadata Knowledge Selection Interpretation
More informationCommunities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices
Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationSemi-Supervised Learning of Speech Sounds
Aren Jansen Partha Niyogi Department of Computer Science Interspeech 2007 Objectives 1 Present a manifold learning algorithm based on locality preserving projections for semi-supervised phone classification
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Graph and Network Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Methods Learnt Classification Clustering Vector Data Text Data Recommender System Decision Tree; Naïve
More informationIterative Laplacian Score for Feature Selection
Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More information11 The Max-Product Algorithm
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced
More informationLink Analysis. Leonid E. Zhukov
Link Analysis Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis and Visualization
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More information0.1 Naive formulation of PageRank
PageRank is a ranking system designed to find the best pages on the web. A webpage is considered good if it is endorsed (i.e. linked to) by other good webpages. The more webpages link to it, and the more
More informationCMU SCS Mining Large Graphs: Patterns, Anomalies, and Fraud Detection
Mining Large Graphs: Patterns, Anomalies, and Fraud Detection Christos Faloutsos CMU Thank you! Prof. Julian McAuley Nicholas Urioste UCSD'16 (c) 2016, C. Faloutsos 2 Roadmap Introduction Motivation Why
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com
More informationProbabilistic Graphical Models Lecture Notes Fall 2009
Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University
More informationSUPPORT VECTOR MACHINE
SUPPORT VECTOR MACHINE Mainly based on https://nlp.stanford.edu/ir-book/pdf/15svm.pdf 1 Overview SVM is a huge topic Integration of MMDS, IIR, and Andrew Moore s slides here Our foci: Geometric intuition
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationNode Centrality and Ranking on Networks
Node Centrality and Ranking on Networks Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Social
More informationTo Randomize or Not To
To Randomize or Not To Randomize: Space Optimal Summaries for Hyperlink Analysis Tamás Sarlós, Eötvös University and Computer and Automation Institute, Hungarian Academy of Sciences Joint work with András
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationComputer Vision Group Prof. Daniel Cremers. 14. Clustering
Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it
More informationarxiv: v2 [cs.ai] 27 Dec 2016
The Linearization of Belief Propagation on Pairwise Markov Random Fields Wolfgang Gatterbauer Carnegie Mellon University Pittsburgh, Pennsylvania 523 arxiv:502.04956v2 [cs.ai] 27 Dec 206 Abstract Belief
More informationComputing PageRank using Power Extrapolation
Computing PageRank using Power Extrapolation Taher Haveliwala, Sepandar Kamvar, Dan Klein, Chris Manning, and Gene Golub Stanford University Abstract. We present a novel technique for speeding up the computation
More informationLecture 18 Generalized Belief Propagation and Free Energy Approximations
Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and
More informationInferring Useful Heuristics from the Dynamics of Iterative Relational Classifiers
Inferring Useful Heuristics from the Dynamics of Iterative Relational Classifiers Aram Galstyan and Paul R. Cohen USC Information Sciences Institute 4676 Admiralty Way, Suite 11 Marina del Rey, California
More informationMin-Max Message Passing and Local Consistency in Constraint Networks
Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com
More informationDS504/CS586: Big Data Analytics Graph Mining II
Welcome to DS504/CS586: Big Data Analytics Graph Mining II Prof. Yanhua Li Time: 6:00pm 8:50pm Mon. and Wed. Location: SL105 Spring 2016 Reading assignments We will increase the bar a little bit Please
More informationPage rank computation HPC course project a.y
Page rank computation HPC course project a.y. 2015-16 Compute efficient and scalable Pagerank MPI, Multithreading, SSE 1 PageRank PageRank is a link analysis algorithm, named after Brin & Page [1], and
More informationStructured deep models: Deep learning on graphs and beyond
Structured deep models: Deep learning on graphs and beyond Hidden layer Hidden layer Input Output ReLU ReLU, 25 May 2018 CompBio Seminar, University of Cambridge In collaboration with Ethan Fetaya, Rianne
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationSpectral Clustering. Guokun Lai 2016/10
Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph
More informationTalk 2: Graph Mining Tools - SVD, ranking, proximity. Christos Faloutsos CMU
Talk 2: Graph Mining Tools - SVD, ranking, proximity Christos Faloutsos CMU Outline Introduction Motivation Task 1: Node importance Task 2: Recommendations Task 3: Connection sub-graphs Conclusions Lipari
More informationBelief Update in CLG Bayesian Networks With Lazy Propagation
Belief Update in CLG Bayesian Networks With Lazy Propagation Anders L Madsen HUGIN Expert A/S Gasværksvej 5 9000 Aalborg, Denmark Anders.L.Madsen@hugin.com Abstract In recent years Bayesian networks (BNs)
More informationLinear Response for Approximate Inference
Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University
More informationActive and Semi-supervised Kernel Classification
Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),
More informationSlides based on those in:
Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering
More informationPageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.
Web Data Mining PageRank, University of Szeged Why ranking web pages is useful? We are starving for knowledge It earns Google a bunch of money. How? How does the Web looks like? Big strongly connected
More informationBEYOND ASSORTATIVITY PROCLIVITY INDEX FOR ATTRIBUTED NETWORKS (PRONE) Reihaneh Rabbany Dhivya Eswaran*
PROCLIVITY INDEX FOR ATTRIBUTED NETWORKS (PRONE) Reihaneh Rabbany rrabbany@andrew.cmu.edu Artur W. Dubrawski awd@andrew.cmu.edu Dhivya Eswaran* deswaran@cs.cmu.edu Christos Faloutsos christos@cs.cmu.edu
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationLINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
LINK ANALYSIS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Retrieval evaluation Link analysis Models
More informationA Probabilistic Embedding Clustering Method for Urban Structure Detection
A Probabilistic Embedding Clustering Method for Urban Structure Detection Xin Lin a,b, Haifeng Li b*, Yan Zhang b, Lei Gao b, Ling Zhao b, Min Deng b a School of Civil Engineering, Central South University,
More informationKristina Lerman USC Information Sciences Institute
Rethinking Network Structure Kristina Lerman USC Information Sciences Institute Università della Svizzera Italiana, December 16, 2011 Measuring network structure Central nodes Community structure Strength
More informationLearning distributions and hypothesis testing via social learning
UMich EECS 2015 1 / 48 Learning distributions and hypothesis testing via social learning Anand D. Department of Electrical and Computer Engineering, The State University of New Jersey September 29, 2015
More informationLecture 14: Random Walks, Local Graph Clustering, Linear Programming
CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These
More informationResults: MCMC Dancers, q=10, n=500
Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time
More informationRANDOM WALKS WITH RESTARTS FOR GRAPH-BASED CLASSIFICATION: TELEPORTATION TUNING AND SAMPLING DESIGN
RANDOM WALKS WITH RESTARTS FOR GRAPH-BASED CLASSIFICATION: TELEPORTATION TUNING AND SAMPLING DESIGN Dimitris Berberidis 1,2, Athanasios N. Nikolakopoulos 2 and Georgios B. Giannakis 1,2 1 Dept. of ECE
More informationLearning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31
Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationOverview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated
Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian
More informationIntroduction to Probabilistic Graphical Models
Introduction to Probabilistic Graphical Models Kyu-Baek Hwang and Byoung-Tak Zhang Biointelligence Lab School of Computer Science and Engineering Seoul National University Seoul 151-742 Korea E-mail: kbhwang@bi.snu.ac.kr
More informationWeb Ranking. Classification (manual, automatic) Link Analysis (today s lesson)
Link Analysis Web Ranking Documents on the web are first ranked according to their relevance vrs the query Additional ranking methods are needed to cope with huge amount of information Additional ranking
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationCOMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei
COMPSCI 650 Applied Information Theory Apr 5, 2016 Lecture 18 Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei 1 Correcting Errors in Linear Codes Suppose someone is to send
More informationPageRank: Ranking of nodes in graphs
PageRank: Ranking of nodes in graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October
More information