Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)

Size: px
Start display at page:

Download "Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)"

Transcription

1 Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)

2 Which pair of nodes {i,j} should be connected? Variant: node i is given Alice Bob Charlie Friend suggestion in Facebook Movie recommendation in Netflix

3 Predict link between nodes With the minimum number of hops With max common neighbors (length 2 paths) Alice Bob 1000 followers Prolific common friends Less evidence Charlie 8 followers The Adamic/Adar score gives more weight to low degree common neighbors. Less prolific Much more evidence

4 Predict link between nodes With the minimum number of hops With more common neighbors (length 2 paths) With larger Adamic/Adar With more short paths (e.g. length 3 paths )

5 Link prediction accuracy* How do we justify these observations? Especially if the graph is sparse Random Shortest Path Common Neighbors Adamic/Adar Ensemble of short paths *Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007

6 Raftery et al. s Model: Points close in this space are more likely to be connected. Unit volume universe Nodes are uniformly distributed in a latent space The problem of link prediction is to find the nearest neighbor who is not currently linked to the node. Equivalent to inferring distances in the latent space 6

7 Two sources of randomness Point positions: uniform in D dimensional space Linkage probability: logistic with parameters, r, r and D are known Higher probability of linking 1 determines the steepness radius r 7

8 Especially if the graph is sparse Link prediction accuracy Random Shortest Path Common Neighbors Adamic/Adar Ensemble of short paths *Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007

9 i j Pr 2 (i,j) = Pr(common neighbor d ij ) Product of two logistic probabilities, integrated over a volume determined by d ij As Logistic Step function Much easier to analyze!

10 Unit volume universe Everyone has same radius r i j =Number of common neighbors Empirical Bernstein Bounds on distance V(r)=volume of radius r in D dims 10

11 OPT = node closest to i MAX = node with max common neighbors with i Theorem: w.h.p d OPT d MAX d OPT + 2[ Common neighbors is an asymptotically optimal heuristic as N

12 Node k has radius r k. i k if d ik r k (Directed graph) r k captures popularity of node k Type 1: i k j Type 2: i k j r i i k j r j r k i k j r k A(r i, r j,d ij ) A(r k, r k,d ij ) 12

13 Example graph: N 1 nodes of radius r 1 and N 2 nodes of radius r 2 r 1 << r 2 1 ~ Bin[N 1, A(r 1, r 1, d ij )] 2 ~ Bin[N 2, A(r 2, r 2, d ij )] i k Maximize Pr[ 1, 2 d ij ] = product of two binomials j w(r 1 ) E[ 1 d*] + w(r 2 ) E[ 2 d*] = w(r 1 ) 1 + w(r 2 ) 2 RHS LHS d*

14 Jacobian Small variance Presence is more surprising 1/r Adamic/Adar Small variance Absence is more surprising r is close to max radius { Variance Real world graphs generally fall in this range

15 Especially if the graph is sparse Link prediction accuracy Random Shortest Path Common Neighbors Adamic/Adar Ensemble of short paths *Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007

16 Common neighbors = 2 hop paths Analysis of longer paths: two components 1. Bounding E( l d ij ). [ l = # l hop paths] Bounds Pr l (i,j) by using triangle inequality on a series of common neighbor probabilities. 2. l E( l d ij ) Triangulation

17 Common neighbors = 2 hop paths Analysis of longer paths: two components 1. Bounding E( l d ij ) [ l = # l hop paths] Bounds Pr l (i,j) by using triangle inequality on a series of common neighbor probabilities. 2. l E( l d ij ) Bounded dependence of l on position of each node Can use McDiarmid s inequality to bound l - E( l d ij )

18 Bound d ij as a function of l using McDiarmid s inequality. For l l we need l >> l to obtain similar bounds Also, we can obtain much tighter bounds for long paths if shorter paths are known to exist.

19 1 Factor weak bound for Logistic Can be made tighter, as logistic approaches the step function.

20 Three key ingredients 1. Closer points are likelier to be linked. Small World Model- Watts, Strogatz, 1998, Kleinberg Triangle inequality holds necessary to extend to l hop paths 3. Points are spread uniformly at random Otherwise properties will depend on location as well as distance

21 Link prediction accuracy* Differentiating between different degrees is important For large dense graphs, common neighbors are enough The number of paths matters, not the length In sparse graphs, length 3 or more paths help in prediction. Random Shortest Path Common Neighbors Adamic/Adar Ensemble of short paths *Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007

22

23 Generative model Link Prediction Heuristics A few properties Most likely neighbor of node i? node a node b We also offer some new prediction algorithms Compare Can justify the empirical observations 23

24 Combine bounds from different radii But there might not be enough data to obtain individual bounds from each radius New sweep estimator Q r = Fraction of nodes w. radius r, which are common neighbors. Higher Q r smaller d ij w.h.p

25 Q r = Fraction of nodes w. radius r, which are common neighbors larger Q r smaller d ij w.h.p T R : = Fraction of nodes w. radius R, which are common neighbors. Smaller T R large d ij w.h.p

26 Number of common neighbors of a given radius r Q r = Fraction of nodes with radius r which are common neighbors T R = Fraction of nodes with radius R which are common neighbors Large Q r small d ij Small T R large d ij

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling

More information

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search 6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)

More information

Greedy Search in Social Networks

Greedy Search in Social Networks Greedy Search in Social Networks David Liben-Nowell Carleton College dlibenno@carleton.edu Joint work with Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. IPAM, Los Angeles 8 May 2007

More information

Nonparametric Link Prediction in Dynamic Networks

Nonparametric Link Prediction in Dynamic Networks Purnamrita Sarkar psarkar@eecs.berkeley.edu Deepayan Chakrabarti deepay@fb.com Michael I. Jordan jordan@eecs.berkeley.edu Department of EECS and Department of Statistics, University of California, Berkeley

More information

1 Complex Networks - A Brief Overview

1 Complex Networks - A Brief Overview Power-law Degree Distributions 1 Complex Networks - A Brief Overview Complex networks occur in many social, technological and scientific settings. Examples of complex networks include World Wide Web, Internet,

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Pawan Goyal CSE, IITKGP October 21, 2014 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 21, 2014 1 / 52 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation

More information

Deterministic Decentralized Search in Random Graphs

Deterministic Decentralized Search in Random Graphs Deterministic Decentralized Search in Random Graphs Esteban Arcaute 1,, Ning Chen 2,, Ravi Kumar 3, David Liben-Nowell 4,, Mohammad Mahdian 3, Hamid Nazerzadeh 1,, and Ying Xu 1, 1 Stanford University.

More information

SMALL-WORLD NAVIGABILITY. Alexandru Seminar in Distributed Computing

SMALL-WORLD NAVIGABILITY. Alexandru Seminar in Distributed Computing SMALL-WORLD NAVIGABILITY Talk about a small world 2 Zurich, CH Hunedoara, RO From cliché to social networks 3 Milgram s Experiment and The Small World Hypothesis Omaha, NE Boston, MA Wichita, KS Human

More information

Tied Kronecker Product Graph Models to Capture Variance in Network Populations

Tied Kronecker Product Graph Models to Capture Variance in Network Populations Tied Kronecker Product Graph Models to Capture Variance in Network Populations Sebastian Moreno, Sergey Kirshner +, Jennifer Neville +, SVN Vishwanathan + Department of Computer Science, + Department of

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Pawan Goyal CSE, IITKGP October 29-30, 2015 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 1 / 61 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation

More information

Facebook Friends! and Matrix Functions

Facebook Friends! and Matrix Functions Facebook Friends! and Matrix Functions! Graduate Research Day Joint with David F. Gleich, (Purdue), supported by" NSF CAREER 1149756-CCF Kyle Kloster! Purdue University! Network Analysis Use linear algebra

More information

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

6.207/14.15: Networks Lecture 12: Generalized Random Graphs 6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is

More information

Social Networks. Chapter 9

Social Networks. Chapter 9 Chapter 9 Social Networks Distributed computing is applicable in various contexts. This lecture exemplarily studies one of these contexts, social networks, an area of study whose origins date back a century.

More information

Efficient routing in Poisson small-world networks

Efficient routing in Poisson small-world networks Efficient routing in Poisson small-world networks M. Draief and A. Ganesh Abstract In recent work, Jon Kleinberg considered a small-world network model consisting of a d-dimensional lattice augmented with

More information

CSCI 3210: Computational Game Theory. Cascading Behavior in Networks Ref: [AGT] Ch 24

CSCI 3210: Computational Game Theory. Cascading Behavior in Networks Ref: [AGT] Ch 24 CSCI 3210: Computational Game Theory Cascading Behavior in Networks Ref: [AGT] Ch 24 Mohammad T. Irfan Email: mirfan@bowdoin.edu Web: www.bowdoin.edu/~mirfan Course Website: www.bowdoin.edu/~mirfan/csci-3210.html

More information

Kristina Lerman USC Information Sciences Institute

Kristina Lerman USC Information Sciences Institute Rethinking Network Structure Kristina Lerman USC Information Sciences Institute Università della Svizzera Italiana, December 16, 2011 Measuring network structure Central nodes Community structure Strength

More information

Parameter estimators of sparse random intersection graphs with thinned communities

Parameter estimators of sparse random intersection graphs with thinned communities Parameter estimators of sparse random intersection graphs with thinned communities Lasse Leskelä Aalto University Johan van Leeuwaarden Eindhoven University of Technology Joona Karjalainen Aalto University

More information

Collaborative Filtering

Collaborative Filtering Collaborative Filtering Nicholas Ruozzi University of Texas at Dallas based on the slides of Alex Smola & Narges Razavian Collaborative Filtering Combining information among collaborating entities to make

More information

Class Meeting #20 COS 226 Spring 2018

Class Meeting #20 COS 226 Spring 2018 Class Meeting #20 COS 226 Spring 2018 Mark Braverman (based on slides by Robert Sedgewick and Kevin Wayne) Linear programming A Swiss army knife for optimization algorithms. Can solve a large fraction

More information

Dynamic Social Network Analysis using Latent Space Models

Dynamic Social Network Analysis using Latent Space Models Dynamic Social Network Analysis using Latent Space Models Purnamrita Sarkar, Andrew W. Moore Center for Automated Learning and Discovery Carnegie Mellon University 5 Forbes Avenue Pittsburgh, PA 523 psarkar,awm@cs.cmu.edu

More information

Basics and Random Graphs con0nued

Basics and Random Graphs con0nued Basics and Random Graphs con0nued Social and Technological Networks Rik Sarkar University of Edinburgh, 2017. Random graphs on jupyter notebook Solu0on to exercises 1 is out If your BSc/MSc/PhD work is

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Finding central nodes in large networks

Finding central nodes in large networks Finding central nodes in large networks Nelly Litvak University of Twente Eindhoven University of Technology, The Netherlands Woudschoten Conference 2017 Complex networks Networks: Internet, WWW, social

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

A Simple Algorithm for Learning Stable Machines

A Simple Algorithm for Learning Stable Machines A Simple Algorithm for Learning Stable Machines Savina Andonova and Andre Elisseeff and Theodoros Evgeniou and Massimiliano ontil Abstract. We present an algorithm for learning stable machines which is

More information

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction A presentation by Evan Ettinger on a Paper by Vin de Silva and Joshua B. Tenenbaum May 12, 2005 Outline Introduction The

More information

Recommender Systems: Overview and. Package rectools. Norm Matloff. Dept. of Computer Science. University of California at Davis.

Recommender Systems: Overview and. Package rectools. Norm Matloff. Dept. of Computer Science. University of California at Davis. Recommender December 13, 2016 What Are Recommender Systems? What Are Recommender Systems? Various forms, but here is a common one, say for data on movie ratings: What Are Recommender Systems? Various forms,

More information

Clustering means geometry in sparse graphs. Dmitri Krioukov Northeastern University Workshop on Big Graphs UCSD, San Diego, CA, January 2016

Clustering means geometry in sparse graphs. Dmitri Krioukov Northeastern University Workshop on Big Graphs UCSD, San Diego, CA, January 2016 in sparse graphs Dmitri Krioukov Northeastern University Workshop on Big Graphs UCSD, San Diego, CA, January 206 Motivation Latent space models Successfully used in sociology since the 70ies Recently shown

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart

More information

Exploratory Study of a New Model for Evolving Networks

Exploratory Study of a New Model for Evolving Networks Exploratory Study of a New Model for Evolving Networks Anna Goldenberg and Alice Zheng Carnegie Mellon University, Pittsburgh, PA 15213, USA anya@cs.cmu.edu,alicez@cs.cmu.edu Abstract. The study of social

More information

Clustering Perturbation Resilient

Clustering Perturbation Resilient Clustering Perturbation Resilient Instances Maria-Florina Balcan Carnegie Mellon University Clustering Comes Up Everywhere Clustering news articles or web pages or search results by topic. Clustering protein

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Learning Outbreak Regions in Bayesian Spatial Scan Statistics

Learning Outbreak Regions in Bayesian Spatial Scan Statistics Maxim Makatchev Daniel B. Neill Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213 USA maxim.makatchev@cs.cmu.edu neill@cs.cmu.edu Abstract The problem of anomaly detection for biosurveillance

More information

A physical model for efficient rankings in networks

A physical model for efficient rankings in networks A physical model for efficient rankings in networks Daniel Larremore Assistant Professor Dept. of Computer Science & BioFrontiers Institute March 5, 2018 CompleNet danlarremore.com @danlarremore The idea

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Rank minimization via the γ 2 norm

Rank minimization via the γ 2 norm Rank minimization via the γ 2 norm Troy Lee Columbia University Adi Shraibman Weizmann Institute Rank Minimization Problem Consider the following problem min X rank(x) A i, X b i for i = 1,..., k Arises

More information

Graph & Geometry Problems in Data Streams

Graph & Geometry Problems in Data Streams Graph & Geometry Problems in Data Streams 2009 Barbados Workshop on Computational Complexity Andrew McGregor Introduction Models: Graph Streams: Stream of edges E = {e 1, e 2,..., e m } describe a graph

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

HighSchoolMathTeachers 2017 Page 1

HighSchoolMathTeachers 2017 Page 1 1 A study compared the number of years of education a person received and that person's average yearly salary. It was determined that the relationship between these two quantities was linear and the correlation

More information

Modeling, Analysis, and Control of Information Propagation in Multi-layer and Multiplex Networks. Osman Yağan

Modeling, Analysis, and Control of Information Propagation in Multi-layer and Multiplex Networks. Osman Yağan Modeling, Analysis, and Control of Information Propagation in Multi-layer and Multiplex Networks Osman Yağan Department of ECE Carnegie Mellon University Joint work with Y. Zhuang and V. Gligor (CMU) Alex

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL

MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL Jing (Selena) He Department of Computer Science, Kennesaw State University Shouling Ji,

More information

ECEN 689 Special Topics in Data Science for Communications Networks

ECEN 689 Special Topics in Data Science for Communications Networks ECEN 689 Special Topics in Data Science for Communications Networks Nick Duffield Department of Electrical & Computer Engineering Texas A&M University Lecture 8 Random Walks, Matrices and PageRank Graphs

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Introduction to Probability and Statistics Lecture 10: Expectation and Variance Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/ psarkar/teaching

More information

On the method of typical bounded differences. Lutz Warnke. Georgia Tech

On the method of typical bounded differences. Lutz Warnke. Georgia Tech On the method of typical bounded differences Lutz Warnke Georgia Tech What is this talk about? Motivation Behaviour of a function of independent random variables ξ 1,..., ξ n : X = F (ξ 1,..., ξ n ) the

More information

1. Let X be a random variable with probability density function. 1 x < f(x) = 0 otherwise

1. Let X be a random variable with probability density function. 1 x < f(x) = 0 otherwise Name M36K Final. Let X be a random variable with probability density function { /x x < f(x = 0 otherwise Compute the following. You can leave your answers in integral form. (a ( points Find F X (t = P

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

CDS in Plane Geometric Networks: Short, Small, And Sparse

CDS in Plane Geometric Networks: Short, Small, And Sparse CDS in Plane Geometric Networks: Short, Small, And Sparse Peng-Jun Wan Peng-Jun Wan () CDS in Plane Geometric Networks: Short, Small, And Sparse 1 / 38 Outline Three Parameters of CDS Dominators Basic

More information

Distributed Approaches to Triangulation and Embedding

Distributed Approaches to Triangulation and Embedding Distributed Approaches to Triangulation and Embedding Aleksandrs Slivkins June 2004 Revised: November 2004, July 2006. Abstract A number of recent papers in the networking community study the distance

More information

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68 CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68 References 1 L. Freeman, Centrality in Social Networks: Conceptual Clarification, Social Networks, Vol. 1, 1978/1979, pp. 215 239. 2 S. Wasserman

More information

Metrics: Growth, dimension, expansion

Metrics: Growth, dimension, expansion Metrics: Growth, dimension, expansion Social and Technological Networks Rik Sarkar University of Edinburgh, 2017. Metric A distance measure d is a metric if: d(u,v) 0 d(u,v) = 0 iff u=v d(u,v) = d(u,v)

More information

Point-of-Interest Recommendations: Learning Potential Check-ins from Friends

Point-of-Interest Recommendations: Learning Potential Check-ins from Friends Point-of-Interest Recommendations: Learning Potential Check-ins from Friends Huayu Li, Yong Ge +, Richang Hong, Hengshu Zhu University of North Carolina at Charlotte + University of Arizona Hefei University

More information

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Properties of Latent Variable Network Models

Properties of Latent Variable Network Models Properties of Latent Variable Network Models Riccardo Rastelli and Nial Friel University College Dublin Adrian E. Raftery University of Washington Technical Report no. 634 Department of Statistics University

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Cycle 2: Why Does It Matter?

Cycle 2: Why Does It Matter? Lesson. It s All Relative 9 Part Cycle : Why Does It Matter? Lesson. It s All Relative. 5 5.. a. Negative; $0,000 Negative; 400 4. a. Loss of 0 yards Loss of 0.6 points for the day 5. 6. a. 6 6 4 4 c.

More information

Clustering with k-means and Gaussian mixture distributions

Clustering with k-means and Gaussian mixture distributions Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to

More information

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US

More information

Random Networks. Complex Networks, CSYS/MATH 303, Spring, Prof. Peter Dodds

Random Networks. Complex Networks, CSYS/MATH 303, Spring, Prof. Peter Dodds Complex Networks, CSYS/MATH 303, Spring, 2010 Prof. Peter Dodds Department of Mathematics & Statistics Center for Complex Systems Vermont Advanced Computing Center University of Vermont Licensed under

More information

EFFICIENT ROUTEING IN POISSON SMALL-WORLD NETWORKS

EFFICIENT ROUTEING IN POISSON SMALL-WORLD NETWORKS J. Appl. Prob. 43, 678 686 (2006) Printed in Israel Applied Probability Trust 2006 EFFICIENT ROUTEING IN POISSON SMALL-WORLD NETWORKS M. DRAIEF, University of Cambridge A. GANESH, Microsoft Research Abstract

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest

More information

Maximum Flow. Jie Wang. University of Massachusetts Lowell Department of Computer Science. J. Wang (UMass Lowell) Maximum Flow 1 / 27

Maximum Flow. Jie Wang. University of Massachusetts Lowell Department of Computer Science. J. Wang (UMass Lowell) Maximum Flow 1 / 27 Maximum Flow Jie Wang University of Massachusetts Lowell Department of Computer Science J. Wang (UMass Lowell) Maximum Flow 1 / 27 Flow Networks A flow network is a weighted digraph G = (V, E), where the

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 19: Inference from Simulations 2 Cosma Shalizi 24 March 2009 Inference from Simulations 2, Mostly Parameter Estimation Direct Inference Method of simulated

More information

Project in Computational Game Theory: Communities in Social Networks

Project in Computational Game Theory: Communities in Social Networks Project in Computational Game Theory: Communities in Social Networks Eldad Rubinstein November 11, 2012 1 Presentation of the Original Paper 1.1 Introduction In this section I present the article [1].

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

Bargaining, Information Networks and Interstate

Bargaining, Information Networks and Interstate Bargaining, Information Networks and Interstate Conflict Erik Gartzke Oliver Westerwinter UC, San Diego Department of Political Sciene egartzke@ucsd.edu European University Institute Department of Political

More information

Notes on Machine Learning for and

Notes on Machine Learning for and Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

CMPSCI 611 Advanced Algorithms Midterm Exam Fall 2015

CMPSCI 611 Advanced Algorithms Midterm Exam Fall 2015 NAME: CMPSCI 611 Advanced Algorithms Midterm Exam Fall 015 A. McGregor 1 October 015 DIRECTIONS: Do not turn over the page until you are told to do so. This is a closed book exam. No communicating with

More information

Decision Making and Social Networks

Decision Making and Social Networks Decision Making and Social Networks Lecture 4: Models of Network Growth Umberto Grandi Summer 2013 Overview In the previous lecture: We got acquainted with graphs and networks We saw lots of definitions:

More information

Enumeration. Phong Nguyễn

Enumeration. Phong Nguyễn Enumeration Phong Nguyễn http://www.di.ens.fr/~pnguyen March 2017 References Joint work with: Yoshinori Aono, published at EUROCRYPT 2017: «Random Sampling Revisited: Lattice Enumeration with Discrete

More information

2 Notation and Preliminaries

2 Notation and Preliminaries On Asymmetric TSP: Transformation to Symmetric TSP and Performance Bound Ratnesh Kumar Haomin Li epartment of Electrical Engineering University of Kentucky Lexington, KY 40506-0046 Abstract We show that

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression (Continued) Generative v. Discriminative Decision rees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 31 st, 2007 2005-2007 Carlos Guestrin 1 Generative

More information

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007 Recommender Systems Dipanjan Das Language Technologies Institute Carnegie Mellon University 20 November, 2007 Today s Outline What are Recommender Systems? Two approaches Content Based Methods Collaborative

More information

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties Prof. James She james.she@ust.hk 1 Last lecture 2 Selected works from Tutorial

More information

11 : Gaussian Graphic Models and Ising Models

11 : Gaussian Graphic Models and Ising Models 10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood

More information

Optimization Bounds from Binary Decision Diagrams

Optimization Bounds from Binary Decision Diagrams Optimization Bounds from Binary Decision Diagrams J. N. Hooker Joint work with David Bergman, André Ciré, Willem van Hoeve Carnegie Mellon University ICS 203 Binary Decision Diagrams BDDs historically

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Decoupled Collaborative Ranking

Decoupled Collaborative Ranking Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides

More information

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold. Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all

More information

Shortest paths with negative lengths

Shortest paths with negative lengths Chapter 8 Shortest paths with negative lengths In this chapter we give a linear-space, nearly linear-time algorithm that, given a directed planar graph G with real positive and negative lengths, but no

More information

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task Binary Principal Component Analysis in the Netflix Collaborative Filtering Task László Kozma, Alexander Ilin, Tapani Raiko first.last@tkk.fi Helsinki University of Technology Adaptive Informatics Research

More information

Random Geometric Graphs.

Random Geometric Graphs. Random Geometric Graphs. Josep Díaz Random Geometric Graphs Random Euclidean Graphs, Random Proximity Graphs, Random Geometric Graphs. Random Euclidean Graphs Choose a sequence V = {x i } n i=1 of independent

More information

TOPOLOGY FOR GLOBAL AVERAGE CONSENSUS. Soummya Kar and José M. F. Moura

TOPOLOGY FOR GLOBAL AVERAGE CONSENSUS. Soummya Kar and José M. F. Moura TOPOLOGY FOR GLOBAL AVERAGE CONSENSUS Soummya Kar and José M. F. Moura Department of Electrical and Computer Engineering Carnegie Mellon University, Pittsburgh, PA 15213 USA (e-mail:{moura}@ece.cmu.edu)

More information

Distribution-specific analysis of nearest neighbor search and classification

Distribution-specific analysis of nearest neighbor search and classification Distribution-specific analysis of nearest neighbor search and classification Sanjoy Dasgupta University of California, San Diego Nearest neighbor The primeval approach to information retrieval and classification.

More information

Personalized Social Recommendations Accurate or Private

Personalized Social Recommendations Accurate or Private Personalized Social Recommendations Accurate or Private Presented by: Lurye Jenny Paper by: Ashwin Machanavajjhala, Aleksandra Korolova, Atish Das Sarma Outline Introduction Motivation The model General

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

Diversity Regularization of Latent Variable Models: Theory, Algorithm and Applications

Diversity Regularization of Latent Variable Models: Theory, Algorithm and Applications Diversity Regularization of Latent Variable Models: Theory, Algorithm and Applications Pengtao Xie, Machine Learning Department, Carnegie Mellon University 1. Background Latent Variable Models (LVMs) are

More information

A Generative Model for Dynamic Contextual Friendship Networks. July 2006 CMU-ML

A Generative Model for Dynamic Contextual Friendship Networks. July 2006 CMU-ML A Generative Model for Dynamic Contextual Friendship Networks Alice Zheng Anna Goldenberg July 2006 CMU-ML-06-107 A Generative Model for Dynamic Contextual Friendship Networks Alice X. Zheng Anna Goldenberg

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Leverage Sparse Information in Predictive Modeling

Leverage Sparse Information in Predictive Modeling Leverage Sparse Information in Predictive Modeling Liang Xie Countrywide Home Loans, Countrywide Bank, FSB August 29, 2008 Abstract This paper examines an innovative method to leverage information from

More information

Adventures in random graphs: Models, structures and algorithms

Adventures in random graphs: Models, structures and algorithms BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM January 2011 2 Complex

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

Distance Estimation and Object Location via Rings of Neighbors

Distance Estimation and Object Location via Rings of Neighbors Distance Estimation and Object Location via Rings of Neighbors Aleksandrs Slivkins February 2005 Revised: April 2006, Nov 2005, June 2005 Abstract We consider four problems on distance estimation and object

More information