Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs

Size: px
Start display at page:

Download "Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs"

Transcription

1 Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun (work with S.V.N. Vishwanathan) Department of Statistics Purdue Machine Learning Seminar November 9, 2011

2 Overview Question How to efficiently sample graphs from Multiplicative Attribute Graphs model? We introduce the first sub-quadratic sampling algorithm for sampling Multiplicative Attribute Graphs ( ) Time complexity: O (log 2 (n)) 3 E on mild conditions n : the number of nodes in the graph E : number of edges Exploit the close connection between Stochastic Kronecker Graphs (SKG) and Multiplicative Attribute Graphs (MAG) Can sample a graph with 8 million nodes and 20 billion edges in under 6 hours (naïve algorithm will take 93 days)

3 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

4 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

5 Motivation Protein-Protein Interaction Network Social Network (Facebook) H. Jeong, et al., 2001

6 Need for Statistical Model Statistical Model Class of probability distributions which describes the stochastic process that could have generate data uncertainty in data Example: Continuous Data: Normal Distribution Count Data: Poisson Distribution Now, Graph Data:!?!? We need a probability distribution on the space of graphs!

7 Initial Works Erdös-Rényi model (1960) Exponential Random Graph Model (Anderson et al, 1999) Usually called ERGM or p Uses usual exponential family model with interesting features of the Graph as sufficient statistic Latent Space Model (Hoff et al, 2002) Embed nodes into latent social space

8 Initial Works Problem Erdös-Rényi model (1960) Exponential Random Graph Model (Anderson et al, 1999) Usually called ERGM or p Uses usual exponential family model with interesting features of the Graph as sufficient statistic Latent Space Model (Hoff et al, 2002) Embed nodes into latent social space Parameter estimation and sampling algorithm does not scale (at least O(n 2 ))

9 The Hunt for Scalability Stochastic Kronecker Graphs (SKG) Model (Leskovec et al., 2010) Parameter Estimation: O (n log 2 (n)) for each MCMC step Sampling: O (log 2 (n) E ) Multiplicative Attribute Graphs (MAG) Model (Kim and Leskovec, 2010) Generalization of SKG model ( ) Parameter Estimation: O (log 2 (n)) 2 E Sampling:!?!?

10 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

11 Graph and Adjacency Matrix Let G be a directed graph. Its nodes are labed 1, 2,..., n. The A ij = 1 if there is an edge from i to j. Graph G Adjacency Matrix A

12 Question You are given an adjacency matrix! You want to, compactly describe it! In other words, compress it using small number of parameters.

13 Kronecker Multiplication Suppose you are given a 2 2 matrix Θ. Θ = ( )

14 Kronecker Multiplication Suppose you are given a 2 2 matrix Θ. Θ = ( ) You would like to take the Kronecker product of itself. Θ Θ

15 Kronecker Multiplication On the left side, stretch the matrix!

16 Kronecker Multiplication On the left side, stretch the matrix! On the right side, repeat the matrix four times!

17 Kronecker Multiplication On the left side, stretch the matrix! On the right side, repeat the matrix four times!

18 Kronecker Multiplication On the left side, stretch the matrix! On the right side, repeat the matrix four times!

19 Kronecker Multiplication On the left side, stretch the matrix! On the right side, repeat the matrix four times!

20 Kronecker Multiplication On the left side, stretch the matrix! On the right side, repeat the matrix four times! Now, element-wise multiply each entry! =

21 Kronecker Multiplication You can do it further! =

22 Generative Model Stochastic Kronecker Graphs (SKG) Model The probability of an edge between two nodes is given by Kronecker power of the parameter matrix Θ. Each edge is independently sampled given the parameter matrix sample

23 Sampling Algorithm Sampling an entry of the adjacency matrix one by one will take O(n 2 ) time. Fractal structure of the Kronecker matrix lets us do it in more clever way: O (log 2 (n) E ).

24 Sampling Algorithm

25 Sampling Algorithm Divide the matrix into four quadrants

26 Sampling Algorithm Choose one with proportional probability 0.9

27 Sampling Algorithm Divide the matrix into four quadrants

28 Sampling Algorithm Choose one with proportional probability 0.7

29 Sampling Algorithm Divide the matrix into four quadrants

30 Sampling Algorithm Choose one with proportional probability 0.9

31 Sampling Algorithm Repeat this edge number times 0.9

32 Difficulty of Parameter Estimation Observed graph data is never permuted to be similar to Kronecker matrices. You have to find the right permutation to do the inference! Quiz Followings are the same adjacency matrix sampled from the SKG model, permuted differently. Which one is the unpermuted version?

33 Difficulty of Parameter Estimation Observed graph data is never permuted to be similar to Kronecker matrices. You have to find the right permutation to do the inference! Quiz Followings are the same adjacency matrix sampled from the SKG model, permuted differently. Which one is the unpermuted version?

34 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

35 Idea Suppose there are d attributes which describe each node. Each node either possesses or lacks each attribute. For example, each attribute can be understood to an answer to each question, such as: Do you like playing StarCraft? Do you speak Esperanto? Do you watch Big Bang Theory? There is a 2 2 parameter matrix associated with each attribute. For example, ( ) ( ) ( Θ SC =, Θ Esp =, Θ BB = ).

36 Model Θ SC = ( ) ( , Θ Esp = ) ( , Θ BB = Consider two people: Yun: (StarCraft O, Esperanto X, Big Bang Theory O) Bill: (StarCraft X, Esperanto X, Big Bang Theory O) The probability of an edge from Yun to Bill is given: P yun,bill = ).

37 Model Θ SC = ( ) ( , Θ Esp = ) ( , Θ BB = Consider two people: Yun: (StarCraft O, Esperanto X, Big Bang Theory O) Bill: (StarCraft X, Esperanto X, Big Bang Theory O) The probability of an edge from Yun to Bill is given: P yun,bill = 0.3 }{{} SC ).

38 Model Θ SC = ( ) ( , Θ Esp = ) ( , Θ BB = Consider two people: Yun: (StarCraft O, Esperanto X, Big Bang Theory O) Bill: (StarCraft X, Esperanto X, Big Bang Theory O) The probability of an edge from Yun to Bill is given: P yun,bill = 0.3 }{{} SC 0.6 }{{} Esp ).

39 Model Θ SC = ( ) ( , Θ Esp = ) ( , Θ BB = Consider two people: Yun: (StarCraft O, Esperanto X, Big Bang Theory O) Bill: (StarCraft X, Esperanto X, Big Bang Theory O) The probability of an edge from Yun to Bill is given: P yun,bill = 0.3 }{{} SC 0.6 }{{} Esp }{{} 0.6 = BB ).

40 Model Θ SC = ( ) ( , Θ Esp = ) ( , Θ BB = Consider two people: Yun: (StarCraft O, Esperanto X, Big Bang Theory O) Bill: (StarCraft X, Esperanto X, Big Bang Theory O) The probability of an edge from Yun to Bill is given: P yun,bill = 0.3 }{{} SC 0.6 }{{} Esp }{{} 0.6 = BB The effect of each attribute is multiplicative. Proposed by Kim and Leskovec (2010). ).

41 Graphical Representation Attributes of Receivers Adjacency Matrix Attributes of Senders sample Q : edge probability matrix Each edge is independently sampled. Naively O ( n 2 d ). Can we do it faster? Q

42 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

43 Idea Suppose we have a set of attributes which looks like the following: The parameter ( for ) each attribute is the same: Θ = How would the edge probability matrix Q would look like?

44 Idea This is the answer! Looks familiar? Attributes of Receivers Attributes of Senders Q

45 Idea This is the answer! Looks familiar? Attributes of Receivers Θ Attributes of Senders Q

46 Idea This is the answer! Looks familiar? Kronecker Power of the Matrix! Attributes of Receivers Θ Θ [2] Attributes of Senders Q

47 Idea This is the answer! Looks familiar? Attributes of Receivers Θ Θ [2] Attributes of Senders Θ [3] Q

48 Connection If every node has unique attribute configuration, it suffices to sample from SKG model! Problem is, there are duplications... Solution: Sample multiple SKGs!

49 Step 1 Partition nodes such that each node has unique attribute configuration within its partition. (1) (2)

50 Step 2 Use this partition to divide the edge probability matrix Q. Q (1,1) Q (1,2) Q Q (2,1) Q (2,2)

51 Step 3 When permuted, each Q (k,l) matrix becomes the submatrix of Kronecker Product Matrix. permute Q (1,1) Q (1,1)

52 Step 4 We sample a graph for each Q (k,l) and quilt these pieces together to form the final graph! A (1,1) A (1,2) A A (2,1) A (2,2)

53 Analysis The size of the partition B is critically important! In the last example, B = 2, so we had to sample B 2 = 2 2 = 4 SKGs. If B = 3, then we have to sample 3 2 = 9 SKGs! If B = O(n), this is useless! When the density of the attribute matrix µ is 0.5, then B = O (log 2 n). Since sampling each SKG ( takes O (log ) 2 (n) E ), the overall time complexity is O (log 2 (n)) 3 E.

54 Analysis Actaully, B = O (log 2 n) is not even a tight bound. 20 size of the partition B Observed log 2 (n) number of nodes n 10 6

55 Further Considerations Existence of Small Sets Sampling a whole SKG for small set will be wasteful. µ 0.5 When µ is close to 0 or 1, some attribute configurations are more frequent than others. Q (1,1) Q (1,2) number of occurrences µ = 0.5 µ = 0.6 µ = 0.7 µ = 0.9 Q (2,1) Q (2,2) attribute configuration (ranked)

56 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

57 Setup We chose two parameter matrices from the literature (Kim and Leskovec 2010, Moreno and Neville 2009) Θ 1 = [ ] and Θ 2 = Set number of attributes d = log 2 n. Set µ = 0.5. Now increase n and observe running time! [ ]

58 Total Running Time 10 7 Θ Θ 2 2 Running Time (ms) Quilting Naive 0 Quilting Naive Number of nodes (n) Number of nodes (n) 10 6

59 Running Time per Edge Θ 1 Θ 2 Running time per edge Quilting Naive Quilting Naive Number of nodes (n) Number of nodes (n) 10 6

60 The Effect of Attribute Density µ Θ1 Θ2 Relative Running Time ρ(µ) n = 2 10 n = 2 12 n = 2 14 n = 2 16 n = attribute probability (µ) attribute probability (µ) cycle list name

61 Outline 1 Introduction 2 Stochastic Kronecker Graphs (SKG) Model 3 Multiplicative Attribute Graphs (MAG) Model 4 Quilting Algorithm 5 Experiments 6 Conclusion

62 Summary Question How to efficiently sample graphs from Multiplicative Attribute Graphs model? We introduce the first sub-quadratic sampling algorithm for sampling Multiplicative Attribute Graphs ( ) Time complexity: O (log 2 (n)) 3 E on mild conditions n : the number of nodes in the graph E : number of edges Exploit the close connection between Stochastic Kronecker Graphs (SKG) and Multiplicative Attribute Graphs (MAG) Can sample a graph with 8 million nodes and 20 billion edges in under 6 hours (naïve algorithm will take 93 days)

63 Further Direction Analysis on µ 0.5 is only empirical. When the attribute matrix is sparse and high-dimensional, the method fails completely: Similarity search algorithms such as Locality Sensitive Hashing (LSH) or Cover Tree may help. Characteristic of such MAG model would be of interest.

Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs

Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun Department of Statistics Purdue University SV N Vishwanathan Departments of Statistics and Computer Science

More information

Supporting Statistical Hypothesis Testing Over Graphs

Supporting Statistical Hypothesis Testing Over Graphs Supporting Statistical Hypothesis Testing Over Graphs Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Tina Eliassi-Rad, Brian Gallagher, Sergey Kirshner,

More information

Tied Kronecker Product Graph Models to Capture Variance in Network Populations

Tied Kronecker Product Graph Models to Capture Variance in Network Populations Tied Kronecker Product Graph Models to Capture Variance in Network Populations Sebastian Moreno, Sergey Kirshner +, Jennifer Neville +, SVN Vishwanathan + Department of Computer Science, + Department of

More information

GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018)

GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018) GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018) Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamilton, Jure Leskovec Presented by: Jesse Bettencourt and Harris Chan March 9, 2018 University

More information

Graph Detection and Estimation Theory

Graph Detection and Estimation Theory Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and

More information

Consistency Under Sampling of Exponential Random Graph Models

Consistency Under Sampling of Exponential Random Graph Models Consistency Under Sampling of Exponential Random Graph Models Cosma Shalizi and Alessandro Rinaldo Summary by: Elly Kaizar Remember ERGMs (Exponential Random Graph Models) Exponential family models Sufficient

More information

Using Bayesian Network Representations for Effective Sampling from Generative Network Models

Using Bayesian Network Representations for Effective Sampling from Generative Network Models Using Bayesian Network Representations for Effective Sampling from Generative Network Models Pablo Robles-Granda and Sebastian Moreno and Jennifer Neville Computer Science Department Purdue University

More information

Using Bayesian Network Representations for Effective Sampling from Generative Network Models

Using Bayesian Network Representations for Effective Sampling from Generative Network Models Using Bayesian Network Representations for Effective Sampling from Generative Network Models Pablo Robles-Granda and Sebastian Moreno and Jennifer Neville Computer Science Department Purdue University

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab

More information

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with

More information

CS224W: Methods of Parallelized Kronecker Graph Generation

CS224W: Methods of Parallelized Kronecker Graph Generation CS224W: Methods of Parallelized Kronecker Graph Generation Sean Choi, Group 35 December 10th, 2012 1 Introduction The question of generating realistic graphs has always been a topic of huge interests.

More information

Scalable Gaussian process models on matrices and tensors

Scalable Gaussian process models on matrices and tensors Scalable Gaussian process models on matrices and tensors Alan Qi CS & Statistics Purdue University Joint work with F. Yan, Z. Xu, S. Zhe, and IBM Research! Models for graph and multiway data Model Algorithm

More information

Learning Structured Probability Matrices!

Learning Structured Probability Matrices! Learning Structured Probability Matrices Qingqing Huang 2016 February Laboratory for Information & Decision Systems Based on joint work with Sham Kakade, Weihao Kong and Greg Valiant. 1 2 Learning Data

More information

Specification and estimation of exponential random graph models for social (and other) networks

Specification and estimation of exponential random graph models for social (and other) networks Specification and estimation of exponential random graph models for social (and other) networks Tom A.B. Snijders University of Oxford March 23, 2009 c Tom A.B. Snijders (University of Oxford) Models for

More information

Lifted and Constrained Sampling of Attributed Graphs with Generative Network Models

Lifted and Constrained Sampling of Attributed Graphs with Generative Network Models Lifted and Constrained Sampling of Attributed Graphs with Generative Network Models Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Pablo Robles Granda,

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Analytic Theory of Power Law Graphs

Analytic Theory of Power Law Graphs Analytic Theory of Power Law Graphs Jeremy Kepner This work is sponsored by the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations

More information

Facebook Friends! and Matrix Functions

Facebook Friends! and Matrix Functions Facebook Friends! and Matrix Functions! Graduate Research Day Joint with David F. Gleich, (Purdue), supported by" NSF CAREER 1149756-CCF Kyle Kloster! Purdue University! Network Analysis Use linear algebra

More information

Algebra II. A2.1.1 Recognize and graph various types of functions, including polynomial, rational, and algebraic functions.

Algebra II. A2.1.1 Recognize and graph various types of functions, including polynomial, rational, and algebraic functions. Standard 1: Relations and Functions Students graph relations and functions and find zeros. They use function notation and combine functions by composition. They interpret functions in given situations.

More information

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory ORIE 4741: Learning with Big Messy Data Spectral Graph Theory Mika Sumida Operations Research and Information Engineering Cornell September 15, 2017 1 / 32 Outline Graph Theory Spectral Graph Theory Laplacian

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March

More information

Collaborative topic models: motivations cont

Collaborative topic models: motivations cont Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.

More information

Using R for Iterative and Incremental Processing

Using R for Iterative and Incremental Processing Using R for Iterative and Incremental Processing Shivaram Venkataraman, Indrajit Roy, Alvin AuYoung, Robert Schreiber UC Berkeley and HP Labs UC BERKELEY Big Data, Complex Algorithms PageRank (Dominant

More information

Scalable and exact sampling method for probabilistic generative graph models

Scalable and exact sampling method for probabilistic generative graph models Data Min Knowl Disc https://doi.org/10.1007/s10618-018-0566-x Scalable and exact sampling method for probabilistic generative graph models Sebastian Moreno 1 Joseph J. Pfeiffer III 2 Jennifer Neville 3

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

Statistical and Computational Phase Transitions in Planted Models

Statistical and Computational Phase Transitions in Planted Models Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203 Cluster/Community structure in

More information

A physical model for efficient rankings in networks

A physical model for efficient rankings in networks A physical model for efficient rankings in networks Daniel Larremore Assistant Professor Dept. of Computer Science & BioFrontiers Institute March 5, 2018 CompleNet danlarremore.com @danlarremore The idea

More information

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Lecture 1: Asymptotics, Recurrences, Elementary Sorting Lecture 1: Asymptotics, Recurrences, Elementary Sorting Instructor: Outline 1 Introduction to Asymptotic Analysis Rate of growth of functions Comparing and bounding functions: O, Θ, Ω Specifying running

More information

How to exploit network properties to improve learning in relational domains

How to exploit network properties to improve learning in relational domains How to exploit network properties to improve learning in relational domains Jennifer Neville Departments of Computer Science and Statistics Purdue University!!!! (joint work with Brian Gallagher, Timothy

More information

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Essential Learning Outcomes for Algebra 2

Essential Learning Outcomes for Algebra 2 ALGEBRA 2 ELOs 1 Essential Learning Outcomes for Algebra 2 The following essential learning outcomes (ELOs) represent the 12 skills that students should be able to demonstrate knowledge of upon completion

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

Deterministic Decentralized Search in Random Graphs

Deterministic Decentralized Search in Random Graphs Deterministic Decentralized Search in Random Graphs Esteban Arcaute 1,, Ning Chen 2,, Ravi Kumar 3, David Liben-Nowell 4,, Mohammad Mahdian 3, Hamid Nazerzadeh 1,, and Ying Xu 1, 1 Stanford University.

More information

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 9, SEPTEMBER 2010 1987 Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE Abstract

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Final Talk May 8, 2014 Ted

More information

Piazza Recitation session: Review of linear algebra Location: Thursday, April 11, from 3:30-5:20 pm in SIG 134 (here)

Piazza Recitation session: Review of linear algebra Location: Thursday, April 11, from 3:30-5:20 pm in SIG 134 (here) 4/0/9 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 Piazza Recitation session: Review of linear algebra Location: Thursday, April, from 3:30-5:20 pm in SIG 34

More information

Iterative solvers for linear equations

Iterative solvers for linear equations Spectral Graph Theory Lecture 23 Iterative solvers for linear equations Daniel A. Spielman November 26, 2018 23.1 Overview In this and the next lecture, I will discuss iterative algorithms for solving

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Bayesian Inference for Contact Networks Given Epidemic Data

Bayesian Inference for Contact Networks Given Epidemic Data Bayesian Inference for Contact Networks Given Epidemic Data Chris Groendyke, David Welch, Shweta Bansal, David Hunter Departments of Statistics and Biology Pennsylvania State University SAMSI, April 17,

More information

CME323 Distributed Algorithms and Optimization. GloVe on Spark. Alex Adamson SUNet ID: aadamson. June 6, 2016

CME323 Distributed Algorithms and Optimization. GloVe on Spark. Alex Adamson SUNet ID: aadamson. June 6, 2016 GloVe on Spark Alex Adamson SUNet ID: aadamson June 6, 2016 Introduction Pennington et al. proposes a novel word representation algorithm called GloVe (Global Vectors for Word Representation) that synthesizes

More information

CURRICULUM CATALOG. Algebra II (3135) VA

CURRICULUM CATALOG. Algebra II (3135) VA 2018-19 CURRICULUM CATALOG Algebra II (3135) VA Table of Contents COURSE OVERVIEW... 1 UNIT 1: STRUCTURE AND FUNCTIONS... 1 UNIT 2: LINEAR FUNCTIONS... 2 UNIT 3: INEQUALITIES AND ABSOLUTE VALUE... 2 UNIT

More information

Theory and Methods for the Analysis of Social Networks

Theory and Methods for the Analysis of Social Networks Theory and Methods for the Analysis of Social Networks Alexander Volfovsky Department of Statistical Science, Duke University Lecture 1: January 16, 2018 1 / 35 Outline Jan 11 : Brief intro and Guest lecture

More information

A New Space for Comparing Graphs

A New Space for Comparing Graphs A New Space for Comparing Graphs Anshumali Shrivastava and Ping Li Cornell University and Rutgers University August 18th 2014 Anshumali Shrivastava and Ping Li ASONAM 2014 August 18th 2014 1 / 38 Main

More information

Information Recovery from Pairwise Measurements

Information Recovery from Pairwise Measurements Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large

More information

1 Mechanistic and generative models of network structure

1 Mechanistic and generative models of network structure 1 Mechanistic and generative models of network structure There are many models of network structure, and these largely can be divided into two classes: mechanistic models and generative or probabilistic

More information

Modeling of Growing Networks with Directional Attachment and Communities

Modeling of Growing Networks with Directional Attachment and Communities Modeling of Growing Networks with Directional Attachment and Communities Masahiro KIMURA, Kazumi SAITO, Naonori UEDA NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Kyoto 619-0237, Japan

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

Matrices and Vectors

Matrices and Vectors Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix

More information

Lecture 13: Spectral Graph Theory

Lecture 13: Spectral Graph Theory CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 13: Spectral Graph Theory Lecturer: Shayan Oveis Gharan 11/14/18 Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1 Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs Chang Ye Dept. of ECE and Goergen Institute for Data Science University of Rochester cye7@ur.rochester.edu http://www.ece.rochester.edu/~cye7/

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Recoverabilty Conditions for Rankings Under Partial Information

Recoverabilty Conditions for Rankings Under Partial Information Recoverabilty Conditions for Rankings Under Partial Information Srikanth Jagabathula Devavrat Shah Abstract We consider the problem of exact recovery of a function, defined on the space of permutations

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from http://www.mmds.org Distance Measures For finding similar documents, we consider the Jaccard

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Lecture: Local Spectral Methods (1 of 4)

Lecture: Local Spectral Methods (1 of 4) Stat260/CS294: Spectral Graph Methods Lecture 18-03/31/2015 Lecture: Local Spectral Methods (1 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic

More information

MATRIX DETERMINANTS. 1 Reminder Definition and components of a matrix

MATRIX DETERMINANTS. 1 Reminder Definition and components of a matrix MATRIX DETERMINANTS Summary Uses... 1 1 Reminder Definition and components of a matrix... 1 2 The matrix determinant... 2 3 Calculation of the determinant for a matrix... 2 4 Exercise... 3 5 Definition

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016 Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started

More information

Parameter estimators of sparse random intersection graphs with thinned communities

Parameter estimators of sparse random intersection graphs with thinned communities Parameter estimators of sparse random intersection graphs with thinned communities Lasse Leskelä Aalto University Johan van Leeuwaarden Eindhoven University of Technology Joona Karjalainen Aalto University

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD

More information

Content-based Recommendation

Content-based Recommendation Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

More on Neural Networks

More on Neural Networks More on Neural Networks Yujia Yan Fall 2018 Outline Linear Regression y = Wx + b (1) Linear Regression y = Wx + b (1) Polynomial Regression y = Wφ(x) + b (2) where φ(x) gives the polynomial basis, e.g.,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

Curriculum Catalog

Curriculum Catalog 2017-2018 Curriculum Catalog 2017 Glynlyon, Inc. Table of Contents ALGEBRA II COURSE OVERVIEW... 1 UNIT 1: SET, STRUCTURE, AND FUNCTION... 1 UNIT 2: NUMBERS, SENTENCES, AND PROBLEMS... 2 UNIT 3: LINEAR

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

Accelerated Training of Max-Margin Markov Networks with Kernels

Accelerated Training of Max-Margin Markov Networks with Kernels Accelerated Training of Max-Margin Markov Networks with Kernels Xinhua Zhang University of Alberta Alberta Innovates Centre for Machine Learning (AICML) Joint work with Ankan Saha (Univ. of Chicago) and

More information

A Nearly Sublinear Approximation to exp{p}e i for Large Sparse Matrices from Social Networks

A Nearly Sublinear Approximation to exp{p}e i for Large Sparse Matrices from Social Networks A Nearly Sublinear Approximation to exp{p}e i for Large Sparse Matrices from Social Networks Kyle Kloster and David F. Gleich Purdue University December 14, 2013 Supported by NSF CAREER 1149756-CCF Kyle

More information

Data mining in large graphs

Data mining in large graphs Data mining in large graphs Christos Faloutsos University www.cs.cmu.edu/~christos ALLADIN 2003 C. Faloutsos 1 Outline Introduction - motivation Patterns & Power laws Scalability & Fast algorithms Fractals,

More information

Finite Model Theory and Graph Isomorphism. II.

Finite Model Theory and Graph Isomorphism. II. Finite Model Theory and Graph Isomorphism. II. Anuj Dawar University of Cambridge Computer Laboratory visiting RWTH Aachen Beroun, 13 December 2013 Recapitulation Finite Model Theory aims to study the

More information

Community Detection on Euclidean Random Graphs

Community Detection on Euclidean Random Graphs Community Detection on Euclidean Random Graphs Abishek Sankararaman, François Baccelli July 3, 207 Abstract Motivated by applications in online social networks, we introduce and study the problem of Community

More information

Network Event Data over Time: Prediction and Latent Variable Modeling

Network Event Data over Time: Prediction and Latent Variable Modeling Network Event Data over Time: Prediction and Latent Variable Modeling Padhraic Smyth University of California, Irvine Machine Learning with Graphs Workshop, July 25 th 2010 Acknowledgements PhD students:

More information

Algebra 2 Syllabus. Certificated Teacher: Date: Desired Results

Algebra 2 Syllabus. Certificated Teacher: Date: Desired Results Algebra 2 Syllabus Certificated Teacher: Date: 2012-13 Desired Results Course Title/Grade Level: Algebra 2 A and B Credit: one semester (.5) x two semesters (1) Estimate of hours per week engaged in learning

More information

Graph Sparsification III: Ramanujan Graphs, Lifts, and Interlacing Families

Graph Sparsification III: Ramanujan Graphs, Lifts, and Interlacing Families Graph Sparsification III: Ramanujan Graphs, Lifts, and Interlacing Families Nikhil Srivastava Microsoft Research India Simons Institute, August 27, 2014 The Last Two Lectures Lecture 1. Every weighted

More information

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016 A Random Dot Product Model for Weighted Networks arxiv:1611.02530v1 [stat.ap] 8 Nov 2016 Daryl R. DeFord 1 Daniel N. Rockmore 1,2,3 1 Department of Mathematics, Dartmouth College, Hanover, NH, USA 03755

More information

Algorithmic approaches to fitting ERG models

Algorithmic approaches to fitting ERG models Ruth Hummel, Penn State University Mark Handcock, University of Washington David Hunter, Penn State University Research funded by Office of Naval Research Award No. N00014-08-1-1015 MURI meeting, April

More information

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov Week 1. Introduction. What are multipole methods and what is this course about. Problems from physics, mathematics,

More information

Modeling heterogeneity in random graphs

Modeling heterogeneity in random graphs Modeling heterogeneity in random graphs Catherine MATIAS CNRS, Laboratoire Statistique & Génome, Évry (Soon: Laboratoire de Probabilités et Modèles Aléatoires, Paris) http://stat.genopole.cnrs.fr/ cmatias

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

An Introduction to Exponential-Family Random Graph Models

An Introduction to Exponential-Family Random Graph Models An Introduction to Exponential-Family Random Graph Models Luo Lu Feb.8, 2011 1 / 11 Types of complications in social network Single relationship data A single relationship observed on a set of nodes at

More information

MACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance

MACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance MACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance Jingbo Shang, Jian Peng, Jiawei Han University of Illinois, Urbana-Champaign May 6, 2016 Presented by Jingbo Shang 2 Outline

More information

Linear Algebra and Probability

Linear Algebra and Probability Linear Algebra and Probability for Computer Science Applications Ernest Davis CRC Press Taylor!* Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor Sc Francis Croup, an informa

More information

Scaling Neighbourhood Methods

Scaling Neighbourhood Methods Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)

More information

VCMC: Variational Consensus Monte Carlo

VCMC: Variational Consensus Monte Carlo VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object

More information

1 Complex Networks - A Brief Overview

1 Complex Networks - A Brief Overview Power-law Degree Distributions 1 Complex Networks - A Brief Overview Complex networks occur in many social, technological and scientific settings. Examples of complex networks include World Wide Web, Internet,

More information

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into

More information

Matrix estimation by Universal Singular Value Thresholding

Matrix estimation by Universal Singular Value Thresholding Matrix estimation by Universal Singular Value Thresholding Courant Institute, NYU Let us begin with an example: Suppose that we have an undirected random graph G on n vertices. Model: There is a real symmetric

More information

Spectral Methods for Subgraph Detection

Spectral Methods for Subgraph Detection Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,

More information