An Efficient reconciliation algorithm for social networks

Size: px
Start display at page:

Download "An Efficient reconciliation algorithm for social networks"

Transcription

1 An Efficient reconciliation algorithm for social networks Silvio Lattanzi (Google Research NY) Joint work with: Nitish Korula (Google Research NY) ICERM Stochastic Graph Models

2 Outline Graph reconciliation Model and theoretical results. Experimental results From theory to practice. Open problems and future directions

3 Graph reconciliation

4 Real world motivations

5 Real world motivations Intra-language network

6 Real world motivations Intra-language network Inter-language network

7 Real world motivations Can we use intra-language information to improve interlanguage graph?

8 Real world motivations Can we use intra-language information to improve interlanguage graph?

9 Real world motivations Can we use intra-language information to improve interlanguage graph??

10 Real world motivations

11 Real world motivations

12 Real world motivations

13 Real world motivations

14 Graph reconciliation problem Given two networks, identify as many users as possible across them. Applications: social networks ontology reconciliation

15 Previous work Problem of reconciliation introduced by Novak et al.

16 Previous work Problem of reconciliation introduced by Novak et al. Two main approaches: - ML on user profile features (name, location, image)

17 Previous work Problem of reconciliation introduced by Novak et al. Two main approaches: - ML on user profile features (name, location, image) - ML on neighborhood topology

18 Previous work Problem of reconciliation introduced by Novak et al. Two main approaches: - ML on user profile features (name, location, image) - ML on neighborhood topology Limitations:

19 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack

20 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack

21 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack

22 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack

23 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack

24 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack

25 Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack - Narayanan and Shmatikov successful de-anonymization attack

26 Narayanan and Shmatikov experiment Ground truth matching across the two social networks

27 Narayanan and Shmatikov experiment Ground truth matching across the two social networks 80 me-links

28 Narayanan and Shmatikov experiment Ground truth matching across the two social networks 80 me-links They could re-identify 30.8% of the mappings.

29 Narayanan and Shmatikov experiment Algorithm:

30 Narayanan and Shmatikov experiment Algorithm:?

31 Narayanan and Shmatikov experiment Algorithm: 2

32 Narayanan and Shmatikov experiment Algorithm:

33 Narayanan and Shmatikov experiment Algorithm:

34 Narayanan and Shmatikov experiment Algorithm:

35 Narayanan and Shmatikov experiment Algorithm: Why? Is it necessary to have high degree me-links?

36 Abstraction Input: two graphs and a set of trusted matching We want to maximize the number of final matches.

37 Is the problem tractable? Problem is similar to graph isomorphism

38 Is the problem tractable? Problem is similar to graph isomorphism Problem seems even harder because we want to detect similar structure

39 Is the problem tractable? Problem is similar to graph isomorphism Problem seems even harder because we want to detect similar structure

40 Abstraction Formalization of the problem: Underlying social network

41 Abstraction Formalization of the problem: Underlying social network p 1 p 2 independently Delete the edges

42 Abstraction Formalization of the problem: Underlying social network p 1 p 2 independently Delete the edges Initial matchings

43 Questions Having a constant fraction of me-links, can we reconcile the entire network? If we have k me-links which fraction of networks can we reconcile?

44 Underlying social network Without additional assumption on the underling network problem seems still very hard

45 Underlying social network Without additional assumption on the underling network problem seems still very hard We study two different models for social networks: - G(n,p) - Preferential attachment

46 Our algorithm Algorithm: Narayanan Shmatikov + degree bucketing + acceptance threshold

47 G(n,p) Does the technique works if the underlying graph is random? p 1 p p 2

48 G(n,p) Does the technique works if the underlying graph is random? p 1 p p 2 E[N G1 ( ) \ N G2 ( )] = (n 1)pp 1 p 2 E[N G1 ( ) \ N G2 ( )] = (n 2)p 2 p 1 p 2

49 Concentration We assume c log n n apple p apple 1 6,l,p 1,p 2 2 O(1) Two cases: npp 1 p 2 l 24 log n -, Chernoff bound is enough npp 1 p 2 l apple 24 log n -, we never make error x =(n 2)p 2 p 1 p 2 P = " X n # B i apple 2 i=1 =(1 x) n + nx(1 x) n 1 + n x 2 (1 x) n 2 =1 n 3 x 3 o(n 3 x 3 ) 2

50 More realistic model Preferential attachment: - G m 1 is a single node with self-loops m G m n G m n 1 m - adding a node to and edges with probability proportional to the current degrees

51 Preferential attachment A bit harder - Several nodes of constant degree, we need to have a cascade - Objective is reconcile a constant fraction of the network

52 Sketch of the proof For high degree node we can use concentration results.

53 Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors.

54 Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. High degree nodes help to detect intermediate degree nodes that in turn help to detect small degree nodes.

55 PA structural lemmas High degree nodes are early birds. Nodes inserted after time n, for constant, have degree in o(log 2 n)

56 PA structural lemmas High degree nodes are early birds. n o(log 2 n) Nodes inserted after time, for constant, have degree in The rich get richer. log 2 n For nodes of degree greater than been inserted after time n, for constant a constant fraction of their neighbors has

57 PA structural lemmas High degree nodes are early birds. n o(log 2 n) Nodes inserted after time, for constant, have degree in The rich get richer. log 2 n For nodes of degree greater than been inserted after time n, for constant a constant fraction of their neighbors has First-mover advantage. All nodes inserted before time n 0.3, have degree at least log 3 n

58 High degree nodes are early birds G m 1 G m n

59 High degree nodes are early birds n G m 1 G m n

60 High degree nodes are early birds n G m 1 G m n n

61 High degree nodes are early birds n G m 1 G m n n Let d i be the degree at the beginning of a phase. The probability that a node increase its degree is dominated by the probability of an head in a coin toss for a biased coin that gives head with probability 3d i n

62 The rich get richer If at time n, the node has degree less than we are done n 1 2 d G m 1 G m n

63 The rich get richer If at time n, the node has degree less than we are done n 1 2 d G m 1 G m n The probability that the node increases its degree is dominated by the probability of an head in a coin toss for a biased coin that gives head with probability d 2nm

64 First-mover advantage From Cooper and Frieze result on the cover time of PA graphs, Pr D k = d nm (v 1 )+d nm (v 2 )+ + d nm (v k ) D k 2 p 2kn 3 p mn log mn apple (mn) 2 Pr(d n (v k+1 )=d +1 D k 2k = s) apple s + d 2N 2k s d Playing a bit with algebra we can get the final result.

65 Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. High degree nodes help to detect intermediate degree nodes that in turn help to detect small degree nodes.

66 Matching high degree nodes E[N G1 ( ) \ N G2 ( )] = d(v)p 1 p 2 l By Chernoff N G1 ( ) \ N G2 ( ) 7 8 d(v)p 1p 2 l w.h.p.

67 Matching high degree nodes E[N G1 ( ) \ N G2 ( )] = d(v)p 1 p 2 l By Chernoff N ( ) \ N ( ) G1 G2 n 7 8 d(v)p 1p 2 l w.h.p. G m 1 G m n N G1 ( ) \ N G2 ( ) apple d(v)p 1 p 2 l + o(d(v))

68 Matching high degree nodes E[N G1 ( ) \ N G2 ( )] = d(v)p 1 p 2 l By Chernoff N ( ) \ N ( ) G1 G2 n 7 8 d(v)p 1p 2 l w.h.p. G m 1 G m n N G1 ( ) \ N G2 ( ) apple d(v)p 1 p 2 l + o(d(v)) has degree at most connecting to it is o(1) Õ( p n) and so the probability of

69 Matching high degree nodes E[N G1 ( ) \ N G2 ( )] = d(v)p 1 p 2 l By Chernoff N ( ) \ N ( ) G1 G2 n 7 8 d(v)p 1p 2 l w.h.p. G m 1 G m n N G1 ( ) \ N G2 ( ) apple d(v)p 1 p 2 l + o(d(v)) has degree at most connecting to it is o(1) Õ( p n) and so the probability of

70 Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. High degree nodes help to detect intermediate degree nodes that in turn help to detect small degree nodes.

71 Bound the mismatch score n 0.3 G m 1 G m n

72 Bound the mismatch score n 0.3 n n ( 4 3) n ( 4 3) G m 1 G m n n a = n 0.3,n b = n ( 3 ) 3 n ( 3 2 )0.3 ( 3 )

73 Bound the mismatch score n 0.3 n n ( 4 3) n ( 4 3) G m 1 G m n n a = n 0.3,n b = n The probability that 3 nodes coming between n a and n b point to and n b 2 n b X i=n a n b X j=n a n b X k=n a log 3 2 n log 3 2 n log 3 2 n (i 1) (j 1) (k 1) n 2b 3a 2 o(1)

74 Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. High degree nodes help to detect intermediate degree nodes that in turn help to detect small degree nodes.

75 Cascade n 0.3 G m 1 G m n

76 Cascade n 0.3 G m 1 G m n n 0.25 After one phase G m 1 G m n

77 Cascade n 0.3 G m 1 G m n n 0.25 After one phase G m 1 G m n in each phase we do not identify a small fraction, in total we loose a small constant G m 1 G m n

78 Cascade n 0.3 G m 1 G m n n 0.25 After one phase G m 1 G m n in each phase we do not identify a small fraction, in total we loose a small constant G m 1 G m n

79 Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. High degree nodes help to detect intermediate degree nodes that in turn help to detect small degree nodes.

80 Results Theorem 1 If the underlying network is a G(n,p) graph it is possible to reconcile it completely Theorem 2 If the underlying network is a PA graph it is possible to reconcile it a large fraction of it.

81 Experimental results

82 Experiments Experiments on different graphs:

83 PA experiment Are our theoretical results robust?

84 Scalability How does the algorithm scale with the size of the graph?

85 Facebook experiment How does the algorithm perform if the underlying graph is a social network?

86 Facebook experiment How does the algorithm perform if the underlying graph is a social network? 80% recall!! Can we explain it in theory?

87 Facebook cascade experiment What does happen if we generate the underlying network using a cascade process? Recover almost all the graph in the intersection. Can we explain it in theory?

88 Affiliation network model What does happen if we delete all the edges inside a subset of the communities? More than 80% recall. Can we explain it in theory?

89 Reconcile different graphs DBLP: we generate two co-authorship graphs. One considering only publications in even years and the other publication only in odd years.

90 Reconcile different graphs DBLP: we generate two co-authorship graphs. One considering only publications in even years and the other publication only in odd years. Gowalla: we generate two co-checkin graphs. One considering only checkins in even years and the other checkins only in odd years.

91 Reconcile different graphs DBLP: we generate two co-authorship graphs. One considering only publications in even years and the other publication only in odd years. Gowalla: we generate two co-checkin graphs. One considering only checkins in even years and the other checkins only in odd years. German/French Wikipedia: we crawl the inter-languange links, we use few of them as seed and we check how many links we could recover.

92 Reconcile different graphs Recall for Wikipedia ~30%

93 Reconcile different graphs We have really good performance for high degree nodes

94 Open problems and future directions

95 Extensions Other model of underlying graphs Other model of generation of networks Adversarial underlying network, error in seed links

96 Limitation of the current model Users degree depend varies in different social networks How can we model this more general setting?

97 Better algorithm Currently exploring only direct neighborhood Can we design better algorithms?

98 Thanks!

Coupling of Scale-Free and Classical Random Graphs

Coupling of Scale-Free and Classical Random Graphs Coupling of Scale-Free and Classical Random Graphs April 18, 2007 Introduction Consider a graph where we delete some nodes and look at the size of the largest component remaining. Just how robust are scale

More information

Modelling self-organizing networks

Modelling self-organizing networks Paweł Department of Mathematics, Ryerson University, Toronto, ON Cargese Fall School on Random Graphs (September 2015) Outline 1 Introduction 2 Spatial Preferred Attachment (SPA) Model 3 Future work Multidisciplinary

More information

MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL

MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL Jing (Selena) He Department of Computer Science, Kennesaw State University Shouling Ji,

More information

Structure based Data De-anonymization of Social Networks and Mobility Traces

Structure based Data De-anonymization of Social Networks and Mobility Traces Structure based Data De-anonymization of Social Networks and Mobility Traces Shouling Ji 1, Weiqing Li 1, Mudhakar Srivatsa 2, Jing S. He 3, and Raheem Beyah 1 Georgia Institute of Technology 1, IBM T.

More information

Lecture 5: Probabilistic tools and Applications II

Lecture 5: Probabilistic tools and Applications II T-79.7003: Graphs and Networks Fall 2013 Lecture 5: Probabilistic tools and Applications II Lecturer: Charalampos E. Tsourakakis Oct. 11, 2013 5.1 Overview In the first part of today s lecture we will

More information

CMPUT 675: Approximation Algorithms Fall 2014

CMPUT 675: Approximation Algorithms Fall 2014 CMPUT 675: Approximation Algorithms Fall 204 Lecture 25 (Nov 3 & 5): Group Steiner Tree Lecturer: Zachary Friggstad Scribe: Zachary Friggstad 25. Group Steiner Tree In this problem, we are given a graph

More information

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University Learning from Sensor Data: Set II Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University 1 6. Data Representation The approach for learning from data Probabilistic

More information

ECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap

ECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap ECS 253 / MAE 253, Lecture 15 May 17, 2016 I. Probability generating function recap Part I. Ensemble approaches A. Master equations (Random graph evolution, cluster aggregation) B. Network configuration

More information

Project in Computational Game Theory: Communities in Social Networks

Project in Computational Game Theory: Communities in Social Networks Project in Computational Game Theory: Communities in Social Networks Eldad Rubinstein November 11, 2012 1 Presentation of the Original Paper 1.1 Introduction In this section I present the article [1].

More information

Facebook Friends! and Matrix Functions

Facebook Friends! and Matrix Functions Facebook Friends! and Matrix Functions! Graduate Research Day Joint with David F. Gleich, (Purdue), supported by" NSF CAREER 1149756-CCF Kyle Kloster! Purdue University! Network Analysis Use linear algebra

More information

Interact with Strangers

Interact with Strangers Interact with Strangers RATE: Recommendation-aware Trust Evaluation in Online Social Networks Wenjun Jiang 1, 2, Jie Wu 2, and Guojun Wang 1 1. School of Information Science and Engineering, Central South

More information

Chapter 1: Introduction to Probability Theory

Chapter 1: Introduction to Probability Theory ECE5: Stochastic Signals and Systems Fall 8 Lecture - September 6, 8 Prof. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter : Introduction to Probability Theory Axioms of Probability

More information

Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions

Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions 1 Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions Jiaqi Liu, Luoyi Fu, Yuhang Yao, Xinzhe Fu, Xinbing Wang and Guihai Chen Shanghai Jiao Tong University {13-liujiaqi, yiluofu,

More information

Balanced Allocation Through Random Walk

Balanced Allocation Through Random Walk Balanced Allocation Through Random Walk Alan Frieze Samantha Petti November 25, 2017 Abstract We consider the allocation problem in which m (1 ε)dn items are to be allocated to n bins with capacity d.

More information

Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach

Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach Author: Jaewon Yang, Jure Leskovec 1 1 Venue: WSDM 2013 Presenter: Yupeng Gu 1 Stanford University 1 Background Community

More information

Parameter estimators of sparse random intersection graphs with thinned communities

Parameter estimators of sparse random intersection graphs with thinned communities Parameter estimators of sparse random intersection graphs with thinned communities Lasse Leskelä Aalto University Johan van Leeuwaarden Eindhoven University of Technology Joona Karjalainen Aalto University

More information

Permutations and Combinations

Permutations and Combinations Permutations and Combinations Permutations Definition: Let S be a set with n elements A permutation of S is an ordered list (arrangement) of its elements For r = 1,..., n an r-permutation of S is an ordered

More information

PRAMs. M 1 M 2 M p. globaler Speicher

PRAMs. M 1 M 2 M p. globaler Speicher PRAMs A PRAM (parallel random access machine) consists of p many identical processors M,..., M p (RAMs). Processors can read from/write to a shared (global) memory. Processors work synchronously. M M 2

More information

PAC Learning. prof. dr Arno Siebes. Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht

PAC Learning. prof. dr Arno Siebes. Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht PAC Learning prof. dr Arno Siebes Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht Recall: PAC Learning (Version 1) A hypothesis class H is PAC learnable

More information

Personalized Social Recommendations Accurate or Private

Personalized Social Recommendations Accurate or Private Personalized Social Recommendations Accurate or Private Presented by: Lurye Jenny Paper by: Ashwin Machanavajjhala, Aleksandra Korolova, Atish Das Sarma Outline Introduction Motivation The model General

More information

Deterministic Decentralized Search in Random Graphs

Deterministic Decentralized Search in Random Graphs Deterministic Decentralized Search in Random Graphs Esteban Arcaute 1,, Ning Chen 2,, Ravi Kumar 3, David Liben-Nowell 4,, Mohammad Mahdian 3, Hamid Nazerzadeh 1,, and Ying Xu 1, 1 Stanford University.

More information

Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs

Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Adrian Weller University of Cambridge CP 2015 Cork, Ireland Slides and full paper at http://mlg.eng.cam.ac.uk/adrian/ 1 / 21 Motivation:

More information

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations Joshua R. Wang November 1, 2016 1 Model and Results Continuing from last week, we again examine provable algorithms for

More information

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into

More information

Efficient Approximation for Restricted Biclique Cover Problems

Efficient Approximation for Restricted Biclique Cover Problems algorithms Article Efficient Approximation for Restricted Biclique Cover Problems Alessandro Epasto 1, *, and Eli Upfal 2 ID 1 Google Research, New York, NY 10011, USA 2 Department of Computer Science,

More information

Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example. oblivious algorithms only consider self packet.

Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example. oblivious algorithms only consider self packet. Routing Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example synchronous message passing bidirectional links, one message per step queues on links permutation

More information

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms   Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

Complexity, Parallel Computation and Statistical Physics

Complexity, Parallel Computation and Statistical Physics Complexity, Parallel Computation and Statistical Physics Jon Machta! Measures of Complexity workshop Santa Fe Institute January 13, 2011 Outline Overview and motivation: What aspect of natural complexity

More information

Intro to probability concepts

Intro to probability concepts October 31, 2017 Serge Lang lecture This year s Serge Lang Undergraduate Lecture will be given by Keith Devlin of our main athletic rival. The title is When the precision of mathematics meets the messiness

More information

Distributed Systems Gossip Algorithms

Distributed Systems Gossip Algorithms Distributed Systems Gossip Algorithms He Sun School of Informatics University of Edinburgh What is Gossip? Gossip algorithms In a gossip algorithm, each node in the network periodically exchanges information

More information

Structural Data De-anonymization: Quantification, Practice, and Implications

Structural Data De-anonymization: Quantification, Practice, and Implications Structural Data De-anonymization: Quantification, Practice, and Implications ABSTRACT Shouling Ji School of Electrical and Computer Engineering Georgia Institute of Technology sji@gatech.edu Mudhakar Srivatsa

More information

MAE 298, Lecture 4 April 9, Exploring network robustness

MAE 298, Lecture 4 April 9, Exploring network robustness MAE 298, Lecture 4 April 9, 2006 Switzerland Germany Spain Italy Japan Netherlands Russian Federation Sweden UK USA Unknown Exploring network robustness What is a power law? (Also called a Pareto Distribution

More information

The Beginning of Graph Theory. Theory and Applications of Complex Networks. Eulerian paths. Graph Theory. Class Three. College of the Atlantic

The Beginning of Graph Theory. Theory and Applications of Complex Networks. Eulerian paths. Graph Theory. Class Three. College of the Atlantic Theory and Applications of Complex Networs 1 Theory and Applications of Complex Networs 2 Theory and Applications of Complex Networs Class Three The Beginning of Graph Theory Leonhard Euler wonders, can

More information

Quantum Computing Lecture 8. Quantum Automata and Complexity

Quantum Computing Lecture 8. Quantum Automata and Complexity Quantum Computing Lecture 8 Quantum Automata and Complexity Maris Ozols Computational models and complexity Shor s algorithm solves, in polynomial time, a problem for which no classical polynomial time

More information

11.1 Set Cover ILP formulation of set cover Deterministic rounding

11.1 Set Cover ILP formulation of set cover Deterministic rounding CS787: Advanced Algorithms Lecture 11: Randomized Rounding, Concentration Bounds In this lecture we will see some more examples of approximation algorithms based on LP relaxations. This time we will use

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Design and Analysis of Algorithms 6.046J/18.401J LECTURE 7 Skip Lists Data structure Randomized insertion With high probability (w.h.p.) bound 7/10/15 Copyright 2001-8 by Leiserson et al L9.1 Skip lists

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,

More information

Performance Evaluation. Analyzing competitive influence maximization problems with partial information: An approximation algorithmic framework

Performance Evaluation. Analyzing competitive influence maximization problems with partial information: An approximation algorithmic framework Performance Evaluation ( ) Contents lists available at ScienceDirect Performance Evaluation journal homepage: www.elsevier.com/locate/peva Analyzing competitive influence maximization problems with partial

More information

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 The Total Probability Theorem. Consider events E and F. Consider a sample point ω E. Observe that ω belongs to either F or

More information

Lecture 1: Overview of percolation and foundational results from probability theory 30th July, 2nd August and 6th August 2007

Lecture 1: Overview of percolation and foundational results from probability theory 30th July, 2nd August and 6th August 2007 CSL866: Percolation and Random Graphs IIT Delhi Arzad Kherani Scribe: Amitabha Bagchi Lecture 1: Overview of percolation and foundational results from probability theory 30th July, 2nd August and 6th August

More information

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

Lecture 5. 1 Review (Pairwise Independence and Derandomization) 6.842 Randomness and Computation September 20, 2017 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Tom Kolokotrones 1 Review (Pairwise Independence and Derandomization) As we discussed last time, we can

More information

Parameterized Complexity of the Sparsest k-subgraph Problem in Chordal Graphs

Parameterized Complexity of the Sparsest k-subgraph Problem in Chordal Graphs Parameterized Complexity of the Sparsest k-subgraph Problem in Chordal Graphs Marin Bougeret, Nicolas Bousquet, Rodolphe Giroudeau, and Rémi Watrigant LIRMM, Université Montpellier, France Abstract. In

More information

Randomized Computation

Randomized Computation Randomized Computation Slides based on S.Aurora, B.Barak. Complexity Theory: A Modern Approach. Ahto Buldas Ahto.Buldas@ut.ee We do not assume anything about the distribution of the instances of the problem

More information

Lecture Examples of problems which have randomized algorithms

Lecture Examples of problems which have randomized algorithms 6.841 Advanced Complexity Theory March 9, 2009 Lecture 10 Lecturer: Madhu Sudan Scribe: Asilata Bapat Meeting to talk about final projects on Wednesday, 11 March 2009, from 5pm to 7pm. Location: TBA. Includes

More information

IT and large deviation theory

IT and large deviation theory PhD short course Information Theory and Statistics Siena, 15-19 September, 2014 IT and large deviation theory Mauro Barni University of Siena Outline of the short course Part 1: Information theory in a

More information

Graphical Model Inference with Perfect Graphs

Graphical Model Inference with Perfect Graphs Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite

More information

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero Chapter Limits of Sequences Calculus Student: lim s n = 0 means the s n are getting closer and closer to zero but never gets there. Instructor: ARGHHHHH! Exercise. Think of a better response for the instructor.

More information

Reasoning with Probabilities. Eric Pacuit Joshua Sack. Outline. Basic probability logic. Probabilistic Epistemic Logic.

Reasoning with Probabilities. Eric Pacuit Joshua Sack. Outline. Basic probability logic. Probabilistic Epistemic Logic. Reasoning with July 28, 2009 Plan for the Course Day 1: Introduction and Background Day 2: s Day 3: Dynamic s Day 4: Reasoning with Day 5: Conclusions and General Issues Probability language Let Φ be a

More information

Circuits. Lecture 11 Uniform Circuit Complexity

Circuits. Lecture 11 Uniform Circuit Complexity Circuits Lecture 11 Uniform Circuit Complexity 1 Recall 2 Recall Non-uniform complexity 2 Recall Non-uniform complexity P/1 Decidable 2 Recall Non-uniform complexity P/1 Decidable NP P/log NP = P 2 Recall

More information

Lecture 9: Conditional Probability and Independence

Lecture 9: Conditional Probability and Independence EE5110: Probability Foundations July-November 2015 Lecture 9: Conditional Probability and Independence Lecturer: Dr. Krishna Jagannathan Scribe: Vishakh Hegde 9.1 Conditional Probability Definition 9.1

More information

Lecture 11 October 11, Information Dissemination through Social Networks

Lecture 11 October 11, Information Dissemination through Social Networks CS 284r: Incentives and Information in Networks Fall 2013 Prof. Yaron Singer Lecture 11 October 11, 2013 Scribe: Michael Tingley, K. Nathaniel Tucker 1 Overview In today s lecture we will start the second

More information

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland Yufei Tao ITEE University of Queensland Today we will discuss problems closely related to the topic of multi-criteria optimization, where one aims to identify objects that strike a good balance often optimal

More information

Relaxed Locally Correctable Codes in Computationally Bounded Channels

Relaxed Locally Correctable Codes in Computationally Bounded Channels Relaxed Locally Correctable Codes in Computationally Bounded Channels Elena Grigorescu (Purdue) Joint with Jeremiah Blocki (Purdue), Venkata Gandikota (JHU), Samson Zhou (Purdue) Classical Locally Decodable/Correctable

More information

Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks. Yang Cao Emory University

Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks. Yang Cao Emory University Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks Yang Cao Emory University 207..5 Outline Data Mining with Differential Privacy (DP) Scenario: Spatiotemporal Data

More information

Complexity Theory of Polynomial-Time Problems

Complexity Theory of Polynomial-Time Problems Complexity Theory of Polynomial-Time Problems Lecture 3: The polynomial method Part I: Orthogonal Vectors Sebastian Krinninger Organization of lecture No lecture on 26.05. (State holiday) 2 nd exercise

More information

Matroid Secretary for Regular and Decomposable Matroids

Matroid Secretary for Regular and Decomposable Matroids Matroid Secretary for Regular and Decomposable Matroids Michael Dinitz Weizmann Institute of Science mdinitz@cs.cmu.edu Guy Kortsarz Rutgers University, Camden guyk@camden.rutgers.edu Abstract In the matroid

More information

A PECULIAR COIN-TOSSING MODEL

A PECULIAR COIN-TOSSING MODEL A PECULIAR COIN-TOSSING MODEL EDWARD J. GREEN 1. Coin tossing according to de Finetti A coin is drawn at random from a finite set of coins. Each coin generates an i.i.d. sequence of outcomes (heads or

More information

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis Eli Upfal Eli Upfal@brown.edu Office: 319 TA s: Lorenzo De Stefani and Sorin Vatasoiu cs155tas@cs.brown.edu It is remarkable

More information

Security in Locally Repairable Storage

Security in Locally Repairable Storage 1 Security in Locally Repairable Storage Abhishek Agarwal and Arya Mazumdar Abstract In this paper we extend the notion of locally repairable codes to secret sharing schemes. The main problem we consider

More information

Network alignment and querying

Network alignment and querying Network biology minicourse (part 4) Algorithmic challenges in genomics Network alignment and querying Roded Sharan School of Computer Science, Tel Aviv University Multiple Species PPI Data Rapid growth

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( ) Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the

More information

Social Networks. Chapter 9

Social Networks. Chapter 9 Chapter 9 Social Networks Distributed computing is applicable in various contexts. This lecture exemplarily studies one of these contexts, social networks, an area of study whose origins date back a century.

More information

Handout 1: Probability

Handout 1: Probability Handout 1: Probability Boaz Barak Exercises due September 20, 2005 1 Suggested Reading This material is covered in Cormen, Leiserson, Rivest and Smith Introduction to Algorithms Appendix C. You can also

More information

Heat Kernel Based Community Detection

Heat Kernel Based Community Detection Heat Kernel Based Community Detection Joint with David F. Gleich, (Purdue), supported by" NSF CAREER 1149756-CCF Kyle Kloster! Purdue University! Local Community Detection Given seed(s) S in G, find a

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Advanced topic: Space complexity

Advanced topic: Space complexity Advanced topic: Space complexity CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2016 1/28 Review: time complexity We have looked at how long it takes to

More information

On Influential Node Discovery in Dynamic Social Networks

On Influential Node Discovery in Dynamic Social Networks On Influential Node Discovery in Dynamic Social Networks Charu Aggarwal Shuyang Lin Philip S. Yu Abstract The problem of maximizing influence spread has been widely studied in social networks, because

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

Today. Statistical Learning. Coin Flip. Coin Flip. Experiment 1: Heads. Experiment 1: Heads. Which coin will I use? Which coin will I use?

Today. Statistical Learning. Coin Flip. Coin Flip. Experiment 1: Heads. Experiment 1: Heads. Which coin will I use? Which coin will I use? Today Statistical Learning Parameter Estimation: Maximum Likelihood (ML) Maximum A Posteriori (MAP) Bayesian Continuous case Learning Parameters for a Bayesian Network Naive Bayes Maximum Likelihood estimates

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

Lecture 6: The Pigeonhole Principle and Probability Spaces

Lecture 6: The Pigeonhole Principle and Probability Spaces Lecture 6: The Pigeonhole Principle and Probability Spaces Anup Rao January 17, 2018 We discuss the pigeonhole principle and probability spaces. Pigeonhole Principle The pigeonhole principle is an extremely

More information

Statistical Methods for the Social Sciences, Autumn 2012

Statistical Methods for the Social Sciences, Autumn 2012 Statistical Methods for the Social Sciences, Autumn 2012 Review Session 3: Probability. Exercises Ch.4. More on Stata TA: Anastasia Aladysheva anastasia.aladysheva@graduateinstitute.ch Office hours: Mon

More information

1 Ways to Describe a Stochastic Process

1 Ways to Describe a Stochastic Process purdue university cs 59000-nmc networks & matrix computations LECTURE NOTES David F. Gleich September 22, 2011 Scribe Notes: Debbie Perouli 1 Ways to Describe a Stochastic Process We will use the biased

More information

Lecture 11: Random Variables

Lecture 11: Random Variables EE5110: Probability Foundations for Electrical Engineers July-November 2015 Lecture 11: Random Variables Lecturer: Dr. Krishna Jagannathan Scribe: Sudharsan, Gopal, Arjun B, Debayani The study of random

More information

Dominating Set. Chapter 7

Dominating Set. Chapter 7 Chapter 7 Dominating Set In this chapter we present another randomized algorithm that demonstrates the power of randomization to break symmetries. We study the problem of finding a small dominating set

More information

Hard-Core Model on Random Graphs

Hard-Core Model on Random Graphs Hard-Core Model on Random Graphs Antar Bandyopadhyay Theoretical Statistics and Mathematics Unit Seminar Theoretical Statistics and Mathematics Unit Indian Statistical Institute, New Delhi Centre New Delhi,

More information

EVOLUTION is manifested to be a common property of. Interest-aware Information Diffusion in Evolving Social Networks

EVOLUTION is manifested to be a common property of. Interest-aware Information Diffusion in Evolving Social Networks Interest-aware Information Diffusion in Evolving Social Networks Jiaqi Liu, Luoyi Fu, Zhe Liu, Xiao-Yang Liu and Xinbing Wang Abstract Many realistic wireless social networks are manifested to be evolving

More information

Symmetric Rendezvous in Graphs: Deterministic Approaches

Symmetric Rendezvous in Graphs: Deterministic Approaches Symmetric Rendezvous in Graphs: Deterministic Approaches Shantanu Das Technion, Haifa, Israel http://www.bitvalve.org/~sdas/pres/rendezvous_lorentz.pdf Coauthors: Jérémie Chalopin, Adrian Kosowski, Peter

More information

6.842 Randomness and Computation Lecture 5

6.842 Randomness and Computation Lecture 5 6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its

More information

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These

More information

Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time

Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time Michael Krivelevich and Dan Vilenchik Tel-Aviv University Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time p. 1/2

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 14 October 16, 2013 CPSC 467, Lecture 14 1/45 Message Digest / Cryptographic Hash Functions Hash Function Constructions Extending

More information

Network Augmentation and the Multigraph Conjecture

Network Augmentation and the Multigraph Conjecture Network Augmentation and the Multigraph Conjecture Nathan Kahl Department of Mathematical Sciences Stevens Institute of Technology Hoboken, NJ 07030 e-mail: nkahl@stevens-tech.edu Abstract Let Γ(n, m)

More information

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference 15-780: Graduate Artificial Intelligence ayesian networks: Construction and inference ayesian networks: Notations ayesian networks are directed acyclic graphs. Conditional probability tables (CPTs) P(Lo)

More information

Lecture 6: Entropy Rate

Lecture 6: Entropy Rate Lecture 6: Entropy Rate Entropy rate H(X) Random walk on graph Dr. Yao Xie, ECE587, Information Theory, Duke University Coin tossing versus poker Toss a fair coin and see and sequence Head, Tail, Tail,

More information

k-symmetry Model: A General Framework To Achieve Identity Anonymization In Social Networks

k-symmetry Model: A General Framework To Achieve Identity Anonymization In Social Networks k-symmetry Model: A General Framework To Achieve Identity Anonymization In Social Networks Wentao Wu School of Computer Science and Technology, Fudan University, Shanghai, China 1 Introduction Social networks

More information

Stochastic Generative Hashing

Stochastic Generative Hashing Stochastic Generative Hashing B. Dai 1, R. Guo 2, S. Kumar 2, N. He 3 and L. Song 1 1 Georgia Institute of Technology, 2 Google Research, NYC, 3 University of Illinois at Urbana-Champaign Discussion by

More information

Self Similar (Scale Free, Power Law) Networks (I)

Self Similar (Scale Free, Power Law) Networks (I) Self Similar (Scale Free, Power Law) Networks (I) E6083: lecture 4 Prof. Predrag R. Jelenković Dept. of Electrical Engineering Columbia University, NY 10027, USA {predrag}@ee.columbia.edu February 7, 2007

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Social diversity promotes the emergence of cooperation in public goods games Francisco C. Santos 1, Marta D. Santos & Jorge M. Pacheco 1 IRIDIA, Computer and Decision Engineering Department, Université

More information

Strongly chordal and chordal bipartite graphs are sandwich monotone

Strongly chordal and chordal bipartite graphs are sandwich monotone Strongly chordal and chordal bipartite graphs are sandwich monotone Pinar Heggernes Federico Mancini Charis Papadopoulos R. Sritharan Abstract A graph class is sandwich monotone if, for every pair of its

More information

SOCIAL networks are becoming more and more important

SOCIAL networks are becoming more and more important TO APPEAR IN IEEE TRANACTION ON NETWORK CIENCE AND ENGINEERING, 2018 1 tructure-based ybil Detection in ocial Networks via Local Rule-based Propagation inghui Wang, tudent Member, IEEE, Jinyuan Jia, tudent

More information

Spectral Alignment of Networks Soheil Feizi, Gerald Quon, Muriel Medard, Manolis Kellis, and Ali Jadbabaie

Spectral Alignment of Networks Soheil Feizi, Gerald Quon, Muriel Medard, Manolis Kellis, and Ali Jadbabaie Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-205-005 February 8, 205 Spectral Alignment of Networks Soheil Feizi, Gerald Quon, Muriel Medard, Manolis Kellis, and

More information

Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users

Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users Bin Nie 1 Honggang Zhang 1 Yong Liu 2 1 Fordham University, Bronx, NY. Email: {bnie, hzhang44}@fordham.edu 2

More information

Overlapping Communities

Overlapping Communities Overlapping Communities Davide Mottin HassoPlattner Institute Graph Mining course Winter Semester 2017 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides GRAPH

More information

How many randomly colored edges make a randomly colored dense graph rainbow hamiltonian or rainbow connected?

How many randomly colored edges make a randomly colored dense graph rainbow hamiltonian or rainbow connected? How many randomly colored edges make a randomly colored dense graph rainbow hamiltonian or rainbow connected? Michael Anastos and Alan Frieze February 1, 2018 Abstract In this paper we study the randomly

More information

Random Graphs. 7.1 Introduction

Random Graphs. 7.1 Introduction 7 Random Graphs 7.1 Introduction The theory of random graphs began in the late 1950s with the seminal paper by Erdös and Rényi [?]. In contrast to percolation theory, which emerged from efforts to model

More information

The large deviation principle for the Erdős-Rényi random graph

The large deviation principle for the Erdős-Rényi random graph The large deviation principle for the Erdős-Rényi random graph (Courant Institute, NYU) joint work with S. R. S. Varadhan Main objective: how to count graphs with a given property Only consider finite

More information