CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
|
|
- William Boone
- 6 years ago
- Views:
Transcription
1 CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
2 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 2 (1) New problem: Outbreak detection (2) Develop an approximation algorithm It is a submodular opt. problem! (3) Speed-up greedy hill-climbing Valid for optimizing general submodular functions (i.e., also works for influence maximization) (4) Prove a new data dependent bound on the solution quality Valid for optimizing general submodular functions (i.e., also works for influence maximization)
3 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 3 Given a real city water distribution network And data on how contaminants spread in the network Detect the contaminant as quickly as possible Problem posed by the US Environmental Protection Agency S
4 10/24/2012 [Leskovec et al., KDD 07] Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, Part 2-4 Posts Blogs Information cascade Time ordered hyperlinks Which blogs should one read to detect cascades as effectively as possible?
5 [Leskovec et al., KDD 07] Want to read things before others do. Detect blue & yellow soon but miss red. Detect all stories but late. 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 5
6 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 6 Both of these two are an instance of the same underlying problem! Given a dynamic process spreading over a network We want to select a set of nodes to detect the process effectively Many other applications: Epidemics Influence propagation Network security
7 Utility of placing sensors: Water flow dynamics, demands of households, For each subset S V compute utility f(s) High impact Contamination outbreak S3 S1 S2 Low impact outbreak Medium impact outbreak S3 S1 S4 Set V of all network junctions Sensor reduces impact through early detection! S1 S4 S2 High sensing quality f(s) = 0.9 Low sensing quality f(s)= /24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 7
8 [Leskovec et al., KDD 07] Given: Graph G(V, E) Data on how outbreaks spread over the G: For each outbreak i we know the time T(i, u) when outbreak i contaminates node u Water distribution network (physical pipes and junctions) Simulator of water consumption&flow (built by Mech Eng. people) We simulate the contamination spread for every possible location. 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 8
9 [Leskovec et al., KDD 07] Given: Graph G(V, E) Data on how outbreaks spread over the G: For each outbreak i we know the time T(i, u) when outbreak i contaminates node u a b c b a c The network of the blogosphere Traces of the information flow Collect lots of blogs posts and trace hyperlinks to obtain data about information flow from a given blog. 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 9
10 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 10 Given: Graph G(V, E) Data on how outbreaks spread over the G: For each outbreak i we know the time T(i, u) when outbreak i contaminates node u Goal: Select a subset of nodes S that maximize the expected reward: max S V f S = P i f i S subject to: cost(s) < B i Expected reward for detecting outbreak i f i S is penalty reduction: f i S = π i π i (S)
11 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 11 Reward (1) Minimize time to detection (2) Maximize number of detected propagations (3) Minimize number of infected people Cost (node/location dependent): Reading big blogs is more time consuming Placing a sensor in a remote location is expensive outbreak i Monitoring blue node saves more people than monitoring the green node f(s)
12 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 12 Objective functions: 1) Time to detection (DT) How long does it take to detect a contamination? Penalty: π i (t) = min {t, T mmm } 2) Detection likelihood (DL) How many contaminations do we detect? We incur penalty if we don t detect: π i (t) = 0, π i ( ) = 1 3) Population affected (PA) How many people drank contaminated water? π i (t) = {# of blogs in cascade i at time t}. Observation: In all cases detecting sooner does not hurt!
13 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 13 Observation: Diminishing returns S 1 New sensor: S s S 1 S 3 S 2 S 2 S 4 Placement S={s 1, s 2 } Adding s helps a lot Placement S ={s 1, s 2, s 3, s 4 } Adding s helps very little
14 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 14 Claim: For all A B V and sensors s V\B f A s f A f B s f B Proof: Fix cascade i Show f i A = π i π i (T(A, i)) is submodular Consider A B V and sensor s V\B When does node s detect cascade i? 3 Cases: (1) T s, i T(A, i) then f i A s = f i A, f i B s = f i B and so f i A s f i A = 0 = f i B s f i B Since s detects too late, nobody benefits
15 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 15 Proof (contd.): 3 Cases: (2) T B, i T s, i < T(A, i) then f i A b f i A 0 = f i B s f i B s detects sooner than any node in A but after all in B. So u only helps improve the solution A. (3) T s, i < T(B, i) then f i A s f i A = π i π i T s, i f i (A) π i π i T s, i f i (B) = f i B s f i B Ineqaulity is due to non-decreasingness of f i ( ), i.e., f i A f i (B) So, f i ( ) is submodular! So, f( ) is also submodular f S = P i f i S i
16 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, Part 2-16 a b c d e Hill-climbing reward b Add sensor with highest marginal gain c d a e What do we know about optimizing submodular functions? A hill-climbing (i.e., greedy) is near optimal (1 1 e OOO) But: (1) This only works for unit cost case! (each sensor costs the same) For use each sensor s has cost c(s) (2) Hill-climbing algorithm is slow At each iteration we need to re-evaluate marginal gains of all nodes Runtime O( V K) for placing K sensors
17 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 17 Consider: Hill-climbing that ignores cost Ignore sensor cost Repeatedly select sensor with highest marginal gain Do this until the budget is exhausted How well does this work?
18 [Leskovec et al., KDD 07] Bad example: n sensors, budget B s 1 : reward r, cost B s 2 s n : reward r ε, cost 1 Hill-climbing always prefers more expensive sensor s 1 with reward r (and exhausts the budget) It never selects cheaper sensors with reward r ε For variable cost it can fail arbitrarily badly! Idea: What if we optimize benefit-cost ratio? s i = arg max s V f A i 1 {s} f(a i 1 ) c s Greedily pick sensor s i that maximizes benefit to cost ratio. 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 18
19 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 19 Benefit-cost ratio can also fail arbitrarily badly! Consider: budget B: 2 sensors s 1 and s 2 : Costs: c(s 1 ) = ε, c(s 2 ) = B Only 1 cascade: f(s 1 ) = 2ε, f(s 2 ) = B Then benefit-cost ratio is b c(s 1 ) = 2 and b c(s 2 ) = 1 So, we first select s 1 and then can not afford s 2 We get reward 2ε instead of B. Now send ε 0 and we get arbitrarily bad solution!
20 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 20 CELF (cost-effective lazy forward-selection) A two pass greedy algorithm: Set (solution) SS: Use benefit-cost greedy Set (solution) SSS: Use unit-cost greedy Final solution: S = arg max (f(s ), f(s )) How far is CELF from (unknown) optimal solution? Theorem: CELF is near optimal CELF achieves ½(1-1/e) factor approximation!
21 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, Part 2-21 a b c d e Hill-climbing reward b Add sensor with highest marginal gain c d a e What do we know about optimizing submodular functions? A hill-climbing (i.e., greedy) is near optimal (1 1 OOO) e But: (2) Hill-climbing algorithm is slow! At each iteration we need to reevaluate marginal gains of all nodes Runtime O( V K) for placing K sensors
22 [Leskovec et al., KDD 07] In round i + 1: So far we picked S i = {s 1,, s i } Now pick s i = arg max u f(s i {u}) f(s i ) This our old friend greedy hill-climbing algorithm. It maximizes the marginal benefit δ u (S i ) = f(s i {u}) f(s i ) By submodularity property: f S i u f S i f S j u f S j for i < j Observation: By submodularity: For every u δ u (S i ) δ u (S j ) for i j since S S i j δ u (S i ) δ u (S j ) Marginal benefits δ u only shrink! u (as S grows) Activating node u in step i helps more than activating it at step j (j>i) 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 22
23 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 23 Idea: Use δ i as upper-bound on δ j (j>i) Lazy hill-climbing: Keep an ordered list of marginal benefits δ i from previous iteration Re-evaluate δ i only for top node Re-sort and prune Marginal gain a b c d e S 1 ={a} f(s {u}) f(s) f(t {u}) f(t) S T
24 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 24 Idea: Use δ i as upper-bound on δ j (j>i) Lazy hill-climbing: Keep an ordered list of marginal benefits δ i from previous iteration Re-evaluate δ i only for top node Re-sort and prune Marginal gain a b c d e S 1 ={a} f(s {u}) f(s) f(t {u}) f(t) S T
25 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 25 Idea: Use δ i as upper-bound on δ j (j>i) Lazy hill-climbing: Keep an ordered list of marginal benefits δ i from previous iteration Re-evaluate δ i only for top node Re-sort and prune Marginal gain a d b e c S 1 ={a} S 2 ={a,b} f(s {u}) f(s) f(t {u}) f(t) S T
26 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 26 Back to the solution quality! The (1-1/e) bound for submodular functions is the worst case bound (worst over all possible inputs) Data dependent bound: Value of the bound depends on the input data On easy data, hill climbing may do better than 63% Can we say something about the solution quality when we know the input data?
27 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 27 Suppose S is some solution to f(s) s.t. S k f(s) is monotone & submodular Let T = {t 1,, t k } be the OPT solution For each u S let δ u = f(s {u}) f(s) Order δ u so that δ δ 1 2 δ n S k i=1 δ i Note: This is a data dependent bound (δ u depend on input data) Bound holds for any algorithm Makes no assumption about how S is computed For some inputs it can be very loose (worse than 63%) Then: f T f S +
28 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 28 Claim: For each u S let δ u = f S u f S Order δ u so that δ 1 δ 2 δ n S Then: f T f S + Proof: k i=1 δ i f T f T S = f S + k f S t 1 t i f S t 1 t i 1 i=1 k i=1 k i=1 f S + f S t i f S = f S + k δ ti Instead of taking t i T (of benefit δ ti ), we take the best possible element (δ i ) f S + i=1 δ i f T f S + i=1 δ i k (we proved this last time)
29 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 29 Real metropolitan area network V = 21,000 nodes E = 25,000 pipes water Use a cluster of 50 machines for a month Simulate 3.6 million epidemic scenarios (152 GB of epidemic data) By exploiting sparsity we fit it into main memory (16GB)
30 Solution quality F(A) Higher is better Offline the (1-1/e) bound Data-dependent bound Hill Climbing Number of sensors placed Data-dependent bound is much tighter (gives more accurate estimate of alg. performance) 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 30
31 [w/ Ostfeld et al., J. of Water Resource Planning] Author Score Placement heuristics perform much worse CELF 26 Sandia 21 U Exter 20 Bentley systems 19 Technion (1) 14 Bordeaux 12 U Cyprus 11 U Guelph 7 U Michigan 4 Michigan Tech U 3 Malcolm 2 Proteo 2 Technion (2) 1 Battle of Water Sensor Networks competition 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 31
32 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 32 Different objective functions give different sensor placements Population affected Detection likelihood
33 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 33 CELF is 10 times faster than greedy hill-climbing!
34 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 34 = I have 10 minutes. Which blogs should I read to be most up to date?? = Who are the most influential bloggers?
35 Want to read things before others do. Detect blue & yellow soon but miss red. Detect all stories but late. 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 35
36 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 36 Online bound is much tighter! 13% instead of 37% Old bound Our bound CELF
37 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 37 Heuristics perform much worse! One really needs to perform the optimization
38 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 38 CELF has 2 sub-algorithms. Which wins? Unit cost: CELF picks large popular blogs: instapundit.com, michellemalkin.com Cost-benefit: Cost proportional to the number of posts We can do much better when considering costs
39 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 39 Problem: Then CEF picks lots of small blogs that participate in few cascades We pick best solution that interpolates between the costs f(s)=0.3 Score f(s)=0.4 We can get good solutions with few blogs and few posts f(s)=0.2 Each curve represents a set of solutions S with the same final reward f(s)
40 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, Part 2-40 We want to generalize well to future (unknown) cascades Limiting selection to bigger blogs improves generalization!
41 [Leskovec et al., KDD 07] 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 41 CELF runs 700 times faster than simple hillclimbing algorithm
42 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 42 Observations Models Algorithms Small diameter, Edge clustering Patterns of signed edge creation Viral Marketing, Blogosphere, Memetracking Scale-Free Densification power law, Shrinking diameters Strength of weak ties, Core-periphery Erdös-Renyi model, Small-world model Structural balance, Theory of status Independent cascade model, Game theoretic model Preferential attachment, Copying model Microscopic model of evolving networks Kronecker Graphs Decentralized search Models for predicting edge signs Influence maximization, Outbreak detection, LIM PageRank, Hubs and authorities Link prediction, Supervised random walks Community detection: Girvan-Newman, Modularity
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Find most influential set S of size k: largest expected cascade size f(s) if set S is activated
More informationCS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University
CS 322: (Social and Inormation) Network Analysis Jure Leskovec Stanord University Initially some nodes S are active Each edge (a,b) has probability (weight) p ab b 0.4 0.4 0.2 a 0.4 0.2 0.4 g 0.2 Node
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/30/17 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2
More informationActive Learning and Optimized Information Gathering
Active Learning and Optimized Information Gathering Lecture 13 Submodularity (cont d) CS 101.2 Andreas Krause Announcements Homework 2: Due Thursday Feb 19 Project milestone due: Feb 24 4 Pages, NIPS format:
More informationCSCI 3210: Computational Game Theory. Cascading Behavior in Networks Ref: [AGT] Ch 24
CSCI 3210: Computational Game Theory Cascading Behavior in Networks Ref: [AGT] Ch 24 Mohammad T. Irfan Email: mirfan@bowdoin.edu Web: www.bowdoin.edu/~mirfan Course Website: www.bowdoin.edu/~mirfan/csci-3210.html
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Intro sessions to SNAP C++ and SNAP.PY: SNAP.PY: Friday 9/27, 4:5 5:30pm in Gates B03 SNAP
More information9. Submodular function optimization
Submodular function maximization 9-9. Submodular function optimization Submodular function maximization Greedy algorithm for monotone case Influence maximization Greedy algorithm for non-monotone case
More informationInfluence Maximization in Continuous Time Diffusion Networks
Manuel Gomez-Rodriguez 1,2 Bernhard Schölkopf 1 1 MPI for Intelligent Systems and 2 Stanford University MANUELGR@STANFORD.EDU BS@TUEBINGEN.MPG.DE Abstract The problem of finding the optimal set of source
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 12/3/2013 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
More informationJure Leskovec Stanford University
Jure Leskovec Stanford University 2 Part 1: Models for networks Part 2: Information flows in networks Information spread in social media 3 Information flows through networks Analyzing underlying mechanisms
More informationMaximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg, Éva Tardos SIGKDD 03
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Éva Tardos SIGKDD 03 Influence and Social Networks Economics, sociology, political science, etc. all have studied
More informationCSE541 Class 22. Jeremy Buhler. November 22, Today: how to generalize some well-known approximation results
CSE541 Class 22 Jeremy Buhler November 22, 2016 Today: how to generalize some well-known approximation results 1 Intuition: Behavior of Functions Consider a real-valued function gz) on integers or reals).
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
Announcements: Please fill HW Survey Weekend Office Hours starting this weekend (Hangout only) Proposal: Can use 1 late period CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu
More informationDiffusion of Innovation and Influence Maximization
Diffusion of Innovation and Influence Maximization Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics
More informationWeb Structure Mining Nodes, Links and Influence
Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.
More informationGreedy Maximization Framework for Graph-based Influence Functions
Greedy Maximization Framework for Graph-based Influence Functions Edith Cohen Google Research Tel Aviv University HotWeb '16 1 Large Graphs Model relations/interactions (edges) between entities (nodes)
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Non-overlapping vs. overlapping communities 11/10/2010 Jure Leskovec, Stanford CS224W: Social
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 4: Query Optimization Query Optimization Cost estimation Strategies for exploring plans Q min CS 347 Notes 4 2 Cost Estimation Based on
More informationReview: TD-Learning. TD (SARSA) Learning for Q-values. Bellman Equations for Q-values. P (s, a, s )[R(s, a, s )+ Q (s, (s ))]
Review: TD-Learning function TD-Learning(mdp) returns a policy Class #: Reinforcement Learning, II 8s S, U(s) =0 set start-state s s 0 choose action a, using -greedy policy based on U(s) U(s) U(s)+ [r
More informationMore Approximation Algorithms
CS 473: Algorithms, Spring 2018 More Approximation Algorithms Lecture 25 April 26, 2018 Most slides are courtesy Prof. Chekuri Ruta (UIUC) CS473 1 Spring 2018 1 / 28 Formal definition of approximation
More informationPerformance Evaluation. Analyzing competitive influence maximization problems with partial information: An approximation algorithmic framework
Performance Evaluation ( ) Contents lists available at ScienceDirect Performance Evaluation journal homepage: www.elsevier.com/locate/peva Analyzing competitive influence maximization problems with partial
More informationThe Ties that Bind Characterizing Classes by Attributes and Social Ties
The Ties that Bind WWW April, 2017, Bryan Perozzi*, Leman Akoglu Stony Brook University *Now at Google. Introduction Outline Our problem: Characterizing Community Differences Proposed Method Experimental
More informationECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap
ECS 253 / MAE 253, Lecture 15 May 17, 2016 I. Probability generating function recap Part I. Ensemble approaches A. Master equations (Random graph evolution, cluster aggregation) B. Network configuration
More informationInfluence Maximization in Dynamic Social Networks
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang and Xiaoming Sun Department of Computer Science and Technology, Tsinghua University Department of Computer
More informationThanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides
Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart
More informationECS 253 / MAE 253, Lecture 13 May 15, Diffusion, Cascades and Influence Mathematical models & generating functions
ECS 253 / MAE 253, Lecture 13 May 15, 2018 Diffusion, Cascades and Influence Mathematical models & generating functions Last week: spatial flows and game theory on networks Optimal location of facilities
More informationCS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash
CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness
More informationInfluence Spreading Path and its Application to the Time Constrained Social Influence Maximization Problem and Beyond
1 ing Path and its Application to the Time Constrained Social Influence Maximization Problem and Beyond Bo Liu, Gao Cong, Yifeng Zeng, Dong Xu,and Yeow Meng Chee Abstract Influence maximization is a fundamental
More informationAditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016
Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu Task: Find coalitions in signed networks Incentives: European
More informationReinforcement Learning. Spring 2018 Defining MDPs, Planning
Reinforcement Learning Spring 2018 Defining MDPs, Planning understandability 0 Slide 10 time You are here Markov Process Where you will go depends only on where you are Markov Process: Information state
More informationIn Search of Influential Event Organizers in Online Social Networks
In Search of Influential Event Organizers in Online Social Networks ABSTRACT Kaiyu Feng 1,2 Gao Cong 2 Sourav S. Bhowmick 2 Shuai Ma 3 1 LILY, Interdisciplinary Graduate School. Nanyang Technological University,
More informationOverlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach
Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach Author: Jaewon Yang, Jure Leskovec 1 1 Venue: WSDM 2013 Presenter: Yupeng Gu 1 Stanford University 1 Background Community
More informationUncovering the Temporal Dynamics of Diffusion Networks
Manuel Gomez-Rodriguez,2 David Balduzzi Bernhard Schölkopf MPI for Intelligent Systems and 2 Stanford University MANUELGR@STANFORD.EDU BALDUZZI@TUEBINGEN.MPG.DE BS@TUEBINGEN.MPG.DE Abstract Time plays
More informationAn Online Algorithm for Maximizing Submodular Functions
An Online Algorithm for Maximizing Submodular Functions Matthew Streeter December 20, 2007 CMU-CS-07-171 Daniel Golovin School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This research
More informationSocial Influence in Online Social Networks. Epidemiological Models. Epidemic Process
Social Influence in Online Social Networks Toward Understanding Spatial Dependence on Epidemic Thresholds in Networks Dr. Zesheng Chen Viral marketing ( word-of-mouth ) Blog information cascading Rumor
More information2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51
2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each
More informationIntroduction to Reinforcement Learning
CSCI-699: Advanced Topics in Deep Learning 01/16/2019 Nitin Kamra Spring 2019 Introduction to Reinforcement Learning 1 What is Reinforcement Learning? So far we have seen unsupervised and supervised learning.
More informationMulti-Round Influence Maximization
Multi-Round Influence Maximization Lichao Sun 1, Weiran Huang 2, Philip S. Yu 1,2, Wei Chen 3, 1 University of Illinois at Chicago, 2 Tsinghua University, 3 Microsoft Research lsun29@uic.edu, huang.inbox@outlook.com,
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationEffective placement of sensors for efficient early warning system in water distribution network MASHREKI ISLAM SAMI
Effective placement of sensors for efficient early warning system in water distribution network MASHREKI ISLAM SAMI AGENDA Introduction and Problem Statement Methodology Experimentation and Analysis Limitations
More informationCascading Behavior in Networks: Algorithmic and Economic Issues
CHAPTER 24 Cascading Behavior in Networks: Algorithmic and Economic Issues Jon Kleinberg Abstract The flow of information or influence through a large social network can be thought of as unfolding with
More information6.207/14.15: Networks Lecture 12: Generalized Random Graphs
6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model
More informationAdaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization
Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization Daniel Golovin California Institute of Technology Pasadena, CA 91125 dgolovin@caltech.edu Andreas Krause California
More informationCSEP 521 Applied Algorithms. Richard Anderson Winter 2013 Lecture 1
CSEP 521 Applied Algorithms Richard Anderson Winter 2013 Lecture 1 CSEP 521 Course Introduction CSEP 521, Applied Algorithms Monday s, 6:30-9:20 pm CSE 305 and Microsoft Building 99 Instructor Richard
More information6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search
6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)
More informationCS599: Convex and Combinatorial Optimization Fall 2013 Lecture 24: Introduction to Submodular Functions. Instructor: Shaddin Dughmi
CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 24: Introduction to Submodular Functions Instructor: Shaddin Dughmi Announcements Introduction We saw how matroids form a class of feasible
More informationPageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10
PageRank Ryan Tibshirani 36-462/36-662: Data Mining January 24 2012 Optional reading: ESL 14.10 1 Information retrieval with the web Last time we learned about information retrieval. We learned how to
More informationOn-Line Social Systems with Long-Range Goals
On-Line Social Systems with Long-Range Goals Jon Kleinberg Cornell University Including joint work with Ashton Anderson, Dan Huttenlocher, Jure Leskovec, and Sigal Oren. Long-Range Planning Growth in on-line
More informationOnline Interval Coloring and Variants
Online Interval Coloring and Variants Leah Epstein 1, and Meital Levy 1 Department of Mathematics, University of Haifa, 31905 Haifa, Israel. Email: lea@math.haifa.ac.il School of Computer Science, Tel-Aviv
More informationEdge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier
Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier W. Xie D. Bindel A. Demers J. Gehrke 12 Aug 2015 W. Xie, D. Bindel, A. Demers, J. Gehrke KDD2015 12 Aug 2015 1 / 1 PageRank
More information1 Maximizing a Submodular Function
6.883 Learning with Combinatorial Structure Notes for Lecture 16 Author: Arpit Agarwal 1 Maximizing a Submodular Function In the last lecture we looked at maximization of a monotone submodular function,
More informationHybrid Machine Learning Algorithms
Hybrid Machine Learning Algorithms Umar Syed Princeton University Includes joint work with: Rob Schapire (Princeton) Nina Mishra, Alex Slivkins (Microsoft) Common Approaches to Machine Learning!! Supervised
More informationOn efficient use of entropy centrality for social network analysis and community detection
On efficient use of entropy centrality for social network analysis and community detection ALEXANDER G. NIKOLAEV, RAIHAN RAZIB, ASHWIN KUCHERIYA PRESENTER: PRIYA BALACHANDRAN MARY ICSI 445/660 12/1/2015
More informationLecture 11 October 11, Information Dissemination through Social Networks
CS 284r: Incentives and Information in Networks Fall 2013 Prof. Yaron Singer Lecture 11 October 11, 2013 Scribe: Michael Tingley, K. Nathaniel Tucker 1 Overview In today s lecture we will start the second
More informationCost and Preference in Recommender Systems Junhua Chen LESS IS MORE
Cost and Preference in Recommender Systems Junhua Chen, Big Data Research Center, UESTC Email:junmshao@uestc.edu.cn http://staff.uestc.edu.cn/shaojunming Abstract In many recommender systems (RS), user
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com
More informationDiffusion of Innovation
Diffusion of Innovation Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Social Network Analysis
More informationStructure Learning: the good, the bad, the ugly
Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 Structure Learning: the good, the bad, the ugly Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 29 th, 2006 1 Understanding the uniform
More informationBin packing and scheduling
Sanders/van Stee: Approximations- und Online-Algorithmen 1 Bin packing and scheduling Overview Bin packing: problem definition Simple 2-approximation (Next Fit) Better than 3/2 is not possible Asymptotic
More informationCS 570: Machine Learning Seminar. Fall 2016
CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or
More informationCITS4211 Mid-semester test 2011
CITS4211 Mid-semester test 2011 Fifty minutes, answer all four questions, total marks 60 Question 1. (12 marks) Briefly describe the principles, operation, and performance issues of iterative deepening.
More informationFaster Kriging on Graphs
Faster Kriging on Graphs Omkar Muralidharan Abstract [Xu et al. 2009] introduce a graph prediction method that is accurate but slow. My project investigates faster methods based on theirs that are nearly
More informationA THEORY OF NETWORK SECURITY: PRINCIPLES OF NATURAL SELECTION AND COMBINATORICS
Internet Mathematics, 2:45 204, 206 Copyright Taylor & Francis Group, LLC ISSN: 542-795 print/944-9488 online DOI: 0.080/542795.205.098755 A THEORY OF NETWORK SECURITY: PRINCIPLES OF NATURAL SELECTION
More informationCommunities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices
Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into
More informationProbability Models of Information Exchange on Networks Lecture 6
Probability Models of Information Exchange on Networks Lecture 6 UC Berkeley Many Other Models There are many models of information exchange on networks. Q: Which model to chose? My answer good features
More informationDetermining the Diameter of Small World Networks
Determining the Diameter of Small World Networks Frank W. Takes & Walter A. Kosters Leiden University, The Netherlands CIKM 2011 October 2, 2011 Glasgow, UK NWO COMPASS project (grant #12.0.92) 1 / 30
More informationContinuous Influence Maximization: What Discounts Should We Offer to Social Network Users?
Continuous Influence Maximization: What Discounts Should We Offer to Social Network Users? ABSTRACT Yu Yang Simon Fraser University Burnaby, Canada yya119@sfu.ca Jian Pei Simon Fraser University Burnaby,
More informationThree Disguises of 1 x = e λx
Three Disguises of 1 x = e λx Chathuri Karunarathna Mudiyanselage Rabi K.C. Winfried Just Department of Mathematics, Ohio University Mathematical Biology and Dynamical Systems Seminar Ohio University November
More informationCS264: Beyond Worst-Case Analysis Lecture #4: Parameterized Analysis of Online Paging
CS264: Beyond Worst-Case Analysis Lecture #4: Parameterized Analysis of Online Paging Tim Roughgarden January 19, 2017 1 Preamble Recall our three goals for the mathematical analysis of algorithms: the
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2019 Last Time: Monte Carlo Methods If we want to approximate expectations of random functions, E[g(x)] = g(x)p(x) or E[g(x)]
More informationWord Alignment via Submodular Maximization over Matroids
Word Alignment via Submodular Maximization over Matroids Hui Lin, Jeff Bilmes University of Washington, Seattle Dept. of Electrical Engineering June 21, 2011 Lin and Bilmes Submodular Word Alignment June
More informationOn Influential Node Discovery in Dynamic Social Networks
On Influential Node Discovery in Dynamic Social Networks Charu Aggarwal Shuyang Lin Philip S. Yu Abstract The problem of maximizing influence spread has been widely studied in social networks, because
More informationCPSC 340: Machine Learning and Data Mining. Regularization Fall 2017
CPSC 340: Machine Learning and Data Mining Regularization Fall 2017 Assignment 2 Admin 2 late days to hand in tonight, answers posted tomorrow morning. Extra office hours Thursday at 4pm (ICICS 246). Midterm
More informationOnline Social Networks and Media. Opinion formation on social networks
Online Social Networks and Media Opinion formation on social networks Diffusion of items So far we have assumed that what is being diffused in the network is some discrete item: E.g., a virus, a product,
More informationStochastic Submodular Cover with Limited Adaptivity
Stochastic Submodular Cover with Limited Adaptivity Arpit Agarwal Sepehr Assadi Sanjeev Khanna Abstract In the submodular cover problem, we are given a non-negative monotone submodular function f over
More informationEM (cont.) November 26 th, Carlos Guestrin 1
EM (cont.) Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University November 26 th, 2007 1 Silly Example Let events be grades in a class w 1 = Gets an A P(A) = ½ w 2 = Gets a B P(B) = µ
More informationMobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL
MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL Jing (Selena) He Department of Computer Science, Kennesaw State University Shouling Ji,
More informationA.I.: Beyond Classical Search
A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously
More informationTime Constrained Influence Maximization in Social Networks
Time Constrained Influence Maximization in Social Networks Bo Liu Facebook Menlo Park, CA, USA bol@fb.com Gao Cong, Dong Xu School of Computer Engineering Nanyang Technological University, Singapore {gaocong,
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationReinforcement Learning
Reinforcement Learning Ron Parr CompSci 7 Department of Computer Science Duke University With thanks to Kris Hauser for some content RL Highlights Everybody likes to learn from experience Use ML techniques
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning First-Order Methods, L1-Regularization, Coordinate Descent Winter 2016 Some images from this lecture are taken from Google Image Search. Admin Room: We ll count final numbers
More informationP, NP, NP-Complete. Ruth Anderson
P, NP, NP-Complete Ruth Anderson A Few Problems: Euler Circuits Hamiltonian Circuits Intractability: P and NP NP-Complete What now? Today s Agenda 2 Try it! Which of these can you draw (trace all edges)
More informationSlides based on those in:
Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering
More informationSlide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.
Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University http://www.mmds.org #1: C4.5 Decision Tree - Classification (61 votes) #2: K-Means - Clustering
More informationAdaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization
Journal of Artificial Intelligence Research 42 (2011) 427-486 Submitted 1/11; published 11/11 Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization Daniel Golovin
More informationCPSC 340: Machine Learning and Data Mining. Feature Selection Fall 2018
CPSC 340: Machine Learning and Data Mining Feature Selection Fall 2018 Admin Assignment 3 is due tonight. You can use a late day to submit up to 48 hours late. Solutions will be posted Monday. Midterm
More informationProfit Maximization for Viral Marketing in Online Social Networks
Maximization for Viral Marketing in Online Social Networks Jing TANG Interdisciplinary Graduate School Nanyang Technological University Email: tang0311@ntu.edu.sg Xueyan TANG School of Comp. Sci. & Eng.
More informationUsing first-order logic, formalize the following knowledge:
Probabilistic Artificial Intelligence Final Exam Feb 2, 2016 Time limit: 120 minutes Number of pages: 19 Total points: 100 You can use the back of the pages if you run out of space. Collaboration on the
More informationSearch and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012
Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Search and Lookahead Enforcing consistency is one way of solving constraint networks:
More informationProject in Computational Game Theory: Communities in Social Networks
Project in Computational Game Theory: Communities in Social Networks Eldad Rubinstein November 11, 2012 1 Presentation of the Original Paper 1.1 Introduction In this section I present the article [1].
More informationAnalysis & Generative Model for Trust Networks
Analysis & Generative Model for Trust Networks Pranav Dandekar Management Science & Engineering Stanford University Stanford, CA 94305 ppd@stanford.edu ABSTRACT Trust Networks are a specific kind of social
More informationThanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides
Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank
More informationMonitoring Stealthy Diffusion
Monitoring Stealthy Diffusion Nika Haghtalab, Aron Laszka, Ariel D. Procaccia, Yevgeniy Vorobeychik and Xenofon Koutsoukos Carnegie Mellon University {nhaghtal,arielpro}@cs.cmu.edu Vanderbilt University
More information1 Submodular functions
CS 369P: Polyhedral techniques in combinatorial optimization Instructor: Jan Vondrák Lecture date: November 16, 2010 1 Submodular functions We have already encountered submodular functions. Let s recall
More informationN/4 + N/2 + N = 2N 2.
CS61B Summer 2006 Instructor: Erin Korber Lecture 24, 7 Aug. 1 Amortized Analysis For some of the data structures we ve discussed (namely hash tables and splay trees), it was claimed that the average time
More informationContaining Cascading Failures in Networks: Applications to Epidemics and Cybersecurity
Containing Cascading Failures in Networks: Applications to Epidemics and Cybersecurity Sudip Saha Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial
More informationData Mining Recitation Notes Week 3
Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents
More information