MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL Jing (Selena) He Department of Computer Science, Kennesaw State University Shouling Ji, and Raheem Beyah School of Electrical and Computer Engineering, Georgia Institute of Technology Zhipeng Cai Department of Computer Science, Georgia State University
INTRODUCTION What is a social network? The graph of relationships and interactions within a group of individuals. 2
SOCIAL NETWORK AND SPREAD OF INFLUENCE Social network plays a fundamental role as a medium for the spread of INFLUENCE among its members Opinions, ideas, information, innovation Direct Marketing takes the word-of-mouth effects to significantly increase profits (facebook, twitter, myspace, ) 3
MOTIVATION 900 million users, Apr. 2012 the 3rd largest Country in the world More visitors than Google Action: Update statues, create event More than 4 billion images Action: Add tags, Add favorites Social networks already become a bridge to connect our really daily life and the virtual web space 2009, 2 billion tweets per quarter 2010, 4 billion tweets per quarter Action: Post tweets, Retweet 4
MOTIVATION (CONT.) Modeling and tracking users actions in social networks is a very important issue and can benefit many real applications Advertising Social recommendation Expert finding Marketing 5
APPLICATION George Who are the opinion leaders in a community? 2 2 Ada 1 Bob Marketer Alice Frank 4 1 Carol 2 Eve David 3 3 Find minimum-sized node (user) set in a social network that could influence on every node in the network 6
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 7
OUTLINE Network Model Models of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 8
NETWORK MODEL A social network is represented as a undirected graph Nodes start either active or inactive An active node may trigger activation of neighboring nodes based on a pre-defined threshold τ Monotonicity assumption: active nodes never deactivate 9
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 10
MODEL OF INFLUENCE If u 1 is active, then the active node set I = {u 1 } P 1 (I) = 1 P 2 (I) = 0.5 P 3 (I) = 0.7 P 4 (I) = 0.6 11
MODEL OF INFLUENCE P ii = 1, if u i ϵ I P ii = 0, otherwise P i (I) = 1 1 Pij τ u j I If u 1 and u 4 are active, then the active node set I = {u 1, u 4 } P 1 (I) = 1 (1 P 11 )(1 P 14 ) = 1 P 2 (I) = 1 (1 P 21 )(1 P 24 ) = 0.9 P 3 (I) = 1 (1 P 31 )(1 P 34 ) = 0.97 P 4 (I) = 1 (1 P 41 )(1 P 44 ) = 1 12
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 13
MINIMUM-SIZED INFLUENCE NODE SET SELECTION PROBLEM (MINS) Given a social network G = (V, E, P) a threshold τ Goal The initially selected active node set denoted by I could influence every node in the network ui V, P i (I) = 1 1 Pij τ u j I Objective Minimize the size of I 14
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 15
CONTRIBUTION FUNCTION f(i) = min (Pi I, τ) u i V Greedy algorithm Initialize I = empty set While f(i) < V τ do Choose u to maximize f(i {u}) I = I {u} End while Return I 16
EXAMPLE First round: I = empty set Second round: I = {u 1 } f(i) = 0.8 + 0.5 + 0.7 + 0.6 = 2.6 I = {u 2 } f(i) = 0.5 + 0.8 + 0.4 + 0.8 = 2.5 I = {u 3 } f(i) = 0.7 + 0.4 + 0.8 + 0.8 = 2.7 I = {u 4 } f(i) = 0.6 + 0.8 + 0.8 + 0.8 = 3.0 τ = 0.8 f(i) = min (Pi I, τ) u i V 17 17
EXAMPLE Third round: I = {u 4, u 1 } f(i) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2 I = {u 4, u 2 } f(i) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2 I = {u 4, u 3 } f(i) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2 Use node ID to break the tie I = {u 4, u 1 } The greedy algorithm stops, since f(i) = V τ = 4 * 0.8 = 3.2. τ = 0.8 f(i) = min (Pi I, τ) u i V 18 18
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 19
THEORETICAL ANALYSIS Theorem 1. The MINS selection problem is NP-hard. 20
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 21
SIMULATION SETTINGS generate random graphs based on the random graph model G(n,p) = {G G has n nodes, and an edge between any pair of nodes is generated with probability p}. 22
EXPERIMENT DATA Real-world data set: academic coauthor network, which is extracted from academic search system Arnetminer [19]. co-authorship networks arguably capture many of the key features of social networks more generally. Resulting graph: 640, 134 nodes (authors), 1, 554, 643distinct edges (coauthor relations) 23
OUTLINE Network Model Model of influence Minimum-sized Influence Node Set selection problem Problem definition Greedy Algorithm Proof of performance bound Experiments Data and setting Results 24
RESULTS: SIMULATION 25
RESULTS: SIMULATION 26
RESULTS: REAL DATA 27
CONCLUSIONS We introduce a new optimization problem, named the Minimum-sized Influential Node Set (MINS) selection problem. We prove that it is a NP-hard problem under the independent cascade model. We define a polymatroid contribution function, which suggests us a greedy approximation algorithm. Comprehensive theoretical analysis about its performance ratio is given. We conduct extensive experiments and simulations to validate our proposed greedy algorithm both on real world coauthor data sets and random graphs. 28
FUTURE WORK Study more realistic network model Directed graph Study more general influence models Deal with negative influences Study the network evolution as time changes 29
Q & A 30