Approximation and Inapproximation for The Influence Maximization Problem in Social Networks under Deterministic Linear Threshold Model

Similar documents
Basic Polyhedral theory

Week 3: Connected Subgraphs

On spanning trees and cycles of multicolored point sets with few intersections

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction

The Equitable Dominating Graph

1 Minimum Cut Problem

Construction of asymmetric orthogonal arrays of strength three via a replacement method

cycle that does not cross any edges (including its own), then it has at least

Strongly Connected Components

(Upside-Down o Direct Rotation) β - Numbers

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Homotopy perturbation technique

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN

Square of Hamilton cycle in a random graph

LINEAR DELAY DIFFERENTIAL EQUATION WITH A POSITIVE AND A NEGATIVE TERM

SOME PARAMETERS ON EQUITABLE COLORING OF PRISM AND CIRCULANT GRAPH.

Application of Vague Soft Sets in students evaluation

Stochastic Submodular Maximization

Mutually Independent Hamiltonian Cycles of Pancake Networks

NEW APPLICATIONS OF THE ABEL-LIOUVILLE FORMULA

Finding low cost TSP and 2-matching solutions using certain half integer subtour vertices

Section 6.1. Question: 2. Let H be a subgroup of a group G. Then H operates on G by left multiplication. Describe the orbits for this operation.

The Matrix Exponential

SCHUR S THEOREM REU SUMMER 2005

Two Products Manufacturer s Production Decisions with Carbon Constraint

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH.

A Propagating Wave Packet Group Velocity Dispersion

Limiting value of higher Mahler measure

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list.

First derivative analysis

The Matrix Exponential

Recall that by Theorems 10.3 and 10.4 together provide us the estimate o(n2 ), S(q) q 9, q=1

Homework #3. 1 x. dx. It therefore follows that a sum of the

Chapter 10. The singular integral Introducing S(n) and J(n)

On the irreducibility of some polynomials in two variables

Higher order derivatives

Combinatorial Networks Week 1, March 11-12

CS 361 Meeting 12 10/3/18

Approximating the Two-Level Facility Location Problem Via a Quasi-Greedy Approach. Jiawei Zhang. October 03, 2003

From Elimination to Belief Propagation

Outerplanar graphs and Delaunay triangulations

COHORT MBA. Exponential function. MATH review (part2) by Lucian Mitroiu. The LOG and EXP functions. Properties: e e. lim.

Einstein Equations for Tetrad Fields

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

That is, we start with a general matrix: And end with a simpler matrix:

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES

COUNTING TAMELY RAMIFIED EXTENSIONS OF LOCAL FIELDS UP TO ISOMORPHISM

Derangements and Applications

Spectral Synthesis in the Heisenberg Group

PROOF OF FIRST STANDARD FORM OF NONELEMENTARY FUNCTIONS

Bifurcation Theory. , a stationary point, depends on the value of α. At certain values

Search sequence databases 3 10/25/2016

Maximizing Influence in a Competitive Social Network: A Follower s Perspective

Problem Set 6 Solutions

Another view for a posteriori error estimates for variational inequalities of the second kind

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero.

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES. 1. Statement of results

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

10. The Discrete-Time Fourier Transform (DTFT)

On Grids in Topological Graphs

1 Isoparametric Concept

EEO 401 Digital Signal Processing Prof. Mark Fowler

Inapproximability Results for Combinatorial Auctions with Submodular Utility Functions

The second condition says that a node α of the tree has exactly n children if the arity of its label is n.

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

MA 262, Spring 2018, Final exam Version 01 (Green)

Category Theory Approach to Fusion of Wavelet-Based Features

INCOMPLETE KLOOSTERMAN SUMS AND MULTIPLICATIVE INVERSES IN SHORT INTERVALS. xy 1 (mod p), (x, y) I (j)

u r du = ur+1 r + 1 du = ln u + C u sin u du = cos u + C cos u du = sin u + C sec u tan u du = sec u + C e u du = e u + C

Exponential inequalities and the law of the iterated logarithm in the unbounded forecasting game

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis

EXST Regression Techniques Page 1

A Simple Formula for the Hilbert Metric with Respect to a Sub-Gaussian Cone

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

Differential Equations

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems

Deift/Zhou Steepest descent, Part I

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

Equidistribution and Weyl s criterion

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator.

1 A lower bound. Lecture notes: Online bipartite matching algorithms

u 3 = u 3 (x 1, x 2, x 3 )

Network Congestion Games

Economics 201b Spring 2010 Solutions to Problem Set 3 John Zhu

1 General boundary conditions in diffusion

Why is a E&M nature of light not sufficient to explain experiments?

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

Where k is either given or determined from the data and c is an arbitrary constant.

Computing and Communications -- Network Coding

Vishnu V. Narayan. January

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia

Final Exam Solutions

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon.

Principles of Humidity Dalton s law

Thus, because if either [G : H] or [H : K] is infinite, then [G : K] is infinite, then [G : K] = [G : H][H : K] for all infinite cases.

Transcription:

20 3st Intrnational Confrnc on Distributd Computing Systms Workshops Approximation and Inapproximation for Th Influnc Maximization Problm in Social Ntworks undr Dtrministic Linar Thrshold Modl Zaixin Lu, Wi Zhang, Wili Wu, Bin Fu and Dingzhu Du Computr Scinc Dpartmnt Univrsity of Txas at Dallas, Richardson, USA Email: zaixinlu,wiliwu,dzdu@utdallas.du Math Dpartmnt Xi an Jiaotong Univrsity, Xi an, PRC Email: dc.zhangwi@gmail.com Computr Scinc Dpartmnt Univrsity of Txas-Pan Amrican, Edinburg, USA Email: binfu@cs.panam.du Abstract Influnc Maximization is th problm of finding a crtain amount of popl in a social ntwork such that thir aggrgation influnc through th ntwork is maximizd. In th past this problm has bn widly studid undr a numbr of diffrnt modls. In 2003, Kmp t al. gav a ( )- approximation algorithm for th linar thrshold modl and th indpndnt cascad modl, which ar th two main modls in th social ntwork analysis. In addition, Chn t al. provd that th problm of xactly computing th influnc givn a sd st in th two modls is #P-hard. Both th linar thrshold modl and th indpndnt cascad modl ar basd on randomizd propagation. Howvr such information might b obtaind by survys or data mining tchniqus, which maks grat diffrnc on th proprtis of th problm. In this papr, w study th Influnc Maximization problm in th dtrministic linar thrshold modl. As a contrast, w show that in th dtrministic linar thrshold modl, thr is no polynomial tim n ɛ -approximation unlss P=NP vn at th simpl cas that on prson nds at most two activ nighbors to bcom activ. This inapproximability rsult is drivd with slf-containd proofs without using PCP thorm. In th cas that a prson can b activatd whn on of its nighbors bcom activ, thr is a polynomial tim -approximation, and w prov it is th bst possibl approximation undr a rasonabl assumption in th complxity thory, NP DT IME(n log log n ). W also show that th xact computation of th final influnc givn a sd st can b solvd in linar tim in th dtrministic linar thrshold modl. Th Last Sd St problm, which aims to find a sd st with last numbr of popl to activat all th rquird popl in a givn social ntwork, is discussd. Using an analysis framwork basd on St Covr, w show a O(logn)- approximation in th cas that a popl bcom activ whn on of its nighbors is activatd. Kywords-influnc maximization; social ntwork; approximation; dtrministic modl; I. INTRODUCTION A social ntwork is a graph of rlationships (dgs) and individuals (nods). On of th issus usually considrd by markting managrs in this fild is how to maximiz th sprad of information through a social ntwork. For instanc, in ordr to promot a nw product, on can giv a fw influntial popl fr sampls of th product. Probably thos popl will rcommnd th product to thir frinds and many individuals will ultimatly try it through such wordof-mouth ffct. Th Influnc Maximization (IM) problm is how to slct th fw initial popl as sd st, such that th sprad of influnc can b maximizd. In ordr to study th complxity of th this problm, w first hav to gt about anothr problm, that is how to comput th total influnc givn a sd st. Plas s [], [2], [9] for rcnt works. Th nod slction problm w just mntiond was first proposd by Domingos and Richardson in [4] and [8] rspctivly. Thy considrd th rlations of individuals and proposd a probabilistic propagation modl for this problm. Kmp t al. in [6], [7] furthr formulatd it into an optimization problm and studid it on two diffrnt modls: th indpndnt cascad modl proposd by Goldnbrg t al. in [5], [2] and th linar thrshold modl proposd by Granovttr and Schlling in [0] and [] sparatly. Kmp t al. provd th natural grdy algorithm achivs a ( )-approximation simply by showing that th influnc sprads undr both th two modls ar submodular. Thraftr, Chn t al. in [], [2] showd that th problm of xactly computing th influnc givn a sd st in both th indpndnt cascad modl and th linar thrshold modl ar #P-hard, which indicats that th grdy algorithm is not a polynomial tim approximation for th two modls. In th indpndnt cascad modl, th propagation procdur is basd on a probabilistic way; individuals can succssfully activat thir nighbors with crtain probabilitis. In th linar thrshold modl, th propagation procdur is in a thrshold mannr; th influnc from an individual v i to anothr individual v j is prsntd by a wight w i,j and an individual can b activatd whn th sum of influncs it rcivs xcds a randomly dtrmind thrshold. It is 545-0678/ $26.00 20 IEEE DOI 0.09/ICDCSW.20.33 64 60

worthy to mntion that th thrsholds in th linar thrshold modl ar randomly updatd during th sprad procss. Thrfor, it can b sn that both th linar thrshold modl and th indpndnt cascad modl ar basd on randomizd propagation. Howvr th thrsholds might b stimatd by survys and data mining tchniqus. If an individual can b activatd whn th sum of influncs xcds a prdtrmind thrshold, w say th propagation procdur is basd on th dtrministic linar thrshold modl. In this papr, w focus on studying th approximation and inapproximation for th Influnc Maximization problm in th dtrministic linar thrshold modl. Th main contribution of this papr includs: ) W show that in th dtrministic linar thrshold modl, thr is no polynomial tim n ɛ -approximation for th IM problm unlss P=NP vn in th simpl cas that a prson nds at most two activ nighbors to bcom activ. 2) W also show that th problm of xactly computing th influnc givn a sd st in this modl can b solvd in linar tim. 3) In th cas that a prson can b activatd aftr on of its nighbors bcoms activ, thr is a polynomial tim -approximation. 4) Th Last Sd St (LSS) problm, which is a variation of th IM problm, is discussd. It aims at finding a sd st with last numbr of popl such that all th popl of intrst in th social ntwork can b finally activatd. W giv a O(logn)-approximation for th cas that a nod can b activatd by anyon of its nighbors. Th rst of this papr is organizd as follows. In sction II, w prsnt a linar tim xact algorithm to comput th influnc sprad for a sd st. In sction III, w study th IM problm in th cas that a popl can b activatd by anyon of its nighbors. An inapproximation rsult for th cas that a prson nds two activ nighbors to bcom activ is providd in sction IV. In sction V, w show th approximation and inapproximation for a spcial cas of th LSS problm. In sction VI, w conclud our papr and discuss th futur work. II. COMPUTING THE INFLUENCE SPREAD IN THE DETERMINISTIC LINEAR THRESHOLD MODEL For th linar thrshold modl and th indpndnt cascadd modl, th problm of computing th influnc sprad givn a sd st was lft as an opn problm in [6]. Chn t al. closd this opn problm by showing its #P-hardnss in [], [2]. Dfinition. A social ntwork is a dirctd graph G(V, E), ach nod v i in V with a thrshold t i rprsnts a prson in th social ntwork, ach dirctd dg (v i, v j ) has wight w i,j that dnots how much th nod v j is influncd by th nod v i. In th dtrministic linar thrshold modl, th propagation procss has th following provisions: ) Lt x j = dnot v j is activ, and x j = 0 dnot v j is not. 2) At any tim, v i bcoms activ if and only if v j nighour(v i) (x j w j,i ) t i. 3) Th diffusion is a stp by stp procss: in stp t, all nods that wr activ in stp t rmain activ, and any nod v satisfis th activ condition will b activatd. Dfinition 2. Givn a social ntwork G(V, E) and a sd st A of initially activ nods, th Influnc Computation problm is to find all th nods that will b activatd dirctly or indirctly by th nods in A. In th nxt, w show that this problm in th dtrministic linar thrshold modl can b solvd in linar tim. A. A Linar Tim Algorithm Thorm. Givn a social ntwork G(V, E) and a sd st A, th problm of xactly computing th influnc sprad can b solvd in linar tim in th dtrministic linar thrshold modl. Algorithm Influnc Computing : Input: A dirctd graph G(V, E), a thrshold t i for ach nod v i V, a wight w i,j for ach dg (v i, v j ) E and a sd st A V. 2: Output: th st of all nods that will b activatd in th ntwork. 3: For ach nod v i in V, lt H i 0; (H i holds th sum of influncs for v i.) 4: Mark all th nods in A as nwly activatd nods; 5: rpat 6: for Each nwly activatd nod v j do 7: for Each its non activ nighbor v k do 8: Lt H k H k + w j,k ; 9: if H k t k thn 0: Mark v k as nwly activatd; : nd if 2: Mark v j as activ nod; (Not that activ nod nwly activatd nod.) 3: nd for 4: nd for 5: until Thr is no nwly activatd nods Proof: Th tim complxity asily follow from th Alg.. It trminats whn thr ar no nwly activatd nods. Each dg in E has only on chanc to b usd to adjust its nighbors. Assum th input dirctd graph G dos not hav isolatd nods, which mans that E V. Hnc th Alg. has tim complxity O( E ). B. A Lowr Bound for Computing th Influnc Sprad Thorm 2. For any α [0, ], thr is a class of graphs G(V, E) with E = Θ(n +α ) such that vry algorithm that 65 6

V n n n Figur. V 2 ( + α)-bipartit Graph xactly computs th final activatd nods givn an initial sd st nds at last Ω( E ) running tim, whr n = V 2. Proof: W can dsign a graph G(V, V 2, E) as shown in Fig., whr V = V 2 = n. Lt th in-dgr of ach nod in V 2 b n α, whr α [0, ]. Thrfor w allow a constant factor dgr diffrnc among th dgrs of nods in V 2. It is asy to s th xistnc of this kind of graphs and w call thm ( + α)-graph. Assum that thr xists an o( E ) tim algorithm h(.) to find th st of activatd nods from an initial st of activ nods. Lt G b a ( + α)-graph such that for ach dg (v i, v j ), it has wight w i,j = dgr(v. Th thrshold t j) j = for all nods v j in G. Assum th sd st is th st of all th nods in V. Sinc h(.) runs in o( E ) tim, thr xists an dg (v i, v j ) that th algorithm dos not accss. Lt G b th sam graph as G xcpt w i,j = 0. Sinc h(g, V ) dos not accss dg (v i, v j ), w hav h(g, V ) = h(g, V ) = V V 2, which is a contradiction. III. APPROXIMATION AND INAPPROXIMATION FOR THE ONE-ACTIVATE-ONE MODEL In this sction, w considr th IM problm undr th condition that a prson can b activatd by anyon of its nighbors. It is a spcial cas of th gnral dtrministic linar thrshold modl, in which a prson nds multipl activ nighbors with wights to b activatd. Th problm is solvd by transforming it into th maximum covrag problm. W driv a constant factor approximation that matchs th bound of th inapproximation. Th Influnc Maximization Problm undr th On- Activat-On Modl: Lt G(V, E) b a dirctd graph. Each dirctd dg (u, v) from nod u to nod v rprsnts that prson u can activat prson v. Th problm is to initially activat k popl so that th largst numbr of popl will b activatd vntually. For an approximation solution, lt Approximat(G, k) rprsnt th numbr of popl who ar vntually activatd dirctly or indirctly by th first slctd k popl and lt Optimal(G, k) rprsnt th optimal solution. Th problm has a polynomial tim approximation ratio c if and only if Approximat(G, k) can b computd in polynomial tim and Optimal(G, k) c Approximat(G, k) for vry input instanc (G and k). Th maximum covrag problm is a classical problm in th computational complxity thory. Th input is a list of m sts and an intgr k. Th targt is to slct k sts from th list to covr th largst numbr of lmnts in th ground st. It is wll known that th maximum covrag problm, such as St Covr problm, Vrtx Covr problm, and Indpndnt St problm, has a th approximation ratio is optimal. -approximation and Thorm 3. Thr is a polynomial tim -approximation algorithm for th Influnc Maximization problm in th On-Activat-On modl. Proof: Assum that w hav an input instanc (G, k) for th On-Activat-On IM problm. As shown in Alg. 2, for ach vrtx v i in G, us th dpth first sarch algorithm to find all th vrtics that ar rachabl from v i. This can b don in linar tim. Th problm bcoms a maximum covr problm. Thrfor, it has an polynomial tim approximation with. Algorithm 2 Influnc Maximization : Input: A dirctd graph G(V, E) and an intgr k. 2: Output: a st of k nods in V. 3: Lt H dnot th st of activ nods and P dnot th initial st. 4: for ach nod v i V, which has no incoming dgs do 5: Lt S vi rprsnt th st of nods rachabl from v i. 6: nd for 7: for j to k do 8: Slct th st S vi that maximiz H S vi 9: H H S vi and P P {v i }. 0: nd for Thorm 4. If th Influnc Maximization problm in th On-Activat-On modl has an approximation algorithm with approximation ratio d, thn th maximum covrag problm has an approximation with ratio d + o(). Proof: Assum S,, S m is a list of sts for th maximum covrag problm. W ar going to construct a dirctd graph G(V, E). Assum S S 2 S m = {a,, a n }. For ach st S i, crat a vrtx v i in V. For ach a j, crat hk 2 vrtics u j,,, u j,t in V, whr t = hk 2 and h is a larg constant. If a j S i, add dirctd dgs from v i to all u j,,, u j,t in E. W claim that an optimal solution for th maximum covrag problm can covr x o lmnts by slcting k sts if and only if th On-Activat-On IM problm can activat x o t + k popl by initially activating k popl. Lt a d-approximation solution for th IM problm to activat yt+k popl. W hav x o t+k d(yt+k). W can assum y ; othrwis, all sts S,, S m ar mpty. W also assum k ; othrwis, th problm is trivial. Thus, 66 62

x o dyt + (d )k, t () (d )k dy +, t (2) (d )k (d + )y, ty (3) (d ) (d + )y, h (4) (d + o())y. (h is larg) (5) Hnc, if th On-Activat-On IM problm has d- approximation, w can also hav an d o() approximation for th maximum covrag problm. Corollary. Thr is no polynomial tim ( o())- approximation for th On-Activat-On Influnc Maximization problm unlss NP DT IME(n log log n ). Proof: It simply follows from th Thm. 4 and Fig t al. s papr [3]. IV. INAPPROXIMATION FOR THE TWO-ACTIVATE-ONE MODEL In this sction, w study a mor gnral IM problm that on prson can b activatd by on or two popl. W driv a strong inapproximation rsult for this problm vn in boundd dgr graphs. To show th inapproximation, w rduc th St Covr problm to th IM problm in polynomial tim. Th input of a St Covr problm is an intgr k, and many sts S,, S m and S, whr S,, S m ar m substs of st S. Th targt is to find k substs S i,, S ik such that S i S ik = S. It is wll known that th St Covr problm is NP-hard. A. Boundd Dgr Graphs Dfinition 3. A graph G is a dirctd boundd (d, d 2 )- graph if vry nod in G has at most d incoming dgs and at most d 2 outgoing dgs. Th IM problm ovr such a dirctd graph is calld (d, d 2 )-Influnc Maximization ((d, d 2 )-IM) problm. Lmma. Assum that w,, w n ar nods with at most on incoming dg, and x is an isolatd nod. W can crat O(n) nw nods to form a (2, 2) graph such that x is activatd if and only if on of th nods in {w,, w n } is activatd. Furthrmor, x has at most on incoming dg and no outgoing dg. Proof: W can achiv th goal by finishing n phass. In th first phas, w crat a nw nod b and add an dg from w to b, and anothr dg from w 2 to b. If w or w 2 is activatd, thn b will b activatd. In th phas i +, for any < i < n, w crat a nw nod b i+, add an dg from b i to b i+ and anothr dg from w i+2 to b i+ such that b i+ will b activatd whn b i or w i+2 is activatd. W W 2 W 3 W i+ W n b b 2 Figur 2. b i- b i x On-to-On Graph W W 2 W 3 W i+ W n /2 /2 /2 b b 2 /2 /2 Figur 3. b i- /2 b i x All-to-On Graph Aftr th phas n, w add an dg from b n to x so that b n can activat x. Th total numbr of nw nods is n. Each nod b i has two incoming dgs and on outgoing dg. Nod x has on incoming dg and no outgoing dg. Th final (2, 2) graph is shown in Fig. 2, ach nod in th graph has a dtrmind thrshold. Lmma 2. Assum that w,, w n ar nods with at most on incoming dg, and x is an isolatd nod. W can crat O(n) nw nods to form a (2, 2) graph such that x is activatd if and only if w,, w n ar all activ. Furthrmor, x has at most on incoming dgs and no outgoing dg. Proof: Th ovrall argumnt is similar to th proof of Thm.. W still us n phass to construct th (2, 2) graph. In th first phas, crat b and add an dg from w to b with wight 2, and anothr dg from w 2 to b with wight 2. If both w and w 2 ar activatd, thn b will b activatd. In th phas i +, w crat nod b i+, add an dg from b i to b i+ and an dg from w i+2 to b i+. If both b i and w i+2 ar activ, thn b i+ will b activatd. Aftr phas n, w add an dg from b n to x so that b n itslf can activat x. Th total numbr of nw nods is n. Each nod b i has two incoming dgs and on outgoing dg. Nod x has on incoming dg and no outgoing dg. Th final (2, 2) graph is shown in Fig. 3, ach nod in th graph has a dtrmind thrshold. B. An Inapproximation Rsult in th Boundd Dgr Graphs Thorm 5. For any constant ɛ (0, ), thr is no polynomial tim p ɛ -approximation for th (2, 2)-IM problm /2 /2 b n- b n- 67 63

unlss P=NP, whr p is th numbr of nods in th dirctd graph of social ntwork. Proof: W giv a polynomial tim rduction from th St Covr problm to a dirctd (2, 2) graph. Lt S,, S m b th input for th St Covr problm and S S 2 S m = {a,, a n }. Without loss of gnrality, assum ɛ < 00. Lt p b th numbr of nods in th graph, w first dfin th following paramtrs: p = (n + m) 20 ɛ, (6) g(p) = p ɛ, (7) m 5 = p 8ɛ, (8) p 8ɛ m 8 3 4 p8ɛ, (9) n p ɛ 0, (0) k m p ɛ 0. () Th inqualitis (0) and () follow from quality (6). W construct th (2, 2) graph as follows: Phas : For ach st S i, crat a vrtx u i. Phas 2: For ach st S i with a j S i, crat a vrtx x i,j and an dg from u i to x i,j such that nod u i activats nod x i,j. If thr ar mor than two lmnts in S i, crat a binary tr with root u i such that u i activats all of thm. Phas 3: Lt H j b th st of all nods x i,j. For ach group H j, crat a vrtx y j and add som additional vrtics such that y j will b activatd if on of th nods in H j is activ. By Lmma, this part can b don in polynomial tim. Phas 4: For ach lmnt a j, crat a vrtx v j. For ach y j, crat an dg from y j to v j such that nod y j activats v j. Phas 5: For ach nod v j, crat a binary tr T j with root v j such that th tr has m 5 lavs and v j can activat all of thos lavs. W labl all th lavs as l,j,, l m5,j. Phas 6: Crat m 5 groups G,, G m5 of nods. Each group G i contains nods w i,,, w i,n. Crat an dg from l i,j to w i,j such that nod l i,j activats nod w i,j. Phas 7: For ach group G i, crat a nod x i such that x i will b activatd if and only if all lmnts w i,,, w i,n in G i ar activatd. By Lmma 2, this part can b don in linar tim for ach group G i. Phas 8: For ach x i, crat a path, which is calld Y i, starting from x i and has lngth m 8 such that vry nods in th path Y i can activat th nxt on. Lt A,7 b th st of nods cratd from Phas to Phas 7. Lt A 8 b th st of nods cratd in Phas 8. By inqualitis (6) to (), th numbr of nods in A,7 is boundd by A,7 O((n + m) 2 m 5 ) = o(p ɛ ). (2) W also hav th following quation: p = A,7 + m 5 m 8. (3) If w can slct k nods among u,, u k such that th corrsponding k substs covr th ntir st S, thn v,, v n can b activatd. Thus, all th p nods in th constructd (2, 2) graph will bcom activ. Assum that thr is a g(p)-approximation algorithm which activats nods in a st B. W claim that if B < p g(p), thn thr is no solution for th St Covr with k substs. Assum B p g(p) = p ɛ. (4) Thus, th numbr of activatd nods in th st A 8 is at last B A,7 B o(p ɛ ), (by (2)) (5) B, (by (4)) (6) 2 50km 8. (by (6) to ()) (7) W ar going to transform th first k slctd vrtics so that only th nods in u,, u m ar slctd. Th transformation follows th following ruls: ) If a slctd nod is in group G i, thn rmov this nod and th nods activatd by th nods from G i. This loss th numbr of activatd nods by at most O(n) + m 8 2m 8. Th total numbr of activatd nods lost by this rul is at most k(o(n)+m 8 ) 2km 8 sinc w slct at most k nods to start th activation. 2) For ach slctd nod, which is activatd by a nod v j, rplac it by v j. This dos not dcras th numbr of activatd nod. 3) For ach slctd nod v j, rplac it by a u i with v j S i. This dos not dcras th numbr of activatd nods. Finally, w only hav th nods in {u,, u m } to b slctd to start th procss of activation. Th total numbr of nods lost activation is at most 2km 8. By inquality (7), w hav th following quation: B A,7 50km 8 > 2km 8. (8) This implis that som nods in Phas 8 ar activatd. Thrfor, w hav a solution for th St Covr problm with k substs. V. APPROXIMATION FOR THE LEAST SEED SET PROBLEM In this sction, w considr th Last Sd St (LSS) problm in th cas that a prson can b activatd by anyon of its nighbors. This problm was proposd by Ning Chn in [3]. Last Sd St Problm in th On-Activat-On Modl: Lt G b a dirctd graph and T b a givn st 68 64

of nods nd to b activatd. Th LSS problm is to slct th last of nods as sd st so that all th nods in T will b activatd. Thorm 6. Thr is a polynomial tim O(log n)- approximation algorithm for th LSS problm in th On- Activat-On Modl. Proof: Th proof of Thm. 4 shows that th LSS problm can b convrtd to a St Covr problm. Thrfor, it has a polynomial tim approximation with O(log n) factor. Thorm 7. If th On-Activat-On LSS problm has an approximation algorithm with ratio d(n), thn th maximum covrag problm has an approximation algorithm with ratio d(n). Proof: Assum S,, S m is th list of sts for th St Covr problm and S S 2 S m = {a,, a n }. W can construct a social ntwork as follows. For ach st S i, crat a vrtx u i. For ach a j, crat a vrtx v j, add dirctd dgs from u i to v j if a j S i. An dg from u i to v j mans v j can b activatd by u i. Lt T b th st of vrtics {v,, v n }. Th St Covr problm is convrtd into an On-Activat- On LSS problm. It is asy to s th On-Activat-On LSS problm has a d(n)-approximation if and only if th St Covr problm has a d(n)-approximation. Corollary 2. Thr is no polynomial tim o(log n)- approximation for th LSS unlss P = NP. Proof: It follows from th Thm. 7 and th wll known inapproximability of th St Covr problm. VI. CONCLUSION AND FURTHER RESEARCH In this papr, w show that th dtrministic linar thrshold modl has no polynomial tim n ɛ -approximation unlss P=NP vn in th simpl cas that on prson nds at most two activ nighbors to bcom activ. In th cas that a prson can b activatd aftr on of its nighbors bcom activ, thr is a polynomial tim -approximation, and w prov it is th bst possibl approximation undr a rasonabl assumption in th complxity thory. W also show that thr is a O(logn)-approximation for LSS problm in th On-Activat-On modl. Th gnral IM problm in th dtrministic linar thrshold modl looks vry hard, but th Influnc Computation problm undr this modl can b solvd in linar tim. Thrfor, w can back up to som simpl cass to furthr study this problm. Th following opn problm will b considrd in our futur rsarch: ) Is thr a polynomial tim O(log n)-approximation for th LSS problm in dgr boundd graphs undr th dtrministic linar thrshold modl. 2) Is thr a polynomial tim O(log n)-approximation for th LSS problm in th linar thrshold modl and th indpndnt cascadd modl. VII. ACKNOWLEDGMENT This rsarch work is supportd in part by National Scinc Foundation of USA undr grants CNS 06320 and CCF 0829993. REFERENCES [] W. Chn, Y. Yuan and L. Zhang: Scalabl Influnc Maximization in Social Ntworks undr th Linar Thrshold Modl. th 200 Intrnational Confrnc on Data Mining, 200. [2] W. Chn, C. Wang and Y. Wang: Scalabl Influnc Maximization for Prvalnt Viral Markting in Larg-scal Social Ntworks. th 200 ACM SIGKDD Confrnc on Knowldg Discovry and Data Mining, 200. [3] U. Fig:A Thrshold of ln n for Approximating St Covr. Journal of th ACM, 45: pp. 34 38, 998. [4] P. Domingos and M. Richardson: Mining th Ntwork Valu of Customrs. th 200 Intrnational Confrnc on Knowldg Discovry and Data Mining, 200. [5] J. Goldnbrg, B. Libai and E. Mullr: Using Complx Systms Analysis to Advanc Markting Thory Dvlopmnt. Acadmy of Markting Scinc Rviw, 200. [6] D. Kmp, J. Klinbrg É. Tardos: Maximizing Th Sprad of Infunc Through a Social Ntwork. th 2003 Intrnational Confrnc on Knowldg Discovry and Data Mining, pp. 37 46, 2003. [7] D. Kmp, J. Klinbrg and É. Tardos: Influntial Nods in a Diffusion Modl for Social Ntworks. th 2005 Intrnational Colloquium on Automata, Languags and Programming, pp. 27 38, 2005. [8] M. Richardson and P. Domingos: Mining Knowldg-Sharing Sits for Viral Markting. th 2002 Intrnational Confrnc on Knowldg Discovry and Data Mining, pp. 6 70, 2002. [9] F. Zou, J. Willson, Z. Zhang and W. Wu: Fast information propagation in social ntworks. Discrt Mathmatics, Algorithm and Applications (DMAA), 200. [0] M. Granovttr: Thrshold Modls of Collctiv Bhavior. Amrican Journal of Sociology, 83(6): pp. 420 443, 978. [] T. Schlling: Micromotivs and Macrobhavior. Norton, 978. [2] J. Goldnbrg, B. Libai and E. Mullr: Talk of th Ntwork: A Complx Systms Look at th Undrlying Procss of Wordof-Mouth. Markting Lttrs, 2(3): pp. 2 223, 200. [3] N. Chn: On th Approximability of Influnc in Social Ntworks. th 2008 annual ACM SIAM symposium on Discrt algorithms, pp. 029 037, 2008. 69 65