An Iterative Incentive Mechanism Design For Crowd Sensing Using Best Response Dynamics

An Iterative Incentive Mechanism Design For Crowd Sensing Using Best Response Dynamics Alireza Chakeri, Luis G Jaimes Department of Computer Science and Engineering, University of South Florida, Tampa, FL chakeri@mail.usf.edu Department of Computer Science, Florida Polytechnic University, Lakeland, FL ljaimes@flpoly.org Abstract Recent studies have modeled the incentive mechanism as a complete information game where the contributors have common knowledge. However, that assumption is not realistic in real world scenarios. In this paper, we present an incentive mechanism for CS in sealed markets in which participants have incomplete information on other participants behavior. An iterative game framework is introduced where the solution is achieved after a number of iterations. We also address the question do uncoordinated contributors converge to an equilibrium?. In fact, we are concerned with the convergence of the contributors to an equilibrium under a natural dynamics. The well-known best response dynamics with different rules of play is studied. In addition, a strategy for the platform to assign the optimal budget for the initial state of the market is presented. Through theoretical analyses and extensive simulations, we also evaluate the performance of the proposed incentive mechanism. Index Terms Crowd Sending; Participatory Sensing, Ubiquitous Computing, Game Theory, Best Response Dynamics. I. INTRODUCTION Crowd sensing involves participation of masses of individuals that use the embedded sensors in their mobile devices to sample a variable of interest. Users participation include actions such as taking a picture, recording a sound, and sensing environmental variables like temperature, pollen, or pollution. A CS market consists of some data buyers, i.e. crowdsourcers, and data sellers, i.e. contributors. Designing an incentive mechanism is a fundamental step for the successful deployment of any CS market in real world scenarios. The main purpose of an incentive mechanism is to compensate contributors for the associated costs due to data samplings and collections. These costs might include factors such as energy consumption for 1) data sensing, processing, and transmissions, 2) consumption of the users data plan, and 3) users discomfort for manually data sampling and reporting. Hence, a rational contributor expects a compensation that exceeds its participation s costs. On the other hand, every crowdsourcer wants to engage as many participants as possible, and at the same time, to keep the system economically feasible. To address these problems, several mechanisms have been proposed [2], [6], [7], [12], [19]. Recent approaches are centered around the notion of usercentric incentive mechanisms [5], [8], [15]. These approaches often take the form of reverse-auctions [11]. Specifically, the contributor s bids partially reveal its true valuation, namely the minimum value that it is willing to accept for its work. All of these approaches suffer from the same problem, i.e. contributor s starvation. A completely different family of monetary incentives are the platform-centric approaches [14], [17], [18]. In platform-centric approaches, a crowdsourcer posts a sensing task and the associated reward for the task. Then, the interested contributors submit the units of time that they are willing to work on the task. In this way, all of the contributors can participate in performing the task, since the reward distributes among the participants proportionally. The studies in this direction, have modeled the incentive mechanism as a complete information game where the contributors have common knowledge about the situation, and the Nash equilibrium of the modeled game is considered as the solution to the market. However, common knowledge assumption is not realistic in real word scenarios. For instance, the contributors usually are not aware of the other contributors costs, and as a result, are not able to predict the true outcome of the market. Hence, in this paper, we present an incentive mechanism for data collection in sealed markets in which participants have incomplete information on other participants behavior. We address a different question: do uncoordinated contributors converge to a Nash equilibrium? Thus, rather than the quality of the resulting sensing plans, we are more concerned with the convergence of the contributors to a stable market under some natural dynamics. Particularly, we address this question assuming the well-known best response dynamics [13] which plays a central role in evolutionary game theory [16]. As its name suggests, the best response dynamics consists of players taking turn in some order to adopt their strategy based on the most recent known strategies of the others, without considering the effect of the future results in the game. In fact, the main focus of this paper is the convergence of the best response dynamics in crowd sensing markets. II. PLATFORM-CENTRIC INCENTIVE MECHANISMS: GAME THEORETIC PERSPECTIVE A typical crowd sensing market consists of one crowdsourcer and multiple contributors. The crowdsourcer unit corresponds to a data buyer or a data consumer that posts a sensing task. The contributors work on the sensing tasks posted

by the crowdsourcer. In particular, consider a market where a crowdsourcer announces a sensing task and the total reward R. Let s U = {1, 2,..., n} be the set of contributors that are willing to work on the task. We assume that each participating contributor is available during the task, and submits its sensing plan, i.e. the number of units of time to invest on the posted sensing task. However, working on the sensing task entails costs on the contributors, such as extra energy expenses for using the phones sensors, extra data consumptions of the users data plan, and the burden representing the manual data collections. Let s denote the sensing plan and the unit sensing cost of the contributor i by t i and k i (without loss of generality, we assume that k 1 k 2 k n ). Assume that the reward received by the contributor i is proportional to t i. Then the utility of the contributor i can be written as [9], [14], [18] u i = t i T R k it i, (1) where T = i U t i. On the other hand, the contributors sensing plans provide a utility for the crowdsourcer that can be generally written as ū = g(t 1, t 2,..., t n ) R, (2) where g(.) is the crowdsourcer valuation function of contributors sensing plan. We assume that g(.) is a strictly concave function in variables t 1, t 2,..., t n. This assumption is realistic and general which has been adopted in the literature. In particular, the authors in [17] set g(.) to λ log ( 1 + i U log(1 + t i) ), where λ is a system parameter. The inner log term reflects the crowdsourcers diminishing return on the work of the contributor i, and the outer log term reflects the crowdsourcers diminishing return on participating contributors. In this paper, we also use this utility function. The goals of the crowdsourcer and the contributors are to maximize their corresponding utilities; the former, by posting the optimal reward, and the latter by submitting the optimal sensing plans. To solve this problem, Yang et al. in [18] formulated the problem as a Stackelberg game [3] such that the crowdsourcer is the leader and the contributors are the followers. They considered the posted reward R and the sensing plan t i as the strategies of the crowdsourcer and the contributor i, respectively. In particular, when the crowdsourcer posts a reward R, a game takes place between the contributors. This game is called the sensing game. In the sensing game, let t = (t 1, t 2,..., t n ) be the strategy profile corresponding to all contributors strategies. t can be also alternatively denoted by t = (t i, t i ) where t i represents the strategy profile of the all contributors except the contributor i. The contributor i gains the utility u i (t) for each strategy profile t. Now a natural question rises; How to find the strategy profile in which non of the contributors has anything to gain by unilaterally changing its current strategy for a fixed reward R. In fact, this stable configuration corresponds to the concept of Nash equilibrium in game theory. A Nash equilibrium is a strategy profile such that each players strategy is a best response to the other players strategies. Particularly, t = (t 1,..., t n) is a Nash equilibrium if, i, t i is a best response to t i. We say t i is a best response to t i, denoted by β i (t i ), if u i (t i, t i ) u i (ˆt i, t i ), ˆt i t i ; For the sensing game, the authors in [18] showed that β i (t i ) = { 0, if R ki S i (l) RS i(l) k i S i (l), otherwise where S i = Σ j U\{i} t j. Interestingly, the authors in [18] proved that the sensing game has a unique Nash equilibrium sensing plans, and provided a linear time algorithm to compute the only equilibrium. In fact, the unit sensing costs are the only factors that determine the survival contributors. The survival contributors are the contributors who have nonzero submitted sensing plans, denoted by the set M and the cardinality M = m. Since, we make reference often to the proposed algorithm in [18], it is rewritten here for the ease of readers. Algorithm 1: The Nash equilibrium of the sensing game 1: Sort contributors according to their costs; k 1 k 2 k n 2: M {1, 2}, i 3 j M 3: while i n and k i < k j do M 1 4: M M {i}, i i + 1 5: end while 6: for all i U do 7: if i M then 8: t i = (m 1)R ( 1 (m 1)k i 9: else 10: t i = 0; 11: end if 12: end for 13: return t = (t 1, t 2,..., t n); III. AN ITERATIVE GAME MODEL USING BEST RESPONSE DYNAMICS According to Algorithm 1, the main assumption in [18] is that each contributor must know the costs of the other contributors in order to be able to calculate its optimal sensing plan, i.e. the contributors require the information of the game completely. However, this assumption is very strong in the real world situations. The contributors usually are not aware of the costs of the others, and as a result, are not able to compute their optimal sensing plans. However, the market still has a unique Nash equilibrium, such that each contributor ultimately wants to reach to that unknown equilibrium to constitute a stable market. But this aim is not achievable in a single shot of the game. As a result, in this paper, we develop an iterative incentive mechanism for the contributors with incomplete information. The mechanism motivates them to reach to the stable state of the market iteratively, without even knowing the costs of the others. In fact, we let the contributors play the game repeatedly reaching to the only equilibrium of the game by changing their sensing plans in an evolutionary manner. ) ; (3)

To overcome the problem, we develop an iterative incentive mechanism using the best response (BR) dynamics. It can be summerized as follows: in the first iteration, the contributors submit their initial sensing plans. The initial sensing plan of a contributor can be arbitrary or based on the estimation of the other contributors costs, and the posted reward for the task. When the utilities are distributed to the crows by the crowdsourcer, the contributors might need to change their sensing plans in hope of reaching to a stable market. Particularly, we suggest that the contributors change their sensing plans to the ones that are the best responses to the sensing plans of the other contributors in the current iteration. The procedure iterates until none of the player changes his sensing plan. More specifically, we use the BR dynamics to obtain the sensing plan of the ith contributor at time l + 1, using the sensing plans of the other contributors at time l, i.e. t i (l + 1) = β i (t i (l)) (4) According to the definition of BR dynamics, the Nash equilibrium is a fixed point of the BR dynamics. This is because in the Nash equilibrium, the contributors play their best responses. However, from relation (3), 0 = (0,..., 0) is also a fixed point, since none of the contributors would change his strategy. Hence, the aim is to study whether the trajectories starting from initial sensing plans converge to the non zero fixed point (i.e. the Nash equilibrium). Ideally, the dynamics should be insensitive to the initial sensing plans and converge all of the players sensing plans to the non zero fixed point, i.e. a globally stable fixed point. But, this is very optimistic in real world scenarios. Thus, we address a weaker notion of asymptotically stable. Particularly, the non zero fixed point needs to be stable, such that every initial sensing plans sufficiently close to the non zero fixed point converge to it. This stability condition guarantees the convergence of any trajectory initialized in the basin of attraction of the non zero fixed point to it. In this regard, we focus on three natural dynamic rules: (1) simultaneous move, where all of the contributors change their sensing plans in each iteration simultaneously, (2) round robin, where the contributors play in a cyclic manner according to a predefined order, and (3) random walk, where a contributor is chosen uniformly at random to change his sensing plan in each iteration. A. Simultaneous Move In this rule, each contributor changes its strategy in each iteration. Particularly, consider the nonlinear discrete-time dynamical system where t(l + 1) = (t 1 (l + 1), t 2 (l + 1),..., t n (l + 1)) (5) t i (l + 1) = β i (t i (l)) (6) This nonlinear discrete-time system specifies the sensing plan of each contributor in each iteration. Now, to study the stability condition of the nonzero fixed point, we address the behavior of the dynamics in a local neighborhood of the fixed point by its linearization. In particular, the Jacobian matrix of (5) evaluated at the fixed point t is f 1 t 1 (t f ) 1 t 2 (t f )... 1 t n (t ) f 2 J(t ) = t 1 (t f ) 2 t 2 (t f )... 2 t n (t )..... (7). f n t 1 (t ) f n t 2 (t )... f n t n (t ) where f i (t ) = β i (t i ). According to (3), we can see that J ij = 1 R 2 k is i 1 and J ii = 0 (i.e. the diagonal elements are zero). To study the stability of the non zero fix point, we analyze the magnitudes of the eigenvalues of the Jacobian matrix. If they are all less than 1, then the nonzero fixed point is stable [4]. However, we show that the non zero fixed point is not stable under simultaneous move rule as in Theorem 1. The assumption in the theorem is realistic, since in a large CS market, it is very common that there are at least three non survival contributors. Theorem 1. The nonzero fixed point of the dynamics in (4) is unstable under simultaneous move rule if m < n 2. Proof. Please see the Appendix. Example 1: Consider a market with three contributors, and a task with reward R = 100. Assume that the costs of the contributors are k 1 = 1, k 2 = 2 and k 3 = 3. The only Nash equilibrium of the sensing game is t = (22.22, 11.11, 0). However, since each contributor does not know the exact costs of the others, it is not able to compute its Nash equilibrium sensing plan. Hence, let the contributors initially submit the sensing plans t(0) = (30, 10, 20). Since t(0) is not the Nash equilibrium, the market is not in a stable state. Using the simultaneous move rule, the contributors submit t(1) = (24.77, 0, 0) as their sensing plan for the next iteration. After 18 iterations, the fixed point t(18) = (22.22, 11.11, 0) is reached. At this point, the players do not have motivation to change their sensing plans and a stable market is achieved. Figure 1 shows the sensing plans submitted by the contributors over the time. To see why they converge, all of the eigenvalues of the Jacobian matrix in relation (8), i.e. ( 0.1706 + 0.5808i, 0.1706 0.5808i, 0.3412), have magnitudes less than 1 that implies the stability of the non zero fixed point. 0 0.5 0.5 J = 0.25 0 0.25 (8) 0.5 0.5 0 The above example shows that for that specific setting, the initial sensing plans are converged toward the non zero fixed point (since the condition m < n 2 does not hold). In the second example, we set up a unstable situation. Example 2: Consider a market with 5 contributors, and a task with reward R = 100. Assume that the costs of the contributors are k 1 = 1.5, k 2 = 2, k 3 = 4, k 4 = 5

where h i = 1 R 2 k is i 1. Hence, the switched discrete-time linear system can be written as t(l + 1) = J i t(l) (11) Fig. 1. Contributors sensing plans over time. and k 5 = 6. The only Nash equilibrium of the game is t = (16.33, 12.24, 0, 0, 0). In this case, the eigenvalues of the Jacobian matrix evaluated at the Nash equilibrium are ( 1.01, 0.30, 0.55, 0.60). One of the eigenvalues has magnitude greater than 1, that results in instability of the non zero fixed point. In fact, this is congruent with Theorem 1, since the condition for the number of survival contributors holds (m = 2, n = 5). B. Random Walk In this subsection, we study the convergence of the random walk rule, where in each turn a player is chosen uniformly at random to play. To address the convergence of this rule, we use the notions in switch discrete-time linear systems [10]. The idea is to study the Jacobian matrices of best response of individual players, and to analyze how the products of such matrices grows. The well-known quantity describing the growth rate of matrix products, is the joint spectral radius. In fact, a sufficient condition for the convergence of the dynamics is that the joint spectral radius be strictly less than one. Hence, we cast the analysis of the convergence of the random walk BR dynamics to the analysis of the joint spectral radius of Jacobian matrices of individual players. We show that the equilibrium sensing plans is not stable under this rule too. Here, we review the essential notions in switch discrete-time linear systems: consider a dynamical system t(l + 1) = (β i (t i (l)), t i (l)) (9) where i can be chosen randomly in each iteration. Such a system is called switched discrete-time nonlinear system. One can linearize the nonlinear system (9) into a linear system using its Jacobian matrix evaluated at the fixed point. The Jacobian matrix of (9) for contributor i is 1... 0... 0...... J i = h i... 0... h i...... 0... 0... 1 (10) In the switched discrete-time linear system, the contributor i is chosen randomly in each iteration. In particular, consider the set J = {J 1,..., J n } containing the Jacobian matrices of all contributors. If all the Jacobian matrices in J are the same, then the stability can be checked if the spectral radius of them is less than 1. However, since the matrices are different in our case, the joint spectral radius can be used to address the stability of such a system. The joint spectral radius of a finite set J of matrices is defined as in (12), i.e. ρ(j) represents the largest asymptotic growth rate of all possible products of matrices in J. ρ(j) = lim k max{ J d 1... J dk 1/k : J di J} (12) Although the growth rate depends on the particular sequence of matrices, but it was shown that all possible trajectories converge to a fixed point if ρ(j) < 1 evaluated at the fixed point. However, it is very hard to compute ρ(j). Specifically, the question of whether ρ(j) < 1 is undecidable, even if J only consists of two matrices [1]. Most of the methods to approximate the joint spectral radius are based on the following inequalities, bounding it from below and above max{ρ(j i ) : J i J} ρ(j) max{ J i : J i J} (13) where ρ(j i ) and J i are the spectral radius and the norm of matrix J i, respectively. Using this inequality, we provide a theorem on the linear instability of the equilibrium sensing plan under random walk rule. Theorem 2. The nonzero fixed point of the dynamics in (4) is linearly unstable under random walk rule. Proof. Please see the Appendix. C. Round Robin In this rule, the contributors play in a cyclic manner according to a predefined order. In addition, a round is defined as a sequence of play in which each player plays exactly once. The order in all rounds are the same as the order in the first round. Let us fix this order to be 1, 2,..., n. Also, let s define t i as the strategy profile obtained from t in which only the ith contributor changes its strategy to its best response while others fixed their strategies, i.e. t i (t) = (β i (t i ), t i ) (14) Hence, the point reached from t(0) after one round of play is t(1) = t n (t n 1(... ( t 1 (t(0)) ))) (15)

we propose a simple yet effective approach to initialize the state of the market. Basically, it is reasonable to assume that the contributors have similar costs by some minor differences. As a result, each contributor submit its initial sensing plan as if the other contributors have the same costs. In this case, from Algorithm 1, the initial sensing plan of the ith contributor would be Fig. 2. Contributors sensing plans over time. Using this definition, the total Jacobian matrix is the product of the Jacobian matrices of the best responses of individual contributors, i.e. J R = n i=1 J i. Although, we were able to prove that all eigenvalues of J R have magnitude less than one, but on the extensive numerical experiments that we conducted, the below conjecture was indeed true. Conjecture 1. The nonzero fixed point of the dynamics in (4) is stable under round robin rule. For instance, we study again the market described in Example 2 and show how the round robin rule makes the non zero fixed point stable. Example 3: Consider a market the same as in Example 2 but played with round robin rule. In this case, the Jacobian matrix (which is the product of the Jacobian matrices of individual contributors) has eigenvalues (0, 0.17, 0.40 + 0.17i, 0.40 0.17i, 0.12). We can see that all the eigenvalues have magnitude less than 1, which means non zero fixed point is stable. For instance, assume that the contributors initially submit the sensing plans t(0) = (20, 10, 10, 8, 5). Using the round robin rule with the order {1, 2, 3, 4, 5}, the contributors submit the following sequence of sensing plans for the first round: (13.90, 10, 10, 8, 5) (13.90, 6.06, 10, 8, 5) (13.90, 6.06, 0, 8, 5) (13.90, 6.06, 0, 0, 5) (13.90, 6.06, 0, 0, 0). Again, since t(1) is not a Nash equilibrium, the contributors are intended to change their sensing plans again. After 11 rounds of play (56 iterations), they reach to the fixed point t(11) = (16.33, 12.24, 0, 0, 0). Figure 2 shows the sensing plans of the all contributors over the time. D. Effect of Initial Sensing Plans It is clear that the convergence behavior of the dynamics depends on the initial sensing plans. If the initial sensing plans is not in the basin of attraction of the non zero fixed point, then the dynamics will converge to the zero sensing plans. As a result, the question of what are the initial sensing plans that lead to the non zero fix point, can be equivalently answered by determining the basin of attraction of the nonzero fixed point. However, we are not able to address this problem, but t i (0) = (n 1)R n 2 k i (16) Extensive experimental results shows that, this initialization method causes the round robin BR dynamics to converge to the non-zero fixed point (confirming that it is in its basin of attraction). IV. CROWDSOURCER S BUDGET ALLOCATION As we mentioned earlier, the authors in [18] modeled the incentive problem as a Stackelberg game in which the crowdsourcer is the leader and the contributors are the followers (it should be noted that the sensing game takes place between the contributors). However, when the game is an incomplete information game, the sensing plans might not converge to the Nash equilibrium of the market for any R. If the sensing plans using the round robin rule converge to the zero fixed point, then the utility for the crowdsourcer would be R for any iteration after reaching to the zero fixed point. This makes the the crowdsourcer to be concerned of the utility in the steady state rather than the utilities in the transient states until reaching to the fixed point. Hence, the crowdsourcer should be able to predict whether the submitted initial sensing plans converge to the nonzero fixed point. To solve the aforementioned problem, we assume that the crowdsourcer expects each contributor submits its initial sensing plan as if the other contributors have the same costs (refer to the subsection 2.1). Hence, the crowdsourcer would be able to predict the outcome of the game assuming the contributors play according to the BR dynamics. In this case, generally, the dynamics converges to the nonzero fixed point for any R. The following example describes our proposed strategy to allocate the optimal budget. Example 4: Consider a market the same as in Example 2, in which the contributors play according to the round robin rule. The goal is to find the optimal R that maximizes the crowdsourcer s utility. Assume that, for any R, each contributor submits its initial sensing plan as if the other contributors have the same costs. For instance, if the posted reward is R = 10 then the initial sensing plans would be t(0) = (1.07, 0.8, 0.4, 0.32, 0.27), whereas if the posted reward is R = 40 then the initial sensing plans would be t(0) = (37.21, 48.85, 29.29, 61.22, 84.92). Hence, the crowdsourcer changes its posted reward to gain the highest utility when the contributors reach to the nonzero fixed point. Figure 3 shows the utility of the crowdsourcer for different reward when λ = 100. The optimal reward for the crowdsourcer is R = 36 with the utility of ū = 116.96.

(a) (b) Fig. 3. Crowdsourcer s utility for different R when the contributors submit their initial sensing plans based on the strategy in subsection 3.4. Fig. 5. a) Impact of the posted reward on the average utility of the contributors in the fixed point. b) Average utility of the contributors for different number of contributors. (a) (b) Fig. 4. Crowdsourcer s utility for different R when the contributors randomly submit their initial sensing plans. Fig. 6. a) Average and variance of the convergence time for different number of contributors. b) Average and variance of the rounds needed to converge for different number of contributors. In the following example, we analyze the effect of initial sensing plans on the optimal R. Example 5: Consider a market with 100 contributors, where their costs are uniformly distributed in the range of [0, 1]. Also, assume the contributors submit the initial sensing plans randomly. In this case, the initial sensing plans might not be in the basin of attraction of the nonzero fixed point, and, as a result, the dynamic will converge to the zero fixed point. This makes the utility of the crowdsourcer negative, i.e. R. Figure 4 shows the utility of the crowdsourcer for different reward when λ = 10. Also, the figure shows the utility of the crowdsourcer if the game was played in a complete information manner. As one can see, in an incomplete information game, the utility of the crowdsourcer can be negative if the initial sensing plans converges to the zero fixed point. In this case, the crowdsourcer is not able to find its optimal reward. V. PERFORMANCE EVALUATION In this section, we evaluate the performance of the proposed incentive mechanism by analyzing the contributors utilities, the running time needed to converge to the nonzero fixed point (convergence time). In the experiments, the number of contributors varied from 100 to 1000 with the increment of 100. The costs of the contributors are randomly distributed over the range of [1, 10]. The λ for the crowdsourcer is set to 100. The code has been written in Matlab m code and run on a machine equipped with 7 Intel 3.6 GHz and 16 GB memory. The simulations are done in the following way: we initialize the sensing plans using the suggested method in subsection 2.1. Then, the round robin BR dynamics is used to dynamically change the sensing plans. In all of the experiments, the dynamics converged to the non zero fixed point confirming Conjecture 1. 1- Contributors utilities: We calculate the average utilities of 100 contributors in the fixed point for different posted reward R and plot it in Figure 5.a. We observe that as the reward increases, the average utility increases almost linearly. Also, we varied the number of contributors and plot the average utility of the survivals in Figure 5.b. As we can see, since the more competitions are involved, the average utility decreases as the number of contributors increases. 2- Running Time: Here we show the running time, as well as the number of rounds, needed to converge to the nonzero fixed point over 100 runs (in each run the costs are randomly picked) for different number of contributors in Figures 6.a and 6.b. The figures show that the number of contributors has almost a linear impact on the convergence time. Also, even when the market size is large (e.g. 1000 contributors), only 12 rounds are needed to converge. Furthermore, we studied the effect of the range of the costs

(a) Fig. 7. a) Impact of the range of the costs on the converge time. b) Impact of the range of the costs on the number of rounds needed to converge. on convergence time for 600 contributors. Particularly, the maximum costs are varied from 2 to 6. The results are shown in Figure 7.a and 7.b. We observe that, as the range of the costs increases, the running time (rounds) increases too. This is because, the initial sensing plans strategy in subsection 3.3 is more effective when the costs are similar. In other words, the initial sensing plans are not close to the ones in the fixed point, the convergence time takes longer. VI. CONCLUSIONS In this paper, we developed an incentive mechanism for crowd sensing markets with imperfect information. We presented an iterative game framework where the equilibrium of the market is achieved after a number of iterations. At the core of the framework is the the well-known best response dynamics using three natural players behavioral models. We present theoretical analysis and simulation results on the convergence of the contributors under these behavioral models. Empirically, we observed that the round robin best response dynamics, where the contributors play in a cyclic manner, provides a stable fixed point for the modeled game. In addition, the main factors that affect the important metrics such as the number of active contributors, the utility of the crowdsourcer, the average utility of the contributors and the convergence time were studied. The results of the experiments show that the distribution of the contributors cost substantially affects these metrics. Finally, a strategy for the crowdsourcer to assign the optimal reward was presented. (b) APPENDIX PROOF OF THE THEOREMS 1 AND 2 Proof of Theorem 1: First, we prove the following observations for all i j J ij = 2 1 Σl M k l k 1 i / M i(m 1) (17) J ij = Σ l M k l 2k 1 i M i(m 1) To prove this, we derive the following relations for the sum of the sensing plans at the Nash equilibrium t T = (m 1)R ( i U j M k 1 (m 1)k ) i j (m 1)R ( = j M k 1 (m 1)k ) i j As a result, i M and i / M = i M (m 1)R ( 1 (m 1)k ) i S i = T = = (m 1)R (18) S i = T t i (19) (m 1)R = (m 1)2 R k i ( ) 2 (m 1)R (20) Thus, by replacing (22) and (23) in J ij = 1 R 2 k is i 1, we have (20). Now, we show that i M, J ij 0.5. According to l M Algorithm 1, i M, k i k l m 1. Thus J ij = 2 1 Σl M k l k i(m 1) 1 0.5 1 = 0.5. Also, we need to show that i M, 0.5 J ij 0. Particularly, we have Σ l M k l 2k i (m 1) < (m 1)(k 1 + k 2 ) k 2 1 (21) 2k i (m 1) k i for all i M except i = 1 (because k 1 k 2 ). However, based on the assumption in the statement of the theorem, (24) also holds for i = 1. Also, from Algorithm 1, we know that k 1 k m. As a result, j M kj m 1 0.5 Σ l M k l 2k i (m 1) 1 = J ij (22) Hence, in summary, we have { J ij 0.5 i / M 0.5 J ij 0 i M (23) Now, we review the Weyl inequalities [?] on the relations between the eigenvalues of matrices A,B and A + B. In particular, suppose λ 1 λ 2 λ n, then λ i+j 1 (A + B) λ i (A) + λ j (B) (24) for i, j 1 and i + j 1 n. Using the above property, we can write matrix J as the sum of two matrices J e and J d, where J e ij = 1 2 R k is i 1 for all j and J d is a digonal matrix with j d ii = 1 1 2 R k is i.

Since matrix J e has equal elements in each row, then the smallest eigenvalue of the matrix J e is the sum of the elements in a column. Also, the greatest eigenvalue of the matrix J d corresponds to a non-survival player. Now, because there are at least three non-survival players, i.e. m < n 2, then λ n (J) λ n (J e ) + λ 1 (J d ) 1 (25) [18] Dejun Yang, Guoliang Xue, Xi Fang, and Jian Tang. Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing. In The 18th Annual International Conference on Mobile Computing and Networking, Mobicom 12, Istanbul, Turkey, August 22-26, 2012, pages 173 184, 2012. [19] Shusen Yang, Usman Adeel, and Julie A. McCann. Selfish mules: Social profit maximization in sparse sensornets using rationally-selfish human relays. IEEE Journal on Selected Areas in Communications, 31(6):1124 1134, 2013. for i, j 1 and i + j 1 n. Thus, the dynamics is not stable under the simultaneous move BR dynamics. Proof of Theorem 2: We show that ρ(j i ) = 1 for any J i J. Particularly, J i is an idempotent matrix, i.e. J i J i = J i. As a result, all of its eigenvalues λ j are 0 or 1. Also, we know that λj = tr(j i ) = n 1. Hence, J i has n 1 eigenvalues of 1, and one eigenvalue of 0, resulting in ρ(j i ) = 1. Now, from the inequality in (14), 1 ρ(j). As a result, the switched linear system is not stable. REFERENCES [1] O. Brun, B. J. Prabhu, and T. Seregina. On the convergence of the best-response algorithm in routing games. In Proceedings of the 7th International Conference on Performance Evaluation Methodologies and Tools, pages 136 144. ICST (Institute for Computer Sciences, Social- Informatics and Telecommunications Engineering), 2013. [2] Michal Feldman, Kevin Lai, and Li Zhang. A price-anticipating resource allocation mechanism for distributed shared clusters. In Proceedings of the 6th ACM conference on Electronic commerce, pages 127 136. ACM, 2005. [3] Drew Fudenberg and Jean Tirole. Game theory mit press. Cambridge, MA, page 86, 1991. [4] Richard A. Holmgren. A first course in discrete dynamical systems. Universitext. Springer, New York, Berlin, Paris, 1996. [5] Luis G. Jaimes, Idalides J. Vergara-Laurens, and Miguel A. Labrador. A location-based incentive mechanism for participatory sensing systems with budget constraints. In PerCom, pages 103 108, 2012. [6] Ramesh Johari and John N Tsitsiklis. Efficiency loss in a network resource allocation game. Mathematics of Operations Research, 29(3):407 435, 2004. [7] Iordanis Koutsopoulos. Optimal incentive-driven design of participatory sensing systems. In Proceedings of the IEEE INFOCOM 2013, Turin, Italy, April 14-19, 2013, pages 1402 1410, 2013. [8] Juong-Sik Lee and Baik Hoh. Dynamic pricing incentive for participatory sensing. Pervasive and Mobile Computing, 6(6):693 708, 2010. [9] Peng Li and Song Guo. Incentive mechanisms for device-to-device communications. Network, IEEE, 29(4):75 79, July 2015. [10] K.L. Mak, J.G. Peng, Z.B. Xu, and K.F.C. Yiu. A new stability criterion for discrete-time neural networks: Nonlinear spectral radius. Chaos, Solitons and Fractals, 31(2):424 436, 2007. [11] Wenqian Nan, Bin Guo, Shenlong Huangfu, Zhiwen Yu, and Huihui Chen. A cross-space, multi-interaction-based dynamic incentive mechanism for mobile crowd sensing. 2014. [12] Thanh Nguyen and Milan Vojnovic. Weighted proportional allocation. In Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, pages 173 184. ACM, 2011. [13] Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay V Vazirani. Algorithmic game theory, volume 1. Cambridge University Press Cambridge, 2007. [14] Jia Peng, Yanmin Zhu, Wei Shu, and Min-You Wu. How multiple crowdsourcers compete for smartphone contributions? [15] Hong Nhat Pham, Back Sun Sim, and Hee Yong Youn. A novel approach for selecting the participants to collect data in participatory sensing. In SAINT, pages 50 55, 2011. [16] Jörgen W Weibull. Evolutionary game theory. MIT press, 1997. [17] D. Yang, G. Xue, X. Fang, and J. Tang. Incentive mechanisms for crowdsensing: Crowdsourcing with smartphones. IEEE/ACM Transactions on Networking, 24(3):1732 1744, June 2016.