Figure 1. Application scenario of mobile micro-clouds (MMCs).

Size: px

Start display at page:

Download "Figure 1. Application scenario of mobile micro-clouds (MMCs)."

Beverly Lynette Heath
5 years ago
Views:

Mobility-Driven Service Migration in Mobile Micro-Clouds Shiqiang Wang, Rahul Urgaonkar, Murtaza Zafer, Kevin Chan, Ting He, and Kin K. Leung Imperial College London, Email: shiqiang.wang11, kin.

1 Mobility-Driven Service Migration in Mobile Micro-Clouds Shiqiang Wang, Rahul Urgaonkar, Murtaza Zafer, Kevin Chan, Ting He, and Kin K. Leung Imperial College London, shiqiang.wang11, IBM T. J. Watson Research Center, rurgaon, Samsung Research America, Army Research Laboratory, Abstract Tactical networks are highly constrained in communication and computation resources, particularly at the edge. This limitation can be effectively addressed by the emerging technology of mobile micro-clouds that is aimed at providing seamless computing/data access to the edge of such networks. Deployment of mobile micro-clouds can be used to deliver critical, timely, and mission relevant situational awareness to end users in highly dynamic environments. Different from the traditional cloud, a mobile micro-cloud is smaller and deployed closer to users, typically attached to a fixed or mobile basestation (BS that is deployed in the field. Due to the relatively small coverage area of each BS, when a user moves across areas covered by different BSs, the issues of service performance and service migration become important. In this work, we consider such migration issues. We model the general problem as a Markov decision process (MDP. To reduce the state space, we propose a distancebased MDP approximation to the general model, which is exact for 1-dimensional uniform user mobility and approximates - dimensional uniform user mobility with a constant error bound. For the distance-based MDP, we provide closed-form solutions to the cost values from a given service migration policy, and then propose an algorithm that finds the optimal policy based on the closed-form expression of the cost. Simulations with random walk mobility models and real-world mobility traces confirm the effectiveness of the proposed approach. Index Terms Cloud technologies, edge computing, Markov decision process (MDP, mobility, optimization, wireless networks I. INTRODUCTION Tactical networks are expected to provide soldiers efficient and effective computing capability at the tactical edge. As described by the DoD Cloud Computing Strategy [1], there is an increasing need for enhanced information services and associated infrastructure, and mobile cloud computing can enhance mission performance by providing warfighters on demand data and services in dynamic network environments. Typical cloud applications are comprised of a front-end component running on the mobile device carried by the soldier while the back-end component (that performs data processing and computation typically runs on the cloud. While this architecture enables applications to take advantage of the on-demand feature of cloud computing, it also introduces new challenges in the form of increased network overhead and latency. A promising approach to address the new challenges is to move such computation closer to the network edge. It is envisioned that entities (such as fixed or mobile basestations deployed on the field closer to the network edge would host smaller-sized cloud-like infrastructure distributed across the network, as shown in Fig. 1. This idea has been variously Note: This paper has been submitted for future publication. Please do not publicly disclose its contents until further notice. Contributions of the author to this work are related to his previous employment at IBM T. J. Watson Research Center. Figure 1. Application scenario of mobile micro-clouds (MMCs. termed as Edge Computing [], Cloudlets [3], and Follow Me Cloud [4], to name a few. In this work, we use the term Mobile Micro-Cloud (MMC for this distributed cloudlike infrastructure. The trend towards MMCs is expected to accelerate as the requirements of soldiers at the tactical edge increase. Besides, MMCs can also improve the robustness of cloud computing systems [3]. One of the key design issues in MMCs is service migration: Should a service currently running in one of the MMCs be migrated as the user location changes, and if yes, where? This question stems from the basic tradeoff between the cost of service migration vs. the reduction in network overhead and latency achieved by migration. While conceptually simple, it is challenging to make this decision in an optimal manner because of the uncertainty in node mobility as well as possible non-linearity of the migration and transmission costs. Most existing work on service migration focuses on traditional cloud environments without explicitly considering the impact of user mobility. However, user mobility becomes a key factor in MMCs. The performance of MMCs with the presence of user mobility is studied in [5], but decisions on whether and where to migrate the service is not considered. A preliminary work on mobility-driven service migration based on Markov Decision Processes (MDPs is given in [6], which mainly follows standard solution procedures and does not consider two-dimensional (-D mobility patterns. The MDP framework has also been used to study optimal handover policies in the context of cellular networks. However, the notion of service migration is quite different from a cellular handover because a user may continue to receive service from an MMC even if it is no longer associated with that basestation. In this paper, similar to [6], we study a sequential decision making problem for service migration in MMCs, using the MDP framework. However, this paper provides substantial novel contributions, which are described as follows. Our formulation captures general cost models and provides a mathematical framework to design optimal service migration policies. We note that the resulting problem becomes difficult to solve due to the large state space. In order to overcome this challenge, we propose an approximation of the underlying

2 state space where we define the states as the distance between the user and service locations 1. This approximation becomes exact for the special case of uniform one-dimensional (1-D mobility and we prove several structural properties of the distance-based MDP, which includes a closed-form solution to the discounted sum cost. We leverage these properties to propose a new algorithm for computing the optimal policy, which reduces the complexity compared with traditional methods based on value or policy iteration [7, Section 6]. Our approach of approximating the state space with the distance becomes inexact for -D mobility models. However, we show that for uniform -D mobility, the resulting error remains bounded by a constant. Further, this approach allows us to reduce the -D problem into a distance-based problem, and it enables the use of results from our distance-based analysis to find an efficient service migration policy for - D mobility. Simulation results comparing our approximation approach to the optimal solution suggest that it performs very close to optimal, and the proposed approach is significantly faster than solving the -D problem directly. Finally, we describe how to implement our algorithms in practice. Simulations are performed on a set of real-world mobility traces and the results are compared with several baseline strategies that include myopic, never-migrate, and alwaysmigrate policies. It is shown that the proposed approach offers significant gains over those baseline approaches. Our main contributions are summarized as follows: We introduce a general MDP-based formulation of the service migration problem. A simplified MDP where the states represent the distances between the user and the service is presented. For this model, we present a closed form solution when the service migration policy is given, and then use this result to propose an algorithm to find the optimal policy. We use the distance-based MDP to approximate the -D problem, and show the efficiency of this approximation both analytically and numerically. We present the procedures of applying the proposed algorithms in practice, and perform simulations with real mobility traces and compare with baseline approaches. The remainder of this paper is organized as follows. Section II describes the problem formulation. The distance-based MDP model and its optimal policy is discussed in Section III. Section IV focuses on using the distance-based MDP to solve problems with -D mobility. Section V discusses real-world application, and Section VI draws conclusions. II. PROBLEM FORMULATION Consider a mobile user that accesses a cloud-based service hosted on the MMCs. The set of possible user locations is given by L, where L is assumed to be finite (but arbitrarily large. We consider a time-slotted model where the user s location remains fixed for the duration of one slot and changes from one slot to the next according to a Markovian mobility model. The time-slotted model can be regarded as a sampled 1 Throughout this paper, we mean by user location the location of the basestation that the user is associated to. The 1-D mobility is an important practical scenario often encountered in transportation networks, such as vehicles moving along a road. Possible migration We always Observe changing g have Operate with states u(t ( and h(t ( h(t ( to h'(t ( h(t+1= ( h'(t ( u(t and h'(t Time Timeslot t Figure. Timing of the proposed service migration mechanism. version of a continuous-time model, and the sampling can be performed either at equal intervals over time or occur right after a cellular handover instance. In addition, we assume that each location l Lis associated with an MMC that can host the service for the user. The locations in L are represented as -D vectors and there exists a distance metric l 1 l that can be used to calculate the distance between locations l 1 and l. Note that the distance metric may not be Euclidean distance. An example of this model is a cellular network in which the user s location is taken as the location of its current basestation and the MMCs are co-located with the basestations. As shown in Section IV, these locations can be represented as pairs (i, j with respect to an origin (, and the distance between any two locations can be calculated. We denote the user and service locations at timeslot t as u(t and h(t respectively. A. Control Decisions and Costs At the beginning of each slot, the MMC controller can choose from one of the following control options: 1 Migrate the service from location h(t to some other location h (t L. This incurs a cost c m (x that is assumed to be a non-decreasing function of x, where x is the distance between h(t and h (t, i.e., x = h(t h (t. Once the migration is completed, the system operates under state (u(t,h (t. We assume that the time to do migration is negligible compared to the time-scale of node mobility (as shown in Fig.. Do not migrate the service. In this case, we have h (t = h(t and the migration cost is c m ( =. In addition to the migration cost, there is a transmission cost incurred by the user for connecting to the currently active service instance. The transmission cost is related to the distance between the service and the user after possible migration, and it is defined as a general non-decreasing function c d (x, where x = u(t h (t. Wesetc d ( =. B. Performance Objective Let us denote the overall system state at the beginning of each timeslot (before possible migration by s(t = (u(t,h(t. The state s(t is named as the initial state of slot t. Consider any policy π that makes control decisions based on the state s(t of the system, and we use a π (s(t to represent the control action taken when the system is in state s(t. This action causes the system to transition to a new intermediate state s (t =(u(t,h (t = a π (s(t. We also use C aπ (s(t to denote the sum of migration and transmission costs incurred by a control a π (s(t in slot t, and we have C aπ (s(t = c m ( h(t h (t +c d ( u(t h (t. Starting from any initial state s( = s, the long-term expected

3 discounted sum cost incurred by policy π is given by t } V π (s = lim E γ τ C aπ (s(τ t s( = s (1 τ= where <γ<1 is a discount factor. Our objective is to design a control policy that minimizes the long-term expected discounted sum total cost starting from any initial state, i.e., V (s =minv π (s s ( π This problem falls within the class of MDPs with infinite horizon discounted cost. It is well known that the optimal solution is given by a stationary policy and can be obtained as the unique solution to the Bellman s equation: V (s =min C a (s +γ } P a(s,s a 1 V (s 1 (3 s 1 L L where P a(s,s 1 denotes the probability of transitioning from state s ( = s = a(s to s(1 = s 1. Note that the intermediate state s (t has no randomness when s(t and a( are given, thus we only consider the transition probability from s (t to the next state s(t+1 in (3. Also note that we always have h(t +1=h (t. C. Characteristics of Optimal Policy We next characterize some structural properties of the optimal solution. The following lemma states that it is not optimal to migrate the service to a location that is farther away from the user. Lemma 1. Let a (s =(u, h denote the optimal action at any state s =(u, h. Then, we have u h u h. (If the optimal action is not unique, then there exists at least one such optimal action. Proof. See [8]. Corollary 1. If c m (x and c d (x are both constants (possibly of different values for x >, and c m ( < c m (x and c d ( <c d (x for x>, then migrating to locations other than the current location of the mobile user is not optimal. Proof. See [8]. D. Simplifying the Search Space Lemma 1 simplifies the search space for the optimal policy considerably. However, it is still very challenging to derive the optimal control policy for the general model presented above, particularly when the state space s(t} is large. One possible approach to address this challenge is to re-define the state space to represent only the distance between the user and service locations d(t = u(t h(t. The motivation for this comes from the observation that the cost functions in our model depend only on the distance. Note that in general, the optimal control actions can be different for two states s and s 1 that have the same user-service distance. However, it is reasonable to use the distance as an approximation of the state space for many practical scenarios of interest, and this simplification allows us to formulate a far more tractable MDP. We discuss the distance-based MDP in the next section, and show how the results on the distance-based MDP can be applied to -D mobility models and real-world mobility traces in Sections IV and V. 1-p 1-p-q p q 1-p-q 1-p-q p q p p p p p 1 N-1 N q q q q 1-p-q p q p a(n = 1 Figure 3. Distance-based MDP with the distances d(t} (before possible migration as states. In this example, migration is only performed at state N. The solid lines denote state transitions without migration. In the remainder of this paper, where there is no ambiguity, we reuse the notations P, C a (, V (, and a( to respectively represent transition probabilities, one-timeslot costs, discounted sum costs, and actions of different MDPs. III. OPTIMAL POLICY FOR DISTANCE-BASED MDP In this section, we consider a distance-based 3 MDP with state transitions as shown in Fig. 3, where the states represent the user-service distances d(t} before possible migration. We define the parameter N as an application-specific maximum allowed distance, and we always perform migration when d(t N. We set the actions a(d(t = a(n for d(t > N, so that we only need to focus on the states d(t [,N]. After taking action a(d(t, the system operates in the intermediate state d (t = a(d(t, and the value of the next state d(t +1 follows the transition probability P d (t,d(t+1 which is related to the mobility model of the user. We assume that the actual migration distance is h(t h (t = d(t d (t, which can be satisfied on the -D model with a properly chosen distance metric, as will be discussed in Section IV. The one-timeslot cost is then C a (d(t = c m ( d(t d (t +c d (d (t. The most straightforward approach of solving for the optimal policy of an MDP is value iteration [7, Section 6], which finds the optimal policy from the Bellman s equation (3 iteratively. This approach, although easy to understand, may require a large number of iterations before converging to the optimal result. An improved method is policy iteration, which generally requires a smaller number of iterations compared with the value iteration approach, because, in each iteration, it finds the exact values of the discounted sum cost V (d, where d = d( for the policy resulting from the previous iteration, and performs the iteration based on the exact V (d values. However, in general, the exact V (d values are found by solving a system of linear equations, which has a complexity of O(N 3 when using Gaussian-elimination. To simplify the solution, we restrict the transition probabilities P d (t,d(t+1 according to the parameters p, p, and q as shown in Fig. 3. Such a restriction is sufficient when the underlying mobility model is uniform 1-D random walk, in which case p = p = q. It is also sufficient to approximate the uniform -D random walk model, as will be discussed in Section IV. Besides, we define the cost functions c m (x and c d (x in a constant-plus-exponential form. Such a form of cost function can have different shapes and are thus applicable to many realistic scenarios, as shown in Section III-A. 3 We assume that the distance is quantized, as it will be the case with the -D model discussed in later sections.

4 Remark: Under the above model specifications, we show that for the distance-based MDP, the exact V (d values can be found with O(N time-complexity. Besides reducing the computational complexity, the solution is also a closed-form solution to V (d for a given policy, which has analytical importance and can provide insights on the properties of V (d that can be used for further study. Based on this result, we propose an algorithm for finding the optimal policy of the distance-based MDP. A. Constant-Plus-Exponential Cost Functions The constant-plus-exponential cost functions are defined as, if x = c m (x = β c + β l μ x (4, if x>, if x = c d (x = δ c + δ l x (5, if x> where β c, β l, δ c, δ l, μ, and are real-valued parameters. The values of these parameters are selected such that c m (x, c d (x, and both c m (x and c d (x are non-decreasing in x for x. Explicitly, we have μ ; β l when μ 1; β l when μ 1; β c β l ; ; δ l when 1; δ l when 1; and δ c δ l. With this definition, the values of β c + β l and δ c + δ l can be regarded as constant terms of the costs, at least such an amount of cost is incurred when x>. The parameters μ and specify the impact of the distance x to the costs, and their values can be related to the network topology and routing mechanism of the network. The parameters β l and δ l further adjust the costs proportionally. The constant-plus-exponential cost function can be used to approximate an arbitrary cost function as discussed in [8]. B. Closed-Form Solution to the Discounted Sum Cost 1 Problem Formulation with Difference Equations: From (1, we get the following balance equation on the discounted sum cost for a given policy π: V π (d =C aπ (d +γ a π(d +1 d 1=a π(d 1 P aπ(d,d 1 V π (d 1 (6 For two different states d,1 and d,, if the policy π has actions a π (d,1 =d, and a π (d, =d,, then we have V π (d,1 =c m ( d,1 d, +V π (d, (7 From now on, we omit the subscript π and write d as d for short. For a given policy, the discounted sum cost V (d for d N can be computed from a set of difference equations [9]. Let n k (k denote the the series of migration states (i.e. a(n k n k, where n k N. For <d<n or n k 1 <d<n k,wehavea(d =d, according to (6, we have d+1 V (d =δ c + δ l d + γ P dd1 V (d 1 (8 d 1=d 1 We can rewrite (8 as V (d =φ 1 V (d 1 + φ V (d +1+φ 3 + φ 4 d (9 γq where φ 1 1 γ(1 p q, φ γp 1 γ(1 p q, φ 3 δ c 1 γ(1 p q, and φ δ 4 l 1 γ(1 p q. This difference function has characteristic roots m 1 = φ 1 φ,m = 1 1 4φ1 φ (1 φ φ When < γ < 1, we always have m 1 m. The homogeneous version of (9 has general solution V h (d =A k m d 1 + B k m d (11 where A k and B k are coefficients for the interval [,n ] or [n k 1,n k ] (k >. To solve the non-homogeneous equation (9, we try a particular solution in the form of D + H d if 1 φ1 V p (d = φ D + Hd d if 1 φ1 φ (1 = where D and H are constant coefficients. By substituting (1 into (9, we get φ 3 D = (13 1 φ 1 φ φ 4 if 1 φ1 1 H = φ 1 φ φ φ 4 φ 1 if 1 φ1 φ φ (14 = Therefore, V (d can be expressed as follows: V (d=a k m d 1 +B k m d H d if 1 φ1 +D + φ Hd d if 1 φ1 φ = (15 Finding the Coefficients: The coefficients A k and B k are unknowns that need to be evaluated using additional constraints. These coefficients may be different for states within different intervals of [n k 1,n k ]. After these coefficients are determined, (15 holds for all d [n k 1,n k ]. We assume 1 φ1 φ and 1 φ φ 1 from now on, the other cases can be derived in a similar way. Coefficients for d n : One constraint is from the balance equation (6 for d =, which is V ( = γp V (1 + γ(1 p V ( (16 By substituting (15 into (16, we get A (1 φ m 1 +B (1 φ m =D(φ 1 + H(φ 1 (17 γp where φ 1 γ(1 p. Another constraint is obtained by substituting (15 into (7, which gives ( A (m n 1 ma(n 1 + B m n ma(n ( = β c + β l μ n a(n H n a(n (18 The values of A and B can be solved from (17 and (18. Coefficients for n k 1 d n k : Assume that we have found V (d for all d n k 1. By letting d = n k 1 in (15, we have the first constraint given by A k m n k B k m n k 1 = V (n k 1 D H n k 1 (19 For the second constraint, we consider two cases. If a(n k n k 1, then A k m n k 1 + B k m n k = β c + β l μ n k a(n k + V (a(n k D H n k ( If n k 1 <a(n k n k 1, then ( A k (m n k 1 m a(n k 1 + B k m n k m a(n k ( = β c + β l μ n k a(n k H n k a(n k (1

5 Algorithm 1 Modified policy iteration algorithm based on difference equations 1: Initialize a(d =for all d =, 1,,..., N : Find constants φ, φ 1, φ, φ 3, φ 4, m 1, m, D, and H 3: repeat 4: k 5: for d =1...N do 6: if a(d d then 7: n k d 8: k k +1 9: for all n k do 1: if k =then 11: Solve for A and B from (17 and (18 1: Find V (d with d n k from (15 with A and B found above 13: else if k> then 14: if a(n k n k 1 then 15: Solve for A k and B k from (19 and ( 16: else 17: Solve for A k and B k from (19 and (1 18: Find V (d with n k 1 <d n k from (15 with A k and B k found above 19: for d =1...N do : a prev (d =a(d 1: a(d =argmin a d C a (d+γ } a+1 j=a 1 P ajv (j : until a prev (d =a(d for all d 3: return a(d for all d The values of A k and B k can be solved from (19 together with either ( or (1. 3 Solution is in Closed-Form: We note that A and B can be expressed in closed-form, and A k and B k for all k can also be expressed in closed-form by substituting (15 into (19 and ( where necessary. Therefore, (15 is a closed-form solution for all d [,N]. Numerically, we can find V (d for all d [,N] in O(N time. C. Algorithm for Finding the Optimal Policy We propose a modified policy iteration approach to find the optimal policy. The proposed approach uses the above result instead of Gaussian-elimination to compute V (d, and it also only checks for migrating to lower states or not migrating (according to Lemma 1. The algorithm is shown in Algorithm 1, where Lines 4 8 find the values of n k, Lines 9 18 find the discounted sum cost values, and Lines 19 1 update the optimal policy. The overall complexity for each iteration is O ( N in Algorithm 1, which reduces complexity because standard 4 policy iteration has complexity O(N 3, and the standard value iteration approach does not compute the exact value function in each iteration and generally has long convergence time. IV. APPROXIMATE SOLUTION FOR -D MOBILITY MODEL In this section, we show that the distance-based MDP can be used to find a near-optimal service migration policy, where 4 We use the term standard here to distinguish with the modified policy iteration mechanism proposed in Algorithm 1. the user conforms to a uniform -D random walk mobility model on infinite space. This mobility model can be used to approximate real-world mobility traces, as shown in Section V. We consider a hexagon cell structure in this section, but the approximation procedure can also be used for other -D mobility models (such as Manhattan grid with some parameter changes. The user is assumed to transition to one of its neighboring cells at the beginning of each timeslot with probability r, and stay in the same cell with probability 1 6r. Define the offset of the user from the service as a -D vector e(t =u(t h(t (recall that u(t and h(t are also -D vectors. Due to the symmetry of the mobility model, it is sufficient to consider the offset state space of e(t} rather than the full state space of s(t}. The distance metric l 1 l is defined as the minimum number of hops that are needed to reach from cell l 1 to cell l on the hexagon model. We name the states with the same value of e(t as a ring, and express the states e(t} with polar indexes (i, j, where the first index i refers to the ring index, and the second index j refers to each of the states within the ring, as shown in Fig. 4. For e(t =(i, j, wehave e(t = i. Ifu(t = h(t (i.e., the actual user and service locations (cell are the same, then we have e(t =(, and e(t =. Similarly as in the distance-based MDP, we assume in the -D MDP that we always migrate when e(t N, where N is a design parameter, and we only consider the state space e(t} with e(t N. The system operates in the intermediate state e (t =u(t h (t =a(e(t after taking action a(e(t. The next state e(t +1 is determined probabilistically according to the transition probability P e (t,e(t+1. We have P e (t,e(t+1 = 1 6r when e(t +1 = e (t; P e (t,e(t+1 = r when e(t +1 is a neighbor of e (t; and P e (t,e(t+1 = otherwise. Note that we always have e(t e (t = h (t h(t, so the one-timeslot cost is C a (e(t = c m ( e(t e (t +c d ( e (t. We note that, even after simplification with the offset model, the -D offset-based MDP has a significantly larger number of states compared with the distance-based MDP, because for a distance-based model with N states (excluding state zero, the -D offset model has M =3N +3N states (excluding state (,. Therefore, we use the distance-based MDP proposed in Section III to approximate the -D offset-based MDP, which significantly reduces the computational time as shown in Section IV-C. A. Approximation Method In the approximation, the parameters of the distance-based MDP are chosen as p =6r, p =.5r, and q =1.5r. The intuition of the parameter choice is that, at state (i,j =(, in the -D MDP, the aggregate probability of transitioning to any state in ring i 1 =1is 6r, sowesetp =6r; at any other state (i,j (,, the aggregate probability of transitioning to any state in the higher ring i 1 = i +1 is either r or 3r, and the aggregate probability of transitioning to any state in the lower ring i 1 = i 1 is either r or r, sowesetp and q to the median value of these transition probabilities. To find the optimal policy for the -D MDP, we first find the optimal policy for the distance-based MDP with the parameters defined above. Then, we map the optimal policy

6 (3, (3,17 (3,1 (3,16 (, (3, (3,15 (,11 (,1 (3,3 3 (,1 (1, (, (3,14 (1,5 (1,1 1 (3,4 (,9 (, (,3 (3,13 (1,4 (1, (3,5 (,8 (1,3 (,4 (3,1 (,7 (,5 (3,6 (3,11 (,6 (3,7 (3,1 (3,8 (3,9 Figure 4. Example of -D offset model on hexagon cells, where N =3. from the distance-based MDP to a policy for the -D MDP. To explain this mapping, we note that, in the -D hexagon offset model, there always exists at least one shortest path from any state (i, j to ring i, the length of this shortest path is i i, and each ring between i and i is traversed once on the shortest path. For example, one shortest path from state (3, to ring i =1is (3,, (, 1, (1,. When the system is in state (i, j and the optimal action from the distance-based MDP is a (i =i, then we perform migration on the shortest path from (i, j to ring i. If there exist multiple shortest paths, one path is arbitrarily chosen. For example, if a(3 = in the distance-based MDP, then either a(3, = (, 1 or a(3, = (, in the -D MDP. With this mapping, the onetimeslot cost C a (d(t for the distance-based MDP and the one-timeslot cost C a (e(t for the -D MDP are the same. B. Error Bound Error in the approximation arises because the transition probabilities in the distance-based MDP are not exactly the same as that in the -D MDP (there is at most a difference of.5r. In this subsection, we study the difference in the discounted sum costs when using the policy obtained from the distance-based MDP and the true optimal policy for the -D MDP. The result is summarized as Proposition 1. Proposition 1. Let V dist (e denote the discounted sum cost when using the policy from the distance-based MDP, and let V (e denote the discounted sum cost when using true optimal policy of the -D MDP, then we have V dist (e V (e γrk 1 γ for all e, where k max x c m (x + c m (x}. Proof. (Outline The proof is completed in three steps. First, we modify the states of the -D MDP in such a way that the aggregate transition probability from any state (i,j (, to ring i 1 = i +1 (correspondingly, i 1 = i 1 is.5r (correspondingly, 1.5r. We assume that we use a given policy on both the original and modified -D MDPs, and show a bound on the difference in the discounted sum costs for these two MDPs. In the second step, we show that the modified - D MDP is equivalent to the distance-based MDP. This can be intuitively explained by the reason that the modified -D MDP has the same transition probabilities as the distance-based MDP when only considering the ring index i, and also, the one-timeslot cost C a (e(t only depends on e(t a(e(t and a(e(t, both of which can be determined from the ring indexes of e(t and a(e(t. The third step uses the fact that the optimal policy for the distance-based MDP cannot bring higher discounted sum cost for the distance-based MDP (and hence the modified -D MDP than any other policy. By utilizing the error bound found in the first step twice, we prove the result. For details of the proof, see [8]. The error bound is a constant value when the parameters are given. It increases with γ. However, note that the absolute value of the discounted sum cost also increases with γ, so the relative error can remain low. C. Numerical Evaluation The error bound derived in Section IV-B is a worst-case upper bound of the error. In this subsection, we evaluate the performance of the proposed approximation method numerically, and focus on the average performance of the approximation. We consider -D random walk mobility with randomly chosen parameter r. The maximum user-service distance is set as N = 1. The transmission cost function parameters are selected as =.8, δ c =1, and δ l = 1. With these parameters, we have δ c +δ l =, which means that there is no constant portion in the cost function. For the migration cost, we choose μ =.8 and fix β c +β l =1to represent a constant server processing cost for migration. The parameter β l takes different values in the simulations, to represent different sizes of data to be migrated. The simulations are run with 5 different random seeds. We compare with standard solution methods on the original -D model and also with alternative migration policies. The overall results are shown in Fig. 5 with different values of the discount factor γ. It can be observed that the proposed method brings discounted sum costs that are very close to optimum. Meanwhile, the computation time when using the proposed method is only about.1% of the computation time of standard value and policy iteration approaches. For further discussion on the simulation results, see [8]. V. APPLICATION TO REAL-WORLD SCENARIOS In this section, we discuss how the aforementioned approaches can be applied to service migration in the real world, where multiple users and services are involved. A. Operating Procedure In the real-world scenario, we assume that each user follows a sample path of the -D hexagon mobility model. The parameter r is estimated from the cell association history of multiple users. The cost parameters β c, β l, μ, δ c, δ l, and are selected based on the actual application scenario, and their values may vary with the traffic load of the network and MMC servers. By appropriately adjusting the cost functions according to the load, the proposed approach can also serve for the goal of load balancing, and we choose such a cost function in the simulations in Section V-B. The discount factor γ is selected based on the duration of services, and a larger γ is selected for a longer service duration. Such a selection is because the discount factor γ determines the amount of time to look ahead, if a user only requires the service for a short time, then there is no need to consider the cost for the long-term future.

7 Computation time (s Discounted sum cost Policy iteration Value iteration Optimal Never migrate.5 Always migrate Myopic Computation time (s Discounted sum cost Policy iteration Value iteration Optimal Never migrate Always migrate Myopic Computation time (s Discounted sum cost Policy iteration Value iteration Optimal Never migrate Always migrate Myopic Actual cost from traces (a Never Migrate Always Migrate Myopic Never Migrate (average Always Migrate (average Myopic (average (average (b Figure 5. Simulation result with -D random walk: (a γ =.5, (b γ =.9, (c γ =.99. Cost reduction =1.1 Over never migrate Over always migrate Over myopic Cost reduction (c =1.5 Over never migrate Over always migrate Over myopic : 6: 1: 18: : Time (a (b (c Cost reduction =1.1 Over never migrate Over always migrate Over myopic Cost reduction =1.5 Over never migrate Over always migrate Over myopic (d (e (f Figure 6. Simulation result with real-world traces: (a average cost per user in each timeslot over a day, where R t = R p =1.5; (b (e average cost reduction compared with alternative policies (error bars denote the standard deviation; (f average cost. 1 Initial Service Placement: Upon initialization, the service is placed onto the MMC that is connected to the same basestation as the user is currently connected to. In other words, when a service starts, the user and the service are in the same location, i.e., state (, of the -D MDP. This is due to the consideration that the initialization cost and the cost of further operation can be minimized by initially placing the service closest to the user. Dynamic Service Migration: After the initial service placement, the subsequent actions of possible service migration is performed according to the optimal policy found from the mechanisms proposed earlier in this paper. To facilitate the mapping between the real-world and the MDP model, we define a timeslot-length T which is the same for all MMCs. The parameter T can be regarded as a protocol parameter, and different MMCs do not need to synchronize on the timeslots. We also define a window length T w >T, which specifies the amount of time to look back to estimate the parameter r. We consider the case where the parameter r is the same across the whole geographical area, which is a reasonable assumption when different locations within the geographical area under consideration have similarities (for example, they all belong to an urban area. More sophisticated cases can be studied in the future. The optimal policy is computed by a MMC controller, and a policy update interval T u is defined. The operating procedure is described as follows. Each MMC obtains the identity of associated users of the cell connected

8 to the MMC, at the beginning of each timeslot with length T. Based on this information, the MMC computes the number of users that have left the cell and the total number of users in the cell, and stores this information for each timeslot. At interval T u, the MMC controller sends a request to all MMCs to collect the current statistics. After receiving the request, each MMC i MMC computes the empirical probability f immc of users moving outside of the cell, based on the statistics on the departed and total users within the duration of T w. These empirical probabilities are sent together with other monitored information (such as current network and server load to the controller. The controller then computes the average of the empirical probabilities f immc, denoted by f, and updates parameter r = f/6. It also computes the transmission and migration cost parameters based on the current system condition. Based on these updated parameters, the controller computes the optimal policy under the current system condition and sends the result to the MMCs. B. Trace-Driven Simulation We simulate with real-world mobility traces of taxis in San Francisco [1], [11]. A hexagon cell structure with 5 m cell separation is assumed and each taxi is assumed to be connected to the nearest basestation. The parameters for data collection and policy update are set as T = T u =6s, and T w = 36 s. We choose N =1, μ = =.8, and γ =.9. 1 Cost definition: It is assumed that the network load is proportional to the number of taxis in operation, and we denote the total amount of transmission bandwidth resource (correspondingly, processing resource at MMC servers with a factor R t (correspondingly, R p that is normalized with the maximum number of taxis that operate at the same time. Then, /( we define a quantity G t (correspondingly, /( G p asg t = 1 1 mcur R tm max (correspondingly, G p =1 1 mcur R pm max, where m cur denotes the number of taxis in operation at the time when the optimal policy is being computed, and m max denotes the maximum number of taxis that ever operate at the same time. The cost parameters are then defined as β c = G p + G t, β l = G t, δ c = G t, and δ l = G t. With such a definition, we have β c + β l = G p and δ c + δ l =, which means that the constant part of the migration cost is G p (to represent the processing cost for migration and there is no constant part in the cost for data transmission. We set β l = δ l = G t because this part of cost can be regarded as related to data transmission in both c m (x and c d (x. The choice of G t and G p can serve for the purpose of load balancing [1] and can also represent the delay of both processing and data transmission. Results: The average cost per user in each timeslot (i.e., the C a (s(t values is collected and shown in Fig. 6. Denote the cost of the proposed method as C and the cost of the method under comparison as C, then the cost reduction is defined as (C C/C. The results show that the proposed approach is beneficial with cost reductions ranging from 9% to 54% compared with the never/always migrate or myopic policy. The results also show that the costs fluctuate over the day, and they vary with different amount of available resources, which implies that it is necessary to compute the optimal policy in real-time, based on recent observations on the system condition. VI. CONCLUSIONS In this paper, we have studied service migration in MMCs. The problem is formulated as an MDP, but its state space can be arbitrarily large. To make the problem tractable, we have reduced the general problem into an MDP that only considers the distance between the user and the service. The distance-based MDP has several structural properties that allow us to propose an efficient algorithm to find its optimal policy. We have then shown that the distance-based MDP is a good approximation to scenarios where the users move in a -D space, which is confirmed by analytical and numerical evaluations and also by simulations with real-world traces. The results in this paper provide an efficient solution to service migration in MMCs. Further, we envision that the approaches used in this paper can be extended to a range of other problems that share similar properties. The highlights of our approaches include: a closed-form solution to the discounted sum cost of a particular class MDPs, which can be used to simplify the procedure of finding the optimal policy; a method to approximate an MDP (in a particular class with one that has smaller state space, where the approximation error can be shown analytically; and a method to collect statistics from the real-world to serve as parameters of the MDP. ACKNOWLEDGMENT This research was sponsored in part by the U.S. Army Research Laboratory and the U.K. Ministry of Defence and was accomplished under Agreement Number W911NF The views and conclusions contained in this document are those of the author(s and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Army Research Laboratory, the U.S. Government, the U.K. Ministry of Defence or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon. REFERENCES [1] T. Takai, Cloud computing strategy, U.S. Department of Defense, Chief Information Officer, Tech. Rep., 1. [] S. Davy, J. Famaey, J. Serrat-Fernandez, J. Gorricho, A. Miron, M. Dramitinos, P. Neves, S. Latre, and E. Goshen, Challenges to support edge-as-a-service, IEEE Communications Magazine, vol. 5, no. 1, pp , Jan. 14. [3] M. Satyanarayanan, G. Lewis, E. Morris, S. Simanta, J. Boleng, and K. Ha, The role of cloudlets in hostile environments, IEEE Pervasive Computing, vol. 1, no. 4, pp. 4 49, Oct. 13. [4] T. Taleb and A. Ksentini, Follow me cloud: interworking federated clouds and distributed mobile networks, IEEE Network, vol. 7, no. 5, pp. 1 19, Sept. 13. [5], An analytical model for follow me cloud, in Proc. of IEEE GLOBECOM 13, Dec. 13. [6] A. Ksentini, T. Taleb, and M. Chen, A Markov decision process-based service migration procedure for follow me cloud, in Proc. of IEEE ICC 14, June 14. [7] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, 9, vol [8] S. Wang, R. Urgaonkar, M. Zafer, K. Chan, T. He, and K. K. Leung, Supplementary materials for mobility-driven service migration in mobile micro-clouds. [Online]. Available: [9] S. Elaydi, An Introduction to Difference Equations, Third Edition. Springer, 5. [1] M. Piorkowski, N. Sarafijanovoc-Djukic, and M. Grossglauser, A parsimonious model of mobile partitioned networks with clustering, in Proc. of COMSNETS, Jan. 9. [11] M. Piorkowski, N. Sarafijanovic-Djukic, and M. Grossglauser, CRAW- DAD data set epfl/mobility (v. 9--4, Downloaded from Feb. 9. [1] M. Chowdhury, M. Rahman, and R. Boutaba, Vineyard: Virtual network embedding algorithms with coordinated node and link mapping, IEEE/ACM Trans. on Networking, vol., no. 1, pp. 6 19, 1.

Dynamic Service Placement for Mobile Micro-Clouds with Predicted Future Costs

Dynamic Service Placement for Mobile Micro-Clouds with Predicted Future Costs Shiqiang Wang, Rahul Urgaonkar, Kevin Chan, ing He, Murtaza Zafer, and Kin K. Leung Department of Electrical and Electronic