MARKOV DECISION PROCESSES

Size: px
Start display at page:

Download "MARKOV DECISION PROCESSES"

Transcription

1 J. Virtamo Teletraffic Theory / Markov decision processes 1 MARKOV DECISION PROCESSES In studying Markov processes we have up till now assumed that the system, its states and transition probabilities are given in advance. The problem has been to find the stationary probabilities of the system, or possibly the transient evolution of the probabilities starting from a given initial state distribution. From these one can deduce interesting quantities such as blocking or overflow probabilities. Often, however, the situation is such that in the operation of the system one can make some choices. The operation is not completely fixed in advance but the behaviour of the system depends on the chosen operation policy. Then the task is to find an optimal policy which maximizes a given objective function. For instance, routing problems lead to this kind of setting. When the state of the network (calls in progress) is known, the task is to decide upon the arrival of a call in a given class (defined by the origin and destination points, and maybe other attributes) whether the call is admitted, and if so, along which route it shall be carried. The objective may be to maximize (in the long run) e.g. the number of carried calls or the volume of the carried traffic (call minutes). Similar problems emerge in the context of e.g. buffer management: one has to decide, in which order the packets shall be sent, which packets are discarded when the buffer is full etc.

2 J. Virtamo Teletraffic Theory / Markov decision processes 2 Markov decision processes, MDPs The theory of Markov decision processes studies decision problems of the described type when the stochastic behaviour of the system can be described as a Markov process. It combines Dynamic programming (Bellman, 1957) Theory of Markov processes (Howard, 1960) In a Markov process the state of the system X S may jump from state i to state j with a given probability p i,j. How state i has been reached does not influence the next and later transitions. In Markov decision processes after each transition, when the system is in a new state, one can make a decision or choose an action, which may incur some immediate revenue or costs and which, in addition, affects the next transition probability. For instance, when a call is terminated or a new call has just been admitted into the network, one can decide, as long as one stays in this state, what one will do with a new call which may arrive (reject/accept/which route is chosen). This decision clearly affects which transitions are possible, or more generally, what are the probabilities of different transitions; the next transition, however, happens stochastically as the arrival and departure events occur stochastically.

3 J. Virtamo Teletraffic Theory / Markov decision processes 3 Markov decision processes (continued) The problem is to find an optimal policy such that the expected revenues will be maximized (or the expected costs will be minimized; it does not matter which way we formulate the problem). Under the Markovian assumptions it is clear, that the action to be chosen in each state depends only on the state itself. Generally, a policy, optimal or not, defines for each state the action to be chosen. When with each state we associate an action, which in turn determines the transition probabilities of the next transition, these probabilities depend solely on the state, and the state of the system constitutes a Markov process. Each policy defines a different Markov process. Special attention will be given to finding a policy such that the associated Markov process has maximal average revenue. In the same way as Markov processes in general, also the Markov decision processes are divided into discrete time and continuous time decision processes.

4 J. Virtamo Teletraffic Theory / Markov decision processes 4 Discrete time MDPs The state of the system chances only at discrete points indexed by t = 1, 2,.... When the system has arrived at state i, one has to decide on action a, which belongs to the set A i of possible actions in state i, a A i. Action a incurs an immediate revenue r i (a). The revenue may also be stochastic; then r i (a) denotes its expectation. At the next instant the system moves into a new state j with the transition probability p i,j (a), which depends on the action chosen in state i. The transition probabilities, however, do not depend on how the state i has been reached (Markovian). Moreover, we restrict ourselves to consider time homogeneous systems, where r i (a) and p i,j (a) do not depend on the time (index) t. Policy α defines, which action a = a i (α) in each state i is chosen among the set of possible actions. Then the revenue r i (a i (α)) accrued by the visit of state i, as well as the transition probabilities p i,j (a i (α)) are functions of the policy α and of the state i. For brevity, we will denote them by r i (α) and p i,j (α).

5 J. Virtamo Teletraffic Theory / Markov decision processes 5 The equilibrium distribution of a discrete time MDP Given the policy α, the transition probabilities p i,j (α) are fixed. Under general assumptions, the Markov chain defined by these transition probabilities has a stationary (equilibrium) distribution, π i (α). The equilibrium distribution can be solved as for any Markov chain from the balance equations complemented by the normalization condition: i π i (α) = j π i (α) = 1 or, in vector form, π j (α)p j,i (α) π(α) = π(α)p(α) π(α) e T = 1 where π(α) = (π 1 (α), π 2 (α),...) P(α) = p 1,1 (α) p 1,2 (α)... p 2,1 (α) p 2,2 (α) e = (1, 1,...) From these equations one can solve π(α). The solution can be written in the form π(α) = e(p(α) I + E) 1, where I is the identity matrix and E is a matrix with all elements equal to 1.

6 J. Virtamo Teletraffic Theory / Markov decision processes 6 Average revenue of a discrete time MDP When the equilibrium distribution π(α) has been solved, one can immediately write the average revenue r(α), i.e. the expected revenue per one step, r(α) = i π i (α) r i (α) = π(α) r T (α), where r(α) = (r 1 (α), r 2 (α),...). Now the task is to find the optimal policy α, which maximizes the average revenue α = argmax r(α) or r(α ) r(α), α α Since the definition of a policy is discrete, we are led to a discrete optimization problem. The solution of such of a problem is not quite straight forward, even though, in principle, r(α) can be calculated for each possible policy. To find the optimum, one needs some systematic approach. The following approaches have been introduced in the literature 1. Policy iteration 2. Value iteration 3. Linear programming In the sequel, we will mainly focus on the policy iteration. To this end we are led to study a quantity which is called the relative value of state i.

7 J. Virtamo Teletraffic Theory / Markov decision processes 7 Relative values of states The quantity r(α) tells the average revenue per step under the policy α. Now we will examine, what can be said about the cumulative revenues, if we have the additional information that initially the system is in state i. Denote V n (i, α) = the expected cumulative revenue over n steps, when the system starts from state i In the first step (initial state) the expected revenue is r i (α) = e i r T (α), where e i = (0,..., 0, }{{ 1 }, 0,..., 0) component i After the first step the state probability vector is e i P(α). Similarly, the expected revenue at the second step is e i P(α) r T (α). In general, we have V n (i, α) = e i ( I + P(α) + P 2 (α) P n 1 (α) ) r T (α) We know that, irrespective of the initial state, the state probabilities approach an equilibrium distribution, e i P n (α) π(α), when n

8 J. Virtamo Teletraffic Theory / Markov decision processes 8 Relative values of states (continued) As n increases, the additional terms in V n (i, α) tend to π(α) r T (α) = r(α). After a large number of steps the information about the initial state is washed out and the expected revenue per each additional step equals the overall average revenue per step. V n (i, α) is the sum depicted in the figure. Only the initial part of the cumulative revenue depends on the initial state. The total effect of the initial transient can be defined for each state. Define the relative value v i (α) of state i v i (α) = lim n (V n (i, α) n r(α)) r The relative value of state i tells how much greater the expected cumulative revenue over an infinite time horizon is when the system starts from the initial state i (rather than from equilibrium), in comparison with the average cumulative revenue.

9 J. Virtamo Teletraffic Theory / Markov decision processes 9 Howard s equation The relative values of states satisfy the Howard equations v i (α) = r i (α) r(α) + j p i,j (α) v j (α) i By defining v(α) = (v 1 (α), v 2 (α),...) the equation can be written in vector form v(α) = r(α) r(α)e + v(α)p T (α) The Howard equation (the component form) can be interpreted as follows. Starting from state i: The deviation of the accrued revenue at the first step from the average is r i (α) r(α); this is explicitly accounted for. From that step on, one uses the Markovian property; conditioned that the system moves to state j, the deviation of the expected cumulative revenue from the average, from step 2 onwards is v j (α). Since p i,j (α) is the probability that the system makes a transition to state j, the sum gives the unconditional deviation of the expected cumulative revenue from step 2 onwards.

10 J. Virtamo Teletraffic Theory / Markov decision processes 10 Remarks on the Howard equation Comparison of the balance equation and the Howard equation In the balance equation π j = π i p i,j i the probability of state i is split and pushed forward. In the last term of Howard s equation p i,j v j the cumulative revenues of the j future paths are collected backwards. p i, j j p i, j j, v j π i i i Notice the difference: πp vs. vp T.

11 J. Virtamo Teletraffic Theory / Markov decision processes 11 Remarks on the Howard equation (continued) Since j p i,j = 1, the Howard equations can be written in the form r i (α) r(α) + j p i,j (α)(v j (α) v i (α)) = 0, i Only the differences v j (α) v i (α) appear in the equation. The relative values will be determined v i (α) up to an undetermined additive constant. In the sequel, the additive constant is unimportant, only the differences matter; We can arbitrarily set e.g. v 1 (α) = 0. Then the number of unknown v i (α) is one less than the number of equations. But also r(α) is unknown; all in all, there are as many unknowns as there are equations; r(α) is solved among the others.

12 J. Virtamo Teletraffic Theory / Markov decision processes 12 Remarks on the Howard equation (continued) The value r(α), which is obtained as a solution of the Howard equations (along with the values v 2 (α), v 3 (α),...), is automatically equal to rπ T (α). This can be seen by multiplying (dot product) the vector form Howard equation from right by π T (α) (for brevity, we suppress the dependence on the policy α): v = r re + vp T π T v π T = r π T r e π T } {{ } 1 +v P T π T }{{} (π P) T =π T = r π T r + v π T r = r π T

13 J. Virtamo Teletraffic Theory / Markov decision processes 13 Policy iteration Howard s equation determines the relative values of states v i (α) when the policy α is given. The policy can be improved by choosing the action a in each state i as follows a i = argmax a {r i (a) r(α) + j p i,j (a) v j (α)} The idea here is that a single decision is made by maximizing the expected revenue, by taking into account the immediate revenue of the action and its influence on the next transition, but assuming that, from that point on, all decisions are made using the old policy α. By choosing the action a i in each state as defined above, we arrive at a new policy α. With the new policy α one can (at least, in principle) solve the average revenue r(α ) and the relative values of states v i (α ). One can show that the resulting new policy α never is worse than the policy α one started with, i.e. r(α ) r(α). In the policy iteration, the iteration is continued until nothing changes. In general, the policy iteration converges quickly.

14 J. Virtamo Teletraffic Theory / Markov decision processes 14 Value iteration In the value iteration, we consider the cumulative revenue starting from initial state i. Now, it is convenient to index time such that the index of the initial time is n. The last time is denoted by index 0, and we set V 0 (i) = 0, i. Determining the optimal policy (with terminal revenues V 0 (i) = 0) is done as in the dynamic programming by proceeding from the last instant of time 0 backwards to the initial time n. Which action should be selected at time n, when the system is in state i, and what is the corresponding expected cumulative revenue V n (i) with the optimal policy? This is solved recursively assuming that the problem has already been solved for time n 1, and that the expected cumulative revenue from that point on up till end, V n 1 (i), is known for all states i. A recursion step is defined by the equation (start of the recursion: V 0 (i) = 0, i) V n (i) = max a {r i (a) + j p i,j (a)v n 1 (j)} The expression in brackets represents the expected revenue, given that at time n the system is in state i and that action a is chosen, and from that point on the policy is optimal. At time n, in state i, the optimal action a is the one which maximizes the expression in the brackets. The value of the maximum is the expected revenue when at each step (n, n 1, n 2,..., 1) an optimal action is taken.

15 J. Virtamo Teletraffic Theory / Markov decision processes 15 Value iteration (continued) In the value iteration, the optimum policy and the relative values of states are determined in parallel. As n grows, and the time is farther and farther from the terminal point: The action, defined by the value iteration becomes independent from time n and depends solely on the state i; the selection of that action corresponds to the optimum policy α in stationarity. The expected revenues tend to: V n (i) v i (α ) + n r(α ) + c, where c is some constant. When this form is inserted back into the value iteration equation, one obtains v i (α ) = max a {r i (a) r(α ) + j p i,j (a)v j (α )} which is the optimality condition for both the policy α and relative values v i (α ) and the average revenue r(α ). Optimum policy α is the policy, which in each state i selects the action a which realizes the maximum. The maximizing action depends on the relative values v i, which in turn, as defined by the equation, depend on which the maximizing action is.

16 J. Virtamo Teletraffic Theory / Markov decision processes 16 Value iteration (continued) The complex dependence of the optimal policy and the related relative values in the value iteration is no problem. The recursive calculation of the cumulative expected revenues V n (i) using the value iteration equation is very easy. In the policy iteration, determining the policy and determining the relative values have been separated: The action is selected as defined by a given policy α (no maximization), whence we are left with the Howard equation to determine the relative values. With these relative values, a new policy is determined by maximizing the equation. Comparison of policy and value iterations Though the policy iteration may look more complicated, it is more efficient: the relative values associated wit a given policy are computed once and for all by solving the linear set of Howard equations. In the value iteration, even the solution of this linear equation, is effectively done iteratively, which is slow (although the solution is interleaved with policy optimization). In the value iteration, one needs many more iterations.

17 J. Virtamo Teletraffic Theory / Markov decision processes 17 Continuous time MDPs The foregoing considerations can straightforwardly be adapted to the setting of continuous time Markov Decision Processes. In each state i a certain action a is chosen depending solely on the current state. State i together with the chosen action a determine the revenue rate r i (a) as well as the transition rates q i,j (a) to other states j. A policy α defines the choice of the action a for each state i, a = a i (α), whence the revenue rate and the transition rates are functions of the state and the policy. For brevity, we again denote these by r i (α) and q i,j (α). The relative values of the states, v i (α), for a given policy α are again determined by the Howard equations, which analogously with the earlier form read r i (α) r(α) + j =i q i,j (α)(v j (α) v i (α)) = 0, i v i (α) = the relative value of state i r i (α) = revenue rate in state i r(α) = average revenue rate of policy α

18 J. Virtamo Teletraffic Theory / Markov decision processes 18 Howard equation for a continuous time MDP Also the continuous time Howard equation can be written in vector form. To this end, we first modify the previous equation q i,j (α)(v j (α) v i (α)) = q i,j (α)v j (α) q i,j v i (α) = j =i j =i j =i }{{} q i,i j q i,j (α)v j (α) We obtain r(α) r(α)e + v(α)q T (α) = 0 Q is the transition rate matrix, Q = (q i,j ), formed by the transition rates q i,j The policy α explicitly determines the transition rate matrix Q(α). v(α) and r(α) are then determined by the Howard equation. For comparison, recall that the equilibrium probabilities, π(α), of policy α are determined by the balance condition π(α)q(α) = 0 (note the difference Q vs. Q T )

19 J. Virtamo Teletraffic Theory / Markov decision processes 19 Remarks on the Howard equation The v i (α) are determined up to additive constant. If c e is added to v(α), where c is a constant, the equation remains satisfied, since eq T = (Q e T ) T = 0 (the row sums of Q equal zero). One can set e.g. v 1 (α) = 0, and then v 1 (α), v 2 (α),... and r(α) can be solved from the equation. The solution r(α) thus obtained is automatically the same as the average revenue rate r(α)π T (α) This can be seen by multiplying the Howard equation from the right by π T : r π T r e π T } {{ } 1 +v Q T π T }{{} (πq) T =0 = 0 r = r π T

20 J. Virtamo Teletraffic Theory / Markov decision processes 20 The relative values of states The relative value v i of state i again represents the difference of the expected cumulative revenue (over infinite time horizon) accrued starting from the initial state i and the average cumulative revenue. Starting from state i the state probability vector is initially π(0) = e i (the i th component is one, other are zero). From that state on, the time dependent state probability vector evolves according to, d dt π(t) = π(t) Q, i.e. π(t) = π(0)e Qt and the revenue rate at time t is then rπ T (t) = re QTt e T i. Let V i (t) be the cumulative revenue (integral of the revenue rate over time) in the interval (0, t) starting from state i and let V(t) be the vector formed by these values. It is easy to see that V(t) = r t 0 eqtu du and, therefore, v = lim t (V(t) r t e) The latter constant term does not change the Howard equation. Now we show that in the limit t the first term V(t) does indeed satisfy the Howard equation: V(t)Q T = r t 0 eqtu du Q T = r/ t 0e QTu = r( e QT t }{{} I) (r π T ) e r = r e r π T e r r e + V(t)Q T 0 when t v satisfies the Howard equation

21 J. Virtamo Teletraffic Theory / Markov decision processes 21 Policy iteration The policy iteration starts with some policy α and solves the related relative values of states v i (α) and the average revenue rate r(α) from the Howard equation. The a new policy is determined by choosing in each state i the action a which realizes the maximum max {r a i(a) r(α) + q i,j (a)(v j (α) v i (α))} j =i These choices define a new policy α. Using the transition rate matrix related to this new policy, Q(α ), one solves from the Howard equation new relative values v i (α ) and r(α ) and determines a new policy again by the above maximization. This iteration is continued until nothing changes.

22 J. Virtamo Teletraffic Theory / Markov decision processes 22 Value iteration In the value iteration one considers the cumulative revenue V i (t) in a finite interval ( t, 0), when initially at time t the system is in the state i. The index t measures the time from the terminal time 0 backwards. At the terminal time t = 0 the cumulative revenues have the terminal values V i (0) = 0, i. The recursion progressing backwards in time takes in the continuous time the form of a differential equation d dt V i(t) = max {r a i (a) + q i,j (a)(v j (t) V i (t))} j =i This defines both the optimal choice of an action in each state i (at time ( t)) as well as the expected cumulative revenues V i (t) associated with the optimal policy.

23 J. Virtamo Teletraffic Theory / Markov decision processes 23 Value iteration (continued) When the value iteration equation is integrated far enough in backward time (up to a large t), one reaches a stationary state, where the chosen action depends solely on state i (not on time t), and this choice corresponds to the optimal policy α. Correspondingly, the cumulative revenue grows at a constant rate V i (t) = v i (α ) + r(α ) t + c, where c is some constant. When this is inserted to the value iteration equation, this takes the form max a {r i (a) r(α ) + j =i q i,j (a)(v j (α ) v i (α ))} = 0 which is an optimality condition for both the policy α and the relative values v i (α ) as well as for the average revenue rate r(α ). In practice, it is easiest to consider the cumulative revenues and to solve the differentia equation far enough, or to separate determining the values and the policy as is done in the policy iteration.

24 J. Virtamo Teletraffic Theory / Markov decision processes 24 Example 1. Relative values in the M/M/ system As the policy we take a free access: each arriving call is admitted (gets a server/trunk of its own). Suppose that revenue is accrued from each ongoing call at a constant rate 1 (charging per minute, e.g. cent/min). When the system is in the state n (n calls in progress), the revenue rate is n. The continuous time Howard equation for state n can be written directly n a + λ(v n+1 v n ) + µn(v n 1 v n ) = 0 Here we have exploited the knowledge that the average revenue rate is equal to the average number of calls in progress, which is a = λ/µ. Denote u n = µv n. Then the equation becomes n a + a(u n+1 u n ) + n(u n 1 u n ) = 0 i.e. a(u n+1 u n 1) = n(u n u n 1 1) The solution to this equation is u n+1 u n = 1 i.e. u n = n + c, n = 0, 1,...

25 J. Virtamo Teletraffic Theory / Markov decision processes 25 Example 1. (continued) The value of the constant c (unimportant) is fixed by the requirement n π nu n = 0 n π nn + c n π n = a + c = 0 c = a The solution is u n = n a 5 The result is easy to understand physically : The expected occupancy m(t) = E[N(t)] in the M/M/ 2 system obeys (irrespective of the initial distribution) the 1 equation d dt m(t) = λ µm(t) m(t) = (m(0) 0 a)e µt a t Then, starting from any state n (implying m(0) = n) the cumulative revenue, in comparison with the average revenue, is (m(t) a)dt = (m(0) a) e µt dt = 1 (n a) 0 0 µ m(t) 4 3 1/µ

26 J. Virtamo Teletraffic Theory / Markov decision processes 26 Example 2. The relative values of states in an M/M/1 queue In developing methods for minimizing delays in systems involving M/M/1 queues, it is convenient to consider the costs (as opposed to revenues). As the total cost incurred per customer we take the total time the customer spends in the queue. Then, in state n (customers in the system) the cost rate is n. In a queueing system one wishes to minimize m(t)dt, where m(t) = E[N(t)]. In a loss system the integral represents the expected amount of carried traffic and one wishes to maximize its value. Minimizing the delay and blocking are opposite objectives. It is easy to write the Howard equation (with free access policy): n ρ 1 ρ + λ(v n+1 v n ) + µ1 n>0 (v n 1 v n ) = 0 where we have exploited the fact that the average cost rate equals the average number in system, which we know is ρ/(1 ρ), ρ = λ/µ. (It is also possible to solve the equation even without this knowledge).

27 J. Virtamo Teletraffic Theory / Markov decision processes 27 Example 2. (continued) Denote u n = µv n. Then the equation reads n ρ 1 ρ + ρ(u n+1 u n ) + 1 n>0 (u n 1 u n ) = 0 ρ (u n+1 u n n ρ ) = 1 n>0(u n u n 1 ) n 1 ρ u n+1 u n = n ρ v n+1 v n = n + 1 µ λ v n = 1 2 n(n + 1) µ λ when one sets v 0 = 0 Physical interpretation: The expected occupancy m(t) = E[N(t)] in an M/M/1 system behaves approximately as depicted in the figure. m(t) ρ /(1- ρ) m(0)-( µ - λ)t ½ m(0) m(0)/( µ - λ) If the initial occupancy n is large, then m(t) first decreases linearly. The area of the triangle is a quadratic function of the initial occupancy. t

28 J. Virtamo Teletraffic Theory / Markov decision processes 28 Example 3. Optimal routing in the case of two parallel M/M/1 queues Packets arrive according to a Poisson process at rate λ. An arriving packet can be routed to either of the queues. The occupancies n 1 and n 2 are assumed to be known. The task is to find a routing policy such that in the long run the delay of the packets in the system is minimized. Let us start with the following basic policy (policy 0): An arriving packet is directed to queue 1 with the probability p and to queue 2 with the probability 1 p. The queues thus receive Poisson streams with intensities λ 1 = pλ and λ 2 = (1 p)λ. The queues are independent M/M/1 queues and the average dealy in the system is p µ 1 pλ + 1 p µ 2 (1 p)λ One can make static optimization with respect to the parameter p. Denote x = µ 2 /µ 1. The expression is minimized by p = p, λ µ 1 µ 2 p = 1, λ (1 x)µ x + x 1 + x (1 x) µ 1 λ, λ > (1 x)µ 1

29 J. Virtamo Teletraffic Theory / Markov decision processes 29 Example 3. (continued) Let us improve the basic policy by making one policy iteration. The implications of each routing decision are estimated by using the relative values of states calculated with the policy 0. Since with policy 0 the queues are independent M/M/1 queues, the relative values are as derived in example 2. The cost difference of routing alternatives 1 and 2 is (the first term is the increase in the cumulative costs, if the packet is directed to queue 2, i.e. v n2 +1 v n2 calculated for queue 2, and the latter term is the corresponding quantity for queue 1) n n If 0, then it is advantageous to put the packet in queue 1, and if < 0, then it is advantageous to put the packet in queue µ 2 λ 2 µ 1 λ 1 2. The decision line, which separates the occupancy areas corresponding to the different optimal routing choices, is n 2 "1" n 2 = µ 2 λ 2 µ 1 λ 1 (n 1 + 1) 1 In the JSQ policy (join the shortest queue) the decision line is along the diagonal. The JSQ policy is not optimal. Neither does the decision line now obtained, define the true optimal policy, but is just the result of the first policy iteration. n 1 "2"

30 J. Virtamo Teletraffic Theory / Markov decision processes 30 Example 3. (continued) One can continue the iteration only by solving the Howard equations numerically. The figure on the right was obtained by truncating the state space in the region n 1 < 20 and n 2 < 20 (only part of the region shown) and by solving the Howard equations numerically for the parameters λ = 1, µ 1 = 2, µ 2 = 1. The iteration converged to a fixed point in four rounds. The final optimal policy is indicated by the colouring of the points: green (light) point: direct the packet to queue 2 red (dark) point: direct the packet to queue JSQ 1st policy iteration According to the optimal static policy (p = 0.828) the mean delay of a packet in the system is With the JSQ policy, the value is The policy obtained by the first policy iteration (starting from the optimized static policy) brings the value down to This is quite close to the value of of the optimal policy.

31 J. Virtamo Teletraffic Theory / Markov decision processes 31 Example 4. Relative costs of states in an M/M/m/m system In example 1, we considered an infinity server system, M/M/, and calculated the relative values of the states, using the amount of carried traffic as the measure for the revenues. Now we will consider a finite capacity system M/M/m/m, i.e. the Erlang loss system, where there are m servers (trunks). All calls are accepted as long as there is free capacity available. We could again calculate the relative values of the states measured by the amount of carried traffic. However, here it is more convenient to consider the relative costs of the states measured by the amount of blocked traffic. Technically, there is a slight difference in the calculations in calculating the revenues, the revunue rate in state n is n, n in calculating the costs of loss, the cost rate in the state n = m is λ, and in other states, n < m, the cost rate is 0; when the system is in the blocking state n = m, the expected rate of blockings is λ (in fact, to make this comparable with the revenue consideration, λ should be multiplied by the expected revenue per call, which with per minute charging equals the mean holding time multiplied by the charge per minute; the constant factor, however, is inconsequential, and will be omitted in the sequel). In practice, it does not matter which consideration (revenues / costs) is used; maximizing the carried traffic is equivalent to minimizing the lost traffic.

32 J. Virtamo Teletraffic Theory / Markov decision processes 32 Example 4. (continued) On the basis what was just said, one can write the Howard equations: λ r + µm(v m 1 v m ) = 0 (last state, no upward transitions) r + λ(v n+1 v n ) + µn(v n 1 v n ) = 0, n = 0,..., m 1 Here r is the average cost rate. In the equations one can set v 0 = 0, and then solve them for r and v 1,..., v m. In advance, though, we already know that r = λe(m, a). The solution of the equations is left as an exercise. It is, however, instructive to note that the solution of this problem can be derived by a direct deduction. In particular, we will deduce the difference n = v n+1 v n. According to the definition given before, we can write n = lim (V n+1 (t) V n (t)) t where V n (t) = E[number of blockings in (0, t), when at time 0 the system is in the state n]

33 J. Virtamo Teletraffic Theory / Markov decision processes 33 Example 4. (continued) Consider the quantities V n (t) and V n+1 (t) with reference to the figures. When the system starts from the state n, it takes some time before the system first moves to the state n + 1. Let us denote the time of this first passage by t n. In the interval (0, t n ) no blockings can occur. When the system has moved to the state n + 1, it is precisely in the same situation as the system which started from the state n + 1 (Markovian property!). The darkened areas in the figures, with equal durations, are statistically indistinguishable, and the expected number of blockings in them are equal. Thus we deduce, that V n+1 V n is the same as the number of blockings in the interval (t t n, t) in the system which started from the state n n n t n * When t, the initial state no longer affects the behaviour of the system in the interval (t t n, t), but the system is in the equilibrium, whence the blocking probability is E(m, a). Since the expected number of arrivals in this interval is λe[t n ], then n = λe[t n ]E(m, a) t t-t n * t

34 J. Virtamo Teletraffic Theory / Markov decision processes 34 Example 4. (continued) We still deduce the value of E[t n]. To this end, consider a system with the capacity of n. Immediately after a blocking event the system is in the state n. The next blocking occurs at the instant when the system would have moved to the state n+1. The interblocking time in this limited capacity system is thus distributed as t n in the larger system (first passage time from state n to state n + 1). E[t n ] = E[interblocking time] = 1 λe(n, a), n x estyneen kutsun saapumishetki x x x (λe(n, a) is the blocking frequency) Inserting this into the previous equation, we finally end up with the simple and beautiful result n = v n+1 v n = E(m, a) E(n, a) Since n m, it follows that n 1. Recall that this quantity tells, the expected increase in the number of blockings in an M/M/m/m system which starts from the initial state n + 1, in comparison with a system, which starts from the state n. If the system is in state n and an offered call arrives, then n the expected future cost of accepting the call. This result is used in the following to find an optimal routing policy.

35 J. Virtamo Teletraffic Theory / Markov decision processes 35 Dynamic state dependent routing in a circuit switched network Consider the basic routing problem in the triangular network shown in the figure. The capacities of the links C j, the offered traffic intensities a j as well as the instantaneous occupancies n j are supposed to be known. C j = capacity of link j n j = occupancy of link j = offered traffic intensity offered to link j a j A C a 2 a 3 n 2, C 2 n 3, C 3 The basic policy is that calls are taken only to the direct links, and accepted always as far as the capacity allows. Then we have three separate loss systems, and for each of them we can calculate the relative costs of states as was done in example 4. Now we wish to find, using first policy iteration, what would be a good policy for using alternative routes. Consider the problem from the point of view of calls offered to link 1. Thus, the problem is the following: An call is offered to link 1, when the link occupancies are n 1, n 2 and n 3. The question is, should the call be admitted or rejected. If admitted, then the question is on which route the call should be carried direct route AB (link 1) alternative route ACB (links 2 + 3) n 1, C 1 a 1 B

36 J. Virtamo Teletraffic Theory / Markov decision processes 36 Dynamic state dependent routing (continued) The objective of the optimization is to maximize the number of carried calls, that is to minimize the number of blocked calls (within the infinite time horizon). Using an alternative route consumes more of the resources of the network (occupies a trunk on two links) and can potentially increase the blocking of future arriving calls. In advance, it is not at all clear, under what conditions it is advisable to use the alternative route. By the first policy iteration, each individual decision is made by minimizing the total costs as they appear, if after the decision all the future decisions are made according to the basic policy. If a new call is admitted to link j, this implies an increase in the number of future blockings with the expectation E(n j, a j )/E(m j, a j ). Policy As the cost of accepting the call on the direct link is always < 1, when there is capacity to admit the call, i.e. when m 1 < n 1, whereas the revenue from carrying the call has the expectation 1, it is always beneficial to admit the call to the direct link whenever possible. If the direct link is occupied, we have to consider the sum of the costs incurred on different links. The use of the alternative route is beneficial, if the following condition is satisfied: E(m 2, a 2 ) E(n 2, a 2 ) + E(m 3, a 3 ) E(n 3, a 3 ) < 1

37 J. Virtamo Teletraffic Theory / Markov decision processes 37 Dynamic state dependent routing (continued) If the links on the alternative route are highly occupied the terms on the left side of the condition E(m 2, a 2 ) E(n 2, a 2 ) + E(m 3, a 3 ) E(n 3, a 3 ) < 1 are close to 1 and the sum can exceed 1. The condition defines a kind of dynamic trunk reservation principle. One has to leave enough free capacity for fresh one link traffic on the links of the alternative route. The figure shows the behaviour of the function n for a system with the capacity m = 30. Assuming that both links of the alternative route 0.6 are identical (m = 30) one sees that typically on the links of the alternative route one has to have 0.4 a=30 free capacity ( n 0.5) of a few trunks, in order 0.2 a=20 that the use of the route is beneficial: a= trunks, if a = 20, 6 trunks, if a = C=30 n

38 J. Virtamo Teletraffic Theory / Markov decision processes 38 Remarks on dynamic state dependent routing The above policy is not the ultimate optimal policy, but the result of the first policy iteration. The true optimal policy is obtained when in making the decisions, the future impacts of the decision are assessed using the optimal policy (which is not yet known). Implementation of dynamic state dependent routing is technically demanding since it requires full knowledge of the state of the system, and also of the arrival intensities. In practice one can apply the more robust trunk reservation, where a fixed capacity is reserved for the fresh direct traffic.

Dynamic resource sharing

Dynamic resource sharing J. Virtamo 38.34 Teletraffic Theory / Dynamic resource sharing and balanced fairness Dynamic resource sharing In previous lectures we have studied different notions of fair resource sharing. Our focus

More information

Part I Stochastic variables and Markov chains

Part I Stochastic variables and Markov chains Part I Stochastic variables and Markov chains Random variables describe the behaviour of a phenomenon independent of any specific sample space Distribution function (cdf, cumulative distribution function)

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems  M/M/1  M/M/m  M/M/1/K Queueing Theory I Summary Little s Law Queueing System Notation Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K " Little s Law a(t): the process that counts the number of arrivals

More information

Performance Evaluation of Queuing Systems

Performance Evaluation of Queuing Systems Performance Evaluation of Queuing Systems Introduction to Queuing Systems System Performance Measures & Little s Law Equilibrium Solution of Birth-Death Processes Analysis of Single-Station Queuing Systems

More information

6 Solving Queueing Models

6 Solving Queueing Models 6 Solving Queueing Models 6.1 Introduction In this note we look at the solution of systems of queues, starting with simple isolated queues. The benefits of using predefined, easily classified queues will

More information

Link Models for Circuit Switching

Link Models for Circuit Switching Link Models for Circuit Switching The basis of traffic engineering for telecommunication networks is the Erlang loss function. It basically allows us to determine the amount of telephone traffic that can

More information

Little s result. T = average sojourn time (time spent) in the system N = average number of customers in the system. Little s result says that

Little s result. T = average sojourn time (time spent) in the system N = average number of customers in the system. Little s result says that J. Virtamo 38.143 Queueing Theory / Little s result 1 Little s result The result Little s result or Little s theorem is a very simple (but fundamental) relation between the arrival rate of customers, average

More information

Statistics 150: Spring 2007

Statistics 150: Spring 2007 Statistics 150: Spring 2007 April 23, 2008 0-1 1 Limiting Probabilities If the discrete-time Markov chain with transition probabilities p ij is irreducible and positive recurrent; then the limiting probabilities

More information

Introduction to Markov Chains, Queuing Theory, and Network Performance

Introduction to Markov Chains, Queuing Theory, and Network Performance Introduction to Markov Chains, Queuing Theory, and Network Performance Marceau Coupechoux Telecom ParisTech, departement Informatique et Réseaux marceau.coupechoux@telecom-paristech.fr IT.2403 Modélisation

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Session-Based Queueing Systems

Session-Based Queueing Systems Session-Based Queueing Systems Modelling, Simulation, and Approximation Jeroen Horters Supervisor VU: Sandjai Bhulai Executive Summary Companies often offer services that require multiple steps on the

More information

Lecture 7: Simulation of Markov Processes. Pasi Lassila Department of Communications and Networking

Lecture 7: Simulation of Markov Processes. Pasi Lassila Department of Communications and Networking Lecture 7: Simulation of Markov Processes Pasi Lassila Department of Communications and Networking Contents Markov processes theory recap Elementary queuing models for data networks Simulation of Markov

More information

Chapter 4 Markov Chains at Equilibrium

Chapter 4 Markov Chains at Equilibrium Chapter 4 Markov Chains at Equilibrium 41 Introduction In this chapter we will study the long-term behavior of Markov chains In other words, we would like to know the distribution vector sn) when n The

More information

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015 Markov chains c A. J. Ganesh, University of Bristol, 2015 1 Discrete time Markov chains Example: A drunkard is walking home from the pub. There are n lampposts between the pub and his home, at each of

More information

Figure 10.1: Recording when the event E occurs

Figure 10.1: Recording when the event E occurs 10 Poisson Processes Let T R be an interval. A family of random variables {X(t) ; t T} is called a continuous time stochastic process. We often consider T = [0, 1] and T = [0, ). As X(t) is a random variable

More information

Introduction to Queuing Networks Solutions to Problem Sheet 3

Introduction to Queuing Networks Solutions to Problem Sheet 3 Introduction to Queuing Networks Solutions to Problem Sheet 3 1. (a) The state space is the whole numbers {, 1, 2,...}. The transition rates are q i,i+1 λ for all i and q i, for all i 1 since, when a bus

More information

GI/M/1 and GI/M/m queuing systems

GI/M/1 and GI/M/m queuing systems GI/M/1 and GI/M/m queuing systems Dmitri A. Moltchanov moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/tlt-2716/ OUTLINE: GI/M/1 queuing system; Methods of analysis; Imbedded Markov chain approach; Waiting

More information

Queueing Theory. VK Room: M Last updated: October 17, 2013.

Queueing Theory. VK Room: M Last updated: October 17, 2013. Queueing Theory VK Room: M1.30 knightva@cf.ac.uk www.vincent-knight.com Last updated: October 17, 2013. 1 / 63 Overview Description of Queueing Processes The Single Server Markovian Queue Multi Server

More information

Markov processes and queueing networks

Markov processes and queueing networks Inria September 22, 2015 Outline Poisson processes Markov jump processes Some queueing networks The Poisson distribution (Siméon-Denis Poisson, 1781-1840) { } e λ λ n n! As prevalent as Gaussian distribution

More information

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks Recap Probability, stochastic processes, Markov chains ELEC-C7210 Modeling and analysis of communication networks 1 Recap: Probability theory important distributions Discrete distributions Geometric distribution

More information

Stability of the two queue system

Stability of the two queue system Stability of the two queue system Iain M. MacPhee and Lisa J. Müller University of Durham Department of Mathematical Science Durham, DH1 3LE, UK (e-mail: i.m.macphee@durham.ac.uk, l.j.muller@durham.ac.uk)

More information

Point Process Control

Point Process Control Point Process Control The following note is based on Chapters I, II and VII in Brémaud s book Point Processes and Queues (1981). 1 Basic Definitions Consider some probability space (Ω, F, P). A real-valued

More information

Queuing Networks: Burke s Theorem, Kleinrock s Approximation, and Jackson s Theorem. Wade Trappe

Queuing Networks: Burke s Theorem, Kleinrock s Approximation, and Jackson s Theorem. Wade Trappe Queuing Networks: Burke s Theorem, Kleinrock s Approximation, and Jackson s Theorem Wade Trappe Lecture Overview Network of Queues Introduction Queues in Tandem roduct Form Solutions Burke s Theorem What

More information

Markov Processes and Queues

Markov Processes and Queues MIT 2.853/2.854 Introduction to Manufacturing Systems Markov Processes and Queues Stanley B. Gershwin Laboratory for Manufacturing and Productivity Massachusetts Institute of Technology Markov Processes

More information

CHAPTER 4. Networks of queues. 1. Open networks Suppose that we have a network of queues as given in Figure 4.1. Arrivals

CHAPTER 4. Networks of queues. 1. Open networks Suppose that we have a network of queues as given in Figure 4.1. Arrivals CHAPTER 4 Networks of queues. Open networks Suppose that we have a network of queues as given in Figure 4.. Arrivals Figure 4.. An open network can occur from outside of the network to any subset of nodes.

More information

Data analysis and stochastic modeling

Data analysis and stochastic modeling Data analysis and stochastic modeling Lecture 7 An introduction to queueing theory Guillaume Gravier guillaume.gravier@irisa.fr with a lot of help from Paul Jensen s course http://www.me.utexas.edu/ jensen/ormm/instruction/powerpoint/or_models_09/14_queuing.ppt

More information

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010 Exercises Stochastic Performance Modelling Hamilton Institute, Summer Instruction Exercise Let X be a non-negative random variable with E[X ]

More information

M/G/1 and M/G/1/K systems

M/G/1 and M/G/1/K systems M/G/1 and M/G/1/K systems Dmitri A. Moltchanov dmitri.moltchanov@tut.fi http://www.cs.tut.fi/kurssit/elt-53606/ OUTLINE: Description of M/G/1 system; Methods of analysis; Residual life approach; Imbedded

More information

1.225 Transportation Flow Systems Quiz (December 17, 2001; Duration: 3 hours)

1.225 Transportation Flow Systems Quiz (December 17, 2001; Duration: 3 hours) 1.225 Transportation Flow Systems Quiz (December 17, 2001; Duration: 3 hours) Student Name: Alias: Instructions: 1. This exam is open-book 2. No cooperation is permitted 3. Please write down your name

More information

CS 798: Homework Assignment 3 (Queueing Theory)

CS 798: Homework Assignment 3 (Queueing Theory) 1.0 Little s law Assigned: October 6, 009 Patients arriving to the emergency room at the Grand River Hospital have a mean waiting time of three hours. It has been found that, averaged over the period of

More information

Markov decision processes and interval Markov chains: exploiting the connection

Markov decision processes and interval Markov chains: exploiting the connection Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo Supervisors: Prof. Nigel Bean, Dr Joshua Ross University of Adelaide July 10, 2013 Intervals and interval arithmetic

More information

Lecture 20: Reversible Processes and Queues

Lecture 20: Reversible Processes and Queues Lecture 20: Reversible Processes and Queues 1 Examples of reversible processes 11 Birth-death processes We define two non-negative sequences birth and death rates denoted by {λ n : n N 0 } and {µ n : n

More information

Birth-Death Processes

Birth-Death Processes Birth-Death Processes Birth-Death Processes: Transient Solution Poisson Process: State Distribution Poisson Process: Inter-arrival Times Dr Conor McArdle EE414 - Birth-Death Processes 1/17 Birth-Death

More information

Quantitative Model Checking (QMC) - SS12

Quantitative Model Checking (QMC) - SS12 Quantitative Model Checking (QMC) - SS12 Lecture 06 David Spieler Saarland University, Germany June 4, 2012 1 / 34 Deciding Bisimulations 2 / 34 Partition Refinement Algorithm Notation: A partition P over

More information

UNIVERSITY OF YORK. MSc Examinations 2004 MATHEMATICS Networks. Time Allowed: 3 hours.

UNIVERSITY OF YORK. MSc Examinations 2004 MATHEMATICS Networks. Time Allowed: 3 hours. UNIVERSITY OF YORK MSc Examinations 2004 MATHEMATICS Networks Time Allowed: 3 hours. Answer 4 questions. Standard calculators will be provided but should be unnecessary. 1 Turn over 2 continued on next

More information

Continuous Time Processes

Continuous Time Processes page 102 Chapter 7 Continuous Time Processes 7.1 Introduction In a continuous time stochastic process (with discrete state space), a change of state can occur at any time instant. The associated point

More information

Signalized Intersection Delay Models

Signalized Intersection Delay Models Chapter 35 Signalized Intersection Delay Models 35.1 Introduction Signalized intersections are the important points or nodes within a system of highways and streets. To describe some measure of effectiveness

More information

A Simple Solution for the M/D/c Waiting Time Distribution

A Simple Solution for the M/D/c Waiting Time Distribution A Simple Solution for the M/D/c Waiting Time Distribution G.J.Franx, Universiteit van Amsterdam November 6, 998 Abstract A surprisingly simple and explicit expression for the waiting time distribution

More information

Online Companion for. Decentralized Adaptive Flow Control of High Speed Connectionless Data Networks

Online Companion for. Decentralized Adaptive Flow Control of High Speed Connectionless Data Networks Online Companion for Decentralized Adaptive Flow Control of High Speed Connectionless Data Networks Operations Research Vol 47, No 6 November-December 1999 Felisa J Vásquez-Abad Départment d informatique

More information

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1 Queueing systems Renato Lo Cigno Simulation and Performance Evaluation 2014-15 Queueing systems - Renato Lo Cigno 1 Queues A Birth-Death process is well modeled by a queue Indeed queues can be used to

More information

Signalized Intersection Delay Models

Signalized Intersection Delay Models Signalized Intersection Delay Models Lecture Notes in Transportation Systems Engineering Prof. Tom V. Mathew Contents 1 Introduction 1 2 Types of delay 2 2.1 Stopped Time Delay................................

More information

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011 Exercise 6.5: Solutions to Homework 0 6.262 Discrete Stochastic Processes MIT, Spring 20 Consider the Markov process illustrated below. The transitions are labelled by the rate q ij at which those transitions

More information

Continuous Time Markov Chains

Continuous Time Markov Chains Continuous Time Markov Chains Stochastic Processes - Lecture Notes Fatih Cavdur to accompany Introduction to Probability Models by Sheldon M. Ross Fall 2015 Outline Introduction Continuous-Time Markov

More information

Strategic Dynamic Jockeying Between Two Parallel Queues

Strategic Dynamic Jockeying Between Two Parallel Queues Strategic Dynamic Jockeying Between Two Parallel Queues Amin Dehghanian 1 and Jeffrey P. Kharoufeh 2 Department of Industrial Engineering University of Pittsburgh 1048 Benedum Hall 3700 O Hara Street Pittsburgh,

More information

IEOR 6711, HMWK 5, Professor Sigman

IEOR 6711, HMWK 5, Professor Sigman IEOR 6711, HMWK 5, Professor Sigman 1. Semi-Markov processes: Consider an irreducible positive recurrent discrete-time Markov chain {X n } with transition matrix P (P i,j ), i, j S, and finite state space.

More information

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan Chapter 5. Continuous-Time Markov Chains Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan Continuous-Time Markov Chains Consider a continuous-time stochastic process

More information

HITTING TIME IN AN ERLANG LOSS SYSTEM

HITTING TIME IN AN ERLANG LOSS SYSTEM Probability in the Engineering and Informational Sciences, 16, 2002, 167 184+ Printed in the U+S+A+ HITTING TIME IN AN ERLANG LOSS SYSTEM SHELDON M. ROSS Department of Industrial Engineering and Operations

More information

Chapter 16 focused on decision making in the face of uncertainty about one future

Chapter 16 focused on decision making in the face of uncertainty about one future 9 C H A P T E R Markov Chains Chapter 6 focused on decision making in the face of uncertainty about one future event (learning the true state of nature). However, some decisions need to take into account

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MH4702/MAS446/MTH437 Probabilistic Methods in OR

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MH4702/MAS446/MTH437 Probabilistic Methods in OR NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 2013-201 MH702/MAS6/MTH37 Probabilistic Methods in OR December 2013 TIME ALLOWED: 2 HOURS INSTRUCTIONS TO CANDIDATES 1. This examination paper contains

More information

IEOR 6711: Stochastic Models I, Fall 2003, Professor Whitt. Solutions to Final Exam: Thursday, December 18.

IEOR 6711: Stochastic Models I, Fall 2003, Professor Whitt. Solutions to Final Exam: Thursday, December 18. IEOR 6711: Stochastic Models I, Fall 23, Professor Whitt Solutions to Final Exam: Thursday, December 18. Below are six questions with several parts. Do as much as you can. Show your work. 1. Two-Pump Gas

More information

Queueing Networks G. Rubino INRIA / IRISA, Rennes, France

Queueing Networks G. Rubino INRIA / IRISA, Rennes, France Queueing Networks G. Rubino INRIA / IRISA, Rennes, France February 2006 Index 1. Open nets: Basic Jackson result 2 2. Open nets: Internet performance evaluation 18 3. Closed nets: Basic Gordon-Newell result

More information

Chapter 3 Balance equations, birth-death processes, continuous Markov Chains

Chapter 3 Balance equations, birth-death processes, continuous Markov Chains Chapter 3 Balance equations, birth-death processes, continuous Markov Chains Ioannis Glaropoulos November 4, 2012 1 Exercise 3.2 Consider a birth-death process with 3 states, where the transition rate

More information

Online Supplement to Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates

Online Supplement to Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates Online Supplement to Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates Xu Sun and Ward Whitt Department of Industrial Engineering and Operations Research, Columbia University

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Markov Processes Cont d. Kolmogorov Differential Equations

Markov Processes Cont d. Kolmogorov Differential Equations Markov Processes Cont d Kolmogorov Differential Equations The Kolmogorov Differential Equations characterize the transition functions {P ij (t)} of a Markov process. The time-dependent behavior of the

More information

reversed chain is ergodic and has the same equilibrium probabilities (check that π j =

reversed chain is ergodic and has the same equilibrium probabilities (check that π j = Lecture 10 Networks of queues In this lecture we shall finally get around to consider what happens when queues are part of networks (which, after all, is the topic of the course). Firstly we shall need

More information

Slides 9: Queuing Models

Slides 9: Queuing Models Slides 9: Queuing Models Purpose Simulation is often used in the analysis of queuing models. A simple but typical queuing model is: Queuing models provide the analyst with a powerful tool for designing

More information

Queues and Queueing Networks

Queues and Queueing Networks Queues and Queueing Networks Sanjay K. Bose Dept. of EEE, IITG Copyright 2015, Sanjay K. Bose 1 Introduction to Queueing Models and Queueing Analysis Copyright 2015, Sanjay K. Bose 2 Model of a Queue Arrivals

More information

Non Markovian Queues (contd.)

Non Markovian Queues (contd.) MODULE 7: RENEWAL PROCESSES 29 Lecture 5 Non Markovian Queues (contd) For the case where the service time is constant, V ar(b) = 0, then the P-K formula for M/D/ queue reduces to L s = ρ + ρ 2 2( ρ) where

More information

BIRTH DEATH PROCESSES AND QUEUEING SYSTEMS

BIRTH DEATH PROCESSES AND QUEUEING SYSTEMS BIRTH DEATH PROCESSES AND QUEUEING SYSTEMS Andrea Bobbio Anno Accademico 999-2000 Queueing Systems 2 Notation for Queueing Systems /λ mean time between arrivals S = /µ ρ = λ/µ N mean service time traffic

More information

Derivation of Formulas by Queueing Theory

Derivation of Formulas by Queueing Theory Appendices Spectrum Requirement Planning in Wireless Communications: Model and Methodology for IMT-Advanced E dite d by H. Takagi and B. H. Walke 2008 J ohn Wiley & Sons, L td. ISBN: 978-0-470-98647-9

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha

More information

Statistics 992 Continuous-time Markov Chains Spring 2004

Statistics 992 Continuous-time Markov Chains Spring 2004 Summary Continuous-time finite-state-space Markov chains are stochastic processes that are widely used to model the process of nucleotide substitution. This chapter aims to present much of the mathematics

More information

Modelling data networks stochastic processes and Markov chains

Modelling data networks stochastic processes and Markov chains Modelling data networks stochastic processes and Markov chains a 1, 3 1, 2 2, 2 b 0, 3 2, 3 u 1, 3 α 1, 6 c 0, 3 v 2, 2 β 1, 1 Richard G. Clegg (richard@richardclegg.org) November 2016 Available online

More information

MAT SYS 5120 (Winter 2012) Assignment 5 (not to be submitted) There are 4 questions.

MAT SYS 5120 (Winter 2012) Assignment 5 (not to be submitted) There are 4 questions. MAT 4371 - SYS 5120 (Winter 2012) Assignment 5 (not to be submitted) There are 4 questions. Question 1: Consider the following generator for a continuous time Markov chain. 4 1 3 Q = 2 5 3 5 2 7 (a) Give

More information

Dynamic control of a tandem system with abandonments

Dynamic control of a tandem system with abandonments Dynamic control of a tandem system with abandonments Gabriel Zayas-Cabán 1, Jingui Xie 2, Linda V. Green 3, and Mark E. Lewis 4 1 Center for Healthcare Engineering and Patient Safety University of Michigan

More information

Contents Preface The Exponential Distribution and the Poisson Process Introduction to Renewal Theory

Contents Preface The Exponential Distribution and the Poisson Process Introduction to Renewal Theory Contents Preface... v 1 The Exponential Distribution and the Poisson Process... 1 1.1 Introduction... 1 1.2 The Density, the Distribution, the Tail, and the Hazard Functions... 2 1.2.1 The Hazard Function

More information

Continuous-Time Markov Chain

Continuous-Time Markov Chain Continuous-Time Markov Chain Consider the process {X(t),t 0} with state space {0, 1, 2,...}. The process {X(t),t 0} is a continuous-time Markov chain if for all s, t 0 and nonnegative integers i, j, x(u),

More information

Chapter 1. Introduction. 1.1 Stochastic process

Chapter 1. Introduction. 1.1 Stochastic process Chapter 1 Introduction Process is a phenomenon that takes place in time. In many practical situations, the result of a process at any time may not be certain. Such a process is called a stochastic process.

More information

Markov Chain Model for ALOHA protocol

Markov Chain Model for ALOHA protocol Markov Chain Model for ALOHA protocol Laila Daniel and Krishnan Narayanan April 22, 2012 Outline of the talk A Markov chain (MC) model for Slotted ALOHA Basic properties of Discrete-time Markov Chain Stability

More information

CDA5530: Performance Models of Computers and Networks. Chapter 4: Elementary Queuing Theory

CDA5530: Performance Models of Computers and Networks. Chapter 4: Elementary Queuing Theory CDA5530: Performance Models of Computers and Networks Chapter 4: Elementary Queuing Theory Definition Queuing system: a buffer (waiting room), service facility (one or more servers) a scheduling policy

More information

STOCHASTIC MODELS FOR RELIABILITY, AVAILABILITY, AND MAINTAINABILITY

STOCHASTIC MODELS FOR RELIABILITY, AVAILABILITY, AND MAINTAINABILITY STOCHASTIC MODELS FOR RELIABILITY, AVAILABILITY, AND MAINTAINABILITY Ph.D. Assistant Professor Industrial and Systems Engineering Auburn University RAM IX Summit November 2 nd 2016 Outline Introduction

More information

Stochastic process. X, a series of random variables indexed by t

Stochastic process. X, a series of random variables indexed by t Stochastic process X, a series of random variables indexed by t X={X(t), t 0} is a continuous time stochastic process X={X(t), t=0,1, } is a discrete time stochastic process X(t) is the state at time t,

More information

Minimizing response times and queue lengths in systems of parallel queues

Minimizing response times and queue lengths in systems of parallel queues Minimizing response times and queue lengths in systems of parallel queues Ger Koole Department of Mathematics and Computer Science, Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands

More information

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS 63 2.1 Introduction In this chapter we describe the analytical tools used in this thesis. They are Markov Decision Processes(MDP), Markov Renewal process

More information

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1 Input-queued switches: Scheduling algorithms for a crossbar switch EE 84X Packet Switch Architectures Overview Today s lecture - the input-buffered switch architecture - the head-of-line blocking phenomenon

More information

Calculation of predicted average packet delay and its application for flow control in data network

Calculation of predicted average packet delay and its application for flow control in data network Calculation of predicted average packet delay and its application for flow control in data network C. D Apice R. Manzo DIIMA Department of Information Engineering and Applied Mathematics University of

More information

An M/M/1/N Queuing system with Encouraged Arrivals

An M/M/1/N Queuing system with Encouraged Arrivals Global Journal of Pure and Applied Mathematics. ISS 0973-1768 Volume 13, umber 7 (2017), pp. 3443-3453 Research India Publications http://www.ripublication.com An M/M/1/ Queuing system with Encouraged

More information

Chapter 6 Queueing Models. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Chapter 6 Queueing Models. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Chapter 6 Queueing Models Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Purpose Simulation is often used in the analysis of queueing models. A simple but typical queueing model: Queueing

More information

Introduction to queuing theory

Introduction to queuing theory Introduction to queuing theory Claude Rigault ENST claude.rigault@enst.fr Introduction to Queuing theory 1 Outline The problem The number of clients in a system The client process Delay processes Loss

More information

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States

More information

On Tandem Blocking Queues with a Common Retrial Queue

On Tandem Blocking Queues with a Common Retrial Queue On Tandem Blocking Queues with a Common Retrial Queue K. Avrachenkov U. Yechiali Abstract We consider systems of tandem blocking queues having a common retrial queue, for which explicit analytic results

More information

The Markov Decision Process (MDP) model

The Markov Decision Process (MDP) model Decision Making in Robots and Autonomous Agents The Markov Decision Process (MDP) model Subramanian Ramamoorthy School of Informatics 25 January, 2013 In the MAB Model We were in a single casino and the

More information

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat

More information

Since D has an exponential distribution, E[D] = 0.09 years. Since {A(t) : t 0} is a Poisson process with rate λ = 10, 000, A(0.

Since D has an exponential distribution, E[D] = 0.09 years. Since {A(t) : t 0} is a Poisson process with rate λ = 10, 000, A(0. IEOR 46: Introduction to Operations Research: Stochastic Models Chapters 5-6 in Ross, Thursday, April, 4:5-5:35pm SOLUTIONS to Second Midterm Exam, Spring 9, Open Book: but only the Ross textbook, the

More information

6.842 Randomness and Computation March 3, Lecture 8

6.842 Randomness and Computation March 3, Lecture 8 6.84 Randomness and Computation March 3, 04 Lecture 8 Lecturer: Ronitt Rubinfeld Scribe: Daniel Grier Useful Linear Algebra Let v = (v, v,..., v n ) be a non-zero n-dimensional row vector and P an n n

More information

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes? IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2006, Professor Whitt SOLUTIONS to Final Exam Chapters 4-7 and 10 in Ross, Tuesday, December 19, 4:10pm-7:00pm Open Book: but only

More information

THIELE CENTRE. The M/M/1 queue with inventory, lost sale and general lead times. Mohammad Saffari, Søren Asmussen and Rasoul Haji

THIELE CENTRE. The M/M/1 queue with inventory, lost sale and general lead times. Mohammad Saffari, Søren Asmussen and Rasoul Haji THIELE CENTRE for applied mathematics in natural science The M/M/1 queue with inventory, lost sale and general lead times Mohammad Saffari, Søren Asmussen and Rasoul Haji Research Report No. 11 September

More information

MARKOV PROCESSES. Valerio Di Valerio

MARKOV PROCESSES. Valerio Di Valerio MARKOV PROCESSES Valerio Di Valerio Stochastic Process Definition: a stochastic process is a collection of random variables {X(t)} indexed by time t T Each X(t) X is a random variable that satisfy some

More information

MAS275 Probability Modelling Exercises

MAS275 Probability Modelling Exercises MAS75 Probability Modelling Exercises Note: these questions are intended to be of variable difficulty. In particular: Questions or part questions labelled (*) are intended to be a bit more challenging.

More information

Admission control schemes to provide class-level QoS in multiservice networks q

Admission control schemes to provide class-level QoS in multiservice networks q Computer Networks 35 (2001) 307±326 www.elsevier.com/locate/comnet Admission control schemes to provide class-level QoS in multiservice networks q Suresh Kalyanasundaram a,1, Edwin K.P. Chong b, Ness B.

More information

Reinforcement Learning. Introduction

Reinforcement Learning. Introduction Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control

More information

The shortest queue problem

The shortest queue problem The shortest queue problem Ivo Adan March 19, 2002 1/40 queue 1 join the shortest queue queue 2 Where: Poisson arrivals with rate Exponential service times with mean 1/ 2/40 queue 1 queue 2 randomly assign

More information

MATH37012 Week 10. Dr Jonathan Bagley. Semester

MATH37012 Week 10. Dr Jonathan Bagley. Semester MATH37012 Week 10 Dr Jonathan Bagley Semester 2-2018 2.18 a) Finding and µ j for a particular category of B.D. processes. Consider a process where the destination of the next transition is determined by

More information

Markov Decision Processes Chapter 17. Mausam

Markov Decision Processes Chapter 17. Mausam Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.

More information

On Tandem Blocking Queues with a Common Retrial Queue

On Tandem Blocking Queues with a Common Retrial Queue On Tandem Blocking Queues with a Common Retrial Queue K. Avrachenkov U. Yechiali Abstract We consider systems of tandem blocking queues having a common retrial queue. The model represents dynamics of short

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle  holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/39637 holds various files of this Leiden University dissertation Author: Smit, Laurens Title: Steady-state analysis of large scale systems : the successive

More information

Stochastic Modelling Unit 1: Markov chain models

Stochastic Modelling Unit 1: Markov chain models Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson

More information

Performance analysis of queueing systems with resequencing

Performance analysis of queueing systems with resequencing UNIVERSITÀ DEGLI STUDI DI SALERNO Dipartimento di Matematica Dottorato di Ricerca in Matematica XIV ciclo - Nuova serie Performance analysis of queueing systems with resequencing Candidato: Caraccio Ilaria

More information