Approximate Dynamic Programming for Networks: Fluid Models and Constraint Reduction

Size: px
Start display at page:

Download "Approximate Dynamic Programming for Networks: Fluid Models and Constraint Reduction"

Transcription

1 Approximate Dynamic Programming for Networks: Fluid Models and Constraint Reduction Michael H. Veatch Department of Mathematics Gordon College April 1, 005 Abstract This paper demonstrates the feasibility of using approximate linear programming (ALP) to compute nearly optimal average cost for multiclass queueing network control problems. ALP works directly with the LP form of the optimality equations, but approximates the di erential cost by a linear form. We use quadratics, piece-wise quadratic value functions from uid models, and other approximating functions based on the structure of the optimality equations and numerical experience. The ALP contains one constraint for every state-action pair; however, for quadratics and certain other basis functions, the constraints can be reduced algebraically to a smaller equivalent set. On examples with two to six bu ers, bounds on average cost were 14-6% below optimal using small LPs; tighter bounds can be achieved by using more approximating functions and larger LPs. Although the size of the LP is exponential in the number of bu ers, for a given level of accuracy, the method requries much less computation than standard value iteration or policy iteration, making accurate bounds obtainable for somewhat larger networks. The ability to compute near-optimal performance measures using LP-based techniques could be very useful in the development and testing of heuristic control policies. 1 Introduction The increasing size, complexity, and exibility of manufacturing processes, supply chains, communications, and computer systems have made them increasingly di cult to model and to operate e ciently. Recurrent themes seen in many industries are a large number of interacting processes and signi cant randomness due to customer demand which must be rapidly served, in addition to uncertainties in the service process. The natural modeling framework for these systems is a multiclass queueing network (MQNET). Even under the simplest assumptions of exponentially distributed service and interarrival times and linear holding costs, MQNET control problems are NP-hard so that we cannot hope to solve large problems exactly [18]. Standard policy iteration or value iteration are too computationally intensive to use even on moderate size problems, particularly in heavy tra c. Many heuristics for scheduling and controlling these systems have been proposed, and some implemented, that have the potential to improve performance. However, the justi cation of these 1

2 policies has been less than satisfactory. One would like to know that a proposed heuristic is stabilizing and robustly near optimal. Signi cant progress has been made in the analysis of stability, but relatively little is known about suboptimality because of the inability to solve or tightly bound the optimal control problem. The commonly used bounds [3], [] tend to be loose under balanced heavy tra c, which are often the conditions of most interest. Some heuristics have been shown to be asymptotically optimal for the limiting Brownian control problem in heavy tra c or the uid control problem, which essentially considers large bu er contents. While asymptotic optimality provides some guidance, it is generally too loose a criteria for designing near-optimal policies. This paper demonstrates the feasibility of using approximate linear programming (ALP) to compute nearly optimal average cost for MQNETs dramatically faster than exact DP methods. ALP works directly with the LP form of the optimality equations. Compared to approximate value iteration or approximate policy iteration, the ALP approach has stronger theoretical results, uses only standard LP, and is potentially faster [6] To make the number of variables tractable, the value function is approximated by a linear form. Theory and numerical results suggest that if an accurate class of approximating functions can be found then the ALP will also be accurate, in the sense explained below. Thus, one key to using ALP is selecting a compact but accurate class of approximating functions, or basis. For MQNETs, we exploit knowledge of the value functions from uid models, the structure of the optimality equations, and numerical experience to construct the appropriate form of value function. A natural starting point in approximating the di erential cost function is the cost to drain of the associated uid model. Fluid cost has been used to initialize the value iteration algorithm [4]. The uid cost function is either quadratic or piece-wise quadratic and captures the quadratic growth of the di erential cost in each direction in the state space. In this paper we nd and use these quadratic regions for a three-class example. For larger examples, the two-station uid relaxation in [14] could be used to form a piece-wise quadratic approximation. We also use approximating functions motivated by an analysis of the optimality equations similar to [15]. The second key to using the ALP method is reducing the number of constraints. The ALP contains one constraint for every state-action pair, which is impractical for the problems of interest. Using a di erent quadratic on each set of states de ned by which bu ers are empty, [16] and [17] show that the constraints can be reduced to a nite set. We develop constraint reduction and approximation methods for additional approximating functions, leading to improved bounds. Their work also di ers in that they focus on upper and lower bounds for a speci c policy. Numerical tests were conducted on networks with two to six bu ers. The ALP bound on average cost was 14-6% below optimal; however, the bound requires only the solution of a small LP for these examples. Accuracy improves as more basis functions are added. For example, simply adding indicator functions starting in states with small queue lengths guarantees that the sequence of ALPs converges to optimal. The rate of convergence was tested for a two-station series queue with tra c intensity of 0.8. An error of less than 1% was achieved using 113 basis functions. Solving this LP is much faster than value iteration, which requires a state space with 8 = 784 states and many iterations to achieve the same accuracy. It seems possible to identify much more e cient bases, increasing the rate of convergence and making accurate bounds obtainable for networks of up to eight or 10 bu ers. Although many real systems require larger models, most networks analyzed by researchers are of this size. The ability to compute near-optimal performance measures using LP-based techniques could be very valuable in the development and testing of heuristics. In addition to giving bounds on average cost, ALPs might be useful for obtaining near-optimal policies. For discounted problems, [6] provides an error bound for the ALP value function. In

3 particular, a suitable weighted norm of the error is bounded by the minimum of this error norm over all functions in the approximating class, multiplied by a constant that does not depend on problem size. Similar bounds are given on performance of the policy implied by the ALP value function. For average cost problems, the theory is not as clear. Policies derived from the ALP can have arbitrarily poor performance or even be unstable [5]. A modi cation of the ALP method is presented in [8] for which a performance bound is provided. We do not address performance in this paper, but we conjecture that performance of the ALP policy will be fairly good for the network problems and approximating functions that we consider. The ALP approach was originally proposed by [1]. It is applied to discounted network problems in [6] using quadratic value function approximations. Instead of constraint reduction, they use importance sampling of constraints, which is shown to be probabilistically accurate in [7]. Bounds have also been obtained using the achievable region method [3], [] and duality [10]. The rest of this paper is organized as follows. Section de nes the MQNET sequencing problem and the associated uid control problem and Section 3 describes average cost ALPs. In Section 4 a variety of di erential cost approximations are presented, including detailed analysis of a series queue and a reentrant line. Numerical results on the accuracy of various ALPs are presented and the di erential cost function examined in Section 5. Convergence of a sequence of ALPs to the optimal average cost is shown in Section 6. Some open questions are discussed in Section 7. Open MQNET sequencing: Discrete and uid models In this section we describe the standard MQNET model and the uid model associated with it. There are n job classes and m resources, or stations, each of which serves one or more classes. Associated with each class is a bu er in which jobs wait for processing. Let x i (t) be the number of class i jobs at time t, including any that are being processed. Class i jobs are served by station (i). The topology of the network is described by the routing matrix P = [p ij ]; where p ij is the probability that a job nishing service at class i will be routed to class j, independent of all other history, and the m n constituency matrix with entries C ji = 1 if station j serves class i and C ji = 0 otherwise. If routing is deterministic, then p i;s(i) = 1, where s(i) is the successor of class i. Exogenous arrivals occur at one or more classes according to independent Poisson processes with rate i in class i. Processing times are assumed to be independently exponentially distributed with mean m i = 1= i in class i. To create an open MQNET, the routing matrix P is assumed to be transient, i.e., I + P + P + : : : is convergent. As a result, there will be a unique solution to the tra c equation given by = + P 0 = (I P 0 ) 1 : Here i is the e ective arrival rate to class i, including exogenous arrivals and routing from other classes, and vectors are formed in the usual way. The tra c intensity is given by = C diag(m 1 ; : : : ; m n ); that is, j is the tra c intensity at station j. Stability requires that < 1. 3

4 The network has sequencing control: each server must decide which job class to work on next, or possibly to idle. Preemption is allowed. Let u i (t) = 1 if class i is served at time t and 0 otherwise. Admissible controls are nonanticipating and have Cu(t) 1 u i (t) x i (t): The rst constraint states that a server s allocations cannot exceed one; the second prevents serving an empty bu er. The objective is to minimize long-run average cost Z 1 T J(x; u) = lim sup T!1 T E x;u c 0 x(t)dt: 0 Here E x;u denotes expectation given the initial state x(0) = x and policy u. Consider only stationary Markov policies and write u(t) = u(x(t)). We use the uniformized, discrete-time Markov chain and assume that the potential event rate is P n i=1 ( i + i ) = 1: Let P u = [p u (x; y)] be the transition probability matrix under policy u. It is convenient to introduce the one-step operator T u, de ned by (T u h)(x) = c 0 x + X y p u (x; y)h(y): Due to the linearity of T u with respect to u, only extreme points u i = 0 and u i = 1 need be considered. Let A(x) be the set of feasible extreme point controls in state x. Under the condition < 1, the control problem has several desirable properties: 1. An optimal policy exists and its average cost is constant, J = min u J(x; u) for all x.. There is a solution J and h to the average cost optimality equation J + h(x) = min (T uh)(x): (1) ua(x) 3. Under the additional condition that h is bounded below by a constant and above by a quadratic, there is a unique solution J and h to (1) satisfying h (0) = 0. Furthermore, J is the optimal average cost, any policy u (x) = arg min(t u h )(x) u is optimal, and h is the di erential cost of this policy, Z T h (x) = lim sup E x;u c 0 x(t)dt T!1 0 Z T E 0;u c 0 x(t)dt: () Properties (1) and () can be established using general results for MDPs as in [, Theorems 7..3 and 7.5.6]. The key conditions are that cost is norm-like (so there are only nitely many low-cost states), state 0 is reached in nite mean time from any state under some policy, and any low-cost 0 4

5 state can be reached from 0 in nite mean time. For networks, properties (1) and () are shown in [1, Theorem 7]; (3) is obtained by applying standard veri cation theorems to networks; see [11, Theorem.1 and Section 7]. A more comprehensive treatment is [13, Theorem 10.7]. A natural starting point in approximating the di erential cost function is the associated uid model. In this model all transitions are replaced by their mean rates and a continuous state q i (t) R + is used. In a uid control problem, for each initial state there is a time horizon T such that q(t) = 0 for all t T. The uid control problem corresponding to (1) is (FCP) V (x) = min Z T 0 c 0 q(t)dt _q(t) = Bu(t) + Cu(t) 1 q(0) = x q(t) 0, u(t) 0; where = ( 1 ; : : : ; n ) 0 and B = (P 0 I)diag( 1 ; : : : : n ). An optimal u(t) can be chosen so that it is piece-wise constant, making q(t) piece-wise linear with _q(t) existing except on a set of zero measure. We will use the uid cost to drain V (x) to guide our approximation of h(x). The motivation for this approximation is [1, Theorem 7(iv)], based on [11, Theorem 5.]. It establishes the following connection between the discrete and uid cost functions: h (x) lim = 1: (3)!1 V (x) The optimal uid policy u(x) partitions R n + into a nite number of control switching sets where the control is constant. Each region is a convex polyhedral cone emanating from the origin. In particular, if a region contains x, it also contains x, > 0. These regions can be subdivided according to the sequence of switching sets that a trajectory enters next. In a region, say S k, where a certain sequence of switching sets will be visited, V is quadratic Thus, (3) implies that V (x) = 1 x0 Q k x; x S k : (4) h (x) = 1 x0 Q k x + o(jxj ); x S k (5) and the di erential cost is dominated by the uid cost as queue lengths increase. Similar comparisons can be drawn between their optimal policies; see [4]. Although e cient algorithms exist for computing an optimal uid trajectory, computing the regions S k requires knowing the optimal policy and su ers from the same combinatorial complexity inherent in the original problem. In this paper we use small examples where the S k are known. For larger examples, a two-station uid relaxation as in [14] could be used to form a piece-wise quadratic approximation. 3 Approximate LP: Average cost bounds In this section we describe a general method for constructing a linear program in a small number of variables that approximates the di erential cost and places a lower bound on average cost. 5

6 Although value function approximation can also be used with iterative methods, the direct LP method described here appears the most promising [0], [8]. It is well-known that, for nite state spaces, an inequality relaxation of Bellman s equation gives an equivalent LP in the same variables, (LP) max J s.t. (T u h)(x) J + h(x) for all x Z n +; u A(x) h(0) = 0: For countable state spaces, an additional condition is needed on h. A suitable condition for networks is suggested by (5): For some K > 0 and L > 0, L h(x) K(1 + jxj ): (6) A standard argument for the equivalence of (LP) with (6) is as follows. Using the constraints for the optimal policy and letting x k denote the state after k transitions (including self-transitions of the uniformized chain), J c 0 x k + E u [h(x k+1 )jx k ] h(x k ): After summing and telescoping, taking expectations yields We need to show that J 1 N NX 1 k=0 E x;u c 0 x k + 1 N E x;u h(x N ) + 1 N h(x 0): lim N!1 1 N E x;u h(x N ) = 0 (7) so that taking the limit as N! 1 leaves J J for any feasible h.. Then, since (J ; h ) are feasible, they are optimal for (LP) and (6). To show (7), use the fact that, for all policies u with nite J(x; u), lim N!1 1 N E x;u jx N j = 0 (8) [10, Theorem 1]. Although they assume nonidling policies, (8) also holds for weakly nonidling policies where u(t) 6= 0 if x(t) 6= 0, which includes u : Their result also assumes x is in the recurrent class, but for the optimal policy this extends easily to all states. Combining (8) and (6) gives (7). This exact LP has one variable for every state. To create a tractable LP, the di erential cost can be approximated by a linear form h (x) KX r k k (x) = (r)(x) (9) k=1 using some small set of basis functions k and variables r k [1]. Assume that k (0) = 0. The resulting approximate LP is (ALP) J = max J s.t. (T u r)(x) J + (r)(x) for all x Z n +; u A(x) L (r)(x) K(1 + jxj ) for all x: 6

7 The bounds K and L may depend on r; all that is needed is that the bound applies to each k. Since (ALP) is equivalent to the exact LP with the constraints (9) added, the exact LP is a relaxation. Hence, (ALP) gives a lower bound, J J. Clearly (ALP) has an optimal solution, say r. The approximate LP is still not a manageable size because it has one constraint for each stateaction pair. In Section 4, various constraint sets are algebraically reduced to a smaller, but still exponentially large, set of constraints. Reduction will also be combined with approximate methods that choose constraints. The relationship between errors in the di erential cost (r )(x) h (x) and performance of the associated policy is not obvious. The maximum di erential cost error is likely to be unbounded; however, one might hope that a suitable norm of this error could be used to bound error in average cost. A di erential cost approximation h de nes a myopic policy u h (x) = arg min ua(x) (T uh)(x): We would like to bound J eu J in terms of k(r )(x) h (x)k for some norm, where eu is the myopic policy found by (ALP). Unfortunately, for general MDPs [5] shows that the system might be unstable under eu. However, it is clear when the basis is su cient. Proposition 1 If (ALP) has a binding constraint for each state x then (i) J = J and h = r. (ii) If f k g are linearly independent on Z n +, then (ALP) has a unique optimal solution. Proof. For each x, J and r satisfy (1), with the minimum achieved by the action u(x) with the binding constraint in (ALP). By assumption, (r )(0) = 0 and uniqueness of solutions to (1) implies (i); (ii) follows. 4 Value function approximation and constraint reduction This section considers several bases to approximate the di erential cost and demonstrates how the constraints of the resulting ALP can be algebraically reduced to a small, or at least more easily approximated, set. In Section 4.1, constraint reduction is given for quadratic approximation Section 4. analyzes a series queue and demonstrates the role of additional approximating functions. A method of reducing piece-wise quadratic approximations, which are suggested by uid models, is presented in Section Quadratic approximation Consider the quadratic di erential cost approximation h(x) = 1 x0 Qx + px (10) where Q = [q ij ] is symmetric. This approximation is motivated by (5). It is also interesting to note that for a single uncontrolled queue h (x) = 1 ( ) (x + x): (11) 7

8 The quadratic term in (11) is from the uid, so h (x) V (x) is linear; the e ect of randomness is to shift the uid value function 1/ unit to the left. In [15], (11) is used as a one-dimensional relaxation of the bottleneck station in (1) and shown to be accurate in an asymptotic sense as tra c intensity approaches one. The constraints (ALP) can be reduced to a nite set for quadratic h [16, Appendix A]. To simplify notation, consider only deterministic routing. First, we write the constraints as J X i (c i x i + i [h(x + e i ) h(x)] + u i i [h(x e i + e s(i) ) h(x)]): (1) Unlike a discounted model, only di erences in h appear in these constraints, simplifying the analysis. It is convenient to let x = z + u, so that a control u is feasible for all z Z n +. Substituting (10) into (1) yields J d u + c u0 z (13) where c u i = c i + X j [ i q ij + u j j (q i;s(j) q ij )] d u = X i [u i (c u i + i ( 1 q ii + 1 q s(i);s(i) q i;s(i) + p s(i) p i )) + i ( 1 q ii + p i )] and c u = [c u i ]. But (13) is equivalent to J d u (14) c u i 0 (15) for all i and u. If the optimal policy is nonidling, then for a given control u, (15) is only needed for i in 8 9 < N(u) = : i : X = u j = 1 ; ; j:(i)=(j) i.e., the classes served by busy machines. Under nonidling there are only jn(u)j + 1 constraints for each u instead of n Series queue: Approximating Poisson s equation Even when the uid value function is quadratic, for most networks it only captures the behavior at large queue lengths and quadratic approximation of di erential cost is inadequate. Numerical evidence of this is presented in Section 5. We propose additional functions k that modify h near the boundaries x i = 0. The choice of functions is based on an analysis of Poisson s equation for reasonable controls near the boundary. We choose functions that are either in the domain of the generator of the process or that can potentially correct for error in Poisson s equation under the quadratic approximation. Consider a series queue with arrivals at rate to the rst queue (Figure 1). For this problem, (1) is J c 0 x + [h (x + e 1 ) h (x)] + u 1 1 [h (x e 1 + e ) h (x)] +u [h (x e ) h (x)] : (16) 8

9 α µ 1 µ Figure 1: Two-stage series system We consider the case c 1 < c, so that station 1 might idle (station is always nonidling), and < 1, so that the uid policy is: idle 1 when x > 0 [1]. This policy is greedy and the value function is quadratic: V (x) = 1 c 1 (x 1 + x ) + 1 c c 1 x : (17) Let P 1 denote the transition probability matrix when serving class 1 (u 1 = 1 unless x 1 = 0). De ne the functions 1 (x) = x ; (x) = x 1 x ; 3 (x) = x x ; where = ( = 1 ). Observe that for x > 0 1 (P 1 1 ) (x) = (x) = 1 (x) ; i.e. 1 is in the domain of the generator P 1. Since 1 is small when x is large, the error introduced when idling should be fairly small. The other functions are suggested by studying the error in Poisson s equation, de ned as E (x) = (T u h) (x) h (x) J: Using a quadratic h results in linear error in at least some regions. approximations and J = c 1 =( ). The resulting error is c 1 h (x) = V (x) + 1 x For example, [15] uses the E (x) = (1 u ) c 1x 1 + u 1 1 (c c 1 ) (x + 1) ; i.e., error is zero above the switching curve but is linear in x below the curve when x > 0. A di erent quadratic makes the error constant when below the switching curve and x > 0. Now consider the contribution of and 3 to the error. For, ( ) x ; x (P 1 ) (x) (x) = > 0 ( 1 ) x 1 ( ); x = 0 9

10 when serving 1 and when idle. For 3, when serving 1 and (P 0 ) (x) (x) = [( 1 ) x 1 + ] x (P 1 3 ) (x) 3 (x) = (1 ) x ; x > 0 ; x = 0 (P 0 3 ) (x) 3 (x) = [( 1 ) x 1 ] x when idle. Suitable multiples of and 3 could eliminate the linear error when x = 0 and compensate for it somewhat in the small x states. Based on these observations, consider the di erential cost approximation h(x) = 1 x0 Qx + px + r 1 1 (x) + r (x) + r 3 3 (x): (18) Another advantage of using 1 in the approximation is that it gives the desired shape of switching curve, serve class 1 if x ln x 1, suggested by numerical experience and [15]. Now we reduce the constraints (16) as far as possible. Substituting (18) into (16) and again using x = z + u, (16) has the form J d u + c u0 z + ( u + u0 z) z+u (19) for all z Z + and all u that are nonidling at station. Here d u, c u, u and u are linear functions of the variables p, Q, and r; see the Appendix. For u = (1; 1), (1;1) = 0, so as z i! 1 we must have c (1;1) i 0; i = 1; : (0) Given (0), (19) is tightest at z 1 = 0 for each z. However, depending on the value of ((1;1) it could be tightest at any z. Thus, we will approximate (ALP) by including (19) at z 1 = 0, z = 0; : : : ; N 1 for some N. Now consider u = (0; 1). For each z, the z 1 coe cient must be nonnegative, c (0;1) 1 + (0;1) 1 z 0: Because of the monotonicity in z, this is equivalent to c (0;1) 1 + (0;1) 1 0 (1) and c (0;1) 1 0. Letting z! 1 in (19) gives another constraint, so we have c (0;1) i 0; i = 1; : () Given (1) and (), (19) is tightest at z 1 = 0 but, depending on (0;1) and (0;1), could be tightest at any z, so we include (19) at u = (0; 1), z 1 = 0, and z = 0; : : : ; N 1. Next, for u = (1; 0) we must have z = 0. For (19) to hold as z 1! 1, we must have c (1;0) 1 0: (3) In light of (3), (19) is tightest at z 1 = 0, so we include (19) at u = (1; 0) and z = (0; 0). Finally, we include (19) at u = z = (0; 0). 10

11 µ 1 µ µ 3 Figure : Three-class, two-station reentrant line. To summarize, the approximate reduced ALP contains the N + 8 constraints (19) at u = (1; 1); z 1 = 0, z = 0; : : : ; N 1; u = (0; 1); z 1 = 0, z = 0; : : : ; N 1; u = (1; 0); z = (0; 0); and u = z = 0; plus (0)-(3). Call this relaxation ALP(N). We end this section with a statement about the relaxation. Proposition Let J N be the optimal value of ALP(N) for the series queue. For some M, J N = J for all N M. The proposition follows from the fact that the limiting constraints as x! 1, namely, (0)-(3), are incorporated in ALP(N). 4.3 Fluid approximation for a reentrant line The network in Section 4. has a greedy uid policy. In contrast, this section considers the reentrant line studied by Weiss [5]. Its uid policy has a switching curve that depends on the relative size of the queue lengths, not just whether a queue is empty, making the uid information richer. Such policies have piece-wise quadratic V (x), and may perform better [3]. Moreover, under natural assumptions about the nature of the optimal policy, their switching surfaces give the correct asymptotic slopes of the switching surfaces of the original problem [4]. We formulate the ALP using the piece-wise quadratic regions from the uid and propose an approximate constraint reduction. Figure shows a three-class, two-station line where jobs arrive at rate to class 1. Station serves only class and is the bottleneck, m > m 1 + m 3, where m i = 1= i is the mean service time for class i. Costs are constant, c i = 1, so the only decision is whether to serve class 1 or 3 at station 1. As Weiss shows, when x = 0 the uid policy makes a trade-o between serving class 3, which starves the bottleneck, and serving class 1, feeding the bottleneck. Class 3 is given priority when x 3 x 1, where 1 m m 1 m 3 = : 1 = m 1 + m 3 When x > 0 class 3 is served. Although the control is constant on x > 0, V (x) divides this region into three quadratic regions according to which of three controls will be used next on a trajectory starting from x. The correspondence between quadratic regions and control regions is shown in Table 1. One can verify that these are the three quadratic regions by following the trajectories. Trajectories in the rst control 11

12 Control region visited next Quadratic region (x > 0) State Station serves _q S 1 : x 3 > x x x = 0, x 3 > x 1 3 (; 0; 3 ) S : x 3 x x x = 0, x 3 x 1 1 and 3 ( ; 0; 3 (1 and x 3 > 3 x S 3 : x 3 3 x x > 0; x 3 = 0 3 and idle ( ; 0) 1 )) Table 1: Quadratic regions of the uid cost in the reentrant line example region in Table 1 enter the second control region next; the second and third region feed into the control region x = 0 and x 3 = 0, which leads to x = 0. Note that this optimal policy idles machine 1 in the third control region, but this is just for convenience; a nonidling optimal policy also exists. Since V (x) is continuous (in fact it is C 1 ), the regions S k can be extended to include x = 0, fully de ning V. Hence, we approximate h as quadratic on each of these regions: h (x) = 1 x0 Q k x + p k0 x + f k ; x S k ; (4) where S k, k = 1; : : : ; 3 are de ned in Table 1. Note that we do not require h to be continuous, so the treatment of boundaries matters. The assignment of boundaries in Table 1 was chosen for consistency with the control regions. Now consider constraint reduction for a general network using (4). Assume that there are nitely many S k, each a polyhedral cone from the origin of full dimension. The dual method of [17, Section 3.4] can be extended to reduce the constraints to a nite set; however, certain approximations are needed to make the constraints tractable. For each control, this approach de nes sets of states in which the form of the constraints (1) is constant. Given the control, the transition probabilities are translation invariant. Let f l g; l = 1; : : : ; q be the transitions, i.e., p u (x; x + l ) > 0 for some u and all x such that u A(x). Let = (u; k 0 ; k 1 ; : : : ; k q ) and de ne X = fx S k0 \ Z n + : x + l S kl for all l such that p u (x; x + l ) > 0g: There is one index for each combination of control u, quadratic region S k0 of the current state, and quadratic region S kl of possible next states. If transition l does not occur under u, then k l = k 0. To illustrate these de nitions in the reentrant line example, number the service transitions l = 1; ; 3 and the arrival transition l = 4. Consider, for example, u = (0; 1; 1) and k 0 = 3, i.e., x S 3 (see Table 1). Then k 1 = 3 because class 1 is not served and k 3 = k 4 = 3 because these transitions cannot leave S 3. However, a class service completion could stay in S 3 (k = 3), enter S (k = ), or, for certain parameter values, enter S 1 (k = 1). Speci cally, if 3 and x = (0; 1; 1) then x S 3 but x + = (0; 0; ) S 1, i.e., x X where = (u; 3; 3; 1; 3; 3). Because S 1 and S 3 only meet at the origin, X can contain only states near the origin. In general, if contains k l 6= k 0 then X lies within one transition of the hyperplane separating S k0 and S kl. Again using x = z + u, let Z = fz : z + u X g. The constraints have the form J d + c 0 z + 1 z0 M z; z Z (5) 1

13 where d, c, and M are linear functions of Q k, p k, and f k. The quadratic term M is symmetric. It appears because of transitions between regions S k. The rst approximation is to remove the integer restriction by allowing z Z, where Z is a polyhedron, say fz R n : A z b ; z 0g, whose lattice points are (nearly) the set Z. For simplicity, we allow lattice points on the boundary of Z that are not in Z. This overlap could be avoided by adding more cutting planes. Also, because S k has full dimension and the control u is feasible at all z 0, there is no need for equality constraints in Z. If nonidling controls are desired, constraints of the form z i = 0 can be enforced by removing these variables. Checking (5) exactly for a given d, c, and M is related to determining if M is copositive; instead, following [17], we impose the stronger, simpler conditions and M 0 (6) J d + c 0 z A z b (7) z 0: The key observation is that these constraints are colinear in z and the ALP variables. A dual can be constructed that separates z. Fixed values of J, Q k, p k, and f k satisfy (7) if and only if, for each, the LP min s.t. c 0 z A z b z 0 has optimal value w J d, or equivalently, so does its dual max b 0 y s.t. A 0 y c (8) y 0: Thus, (7) is equivalent to (8) and w J d for all. Reintroducing J, Q k, p k, and f k as variables, the dual form of (ALP) is (ALPD) max J s.t. A 0 y c b 0 y J d M 0 y 0: The two approximations made were restrictions of (ALP); hence, the optimal value J D of (ALPD) is also a lower bound, J D J. 13

14 Region Extreme directions S 1 (0; 0; 1); (0; ; + 3 ); (1; 0; ) S (1; 0; 0); (0; ; + 3 ); (1; 0; ); (0; ; 3 ) S 3 (1; 0; 0); (0; 1; 0) (0; ; + 3 ) Table : Edges of the quadratic regions in the reentrant line example (ALPD) has many more variables than (ALP) due to the large number of regions indexed by. We propose two reductions. First, if nonidling is assumed, then z i = 0 for i = N(u), and these z i can be eliminated before forming the dual. Second, (7) can be interpreted geometrically as checking J d + c 0 z for every extreme point z and c 0 0 for every extreme direction of Z. Because the hyperplanes bounding each S k pass through the origin, the ones bounding Z pass within roughly one transition of the origin (there are points in Z within one transition of the S k boundary). Thus, in a certain sense, the extreme points of Z lie near the origin. Also, nding the extreme directions is made easier by the fact that the extreme directions of Z are a subset of the extreme directions of S k0. In particular, Z has the ones contained in the common boundary of S k0 and all S kl (because there are transitions into S kl from Z ). The extreme directions for the example in this section are listed in Table. Checking extreme directions in the linear constraints (7) is an exact method; however, we will apply it as an approximate check of (5). Requiring (5) at z = t for all t 0 results in quadratic constraints. For simplicity, we use the stronger conditions c 0 0 and 0 M 0. These observations suggest the following approximation to (5). Find the extreme directions f ;l g of Z. The relaxation ALP(N) contains the constraints (5) for z Z and z i N 1 c 0 ;l 0 (9) ( ;l ) 0 M ;l 0 (30) for all and all directions ;l. The constraints (5) address the extreme points, while the limiting constraints (9) and (30) in the extreme directions allow faster convergence over N. Notice that ALP(N) is based on the exact constraints, not the linearization (7), suggesting that ALP(N) might give a tighter bound than (ALPD). However, because of the approximate treatment of limiting constraints ALP(N) may not converge to (ALP). 4.4 Additional examples Two other examples using slightly di erent value function approximations have been tested. The rst is Hajek s arrival routing problem [9]. Arrivals at rate must be immediately routed to one of two servers (see Figure 3). The di erential cost approximation used is h(x) = 1 x0 Qx + px + r 1 x 1 x1 + r x 1 x + r3 1 x1 + r4 x (31) which is analogous to (18). The quadratic approximation was also tested on the six-class, two-station network of Figure 4. This example is the largest for which DP results are available; they are taken from [19]. 14

15 µ 1 α router µ Figure 3: Arrival routing system. α 1 α 3 µ 1 µ µ 3 µ 4 µ 5 µ 6 Figure 4: A six-class network. 15

16 c max Series queue 1 1:5; 1:5 1; 0:8 Arrival routing 1 0:65; 0:65 1; 0:77 Reentrant line 9 ; 10; 1; 1; 1 0:9 6-class network 6=140; 6=140 1=4; 1; 1=8; 1=6; 1=; 1=7 1; 1; 1; 1; 1; 1 0:6 5 Numerical Results Table 3: Parameters of the examples The tightness of the ALP bound was tested by computing the optimal average cost using DP value iteration on a truncated state space. The baseline parameters for each of the four examples are shown in Table 3. Note that these parameters have not yet been scaled. For example, the reentrant line and i must be divided by their sum of 63. First, the e ect of tra c intensity on the quadratic approximation was tested. In Figure 5, = = was varied while keeping i xed in the series queue. The percent error vanishes in light tra c. This is not surprising, since the ve variables in (10) can t h exactly in the six states with x 1 + x ; which are a su cient state truncation in light tra c. Although the DP can only be solved up to a certain, the data suggests that percent error also vanishes in heavy tra c. A data point has been added at = 1 to show this. Tra c intensity a ects error in a similar way when the piecewise quadratic approximation (4) is used for the reentrant line (Figure 6). In this example, as! 1, the geometry of the quadratic regions changes and large ALPs (large truncations N) are required. It should be noted that each of these examples has a single bottleneck station. The accuracy of the various di erential cost approximations is reported in Table 4. The column labelled improved ALP uses the di erential cost approximations (18) for the series queue, (31) for arrival routing, and a quadratic with indicator functions for the events x i = 0 and x i = 1 for the 6-class network. The optimal average cost for the 6-class network is taken from [19]. Although the bounds on average cost are fairly loose, the LPs to obtain them are quite small. For the ALPs that require truncation, the use of limiting constraints is very e ective. ALP(N) for the series queue gives the same solution for all N 3; the sizes shown in Table 4 are for the smallest N that achieves this constant result. We tried removing the limiting constraints from ALP(N) for the series queue and found that it converged very slowly, requiring a prohibitively large N to approach the same average cost. At least in some cases, accuracy improves signi cantly when more variables are added. Adding three variables to the series queue ALP cut the error from 40% to 6%. If additional, similarly e ective basis functions could be identi ed it appears that accurate bounds could be obtained for relatively little computational e ort. Even the LPs with a large number of constraints can be rapidly solved using dual-based methods. Note also that these bounds will be tighter not only in light tra c but in (very) heavy tra c. The form of the optimal di erential cost, h, was also investigated numerically. The dominant feature of h is its quadratic growth. To view the other features of h, a quadratic function was t to it using least squares over the points 0 x i 10. The result for the series queue is shown in Figure 7. The residuals, plotted on the z axis, are small compared to h ; the largest value of h on this grid is over 500. The percent residual is larger when x is small and particularly when x is small. The residual function has a complex shape, giving further evidence that the r 1 and r terms in (18) are useful. The graph also suggests that higher order terms, such as x 1 x x and x 1 x might be e ective; however, adding these terms to the ALP only reduced the error in average cost 16

17 tightness of bound Figure 5: Tightness of average cost bound vs. tra c intensity, series queue tightness of bound Figure 6: Tightness of average cost bound vs. tra c intensity, reentrant line. 17

18 Optimal Percent error in ALP J (size of LP) average cost Quadratic Piece-wise quadratic Improved ALP Series queue 9:31 40% (13 6) n/a 6% (15 9) Arrival routing 5:54 19% (13 6) n/a 14% (75 10) Reentrant line 11:93 0% (18 10) 17% (1337 ) 6-class network :56 19% (89 8) 19% (80 34) Table 4: Accuracy of the ALP average cost from 6.5% to 5.7%. 6 ALP Sequences The numerical results in Section 5 suggest that a sequence of ALPs with larger bases could be solved until the desired accuracy is obtained. However, it is not obvious which sequences of basis functions will work. This section establishes that it is possible to select a sequence of basis functions so that the ALP bound converges to the optimal average cost. Let the sequence (x k ) be any ordering of the states Z n + that is increasing in total queue length, i.e., if i < j then x i x j. Assign an indicator function to each state, k (x) = 1 if x = x k and 0 otherwise. Also de ne N = x K and 0 (x) = 1 fjxj>ng e (jxj N). Let ALP[K] be the ALP that uses the functions f 0 ; 1 ; : : : K g in the same manner as (9). Denote the optimal value of ALP[K] by J(K). The proof that J(K) converges to optimal is based on Sennott s approximating sequence method for in nite state space MDPs []. The basic idea is to construct an approximating sequence ( N ) of MDPs by limiting the total queue length to N through turning o arrivals when jxj = N. Let J N and h N (x) be a solution to (1) for N. Such a solution exists and is in fact the optimal average and di erential cost, respectively [, Proposition and Theorem 6.4.]. Sennott gives conditions under which J N converges to J for a general MDP. First, we show that these conditions hold for our approximating sequence. Lemma 3 The truncated MDP converges, lim N!1 J N = J. Proof. We need to verify the (AC) assumptions [, p. 169]. The rst assumption, that (1) has a solution for each N, was addressed above. Since arrivals can only increase cost, 0 h N (x) h (x), establishing (AC) and (AC3), and J N J < 1, establishing (AC4). Convergence of the ALP sequence follows from the lemma. Theorem 4 For su ciently large, the ALP sequence converges, lim K!1 J(K) = J. Proof. Set r k = h N (x k ), so that (r)(x) = h N (x), jxj N. We will show that, for su ciently large N,, and r 0, J N and r are feasible for ALP[K]. That makes J(K) J N and lim k!1 J(K) lim N!1 J N = J. But the ALP gives a lower bound, J(K) J, so lim K!1 J(K) = J. Recall that the ALP constraints are c T x + X y p u (x; y)(r)(y) J + (r)(x): (3) 18

19 Figure 7: Residual (z) in the best quadratic t to h for the series queue. The constraints with jxj N 1 are the same as the corresponding optimality equations for N ; hence, they are satis ed by J N and r de ned above. Fix N and choose r 0 = max x fh N (x)g. For jxj = N, the left side of (3) di ers from the corresponding optimality equation for N by P fnext transition is an arrivalg(r 0 h N (x)) 0; so (3) is satis ed for these states. Now choose N large enough that minfc T x : jxj > Ng J N and large enough so that, for any x and action u, X X p u (x; y) + e p u (x; y) 1: (33) jyj=jxj jyj=jxj+1 Note that N is chosen so that there are no low cost states outside of N and that (33) is a Lyapunov drift condition on 0 in the states jxj > N. Such a will exist because there are uncontrolled arrivals, making the second sum in (33) nonzero. Then for jxj > N X p u (x; y)(r)(y) (r)(x) y and (3) is satis ed in these states as well. Any xed set of basis functions can be added to the ALPs in Theorem 4. The rate of convergence was tested for the series queue, using (18) and the indicator functions. An error of less than 1%, compared to 6% for (18) alone, was achieved using the 104 indicator functions with jxj < 13 (a 19

20 total of 113 basis functions). An error of 3% was achieved using 44 indicator functions. To achieve 1% accuracy using DP, a state space of 8 = 784 states is required if arrivals are ignored at the upper boundary. Considering also the large number of iterations required by value iteration, the ALP achieves the same accuracy with much less computation. Although the proof uses a very speci c set of basis functions, Section 5 suggests that convergence will occur for other functions and that it should be possible to preselect more e cient functions, increasing the rate of convergence. 7 Summary and Future Work We have demonstrated the practicality of computing a tight bound on average cost for small to moderate size networks. Unlike other bounds, the quality of the bound does not degrade in heavy tra c, although the size of the LP used to compute it may grow. The method requires only the selection of approximating functions and the solution of an LP. For some approximating functions, constraint reduction can be applied so that the number of constraints in the LP grows linearly the number of control actions and the number of bu ers. Typically the number of actions is exponential in the number of bu ers, but with a small base, and the resulting LP is tractable for much larger systems than can be solved exactly by DP. Even when the constraints cannot be reduced to an equivalent nite set, truncation methods can be e ective in selecting a small approximate constraint set. As the number of approximating functions increases, the bound becomes tighter fairly rapidly. More work is needed to determine larger sets of approximating functions that give tighter bounds. Two possibilities are A larger set of exponential decay functions, similar to those in Section 4.. Using the principle that states with at least one small bu er are more important, states could be aggregated with partitions at x i = m. Functions that are linear in, say, the largest x i in each partition and zero elsewhere might be e ective. Several questions of interest remain open: Can it be shown that the percent error in the average cost bound vanishes in heavy tra c with a single bottleneck, as suggested by the numerical results? How does it perform in balanced heavy tra c? Do the policies recovered from the ALP have good performance? For general MDPs, [5] shows that the answer is no, leading them to propose a modi ed algorithm [8]. However, for the class of network problems considered here a positive answer seems possible. Can a comparable upper bound be constructed based on the ALP? Previous upper bounds are on the worst-case performance over a broad class of policies, such as nonidling. Acknowledgements Much of the numerical work in this paper was done by Michael Frechette, Melissa LeClair, and Jonathan Senning; Daniel Stahl and Anna Moore also assisted. I would also like to thank Sean Meyn for his many suggestions. 0

21 References [1] F. Avram, D. Bertsimas, and M. Ricard. Fluid models of sequencing problems in open queueing networks: An optimal control approach. In F. P. Kelly and R. J. Williams, editors, Stochastic Networks, Vol. 71 of the IMA Volumes in Mathematics and its Applications, pages Springer-Verlag, New York, [] D. Bertsimas, D. Gamarnik, and J.N. Tsitsiklis. Performance of multiclass Markovian queueing networks via piecewise linear lyapunov functions. Ann. Appl. Probab., 11: , 001. [3] D. Bertsimas, I.Ch. Paschaladis, and J.N. Tsitsiklis. Optimization of multiclass queueing networks: Polyhedral and nonlinear characterizations of achievable performance. Ann. Appl. Probab., 4:43 75, [4] R-R. Chen and S.P. Meyn. Value iteration and optimization of multiclass queueing networks. Queueing Systems Theory and Appl., 3(1-3):65 97, [5] D.P. de Farias and B. Van Roy. Approximate linear programming for average-cost dynamic programming. In Advances in Neural Information Processing Systems 15. MIT Press, 003. [6] D.P. de Farias and B. Van Roy. The linear programming approach to approximate dynamic programming. Oper. Res., 5(6): , 003. [7] D.P. de Farias and B. Van Roy. On constraint sampling for the linear programming approach to approximate dynamic programming. Math. Oper. Res., 9(3):46 478, 004. [8] D.P. de Farias and B. Van Roy. A linear program for Bellman error minimization with performance guarantees. In Advances in Neural Information Processing Systems 17. MIT Press, 005. [9] B. Hajek. Optimal control of two interacting service stations. IEEE Trans. Automat. Control, AC-9: , [10] P.R. Kumar and S.P. Meyn. Duality and linear programs for stability and performance analysis of queueing networks and scheduling policies. IEEE Trans. Automat. Control, 41(1):4 17, [11] S.P. Meyn. The policy iteration algorithm for Markov decision processes with general state space. IEEE Trans. Automat. Control, AC-4: , [1] S.P. Meyn. Sequencing and routing in multiclass queueing networks. Part I: Feedback regulation. SIAM J. Control Optim., 40: , 001. [13] S.P. Meyn. Stability, performance evaluation and optimization. In E. Feinberg and A. Shwartz, editors, Handbook of Markov Decision Processes: Methods and Applications. Kluwer, 001. [14] S.P. Meyn. Sequencing and routing in multiclass queueing networks. Part II: Workload relaxations. SIAM J. Control Optim., 4(1):178 17, 003. [15] S.P. Meyn. Dynamic safety-stocks for asymptotic optimality in stochastic networks. Dept. of Electrical and Computer Eng., University of Illinois at Urbana-Champaign,

22 [16] J.R. Morrison and P.R. Kumar. New linear program performance bounds for queueing networks. J. Optim. Theory Appl., 100(3): , [17] J.R. Morrison and P.R. Kumar. Linear programming performance bounds for Markov chains with polyhedrally translation invariant transition probabilities and applications to unreliable manufacturing systems and enhanced wafer fab models. In Proceedings of IMECE00, New Orleans, LA, 00. Full-length version available at uiuc.edu/prkumar. [18] C.H. Papadimitriou and J.N. Tsitsiklis. The complexity of optimal queuing network control. Math. Oper. Res., 4:93 305, [19] I.C. Paschaladis, C. Su, and M.C. Caramanis. Target-pursuing scheduling and routing policies for multiclass queueing networks. IEEE Trans. Automat. Control, 49(10): , July 004. [0] D. Schuurmans and R. Patrascu. Direct value-approximation for factored MDPs. In Advances in Neural Information Processing Systems 14, pages MIT Press, 001. [1] P. Schweitzer and A. Seidmann. Generalized polynomial approximations in Markovian decision processes. J. of Mathematical Analysis and Applications, 110:568 58, [] L.I. Sennott. Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York, [3] M.H. Veatch. Fluid analysis of arrival routing. IEEE Trans. Automat. Control, 46: , 001. [4] M.H. Veatch. Using uid solutions in dynamic scheduling. In S. B. Gershwin, Y. Dallery, C. T. Papadopoulos, and J. M. Smith, editors, Analysis and Modeling of Manufacturing Systems, pages , New York, 00. Kluwer. [5] G. Weiss. On optimal draining of uid reentrant lines. In F. P. Kelly and R. J. Williams, editors, Stochastic Networks, Vol. 71 of the IMA Volumes in Mathematics and its Applications, pages , New York, Springer-Verlag. Appendix ALP Constraints for the Series Queue For the series queue of Section 4. and the di erential cost approximation (18) this appendix gives the ALP constraints in terms of z, where x = z + u. The terms in (16) are h(x + e 1 ) h(x) = q 11 x 1 + q 1 x + 1 q 11 + p 1 + r x h(x e 1 + e ) h(x) = ( q 11 + q 1 )x 1 + (q q 1 )x + 1 q q q 1 p 1 + p r 1 (1 ) x r [(1 )x 1 + ] x r 3 [(1 )x ] x h(x e ) h(x) = q 1 x 1 q x + 1 q p + r 1 ( 1 + r x 1 ( 1 1) x + r 3 ( 1 1)x 1 x : 1) x

23 Using these, (16) can be written as (19), which we restate here For the control u = (1; 1), (1;1) = 0 and c (1;1) 1 = c 1 ( 1 )q 11 + ( 1 )q 1 c (1;1) = c ( 1 )q 1 + ( 1 )q J d u + c ut z + ( u + ut z) z+u : d (1;1) = c (1;1) 1 + c (1;1) + 1 ( + 1)q 11 1 q ( 1 + )q ( 1 )p 1 + ( 1 )p (1;1) = r ( ) r 3 ( 1 ): For u = (0; 1), c (0;1) 1 = c 1 + q 11 q 1 c (0;1) = c + q 1 q d (0;1) = c (0;1) + 1 q q + p 1 p (0;1) 1 = r ( 1 ) (0;1) = r 3 ( 1 ) (0;1) = (0;1) + r 1 ( 1 ) + r r 3 1 : For u = (1; 0), we must have z = 0 and (1;0) 1 = 0; (1;0) = 0, c (1;0) 1 = c 1 ( 1 )q q 1 r ( 1 ) d (1;0) = c (1;0) ( + 1)q 11 1 q q ( 1 )p p Finally, for u = x = (0; 0), (0;0) = 0 and r 1 ( 1 ) r ( ) + r 3 : d (0;0) = 1 q 11 + p 1 + r : 3

Operations Research Letters. Instability of FIFO in a simple queueing system with arbitrarily low loads

Operations Research Letters. Instability of FIFO in a simple queueing system with arbitrarily low loads Operations Research Letters 37 (2009) 312 316 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.com/locate/orl Instability of FIFO in a simple queueing

More information

Tutorial: Optimal Control of Queueing Networks

Tutorial: Optimal Control of Queueing Networks Department of Mathematics Tutorial: Optimal Control of Queueing Networks Mike Veatch Presented at INFORMS Austin November 7, 2010 1 Overview Network models MDP formulations: features, efficient formulations

More information

A tutorial on some new methods for. performance evaluation of queueing networks. P. R. Kumar. Coordinated Science Laboratory. University of Illinois

A tutorial on some new methods for. performance evaluation of queueing networks. P. R. Kumar. Coordinated Science Laboratory. University of Illinois A tutorial on some new methods for performance evaluation of queueing networks P. R. Kumar Dept. of Electrical and Computer Engineering, and Coordinated Science Laboratory University of Illinois 1308 West

More information

[4] T. I. Seidman, \\First Come First Serve" is Unstable!," tech. rep., University of Maryland Baltimore County, 1993.

[4] T. I. Seidman, \\First Come First Serve is Unstable!, tech. rep., University of Maryland Baltimore County, 1993. [2] C. J. Chase and P. J. Ramadge, \On real-time scheduling policies for exible manufacturing systems," IEEE Trans. Automat. Control, vol. AC-37, pp. 491{496, April 1992. [3] S. H. Lu and P. R. Kumar,

More information

1 Markov decision processes

1 Markov decision processes 2.997 Decision-Making in Large-Scale Systems February 4 MI, Spring 2004 Handout #1 Lecture Note 1 1 Markov decision processes In this class we will study discrete-time stochastic systems. We can describe

More information

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS Xiaofei Fan-Orzechowski Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony

More information

This lecture is expanded from:

This lecture is expanded from: This lecture is expanded from: HIGH VOLUME JOB SHOP SCHEDULING AND MULTICLASS QUEUING NETWORKS WITH INFINITE VIRTUAL BUFFERS INFORMS, MIAMI Nov 2, 2001 Gideon Weiss Haifa University (visiting MS&E, Stanford)

More information

Stability and Asymptotic Optimality of h-maxweight Policies

Stability and Asymptotic Optimality of h-maxweight Policies Stability and Asymptotic Optimality of h-maxweight Policies - A. Rybko, 2006 Sean Meyn Department of Electrical and Computer Engineering University of Illinois & the Coordinated Science Laboratory NSF

More information

OPTIMAL CONTROL OF A FLEXIBLE SERVER

OPTIMAL CONTROL OF A FLEXIBLE SERVER Adv. Appl. Prob. 36, 139 170 (2004) Printed in Northern Ireland Applied Probability Trust 2004 OPTIMAL CONTROL OF A FLEXIBLE SERVER HYUN-SOO AHN, University of California, Berkeley IZAK DUENYAS, University

More information

Zero-Inventory Conditions For a Two-Part-Type Make-to-Stock Production System

Zero-Inventory Conditions For a Two-Part-Type Make-to-Stock Production System Zero-Inventory Conditions For a Two-Part-Type Make-to-Stock Production System MichaelH.Veatch Francis de Véricourt October 9, 2002 Abstract We consider the dynamic scheduling of a two-part-type make-tostock

More information

Maximum Pressure Policies in Stochastic Processing Networks

Maximum Pressure Policies in Stochastic Processing Networks OPERATIONS RESEARCH Vol. 53, No. 2, March April 2005, pp. 197 218 issn 0030-364X eissn 1526-5463 05 5302 0197 informs doi 10.1287/opre.1040.0170 2005 INFORMS Maximum Pressure Policies in Stochastic Processing

More information

On the static assignment to parallel servers

On the static assignment to parallel servers On the static assignment to parallel servers Ger Koole Vrije Universiteit Faculty of Mathematics and Computer Science De Boelelaan 1081a, 1081 HV Amsterdam The Netherlands Email: koole@cs.vu.nl, Url: www.cs.vu.nl/

More information

STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING

STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING Applied Probability Trust (7 May 2015) STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING RAMTIN PEDARSANI and JEAN WALRAND, University of California,

More information

Near-Optimal Control of Queueing Systems via Approximate One-Step Policy Improvement

Near-Optimal Control of Queueing Systems via Approximate One-Step Policy Improvement Near-Optimal Control of Queueing Systems via Approximate One-Step Policy Improvement Jefferson Huang March 21, 2018 Reinforcement Learning for Processing Networks Seminar Cornell University Performance

More information

Admission control schemes to provide class-level QoS in multiservice networks q

Admission control schemes to provide class-level QoS in multiservice networks q Computer Networks 35 (2001) 307±326 www.elsevier.com/locate/comnet Admission control schemes to provide class-level QoS in multiservice networks q Suresh Kalyanasundaram a,1, Edwin K.P. Chong b, Ness B.

More information

5 Lecture 5: Fluid Models

5 Lecture 5: Fluid Models 5 Lecture 5: Fluid Models Stability of fluid and stochastic processing networks Stability analysis of some fluid models Optimization of fluid networks. Separated continuous linear programming 5.1 Stability

More information

Maximum pressure policies for stochastic processing networks

Maximum pressure policies for stochastic processing networks Maximum pressure policies for stochastic processing networks Jim Dai Joint work with Wuqin Lin at Northwestern Univ. The 2011 Lunteren Conference Jim Dai (Georgia Tech) MPPs January 18, 2011 1 / 55 Outline

More information

A Semiconductor Wafer

A Semiconductor Wafer M O T I V A T I O N Semi Conductor Wafer Fabs A Semiconductor Wafer Clean Oxidation PhotoLithography Photoresist Strip Ion Implantation or metal deosition Fabrication of a single oxide layer Etching MS&E324,

More information

Improved Algorithms for Machine Allocation in Manufacturing Systems

Improved Algorithms for Machine Allocation in Manufacturing Systems Improved Algorithms for Machine Allocation in Manufacturing Systems Hans Frenk Martine Labbé Mario van Vliet Shuzhong Zhang October, 1992 Econometric Institute, Erasmus University Rotterdam, the Netherlands.

More information

Production Policies for Multi-Product Systems with Deteriorating. Process Condition

Production Policies for Multi-Product Systems with Deteriorating. Process Condition Production Policies for Multi-Product Systems with Deteriorating Process Condition Burak Kazaz School of Business, University of Miami, Coral Gables, FL 3324. bkazaz@miami.edu Thomas W. Sloan College of

More information

HOW TO CHOOSE THE STATE RELEVANCE WEIGHT OF THE APPROXIMATE LINEAR PROGRAM?

HOW TO CHOOSE THE STATE RELEVANCE WEIGHT OF THE APPROXIMATE LINEAR PROGRAM? HOW O CHOOSE HE SAE RELEVANCE WEIGH OF HE APPROXIMAE LINEAR PROGRAM? YANN LE ALLEC AND HEOPHANE WEBER Abstract. he linear programming approach to approximate dynamic programming was introduced in [1].

More information

Total Expected Discounted Reward MDPs: Existence of Optimal Policies

Total Expected Discounted Reward MDPs: Existence of Optimal Policies Total Expected Discounted Reward MDPs: Existence of Optimal Policies Eugene A. Feinberg Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony Brook, NY 11794-3600

More information

Management of demand-driven production systems

Management of demand-driven production systems Management of demand-driven production systems Mike Chen, Richard Dubrawski, and Sean Meyn November 4, 22 Abstract Control-synthesis techniques are developed for demand-driven production systems. The resulting

More information

Dynamic Matching Models

Dynamic Matching Models Dynamic Matching Models Ana Bušić Inria Paris - Rocquencourt CS Department of École normale supérieure joint work with Varun Gupta, Jean Mairesse and Sean Meyn 3rd Workshop on Cognition and Control January

More information

Fluid Heuristics, Lyapunov Bounds and E cient Importance Sampling for a Heavy-tailed G/G/1 Queue

Fluid Heuristics, Lyapunov Bounds and E cient Importance Sampling for a Heavy-tailed G/G/1 Queue Fluid Heuristics, Lyapunov Bounds and E cient Importance Sampling for a Heavy-tailed G/G/1 Queue J. Blanchet, P. Glynn, and J. C. Liu. September, 2007 Abstract We develop a strongly e cient rare-event

More information

Linear Programming Methods

Linear Programming Methods Chapter 11 Linear Programming Methods 1 In this chapter we consider the linear programming approach to dynamic programming. First, Bellman s equation can be reformulated as a linear program whose solution

More information

A cµ Rule for Two-Tiered Parallel Servers

A cµ Rule for Two-Tiered Parallel Servers 1 A cµ Rule for Two-Tiered Parallel Servers Soroush Saghafian, Michael H. Veatch. Abstract The cµ rule is known to be optimal in many queueing systems with memoryless service and inter-arrival times. We

More information

Transience of Multiclass Queueing Networks. via Fluid Limit Models. Sean P. Meyn. University of Illinois. Abstract

Transience of Multiclass Queueing Networks. via Fluid Limit Models. Sean P. Meyn. University of Illinois. Abstract Transience of Multiclass Queueing Networks via Fluid Limit Models Sean P. Meyn University of Illinois Abstract This paper treats transience for queueing network models by considering an associated uid

More information

Prioritized Sweeping Converges to the Optimal Value Function

Prioritized Sweeping Converges to the Optimal Value Function Technical Report DCS-TR-631 Prioritized Sweeping Converges to the Optimal Value Function Lihong Li and Michael L. Littman {lihong,mlittman}@cs.rutgers.edu RL 3 Laboratory Department of Computer Science

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk ANSAPW University of Queensland 8-11 July, 2013 1 Outline (I) Fluid

More information

The consequences of time-phased order releases on two M/M/1 queues in series DEPARTMENT OF DECISION SCIENCES AND INFORMATION MANAGEMENT (KBI)

The consequences of time-phased order releases on two M/M/1 queues in series DEPARTMENT OF DECISION SCIENCES AND INFORMATION MANAGEMENT (KBI) Faculty of Business and Economics The consequences of time-phased order releases on two M/M/ queues in series Diederik Claerhout and Nico Vandaele DEPARTMENT OF DECISION SCIENCES AND INFORMATION MANAGEMENT

More information

Linearly-solvable Markov decision problems

Linearly-solvable Markov decision problems Advances in Neural Information Processing Systems 2 Linearly-solvable Markov decision problems Emanuel Todorov Department of Cognitive Science University of California San Diego todorov@cogsci.ucsd.edu

More information

Dynamic control of a tandem system with abandonments

Dynamic control of a tandem system with abandonments Dynamic control of a tandem system with abandonments Gabriel Zayas-Cabán 1, Jingui Xie 2, Linda V. Green 3, and Mark E. Lewis 4 1 Center for Healthcare Engineering and Patient Safety University of Michigan

More information

A monotonic property of the optimal admission control to an M/M/1 queue under periodic observations with average cost criterion

A monotonic property of the optimal admission control to an M/M/1 queue under periodic observations with average cost criterion A monotonic property of the optimal admission control to an M/M/1 queue under periodic observations with average cost criterion Cao, Jianhua; Nyberg, Christian Published in: Seventeenth Nordic Teletraffic

More information

Chapter 5 Linear Programming (LP)

Chapter 5 Linear Programming (LP) Chapter 5 Linear Programming (LP) General constrained optimization problem: minimize f(x) subject to x R n is called the constraint set or feasible set. any point x is called a feasible point We consider

More information

Robust linear optimization under general norms

Robust linear optimization under general norms Operations Research Letters 3 (004) 50 56 Operations Research Letters www.elsevier.com/locate/dsw Robust linear optimization under general norms Dimitris Bertsimas a; ;, Dessislava Pachamanova b, Melvyn

More information

Time is discrete and indexed by t =0; 1;:::;T,whereT<1. An individual is interested in maximizing an objective function given by. tu(x t ;a t ); (0.

Time is discrete and indexed by t =0; 1;:::;T,whereT<1. An individual is interested in maximizing an objective function given by. tu(x t ;a t ); (0. Chapter 0 Discrete Time Dynamic Programming 0.1 The Finite Horizon Case Time is discrete and indexed by t =0; 1;:::;T,whereT

More information

Economics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries

Economics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries 1. Introduction In this paper we reconsider the problem of axiomatizing scoring rules. Early results on this problem are due to Smith (1973) and Young (1975). They characterized social welfare and social

More information

MATHEMATICAL PROGRAMMING I

MATHEMATICAL PROGRAMMING I MATHEMATICAL PROGRAMMING I Books There is no single course text, but there are many useful books, some more mathematical, others written at a more applied level. A selection is as follows: Bazaraa, Jarvis

More information

STABILIZATION OF AN OVERLOADED QUEUEING NETWORK USING MEASUREMENT-BASED ADMISSION CONTROL

STABILIZATION OF AN OVERLOADED QUEUEING NETWORK USING MEASUREMENT-BASED ADMISSION CONTROL First published in Journal of Applied Probability 43(1) c 2006 Applied Probability Trust STABILIZATION OF AN OVERLOADED QUEUEING NETWORK USING MEASUREMENT-BASED ADMISSION CONTROL LASSE LESKELÄ, Helsinki

More information

A linear programming approach to constrained nonstationary infinite-horizon Markov decision processes

A linear programming approach to constrained nonstationary infinite-horizon Markov decision processes A linear programming approach to constrained nonstationary infinite-horizon Markov decision processes Ilbin Lee Marina A. Epelman H. Edwin Romeijn Robert L. Smith Technical Report 13-01 March 6, 2013 University

More information

1 Which sets have volume 0?

1 Which sets have volume 0? Math 540 Spring 0 Notes #0 More on integration Which sets have volume 0? The theorem at the end of the last section makes this an important question. (Measure theory would supersede it, however.) Theorem

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Stochastic Stability - H.J. Kushner

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Stochastic Stability - H.J. Kushner STOCHASTIC STABILITY H.J. Kushner Applied Mathematics, Brown University, Providence, RI, USA. Keywords: stability, stochastic stability, random perturbations, Markov systems, robustness, perturbed systems,

More information

MDP Preliminaries. Nan Jiang. February 10, 2019

MDP Preliminaries. Nan Jiang. February 10, 2019 MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process

More information

PROBABILISTIC SERVICE LEVEL GUARANTEES IN MAKE-TO-STOCK MANUFACTURING SYSTEMS

PROBABILISTIC SERVICE LEVEL GUARANTEES IN MAKE-TO-STOCK MANUFACTURING SYSTEMS PROBABILISTIC SERVICE LEVEL GUARANTEES IN MAKE-TO-STOCK MANUFACTURING SYSTEMS DIMITRIS BERTSIMAS Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 0239, dbertsim@mit.edu

More information

Decentralized Control of Stochastic Systems

Decentralized Control of Stochastic Systems Decentralized Control of Stochastic Systems Sanjay Lall Stanford University CDC-ECC Workshop, December 11, 2005 2 S. Lall, Stanford 2005.12.11.02 Decentralized Control G 1 G 2 G 3 G 4 G 5 y 1 u 1 y 2 u

More information

On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming

On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 462 478 issn 0364-765X eissn 1526-5471 04 2903 0462 informs doi 10.1287/moor.1040.0094 2004 INFORMS On Constraint Sampling in the Linear

More information

Distributed Optimization. Song Chong EE, KAIST

Distributed Optimization. Song Chong EE, KAIST Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Approximate Linear Programming for Network Control: Column Generation and Subproblems

Approximate Linear Programming for Network Control: Column Generation and Subproblems Approximate Linear Programming for Network Control: Column Generation and Subproblems Michael H. Veatch and Nathan Walker Department of Mathematics, Gordon College, Wenham, MA 01984 mike.veatch@gordon.edu

More information

Production Capacity Modeling of Alternative, Nonidentical, Flexible Machines

Production Capacity Modeling of Alternative, Nonidentical, Flexible Machines The International Journal of Flexible Manufacturing Systems, 14, 345 359, 2002 c 2002 Kluwer Academic Publishers Manufactured in The Netherlands Production Capacity Modeling of Alternative, Nonidentical,

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Applications. Stephen J. Stoyan, Maged M. Dessouky*, and Xiaoqing Wang

Applications. Stephen J. Stoyan, Maged M. Dessouky*, and Xiaoqing Wang Introduction to Large-Scale Linear Programming and Applications Stephen J. Stoyan, Maged M. Dessouky*, and Xiaoqing Wang Daniel J. Epstein Department of Industrial and Systems Engineering, University of

More information

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH 1998 315 Asymptotic Buffer Overflow Probabilities in Multiclass Multiplexers: An Optimal Control Approach Dimitris Bertsimas, Ioannis Ch. Paschalidis,

More information

ECON2285: Mathematical Economics

ECON2285: Mathematical Economics ECON2285: Mathematical Economics Yulei Luo Economics, HKU September 17, 2018 Luo, Y. (Economics, HKU) ME September 17, 2018 1 / 46 Static Optimization and Extreme Values In this topic, we will study goal

More information

Resource Pooling for Optimal Evacuation of a Large Building

Resource Pooling for Optimal Evacuation of a Large Building Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Resource Pooling for Optimal Evacuation of a Large Building Kun Deng, Wei Chen, Prashant G. Mehta, and Sean

More information

Optimal scaling of average queue sizes in an input-queued switch: an open problem

Optimal scaling of average queue sizes in an input-queued switch: an open problem DOI 10.1007/s11134-011-9234-1 Optimal scaling of average queue sizes in an input-queued switch: an open problem Devavrat Shah John N. Tsitsiklis Yuan Zhong Received: 9 May 2011 / Revised: 9 May 2011 Springer

More information

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018 Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank

Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank David Glickenstein November 3, 4 Representing graphs as matrices It will sometimes be useful to represent graphs

More information

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974 LIMITS FOR QUEUES AS THE WAITING ROOM GROWS by Daniel P. Heyman Ward Whitt Bell Communications Research AT&T Bell Laboratories Red Bank, NJ 07701 Murray Hill, NJ 07974 May 11, 1988 ABSTRACT We study the

More information

OPTIMAL CONTROL OF STOCHASTIC NETWORKS - AN APPROACH VIA FLUID MODELS

OPTIMAL CONTROL OF STOCHASTIC NETWORKS - AN APPROACH VIA FLUID MODELS OPTIMAL CONTROL OF STOCHASTIC NETWORKS - AN APPROACH VIA FLUID MODELS Nicole Bäuerle Department of Mathematics VII, University of Ulm D-8969 Ulm, Germany, baeuerle@mathematik.uni-ulm.de Abstract We consider

More information

8 Periodic Linear Di erential Equations - Floquet Theory

8 Periodic Linear Di erential Equations - Floquet Theory 8 Periodic Linear Di erential Equations - Floquet Theory The general theory of time varying linear di erential equations _x(t) = A(t)x(t) is still amazingly incomplete. Only for certain classes of functions

More information

Information Relaxation Bounds for Infinite Horizon Markov Decision Processes

Information Relaxation Bounds for Infinite Horizon Markov Decision Processes Information Relaxation Bounds for Infinite Horizon Markov Decision Processes David B. Brown Fuqua School of Business Duke University dbbrown@duke.edu Martin B. Haugh Department of IE&OR Columbia University

More information

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra LP Duality: outline I Motivation and definition of a dual LP I Weak duality I Separating hyperplane theorem and theorems of the alternatives I Strong duality and complementary slackness I Using duality

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Dynamic Control of a Tandem Queueing System with Abandonments

Dynamic Control of a Tandem Queueing System with Abandonments Dynamic Control of a Tandem Queueing System with Abandonments Gabriel Zayas-Cabán 1 Jungui Xie 2 Linda V. Green 3 Mark E. Lewis 1 1 Cornell University Ithaca, NY 2 University of Science and Technology

More information

Value and Policy Iteration

Value and Policy Iteration Chapter 7 Value and Policy Iteration 1 For infinite horizon problems, we need to replace our basic computational tool, the DP algorithm, which we used to compute the optimal cost and policy for finite

More information

Solving Extensive Form Games

Solving Extensive Form Games Chapter 8 Solving Extensive Form Games 8.1 The Extensive Form of a Game The extensive form of a game contains the following information: (1) the set of players (2) the order of moves (that is, who moves

More information

On the Approximate Linear Programming Approach for Network Revenue Management Problems

On the Approximate Linear Programming Approach for Network Revenue Management Problems On the Approximate Linear Programming Approach for Network Revenue Management Problems Chaoxu Tong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

Quadratic and Copositive Lyapunov Functions and the Stability of Positive Switched Linear Systems

Quadratic and Copositive Lyapunov Functions and the Stability of Positive Switched Linear Systems Proceedings of the 2007 American Control Conference Marriott Marquis Hotel at Times Square New York City, USA, July 11-13, 2007 WeA20.1 Quadratic and Copositive Lyapunov Functions and the Stability of

More information

A Hierarchy of Suboptimal Policies for the Multi-period, Multi-echelon, Robust Inventory Problem

A Hierarchy of Suboptimal Policies for the Multi-period, Multi-echelon, Robust Inventory Problem A Hierarchy of Suboptimal Policies for the Multi-period, Multi-echelon, Robust Inventory Problem Dimitris J. Bertsimas Dan A. Iancu Pablo A. Parrilo Sloan School of Management and Operations Research Center,

More information

Microeconomics, Block I Part 1

Microeconomics, Block I Part 1 Microeconomics, Block I Part 1 Piero Gottardi EUI Sept. 26, 2016 Piero Gottardi (EUI) Microeconomics, Block I Part 1 Sept. 26, 2016 1 / 53 Choice Theory Set of alternatives: X, with generic elements x,

More information

STABILITY OF QUEUEING NETWORKS AND SCHEDULING POLICIES. P. R. Kumar and Sean P. Meyn y. Abstract

STABILITY OF QUEUEING NETWORKS AND SCHEDULING POLICIES. P. R. Kumar and Sean P. Meyn y. Abstract STABILITY OF QUEUEING NETWORKS AND SCHEDULING POLICIES P. R. Kumar and Sean P. Meyn y Abstract Usually, the stability of queueing networks is established by explicitly determining the invariant distribution.

More information

Stability of the two queue system

Stability of the two queue system Stability of the two queue system Iain M. MacPhee and Lisa J. Müller University of Durham Department of Mathematical Science Durham, DH1 3LE, UK (e-mail: i.m.macphee@durham.ac.uk, l.j.muller@durham.ac.uk)

More information

QUALIFYING EXAM IN SYSTEMS ENGINEERING

QUALIFYING EXAM IN SYSTEMS ENGINEERING QUALIFYING EXAM IN SYSTEMS ENGINEERING Written Exam: MAY 23, 2017, 9:00AM to 1:00PM, EMB 105 Oral Exam: May 25 or 26, 2017 Time/Location TBA (~1 hour per student) CLOSED BOOK, NO CHEAT SHEETS BASIC SCIENTIFIC

More information

1. Introduction. Consider a single cell in a mobile phone system. A \call setup" is a request for achannel by an idle customer presently in the cell t

1. Introduction. Consider a single cell in a mobile phone system. A \call setup is a request for achannel by an idle customer presently in the cell t Heavy Trac Limit for a Mobile Phone System Loss Model Philip J. Fleming and Alexander Stolyar Motorola, Inc. Arlington Heights, IL Burton Simon Department of Mathematics University of Colorado at Denver

More information

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data Robust Solutions to Multi-Objective Linear Programs with Uncertain Data M.A. Goberna yz V. Jeyakumar x G. Li x J. Vicente-Pérez x Revised Version: October 1, 2014 Abstract In this paper we examine multi-objective

More information

Multicommodity Flows and Column Generation

Multicommodity Flows and Column Generation Lecture Notes Multicommodity Flows and Column Generation Marc Pfetsch Zuse Institute Berlin pfetsch@zib.de last change: 2/8/2006 Technische Universität Berlin Fakultät II, Institut für Mathematik WS 2006/07

More information

On Finding Optimal Policies for Markovian Decision Processes Using Simulation

On Finding Optimal Policies for Markovian Decision Processes Using Simulation On Finding Optimal Policies for Markovian Decision Processes Using Simulation Apostolos N. Burnetas Case Western Reserve University Michael N. Katehakis Rutgers University February 1995 Abstract A simulation

More information

In search of sensitivity in network optimization

In search of sensitivity in network optimization In search of sensitivity in network optimization Mike Chen, Charuhas Pandit, and Sean Meyn June 13, 2002 Abstract This paper concerns policy synthesis in large queuing networks. The results provide answers

More information

Electronic Companion Fluid Models for Overloaded Multi-Class Many-Server Queueing Systems with FCFS Routing

Electronic Companion Fluid Models for Overloaded Multi-Class Many-Server Queueing Systems with FCFS Routing Submitted to Management Science manuscript MS-251-27 Electronic Companion Fluid Models for Overloaded Multi-Class Many-Server Queueing Systems with FCFS Routing Rishi Talreja, Ward Whitt Department of

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information

Asymptotics for Polling Models with Limited Service Policies

Asymptotics for Polling Models with Limited Service Policies Asymptotics for Polling Models with Limited Service Policies Woojin Chang School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205 USA Douglas G. Down Department

More information

Efficient Implementation of Approximate Linear Programming

Efficient Implementation of Approximate Linear Programming 2.997 Decision-Making in Large-Scale Systems April 12 MI, Spring 2004 Handout #21 Lecture Note 18 1 Efficient Implementation of Approximate Linear Programming While the ALP may involve only a small number

More information

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT MARCH 29, 26 LECTURE 2 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT (Davidson (2), Chapter 4; Phillips Lectures on Unit Roots, Cointegration and Nonstationarity; White (999), Chapter 7) Unit root processes

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

and to estimate the quality of feasible solutions I A new way to derive dual bounds:

and to estimate the quality of feasible solutions I A new way to derive dual bounds: Lagrangian Relaxations and Duality I Recall: I Relaxations provide dual bounds for the problem I So do feasible solutions of dual problems I Having tight dual bounds is important in algorithms (B&B), and

More information

On the Partitioning of Servers in Queueing Systems during Rush Hour

On the Partitioning of Servers in Queueing Systems during Rush Hour On the Partitioning of Servers in Queueing Systems during Rush Hour This paper is motivated by two phenomena observed in many queueing systems in practice. The first is the partitioning of server capacity

More information

Optimal Rejuvenation for. Tolerating Soft Failures. Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi.

Optimal Rejuvenation for. Tolerating Soft Failures. Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi. Optimal Rejuvenation for Tolerating Soft Failures Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi Abstract In the paper we address the problem of determining the optimal time

More information

A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch

A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch A Starvation-free Algorithm For Achieving 00% Throughput in an Input- Queued Switch Abstract Adisak ekkittikul ick ckeown Department of Electrical Engineering Stanford University Stanford CA 9405-400 Tel

More information

Linear Programming. Scheduling problems

Linear Programming. Scheduling problems Linear Programming Scheduling problems Linear programming (LP) ( )., 1, for 0 min 1 1 1 1 1 11 1 1 n i x b x a x a b x a x a x c x c x z i m n mn m n n n n! = + + + + + + = Extreme points x ={x 1,,x n

More information

15 Closed production networks

15 Closed production networks 5 Closed production networks In the previous chapter we developed and analyzed stochastic models for production networks with a free inflow of jobs. In this chapter we will study production networks for

More information

Simplex Algorithm for Countable-state Discounted Markov Decision Processes

Simplex Algorithm for Countable-state Discounted Markov Decision Processes Simplex Algorithm for Countable-state Discounted Markov Decision Processes Ilbin Lee Marina A. Epelman H. Edwin Romeijn Robert L. Smith November 16, 2014 Abstract We consider discounted Markov Decision

More information

Heuristic Search Algorithms

Heuristic Search Algorithms CHAPTER 4 Heuristic Search Algorithms 59 4.1 HEURISTIC SEARCH AND SSP MDPS The methods we explored in the previous chapter have a serious practical drawback the amount of memory they require is proportional

More information

Distributionally Robust Convex Optimization

Distributionally Robust Convex Optimization Submitted to Operations Research manuscript OPRE-2013-02-060 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However,

More information

Approximation Metrics for Discrete and Continuous Systems

Approximation Metrics for Discrete and Continuous Systems University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science May 2007 Approximation Metrics for Discrete Continuous Systems Antoine Girard University

More information

Positive Harris Recurrence and Diffusion Scale Analysis of a Push Pull Queueing Network. Haifa Statistics Seminar May 5, 2008

Positive Harris Recurrence and Diffusion Scale Analysis of a Push Pull Queueing Network. Haifa Statistics Seminar May 5, 2008 Positive Harris Recurrence and Diffusion Scale Analysis of a Push Pull Queueing Network Yoni Nazarathy Gideon Weiss Haifa Statistics Seminar May 5, 2008 1 Outline 1 Preview of Results 2 Introduction Queueing

More information