Using Static Flow Patterns in Time-Staged Resource Allocation Problems

Size: px
Start display at page:

Download "Using Static Flow Patterns in Time-Staged Resource Allocation Problems"

Transcription

1 Using Static Flow Patterns in Time-Staged Resource Allocation Problems Arun Marar Warren B. Powell Hugo P. Simão Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ September 6, 2006

2 Abstract We address the problem of combining a cost-based simulation model, which makes decisions over time by minimizing a cost model, and rule-based policies, where a knowledgeable user would like certain types of decisions to happen with a specified frequency when averaged over the entire simulation. These rules are designed to capture issues that are difficult to quantify as costs, but which produce more realistic behaviors in the judgment of a knowledgeable user. We consider patterns that are specified as averages over time, which have to be enforced in a model that makes decisions while stepping through time (for example, while optimizing the assignment of resources to tasks). We show how an existing simulation, as long as it uses a cost-based optimization model while stepping through time, can be modified to more closely match exogenously specified patterns.

3 Introduction Frequently, we find that optimization models of complex operational problems produce results which run against the insights of knowledgeable experts. It is nice when these differences represent the improvements that save money, but it is frequently the case that the differences simply reflect missing or incomplete information about the real problem. For example, a truckload carrier may need to assign longer loads to drivers who own their tractors (as opposed to drivers who use company-owned equipment) because these drivers need to make more money to cover the equipment costs. We may not be able to quantify the cost of assigning a driver to a shorter load, but we do know that we are happy if the average length of loads to which these drivers are assigned matches a corporate goal. Making optimization models match corporate goals (as opposed to simply minimizing costs) is very common in engineering practice and it is usually achieved through the inclusion of soft bonuses and penalties to encourage the model to produce certain behaviors. Tuning these soft parameters is typically ad hoc and can be quite time consuming. A more formal strategy, introduced by Marar et al. (2006), is to add a penalty term to produce a modified objective function of the form min C(x) + θ x x X xp where x is the flow produced by the model, and x p is a flow that we are trying to match using an exogenously specified pattern. The resulting problem is a nonlinear programming problem that can be solved using standard algorithms. We often encounter problems that are time-staged where the same challenge of meeting corporate operating statistics arise. The problems may be stochastic, or we may be using a temporal decomposition simply because the problems are too large. For example, we may be simulating the assignment of drivers to loads over a planning horizon. We know the cost of assigning a driver to a load (we can minimize these costs at a point in time), but by the end of the simulation, we want the model to produce statistics that meet certain goals when averaged over the entire simulation. 1

4 In our applications, these corporate goals are always expressed as static patterns. This means that while we are solving the problem using a method that steps through time, the decisions, when aggregated over the entire simulation, need to match specific targets. This challenge arises in virtually every project we encounter with the sponsors of CASTLE Laboratory at Princeton. Examples of specific projects (all of which have been solved using the techniques in this paper) include: Locomotive management at a railroad - One pattern was to assign a particular type of locomotive (e.g horsepower) to a particular train (e.g. intermodal trains), 80 percent of the time (intermodal trains need to move quickly to compete with trucks). Routing and scheduling for cryogenic gases - The pattern specified that drivers who just delivered gases at a particular customer would then move an average distance to the next customer (this helped provide more realistic clustering of customers). Managing drivers at a major less-than-truckload carrier - One pattern specified that drivers in Chicago, with a home domicile of Cleveland, might be assigned to a load going to Indianapolis 10 percent of the time (this tells the model that it is possible, but not common, to send drivers in this direction). Military airlift problems - The pattern might specify that C-5 aircraft should be assigned to move cargo into the Middle East 7 percent of the time (bases in the Middle East might not have good repair facilities for this type of aircraft). Truckload motor carriers - Team drivers (drivers moving in pairs) should be assigned to loads that were between 700 and 800 miles in length 20 percent of the time (this helped the model match average length of haul statistics). Managing boxcars for a railroad - Customers requesting boxcars would receive empties from a particular location 40 percent of the time. Sometimes a customer had special needs that were met by cars from a specific location. All of these problems were solved using methods that stepped through time using the techniques of approximate dynamic programming (see, for example, Topaloglu & Powell (2006), 2

5 Powell & Van Roy (2004) or Powell et al. (2005)). In each case, our ability to gain user acceptance was significantly improved by our ability to match user-specified patterns to obtain more realistic behaviors from the model. This required the ability to solve a problem at a point in time, while matching statistics that were measured over the entire simulation. In this paper, we assume that we are solving a dynamic model one time period at a time, stepping forward through time (for example, using a myopic simulation, a rolling horizon procedure or a more advanced technique such as approximate dynamic programming). We assume that we are given static flow patterns that we wish to use to guide the behavior of a dynamic model. Thus, we might like to assign a particular type of driver to long loads 70 percent of the time, but in any one time period we may not be able to meet this target. It is not necessary (and often may not be possible) to match the pattern at any one point in time. The goal is to match it over time. Although our original motivation was to match exogenous patterns to improve user acceptance, there is another use of static patterns which we investigate in this paper. All of our problems are defined over a time horizon and are too hard to solve as a single optimization problem, either because the problem is stochastic or because the problem is simply too large. As a result, we are forced to use some sort of approximation. It is typically the case, however, that we can solve static versions of the same model using commercial solvers. We can view the optimal solution of the static model as an exogenous pattern, and test whether this improves the quality of the solution produced by our dynamic approximation. We propose an algorithm that modifies an existing (typically approximate) algorithm which steps through time, producing results that more closely match a static pattern. We establish the following properties for the algorithm. 1) For the case of continuous, nonreusable resources (resources are consumed in each time period), we introduce a modified model to be solved at each point in time that guarantees that the deviation from a static flow pattern is reduced after each time period. 2) For the case of reusable resources, we introduce an iterative algorithm which adapts to static patterns. 3) We show experimentally that using the optimal solution of a static problem (which is much smaller) as a pattern to guide an approximate solution of a dynamic problem improves overall solution quality. 3

6 The organization of the paper is as follows. In section 1 we present the dynamic resource allocation problem which is our motivating application. We present the dynamic resource allocation model in two settings: reusable resources, which arise in the context of fleet management, and nonreusable resources, which arise in the context of production planning. In section 2 we introduce our approach for incorporating static flow patterns in the optimization model. This approach combines a traditional cost function with a proximal term called the pattern metric which measures the deviation between static flow patterns and the patterns generated from solving the time-staged approximation. The technique is then developed for two major problem classes. The first, presented in section 3 assumes that resources are nonreusable which is to say that decisions made about resources in one time period do not affect the resources available in the next time period. This special case is easily solved to optimality in a time-staged manner (since each time period is independent), allowing us to focus on the challenge of making decisions over time that match a static pattern. We are able to prove specific convergence results for this problem class. Then, section 4 introduces the problem of reusable resources, where decisions made in one time period need to consider the downstream impact on future time periods. Section 5 describes a specific resource allocation problem as an instance of the more difficult case of reusable resources for which we want to demonstrate that static flow patterns can improve the solution obtained by approximate policies that are applied over time in a simulation. Experimental results in section 6 show that we can improve the overall solution quality when we introduce static flow patterns. We present our conclusions in section 7. 1 The Dynamic Resource Allocation Problem We begin by presenting a model of a resource allocation problem where the resources are reusable. Our work is motivated by problems in freight transportation which involve the management of vehicles (aircraft, tractors, trailers, box cars, containers) which have to be moved over space and time. After finishing a move, the vehicle becomes empty and available to be assigned to a new load of freight or to be repositioned (empty) at another location. To model this problem, we use the following notation. Our problem is modeled in discrete 4

7 time over the set T = {1,..., T }. Resources are modeled using: a = Vector of attributes of a resource. A = Attribute space of a. R ta = Number of resources with attribute vector a at time t. R t = (R ta ) a A, known as the resource state vector. Decisions and costs are given by: D a = Set of specific decisions that can be applied to resources with attribute vector a. x = Number of resources with attribute vector a acted on by decision d D a at time t. x t = (x ) a A,d D c = Cost of making a decision d on resource attribute vector a at time t, a A, d D a, t {1,..., T }. c t = (c ) a A,d D The optimization problem over a finite horizon is written as: min t T a A c x (1) subject to, for all t T, A t x t R t = 0 (2) B t x t R t+1 = 0 (3) x t 0. (4) The problem in (1) can be hard to solve because of complexities such as uncertainty, integrality constraints, time windows on tasks and a high level of detail in defining actual operations. 5

8 It is common to solve time-staged problems such as (1) using techniques that step through time. Let: X π t (R t ) = Vector of decisions returned by a policy π Π given the resource state R t. There are several classes of policies that illustrate this function. A myopic policy uses the rule: Xt π (R t ) = arg min x t X t a A c x t T (5) where X t is the feasible set defined by the constraints (2) - (4). A rolling horizon policy would plan events over a planning horizon T ph T in the future and is given by X π t (R t ) = arg min x t,...,x t+t ph c x + a A t {1,..., T T ph }. t+t ph c t adx t ad t =t+1 a A where we optimize over x t,..., x t+t ph but only implement x t. Finally, we might use a dynamic programming policy: X π t (R t ) = arg min x t X t ( a A c x + V t+1 (R t+1 ) ) (6) where V t+1 is an approximation of the value of being in resource state R t+1 = B t x t. For simplicity of notation, we have presented our model assuming single-period transformation times, that is, resources which are acted on in period t reappear in period t+1. An important special case arises when resources are not reusable, which we would represent using B t = 0. Our goal is to obtain flows x t at a point in time which, when averaged over time, closely match the static flow patterns. In the next section we introduce the basis of our methodology that allows us to make decisions that match the static flow patterns. 6

9 2 Representation of Static Flow Patterns We first develop the notation to represent information pertaining to static flow patterns. We assume that exogenous patterns are specified in the form ρ s ad = The fraction of time that resources with attribute a are acted on with decisions of type d. Thus the vector ρ s a = (ρ s ad ) represents the probability mass function of the decisions d acting on the resource attribute vector a. In practice, it is typically the case that attributes (and decisions) are expressed at some level of aggregation, although we do not consider this possibility in this paper. To compare with static flow patterns, we normalize the decisions made by the model over the entire time horizon as shown below: ρ ad (x) = t T t T x x, d D a, a A. (7) We now present the optimization model in the following form: [( ) ] arg min c x + θh(ρ(x), ρ s ) t T a A (8) subject to A t x t = R t, x t 0, t T, (9) where H is a penalty function known as the pattern metric that penalizes deviations of the vector ρ(x) = (ρ ad (x)) a A,d Da from the static flow patterns ρ s = (ρ s ad ) a A,. This penalty is weighted by a positive scaling factor θ. The formulation in (8) holds true for both reusable and nonreusable resources if we note that in the case of nonreusable resources B t = 0. In the next paragraphs we derive the functional form of the pattern metric through a goodness-of-fit test metric used widely in statistics. We adopt a quadratic form of the pattern metric in (8) motivated by the popular Pearson goodness-of-fit metric (Read & Cressie (1988), Pearson (1900)). The Pearson goodness-of-fit 7

10 metric is a popular statistical test where a particular sample of data might have been drawn from a hypothesized probability distribution denoted by H 0. Consider observing a random variable which can have one of the possible outcomes in the set {d i } {i I}, where ρ i is the probability of outcome d i. The decisions {d i } {i I} are mutually exclusive and ρ i = 1. i I We assume ρ i > 0 for all i I. Consider a scenario where we observe N realizations of this random variable. We can summarize our observations using the vector (ˆρ i ) i I where ˆρ i denotes the fraction of the sample that is observed with outcome d i, i I. We hypothesize a probability vector for the null model using H 0 : ρ = (ρ i ) i I where ρ i > 0 for all i I. If the observations are independent and identically distributed the Pearson goodness-of-fit metric is a chi-squared statistic given by χ 2 = i I N ρ i (ˆρ i ρ i ) 2. (10) The null hypothesis H 0 (that is, the observation of the random variable follows the distribution ρ) is rejected if the Pearson goodness-of-fit metric in (10) exceeds a certain threshold. The Pearson goodness-of-fit metric in its original form has a disadvantage because of the presence of the probabilities in the denominator of the function. This is particularly inconvenient because we do not require the time-staged model to prohibit decisions that do not occur in the static flow pattern. Thus we adopt a simple variant of the Pearson goodness-of-fit metric as our functional form of the pattern metric, given by min x t X t, t T ( T ) c x + θ ( T ) t=1 R x 2 a ρ s ad, (11) R t=1 a a A a A where X t is the feasible region defined by constraints (2) - (4) for time t and = t T x 8

11 is the total number of resources with attribute a over the entire horizon. We first develop our methodology of solving the model with a pattern metric in (11) in a setting with nonreusable resources. In this setting, each time period represents a separate optimization problem with no coupling across time periods. Thus, if we do not consider static flow patterns, we can obtain the overall optimal solution simply by optimizing each time period. Introducing static flow patterns requires that we make decisions which, over time, minimize deviations from the exogenous pattern. 3 Static Flow Patterns with Nonreusable Resources In this section, we focus on the problem of nonreusable resources by which we mean that resources in time period t are not carried forward to the next time period. If we did not face the challenge of matching a static flow pattern (which applies to activities over all time periods) we would be able to solve each time period independently. Such models tend to arise in strategic planning settings where the time periods are fairly large. In subsection 3.1 we present an algorithm for the case of continuous resources. Subection 3.2 proves convergence of the algorithm. Finally, subsection 3.3 shows how to adapt the algorithm for the case of discrete resources. 3.1 The Continuous Case A dynamic resource allocation model with nonreusable resources is solved as a sequence of models over the set T given by x t = arg min x t X t a A c x t T. (12) Our goal is to develop a methodology that solves the model in (11) in a time-staged manner compatible with the techniques introduced in section 1. We let the optimal solution of our 9

12 objective function with the pattern metric be x t (θ) = arg min x t X t [( a A c x ) + θh t (x t ) ] t T, (13) where H t is a function whose specific form we derive using the pattern metric H(ρ(x), ρ s ) later in this subsection. Thus, x t (θ) is our solution with the pattern metric while x t = x t (0) is the solution obtained using only the cost function. With the application of the policy in (13) in the case of nonreusable resources we can show that H(ρ(x (θ)), ρ s ) H(ρ( x ), ρ s ) θ > 0, (14) where x (θ) = (x t (θ)) t T and x = (x t ) t T. The rest of this subsection is devoted to deriving the functional form of H t. The normalized decision variables over the entire time horizon are given by ρ ad (x) = T t=1 x a A, d D a. (15) We suppress the dependence of x on θ to simplify notation. The pattern metric proposed in (11) is given by H(ρ(x), ρ s ) = a A (ρ ad (x) ρ s ad) 2. (16) We can define the normalized decision variables specific to a stage t using ρ = x R ta a A, d D a t {1,..., T }. Analogous to the decision variable x we define x = Number of resources with attribute vector a acted on by decision d D a at time t in the optimal solution of (12). 10

13 Using the same notation for ρ we let ρ = x R ta a A, d D a t {1,..., T }. (17) Similar to the expression in (15) we define the normalized solution to the problem in (12) using ρ ad = T t=1 x a A, d D a which we may rewrite as ρ ad = = = T t=1 x T t=1 T t=1 x R ta R ta ( Rta ρ ), (18) where the last step uses the substitution in equation (17). We denote the gradient of H(ρ(x), ρ s ) with respect to the normalized decision variable ρ ad at the value ρ ad as h ad, which is found using h ad = H ρ ad ρad = ρ ad a A, d D a = 2 ( ρ ad ρ s ad) a A, d D a. (19) Using equation (18) and the relation T t=1 R ta = we can rewrite equation (19) as h ad = 2 ( T t=1 ( ) ) Rta ( ρ ρ s R ad) a A, d D a. (20) a When we solve a subproblem at time t using equation (13) we have already obtained the solution vector x for all t t < t. Our static flow pattern may be telling us to send 30 percent of a particular type of vehicle to a particular location, whereas if we look at the time periods before t, we may be doing this only 20 percent of the time. This information could be used 11

14 as we progress through time to help us match the static flow pattern, but is ignored in the expression for the gradient of the pattern metric in (20). We incorporate information regarding prior decisions by adopting a Gauss-Seidel strategy (see Strang (1988), p.381). We first define ρ, = t 1 ( ) x t ad + t =1 } {{ } I T ( ) x t ad t =t } {{ } II a A, d D a t {1,..., T }, (21) The Gauss-Seidel gradient of the pattern metric is given by h = 2 (ρ, ρs ad) a A, d D a t {1,..., T }. (22) The pattern metric itself can be calculated at the beginning of every subproblem using Ht 1 = ρs ad) 2 t {1,..., T + 1}. a A (ρ, Note that H0 is simply the pattern metric that evaluates the optimal solution x of model (12) and HT is the pattern metric that evaluates the solution x of model (13), which incorporates the static flow patterns. 3.2 Convergence Results We establish two useful results. The first shows that the Gauss-Seidel version of the algorithm monotonically improves the pattern metric as we step forward in time during a single iteration. We then establish overall convergence of the algorithm. The following theorem establishes monotonic improvement of the pattern metric within an iteration: Theorem 1 For all t, t {1,..., T } if we solve the following quadratic programming problem: x t (θ) = arg min ( c x + θ x [ h R + 2(u x ) ] ) du (23) a a A d D 0 a 12

15 subject to: A t x t = R t, x t 0 then we obtain H T H T 1... H 1. (24) Thus, the pattern metric evaluated after solving each subproblem in (23) forms a monotonically decreasing sequence in time t. Consequently the function H t (x t ) that we adopt in the formulation given in (13) is given by: H t (x t ) = ( 1 a A x 0 [ h + 2(u x ) ] du ). Proof: See appendix. Theorem 1 proves the expression in (14) thus validating our approach of solving the time-staged sequence of models stated in (13). We next show that the decisions produced by equation (23) produce the optimal solution to the objective function given in equation (11). The proof of convergence is obtained by showing that the model in (23) is identical to solving the model in (11) using an iterative method known as the block coordinate descent (BCD) method. The proof uses existing convergence results for this class of algorithms. The block coordinate descent method is a popular technique for minimizing a real-valued continuously differentiable function f of m real variables subject to upper bounding constraints. In this method coordinates of f are partitioned into M blocks and at each iteration, f is minimized with respect to one of the coordinate blocks while the other coordinates are held fixed. This method is closely related to Gauss-Seidel methods for equation solving (Ortega & Rheinboldt (1970) and Warga (1963)). Convergence of the block coordinate descent method typically requires that f be strictly convex, differentiable and, taking into account the bounded constraints, has bounded level sets (Sargent & Sebastian (1973),Warga (1963)). 13

16 We formally describe the BCD algorithm below using the notation developed in Tseng (2000): Initialization. Choose any x 0 = (x 0 1,..., x 0 M ) X. Iteration n, n 1. Given x n 1 = (x n 1 1,..., x n 1 ) X, choose an index s {1,....M} and compute a new iterate: M x n = (x n 1,..., x n M) X satisfying x n s = arg min f(x n 1 1,..., xs 1 n 1, x s, x n 1 s+1,..., x n 1 x M ) (25) s x n j = x n 1 j, j s, j {1,..., M}. The minimization in (25) is attained if the set {x : f(x) f(x 0 )} is bounded and f is lower semicontinuous on this compact set (Rockafellar (1972)). To ensure convergence of the algorithm it is further required that each coordinate block is chosen sufficiently often in the method. One of the most commonly used methods to achieve this is the cyclic rule. According to the cyclic rule there exists a constant M M such that every index j {1,..., M} is chosen at least once between the n-th iteration and the (n + M 1)-th iteration. A well-known case of this rule is when M = M according to which an index s is set to k {1,..., M} at iterations k, k + M, k + 2M,.... It is obvious why the BCD method is attractive to solve the model with a pattern metric given in (11) in the case where the resources are nonreusable. The number of blocks is equal to the number of time periods T. By fixing the values for T 1 blocks at any iteration we only need to optimize over the decision variables representing one time period, say index t. In the case where the resources are nonreusable the advantage of the BCD method is realized because we can optimize over the feasible region X t ignoring all other constraints. This is exactly what we exploited in developing our algorithm in (23). It should be noted that we do not require the initial solution x 0 to be the optimal solution of the optimization model 14

17 solved without the pattern metric. We used this as our initial solution in theorem 1 only to validate our approach in capturing information in an optimization model. If we adopt the cycle rule in the BCD methodology applied to our optimization model in (11) then at any iteration n 1 the time period (block) t that we minimize over is given by: t = n n 1 T T where x denotes the greatest integer less than or equal to x. The key to understanding the connection between the BCD methodology and our problem is that a subproblem solved at time period t is an iteration of the BCD methodology. The Gauss-Seidel gradient of the pattern metric given in (22) after iteration n can be expressed by h,n t ad = 2 ( T t =1 x,n We compute x,n t f n (x) = a A ρ s ad ) as follows. If t = n n 1 T ( c x + θ Otherwise, we simply set x,n t x,0, so we may use x,0 t = x t., n 1, a A, d D a, t T. (26) T, then x,n = arg min x X f n (x) where x 0 t [ h,n 1 + 2(u x,n 1 ) ] du ). (27) = x,n 1 t. Any feasible solution in X can be used to initialize A direct application of the BCD methodology suggests the following procedure. iteration n 1, if t = n n 1 T f BCD,n (x) = a A θ a A Otherwise, we simply set x,n t T, then x,n = arg min x X f BCD,n (x) where T t =1,t t ( T t R =1,t t x,n 1 t ad a = x,n 1 t. t c t adx,n 1 t ad + c x + At ) 2 + x ρ s ad. (28) We conclude this subsection by showing that our methodology in (23) is a provably convergent algorithm for solving the optimization model in a pattern metric. We first show 15

18 that the application of the BCD method to the optimization model in a pattern metric given in (11) and our methodology in (23) are exactly the same, that is, we prove the following: Theorem 2 The minimizers of f BCD,n given in (28) and f n given in (27) are identical, that is: arg min f BCD,n (x) = arg min f n (x), n 1, t = n n 1 x X x X T T. Proof: The proof is provided in the appendix. The proof of convergence follows directly from the properties of the optimization model in a pattern metric. In Warga (1963) it is shown that the application of the BCD methodology to a convex function does converge to the optimal solution if the following statements are true: The optimization model in a pattern metric given by equation (11) is continuously differentiable in some neighborhood (relative to X = t T X t ) at every stationary point of this function. For every t, t = 1,..., T, f BCD,n (x t ) (or f n (x t )) is a strictly convex function of x t for all iterations n 1. X is compact. The first condition holds since the model in a pattern metric is differentiable everywhere. The second condition of strict convexity also holds because of the quadratic form of the pattern metric (see appendix). The feasible region X is compact in most real-world applications. 3.3 The Discrete Case Many operational problems are characterized by integrality constraints on the decision variables as is indicated by the wide application of integer resource allocation problems. Such applications arise in airline fleet assignment (Barnhart et al. (2000), Hane et al. (1995)), air 16

19 traffic control (Bertsimas & Patterson (2000)), railcar management (Holmberg et al. (1998), Jordan & Turnquist (1983)), container distribution (Crainic et al. (1993)) and general fleet management (Powell & Carvalho (1997)). In this subsection we see how we can approximate the model in (23) to generate integer solutions. Moreover we see that we can solve the resulting problem as a network if the original structure of the problem (that is, the cost function without the pattern metric) is a network. There is a literature on solving quadratic cost functions and more general convex cost problems as network flow problems. Minoux (1984) developed a polynomial-time algorithm for obtaining a real-valued optimal solution of a quadratic form of the objective function similar to the model objective in (23). It is further shown in Minoux (1986) that this method can be used to obtain an integer optimal solution to the general convex flow problem. We use a method (see Ahuja et al. (1993), pp ) that approximates a quadratic function using a piecewise linear model. We then show that this formulation can be solved as a network and use the well-known fact that solving a network with integer data as a linear program yields integer solutions. The objective function in (23) can be expressed as a A C (x ) where: C (x ) = c x + θ x [ h R + 2(u x ) ] du. a 0 Since x cannot exceed the number of occurrences of state a at time t denoted by R ta we can approximate C (x ) by at most R ta = R ta + I {Rta R ta >0} linear segments. The set {0, 1,..., R ta } denotes the breakpoints of the piecewise linear approximation. The linear cost coefficient in any interval [u 1, u], u {1,..., R ta } is obtained by taking the gradient of C (x ) evaluated at x = u which is given by c + θ [ h + 2(u x )]. Let Rta u=1 yu = x where 0 y u 1, u {1,..., R ta }. Using the piecewise linear approximation of C (x ) we can represent the quadratic formulation in (23) using yt (θ) = arg min y t Y t R ta a A u=1 ( c + θ [ h + 2(u x ) ]) y u. (29) Y t is the feasible region obtained from transforming the feasible region X using the equations 17

20 Rta u=1 yu = x and the constraints 0 y u 1, u {1,..., R ta }. If the feasible region X t for all t T defines network flow constraints, we see that the formulation in (29) retains the network structure. In the presence of integer data this formulation yields integer solutions. The disadvantage of the network formulation in (29) is that we need to replace a single arc representing the pattern (a, d) with multiple arcs each of whose upper bound is one unit of flow and the cardinality of the number of arcs associated with a particular pattern (a, d) is given by R ta. A simpler version of our piecewise linear approximation is simply to use a linear approximation as shown below: x t (θ) = arg min x t X t a A (c + θra h ) x. (30) In the next section we extend the algorithm developed in this section to the reusable resource case. 4 Extension to Reusable Resources When time periods are relatively short, the decisions to act on resources in one time period impact the resources available in a later time period. In this case, the time periods are coupled. A natural algorithmic strategy is to use approximate dynamic programming methods. Decisions made in time period t can capture the impact on time period t + 1 by using an approximate value function V t+1 (R t+1 ) where R t+1 = B t x t, as presented in equation (6). When we allocate resources, our decisions (x ) a A,d D must be chosen subject to the resource constraint: R n ta = x n a A, t T. In the case of reusable resources, the resource vector R t = (R ta ) a A, for t 1, depends on decisions made in earlier time periods. We let V t (R t ) be the function that describes (at least 18

21 approximately) the optimal value of having R t resources at the beginning of time period t for the remainder of the horizon. An outline of the basic algorithm is given in figure 1. We use U V to denote an updating function that updates the value function approximations for the resource state Rt n = {Rta} n a A, t T at every iteration n 1. Examples of such approximations for resource allocation problems can be found in Powell et al. (2002), Godfrey & Powell (2001) and Godfrey & Powell (2002). A general treatment of approximate dynamic programming methods can be found in Bertsekas & Tsitsiklis (1996) and Si et al. (2004). In practice, these methods do not produce optimal solutions for most problems, and as a result we lose our ability to prove overall convergence of the algorithm. However, we can show that our pattern matching algorithm improves our ability to match an exogenously specified pattern. In addition, we can show experimentally that we can improve overall solution quality when the exogenous pattern is based on solving a static model to optimality. Step 0 Initialization: Set iteration counter n = 1. Choose an approximation V 0 t (.) for V t (.), t T. Step 1 Forward Pass: Step 1.0 Initialize forward pass: Initialize R 1 1. Set t = 1. Step 1.1 Solve subproblem: For time period t solve equation (6) to get solution vector x n t. Step 1.2 Apply system dynamics to update resource attributes after transformation. Step 1.3 Advance time t = t + 1: If t T go to Step 1.1. Step 2 Value function update: Set V t n (.) U V n 1 ( V t (.), Rt n ), t T. Step 3 Advance iteration counter: Stop if convergence is satisfied. If not set n = n + 1 and go to Step 1. Figure 1: Value iteration methodology for dynamic resource allocation problems with reusable resources. In this section we show how we can apply the optimization model introduced in subsection 3.3 to the iterative setting represented by figure 1. We define the normalized model flows as 19

22 shown below: ρ n (x) = xn R n ta a A, t T d D a. As the model decision variables change with each iteration it is necessary to define the pattern metric given by equation (16) at every iteration as a function of the normalized model flows obtained from the previous iteration. We denote the pattern metric at the beginning of iteration n as shown by H n 1 = H(ρ n 1 (x), ρ s, R n 1 ) where H is given by the expression in equation (16). Note that we use the additional argument R n 1 in denoting the pattern metric H n 1 to take into account the fact that when we have reusable resources the number of resources with attribute a varies across iterations. We let R n a = t T R n ta a A. We assume we have the initialized values ρ 0 and R 0. We denote the gradient of H n 1 with respect to the normalized decision variable ρ ad at the value ρ n 1 ad h n ad = Hn 1 ρ ad from which we obtain ρad =ρ n 1 ad a A, d D a as h n ad. We begin with h n ad = 2Ra n 1 (ρ n 1 ad ρ s ad) a A, d D a. The Gauss-Seidel variant of the gradient of the pattern metric, denoted by h n,is given by h n = 2Ra n 1 (ρ n, ρs ad) a A, d D a t {1,..., T }, (31) where as before we define ρ n, = t 1 ( ) x n t ad + Ra n 1 t =1 T t =t ( x n 1 ) t ad Ra n 1 a A, d D a t {1,..., T }. 20

23 Note that we use the notation for the Gauss-Seidel gradient of the pattern metric as a function of n since it reflects the usage of decision variables from prior time periods obtained at iteration n. Within the approximate dynamic programming technique proposed to solve this problem, we adopt a linear value function approximation V t (R t ) = a A v n tar ta, where v n ta is an approximation of the marginal value of resources of type a at time t. Let the attribute transition function be defined using a M (a, d) = The attribute of a resource produced by acting on a resource with attribute a using decision d. The slope v n t+1,a M (a,d) represents the future (marginal) value at time t+1 of a decision d acting at time t on resource attribute vector a. If we use a linear value function approximation, then our subproblem at time t becomes Y n t min y n t Yn t R n ta a A u=1 ( c + v n 1 t+1,a M (a,d) + θ [ hn Ra n 1 + 2(u x n 1 )]) y u,n. (32) denotes the feasible region at iteration n and time t. We use the notation y u,n the iteration-specific dependence of the flow decomposition variables. to indicate We obtain a myopic policy by simply setting v n 1 t+1,a = 0, giving us min y n t Yn t R n ta a A u=1 ( c + θ [ hn Ra n 1 + 2(u x n 1 )]) y u,n. (33) We can obtain an estimate of v n ta by letting ˆv n ta be the dual variable of the supply constraint (equation (2)) of resource attribute a of the subproblem solved at time t at iteration n. Since these fluctuate randomly (even for deterministic problems), we update our estimates v n ta using v n ta = (1 α n ) v n 1 ta + α nˆv n ta a A, t T, 21

24 where α n (0, 1) is a smoothing factor. 5 A Resource Allocation Problem There are two applications of our pattern matching logic which we would like to test. First, we wish to demonstrate the degree to which our algorithm can improve our ability to match exogenously specified patterns. This ability improves user acceptance of these complex models. Second, we wish to measure the value of using the optimal solution of a static model to guide the approximate solution of a dynamic model for the more difficult context of nonreusable resources. To demonstrate the usefulness of the approach, we use as our test setting a problem known as the military airlift problem, which requires managing over time different types of cargo aircraft to move a set of loads ( requirements ) within a network of airbases. Cargo aircraft can be moved loaded or repositioned empty. The problem was chosen in part because while it exhibits the difficult time-staged nature of all of our problems, it is still small enough that we can solve the dynamic version of the model using a commercial solver. This ability allows us to evaluate all of our solutions relative to the optimal solution. In this section we first present the multicommodity flow problem in section 5.1. We detail in section 5.2 the static model that we solve to generate the static flow patterns. Section 5.3 then presents the dynamic model and we see how we can formulate the decision to hold a resource for the next time period which is absent when solving the static model. The results from the actual experiments are reported in section 6; 5.1 The Multicommodity Flow Problem Our experimental design is centered around a dynamic, multicommodity flow problem where resources are assigned to tasks that are moved from one location to another. On completion of these tasks these resources are allowed to cover other tasks starting from that location or to move empty to a different location to cover other tasks. Typically tasks have a time window during which they are available for assignment. There is a reward for covering a 22

25 Table 1: Compatibility matrix task based on the type of resource assigned to it. In addition there is a cost of moving empty between two locations. The data for our experiment is motivated by the military airlift problem, where a fleet of cargo aircraft are used to move loads of freight over time. We consider five types of aircraft and five types of tasks. We conducted experiments with five sets of data. Each dataset is characterized by a label L-A(#)-T(#)-TP where L denotes the number of locations, A the number of aircraft, T the number of tasks and TP the number of time periods. For the same number of aircraft we have different data sets characterizing the attributes of the aircraft and this difference is indicated as a counter A(#) for aircraft (we use T(#) for tasks). For example (1)-2000(1)-30 indicates an experiment that has 20 aircraft locations, 200 aircraft characterized by dataset 1, 2000 tasks characterized by dataset 1 and 30 time periods. Each task is characterized by an origin, destination and a type. A negative value is generated for covering the task and this reward is a function of the type of the task and the type of resource assigned to the task. Each task is associated with a value specified in dollars and the reward for covering this task with a resource is based on a compatibility matrix of dimensionality 5 5 that indicates the fraction of the reward received when covering a particular task type with a specific resource type. The compatibility matrix for our experiment is shown in table 1. There is an empty cost in dollars per mile that is associated with moving empty from one location to another. The empty cost is the same for all resource types. The data set is generated so that the number of demands going out of a location is negatively correlated with the number of demands going into a location at a certain time period. This results in more empty repositioning moves and more temporal flow imbalances. 23

26 The resource attribute vector a is given by a = {location, aircraft-type}. (34) We denote the set of locations in the network by J, and let a location be the attribute location of the resource attribute vector a. For any location j J we define L(j) as the set of tasks whose origin is j. A task expires from the system if it has not been assigned at the time it is available for assignment, that is, we do not assume time windows on tasks in the dynamic model. There is no reward generated for expired tasks. The decision set for the resource a at time t is given by D a = {Move assigned with l L(a location ), Move empty to j J }, a A We let D e a be the set of decisions to move a vehicle empty. 5.2 The Static Model We solve the static model characterizing the resource allocation problem presented in the above subsection as a flow balancing network model. To denote the transformation of resources in the static model we define the following indicator variable: 1 If decision d D a transforms the resource with attribute δ a (a, d ) = vector a A to the state a A 0 Otherwise. The static flow-balancing network model is given by x s = arg min c ad x ad (35) x X a A where we let c ad be the unit cost of transforming a resource with attribute vector a A 24

27 using a decision d D a. The feasible region X is defined by the constraints x ad a A d D a δ a (a, d ) x a d = 0, a A x ad 0, a A, d D a. The cost vector c consists of negative values (rewards) for covering a task and positive values (costs) for moving empty between locations. We represent the normalized optimal flows of empties from the static model as static flow patterns in the time-staged resource allocation model. The normalized static flow patterns ρ s ad are derived from the flow of empties using ρ s ad = x s ad d D ea xs ad, a A, d D e a. The static model is able to globally balance flows over the entire network. As such, these models are able to capture network-level patterns that may be missed by approximate models that are stepping forward through time. The experimental challenge is to measure the size of this benefit. 5.3 The Dynamic Model The objective function for the time-staged model is given by: max x X t T a A c x In our dynamic model, our cost vector has to consider the timing of activities. Thus, a load that is moved late will be assessed a service penalty. A problem we face in using flows from a static model to guide a dynamic model is that the static model does not provide any guidance as to how much flow should be held at a location (the hold option) at a given time period. Let β n a [0, 1] be an estimate of the fraction of hold flows for resources with attribute vector a at iteration n. We use the total number of empties from the static model to derive the scaling factor β n a as shown: β n a = min ( 0, 1 d Da e xs ad d Da t T e xn 1 ) 25

28 where d D e a t T xn 1 is the total number of resources with attribute vector a in the model characterizing flow of empties and the hold decisions from the previous iteration. Instead of using an iteration-independent ρ s ad to represent static flow patterns at every iteration we use: ρ s,n ad = { β n a Hold (1 β n a )ρ s ad Move to another location Thus, we are employing a user-defined parameter to specify the fraction of vehicles that are held in a location, and then factoring down the movements to other locations so that the pattern still sums to one. The new vector of probabilities {ρ s,n ad } d D e a satisfies the following condition at every iteration n: d D e a ρ s,n ad = 1, a A. The new pattern metric at the end of every iteration n is given by H n (ρ n (x), ρ s,n, R n ) = a A ( T t=1 R n ta ) d D e a (ρ n ad ρ s,n ad )2 (36) where we use the compact notation ρ s,n = {ρ s,n ad } a A, d D e a. A summary of the algorithm we use to incorporate the pattern logic in a dynamic model is given in figure 2. 6 Experimental Results We have three questions we wish to answer experimentally: 1) How quickly does the algorithm converge? 2) How well does the algorithm match exogenous patterns for problems with reusable resources? and 3) If the exogenous pattern is the optimal solution to a static problem, how much does this improve the solution when we are using an approximate algorithm (for problems with reusable resources)? 26

29 Step 0 Initialization: Set iteration counter n to 1. Initialize the following for n = 1: h 0 = 0, t T, a A, d D e a R0 ta t T R0 ta = 0, t T, a A ρ s,0 a = ρ s a, a A. Step 1 Set time t = 1: Step 2 : Step 1.0 If n > 1 : Derive network arc cost using the Gauss-Seidel gradient of the pattern metric as in equation (31) and apply smoothing to this cost. Step 1.1 Solve the time-staged model with linear value function approximations indicated in (32) or the myopic policy indicated in (33) for stage t. Step 1.2 Increment t = t + 1: If t T go to Step 1.0 else go to Step 2. Step 2.0 Calculate aggregate decision variables: x n ad = t T x n, a A, d D e a Step 2.1 Derive: ρ n ad = x n ad d D ea xn ad, a A, d D e a If d D e a xn ad = 0 set ρn ad = 1 for all a A, d De a, δ a (a, d) = 1, and ρ n ad = 0, otherwise. Step 2.2 Scaling: Derive ρ s,n ad, a A, d De a to reflect hold decisions. Step 2.3 Derive the pattern metric H n using (36). Step 2.5 Advance iteration counter if convergence is not satisfied: Set n = n + 1 and go to Step 1. Figure 2: Piecewise linear version of algorithm for incorporating static flow patterns in a time-staged resource allocation model. 27

30 Flow patterns from static model Origin,Aircraft-Type (a) Destination (d) Proportion(ρ s ad ) Total Flow(xs ad ) FL-34,A VA FL-34,A MS SC-29,A VA CA-95,B MO CA-95,B OR IA-51,D AK UT-84,D NM UT-84,D MO Table 2: Percent of flow moving empty from origin to destination by aircraft type, as produced by the static model. These are the patterns used to guide the dynamic model. We address these questions using the problem described in section 5. A sample file of patterns representing the flow of empties between locations obtained from solving the static model for the military airlift problem is highlighted in table 2. In our experiments we are able to solve the dynamic resource allocation model exactly to get the optimal solution. Based on experimentation we found that using a scaling factor θ = 1000 is appropriate when incorporating patterns with a linear value function and a scaling factor θ = is appropriate when incorporating patterns while using a myopic policy, which performs very poorly for this problem class. In our experiment we use α n = 2 10+n as the smoothing factor to update the linear value function approximations. The smoothing factor that we apply 20 to the Gauss-Seidel gradient of the pattern metric is. We initialize all the smoothed 40+n 1 gradients and costs for n = 0 to Rate of convergence We have proven that our algorithm monotonically reduces the pattern metric, even for the case of reusable resources where we are unable to prove global convergence (since we are using an approximate algorithm to step through time). Unresolved, however, is the rate of convergence. In the introduction, we described a number of projects where we are using this methodology. We have consistently found that the Gauss-Seidel strategy produces very fast convergence. Figure 3 shows how well we match a historical pattern (normalized to 100) after each 28

31 Normalized metric Iterations Figure 3: Rate of convergence of the pattern metric. iteration of the algorithm. The model was judged to be acceptable (by a knowledgeable user) if the performance was within the bounds shown in the figure (approximately two percent above and below the target). We found that the Gauss-Seidel algorithm converged closely to this target within three to four iterations. We have used this algorithm in a number of projects, and this performance is typical. The fast performance is due to the ability of the algorithm to adjust after each time step whether it should do more or less of an activity in order to match a target statistic based on how well we are tracking the goal over the last T time periods (which may include time periods from a previous iteration). If we are using an approximate dynamic programming algorithm, we have to simulate the problem iteratively, and the pattern logic adds only a nominal computational burden. If we were to use a simple myopic policy which normally would require stepping through the data once, this logic now requires that we repeat this simulation three or four times. 6.2 Matching patterns and improving solution quality We now report on experiments where we measure both how well the procedure matches exogenous patterns, and the degree to which patterns derived from solving a static model optimally improves the quality of heuristics used to solve the dynamic problem. 29

32 Linear Pattern Logic and a Myopic Policy Data % optimality % optimality % improvement % improvement with θ = 0 with θ = in obj. function in pattern metric (1)-2000(1) (1)-4000(1) (1)-6000(1) (2)-4000(2) (3)-4000(3) Table 3: Effect of patterns when using a myopic policy (value functions are zero) and a linear pattern metric. Piecewise Linear Pattern Logic using a Myopic Policy Data % optimality % optimality % improvement % improvement with θ = 0 with θ = in obj. function in pattern metric (1)-2000(1) (1)-4000(1) (1)-6000(1) (2)-4000(2) (3)-4000(3) Table 4: Effect of patterns when using a myopic policy and a piecewise-linear pattern metric. Tables 3 and 4 summarize our experimental results when implementing our algorithm using a myopic policy. We see that there is a significant improvement in the percentage of optimality obtained by incorporating patterns using either the linear (equation (30)) or piecewise linear (equation (29)) versions of our algorithm. We are able to achieve in most cases around 70 percent of the optimal solution with an improvement of around 40 percent. While this is far below optimal, we have to point out that the myopic policy is especially poor. This policy does not allow us to move equipment empty to a different location to cover demands that might arise in the future, resulting in excess inventories of equipment at some location that become unproductive. We also see that in the implementations of both the linear and piecewise linear versions of our methodology there is a significant reduction in the pattern metric, showing that we are doing a much better job of matching the pattern. In tables 5 and 6 we report our results for incorporating patterns when we use linear value function approximations to convey information among subproblems. We see that even without incorporating patterns, with the use of linear value function approximations we are able to achieve more than 90 percent of the optimal solution. Despite this, both linear and 30

The Optimizing-Simulator: Merging Optimization and Simulation Using Approximate Dynamic Programming

The Optimizing-Simulator: Merging Optimization and Simulation Using Approximate Dynamic Programming The Optimizing-Simulator: Merging Optimization and Simulation Using Approximate Dynamic Programming Winter Simulation Conference December 5, 2005 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu

More information

Approximate Dynamic Programming for High Dimensional Resource Allocation Problems

Approximate Dynamic Programming for High Dimensional Resource Allocation Problems Approximate Dynamic Programming for High Dimensional Resource Allocation Problems Warren B. Powell Abraham George Belgacem Bouzaiene-Ayari Hugo P. Simao Department of Operations Research and Financial

More information

Approximate Dynamic Programming in Rail Operations

Approximate Dynamic Programming in Rail Operations Approximate Dynamic Programming in Rail Operations June, 2007 Tristan VI Phuket Island, Thailand Warren Powell Belgacem Bouzaiene-Ayari CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu

More information

Dynamic Models for Freight Transportation

Dynamic Models for Freight Transportation Dynamic Models for Freight Transportation Warren B. Powell Belgacem Bouzaiene-Ayari Hugo P. Simao Department of Operations Research and Financial Engineering Princeton University August 13, 2003 Abstract

More information

Dynamic Models for Freight Transportation

Dynamic Models for Freight Transportation Dynamic Models for Freight Transportation Warren B. Powell Belgacem Bouzaiene-Ayari Hugo P. Simao Department of Operations Research and Financial Engineering Princeton University September 14, 2005 Abstract

More information

Some Fixed-Point Results for the Dynamic Assignment Problem

Some Fixed-Point Results for the Dynamic Assignment Problem Some Fixed-Point Results for the Dynamic Assignment Problem Michael Z. Spivey Department of Mathematics and Computer Science Samford University, Birmingham, AL 35229 Warren B. Powell Department of Operations

More information

There has been considerable recent interest in the dynamic vehicle routing problem, but the complexities of

There has been considerable recent interest in the dynamic vehicle routing problem, but the complexities of TRANSPORTATION SCIENCE Vol. 38, No. 4, November 2004, pp. 399 419 issn 0041-1655 eissn 1526-5447 04 3804 0399 informs doi 10.1287/trsc.1030.0073 2004 INFORMS The Dynamic Assignment Problem Michael Z. Spivey,

More information

Stochastic Programming in Transportation and Logistics

Stochastic Programming in Transportation and Logistics Stochastic Programming in Transportation and Logistics Warren B. Powell and Huseyin Topaloglu Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544 Abstract

More information

A Parallelizable and Approximate Dynamic Programming-Based Dynamic Fleet Management Model with Random Travel Times and Multiple Vehicle Types

A Parallelizable and Approximate Dynamic Programming-Based Dynamic Fleet Management Model with Random Travel Times and Multiple Vehicle Types A Parallelizable and Approximate Dynamic Programming-Based Dynamic Fleet Management Model with Random Travel Times and Multiple Vehicle Types Huseyin Topaloglu School of Operations Research and Industrial

More information

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Warren B. Powell, Hugo P. Simao and Belgacem Bouzaiene-Ayari Department of Operations Research and Financial Engineering

More information

Improving the travel time prediction by using the real-time floating car data

Improving the travel time prediction by using the real-time floating car data Improving the travel time prediction by using the real-time floating car data Krzysztof Dembczyński Przemys law Gawe l Andrzej Jaszkiewicz Wojciech Kot lowski Adam Szarecki Institute of Computing Science,

More information

A Distributed Decision Making Structure for Dynamic Resource Allocation Using Nonlinear Functional Approximations

A Distributed Decision Making Structure for Dynamic Resource Allocation Using Nonlinear Functional Approximations A Distributed Decision Making Structure for Dynamic Resource Allocation Using Nonlinear Functional Approximations Huseyin Topaloglu School of Operations Research and Industrial Engineering Cornell University,

More information

On Two Class-Constrained Versions of the Multiple Knapsack Problem

On Two Class-Constrained Versions of the Multiple Knapsack Problem On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic

More information

We present a general optimization framework for locomotive models that captures different levels of detail,

We present a general optimization framework for locomotive models that captures different levels of detail, Articles in Advance, pp. 1 24 ISSN 0041-1655 (print) ISSN 1526-5447 (online) http://dx.doi.org/10.1287/trsc.2014.0536 2014 INFORMS From Single Commodity to Multiattribute Models for Locomotive Optimization:

More information

Robust Optimization for Empty Repositioning Problems

Robust Optimization for Empty Repositioning Problems Robust Optimization for Empty Repositioning Problems Alan L. Erera, Juan C. Morales and Martin Savelsbergh The Logistics Institute School of Industrial and Systems Engineering Georgia Institute of Technology

More information

Inventory optimization of distribution networks with discrete-event processes by vendor-managed policies

Inventory optimization of distribution networks with discrete-event processes by vendor-managed policies Inventory optimization of distribution networks with discrete-event processes by vendor-managed policies Simona Sacone and Silvia Siri Department of Communications, Computer and Systems Science University

More information

MODELING DYNAMIC PROGRAMS

MODELING DYNAMIC PROGRAMS CHAPTER 5 MODELING DYNAMIC PROGRAMS Perhaps one of the most important skills to develop in approximate dynamic programming is the ability to write down a model of the problem. Everyone who wants to solve

More information

A Representational Paradigm for Dynamic Resource Transformation Problems

A Representational Paradigm for Dynamic Resource Transformation Problems A Representational Paradigm for Dynamic Resource Transformation Problems Warren B. Powell, Joel A. Shapiro and Hugo P. Simao January, 2003 Department of Operations Research and Financial Engineering, Princeton

More information

Routing. Topics: 6.976/ESD.937 1

Routing. Topics: 6.976/ESD.937 1 Routing Topics: Definition Architecture for routing data plane algorithm Current routing algorithm control plane algorithm Optimal routing algorithm known algorithms and implementation issues new solution

More information

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Warren B. Powell, Hugo P. Simao and Belgacem Bouzaiene-Ayari Department of Operations Research and Financial Engineering

More information

The Dynamic Energy Resource Model

The Dynamic Energy Resource Model The Dynamic Energy Resource Model Group Peer Review Committee Lawrence Livermore National Laboratories July, 2007 Warren Powell Alan Lamont Jeffrey Stewart Abraham George 2007 Warren B. Powell, Princeton

More information

Approximate Dynamic Programming for High-Dimensional Resource Allocation Problems

Approximate Dynamic Programming for High-Dimensional Resource Allocation Problems Approximate Dynamic Programming for High-Dimensional Resource Allocation Problems Warren B. Powell Department of Operations Research and Financial Engineering Princeton University Benjamin Van Roy Departments

More information

An Integrated Optimizing-Simulator for the Military Airlift Problem

An Integrated Optimizing-Simulator for the Military Airlift Problem An Integrated Optimizing-Simulator for the Military Airlift Problem Tongqiang Tony Wu Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544 Warren B. Powell

More information

On-line supplement to: SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology

On-line supplement to: SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology On-line supplement to: SMART: A Stochastic Multiscale Model for e Analysis of Energy Resources, Technology and Policy This online supplement provides a more detailed version of e model, followed by derivations

More information

TRUCK DISPATCHING AND FIXED DRIVER REST LOCATIONS

TRUCK DISPATCHING AND FIXED DRIVER REST LOCATIONS TRUCK DISPATCHING AND FIXED DRIVER REST LOCATIONS A Thesis Presented to The Academic Faculty by Steven M. Morris In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Industrial

More information

A heuristic algorithm for the Aircraft Landing Problem

A heuristic algorithm for the Aircraft Landing Problem 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 A heuristic algorithm for the Aircraft Landing Problem Amir Salehipour

More information

Perhaps one of the most widely used and poorly understood terms in dynamic programming is policy. A simple definition of a policy is:

Perhaps one of the most widely used and poorly understood terms in dynamic programming is policy. A simple definition of a policy is: CHAPTER 6 POLICIES Perhaps one of the most widely used and poorly understood terms in dynamic programming is policy. A simple definition of a policy is: Definition 6.0.1 A policy is a rule (or function)

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

Handout 1: Introduction to Dynamic Programming. 1 Dynamic Programming: Introduction and Examples

Handout 1: Introduction to Dynamic Programming. 1 Dynamic Programming: Introduction and Examples SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 1: Introduction to Dynamic Programming Instructor: Shiqian Ma January 6, 2014 Suggested Reading: Sections 1.1 1.5 of Chapter

More information

On the Approximate Linear Programming Approach for Network Revenue Management Problems

On the Approximate Linear Programming Approach for Network Revenue Management Problems On the Approximate Linear Programming Approach for Network Revenue Management Problems Chaoxu Tong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

Prioritized Sweeping Converges to the Optimal Value Function

Prioritized Sweeping Converges to the Optimal Value Function Technical Report DCS-TR-631 Prioritized Sweeping Converges to the Optimal Value Function Lihong Li and Michael L. Littman {lihong,mlittman}@cs.rutgers.edu RL 3 Laboratory Department of Computer Science

More information

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo Lecture 1 Behavioral Models Multinomial Logit: Power and limitations Cinzia Cirillo 1 Overview 1. Choice Probabilities 2. Power and Limitations of Logit 1. Taste variation 2. Substitution patterns 3. Repeated

More information

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem Parisa Mansourifard 1/37 Bayesian Congestion Control over a Markovian Network Bandwidth Process:

More information

Mitigating end-effects in Production Scheduling

Mitigating end-effects in Production Scheduling Mitigating end-effects in Production Scheduling Bachelor Thesis Econometrie en Operationele Research Ivan Olthuis 359299 Supervisor: Dr. Wilco van den Heuvel June 30, 2014 Abstract In this report, a solution

More information

Large Scale Semi-supervised Linear SVMs. University of Chicago

Large Scale Semi-supervised Linear SVMs. University of Chicago Large Scale Semi-supervised Linear SVMs Vikas Sindhwani and Sathiya Keerthi University of Chicago SIGIR 2006 Semi-supervised Learning (SSL) Motivation Setting Categorize x-billion documents into commercial/non-commercial.

More information

We consider a nonlinear nonseparable functional approximation to the value function of a dynamic programming

We consider a nonlinear nonseparable functional approximation to the value function of a dynamic programming MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 13, No. 1, Winter 2011, pp. 35 52 issn 1523-4614 eissn 1526-5498 11 1301 0035 informs doi 10.1287/msom.1100.0302 2011 INFORMS An Improved Dynamic Programming

More information

WE propose an approximate dynamic programming

WE propose an approximate dynamic programming IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL 58, NO 12, DECEMBER 2013 2995 An Optimal Approximate Dynamic Programming Algorithm for Concave, Scalar Storage Problems With Vector-Valued Controls Juliana Nascimento

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

M08/5/MATSD/SP1/ENG/TZ2/XX/M+ MARKSCHEME. May 2008 MATHEMATICAL STUDIES. Standard Level. Paper pages

M08/5/MATSD/SP1/ENG/TZ2/XX/M+ MARKSCHEME. May 2008 MATHEMATICAL STUDIES. Standard Level. Paper pages M08/5/MATSD/SP1/ENG/TZ/XX/M+ MARKSCHEME May 008 MATHEMATICAL STUDIES Standard Level Paper 1 0 pages M08/5/MATSD/SP1/ENG/TZ/XX/M+ This markscheme is confidential and for the exclusive use of examiners in

More information

Recoverable Robustness in Scheduling Problems

Recoverable Robustness in Scheduling Problems Master Thesis Computing Science Recoverable Robustness in Scheduling Problems Author: J.M.J. Stoef (3470997) J.M.J.Stoef@uu.nl Supervisors: dr. J.A. Hoogeveen J.A.Hoogeveen@uu.nl dr. ir. J.M. van den Akker

More information

M08/5/MATSD/SP2/ENG/TZ2/XX/M+ MARKSCHEME. May 2008 MATHEMATICAL STUDIES. Standard Level. Paper pages

M08/5/MATSD/SP2/ENG/TZ2/XX/M+ MARKSCHEME. May 2008 MATHEMATICAL STUDIES. Standard Level. Paper pages M08/5/MATSD/SP/ENG/TZ/XX/M+ MARKSCHEME May 008 MATHEMATICAL STUDIES Standard Level Paper 3 pages M08/5/MATSD/SP/ENG/TZ/XX/M+ This markscheme is confidential and for the exclusive use of examiners in this

More information

Session-Based Queueing Systems

Session-Based Queueing Systems Session-Based Queueing Systems Modelling, Simulation, and Approximation Jeroen Horters Supervisor VU: Sandjai Bhulai Executive Summary Companies often offer services that require multiple steps on the

More information

Vehicle Routing with Traffic Congestion and Drivers Driving and Working Rules

Vehicle Routing with Traffic Congestion and Drivers Driving and Working Rules Vehicle Routing with Traffic Congestion and Drivers Driving and Working Rules A.L. Kok, E.W. Hans, J.M.J. Schutten, W.H.M. Zijm Operational Methods for Production and Logistics, University of Twente, P.O.

More information

Integrated Network Design and Scheduling Problems with Parallel Identical Machines: Complexity Results and Dispatching Rules

Integrated Network Design and Scheduling Problems with Parallel Identical Machines: Complexity Results and Dispatching Rules Integrated Network Design and Scheduling Problems with Parallel Identical Machines: Complexity Results and Dispatching Rules Sarah G. Nurre 1 and Thomas C. Sharkey 1 1 Department of Industrial and Systems

More information

An Approximate Dynamic Programming Algorithm for the Allocation of High-Voltage Transformer Spares in the Electric Grid

An Approximate Dynamic Programming Algorithm for the Allocation of High-Voltage Transformer Spares in the Electric Grid An Approximate Dynamic Programming Algorithm for the Allocation of High-Voltage Transformer Spares in the Electric Grid Johannes Enders Department of Operations Research and Financial Engineering Princeton

More information

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements 3E4: Modelling Choice Lecture 7 Introduction to nonlinear programming 1 Announcements Solutions to Lecture 4-6 Homework will be available from http://www.eng.cam.ac.uk/~dr241/3e4 Looking ahead to Lecture

More information

A Gentle Introduction to Reinforcement Learning

A Gentle Introduction to Reinforcement Learning A Gentle Introduction to Reinforcement Learning Alexander Jung 2018 1 Introduction and Motivation Consider the cleaning robot Rumba which has to clean the office room B329. In order to keep things simple,

More information

Bayesian Active Learning With Basis Functions

Bayesian Active Learning With Basis Functions Bayesian Active Learning With Basis Functions Ilya O. Ryzhov Warren B. Powell Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA IEEE ADPRL April 13, 2011 1 / 29

More information

The design of Demand-Adaptive public transportation Systems: Meta-Schedules

The design of Demand-Adaptive public transportation Systems: Meta-Schedules The design of Demand-Adaptive public transportation Systems: Meta-Schedules Gabriel Teodor Crainic Fausto Errico ESG, UQAM and CIRRELT, Montreal Federico Malucelli DEI, Politecnico di Milano Maddalena

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

The network maintenance problem

The network maintenance problem 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 The network maintenance problem Parisa Charkhgard a, Thomas Kalinowski

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 6.4/43 Principles of Autonomy and Decision Making Lecture 8: (Mixed-Integer) Linear Programming for Vehicle Routing and Motion Planning Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute

More information

Complexity Metrics. ICRAT Tutorial on Airborne self separation in air transportation Budapest, Hungary June 1, 2010.

Complexity Metrics. ICRAT Tutorial on Airborne self separation in air transportation Budapest, Hungary June 1, 2010. Complexity Metrics ICRAT Tutorial on Airborne self separation in air transportation Budapest, Hungary June 1, 2010 Outline Introduction and motivation The notion of air traffic complexity Relevant characteristics

More information

A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions

A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions Huseyin Topaloglu School of Operations Research and Information Engineering, Cornell

More information

Linear-Quadratic Optimal Control: Full-State Feedback

Linear-Quadratic Optimal Control: Full-State Feedback Chapter 4 Linear-Quadratic Optimal Control: Full-State Feedback 1 Linear quadratic optimization is a basic method for designing controllers for linear (and often nonlinear) dynamical systems and is actually

More information

What you should know about approximate dynamic programming

What you should know about approximate dynamic programming What you should know about approximate dynamic programming Warren B. Powell Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 December 16, 2008 Abstract

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

An Optimization-Based Heuristic for the Split Delivery Vehicle Routing Problem

An Optimization-Based Heuristic for the Split Delivery Vehicle Routing Problem An Optimization-Based Heuristic for the Split Delivery Vehicle Routing Problem Claudia Archetti (1) Martin W.P. Savelsbergh (2) M. Grazia Speranza (1) (1) University of Brescia, Department of Quantitative

More information

Stochastic programs with binary distributions: Structural properties of scenario trees and algorithms

Stochastic programs with binary distributions: Structural properties of scenario trees and algorithms INSTITUTT FOR FORETAKSØKONOMI DEPARTMENT OF BUSINESS AND MANAGEMENT SCIENCE FOR 12 2017 ISSN: 1500-4066 October 2017 Discussion paper Stochastic programs with binary distributions: Structural properties

More information

Runtime Reduction Techniques for the Probabilistic Traveling Salesman Problem with Deadlines

Runtime Reduction Techniques for the Probabilistic Traveling Salesman Problem with Deadlines Runtime Reduction Techniques for the Probabilistic Traveling Salesman Problem with Deadlines Ann Melissa Campbell, Barrett W. Thomas Department of Management Sciences, University of Iowa 108 John Pappajohn

More information

Dynamic Programming Approximations for Stochastic, Time-Staged Integer Multicommodity Flow Problems

Dynamic Programming Approximations for Stochastic, Time-Staged Integer Multicommodity Flow Problems Dynamic Programming Approximations for Stochastic, Time-Staged Integer Multicommody Flow Problems Huseyin Topaloglu School of Operations Research and Industrial Engineering, Cornell Universy, Ithaca, NY

More information

CS 6901 (Applied Algorithms) Lecture 3

CS 6901 (Applied Algorithms) Lecture 3 CS 6901 (Applied Algorithms) Lecture 3 Antonina Kolokolova September 16, 2014 1 Representative problems: brief overview In this lecture we will look at several problems which, although look somewhat similar

More information

Real-time Systems: Scheduling Periodic Tasks

Real-time Systems: Scheduling Periodic Tasks Real-time Systems: Scheduling Periodic Tasks Advanced Operating Systems Lecture 15 This work is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of

More information

M08/5/MATSD/SP1/ENG/TZ1/XX/M+ MARKSCHEME. May 2008 MATHEMATICAL STUDIES. Standard Level. Paper pages

M08/5/MATSD/SP1/ENG/TZ1/XX/M+ MARKSCHEME. May 2008 MATHEMATICAL STUDIES. Standard Level. Paper pages M08/5/MATSD/SP1/ENG/TZ1/XX/M+ MARKSCHEME May 008 MATHEMATICAL STUDIES Standard Level Paper 1 0 pages M08/5/MATSD/SP1/ENG/TZ1/XX/M+ This markscheme is confidential and for the exclusive use of examiners

More information

Supplementary Technical Details and Results

Supplementary Technical Details and Results Supplementary Technical Details and Results April 6, 2016 1 Introduction This document provides additional details to augment the paper Efficient Calibration Techniques for Large-scale Traffic Simulators.

More information

On service level measures in stochastic inventory control

On service level measures in stochastic inventory control On service level measures in stochastic inventory control Dr. Roberto Rossi The University of Edinburgh Business School, The University of Edinburgh, UK roberto.rossi@ed.ac.uk Friday, June the 21th, 2013

More information

On the static assignment to parallel servers

On the static assignment to parallel servers On the static assignment to parallel servers Ger Koole Vrije Universiteit Faculty of Mathematics and Computer Science De Boelelaan 1081a, 1081 HV Amsterdam The Netherlands Email: koole@cs.vu.nl, Url: www.cs.vu.nl/

More information

Robust multi-sensor scheduling for multi-site surveillance

Robust multi-sensor scheduling for multi-site surveillance DOI 10.1007/s10878-009-9271-4 Robust multi-sensor scheduling for multi-site surveillance Nikita Boyko Timofey Turko Vladimir Boginski David E. Jeffcoat Stanislav Uryasev Grigoriy Zrazhevsky Panos M. Pardalos

More information

Approximate Dynamic Programming: Solving the curses of dimensionality

Approximate Dynamic Programming: Solving the curses of dimensionality Approximate Dynamic Programming: Solving the curses of dimensionality Informs Computing Society Tutorial October, 2008 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu

More information

DRAFT Formulation and Analysis of Linear Programs

DRAFT Formulation and Analysis of Linear Programs DRAFT Formulation and Analysis of Linear Programs Benjamin Van Roy and Kahn Mason c Benjamin Van Roy and Kahn Mason September 26, 2005 1 2 Contents 1 Introduction 7 1.1 Linear Algebra..........................

More information

15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018

15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018 15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018 Usual rules. :) Exercises 1. Lots of Flows. Suppose you wanted to find an approximate solution to the following

More information

Knapsack and Scheduling Problems. The Greedy Method

Knapsack and Scheduling Problems. The Greedy Method The Greedy Method: Knapsack and Scheduling Problems The Greedy Method 1 Outline and Reading Task Scheduling Fractional Knapsack Problem The Greedy Method 2 Elements of Greedy Strategy An greedy algorithm

More information

Lecture 1. Stochastic Optimization: Introduction. January 8, 2018

Lecture 1. Stochastic Optimization: Introduction. January 8, 2018 Lecture 1 Stochastic Optimization: Introduction January 8, 2018 Optimization Concerned with mininmization/maximization of mathematical functions Often subject to constraints Euler (1707-1783): Nothing

More information

Probabilistic Planning. George Konidaris

Probabilistic Planning. George Konidaris Probabilistic Planning George Konidaris gdk@cs.brown.edu Fall 2017 The Planning Problem Finding a sequence of actions to achieve some goal. Plans It s great when a plan just works but the world doesn t

More information

Surge Pricing and Labor Supply in the Ride- Sourcing Market

Surge Pricing and Labor Supply in the Ride- Sourcing Market Surge Pricing and Labor Supply in the Ride- Sourcing Market Yafeng Yin Professor Department of Civil and Environmental Engineering University of Michigan, Ann Arbor *Joint work with Liteng Zha (@Amazon)

More information

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved. Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved. 1 4.1 Interval Scheduling Interval Scheduling Interval scheduling. Job j starts at s j and

More information

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of

More information

Complexity of Routing Problems with Release Dates and Deadlines

Complexity of Routing Problems with Release Dates and Deadlines Complexity of Routing Problems with Release Dates and Deadlines Alan Erera, Damian Reyes, and Martin Savelsbergh H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology

More information

Birgit Rudloff Operations Research and Financial Engineering, Princeton University

Birgit Rudloff Operations Research and Financial Engineering, Princeton University TIME CONSISTENT RISK AVERSE DYNAMIC DECISION MODELS: AN ECONOMIC INTERPRETATION Birgit Rudloff Operations Research and Financial Engineering, Princeton University brudloff@princeton.edu Alexandre Street

More information

Numerical Methods. V. Leclère May 15, x R n

Numerical Methods. V. Leclère May 15, x R n Numerical Methods V. Leclère May 15, 2018 1 Some optimization algorithms Consider the unconstrained optimization problem min f(x). (1) x R n A descent direction algorithm is an algorithm that construct

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

Planning in Markov Decision Processes

Planning in Markov Decision Processes Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Planning in Markov Decision Processes Lecture 3, CMU 10703 Katerina Fragkiadaki Markov Decision Process (MDP) A Markov

More information

Regularized optimization techniques for multistage stochastic programming

Regularized optimization techniques for multistage stochastic programming Regularized optimization techniques for multistage stochastic programming Felipe Beltrán 1, Welington de Oliveira 2, Guilherme Fredo 1, Erlon Finardi 1 1 UFSC/LabPlan Universidade Federal de Santa Catarina

More information

Microeconomic Algorithms for Flow Control in Virtual Circuit Networks (Subset in Infocom 1989)

Microeconomic Algorithms for Flow Control in Virtual Circuit Networks (Subset in Infocom 1989) Microeconomic Algorithms for Flow Control in Virtual Circuit Networks (Subset in Infocom 1989) September 13th, 1995 Donald Ferguson*,** Christos Nikolaou* Yechiam Yemini** *IBM T.J. Watson Research Center

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Payments System Design Using Reinforcement Learning: A Progress Report

Payments System Design Using Reinforcement Learning: A Progress Report Payments System Design Using Reinforcement Learning: A Progress Report A. Desai 1 H. Du 1 R. Garratt 2 F. Rivadeneyra 1 1 Bank of Canada 2 University of California Santa Barbara 16th Payment and Settlement

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Anticipatory Freight Selection in Intermodal Long-haul Round-trips

Anticipatory Freight Selection in Intermodal Long-haul Round-trips Anticipatory Freight Selection in Intermodal Long-haul Round-trips A.E. Pérez Rivera and M.R.K. Mes Department of Industrial Engineering and Business Information Systems, University of Twente, P.O. Box

More information

The Optimizing-Simulator: An Illustration using the Military Airlift Problem

The Optimizing-Simulator: An Illustration using the Military Airlift Problem The Optimizing-Simulator: An Illustration using the Military Airlift Problem Tongqiang Tony Wu Warren B. Powell Princeton University and Alan Whisman Air Mobility Command There have been two primary modeling

More information

Traffic Modelling for Moving-Block Train Control System

Traffic Modelling for Moving-Block Train Control System Commun. Theor. Phys. (Beijing, China) 47 (2007) pp. 601 606 c International Academic Publishers Vol. 47, No. 4, April 15, 2007 Traffic Modelling for Moving-Block Train Control System TANG Tao and LI Ke-Ping

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important

More information

M11/5/MATSD/SP2/ENG/TZ2/XX/M MARKSCHEME. May 2011 MATHEMATICAL STUDIES. Standard Level. Paper pages

M11/5/MATSD/SP2/ENG/TZ2/XX/M MARKSCHEME. May 2011 MATHEMATICAL STUDIES. Standard Level. Paper pages M11/5/MATSD/SP/ENG/TZ/XX/M MARKSCHEME May 011 MATHEMATICAL STUDIES Standard Level Paper 9 pages M11/5/MATSD/SP/ENG/TZ/XX/M This markscheme is confidential and for the exclusive use of examiners in this

More information

1 Introduction. 2 Successive Convexification Algorithm

1 Introduction. 2 Successive Convexification Algorithm 1 Introduction There has been growing interest in cooperative group robotics [], with potential applications in construction and assembly. Most of this research focuses on grounded or mobile manipulator

More information

6. DYNAMIC PROGRAMMING I

6. DYNAMIC PROGRAMMING I 6. DYNAMIC PROGRAMMING I weighted interval scheduling segmented least squares knapsack problem RNA secondary structure Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley Copyright 2013

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Chapter 3: Discrete Optimization Integer Programming

Chapter 3: Discrete Optimization Integer Programming Chapter 3: Discrete Optimization Integer Programming Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Sito web: http://home.deib.polimi.it/amaldi/ott-13-14.shtml A.A. 2013-14 Edoardo

More information

1 Bewley Economies with Aggregate Uncertainty

1 Bewley Economies with Aggregate Uncertainty 1 Bewley Economies with Aggregate Uncertainty Sofarwehaveassumedawayaggregatefluctuations (i.e., business cycles) in our description of the incomplete-markets economies with uninsurable idiosyncratic risk

More information