Optimizing a Dynamic Order-Picking Process

Size: px
Start display at page:

Download "Optimizing a Dynamic Order-Picking Process"

Transcription

1 Optimizing a Dynamic Order-Picking Process Yossi Bukchin, Eugene Khmelnitsky, Pini Yakuel Department of Industrial Engineering, Tel-Aviv University, Tel-Aviv 69978, ISRAEL Abstract This research studies the problem of batching orders in a dynamic, finite-horizon environment to minimize order tardiness and overtime costs of the pickers. The problem introduces the following trade-off: at every period, the picker has to decide whether to go on a tour and pick the accumulated orders, or to wait for more orders to arrive. By waiting, the picker risks higher tardiness of existing orders on the account of lower tardiness of future orders. We use a Markov Decision Process (MDP) based approach to set an optimal decision making policy. In order to evaluate the potential improvement of the proposed approach in practice, we compare the optimal policy with two naive heuristics: () Go on tour immediately after an order arrives, and, (2) Wait as long as the current orders can be picked and supplied on time. The optimal policy shows a considerable improvement over the naïve heuristics, in the range of 7%-99%, where the specific values depend on the picking process parameters. We have found that one measure, the slack percentage of the picking process, associated with the difference between the promised lead time and the single item picking time, predicts quite accurately the cost reduction generated by the optimal policy. The structure and the properties of the optimal solutions have led to the construction of a more comprehensive heuristic method. Numerical results show that the proposed heuristic, MDP-H, outperforms the naïve heuristics in all experiments. As compared to the optimal solution, MDP-H provides close to optimal results for a slack of up to 40%.. Introduction Order-picking is the process of retrieving items from stocking locations in a warehouse to satisfy given demands. This process may involve as much as 60% of all labor activities in a warehouse and may account for as much as 65% of all operating expenses (Gademann and Van de velde, 2005).

2 The performance of an order picking system is typically determined by seven factors: batching, picking sequence, storage policy, zoning, layout design, picking equipment and design of picking information. Some research has been mainly concerned with studying the joint effect of several factors, on the performance of order picking systems. Petersen and Aase (2004) evaluated a number of picking, routing and storing methods, in order to determine which combination of these factors is best in terms of picking time. Each combination was compared to a basic scenario, in which orders are picked separately, items are stored randomly and the traversal strategy is used for routing. They concluded that batching of orders leads to the largest improvement, especially when small sized orders are frequent. Moreover, an improved storage policy (one which is not random, for example, class based) also achieves significant improvement, and with less sensitivity to order size. The best combination reduced the picking time by almost 30%. Other papers address each order picking performance factor separately. In that context, batching related studies are very common. Generally, the order batching problem is the problem of simultaneously assigning orders to batches and determining a picking tour for every batch so as to optimize an objective function. The main driver for batching is to reduce the average picking travel distance and thereby increase the throughput, and improve due date performance. Gademann and Van De Velde (2005) addressed the problem of batching orders to minimize total travel time in a parallel aisle warehouse. This problem is also referred to as proximity batching, since the obvious motivation is to batch orders that are stored in near locations. They proved that the problem is NP-hard in the strong sense, but can be solved in a polynomial time when the batch size is no greater than two orders. In the past, many heuristics have been presented in the literature for proximity batching. Most of these heuristics first select a seed order for a batch and subsequently expand the batch with orders that have "proximity" to the seed order as long as the picking cart capacity is not exceeded. The distinctive factor is the measure of the proximity of orders. Armstrong et al. (979) considered proximity batching with fixed batch sizes and presented an integer programming model. Gibson and Sharp (992) considered order batching in an order picking operation of storage and retrieval (S/R) machines. Elsayed and Lee (996) investigated automated storage/retrieval (AS/R) systems where a due date is specified for each retrieval order. They considered the inclusion of both order retrieval and storage in the same tour when possible. Their main results include a set of rules for sequencing and batching orders to tours such that the total tardiness of retrievals per group of orders is minimized. 2

3 The routing strategies of pickers in the warehouse were investigated in Hall (993). Three strategies for routing manual pickers are compared: () traversal, (2) midpoint, and (3) largest gap. The comparison was made by estimating the expected route length of each strategy. The results include a few rules of thumb which assist in choosing one strategy over another. For example, the third strategy is best when the average number of picks per aisle is relatively small. Another research was made by Roodbergen and de Koster (200) who considered a parallel aisle warehouse, where order pickers can change aisles at the ends of every aisle and also at a cross aisle halfway along the aisles. They concluded that in many cases the average order picking time can be decreased significantly by adding a middle aisle to the layout. In zoning, the warehouse is divided into zones so that each order is divided into sub orders which are allocated to the different zones. Every sub order is picked in the respective zone and the entire order is being rejoined in the packing area. Hane and Laih (2005) studied a synchronized zone order picking system. In such a system, the pickers of all zones work on the same order simultaneously. In order to prevent balance loss, the authors suggest storing items, which are likely to be a part of the same order, in different zones. Next, they developed a natural cluster model for item assignment in the warehouse. In one case study, the proposed item clustering approach improved the system's efficiency by 29% and the order picking time by 8%. Jane (2000) has developed a heuristic algorithm for a progressive zone picking system. Unlike synchronized zoning, under progressive zoning each order is processed by one zone picker at a time. The research objective was to balance workloads among all pickers so each one has almost the same load and to adjust the zone size for order volume fluctuations. The proposed method was illustrated and verified to achieve the objective through empirical data and simulation experiments. As described above, most of the related literature deals with the static problem of picking a fixed number of orders in the most efficient way while finding the best picking sequence or picking strategy (batching or zoning). However, in many warehouses and distribution centers (DC), the picking activity is executed under uncertainty, since the inter-arrival time of customer orders is stochastic by nature. Both a DC satisfying customer orders made via the Internet and an automotive warehouse providing spare parts for auto-shops are examples of such an environment. In this research, we address the problem of batching orders in a dynamic, finite-horizon environment to minimize order tardiness and overtime costs of the pickers. This problem is solved to optimality using a Markov decision process based approach. The performance of the optimal procedure was compared with two naïve heuristics and is found significantly 3

4 superior. The structure and the properties of the obtained solutions lead to constructing an efficient heuristic, called MDP-H. The comparison between the proposed heuristic and the optimal one shows that MDP-H provides close to optimal solution (up to 0.62%) for a slack up to 40%. In all experiments, MDP-H provides better solutions than the two naïve heuristics. Although this paper mainly refers to the manual order picking system, we expect the results to be applicable to automatic systems as well, where AS/R machines are responsible for these operations in an AS/R system. Equipped with a dual or triple shuttle, the AS/R machine is capable of picking a small number of orders simultaneously, just like a human picker who uses a multi-bin, picking cart. Having this analogy defined, one might realize the possible applicability of the automatic system. For example, an AS/R machine that operates in a Blockbuster DVD rent center. Customers demand to rent DVDs randomly during the day and a picking policy for the S/R Machine must be defined with the purpose of maximizing customer service level (minimizing order tardiness). The structure of this paper is as follows. In Section 2 the problem description is presented. Section 3 formulates the problem as a Markov Decision Process (MDP) and briefly outlines the solution algorithm. In Section 4 the optimal solution is compared with naïve batching strategies and some numerical results are presented. Section 5 analyses a new heuristic, which is developed on the basis of the optimal strategies' properties learnt from the MDP solutions. The performance of the heuristic is then compared both with the optimal approach and the naïve heuristics. Finally, in Section 6 we discuss the main contribution of the paper and indicate further research opportunities. 2. Problem description The problem studied can be outlined in the following manner. Orders, each of a single line item, are picked by one picker who uses a cart of limited capacity. Different orders/items are being placed in different bins of the cart during the picking tour. This picking method is referred to as sort-while-pick. The arrival rate of the orders is a random variable, which follows the Poisson distribution with a mean value of λ orders per period of time. All orders are supplied according to the same service level, by having the same customer lead time. Whenever an order is supplied after its due date, a penalty, which is proportional to the amount of tardiness periods, is consumed. A finite horizon is considered, as the warehouse is closed at the end of each working day, after fulfilling all the orders of that day. Consequently, 4

5 another kind of penalty is incurred whenever the picker keeps on working after the end of the working day. This penalty is proportional to the amount of overtime periods. The fundamental trade-off existing in our problem can be explained as follows. At every period, regardless of whether a new order has arrived or not, the picker has to decide whether to go on a picking tour and supply the orders accumulated so far or to wait for more orders to arrive (to batch orders). The former decision may speed up the supply of the currently available orders. However, by doing this, the picker may miss an opportunity to batch more orders had he waited one more period. That is, by waiting, the picker risks higher tardiness of existing orders for the potential lower tardiness of future orders. Our goal is to set a decision making policy that will minimize the average cost of order tardiness and worker overtime during a finite working day. It is clear that the time to pick a batch of n orders changes according to their storage locations in the warehouse. However, in this model, we assume that the picking tour time of n items, T(n), is an increasing function of the number of items, n, and independent of their locations. Still, we assume that T(n) is a concave function of n and therefore there is a motivation for batching items before going on a tour. 3. The solution approach We have modeled the above problem via MDP. The time horizon over which the optimal order picking policy is obtained is finite and associated with one working day. The working day is divided into periods of time, where the period length is set in such a way that the probability of the arrival of more than one order during a period is negligible. At the beginning of each period (i.e., at each decision epoch), the decision maker has two actions to choose from: to go on a picking tour or to wait another period. If he decided to wait, he receives no reward. Otherwise, if he decided to go on a picking tour, he incurs a cost proportional to the tardiness of the orders batched so far. The system then evolves into the next decision epoch according to the transition probability matrix of the Markov chain. Each state is defined by the number of orders batched and their corresponding remaining times to supply. The number of states in our problem is finite since the picker cannot accumulate more orders than the number of bins in the picking cart, and since we have defined the waiting time of an order to be limited (i.e., every order must be supplied after a predefined amount of time). The elements of the MDP model are described next. 5

6 Decision epochs Let {, 2,..., N} be a finite set of decision epochs, where the decision epoch N denotes the end of a working day. According to the policy of the DC, no orders arrive at the last I periods of the working day, in order to allow the picker to supply all the orders arrived during the (N- I) periods. If I is chosen to be relatively small, then there is a good chance that the DC would have to remain open after decision epoch N and therefore pay for overtime. If I is relatively large, then there is only a small chance of overtime. System states Let S = S' U Δ denote the set of the possible system s states, where S' is the set of states describing the order batching process, and Δ is the set of states describing the picking tour. Let γ i denote the remaining time to supply order i and Γ n = ( γ, γ 2, K, γ n ), the vector of remaining times to supply the n orders batched so far, where γ < γ 2 <... < γ n. Recall that the strict inequality results from the fact that at most one order can enter the system within a single period. A member of the set S = { s' s' = ( n,( γ, γ, K, γ ))} contains the number of ' 2 n orders batched, n, and their corresponding remaining times to supply, Γ n. For example, the system's state s '= {2,(3,5)} implies that two orders were batched so far. The first order is due in three periods and the second one is due in five periods. The S' state set is bounded because the values of n and γ i for all i are bounded. For all i, γ i is bounded from above by d, the planned lead time of an order, and from below by L, the lowest time left to the due date, L γ d ; n is bounded by the number of bins in the picking cart, C. In case either the i remaining time to supply the oldest order, γ, reaches the value of L, or, the cart is full, the picker is forced going on a picking tour. The state s ' = (0, φ) describes the system with no orders. As mentioned above, Δ is the set of states describing the picking tour. The members of this set are defined by the time left to the end of the picking tour and the expected length of the tour; i.e., Δ= { δ δ = ( ktn, ( ))}, where k is the time left to the end of the tour, and T(n) is the length of the tour. For example, the system's state δ = ( 3, T (5)) implies that a picking tour will be over in three periods and its total length is T(5) periods. The Δ state space counts the periods left in the picking tour, in order to determine the epoch in which the system comes back to the S' state space. The tour length is also kept as a part of the state in order to calculate the correct transition probability to the S' state space. 6

7 The state set Δ consists of the following members: {( T ( n), T ( n)),( T ( n) 2, T ( n)),..., (, T ( n)),(0,0) } Δ = for all n=,,c where, δ = (0,0) is the state of a picking tour that ends in one of the last I periods of the working day (i.e., an absorbing state, since no more orders arrive in the last I periods). Actions The action set A s depends on state s and includes at most two actions for each state. The first action, a, is to wait for one more period and the second action, a 2, is to go on a picking tour. Clearly, a choice of a 2 is prohibited during a picking tour and when no orders have been batched in the system. More precisely, { a} s Δ { a} s S' n= 0 As = { a2} s S' n= C or γ = L { a, a2} otherwise Rewards Whenever action a is chosen, the decision maker receives no reward. If action a 2 is chosen, then an immediate penalty, which is proportional to the tardiness of all the orders accumulated thus far, is incurred. Notice that since the length of the tour given n is assumed known, T(n), the tardiness can be calculated before the tour has actually started. We denote by c T the tardiness penalty per period and by c O the overtime penalty per period. Note that the overtime penalty is incurred only once, at the end of the working day, in epoch N. The value of the tardiness penalty at every time of the working day, t, and for every possible action and state combination is 0 a= a r(, ) n t s a = ct max( T( n) γ i,0) a = a2, s = { n,( γ, K, γn)} i= and the overtime penalty value at time N is rn(( k, T( n))) = kco k = KT( n) r ((0,0)) = 0 N Transition probabilities At each decision epoch t, given the system state s and the action a, we determine the probability to reach a state j in the subsequent decision epoch t+. Two additional 7

8 assumptions are taken: the expected time to pick one order is larger than two periods; and the probability of more than C orders entering during the longest picking tour, T(C), is negligible. The transition probabilities differ in two distinct time frames. The first time frame is comprised of the first N-I periods during which orders can enter the system, and the second time frame is comprised of the last I periods during which orders do not enter the system. The transition probability matrix of the first time frame is presented in (). For t < N I s S' n> 0, j Δ, j = ( T( n), T( n)), a= a2 s Δ, s= ( ktn, ( )), j Δ, j= ( k, Tn ( )), k= 2 KTn ( ), a= a λ λe s S' s= ( n< C,( γ > L, γ2,..., γn)), j S' j = ( n,( γ: = γ,..., γn : = γn )), a= a λ Pt ( j s, a) = λe s S' s= ( n< C,( γ > L, γ2,..., γn) ), j S' j = ( n+,( γ: = γ,..., γn : = γn, γn+ : = d)), a= a P s Δ, s= (, T( n)), j S' j = ( n',( γ = d ( T( n) p ),..., γn' = d ( Tn ( ) pn' ))) if n' > 0, or j= (0, ) if n' = 0, a= a 0 otherwise () where, P ( λt ( n)) n'! T ( n) n' n' = λ T ( n) e. In the first line of (), action a 2 (go on tour) is chosen, and the system evolves into the set of the picking tour states with a probability of. The remaining tour time in the next state, j, is T(n)-. In the second line, the system occupies a state from Δ and moves in another state from Δ with a probability of. The remaining tour time is decreased by one period. This is true for all Δ states apart from δ = (, T ( n)). The third and forth lines consider a case in which the system occupies a state from S' and does not have to go on a tour immediately. That is, the number of batched orders is smaller than C and the oldest order has more than L periods left to its due date. Then, if an action a is chosen, the next state will be determined by whether an order has entered the system (line 4) or not (line 3). The fifth line addresses a transition from the state δ = (, T ( n)) into a state from S'. The transition to a specific state, j, is determined by both the number of orders that entered the system during the picking tour, n', and the time periods of the picking tour, denoted by p,...,, p2 pn', in which the n' orders have entered the system. To elaborate on the transition presented in the fifth line of (), we consider the following example, presented in Figure. Let a picking tour last five periods, and there are two orders 8

9 that entered the system during that tour, at the periods 2 and 4. Then, n'=2, p = 2 p 4. 2 = and The system returns to a state from S' Figure. Arrivals during picking tour. The state to which the system will transmit is defined as (2,(d-3,d-)). There are two orders to be supplied with the ages of three and one periods, respectively. The probability P in the example is calculated as follows. The probability of arriving two orders within the picking tour of five periods is 5 (5 λ) 2! e λ 2. Due to the lack of memory property of the Poisson distribution, the two orders could have entered in any two periods with the same probability. Since the number of options is 5, the transition probability is finally obtained as, (5 ) 5 λ P = e λ 2! 2. The transition probability matrix for the second time frame is given in (2). In this frame a picking tour is taken immediately, since no new orders can arrive. For N I t < N s S' n> 0, j Δ, j= ( T( n), T( n)), a= a2, N t< T( n) s S' n> 0, j Δ, j= (0,0), a= a2, N t T( n) s S' n= 0, j Δ, j= (0,0), a= a s Δ, s= (0,0), j Δ, j= (0,0), a= a Pjsa t(, ) = s Δ, s= ( k, T( n)), j Δ, j= ( k, T( n)), k= 2, K, T( n), a= a * P s Δ, s= (, T( n)), j S' j= ( n',( γ= d ( T( n) p ),..., γn' = d ( T( n) pn' ))), or j= (0, ) if n' = 0, a= a, t+ T( n) < N I, t N 2 0 otherwise (2) where, P * = e λ ( N I t + T ( n)) ( λ( N I t + T ( n))) n'! n' N I t + T ( n) n' 9

10 In the first line of (2), N-t<T(n), and hence, the last tour does not end before the end of the working day. In this case, the overtime length is kept for future calculation of the overtime cost. In the second line, there is enough time to complete the tour and the system evolves to the absorbing state. In the third and fourth lines, the system moves to the absorbing state immediately. The dynamics of the states within a picking tour is addressed in the fifth line. The sixth line is similar to line 5 in (). The only difference is that orders enter the system only in the first N I t + T ( n) periods of the tour rather than during the entire tour. An optimal policy The problem described above is characterized by a finite set of states, S, and a finite set of actions, A s, for each s S. Therefore, there exists an optimal deterministic Markovian * policy, as stated in Puterman (994). Let u ( ) be the maximum total expected reward, t s t * starting from state s t at decision epochs t, t +,..., N. Then, u t ( s t ) is obtained by the following Backwards induction algorithm, which gives also the optimal actions for each state and each epoch, A. * st, t The Backwards induction algorithm *. Set t = N and u ( s ) = r ( s ) for all s N S N N N * 2. Substitute t- for t and compute u ( ) for each s t S by u ( s ) = max{ r ( s, a) + * t t a Ast t t N j S t s t P ( j s, a) u t t * t + ( j)} * * Set A = arg max{ r ( s, a) + P ( j s, a) u ( )} s, t j t t t t t t a A + st j S 3. If t =, stop. Otherwise return to step Experiments 4.. MDP model versus naïve heuristics The main objectives of the experiments conducted in this section are: To validate the mathematical model. To evaluate possible cost reduction via applying the proposed approach in a real order picking system. To gain insights into the structure and the properties of optimal solutions that will assist in developing new MDP based heuristic methods. 0

11 In order to implement the MDP model, we have developed a computer code, and obtained as output a table containing the optimal policy. Each row in the table stands for each one of the possible system s states, excluding the Δ states which do not involve any decision. Each column of the table stands for each time period during a working day. The data within the table specifies instructions, regarding the optimal action choice, as "" means "go on tour" and "0" means "wait another time period". An example is presented in Figure 2, where the upper left hand side of an optimal policy table is shown. For demonstration purposes, the "go on tour" policy was painted in green while "wait another time period" policy was painted in red. One can see, for example, that when the system contains a single item, the picker waits when the time to supply the order is relatively large, however, when this value is smaller than or equal to 6, the picker goes on a tour. Figure 2. An example of an optimal policy table A simulation model of the order picking system was developed to evaluate the performance of the proposed MDP based solution procedure versus two naïve heuristics. The first heuristic (will be referred to as the Green heuristic from here on) is quite straightforward. Whenever an order is waiting to be picked and the picker is available, the picker will go on a picking tour. The second heuristic (will be referred to as the Slack heuristic from here on) indicates that "waiting another time period" is preferred as long as no certain tardiness will occur. We say that the system has slack if the orders picking time is smaller than the orders' remaining times to supply. This heuristic is called Slack since as long as there is slack available in the system, the action choice is "wait another time period". Once there is no slack, the action choice is "go on tour".

12 When examining the tables of the optimal policies obtained in the experiments, we were able to identify two major effects, demonstrated in Figure 3:. Steady state effect at a certain point in time, which is far back from the end of the working day, the optimal policy becomes independent of time. In fact, at the steady state, the optimal policy can be expressed by a vector instead of a table, as each element denotes the optimal action given a certain state. The steady state effect is clearly illustrated in the left hand side of Figure 3(a). 2. Transient state effect toward the end of the working day the optimal policy shows time dependent irregular pattern, as different actions are associated with the same state in different points in time (see the right hand side of Figure 3(a)). Note that in the last I periods, where no new orders arrive, the only action is go on tour. Moreover, in some experiments we were able to identify an additional red shape adjacent to the last I periods, denoted as the tail. An example of such a tail is shown in Figure 3(b). In the "tail" region, despite the certain tardiness, the picker chooses to wait in order to save future overtime costs. We were also able to identify an influence of the cost parameters on the transient state as the length of this state increases with the ratio of the overtime and tardiness cost parameters. (a) (b) Figure 3. The two major effects identified in the optimal policy. Another observation indicated that the optimal solution is mostly "green", i.e., the action go on tour is made more frequently than the action wait another time period. We believe such a behavior results from the relatively low order arrival rate. Indeed, when the arrival rate is low, the chance to batch additional order while waiting another period is relatively low as well. 2

13 4.2. Experimental design When analyzing the results of the preliminary experiments, we have determined the configuration of the final experiments in such a way that all the assumptions of the model are satisfied and all aspects of the optimal policy are clearly expressed. In particular, λ is chosen so that the probability of more than C orders arriving during a picking tour is small enough (%). For tractability purposes the value of C was set to three orders. The values of the other parameters are detailed in Table. Overall we have conducted 25 experiments that model 25 different warehouse configurations. The tour time function is chosen convex, T(2)=T()+, and T(3)=T(2)+. Table. The values of the problem parameters in the main experiments. Parameter C I N d L Slack:d-T() c O c T Set at 3 T(3)- 25,27,30,32,35 0 5,7,0,2, Analysis of results As noted above, the optimal policy was compared via simulation against two naïve heuristics, denoted as Green and Slack. The simulation results are presented in Table 2, as each row refers to one experiment. The order lead time, d, and the system's slack (which is defined as d-t()) are both displayed in the second and third column, respectively. Next, the slack as a percentage of d is given. Column 5 contains the MDP optimal steady state vector, determined by the parameters, n and n2; the parameters represent the threshold between the green area and the red area in terms of the slack for one and two-order states, respectively. The value Slack indicates that all the slack periods are painted green, and therefore the entire steady state vector is green. The value "0" indicates that all the slack periods are red, and therefore the steady state vector becomes identical to the vector of the Slack heuristic. The value "" indicates that all of the slack periods, except the last one, are painted red. Surprisingly, in all the experiments, n and n2 took only three values, 0, and Slack. That is, as slack decreases (from 5 down to 5), n and n2 decrease as well, while skipping all intermediate values between Slack and. The existence of tail is indicated in column 6. Note that tail appears only for states with two orders. Columns 7, 8 and 9 contain the average daily cost (based on simulation of 0,000 replications of a working day) of the optimal policy, Green heuristic and Slack heuristic, respectively. The last column contains the relative improvement of the optimal policy against the best heuristic. 3

14 Table 2. Experimental results # D slack Slack% MDP MDP Green Slack Improvement to steady tail average average average best H [%] state daily cost daily cost daily cost % n=;n2= No % n=0;n2=0 No % n=0;n2= % n=0;n2= % n=0;n2= % n=slack n2=slack No % n=;n2= No % n=0;n2= % n=0;n2= % n=0;n2= % n=slack n2= Slack No % n=;n2= No % n=0;n2= No % n=0;n2= % n=0;n2= % n=slack n2=slack No % n= n2= Slack No % n=;n2= No % n=0;n2= % n=0;n2= % n=slack n2=slack No % n= Slack n2=slack No % n=;n2= No % n=0;n2= % n=0;n2= The results indicate that the MDP optimal solution outperforms any of the heuristics in all the experiments; i.e., its average cost is always lower than the average costs of the two heuristics. Another observation, which is demonstrated in Figure 4, is that the slack percentage predicts the relative improvement to the best heuristic quite accurately, as the improvement percentage increases with the slack percentage. 4

15 Figure 4. The relative improvement as a function of the slack value Clearly, systems with larger slacks suffer from less tardiness and consequently, enjoy lower average costs. One can notice from Table 2 that cases with high relative improvement are associated with low absolute values of improvement which may be sometimes negligible. To stress this point we have divided the results into three groups with respect to high, medium and small relative slack (see Table 3). In the medium relative slack scenarios the average improvement is still significant while the average cost is far from being negligible. Therefore, we conclude that the strength of our model lies in medium slack size scenarios. Table 3. Absolute value improvement versus relative improvement Slack percentage range Average cost Improvement to best heuristic 44%-60% % 26%-43% 2 0.7% 4%-23% 380.7% The steady state vector of the Green heuristic is characterized by n=slack and n2=slack. Similarly, for the Slack heuristic, n=0 and n2=0. Now, we can notice that in all of the experiments, the MDP steady state vector is almost identical (there might be a difference of one or two action choices in the entire vector) to the steady state vector of one of the two 5

16 naïve heuristics. Therefore, we conclude that the major part of the MDP model benefit is due to the transient state effect. In addition, the structure of the optimal policy indicates that the higher the slack percentage, the more preferable the Green heuristic is against the Slack heuristic. 5. Heuristic methods 5. Background In this section, a heuristic approach for large-scale problem is proposed. To this end, the structure of the optimal policy, expressed by the colored table (see Section 4..), was analyzed. Fortunately, regular patterns were identified in the optimal policy. These patterns and their characteristics were the cornerstones of our heuristic design. We distinguish between patterns of the steady state and patterns of the transient state, and use these patterns in developing the heuristic. The main purpose of the proposed heuristic is to develop a close to optimal procedure which outperforms the best practice heuristics, named Green and Slack in the previous section. The patterns of the optimal procedure are outlined next. Patterns in the steady state We define the steady state as a time period in which the action choice depends only on the systems state s and not on the decision epoch t. Thus, the steady state can be defined by a policy vector instead of a policy table. According to the optimal results, the steady state vector has only a few configurations. The structure of steady state is described by the two parameters, n and n 2 that take only three values each. Specifically, one form of the steady state is a green vector, which orders go on tour for every possible state. Figure 5(a) illustrates such a case. Another form of the steady state vector is one of full slack usage or one with full slack usage minus one. This is illustrated in Figure 5(b). Rarely, the slack usage could be uneven between states of two orders and states of one order. Namely, n and n 2 are not necessarily equal in all of the optimal solutions. Full slack usage indicates that system states, in which slack is available, are painted red. Similarly, full slack usage minus one indicates that the same states are painted red, apart from the state with only one slack time period. This state is painted green. 6

17 Four red cubes A green steady state vector A full slack usage steady state vector (a) (b) Figure 5. Steady state and transient patterns for a green solution (a) and a full slack usage (b). When analyzing the results of the main experiments, we have noticed that the steady state vector seems to have a strong link to the slack percentage. This observation was extremely helpful in the construction of the heuristic policy. Table 4 shows the 25 experiments, sorted by the slack percentage. It is easily seen that (i) in low slack percentage systems the steady state is described by a green vector (i.e., n = n 2 =0); (ii) in medium slack percentage systems the steady state is described by a full slack usage minus one vector (i.e., n = n 2 =); (iii) in high slack percentage systems the steady state is described by a full slack usage vector (i.e., n = n 2 = Slack). As mentioned above, note that in experiments 3 and 7 the values of n and n 2 are not even. We refer to this issue later on. Table 4: The 25 experiments sorted by the slack percentage. MDP steady Slack Tail # MDP steady Slack Tail # Slack d Slack d state (n,n2) (%) (Y/N) exp. state (n,n2) (%) (Y/N) exp. (0,0) 5 4% Y 35 5 (0,0) 2 34% N 35 2 (0,0) 5 6% Y 32 0 (,) 0 37% N 27 8 (0,0) 5 7% Y 30 5 (,) 2 38% N 32 7 (0,0) 5 9% Y (,) 2 40% N 30 2 (0,0) 7 20% Y 35 4 (,) 0 40% N (0,0) 5 20% Y (,) 5 43% N 35 (0,0) 7 22% Y 32 9 (,Slack) 2 44% N 27 7 (0,0) 7 23% Y 30 4 (Slack,Slack) 5 47% N 32 6 (0,0) 7 26% Y 27 9 (Slack,Slack) 2 48% N (0,0) 7 28% Y (Slack,Slack) 5 50% N 30 (0,0) 0 29% Y 35 3 (Slack,Slack) 5 56% N 27 6 (0,0) 0 3% Y 32 8 (Slack,Slack) 5 60% N 25 2 (0,) 0 33% N

18 Patterns in the transient state In the transient time, just before the end of the planning period, the system shows an unstable behavior. Nevertheless, clear and repetitive patterns still exist. One clear pattern occurs in systems for which the steady state vector is not green. In these cases, at least three green holes are seen in the policy table. Such a case is illustrated in Figure 5(b). Another noticeable pattern occurs in systems in which the steady state vector is green. In these cases, at least four red cubes are observed in the policy table. Such a case is illustrated in Figure 5(a). Furthermore, the thickness of the cubes is the same in most of the cases. The transient state patterns also very much depend on the slack percentage. Interestingly, they also depend on the length of the picking tour. For example, we have discovered that the exact starting position of each of the three green holes can be determined by T(). This is illustrated in Figure 6. T() T() T() Figure 6. The position of the three green holes. The tail patterns are also apparent in the transient state. These patterns usually appear in low slack systems. Specifically, in systems with slack percentage lower than 32% (see Table 4). The tail is typically characterized by a fixed thickness and appears at specified places in the policy table (see Figure 3(b)). The last pattern, associated with the transient state, shows that the last I periods are always green, since in that time period no orders arrive and therefore there is no need in waiting. Key points of the heuristic design Three main principles have guided us in designing the heuristic approach: 8

19 . A rough cut of the optimal policy. The basic idea of our design is to follow the general visual form of the optimal policy table. Consequently, we identify several types of problems based on their parameters and construct a typical generic heuristic policy for each problem type based on the above patterns of the optimal policy. Still, our heuristic policy does not imitate the exact pattern of the optimal policy. For example, we ignore the jagged left side of the red patterns shown in Figure 6 and replace it with a rectangular pattern. 2. The maximum similarity principle. The heuristic policy comprises from several parameters. The setting strategy of all these parameters was based on the results of the optimal solutions. Consequently, we have identified empirical properties of the optimal solution with regard to each of the parameters, and determined the parameters of the heuristic policy accordingly. 3. A don t damage approach. The heuristic policy attempts to achieve better results than the two naive heuristics. Accordingly, we wanted the MDP heuristic policy to indicate the action choices different from the naïve heuristics, only when such an Figure 7 illustrates the rough cut approach by showing four policies, two optimal and two heuristic, for two problems. (a) Optimal policy (high slack) (b) Optimal policy (low slack) (c) Heuristic policy (high slack) (d) Heuristic policy (low slack) Figure 7. Optimal versus heuristic policy in a high and low slack percentage system 9

20 action yields an improved performance over the best naïve heuristic. Therefore, we were very conservative in the parameter setting. When the optimal policy follows a pattern similar to those appears in Figure 5(a), clearly green policy heuristic outperforms the slack heuristic. In this case, we have added the only those cubes that were observed in all cases. Similarly, when the optimal policy follows a pattern similar to those in Figure 5(b), clearly the slack heuristic outperforms the green heuristic. In these cases, only those green holes that were identified in all of the cases were added. 5.2 Algorithmic formulation The following steps work as the instructions to the construction of an MDP based heuristic. These instructions are general and they fit different warehouses with different configurations. Every parameter in the following formulation was generated according to the maximum similarity principle and the don t damage approach, which were described above. Step : Calculate the slack percentage. Step 2: Set n and n 2 in the following manner: 0 if 0 slack percentage 0.36 n = if 0.36 < slack percentage 0.46 slack if 0.46 < slack percentage n 2 0 if 0 slack percentage 0.36 = if 0.36 < slack percentage 0.44 slack if 0.44 < slack percentage Step 3: (generalize the steady state vector on the entire policy table). According to the values of n and n 2 set the steady state policy vector. 2. Use the steady state policy vector to paint the entire policy table in a unified manner. Step 4: (set the three green holes or the four red cubes) If slack percentage < 0.44 then set the following steps in the left column (constructing three green holes). Else, set the following steps in the right column (constructing four red cubes).. t = N I T () : the starting point of the first green hole. 2. t = t () : the starting point of 2 T. t = N I : the starting point of the first red cube. 2. t = t (3) : the starting point of the 2 T 20

21 the second green hole. 3. t = t () : the starting point of 3 2 T the third green hole. 4. A T () = : the length of the first hole. 5. B T () = : the length of the second hole. 6. C T () = : the length of the third hole. 7. V = { t t +,.., t + A }, : the group of decision epochs in which the first hole is present. 8. V = { t t +,.., t + B } 2 2, 2 2 :the group of decision epochs in which the second hole is present. 9. V = { t t +,.., t + C } 3 3, 3 3 :the group of decision epochs in which the third hole is present. 0. V = V V2 V3 Set the three green holes as follows. For every decision epoch t, which is a member of the group V, and for every state s, which is a member of the group S, set a=a 2 (go on tour). second red cube. 3. t = t (3) : the starting point of the 3 2 T third red cube. 4. t = t (3) : the starting point of the 4 3 T forth red cube. 5. A T (3) = : the length of the first cube. 6. B T (3) = : the length of the second cube. 7. C T (3) = : the length of the third cube. 8. D T (3) = : the length of the fourth cube. 9. W = { t t,.., t A }, : the group of decision epochs in which the first cube is present. 0. = { t t,.., t B } W : the 2 2, 2 2 group of decision epochs in which the second cube is present.. = { t t,.., t C } W : the 3 3, 3 3 group of decision epochs in which the third cube is present. 2. = { t t,.., t D } W : the 4 4, 4 4 group of decision epochs in which the forth cube is present. 3. W = W W2 W3 W4 Set the four red cubes as follows. For every decision epoch t which is a member of the group W and for every state s which uses the slack to its full except the last unit (i.e., n and n 2 equal 2

22 ), set a=a (wait another time period). Step 5: (set the tail) On the basis of the slack percentage set m and m 2 (which determine whether a tail exists or not, in states of one order and two orders accordingly), set the tail thickness and the stopping state of the tail. The stopping state of the tail indicates that no more tails appear below this state. The tail s parameters are determined according to Table 5. Table 5. Tail parameter determination. Slack percentage m [ - tail exists, 0 tail does not exist] m 2 [ - tail exists, 0 tail does not exist] 0 Tail thickness (uniform thickness) Stopping state s γ 2 =T(2) s γ 2 =T(2) s γ 2 =T(2) s γ 2 =T(2) s γ 2 =T(2)+ - Based on these parameters (m, m 2, Tail thickness, stopping state) set the action a (wait another period) on the regions of the tail. 5.3 Experiments After completing the design of the MDP based heuristic, we have conducted experimentation for evaluation the performance of the heuristic procedure. The main purpose of the experiments was to compare its performance with the two naïve heuristics and the optimal algorithm. In addition, the effect of the length of the planning period was also examined. As a result of the first session of experiments, in which the slack percentage turned out to be very meaningful, we now choose to determine T() in an indirect manner. We did so, by defining the slack percentage as a direct independent parameter. Three parameters were examined. First, the length of the planning period was set to two levels, and decision epochs. Next, the order lead time, d, was set to 5, 30 and 45. Last, the slack percentage, which was identified in the previous set of experiments as the most influencing parameter, was set to five values: 20, 33, 40, 53 and 60 percent. All other parameters were left the same as in the first session of experiments. The experimental results are presented in Table 6. The first five columns contain the number of experiment, the length (in time periods) of the working day N, the order lead time, d, the picking time of one order T() and the slack percentage. These data define the 22

23 warehouse configuration. The next four columns contain the average daily cost, which was evaluated by 0,000 runs of the simulation model (0,000 working days), for each of the four order-picking policies (the optimal MDP policy, the MDP based heuristic and the two naive heuristics). In the tenth column, the relative improvement (in terms of average daily cost) of the MDP based heuristic to the better naive heuristic is presented. In the next column one can see whether the above difference is statistically significant (based on significant level of 95%). Finally, the percentile distance between the optimal policy and the MDP heuristic policy is shown. Table 6. Experimental results in the second session. # N d T() Slack (%) Opt. MDP-H Green Slack MDP-H vs best naive H (%) Significance No No No No No No No No No No MDP-H vs optimal (%) MDP heuristic versus the two naive heuristics Results show that the MDP based heuristic (MDP-H) outperforms the best naïve heuristic in all of the experiments. The slack percentage has been identified as the only parameter which significantly affects the difference between MDP-H and the best naïve heuristic. This value increases with the value of the slack (see Figure 8). Note that high difference in high slack systems stems from the fact that the total cost in these systems is very low and mostly associated with the transient stage. This issue is addressed next. 23

24 Figure 8. Average improvement of MDP-H over naïve heuristic. Sensitivity to different working day lengths Two types of costs were considered in the analysis, the tardiness cost and the overtime cost. The former occurred mostly during the steady state, while the latter was observed in the transient state. When analyzing different lengths of working day, clearly, the relative effect of the transient state decreases with the length of working day. One may expect that the main advantage of MDP-H over the naïve heuristics is related to the transient state. Consequently, we expected the difference between the two heuristics to increase as the working day becomes shorter. When we compared the experiments of to those of time periods, no significant effect of length of the working day was identified in high slack systems (slacks of 53% and 60%). The reason was that in such systems, the tardiness cost, associated with the steady state, was close to zero, and hence the overtime cost became the most significant cost element. As a result, the effect of the working day length became negligible. Small and medium slack systems (up to 40%) demonstrate the increased effectiveness of the time epochs experiments over the time epochs, as shown in Figure 9. Distance from the optimal policy The optimal policy was generated and compared to MDP-H. Not surprisingly, we have seen that also in this case the slack percentage significantly affects the distance from optimality. In particular, we have observed that the distance from optimality increases with the slack. Figure 0 demonstrates this phenomenon, as one can see that the difference between MDP-H and the optimal solution is relatively small (up to 0.62%) for a slack percentage of up to 0.4. For higher slacks we can see much larger differences. However, as can be seen in Figure 4, high slack systems are characterized by a very low costs. Hence the absolute value of the difference is insignificant. 24

Interventionist Routing Algorithm for single-block warehouse: Application and Complexity

Interventionist Routing Algorithm for single-block warehouse: Application and Complexity Interventionist Routing Algorithm for single-block warehouse: Application and Complexity Wenrong Lu, Duncan McFarlane, Vaggelis Giannikas May 19, 2015 CUED/E-MANUF/TR.35 ISSN: 0951-9338 1 Introduction

More information

Session-Based Queueing Systems

Session-Based Queueing Systems Session-Based Queueing Systems Modelling, Simulation, and Approximation Jeroen Horters Supervisor VU: Sandjai Bhulai Executive Summary Companies often offer services that require multiple steps on the

More information

On the Partitioning of Servers in Queueing Systems during Rush Hour

On the Partitioning of Servers in Queueing Systems during Rush Hour On the Partitioning of Servers in Queueing Systems during Rush Hour Bin Hu Saif Benjaafar Department of Operations and Management Science, Ross School of Business, University of Michigan at Ann Arbor,

More information

Bayesian Congestion Control over a Markovian Network Bandwidth Process

Bayesian Congestion Control over a Markovian Network Bandwidth Process Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard 1/30 Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard (USC) Joint work

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem Parisa Mansourifard 1/37 Bayesian Congestion Control over a Markovian Network Bandwidth Process:

More information

Optimal Control of Partiality Observable Markov. Processes over a Finite Horizon

Optimal Control of Partiality Observable Markov. Processes over a Finite Horizon Optimal Control of Partiality Observable Markov Processes over a Finite Horizon Report by Jalal Arabneydi 04/11/2012 Taken from Control of Partiality Observable Markov Processes over a finite Horizon by

More information

Online Appendices: Inventory Control in a Spare Parts Distribution System with Emergency Stocks and Pipeline Information

Online Appendices: Inventory Control in a Spare Parts Distribution System with Emergency Stocks and Pipeline Information Online Appendices: Inventory Control in a Spare Parts Distribution System with Emergency Stocks and Pipeline Information Christian Howard christian.howard@vti.se VTI - The Swedish National Road and Transport

More information

JRF (Quality, Reliability and Operations Research): 2013 INDIAN STATISTICAL INSTITUTE INSTRUCTIONS

JRF (Quality, Reliability and Operations Research): 2013 INDIAN STATISTICAL INSTITUTE INSTRUCTIONS JRF (Quality, Reliability and Operations Research): 2013 INDIAN STATISTICAL INSTITUTE INSTRUCTIONS The test is divided into two sessions (i) Forenoon session and (ii) Afternoon session. Each session is

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle  holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/39637 holds various files of this Leiden University dissertation Author: Smit, Laurens Title: Steady-state analysis of large scale systems : the successive

More information

Slides 9: Queuing Models

Slides 9: Queuing Models Slides 9: Queuing Models Purpose Simulation is often used in the analysis of queuing models. A simple but typical queuing model is: Queuing models provide the analyst with a powerful tool for designing

More information

Chapter 16 focused on decision making in the face of uncertainty about one future

Chapter 16 focused on decision making in the face of uncertainty about one future 9 C H A P T E R Markov Chains Chapter 6 focused on decision making in the face of uncertainty about one future event (learning the true state of nature). However, some decisions need to take into account

More information

On the Partitioning of Servers in Queueing Systems during Rush Hour

On the Partitioning of Servers in Queueing Systems during Rush Hour On the Partitioning of Servers in Queueing Systems during Rush Hour This paper is motivated by two phenomena observed in many queueing systems in practice. The first is the partitioning of server capacity

More information

On the Approximate Linear Programming Approach for Network Revenue Management Problems

On the Approximate Linear Programming Approach for Network Revenue Management Problems On the Approximate Linear Programming Approach for Network Revenue Management Problems Chaoxu Tong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

Finding the Value of Information About a State Variable in a Markov Decision Process 1

Finding the Value of Information About a State Variable in a Markov Decision Process 1 05/25/04 1 Finding the Value of Information About a State Variable in a Markov Decision Process 1 Gilvan C. Souza The Robert H. Smith School of usiness, The University of Maryland, College Park, MD, 20742

More information

Stochastic programs with binary distributions: Structural properties of scenario trees and algorithms

Stochastic programs with binary distributions: Structural properties of scenario trees and algorithms INSTITUTT FOR FORETAKSØKONOMI DEPARTMENT OF BUSINESS AND MANAGEMENT SCIENCE FOR 12 2017 ISSN: 1500-4066 October 2017 Discussion paper Stochastic programs with binary distributions: Structural properties

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010 Exercises Stochastic Performance Modelling Hamilton Institute, Summer Instruction Exercise Let X be a non-negative random variable with E[X ]

More information

On Two Class-Constrained Versions of the Multiple Knapsack Problem

On Two Class-Constrained Versions of the Multiple Knapsack Problem On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic

More information

Chapter 6 Queueing Models. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Chapter 6 Queueing Models. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Chapter 6 Queueing Models Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Purpose Simulation is often used in the analysis of queueing models. A simple but typical queueing model: Queueing

More information

Description of the ED library Basic Atoms

Description of the ED library Basic Atoms Description of the ED library Basic Atoms Simulation Software / Description of the ED library BASIC ATOMS Enterprise Dynamics Copyright 2010 Incontrol Simulation Software B.V. All rights reserved Papendorpseweg

More information

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems  M/M/1  M/M/m  M/M/1/K Queueing Theory I Summary Little s Law Queueing System Notation Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K " Little s Law a(t): the process that counts the number of arrivals

More information

Markov decision processes and interval Markov chains: exploiting the connection

Markov decision processes and interval Markov chains: exploiting the connection Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo Supervisors: Prof. Nigel Bean, Dr Joshua Ross University of Adelaide July 10, 2013 Intervals and interval arithmetic

More information

21 Markov Decision Processes

21 Markov Decision Processes 2 Markov Decision Processes Chapter 6 introduced Markov chains and their analysis. Most of the chapter was devoted to discrete time Markov chains, i.e., Markov chains that are observed only at discrete

More information

HITTING TIME IN AN ERLANG LOSS SYSTEM

HITTING TIME IN AN ERLANG LOSS SYSTEM Probability in the Engineering and Informational Sciences, 16, 2002, 167 184+ Printed in the U+S+A+ HITTING TIME IN AN ERLANG LOSS SYSTEM SHELDON M. ROSS Department of Industrial Engineering and Operations

More information

Non-Linear Optimization

Non-Linear Optimization Non-Linear Optimization Distinguishing Features Common Examples EOQ Balancing Risks Minimizing Risk 15.057 Spring 03 Vande Vate 1 Hierarchy of Models Network Flows Linear Programs Mixed Integer Linear

More information

Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers

Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers Rong-Chang Chen, Yi-Ru Liao, Ting-Yao Lin, Chia-Hsin Chuang, Department of Distribution Management,

More information

Worst case analysis for a general class of on-line lot-sizing heuristics

Worst case analysis for a general class of on-line lot-sizing heuristics Worst case analysis for a general class of on-line lot-sizing heuristics Wilco van den Heuvel a, Albert P.M. Wagelmans a a Econometric Institute and Erasmus Research Institute of Management, Erasmus University

More information

Point Process Control

Point Process Control Point Process Control The following note is based on Chapters I, II and VII in Brémaud s book Point Processes and Queues (1981). 1 Basic Definitions Consider some probability space (Ω, F, P). A real-valued

More information

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions Mitra Nasri Chair of Real-time Systems, Technische Universität Kaiserslautern, Germany nasri@eit.uni-kl.de

More information

MACHINE DEDICATION UNDER PRODUCT AND PROCESS DIVERSITY. Darius Rohan. IBM Microelectonics Division East Fishkill, NY 12533, U.S.A.

MACHINE DEDICATION UNDER PRODUCT AND PROCESS DIVERSITY. Darius Rohan. IBM Microelectonics Division East Fishkill, NY 12533, U.S.A. Proceedings of the 1999 Winter Simulation Conference P. A. Farrington, H. B. Nembhard, D. T. Sturrock, and G. W. Evans, eds. MACHINE DEDICATION UNDER PRODUCT AND PROCESS DIVERSITY Darius Rohan IBM Microelectonics

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updates Michael Kearns AT&T Labs mkearns@research.att.com Satinder Singh AT&T Labs baveja@research.att.com Abstract We give the first rigorous upper bounds

More information

National Center for Geographic Information and Analysis. A Comparison of Strategies for Data Storage Reduction in Location-Allocation Problems

National Center for Geographic Information and Analysis. A Comparison of Strategies for Data Storage Reduction in Location-Allocation Problems National Center for Geographic Information and Analysis A Comparison of Strategies for Data Storage Reduction in Location-Allocation Problems by Paul A. Sorensen Richard L. Church University of California,

More information

The Transportation Problem

The Transportation Problem CHAPTER 12 The Transportation Problem Basic Concepts 1. Transportation Problem: BASIC CONCEPTS AND FORMULA This type of problem deals with optimization of transportation cost in a distribution scenario

More information

Deep Algebra Projects: Algebra 1 / Algebra 2 Go with the Flow

Deep Algebra Projects: Algebra 1 / Algebra 2 Go with the Flow Deep Algebra Projects: Algebra 1 / Algebra 2 Go with the Flow Topics Solving systems of linear equations (numerically and algebraically) Dependent and independent systems of equations; free variables Mathematical

More information

Analysis of Algorithms. Unit 5 - Intractable Problems

Analysis of Algorithms. Unit 5 - Intractable Problems Analysis of Algorithms Unit 5 - Intractable Problems 1 Intractable Problems Tractable Problems vs. Intractable Problems Polynomial Problems NP Problems NP Complete and NP Hard Problems 2 In this unit we

More information

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement Submitted to imanufacturing & Service Operations Management manuscript MSOM-11-370.R3 Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement (Authors names blinded

More information

Practicable Robust Markov Decision Processes

Practicable Robust Markov Decision Processes Practicable Robust Markov Decision Processes Huan Xu Department of Mechanical Engineering National University of Singapore Joint work with Shiau-Hong Lim (IBM), Shie Mannor (Techion), Ofir Mebel (Apple)

More information

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS Xiaofei Fan-Orzechowski Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony

More information

Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal

Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Matthew Johnston, Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology Cambridge,

More information

1 Markov decision processes

1 Markov decision processes 2.997 Decision-Making in Large-Scale Systems February 4 MI, Spring 2004 Handout #1 Lecture Note 1 1 Markov decision processes In this class we will study discrete-time stochastic systems. We can describe

More information

Optimizing Fishbone Aisles for Dual-Command Operations in a Warehouse

Optimizing Fishbone Aisles for Dual-Command Operations in a Warehouse Optimizing Fishbone Aisles for Dual-Command Operations in a Warehouse Letitia M. Pohl and Russell D. Meller Department of Industrial Engineering University of Arkansas Fayetteville, Arkansas 72701 lpohl@uark.edu

More information

Mitigating end-effects in Production Scheduling

Mitigating end-effects in Production Scheduling Mitigating end-effects in Production Scheduling Bachelor Thesis Econometrie en Operationele Research Ivan Olthuis 359299 Supervisor: Dr. Wilco van den Heuvel June 30, 2014 Abstract In this report, a solution

More information

Influence of product return lead-time on inventory control

Influence of product return lead-time on inventory control Influence of product return lead-time on inventory control Mohamed Hichem Zerhouni, Jean-Philippe Gayon, Yannick Frein To cite this version: Mohamed Hichem Zerhouni, Jean-Philippe Gayon, Yannick Frein.

More information

Prioritized Sweeping Converges to the Optimal Value Function

Prioritized Sweeping Converges to the Optimal Value Function Technical Report DCS-TR-631 Prioritized Sweeping Converges to the Optimal Value Function Lihong Li and Michael L. Littman {lihong,mlittman}@cs.rutgers.edu RL 3 Laboratory Department of Computer Science

More information

A STAFFING ALGORITHM FOR CALL CENTERS WITH SKILL-BASED ROUTING: SUPPLEMENTARY MATERIAL

A STAFFING ALGORITHM FOR CALL CENTERS WITH SKILL-BASED ROUTING: SUPPLEMENTARY MATERIAL A STAFFING ALGORITHM FOR CALL CENTERS WITH SKILL-BASED ROUTING: SUPPLEMENTARY MATERIAL by Rodney B. Wallace IBM and The George Washington University rodney.wallace@us.ibm.com Ward Whitt Columbia University

More information

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time David Laibson 9/30/2014 Outline Lectures 9-10: 9.1 Continuous-time Bellman Equation 9.2 Application: Merton s Problem 9.3 Application:

More information

Markov Chains. Chapter 16. Markov Chains - 1

Markov Chains. Chapter 16. Markov Chains - 1 Markov Chains Chapter 16 Markov Chains - 1 Why Study Markov Chains? Decision Analysis focuses on decision making in the face of uncertainty about one future event. However, many decisions need to consider

More information

Formulas to Estimate the VRP Average Distance Traveled in Elliptic Zones

Formulas to Estimate the VRP Average Distance Traveled in Elliptic Zones F. Robuste, M. Estrada & A. Lopez-Pita 1 Formulas to Estimate the VRP Average Distance Traveled in Elliptic Zones F. Robuste, M. Estrada & A. Lopez Pita CENIT Center for Innovation in Transport Technical

More information

The Optimal Resource Allocation in Stochastic Activity Networks via The Electromagnetism Approach

The Optimal Resource Allocation in Stochastic Activity Networks via The Electromagnetism Approach The Optimal Resource Allocation in Stochastic Activity Networks via The Electromagnetism Approach AnabelaP.Tereso,M.MadalenaT.Araújo Universidade do Minho, 4800-058 Guimarães PORTUGAL anabelat@dps.uminho.pt;

More information

arxiv: v1 [math.ho] 25 Feb 2008

arxiv: v1 [math.ho] 25 Feb 2008 A Note on Walking Versus Waiting Anthony B. Morton February 28 arxiv:82.3653v [math.ho] 25 Feb 28 To what extent is a traveller called Justin, say) better off to wait for a bus rather than just start walking

More information

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University Algorithms NP -Complete Problems Dong Kyue Kim Hanyang University dqkim@hanyang.ac.kr The Class P Definition 13.2 Polynomially bounded An algorithm is said to be polynomially bounded if its worst-case

More information

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Mohammad H. Yarmand and Douglas G. Down Department of Computing and Software, McMaster University, Hamilton, ON, L8S

More information

56:171 Operations Research Final Exam December 12, 1994

56:171 Operations Research Final Exam December 12, 1994 56:171 Operations Research Final Exam December 12, 1994 Write your name on the first page, and initial the other pages. The response "NOTA " = "None of the above" Answer both parts A & B, and five sections

More information

Revenue Maximization in a Cloud Federation

Revenue Maximization in a Cloud Federation Revenue Maximization in a Cloud Federation Makhlouf Hadji and Djamal Zeghlache September 14th, 2015 IRT SystemX/ Telecom SudParis Makhlouf Hadji Outline of the presentation 01 Introduction 02 03 04 05

More information

Chapter 2 Class-Based Storage with a Finite Number of Items in AS/RS

Chapter 2 Class-Based Storage with a Finite Number of Items in AS/RS Chapter 2 Class-Based Storage with a Finite Number of Items in AS/RS Abstract Class-based storage is widely studied in the literature and applied in practice. It divides all stored items into a number

More information

Statistical Quality Control - Stat 3081

Statistical Quality Control - Stat 3081 Statistical Quality Control - Stat 3081 Awol S. Department of Statistics College of Computing & Informatics Haramaya University Dire Dawa, Ethiopia March 2015 Introduction Lot Disposition One aspect of

More information

Real-Time Systems. Event-Driven Scheduling

Real-Time Systems. Event-Driven Scheduling Real-Time Systems Event-Driven Scheduling Hermann Härtig WS 2018/19 Outline mostly following Jane Liu, Real-Time Systems Principles Scheduling EDF and LST as dynamic scheduling methods Fixed Priority schedulers

More information

Chapter 2 SIMULATION BASICS. 1. Chapter Overview. 2. Introduction

Chapter 2 SIMULATION BASICS. 1. Chapter Overview. 2. Introduction Chapter 2 SIMULATION BASICS 1. Chapter Overview This chapter has been written to introduce the topic of discreteevent simulation. To comprehend the material presented in this chapter, some background in

More information

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1 Queueing systems Renato Lo Cigno Simulation and Performance Evaluation 2014-15 Queueing systems - Renato Lo Cigno 1 Queues A Birth-Death process is well modeled by a queue Indeed queues can be used to

More information

On the static assignment to parallel servers

On the static assignment to parallel servers On the static assignment to parallel servers Ger Koole Vrije Universiteit Faculty of Mathematics and Computer Science De Boelelaan 1081a, 1081 HV Amsterdam The Netherlands Email: koole@cs.vu.nl, Url: www.cs.vu.nl/

More information

Online Interval Coloring and Variants

Online Interval Coloring and Variants Online Interval Coloring and Variants Leah Epstein 1, and Meital Levy 1 Department of Mathematics, University of Haifa, 31905 Haifa, Israel. Email: lea@math.haifa.ac.il School of Computer Science, Tel-Aviv

More information

Lecture 13: Dynamic Programming Part 2 10:00 AM, Feb 23, 2018

Lecture 13: Dynamic Programming Part 2 10:00 AM, Feb 23, 2018 CS18 Integrated Introduction to Computer Science Fisler, Nelson Lecture 13: Dynamic Programming Part 2 10:00 AM, Feb 23, 2018 Contents 1 Holidays 1 1.1 Halloween..........................................

More information

Figure 10.1: Recording when the event E occurs

Figure 10.1: Recording when the event E occurs 10 Poisson Processes Let T R be an interval. A family of random variables {X(t) ; t T} is called a continuous time stochastic process. We often consider T = [0, 1] and T = [0, ). As X(t) is a random variable

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MH4702/MAS446/MTH437 Probabilistic Methods in OR

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MH4702/MAS446/MTH437 Probabilistic Methods in OR NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 2013-201 MH702/MAS6/MTH37 Probabilistic Methods in OR December 2013 TIME ALLOWED: 2 HOURS INSTRUCTIONS TO CANDIDATES 1. This examination paper contains

More information

250 (headphones list price) (speaker set s list price) 14 5 apply ( = 14 5-off-60 store coupons) 60 (shopping cart coupon) = 720.

250 (headphones list price) (speaker set s list price) 14 5 apply ( = 14 5-off-60 store coupons) 60 (shopping cart coupon) = 720. The Alibaba Global Mathematics Competition (Hangzhou 08) consists of 3 problems. Each consists of 3 questions: a, b, and c. This document includes answers for your reference. It is important to note that

More information

Coordinated Replenishments at a Single Stocking Point

Coordinated Replenishments at a Single Stocking Point Chapter 11 Coordinated Replenishments at a Single Stocking Point 11.1 Advantages and Disadvantages of Coordination Advantages of Coordination 1. Savings on unit purchase costs.. Savings on unit transportation

More information

Optimal Control of a Production-Inventory System with both Backorders and Lost Sales

Optimal Control of a Production-Inventory System with both Backorders and Lost Sales Optimal Control of a Production-Inventory System with both Backorders and Lost Sales Saif Benjaafar Mohsen ElHafsi Tingliang Huang 3 Industrial & Systems Engineering, Department of Mechanical Engineering,

More information

Another max flow application: baseball

Another max flow application: baseball CS124 Lecture 16 Spring 2018 Another max flow application: baseball Suppose there are n baseball teams, and team 1 is our favorite. It is the middle of baseball season, and some games have been played

More information

Latent voter model on random regular graphs

Latent voter model on random regular graphs Latent voter model on random regular graphs Shirshendu Chatterjee Cornell University (visiting Duke U.) Work in progress with Rick Durrett April 25, 2011 Outline Definition of voter model and duality with

More information

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement Wyean Chan DIRO, Université de Montréal, C.P. 6128, Succ. Centre-Ville, Montréal (Québec), H3C 3J7, CANADA,

More information

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS 63 2.1 Introduction In this chapter we describe the analytical tools used in this thesis. They are Markov Decision Processes(MDP), Markov Renewal process

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.262 Discrete Stochastic Processes Midterm Quiz April 6, 2010 There are 5 questions, each with several parts.

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 0 Output Analysis for a Single Model Purpose Objective: Estimate system performance via simulation If θ is the system performance, the precision of the

More information

Control Theory : Course Summary

Control Theory : Course Summary Control Theory : Course Summary Author: Joshua Volkmann Abstract There are a wide range of problems which involve making decisions over time in the face of uncertainty. Control theory draws from the fields

More information

Appendix: Simple Methods for Shift Scheduling in Multi-Skill Call Centers

Appendix: Simple Methods for Shift Scheduling in Multi-Skill Call Centers Appendix: Simple Methods for Shift Scheduling in Multi-Skill Call Centers Sandjai Bhulai, Ger Koole & Auke Pot Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Supplementary Material

More information

Contextually found Mathematics: Organizing Goods within a Distribution Center

Contextually found Mathematics: Organizing Goods within a Distribution Center fall '00 Pre-calculus (note: could be adapted for Algebra or Advanced Algebra) Background Name: Date: Contextually found Mathematics: Organizing Goods within a Distribution Center Slot Numbering, p. 1

More information

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Unit 1A: Computational Complexity

Unit 1A: Computational Complexity Unit 1A: Computational Complexity Course contents: Computational complexity NP-completeness Algorithmic Paradigms Readings Chapters 3, 4, and 5 Unit 1A 1 O: Upper Bounding Function Def: f(n)= O(g(n)) if

More information

Summarizing Measured Data

Summarizing Measured Data Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,

More information

Proxel-Based Simulation of Stochastic Petri Nets Containing Immediate Transitions

Proxel-Based Simulation of Stochastic Petri Nets Containing Immediate Transitions Electronic Notes in Theoretical Computer Science Vol. 85 No. 4 (2003) URL: http://www.elsevier.nl/locate/entsc/volume85.html Proxel-Based Simulation of Stochastic Petri Nets Containing Immediate Transitions

More information

Traffic Modelling for Moving-Block Train Control System

Traffic Modelling for Moving-Block Train Control System Commun. Theor. Phys. (Beijing, China) 47 (2007) pp. 601 606 c International Academic Publishers Vol. 47, No. 4, April 15, 2007 Traffic Modelling for Moving-Block Train Control System TANG Tao and LI Ke-Ping

More information

Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman Olin Business School, Washington University, St. Louis, MO 63130, USA

More information

MDP Preliminaries. Nan Jiang. February 10, 2019

MDP Preliminaries. Nan Jiang. February 10, 2019 MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Model-Based Reinforcement Learning Model-based, PAC-MDP, sample complexity, exploration/exploitation, RMAX, E3, Bayes-optimal, Bayesian RL, model learning Vien Ngo MLR, University

More information

Stochastic Optimization

Stochastic Optimization Chapter 27 Page 1 Stochastic Optimization Operations research has been particularly successful in two areas of decision analysis: (i) optimization of problems involving many variables when the outcome

More information

A Polynomial-Time Algorithm to Find Shortest Paths with Recourse

A Polynomial-Time Algorithm to Find Shortest Paths with Recourse A Polynomial-Time Algorithm to Find Shortest Paths with Recourse J. Scott Provan Department of Operations Research University of North Carolina Chapel Hill, NC 7599-380 December, 00 Abstract The Shortest

More information

SYMBIOSIS CENTRE FOR DISTANCE LEARNING (SCDL) Subject: production and operations management

SYMBIOSIS CENTRE FOR DISTANCE LEARNING (SCDL) Subject: production and operations management Sample Questions: Section I: Subjective Questions 1. What are the inputs required to plan a master production schedule? 2. What are the different operations schedule types based on time and applications?

More information

The prediction of passenger flow under transport disturbance using accumulated passenger data

The prediction of passenger flow under transport disturbance using accumulated passenger data Computers in Railways XIV 623 The prediction of passenger flow under transport disturbance using accumulated passenger data T. Kunimatsu & C. Hirai Signalling and Transport Information Technology Division,

More information

Chapter 3: The Reinforcement Learning Problem

Chapter 3: The Reinforcement Learning Problem Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which

More information

Logistical and Transportation Planning. QUIZ 1 Solutions

Logistical and Transportation Planning. QUIZ 1 Solutions QUIZ 1 Solutions Problem 1. Patrolling Police Car. A patrolling police car is assigned to the rectangular sector shown in the figure. The sector is bounded on all four sides by a roadway that requires

More information

A New Dynamic Programming Decomposition Method for the Network Revenue Management Problem with Customer Choice Behavior

A New Dynamic Programming Decomposition Method for the Network Revenue Management Problem with Customer Choice Behavior A New Dynamic Programming Decomposition Method for the Network Revenue Management Problem with Customer Choice Behavior Sumit Kunnumkal Indian School of Business, Gachibowli, Hyderabad, 500032, India sumit

More information

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018 Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections

More information

Data Structures in Java

Data Structures in Java Data Structures in Java Lecture 21: Introduction to NP-Completeness 12/9/2015 Daniel Bauer Algorithms and Problem Solving Purpose of algorithms: find solutions to problems. Data Structures provide ways

More information

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture - 27 Flow Shop Scheduling - Heuristics - Palmer, Campbell Dudek

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017 Simple Techniques for Improving SGD CS6787 Lecture 2 Fall 2017 Step Sizes and Convergence Where we left off Stochastic gradient descent x t+1 = x t rf(x t ; yĩt ) Much faster per iteration than gradient

More information