This lecture is expanded from: HIGH VOLUME JOB SHOP SCHEDULING AND MULTICLASS QUEUING NETWORKS WITH INFINITE VIRTUAL BUFFERS INFORMS, MIAMI Nov 2, 2001 Gideon Weiss Haifa University (visiting MS&E, Stanford) Joint work with: Jim Dai, Michal Penn and students: Tammy Boudoukh, Yoni Nazarathy, Anat Kopzon.
Job Shop Scheduling Machines: i = 1, L, I Routes: r = 1, L, R Steps: (, ro), o= 1, L, K on machines: σr, 1, L, σr, K Processing times: X,, X r, 1 L r, K Classical Job Shop Problem: ONE job on each route, Schedule to minimize MAKESPAN: T OPT = Earliest time to complete all jobs NP-Hard: 10x10 problem took 10 years to solve. Machine Lower Bound: * * Ti = Xr, o, T = C = max T σ i i ro, = Job lower bound: X = X, X = max X r K * r, o o= 1 r r i MS&E324, Stanford University, Spring 2002 8-2 Gideon Weiss manufacturing & control
Step# 1 2 3 4 5 6 7 8 9 10 sum This is the famous 10x10 job-shop problem, First Appeared in 1963 book of Muth and Thompson. Solution found by Carlier and Pinson, 1988, Solution is 930 Lower bound 631. job 1 job 2 job 3 job 4 job 5 1 29 1 43 2 91 2 81 3 14 2 78 3 90 1 85 3 95 1 6 3 9 5 75 4 39 1 71 2 22 4 36 10 11 3 74 5 99 6 61 5 49 4 69 9 90 7 9 4 26 6 11 2 28 6 10 9 52 5 69 7 62 7 46 8 12 8 85 9 21 8 56 6 46 7 89 4 98 8 49 9 44 8 72 1045 1022 1072 10 21 9 30 5 33 6 43 7 53 395 510568 655 393 job 6 job 7 job 8 job 9 job 10 mach bound 3 84 2 46 3 31 1 76 2 85 1 493 2 2 1 37 1 86 2 69 1 13 2 548 6 52 4 61 2 46 4 76 3 81 3 576 4 95 3 13 6 74 6 51 7 7 4 631 9 48 7 32 5 32 3 85 9 64 5 534 1072 6 21 7 88 1011 10 76 6 416 1 47 10 32 9 19 7 40 6 47 7 491 7 65 9 89 10 48 8 89 4 52 8 499 5 6 8 30 8 36 5 26 5 90 9 531 8 25 5 55 4 79 9 74 8 45 10410 496 416 539 597 560 Machine lower bound is 631 (Machine 4). Job lower bound is 655. MS&E324, Stanford University, Spring 2002 8-3 Gideon Weiss manufacturing & control
More About Job-Shops: The problem is NP-hard. In fact, Williamosn et al showed that unless P=NP there is no polynomial time algorithm that can construct a schedule that is within 5/4 of the optimum. Heuristics can solve Job-shops approximately. Pupular is the shifting bottleneck heuristic which can deal with 10x10 in a few seconds (it finds optimum in 20 seconds but without proof of optimality). These heuristics fail to functions for more then a few dozen jobs on 20 machines. Sevastyanov (see also Baranyi) has a polynomial heuristic which achieve a worst case schedule better than * 2 T + ( K 1)( IK + 2K 1) max Xrk, ( j) For the 10x10 problem this gives a gap of ~800,000. Recent heuirstics exploit division of operations to long/short/tiny MS&E324, Stanford University, Spring 2002 8-4 Gideon Weiss manufacturing & control
High Volume Job Shops - Identical N IDENTICAL JOBS on each route, processing times X,, X r, 1 L r, K Machine Lower Bound: * * Ti = N Xr, o, T = NC = max T σ i i ro, = Job lower bound: For job j on route r: X = X, X = max X r K * r, o o= 1 r r i MS&E324, Stanford University, Spring 2002 8-5 Gideon Weiss manufacturing & control
P i p e l i n i n g : (Tammy Boudoukh, Bertsimas & Gamarnik) Cycle 1: Do operation 1 on job 1 of each route. Cycle 2: Do operation 2 on job 1 and operation 1 on job 2 of each route. Cycle 3: Do operation 3 on job 1, operation 2 on job 2, and operation 1 on job 3 of each route. Cycle K 1: Do operation K 1 on job 1, operation K 2 on job 2,... etc... and operation 1 on job K 1 of each route. Pipeline is full: Do N K+1 Full Cycles. Runout: K 1 Partial Cycles to complete all jobs. * * OPT H * * NC = T T T NC + ( K 1) C H Gap: T * T = O() 1 MS&E324, Stanford University, Spring 2002 8-6 Gideon Weiss manufacturing & control
High Volume Job Shops - Similar N SIMILAR JOBS on each route, processing times X (), j L, X () j j =, K, N. r, 1 r, K 1 Average: m = E( X ( j)) or m ro, ro, T = X ( j), T = max T, i N r, o σ = i j= ro, 1 Bottleneck Machine: i = arg max m ro, * * i σ 1 = N X ro j ro,, () j i ro, = i MS&E324, Stanford University, Spring 2002 8-7 Gideon Weiss manufacturing & control
Cyclic Scheduling with Safety Stocks To keep Bottleneck machine busy throughout full cycles: Build up SAFETY STOCKS, S in each buffer. Then perform Cycles, initiated by the Bottleneck Machine. Bottleneck idles if maximal queue length exceeds S. Theorem (Dai & W): Xro, () j i.i.d. with means m ro, and (1) i * is a unique, single, bottleneck. (2) Xro, () j possess exponential moments. Let T *, T H be the (random) lower bound and heuristic makespans. There exist C1, C2 such that for all N: With safety stocks S = C 1 log N, H * PT ( T > C log N) < / N 2 1 MS&E324, Stanford University, Spring 2002 8-8 Gideon Weiss manufacturing & control
Multi-Class Queueing/Fluid Networks: Nodes (machines): i= 1, K, I Classes, queues, buffers, steps: k = 1, K, K Constituencies: i= σ( k), k C i Jobs j = 1, K, N Arrivals: Aj (), renewal process, rate α Processing: Xk (), j i.i.d. mean m k Routing: k l, Bernoulli, P kl, Qt () = Q( 0) + A() t ( I P ) S( T( t)) 0 ST ( ( t)) FLUID LIMIT UNDER FLUID SCALING OF MCQN Rescale time and space equally by initial state which qt ( ) lim 1 = n Qnt ( ), n = Q( 0 ) n FLUID RELAXATION of JOB-SHOP or MCQN Integrality of Jobs: qk () t need not be integer. 1 Integrality of Machines: Flow rate 0 uk () t. mk Integrality of Work: Part of Job completed at Buffer k can already be processed at Buffer k + 1. qt () = q( 0) + α t ( I P )() usds 0 mu() t 1, ut () 0 k C i k k t 0 MS&E324, Stanford University, Spring 2002 8-9 Gideon Weiss manufacturing & control
Minimum Fluid Job Shop Makespan The cyclic policy with safety stocks S serves N S( K 1) full cycles, in which one job is processed in each buffer, i.e each (, ro) This corresponds to u () t N/ T for all buffers. ro, Fluid N * Route R N Route 2 N Route 1 T * Time MS&E324, Stanford University, Spring 2002 8-10 Gideon Weiss manufacturing & control
MCFN Minimum Makespan Q ( 0) = q ( 0), k = 1, K, K k k i 1 i i * T = max Ti i + 1 * T { CM( I P ) ( q( 0) + αt)} Q ( 0) = ( I P ) ( Q( 0) + αt ) u () t = Q ( 0)/ T k + k q + t k () t = Q + k ( 0) 1 *, qk() t T Fluid * = t Qk ( 0 ) 1 * T K K-1 1 T * Time MS&E324, Stanford University, Spring 2002 8-11 Gideon Weiss manufacturing & control
Fluid Imitation Algorithms Base decision on the gap between fluid and actual queues. Compare the nominal qk + () t to the actual queue Qk + () t Give priority to k for which Q + k () t q + k () t is large. Equivalently: k t t k k 0 0 k Let q () t = q () s ds, Q () t = Q () s ds, Give priority to k for which q () t Q () t is large. k k qk (), t Qk () t are easy to keep track of: They are the cumulative amount that of fluid or of actual jobs that flowed out of buffer k. q Priority measure ρ k () t Q k () t k () t = q () t Cyclic Fluid Algorithm (CFA): Machine i works on buffer: * k = argmax ρ ( t) : k C i k i k { } Greedy Fluid Algorithm (GFA): Machine i works on buffer: * k = argmax ρ ( t) : k C, Q ( t) > 0 { } i k i k MS&E324, Stanford University, Spring 2002 8-12 Gideon Weiss manufacturing & control
Comparing the policies: It can be shown that, with safety stocks S, GFA is at least as good as CFA which is at least as good as Dai&W Cyclic algorithm. GFA can function with no Safety Stocks. For identical Jobs: MS&E324, Stanford University, Spring 2002 8-13 Gideon Weiss manufacturing & control
For Similar Jobs, GFA with no Safety Stocks does not show logarithmic gap! MS&E324, Stanford University, Spring 2002 8-14 Gideon Weiss manufacturing & control
Job Shop Simulation Study (Yoni Nazarathy) http://rstat.haifa.ac.il/~yonin/thesis/jobshopsim/shopsim.html MS&E324, Stanford University, Spring 2002 8-15 Gideon Weiss manufacturing & control
Empirical Conclusions: (1) The gap does not grow like log( N ), but remains O( 1) (2) The distribution of the processing time has little effect (3) Gap of order O( 1 ) can occur with multiple bottlenecks Experiments with MT10x10 and other examples. N ranging from 1 to 22 17. Enough replicates for clear evidence. Disttibution: CV=0, CV=0.25, CV=1.0, Weibull 1/2, Pareto 3, Pareto 2. MT10 Re-entrant: N jobs on a single, 100 step route. MT10 Balanced: All machines require 631 time per cycle MT10 Round-Bal: Each route starts on a different machine Differing Nr, r = 1, K, R Policies of: Random buffer, Random job, LBFS, FBFS performed much worse, with gaps up to ON ( ) MS&E324, Stanford University, Spring 2002 8-16 Gideon Weiss manufacturing & control
M T 1 0 : MS&E324, Stanford University, Spring 2002 8-17 Gideon Weiss manufacturing & control
MT10 (cont1): MS&E324, Stanford University, Spring 2002 8-18 Gideon Weiss manufacturing & control
MT10 (cont2) MS&E324, Stanford University, Spring 2002 8-19 Gideon Weiss manufacturing & control
MT10 Round-Balanced: MS&E324, Stanford University, Spring 2002 8-20 Gideon Weiss manufacturing & control
MS&E324, Stanford University, Spring 2002 8-21 Gideon Weiss manufacturing & control
O(1) Gap for Re-entrant Lines: Surprising: O( 1 ) gap says: Bottleneck machine never idles between some random time T starve independent of N and completion of its work When bottleneck machine completes its work, other machines have runout time T runout independent of N. In a re-entrant line, all N jobs follow the same route. In that case, q + () t = N( 1 t/ T * ), q () t = Nt/ T *, k k If Qk() t > 0 Qk () t < Qk 1 () t, k has priority over k 1 Hence, GFA is actually simply LBFS. We know that LBFS is stable, for input rate ρ < 1 MS&E324, Stanford University, Spring 2002 8-22 Gideon Weiss manufacturing & control
GFA LBFS Fluid Solution Look at the fluid solution under cyclic policy, and compare to that under LBFS: Cyclic Fluid Solution GFA LBFS Solution a (L) a (1) Algorithm Buffer a () 1 is first buffer of the bottleneck machine, * Slope is N/ T * Flow of N/ T keeps buffers a () 1 + 1, K, K empty. () 1 Fluid out of buffers 1, K, a 1is faster, which defines a secondary bottleneck machine. a ( 2) is first buffer of secondary bottleneck machine. Continue until a ( L) = 1 MS&E324, Stanford University, Spring 2002 8-23 Gideon Weiss manufacturing & control
ReEntrant QN with infinite 1st buffer Assume the first buffer has an unlimited supply of jobs. Consider the system under LBFS policy: Buffer 1 is served only if all the other buffers of machine σ 1 are empty. The fluid picture now is: Queue in buffer 1 is ( L 1) ( 1) Buffers a, K, a : ρ > 1 All other queues are stable No starvation after T starve Stable time for T runout N a (L) a (1) MS&E324, Stanford University, Spring 2002 8-24 Gideon Weiss manufacturing & control
Multiple Bottlenecks: We found that high volume Job Shops with more than one bottleneck may also be scheduled with a gap of O( 1 ). We also found that high volume job shop problems with random routes, where each route has K processing steps, and the processing times of all steps are i.i.d. (i.e. all machines are bottlenecks), can also be scheduled with a gap of O( 1 ). How is that possible? We now look at MCQN with Infinite Virtual Buffers. MS&E324, Stanford University, Spring 2002 8-25 Gideon Weiss manufacturing & control
Queueing Systems v. Manufacturing Systems The Single Server Queue: λ As long as λ < µ, ρ = <, µ Server is idle 1 ρ of his time. 1 system will be stable. λ λ µ A Single Machine Manufacturing System: Infinite supply of work. There is a virtual queue. Machine works all the time. Virtual Queue µ µ A Two Machine Manufacturing System: First machine works all the time. Second machine is a single server queue. System λ stable if λ < µ, ρ = < 1, and second machine idles 1 ρ. µ Virtual Queue λ λ µ λ MS&E324, Stanford University, Spring 2002 8-26 Gideon Weiss manufacturing & control
Queueing Systems v. Manufacturing Systems Efficient Operation of the Two Machine Manufacturing System: To operate the system efficiently we add work for the second, faster machine: λ λ 1 Virtual 1 Queue λ 1 µ 1 λ 2 Virtual Queue Second machine is a single server queue with vacations: λ Stable, non-idling, production from bottom stream 1 1 1 µ λ A Two Machine Manufacturing System with Two Production Routes: Consider 1 2 2 Virtual Queue λ 1 µ 1 ν 1 ν 2 µ 2 λ 2 Virtual Queue We can operate this system to be stable and non-idling. MS&E324, Stanford University, Spring 2002 8-27 Gideon Weiss manufacturing & control
The Push Pull System (Anat Kopzon): Consider the following Two Machine Two Routes System: 1 2 λ 1 µ 2 ν 1 ν 2 µ 1 λ 2 Top Route goes from machine 1 to machine 2. Bottom Route goes from machine 2 to machine 1. Machine i works feeds machine 3 i at rate λ i. Machine i serves its own queue at rate µ i. Flow balance: To keep machines busy all the time and be stable: α i proportion of time machine i is feeding, 1 α i it is serving. ν1 = α1λ1 = ( 1 α 2) µ 2 The flow ratessatisfy: ν2 = α2λ2 = ( 1 α 1) µ 1 Hence: λµ µ λ ν 1 2( 1 2) 1 = µµ λλ This be achieved by a stable policy! ν 2 1 2 1 2 λ 2µ 1( µ 2 λ = 1) µµ λλ 1 2 1 2 MS&E324, Stanford University, Spring 2002 8-28 Gideon Weiss manufacturing & control
The Push Pull System Operating Rules: Assume: λ < µ, λ < µ 1 2 2 1 The policy for Machine i is: produce for 3-i rate λi serve queue i rate µi part leaves system is queue i empty? Empty Not empty send part to queue 3-i keep part produced for 3-i Lemma: After an initial period, exactly one of the queues is non-empty at any time. MS&E324, Stanford University, Spring 2002 8-29 Gideon Weiss manufacturing & control
Steady State of M/M/ System: Assume: production and service time are momoryless - The system is Markovian: Define the following states: ia, both machines feeding, machine 1 has i in queue ia, machine 1 is serving its queue, and has i in queue ib, both machines feeding, machine 2 has i in queue ib, machine 2 is serving its queue, and has i in queue State transition rates are: λ λ 2 λ 2 λ 1 λ 1 λ 1 2 3,a 2,a 1,a 1,B 2,B 3,B µ 1 λ 1 λ 1 λ 1 µ 1 µ 1 3,A 2,A λ 2 λ 2 λ 2 1,A µ 2 µ 1 1,b µ 2 µ 2 λ 2 λ 2 λ 2 µ 2 2,b 3,b λ 1 λ 1 λ 1 These are solved by: pia, = ΠaP( X = i 1) X ~ Geom( λ1/ λ1+ λ2) pia, = Π AP( X+ Z = i 1) Z ~ Ge om( µ 1 λ2 / λ2) pib, = ΠbP( Y = i 1) Y ~ Geom( λ2 / λ1+ λ2) pib, = ΠBP( Y + W = i 1) W ~ Ge om( µ 2 λ1/ λ1) Here Πa, ΠA, Πb, ΠB are the times proportions of states aabb,,, MS&E324, Stanford University, Spring 2002 8-30 Gideon Weiss manufacturing & control
Cycles in the Operations of the System The sample path of the push pull system will cycle through 4 periods: Queue 1 nonempty Queue 2 nonempty x a A b B x exp(λ 1 ) Μ/Μ/1 λ 2, µ 1 exp(λ 2 ) Μ/Μ/1 λ 1, µ 2 x The times proportions of states aabb,,, Π Π Π Π a b A B λ = 2( µ 1 λ 2)( µ 2 λ1) ( λ1+ λ 2)( µ 1µ 2 λ1λ2) λ µ λ µ λ = 1( 1 2)( 2 1) ( λ1+ λ 2)( µ 1µ 2 λ1λ2) λ µ λ = 2( 2 1) ( µµ 1 2 λλ 1 2) λ µ λ = 1( 1 2) ( µµ λλ ) 1 2 1 2 MS&E324, Stanford University, Spring 2002 8-31 Gideon Weiss manufacturing & control
Generalization to M/G/ Assume now that the service times are general i.i.d. with rates µ µ production time are still memoryless with rates λ, λ. 1 2 1, 2 while the The cycles of operation at each machine can be regarded as an M/G/1 queue with vacation. λ arrivals 2 Server 1 is working Server 1 vacation - no arrivals x a A b B x exp(λ 1 ) Μ/G/1 λ 2, µ 1 exp(λ 2 ) Μ/G/1 λ 1, µ 2 x A small trick is needed: Replace the bb, period, where the queue of server 1 is zero by a single exp( λ 2 ) time which is the wait for first customer in the M/G/1 after end of busy period. Vacation Server 1 is working x exp(λ ) 2 x a A exp(λ ) 1 Μ/G/1 λ, µ 1 2 x We obtain explicit steady state from that. MS&E324, Stanford University, Spring 2002 8-32 Gideon Weiss manufacturing & control
Generalization J machines Assume there are J machines. We assume that at any time at most one machine i is "serving" i.e. performing a second operation on a job, at rate µ i. All the other machines are feeding machine i, at a total rate Λ i = λ. When the queue of machine i is empty, it will send its previously produced kept from start of busy period job to machine j with probability p ij Machine j then becomes the serving machine. If Λ i < µ i this system is again stable. j i j µ3 Λ 1,C 2,C Λ 3,C Λ 3 3 3 µ3 µ3 µ3 λ 3 λ 3 λ 3 1,c 2,c 3,c Λ Λ Λ 3 3 3 Λ µ 1 1 Λ1 Λ 2 2 2 1 3,a 2,a 1,a µ 1,B 2,B 3,B 2 µ 2 µ 2 µ 2 λ 1 λ 1 λ 1 µ 1 µ 1 3,A 2,A Λ Λ Λ 1 1 1 1,A p p 23 31 µ 1 p 21 p 13 p 32 λ 2 λ 2 λ 2 1,b p 12 Λ Λ Λ 2,b 3,b Λ Λ Λ 2 2 2 MS&E324, Stanford University, Spring 2002 8-33 Gideon Weiss manufacturing & control
Conclusion - What have we done? We have analyzed a multiclass queueing system. The policy under which we operated the system broke the sample paths into very simple periods. As a result we were able to obtain explicit expression for the system in steady state. Surprise: This system seems to operate at ρ i = 1 for all the machines, and yet it is stable! MS&E324, Stanford University, Spring 2002 8-34 Gideon Weiss manufacturing & control
5 machine 20 buffer re-entrant example I II III IV V 2 1 3 6 7 4 5 8 10 12 11 13 15 14 16 9 19 18 20 17 Starting from some initial fluid level, and minimizing total holding costs until the system is empty is an SCLP. MS&E324, Stanford University, Spring 2002 8-35 Gideon Weiss manufacturing & control
S o l u t i o n : In this solution we use piecewise constant flow rates out of the buffers. In each interval several machines are fully utilized, so this has multiple bottlenecks. We hope that this can be operated by a stable fluid imitation heuristic. MS&E324, Stanford University, Spring 2002 8-36 Gideon Weiss manufacturing & control
M e s s a g e : Manufacturing Systems, with Ample Supply of Work, under Fluid Control can: Work with Full Utilization (ρ=1) Be Stable (no heavy traffic) Optimize Costs. MS&E324, Stanford University, Spring 2002 8-37 Gideon Weiss manufacturing & control