CHAPTER 4 Networks of queues. Open networks Suppose that we have a network of queues as given in Figure 4.. Arrivals Figure 4.. An open network can occur from outside of the network to any subset of nodes. Arrivals from outside to node i are assumed to follow a Poisson process with rate γ i. There is a single server at each node with an infinite queue. The processing times at node i are assumed to be exponentially distributed with rate µ i. Routing is random, i.e. jobs departing from node i proceed to node j with probability r ij. We assume that the routing is such that all arriving jobs eventually leave the system. We first write down a set of equations that determines the arrival rate to each node i, λ i, if the system were stable: λ i = γ i + r ji λ j, i =, 2..., N, j= where N is the number of nodes (queues) in the network. This system of equations is known as the traffic equations. 37
38 4. NETWORKS OF QUEUES If we let the state of the system be an N-dimensional vector consisting of the number of jobs at each node, we could write the balance equations for each state. This is somewhat cumbersome, but doable. First, let ē i = (0, 0, 0,..., 0,, 0,..., 0) be an all zero vector, except for a in the ith position. Also, let n = (n, n 2,..., n n ). Suppose that all n i > 0, then the balance equation for state n is (γ i +µ i ( r ii ))p( n) = γ i p( n ē i )+ µ i r i0 p( n+ē i )+ µ j r ji p( n+ē j ē i ) i= i= i= i= j= where p( n) is the steady-state probability that there are n i jobs at node i, i =,..., N and r i0 = is the probability that a job departing node i leaves the network. In addition to this equation, there are all of the boundary states (all combinations of n i = 0) to take into account. It turns out that we have the following solution p( n) = p i (n i ) = j= r ij N p i (n i ) i= ( λ ) ( i λi µ i µ i This is a so-called product form solution - the steady-state probabilities are a product of quantities at one queue only (of course, there is dependence through the traffic equations). Note that if there are c i servers at node i, it turns out that p i (n i ) is the corresponding steady-state distribution for an M/M/c i system. This suggests that the queues are independent of each other, but that is a bit misleading. It turns out that if you take a snapshot of such a network at a particular instant of time, the number of jobs found in each queue are independent of one another, However, if you took snapshots of the network at two close time instants, you would find strong correlations between the number of jobs in the queues between those two time instants. Example. N = 3 nodes, mean processing times /µ = 0.04 seconds, /µ 2 = 0.06 seconds, /µ 3 = 0.04 seconds. From outside, jobs arrive exclusively to the first node, according to a Poisson process with rate γ = 4 jobs per second. The routing probabilities are r = 0.2, r 2 = 0.4, r 3 = 0.4, r 2 =, r 3 = 0.5. (a) Find the steady-state probability that there are exactly 4 jobs at node, exactly 3 jobs at node 2, and exactly 2 jobs at node 3. ) ni
We first solve the traffic equations: 2. CLOSED NETWORKS. 39 λ = 4 + 0.2λ + λ 2 + 0.5λ 3 λ 2 = 0.4λ λ 3 = 0.4λ The solution of this is λ = 20, λ 2 = 8, λ 3 = 8, from which we can compute ρ = 0.8, ρ 2 = 0.48, ρ 3 = 0.32. Thus, the required probability is p((4, 3, 2)) = ( ρ )ρ 4 ( ρ 2 )ρ 3 2( ρ 3 )ρ 2 3 = 3.28 0 4. (b) Find the expected number of jobs in the network: ρ ρ + ρ 2 ρ 2 + ρ 3 ρ 3 = 4 + 0.92 + 0.47 = 5.39. What is the bottleneck in this network? Here, it is the node with the highest load. A general rule of thumb is: improving the situation at the bottleneck maximizes performance improvement. For example, suppose that we can only improve one processing rate, by 0 percent, i.e. choice of µ = 27.5, µ 2 = 8.34, µ 3 = 27.5. If we improve µ 3, we get L = 5.33, if we improve µ 2, we get L = 5.26 and if we improve µ we get L = 4.09. Improving the non-bottleneck nodes yields negligible improvement, whereas there is a dramatic improvement if we improve the bottleneck. 2. Closed networks. Suppose that we start with M jobs in the system, none of them leave and no new jobs enter. In other words, we assume that for all i, γ i = 0 and N j= r ij =. Here, the traffic equations are λ i = r ji λ j. j= There is a problem with this set of equations. There is no unique solution. It turns out that we still have a product form solution, using any non-zero solution to the traffic equations: p( n) = G(M) N i= ( λi µ i ) ni. We need to calculate G(M). It is maybe best to see how to do this through an example. Example. N = 3 nodes, M = 3 jobs. There is a single processor at each node. The routing probabilities are r = 0.6, r 2 = 0.5, r 3 = 0.4, r 2 = 0.3, r 32 = 0.6, r 3 = 0., r 23 = 0.5. Processing times are exponentially distributed with rates µ = 0.8 per second, µ 2 = 0.6 per second, and µ 3 = 0.3 per second.
40 4. NETWORKS OF QUEUES (a) What is the steady-state probability that there are exactly two jobs at node two? We first solve the traffic equations: λ = 0.6λ + 0.5λ 2 + 0.4λ 3 λ 2 = 0.3λ + 0.6λ 3 λ 3 = 0.λ + 0.5λ 2 There is no unique solution, so arbitrarily set λ =. Solving for the remaining arrival rates gives λ 3 = 0.386, λ 2 = 0.532. To solve for G(3), we need to calculate G(3) = n +n 2 +n 3 =3 ( 0.8 ) n (.532.6 ) n2 ( ).386 n3..3 Note that the summation is a summation over ten terms, of all possible combinations of 3 jobs over 3 nodes: (3, 0, 0), (2,, 0),.... In this case, doing the summation gives G(3) = 5.33, so p((n, n 2, n 3 )) = 5.33 The required probability is ( 0.8 ) n (.532.6 ) n2 ( ).386 n3..3 p((, 2, 0)) + p((0, 2, )) = (0.983 +.02) = 0.32. 5.33 (b) Determine the throughput at node. This is the processing rate at node, multiplied by the probability that it is busy, i.e. µ ( (p((0, 3, 0) + p((0, 0, 3)) + p((0, 2, )) + p((0,, 2)) = 0.59. 3. BCMP Networks The furthest that we can push the product for solution is a class of networks known as BCMP networks. There are N nodes and R classes (or types) of job. For each class, we must specify routing probabilities through the network (these can be class dependent). A class can either be open (jobs enter from outside and eventually leave) or closed (jobs never leave). As discussed in class, the nodes are allowed to be one of four types: : FCFS Here, jobs are served in a first come, first served order. Multiple classes may visit a node, but in this case the service time distributions must be the same (and exponentially distributed) for all classes. The service rates may be load-dependent. 2: PS Here jobs are served using processor sharing, with each waiting job getting an equal share of capacity. Jobs of different classes may have different service requirements and the service rates (for each class) may depend on the queue length at the node. The service distributions must be so-called Coxian type (essentially a combination of exponential distributions), but only the expected value needs to be determined.
3. BCMP NETWORKS 4 3: IS or delay Here an infinite number of servers is available, or equivalently, each job is served by their own server. Jobs of different classes may have different service requirements and the service rates (for each class) may depend on the queue length at the node. The service distributions must be so-called Coxian type (essentially a combination of exponential distributions), but only the expected value needs to be determined. 4: LCFS-PR Here jobs are served on a last come first serve basis, with preemption (also, work done on preempted jobs is not lost). Further restrictions are the same as in the previous two cases. At this point, let me indicate that the results below depend only on the mean service times. This is why I have not discussed Coxian distributions. It will suffice at this point to note that we can approximate most distributions by a Coxian distribution and thus when the nodes are of the last 3 types, there is no practical limitation on the service time distributions. Upon leaving node i, a job of class r goes to node j and becomes a job of class s with probability r i,r;j,s. A job will leave the network with probability r i,r;0. Of course, there can only be arrivals from outside of the system for classes that are open. In this case, there are two possibilities which are allowed. The first possibility is that there is a single Poisson process with rate λ(k) where k is the total population in the network. Upon arrival to the system, a job goes to node i as a class r job with probability r 0;i,r. The second possibility is that each routing chain has its own arrival stream, with a rate that depends only on the population of that chain (which we will give by λ c (k c ), with c C, where C is the set of routing chains and k c is the population in routing chain c. For each stream, with probability r 0;i,c an arrival joins node i. For each routing chain c, we want to write an equation for the net arrival rate to node i of class r jobs. This can be written as λ i,r = λ i,r + (j,s) λ j,s r j,s;i,r. Here, λ i,r is the arrival rate of jobs from outside of the system. For closed networks it is 0; for open networks it equals λr 0;i,c (one arrival process) or λ r r 0;i,r (arrivals per chain/class). This equation has a very simple intuitive explanation. The left side is the arrival rate to (i, r), the first term on the right hand side is the arrival rate to (i, r) from outside, and the final term is the sum of the arrival rates to (i, r) from all other (node,class) pairs in the network. Using this equation, assuming the system is stable, one can calculate the throughputs for open chains λ i,r, and visit ratios for closed chains V i,r. Of course, it is not uncommon that for closed systems, the visit ratios are given directly (think of what we have done in class and that you have done in assignments).
42 4. NETWORKS OF QUEUES The main result is now stated (the proof is really beyond the scope of the course, but is not particularly difficult... if anybody is interested, just ask). We need a couple of definitions, to define what the state of the queueing network is. Let N i be the vector (N i,, N i,2,..., N i,r ) denote the state of node i, where N i,r gives the number of class r jobs at node i. The state of the system is the vector N = ( N, N 2,..., N N ) and the total number of jobs in the system is K. BCMP Theorem The steady-state probability distribution in a BCMP network has the following product form: P ( N = n) = N G A( n) p i ( n i ), where G is a normalizing constant (it assures that the probabilities sum to one), A( n) is a function of the external arrival processes only, and the functions p i ( n i ) are the per-node steady-state distributions. The important point of this result is that there are explicit expressions for the p functions. They are as follows (note that n i is R r= n i,r) When node i is of type FCFS, we have in the load-independent case p i ( n i ) = n i! ( R r= and in the load-dependent case ( R p i ( n i ) = n i! r= i= n i,r! V n i,r i,r ) ( ) ni, µ i ) n i,r! V n ni i,r i,r j= µ i (j) When node i is of type PS or LCFS-PR, we have in the load-independent case R ( ) ni,r Vi,r p i ( n i ) = n i!, n i,r! and in the load-dependent case p i ( n i ) = n i! R r= r= n i,r! V n i,r i,r µ i,r n i j= µ i,r (j). When node i is of type IS, we have in the load-independent case R ( ) ni,r Vi,r p i ( n i ) =, n i,r! and in the load-dependent case R p i ( n i ) = r= r= n i,r! V n i,r i,r µ i,r n i j= µ i,r (j).
3. BCMP NETWORKS 43 Finally, the term A( n) is determined by the arrival processes in the following manner. If all chains are closed, then A( n) =. If the arrivals depend on the total system population, then it is equal to A( n) = k j=0 λ(j), where k is the network population. If the arrivals are per chain, then A( n) = kc j=0 λ c(j), where N C is the number of routing chains and k c is the NC c= population in routing chain c. At this point, all of this notation may seem a bit much, so there will be two examples given at this point which are special cases of the BCMP theorem which are of great practical interest. Example. Single-class, load-independent open networks. Here, the arrival process is Poisson of constant rate λ (there is no load dependence for the arrivals). Also, the service rates are fixed. If the node is FCFS, PS or LCFSPR, there is only one server. Then where P ( N = n) = N p i (n i ), i= p i (n i ) = { ( ρi )ρ n i e ρ i ρn i i n i!, i, FCFS, PS, LCFSPR type, IS type, where ρ i is defined as { r R ρ i = i, λv i,r µ i FCFS type, λv i,r r R i µ i,r, IS, PS, LCFSPR type, where R i is the set of classes that require service at node i. You should be able to verify this result yourself, it is a decent exercise to get used to all of the notation. Note that A( n) has been absorbed into the definition of ρ i. This result should be somewhat intuitive. It says that the the system decomposes into M/M/ (or M/M/ ) queues with the appropriate arrival rates. Example. Closed, multi-class, load-independent BCMP networks. A lot of computer systems examples have load-independent servers, multiple customer classes (but no class changes) and fixed populations per class. Here, P ( N = n) = G N p i ( n i ), i=
44 4. NETWORKS OF QUEUES with p i ( n i ) = ( ) ni n i! R µi r= n i! R R r= r= n i,r! n i,r! n i,r! V i,r, ( Vi,r FCFS type, ) ni,r µ i,r, PS, LCFSPR type, ( ) ni,r Vi,r µ i,r, IS type. Note that n i = R r= n i,r. Example. Client Server System with Ethernet connection. Here we consider a client server system with a fixed number m of client workstations that are connected by an Ethernet network to a server. The server consists of a single disk and a single CPU. The Ethernet connection between the terminals and the server can be modelled as a server with the load dependent service rate µ(k) = ( N p ( N p L p ) B + S C(), k =, ) L p B + S C(k + ), k >, where C(k) = ( A(k))/A(k) is the average number of collisions per request and A(k) = ( /k) k is the probability of a successful transmission and k the number of workstations that desire the use of the network. The other parameters in the expression for µ(k) are: N p, the average number of packets generated per request, B, the network bandwidth in bits per second, S, the slot duration (in other words, the time for collision detection) and L p, the average packet length in bits. 4. Numerical Solutions for Product Form Networks For a closed, product form (i.e. closed BCMP) queueing network, we know that p( n) = G(M) f (n )f 2 (n 2 ) f n (n N ) where either f i (n i ) = ( ) ni θi of the note has state independent processing, or f i (n i ) = µ i θ n i ni k= µ i(k) and the set {θ i } is a solution to the traffic equations θ i = θ k r ki. k=
5. MEAN VALUE ANALYSIS 45 (Here we have assumed a single class network.) There is a difficulty in evaluating G(M), as we have G(M) = n N f i (n i ) i= and this expression does not simplify. So, for a network with M jobs and N nodes, the sum is over ( ) M + N N possibilities. For example, for 0 nodes and 25 jobs, one gets 34! = 52, 45, 256 25!9! terms. The issues here are that there are a large number of multiplications, also the summation of a large number of terms can be problematic, due to either overflow or underflow errors. There are many ways to circumvent this. We will do one, which allows the calculation of some performance measures without explicitly calculating the normalization constant. 5. Mean Value Analysis We start with a simple model, a closed, cyclic queueing network with N nodes and M jobs. The ith node has a processing rate of µ i. We can decompose the mean waiting time at node i, W i as follows. W i = µ i + µ i (mean number of jobs present upon arrival) The memoryless property is crucial here, otherwise we need to know something about the elapsed time of a job in service. We need the following key result, presented here without proof. Define W i (m) to be the mean waiting time at node i for a network with m jobs. Similarly, L i (m) is the mean number of jobs at node i for a network with m jobs. We then have W i (m) = µ i + µ i L i (m ). Now, for the entire network, if Y (m) is the throughput, then by Little s Law Y (m) W i (m) = m. i= For each node, Little s Law also yields Y (m)w i (m) = L i (m). Putting these all together yields the basic MVA algorithm: () L i (0) = 0, i =, 2,..., N
46 4. NETWORKS OF QUEUES (2) W i (m) = µ i + µ i L i (m ), i =, 2,..., N (3) Y (m) = m/ N i= W i(m) (4) L i (m) = Y (m)w i (m), i =, 2,..., N Steps 2-4 are repeated for m =, 2,..., M. Example. We look at the network in Figure 4.2. Figure 4.2. Example for basic MVA. L i (0) = 0, i =, 2, 3 2. W () = 0.5, W 2 () = 0.333, W 3 () = 0.66 3. Y () = /(0.5 + 0.333 + 0.666) =.0 4. L () = (.0)(0.500) = 0.500, L 2 () = (.0)(0.333) = 0.333, L 3 () = (.0)(0.66) = 0.66 2. W (2) = 0.500+(0.500)(0.500) = 0.750, W 2 (2) = 0.333+(0.333)(0.333) = 0.444, W 3 (2) = 0.66 + (0.66)(0.66) = 0.94 3. Y (2) = 2/(0.75 + 0.444 + 0.94) =.44 4. L (2) = (.44)(0.75) =.08, L 2 (2) = (.44)(0.444) = 0.639, L 3 (2) = (.44)(0.94) = 0.279 2. W (3) = 0.500+(0.500)(.08) =.04, W 2 (3) = 0.333+(0.333)(0.639) = 0.546, W 3 (3) = 0.66 + (0.66)(0.279) = 0.22 3. Y (3) = 3/(.04 + 0.546 + 0.22) =.67 4. L (3) = (.67)(.04) =.74, L 2 (3) = (.67)(0.546) = 0.92, L 3 (3) = (.67)(0.22) = 0.354 2. W (4) = 0.500+(0.500)(.74) =.37, W 2 (4) = 0.333+(0.333)(0.92) = 0.637, W 3 (4) = 0.66 + (0.66)(0.354) = 0.225 3. Y (4) = 4/(.37 + 0.637 + 0.225) =.79 4. L (4) = (.79)(.37) = 2.45, L 2 (4) = (.79)(0.637) =.4, L 3 (4) = (.79)(0.225) = 0.40 There are several ways to generalize this. One is of particular interest, but we will defer this for a bit, as it requires some concepts to be introduced first. We would like to see how much we can get from just knowing averages (from data). This leads us to study what is known as operational analysis. 6. Operational Analysis Operational analysis is the use of a number of laws that various averages satisfy. These can often be used to make very quick calculations of various performance measures. Suppose that we have the following data: T, the length of a time interval over which observations are made, A i, the number
6. OPERATIONAL ANALYSIS 47 of arrivals to device i, C i, the number of completions at device i and B i, the amount of time that device i was busy. From these we can quickly derive the following quantities: () Arrival rate at device i - λ i = A i /T (2) Throughput at device i - X i = C i /T (3) Utilization of device i - U i = B i /T (4) Average processing time at device i - S i = B i /C i We can also derive the following laws. Utilization Law. U i = X i S i. This follows from B i T = C i B i. T C i Forced Flow Law. Suppose that A i = C i (or at least approximately). Let V i be the visit ratio to device i. i.e. the average number of visits to node i for every visit to a reference device (device 0), where V 0 =, by definition. So, we have C i = C 0 V i which implies V i = C i. C 0 So, if the system throughput is measured at node 0, then the throughput of node i is X i = C i T = C i C 0 C 0 T = V ix. Now, we can combine the Utilization Law and the Forced Flow Law to get U i = X i S i = XV i S i = XD i, where D i = V i S i is the total processing demand on device i for all visits of a job. So, the bottleneck is equivalently the device with highest D i. Little s Law. Q i = X i R i. This has already been shown, where Q i is the average number of jobs at device i and R i is the average response time at device i. The General Response Time Law refers to the system in Figure 4.3. Inside the box, we have an average number of jobs Q satisfying Figure 4.3. Network for General Response Time Law
48 4. NETWORKS OF QUEUES Q = XR = Q + Q 2 + + Q N XR = X R + X 2 R 2 + + X N R N R = V R + V 2 R 2 + + V N R N = V i R i. i= Next, we have the Interactive Response Time Law. The total average cycle time of a client request is R + Z. So, each client generates T/(R + Z) requests, on average, in time T. So, X = = N(T/(R + Z)) T N R + Z R = N X Z Finally, we can get bounds on performance by doing a Bottleneck Analysis. Let D = D + D 2 + + D N be the average total demandand D max be the bottleneck device demand. The main bounds that we have are, with M clients X(M) min { D max, } M D + Z R(M) max{d, MD max Z} It is straightforward to prove these. First, at the bottleneck, U max = XD max or X. D max Also, R() = D + D 2 + + D n = D which implies R(N) D. Finally, from the Interactive Response Time Law R(M) = X(M) = M X(M) Z MD max Z M R(M) + Z M D + Z Example. Suppose we have the system in Figure 4.4. Suppose D a = 2 seconds, D b = 3 seconds, D c = second, Z = 30 seconds. Then we can use the bottleneck analysis bounds to derive the curves in Figures 4.5 and 4.6. The dotted lines are the bounds given by the bottleneck analysis. The solid line is the actual value (we can determine this by MVA, see the next section). Note that the breakpoint for both of the curves is the same, this
6. OPERATIONAL ANALYSIS 49 Figure 4.4. Network for Operational Analysis Example Figure 4.5. Bounds on throughput Figure 4.6. Bounds on response time value is often known as the number of clients that the system can support. Below this number, performance scales well, above this number, performance degrades considerably. Suppose that we had the choice to allow D a, D b, D c to be any (nonnegative) values that add to six. What would be a good choice? If we let D a = D b = D c = 2, we get a balanced load on each device, and the maximum number of supported clients becomes 8. In Figures 4.5 and 4.6, the dotted lines were easy to get. To obtain the solid lines, we need to do a little more work - generalize the MVA algorithm. If we are interested in the system in Figure 4.3, we first convert it to a system of the of the form of that in Figure 4.4, by replacing each node through N by a node with visit ratio but average processing time D i. Note that
50 4. NETWORKS OF QUEUES this preserves the demand at each node. Now, the MVA algorithm simply becomes () L i (0) = 0, i = 0,, 2,..., N (2) W i (m) = D i + D i L i (m ), i =, 2,..., N (3) Y (m) = m/(z + N i= W i(m)) (4) L i (m) = Y (m)w i (m), i =, 2,..., N, L 0 (m) = Y (m)z. Suppose that we have a system that we cannot analyze using any of the techniques to this point. A possibility is to simulate the system, which is the subject of the next chapter.