Reducing Markov Chains for Performance Evaluation

Size: px

Start display at page:

Download "Reducing Markov Chains for Performance Evaluation"

Margery Simmons
5 years ago
Views:

1 1 Reducing Markov Chains for erformance Evaluation Y. ribadi, J..M. Voeten and B.D. Theelen Information and Communication Systems Group, Faculty of Electrical Engineering Eindhoven Embedded Systems Institute Eindhoven University of Technology.O.Box 513, 5600 MB Eindhoven, The Netherlands Abstract A major problem in performance evaluation of real-life industrial systems is the enormous number of states generated by the system s model during the simulation process. An example is in a arallel Object-Oriented Specification Language (OOSL) model, which can be interpreted as a discrete-time Markov chain enabling the evaluation of long-run average performance metrics. In general this Markov chains is huge, making performance simulation expensive since all performance estimations must be updated in each state. In this paper, we introduce a method for performance evaluation based on Markov chain theory, which aims at the reduction of the original Markov chain into a smaller chain. We prove that the reduction yields another Markov chain and it preserves several important properties. We further give a method to effectively construct the reduced chain. The reduction method yields an efficient way to estimate performance metrics during model simulation, since in general it requires estimations to be updated only in a relatively small part of the state space. Keywords Markov Chains; erformance Modelling; Ergodicity; Long-run Averages I. INTRODUCTION The OOSL language is a system level modeling language for complex real-time hardware/software systems [6], [3]. This language has proven to be very useful to describe real-life industrial systems and to verify their qualitative(correctness) and quantitative(performance) properties. In [6], [10] for instance, OOSL has been applied to design and analyze a distributed control system for a new generation of mailing machines. In [5], a protectionswitching protocol for SDH has been modeled, simulated and verified. A case study on the application of the OOSL formalism to analyze the performance properties of an industrial-sized Internet Router has been presented in [9]. The OOSL language is equipped with a lotkin-style structural operational semantics defining a labeled transi- This research is supported by ROGRESS, the embedded systems research program of the Dutch organization for Scientific Research NWO, the Dutch Ministry of Economic Affairs and the Technology Foundation STW. tion system. By resolving non-determinism with an external scheduler, this labeled transition system can be transformed into a discrete-time Markov chain [11]. erformance metrics can be expressed as long-run average rewards. A reward is a real-valued function on the state space of the chain and the idea is that each time a certain state is visited, a reward as specified by the reward function is obtained. Long-run average rewards can be computed using the ergodic theorem for Markov chains. However, the state space of Markov chains is often too large to perform such an exhaustive analysis. In case of simulation the size of the state space also yields a problem, since every performance estimation must be updated in each state, even if this state has no direct impact on the the estimation. For this reason, we develop a method to reduce the Markov chain by leaving out all such states. The plan of the remainder of this paper is as follows. In section II, we explain the formalism of Markov chains in terms of traces and their probabilities. In section III, long-run average rewards and the ergodic theorem are discussed. Section IV forms the core of this paper. In this section, we discuss the construction method and prove that it preserves several important properties. In particular, we prove that the long-run average reward of the original chain and that of the reduced chain are equal. We also show an example in this section. Finally, in Section V the conclusions are given. II. FORMALISM Assume a OOSL model defines a discrete time Markov chain { } with a countable state space S, an initial distribution D (where D { S } is a set of probability distributions) and a probability matrix S S. For S denotes the probability that. The def- the chain transits from state to state inition of such a chain will be given in the form of the ensemble S. Further, we assume that the probability space of this chain is given by probability triple F where is the collection of all infinite state se-

2 2 quences, F is the -algebra generated by the collection of cylindrical sets and is the corresponding probability measure (see [1] for more details). A finite state se- quence or infinite state sequence is called a trace if for each respectively each. For any infinite trace, we let denote the finite prefix of length and for any finite trace we define as the thin cylinder of all infinite traces having as prefix. The probabilities of the occurrences of finite trace and infinite trace are denoted by and respectively. These probabilities are given by where chain starts in state. (1) (2) is the probability that the Next to the probabilities and, we also need the probabilities of the occurrence of finite or infinite traces, conditional on the initial state of these traces. These conditional probabilities are given by and. Notice that and respectively. Let be a set of finite traces. Then is called proper if no trace in is a proper prefix of another trace in. A set is called initial if all traces in have the same initial state and final if all traces in For a proper set, we define by have the same final state. If is also initial, we define the conditional probability by Let be a proper final collection of finite traces and let be a proper initial collection of finite traces such that the final states in are the same as the initial states in. Then is defined by { and } Notice that is proper, and if is initial,. III. REWARDS AND ERFORMANCE EVALUATION For performance evaluation, the information contained in the states is used to define a number of rewards. A reward is a real-valued function on state space and the idea is that each time a certain state is visited a reward as specified by the reward function is obtained [7]. Definition 1: A reward is a function from state space S to R and for S, denotes the reward earned each time the Markov chain makes a transition into state. Definition 2: A reward function is called a conditional reward if or for all S. A state is called relevant (with respect to ) if and irrelevant (with respect to ) if. Next we define a reward structure Definition 3: Let { } be a Markov chain defined by S and let R be a set of reward functions. Then S R is called a reward structure. Many performance metrics can be expressed in terms of long-run average rewards or combination thereof. A reward structure defines precisely those rewards required to express these performance metrics. For reward and Markov chain { }, the long-run average -reward is defined by the limiting behaviour of random variable (3) With the limit of a random sequence we mean almost sure convergence [4] in this paper. Next, we define when the above limit exists. Definition 4: A Markov chain is called ergodic if it has a positive recurrent state that can be reached from any state with probability one. The property of ergodicity considered in this paper is taken from [2], [7]. The definition of ergodicity implies that the chain S has no two disjoint closed sets of states. Further, the (non-empty) collection of recurrent states is irreducible, each state in is positive recurrent and each state in \ is transient. It can be shown that an ergodic chain has a unique equilibrium distribution { S} where D. It further satisfies the following important ergodic theorem. Theorem 1 (Ergodic Theorem) Let { } be an ergodic Markov chain defined by a reward structure S R and let R be a reward function. Then (4) provided that converges absolutely.

3 roof: given in [2]. The proof is a generalization of the proof Definition 5: A reward structure S R is called complete if S is ergodic and if for each R, converges absolutely, where is the unique equilibrium distribution of the chain. Hence, in a complete reward structure, all long-run average rewards exists and can be computed using the ergodic theorem. We encountered that many performance metrics in practice can be expressed as conditional long-run average rewards of the form (see also [8]) where is any reward and where is a conditional reward. The idea of the performance metric above is that reward is averaged precisely over those states that are relevant with respect to. It is easy to prove that if and then the performance metric ). in (5) converges almost surely to (if IV. REDUCING MARKOV CHAINS The conditional reward defined in (5) can be computed from the original Markov chain obtained from the OOSL model. In case of simulation, the estimation of (5) has to be updated in each simulation step, which is computationally expensive. Now, since the conditional long-run average reward (5) only depends on the relevant states in the initial chain, it might seem plausible that we only have to consider these relevant states in order to compute or estimate the metric. In this paper, we will show that this is indeed the case. To this end, we will reduce the original chain by leaving out all irrelevant states (with respect to ) and prove that the conditional long-run average -reward in the original chain is the same as the long-run average -reward in the reduced chain. We now fix a complete reward structure S R and assume that it defines Markov chain { } which is based on probability triple F. We further assume that R is a reward and R is a conditional reward. In this section we will define a reduced reward structure (with respect to ) which will be denoted by S R = S R. Here the state space S is defined by S { S } and R is the collection of rewards in R reduced to the states in S. To define components and we first introduce the reduced Markov chain denoted by { }. This reduced chain will be defined on the same probability (5) triple F as the original chain. For each and infinite trace, is defined by the relevant state in undefined if it exists otherwise Notice that { } defined this way is not always a stochastic process. In general, is not even a function. However, the following lemma states that if the original Markov chain has at least one positive recurrent relevant state, then { } defines a proper stochastic process. Lemma 1: Let the original Markov chain { } has at least one positive recurrent relevant state. { 3 Then } is a stochastic process. Assume state is a positive recurrent rel- roof: evant state. Then each infinite trace has infinitely many occurrences of state with probability one. Hence, if we discard a set of probability zero [2], is a complete S -valued function. We further have to show that each is a random variable. But this follows easily from the fact that each set of infinite traces is measurable. Next, we want to prove that the stochastic process { } is in fact a Markov chain. Theorem 2: Let the original Markov chain { } have at least one positive recurrent relevant state. Then { } is a Markov chain. To prove this theorem, we first introduce the following collections of finite traces. For S and S, we define = The set of finite traces that end in and for which all preceding states are irrelevant = The set of finite traces that start in, end in and has no relevant states in between Notice that is an initial and final proper set of traces and that is a final proper set of traces. We are now able to prove theorem 2. roof: We will first prove the Markovian property.

4 4 Similarly it is not hard to show that and hence the Markovian property holds. The property of time homogeneity is proven in an analogous way. From the previous proof, it follows immediately that the probability that the reduced chain transits from S to S is given by. Hence the reduced probability matrix is defined by for all S Further, the initial distribution of the reduced chain is defined by for all S So now we have defined the reduced reward structure S R where S defines the reduced Markov chain { }. In the next subsection IV-A, we will show that the reduced reward structure is ergodic if the original chain has at least one positive recurrent relevant state. We further prove that the reduced reward structure is in fact complete. In subsection IV-B, we will prove the preservation of long-run averages. In subsection IV-C, a method to effectively construct the reduced chain is given. Finally, an example is given in subsection IV-D. A. reservation of Ergodicity and Completeness For the preservation of ergodicity, we need to prove the following theorem. Theorem 3: If the original chain { } is ergodic and has a positive recurrent relevant state, then the reduced Markov chain { } is ergodic. roof: Assume that the original chain is ergodic and let be a positive recurrent relevant state. We will show that is a positive recurrent state that is reached from any initial state in the reduced chain with probability one. Let S be any initial state of the reduced chain and define as the set of finite traces that start in, end in and have no other occurrences of other than the final state. We have to prove that. But this follows immediately from the fact that is reached with probability one in the original chain. To show that is a recurrent state, define as the set of finite traces, consisting of at least two states, that start in, end in and have no other occurrences of state in between. Since is a recurrent state in the original chain, and hence is a recurrent state in the reduced chain. To prove that is positive recurrent, we need to show that the mean return time to is finite. To this end, define for trace, as the amount of states in minus one and as the amount of relevant states in minus one. We have to show that (6) But since is a positive state in the original chain, we clearly have (7) from which the result follows. For the preservation of completeness, we need to prove the following theorem. Theorem 4: If the original reward structure S has a positive recurrent relevant state, then the reduced reward structure S roof: R is complete. By the previous theorem we already know that the reduced chain is ergodic so that the unique equilibrium distribution D exists. We further have to show that for each R the following series converges absolutely R (8) By [2] this is equivalent by saying that the expected cumulative absolute reward earned during one cycle through a positive recurrent relevant state is finite. By an argument similar to that in (7) it can be verified that this condition is indeed satisfied. B. reservation of Long-run Averages For the preservation of long-run averages, we need to prove the following main result. Theorem 5: Assume the original reward structure S R is complete and has a positive recurrent relevant state. Then roof: iff To show this, let be a constant such that. Next, we define a set }. We further { define set { }. For the only if part we have to prove that

5 implies. To this end, assume and let. Then we have that. By the definition of it then easily follows that ( the relevant state in ), so. Hence and. The if part is proven in a similar way. Notice that in case the condition of the theorem is satisfied, both and converge almost surely, where the latter limit is different from 0. The next corollary establishes the relation between the equilibrium distribution of the original chain and that of the reduced one. Corollary 1: Assume that the original Markov chain has a positive recurrent relevant state. Let and denote the equilibrium distributions of { } and { } respectively. Then for each S (9) roof: Let S be any relevant state. We define a reward as the indicator function of state, i.e. yielding 1 in state and 0 in any other state in S. Now although is not necessary in R, it is easy to see that Theorem 5 also holds for this reward. Hence, by the ergodic theorem we obtain Now by observing that for all, =, the result follows straightforwardly. C. Construction Method Although the reduced reward structure has been defined already in the beginning of Section IV, this definition is not constructive. Computing the transition probability from relevant state to relevant state as the conditional probability of set or computing the initial probability of relevant state as the probability of set is a laborious exercise. In this subsection, we introduce a method for computing the transition probabilities by solving a set of linear equations. The initial probabilities can then be computed immediately from the initial distribution of the original chain and the probability matrix of the reduced chain. Theorem 6: Assume the original complete reward structure has a positive recurrent relevant state. Extent the reduced probability matrix to S S and define for all S S Then the elements of satisfy the following set of linear equations: \S roof: It is easily verified that for S and S the following equation holds { } { } \S By working out the conditional probability of, the result follows. We have strong evidence that the system of linear equations has a unique solution, but more research is necessary to establish a formal proof. Theorem 7: Assume the original complete reward structure has a positive recurrent relevant state. Then the elements of satisfy the following equations: 6. \S roof: The proof is similar to the proof of Theorem D. Example We give an example to illustrate what we have discussed so far. Suppose a OOSL model defines a Markov chain with transition probabilities as given in Fig. 1 (a). The realvalued reward is indicated in each state. Relevant states are indicated with a black dot. We assume that the initial distribution is given by,, = = =. Using the ergodic theorem, we can calculate the equilibrium distribution. It is given by,, = = =. Using the construction method, the reduced Markov chain shown in Fig. 1 (b) is obtained. Using Theorem 6, the reduced transition probabilities are computed and by Theorem 7 the reduced initial distribution is calculated. This distribution is given by and for all the other relevant states. Using the ergodic theorem or by applying Corollary 1, the reduced equilibrium distribution can be obtained easily. This distribution is given by, and. The reader may want to check that the conditional longrun average -reward in the original Markov chain is the same as the long-run average reward in the reduced Markov chain, as is stated by Theorem 5. Both long-run averages equal 2. It is clear that in case of simulation, the reduced Markov chain is preferred above the original Markov chain to estimate the metric. Notice that in this case the original chain 5

6 6 Fig. 1. (a) Original Markov chain and (b) Reduced Markov chain can be used to generate a trace for analysis; it is not necessary to first construct explicitly the reduced chain 1. During the analysis of this trace, only the relevant states are observed. Since in practice most states will be irrelevant for the performance metric in question, the reduction technique is expected to result in a huge increase in simulation performance. To be able to apply simulation of the reduced chain, the central limit theorem for Markov chains should hold. It is expected that the construction method preserves the conditions for this limit theorem to hold, but this will be a topic for future research. V. CONCLUSIONS In this paper, we introduced a reduction method for Markov chains. The reduction consists of leaving out all states not satisfying the update condition specified by some conditional reward. We have proved that the reduction yields another Markov chain and that it preserves the property of ergodicity. We further showed that every conditional long-run average reward in the original chain is equal to the long-run average reward in the reduced chain. We gave a method to effectively construct the reduced Markov chain. The presented reduction method is especially useful in the area of simulation and performance estimation. The reduction method makes that performance estimations only Notice that the reduced chain is defined using the original chain. -algebra of the have to be updated in those states that satisfy the update condition specified by some conditional reward. Since in practice most states will not satisfy this reward, the reduction technique is expected to result in a huge increase in simulation performance. REFERENCES [1]. Billingsley. robability and Measure. Wiley, New York, [2] K.L. Chung. Markov Chains with Stationary Transition robabilities, Springer-Verlag, Berlin, [3] M.C.W. Geilen and J..M. Voeten. Real-time Concepts for a Formal Specification Language for Software/Hardware Systems. In: J.. Veen, Ed., roceeding of rorisc 97, pp Utrecht(Netherlands): STW, Technology Foundation, [4] A.F. Karr. robability. Springer-Verlang, New York, [5] G. Lopez. Modelisation, Simulation et Verification d un rotocole de Telecommunication. Graduation Report. Eindhoven University of Technology, Eindhoven (Netherlands), [6].H.A. van der utten and J..M. Voeten. Specification of Reactive Hardware/Software Systems. h.d. Thesis. Eindhoven(Netherlands), Eindhoven University of Technology, [7] H.C. Tijms. Stochastic Models-An Algorithmic Approach. John Wiley & Sons, Chichester, England, [8] B.D. Theelen, J..M. Voeten and Y. ribadi. Accuracy Analysis of Long-Run Average erformance Metrics. In: roceeding of ROGRESS 01. Utrecht(The Netherlands): STW Technology Foundation, [9] B.D. Theelen, J..M. Voeten, L.J. van Bokhoven, G.G. de Jong, A.M.M. Niemegeers,.H.A. van der utten, M..J. Stevens, J.C.M. Baeten. System-level Modeling and erformance Analysis. In: J.. Veen, Ed., roceedings of ROGRESS 00, pp Utrecht(The Netherlands): STW Technology Foundation, [10] J..M. Voeten,.H.A. van der utten and M..J. Stevens. Sys-

7 tematic Development of Industrial Control Systems Using Software/Hardware Engineering. In:. Milligan and. Corr, Eds., roc. of EUROMICRO 97, pp Los Alamitos, California(U.S.A): IEEE, [11] J..M. Voeten. erformance Evaluation with Temporal Rewards. To be published in: Journal of erformance Evaluation,

Summary of Results on Markov Chains. Abstract

Summary of Results on Markov Chains Enrico Scalas 1, 1 Laboratory on Complex Systems. Dipartimento di Scienze e Tecnologie Avanzate, Università del Piemonte Orientale Amedeo Avogadro, Via Bellini 25 G,