AMARKOV modulated Markov process (MMMP) is a

Size: px

Start display at page:

Download "AMARKOV modulated Markov process (MMMP) is a"

Gwendolyn Cook
6 years ago
Views:

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY An EM Algorithm for Markov Modulated Markov Processes Yariv Ephraim, Fellow, IEEE, and William J. J. Roberts, Senior Member, IEEE Abstract An expectation maximization (EM) algorithm for estimating the parameter of a Markov modulated Markov process in the maximum likelihood sense is developed. This is a doubly stochastic random process with an underlying continuous-time finite-state homogeneous Markov chain. Conditioned on that chain, the observable process is a continuous-time finite-state nonhomogeneous Markov chain. The generator of the observable process at any given time is determined by the state of the underlying Markov chain at that time. The parameter of the process comprises the set of generators for the underlying and conditional Markov chains. The proposed approach generalizes an earlier approach by Rydén for estimating the parameter of a Markov modulated Poisson process. Index Terms Expectation-maximization (EM) algorithm, Markov modulated Markov processes, Markov modulated Poisson processes. I. INTRODUCTION AMARKOV modulated Markov process (MMMP) is a doubly stochastic random process comprising an observable process and an underlying hidden process. The underlying process is a continuous-time finite-state homogeneous Markov chain. Conditioned on that chain, the observable process is a continuous-time finite-state nonhomogeneous Markov chain. The generator of the observable process at any given time is determined by the state of the underlying Markov chain at that time. Thus, the number of generators characterizing the observable process equals the number of states of the underlying Markov chain. The parameter of the MMMP comprises the set of generators of the underlying and conditional Markov chains. The goal of this paper is to develop an expectation-maximization (EM) algorithm for maximum-likelihood (ML) estimation of the parameter of an MMMP from a given sample path of the observable process. MMMPs generalize the more familiar Markov modulated Poisson processes. The latter is a nonhomogeneous conditional Poisson process whose rate at any given time is determined by the state of an underlying continuous-time finite-state homogeneous Markov chain [11]. The generalization is in the sense that Manuscript received November 21, 2007; revised September 16, First published October 31, 2008; current version published January 30, The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Subhrakanti Dey. Y. Ephraim is with the Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA USA ( yephraim@gmu. edu). W. J. J. Roberts is with Atlantic Coast Technologies, Inc., Silver Spring, MD USA ( robertsb@acoast.com). Digital Object Identifier /TSP increments of the conditional Markov chain are not independent as is the case with increments of the conditional Poisson process. Similarly to Markov modulated Poisson processes [22], [23], an MMMP may also be interpreted as a hidden Markov process comprising a discrete-time Markov chain and a sequence of conditionally independent observations. A survey of hidden Markov processes may be found in [10]. An MMMP may be viewed as a Markov renewal process whose dwell time in each state is not necessarily exponentially distributed [6]. In fact, that dwell time is a finite mixture of exponential densities, as is demonstrated in Section II by a simple example. Such finite mixtures may be used to approximate arbitrarily closely, in the Kullback Leibler sense, any desirable dwell time density in the family of continuous mixture densities [17]. Hence, MMMPs may be useful in hidden Markov modeling where explicit durational models are necessary. Examples include modeling of speech signals in automatic speech recognition applications [16], and hidden Markov modeling of hydrothermal flow episodes [27]. See also Yu and Kobayashi [31] and the references therein. MMMPs have been used in ion channel modeling as early as 1994, see, e.g., Ball, Milne and Yeo, [3], and subsequent papers by some of these authors. MMMPs have also been proposed for evaluating the likelihood of DNA sequences in molecular phylogenetics (see, e.g., [5] and the references therein). In both cases, no efficient algorithm for parameter estimation was proposed. We hope that the approach presented here would be useful for these important applications. Our approach for estimating the parameter of an MMMP was inspired by an EM algorithm developed by Rydén [24] for Markov modulated Poisson processes, and by an earlier EM approach developed by Asmussen and Nerman for estimating the parameter of a phase-type distribution, see Asmussen, Nerman and Olsson [2]. This distribution is that of a Markov chain with transient states and a single absorbing state. Of interest is the absorption time of the chain. The approach developed here uses the classical EM formulation of Dempster, Laird, and Rubin [8] for estimating the parameter of a process from a sequence of observations. This becomes possible, for example, when a continuous-time Markov chain is seen as a sequence of conditionally independent exponentially distributed random variables representing the dwell times of the chain in each state, together with a discrete time Markov chain representing state transitions at the jump times of the continuous-time chain. The likelihood function of a Markov chain represented in this way was derived independently by Billingsley [4] and Albert [1]. The complete data in this approach includes the entire sample path of the hidden chain, and the resulting -step and -step of the algorithm are explicit. The -step is implemented using forward-backward X/$ IEEE

2 464 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009 recursions similar to those associated with Markov modulated Poisson processes [24], [20]. An alternative approach to the same parameter estimation problem was taken by Elliott, Aggoun, and Moore [9]. They used the EM approach for continuous-time processes of Dembo and Zeitouni [7], [32], and developed the recursive estimators required to implement the -step. These estimators were derived using transformation of measures and the generalized Bayes rule or the Kallianpur Striebel formula [18, Lemma 7.4]. The recursions are given in terms of stochastic integrals as is usually the case with nonlinear estimation of continuous-time signals. Contrary to the EM approach of Rydén [24], the approach in [9] uses only forward recursions. Implementation of the approach in [9], however, requires solution of stochastic differential equations. We are not aware of any implementation of the approach in [9]. In Ledoux [15], the two approaches were developed and compared for the related Markovian arrival processes (MAPs) with hidden states. It was shown that the filter-based approach of [9] requires computations while the forward backward approach of [24] requires only computations. The application of Rydén s EM approach for MAPs and their batch extension was originally developed by Klemm, Lindemann, and Lohmann [14]. In yet another related work [29], the forward backward EM algorithm was applied to estimate the generator of a vector of two continuous-time Markov chains of which one is hidden. In [29], the transition matrix of the joint process was first estimated from sampled data. Then, the generator was estimated as the principal logarithm of the transition matrix estimate normalized by the sampling period. A thorough discussion of this approach was recently given in [21]. Due to the close relationship between Markov modulated Poisson processes and MMMPs, it is worthwhile mentioning that randomly modulated Poisson processes have attracted significant research interests since the 1970s (see, for example, Snyder [26] and Segall, Davis, and Kailath [25] and the references therein). In these works, the underlying process is rather general and need not be a finite state Markov or even a diffusion process. For this very general case, the modulated Poisson process can be represented as an increasing predictable process and an additive martingale noise process w.r.t. an appropriately defined sigma field. This representation is reminiscent of the classic representation of a signal observed in additive white noise. General finite-dimensional recursive filtering and smoothing formulas were developed, see, e.g., [25], for estimating the underlying process from the observed counts of the conditionally Poisson process. The plan for the remainder of this paper is as follows. In Section II, we derive some likelihood functions associated with the MMMP. In Section III, we develop the EM algorithm. In Section IV, we provide a numerical example. II. DENSITIES OF MMMP In this section, we set our notation, define the MMMP, and derive some densities associated with this process. In general, we use upper case letters to denote random variables and lower case letters to denote their realizations. We will not make this distinction when referring to sample paths of the process. Thus, for example, we may use to refer to the process itself or to a particular sample path of that process. We also use to denote either a probability measure or a density, as appropriate. These conventions greatly simplify the notation. We trust that the exact meaning of these symbols will always be clear from the context. Let denote the underlying process of the MMMP. It is assumed to be a separable continuous-time finite-state homogeneous irreducible Markov chain with state space of say. The order of the chain is assumed known. With probability one, all sample paths are right-continuous step functions with a finite number of jumps in each finite interval [1, Theorem 2.1]. A sample path of the chain in a given interval is characterized by the number of jumps, the sequence of visited states, and the time spent in each state within that interval. Assume that the chain jumps times in at, and define and. Let the initial state at time be denoted by, and let the state visited during be denoted by for. The time spent in state is. Let denote the initial distribution of the chain. Let the generator of the chain, where for denote Let denote the rate at which leaves state.we have When then is the stationary distribution of the chain [13, p. 261]. Similarly, let denote the observable process of the MMMP. Conditioned on, it is a separable continuous-time finite-state nonhomogeneous irreducible Markov chain with state space of say. The order is also assumed known. Assume that the process starts from some state at time, and that it jumps times in at. When, we denote.for, let denote the state of the process in the interval, and let denote the dwell time of the process in that state. The last state reached by the process is. When, the dwell time of the process in is. We use, and to denote realizations of, and, respectively. The initial distribution and generator of are controlled by in the following manner. When, the distribution of is given by. Similarly, when, the observable process transits at time according to the generator. Let denote the rate that leaves state while the underlying chain is in state. The set of generators of the two Markov chains,, constitutes the parameter of interest of the MMMP. Note that is a two-dimensional Markov process. The density of the observable process on the space of sample functions may be evaluated as follows. Assume that the process (1) (2)

3 EPHRAIM AND ROBERTS: AN EM ALGORITHM FOR MARKOV MODULATED MARKOV PROCESSES 465 jumps times in. Suppose first that the last jump occurs at. The desired density can then be obtained from the joint density of the random variables This density, as well as other densities of interest, will follow from the joint density of and the sequence of random variables ς. The latter random variables signify the states of the underlying chain at and at the jump times of the observable process. Let the augmented collection of random variables be denoted as ς (3) ς ς ς (4) Let and. The density of follows from the Markov property which implies conditional independence of and ς ς given ς. This density is therefore given by where for denotes a realization of ς, and. Let ς for. Rewriting (5) in terms of the Markov chain shows that is a Markov renewal process [6]. A. Transition Density Matrix An expression for the density in (5) can be obtained from a similar argument to that used for Markov modulated Poisson processes, see, e.g., Freed and Shepp [12]. Specifically, suppose that at time the underling Markov chain is in state and the observable process is in state. Recall that denotes the time of the first jump of the observable process. We are interested in the conditional probability of the event given for, and we denote the corresponding density by. Following an argument similar to that given in [12, (2.5)], we obtain where denotes the Kronecker delta. To interpret this expression, note that the first jump of the observable process from to at time may occur while the underlying Markov chain first jumps to some intermediate state at some random time, before it eventually reaches its final state (which may be equal to ) at time. The jump of the observable process may also occur while the underlying chain remains in during. In that case,. The density corresponding to (5) (6) the first alternative is given by the second term on the right-hand side of (6). This term comprises the product of the probability that the observable process does not jump in while the underlying chain remains in state, the exponential density for the dwell time of the underlying chain in state during, the probability of the jump from to at time, and the density corresponding to the jump time of the observable process from to and the jump of the underlying chain from to. This product is summed over all possible and integrated over. The first term on the right-hand side of (6) is relevant only when the underlying chain does not jump in. This term comprises the product of the probability that the underlying chain remains in state in, the exponential density of whose parameter is, and the probability of the jump of the observable process from to at time under the generator. Next, for, define the transition density matrix and the matrix The transition density matrix (7) is obtained by differentiating (6) w.r.t.. We find that Since, we have from (9) (7) (8) (9) (10) This expression is similar to [12, (2.6)] for Markov modulated Poisson processes. The density of can now be obtained from (5) and (10). Define.Wehave (11) where denotes an vector whose components are all equal to one. Note that evaluation of requires products of matrices. Furthermore, this approach does not provide an explicit form for since we only have the exponential matrix form (10). When, the density in (5) should be multiplied by an additional transition matrix which is derived as follows. Let (12) denote the transition probability of the underlying chain from state at time 0 to state at time while the observable process remains in state for at least seconds. Let denote the corresponding transition matrix. It follows that (5) must be multiplied by. The transition probability satisfies an equation similar to (6) with the exception that the first term on the right-hand side is replaced by. This change

4 466 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009 only affects the initial condition of the differential equation. Since,wehave (13) which is analogous to a result by Neuts [19] that was used for Markov modulated Poisson processes in [24]. Finally, the transition matrix of ς for given can be obtained from (10) and is given by (14) B. An Example To illustrate the evaluation and form of the transition density matrix (10), we consider a relatively simple MMMP with. This example is similar to that given by Rydén [22, Sec. 2.3] for switched Poisson processes, which are Markov modulated Poisson processes for which the order of the underlying Markov chain is equal to two. For this case, let (15), and for. To evaluate the transition density matrix (10) we note that is a simple matrix if its eigenvalues are distinct. As such, it can be written as, where and denote, respectively, a diagonal matrix of the eigenvalues and a nonsingular matrix of corresponding eigenvectors of. We express the eigenvalues in terms of (16) We have that and. The eigenvectors are given by for, and the determinant of is given by. Using these results in (10) we obtain (17), shown at the bottom of the page, which is analogous to the result of Rydén [22]. This example demonstrates nicely the nonexponential nature of the densities of the dwell time of the observable process in each state for a given transition of the underlying chain. In this example, each density is a mixture of two exponential densities. In general, we would have a mixture of exponential densities. III. THE EM ALGORITHM In this section, we describe an EM algorithm for ML estimation of the true parameter of the MMMP, which we denote by, from a sample path of the observable process. In the EM approach, a new parameter estimate, say, is obtained from a given parameter estimate, say,as (18) where is related to the underlying Markov chain, and the expectation is performed over given and the current parameter estimate. Starting from an initial estimate, say, the procedure generates a sequence of estimates with nondecreasing likelihood values. It is well known that every fixed point of the algorithm is a stationary point of the likelihood function [30]. Under certain conditions, there exists a subsequence of which converges to a stationary point of the likelihood function [30]. The expectation and maximization in (18) represent the - and -steps of the algorithm, respectively. The exact form of is not specified by the EM algorithm. This quantity can be conveniently chosen to simplify the estimation procedure. One choice could be the random variables ς used in (4). Unfortunately, this choice does not lead to explicit reestimation formula in each iteration for two reasons. First, we only have the transition density matrix in its exponential form rather than the individual densities for all possible values of. Second, the latter densities are mixtures of exponentials as was demonstrated in the Example in Section II. The logarithm of a mixture density does not simplify as in the case with densities from the exponential family. An alternative form of, proposed by Asmussen and Nerman (see [2]) and used by Rydén in the context of estimating the parameter of a Markov modulated Poisson processes [24], is that of being the entire sample path of the underlying chain in. This choice is useful in our case as well. Consider where is the entire sample path of the underlying chain. The density of was derived by Albert [1]. Since it also forms the basis for the density of, we begin with its description. A. Parameter of Underlying Chain Let be a random variable which represents the number of jumps of in with the last jump occurring prior to. Let denote a natural number which represents a value that may take. Recall that for denotes the time of the th jump of the chain. Using and, the state of the chain during is for. The process is characterized by (19) (17)

5 EPHRAIM AND ROBERTS: AN EM ALGORITHM FOR MARKOV MODULATED MARKOV PROCESSES 467 Albert [1] showed that the density corresponding to the probability measure is given by (20) (21) where for, and. Hence,.Anexpression for was given in [1] and [24], using the indicator function and the notation (22) Maximization of over and, using, results in the following intuitive estimates in the st iteration of the EM algorithm. (29) Thus, is the ratio of the conditional mean estimates of the number of jumps from to and the dwell time of the chain in state. Note that is not necessarily the stationary distribution of the chain. Albert [1] provided a similar estimate for when the chain is observable. The conditional probability in (25) (26), and the conditional density in (27), may be evaluated using forward-backward recursions in a similar way to that proposed by Rydén [24] for Markov modulated Poisson processes. To simplify the notation, we suppress the dependency on. We begin with (27). From Bayes rule (23) # (24) The reestimation formula for the parameter of the underlying chain is obtained from maximization of over. Let, and denote, respectively, the conditional mean estimates of, and given and. We have (25) (26) (30) Using the Markov property of conditional independence (31) Now, if, where is the time of the th jump of the observable process, and, then the last term of (31) multiplied by follows from the forward density given by (27) where denotes the conditional density of a jump from to at time. The expression for follows from a limit argument similar to that developed by Asmussen, Nerman and Olsson [2, Appendix] for phase-type distributions. The idea behind the proof is to consider a uniform partition of the time interval between consecutive jumps of the observable process into subintervals of length each. Then (28) Now, can be evaluated by i) applying Lebesgue s dominated convergence theorem which turns the indicator function in (28) into the conditional probability ; ii) using Bayes rule along with (31) (33) given below; and iii) using the identity. The proof also clarifies that, a result which will be required for evaluating (30) below. (32) where is a vector whose th component equals one and the remaining components are all equal to zero, and the transition density matrices in the right-hand side of (32) are given in (10) and (13), respectively. The expression in the right-hand side of (32) corresponds to rather than. As indicated in [24, p. 437] for Markov-modulated Poisson processes, this replacement has no effect on the density since the underlying Markov chain is continuous in probability. Next, when the last jump of the observable process, the first term of (31) follows from the backward density which is given by (33) When or when, these two expressions can be modified in a straightforward manner.

6 468 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009 The forward recursion is used to evaluate the bracketed term in (32). This recursion is given by Using the density of a Markov chain on the space of sample functions derived by Albert [1], it is straightforward to verify that the conditional density of the observable process in the th subinterval is given by (34) The backward recursion is used to evaluate the bracketed term in (33). This recursion is given by (38) where (35) A scaling procedure similar to that developed in [20] for Markov modulated Poisson processes may be applied here. Recursion (34) can be used to evaluate the likelihood of a sample path of the observable process when as. The conditional probability in (25) and (26) may be similarly evaluated as follows. From Bayes rule, the Markov property, and the continuity in probability of,wehave (36) denotes the time that the observable process spends in state, and (39) # (40) is the number of jumps of the observable process from to in. The last term in (38) is due to the inclusion of the probability of a possible jump of the observable process at from to say. Substituting (38) (40) into (37) and rearranging terms, we obtain in The two last terms can be evaluated using (32) and (33), respectively, and. B. Parameter of Observable Process We next derive the conditional density and the reestimation formula for the parameter of the observable process. We express the conditional density of the observable process for a sample path with jumps at. We denote and. Suppose that the underlying Markov chain jumped at, and let and. For, let and let. During the th subinterval, the underlying chain resides in state, and is a Markov chain with a fixed generator. Using the Markov property of,wehave where (41) (42) represents the total time that the observable process spends in state while the underlying chain resides in state in, (37) where for notational convenience we have used. The conditional density of given and is essentially given by an expression similar to (21). Here, however, the observable process may jump simultaneously with the underlying Markov chain at the beginning of each subinterval. Hence, we cannot exclude such jumps. (43)

7 EPHRAIM AND ROBERTS: AN EM ALGORITHM FOR MARKOV MODULATED MARKOV PROCESSES 469 where is the Dirac function, is the total number of jumps of the observable process from state to state while the underlying chain is in state in, and TABLE I MMMP EXAMPLE. TRUE PARAMETER ; INITIAL ESTIMATE ;FINAL ESTIMATE ^ (44) where mean over the generators of the observable process, taking into account that, we obtain (45) is the total number of jumps of the observable process from state to any other state while the underlying chain jumps from any state to state in. The conditional mean estimate is determined by the conditional mean estimates of the quantities defined in (42) (45). From (42) we have Similarly, from (43), we have (46) (47) (50) which can be interpreted similarly to in (29). The conditional probabilities in (46) (47) may be efficiently calculated as in (36) using the forward-backward recursions (34) (35). The algorithm presented here generalizes the EM algorithm for Markov modulated Poisson processes of Rydén [24] when the conditional Poisson process is seen as a conditional Markov process. To see that, it suffices to show that the likelihood of the conditional Markov process coincides with the likelihood of the conditional Poisson process in each subinterval. The result then follows from the fact that the conditional Poisson process has independent increments. Let denote the generator of the conditional Markov process at the given subinterval, and assume that the rate of the conditional Poisson process in the same subinterval is.wehave. Now, for every jump of the conditional Poisson process, it holds for the representing conditional Markov process that (51) From (45), we have (48) where contrary to (27), now represents the conditional probability of a jump from to at the known time, and that probability equals zero. Hence, (49) The conditional mean is now readily available from (41) and (46) (49). Maximizing this conditional independently of and for all. Suppose that there are Poisson events in. Substituting (51) in a density similar to that given in (21) but for the observable process, we obtain. This likelihood coincides with the corresponding th term in the second line of [24, (6)]. IV. NUMERICAL EXAMPLE The EM algorithm presented in Section III was implemented using Matlab. The forward-backward equations, (34) (35), were scaled to provide numerical stability using the approach developed in Roberts, Ephraim and Dieguez [20] for EM estimation of the parameter of a Markov modulated Poisson process. The integrals of matrix exponentials that appear in (26) (27) and in (46) were evaluated using the approach due to Van Loan [28] as was done in [20]. Matrix exponentials were calculated using the Matlab function expm which relies upon the Padé approximation with repeated squaring. The

470 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009 algorithm was terminated when the relative difference between log-likelihood values of successive iterations was.

Two different MMMPs were considered using the order of and for the underlying Markov chain, and the order of for the observable process.

8 470 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009 algorithm was terminated when the relative difference between log-likelihood values of successive iterations was. Our setup was motivated by Rydén s experiments on Markov modulated Poisson processes estimation [24]. Two different MMMPs were considered using the order of and for the underlying Markov chain, and the order of for the observable process. For the case, the generator of the underlying Markov chain and its initial estimate were identical to those of [24, case 2A]. As in [24, case 2A], 5000 inter-event times were used. The rates of the observable process and their initial estimates were similar to those used for the Poisson intensities in [24, case 2A]. For the case, the MMMP setup was similarly fashioned on [24, case 3A]. All Markov chains had uniform initial state distributions, which were assumed known by the algorithm. The MMMP parameter estimates were consistent with the Markov modulated Poisson process estimates obtained in [24]. We demonstrate the results in Table I for only. ACKNOWLEDGMENT The authors would like to thank the perceptive referees for their useful comments which greatly improved the presentation. REFERENCES [1] A. Albert, Estimating the infinitesimal generator of a continuous time, finite state Markov process, Ann. Math. Statist., vol. 23, no. 2, pp , Jun [2] S. Asmussen, O. Nerman, and M. Olsson, Fitting phase-type distributions via the EM algorithm., Scand. J. Statist., vol. 23, no. 4, pp , [3] F. Ball, R. K. Milne, and G. F. Yeo, Continuous-time Markov chains in a random environment, with applications to ion channel modelling, Adv. Appl. Probab., vol. 26, no. 4, pp , Dec [4] P. Billingsley, Statistical Inference for Markov Processes. Chicago, IL: Univ. of Chicago Press, [5] D. Bryant, N. Galtier, and M. A. Poursat, Likelihood calculation in molecular phylogenetics, in Mathematics of Evolution and Phylogeny, O. Gascuel, Ed. New York: Oxford, [6] E. Çinlar, Markov renewal theory: A survey, Manage. Sci., ser. Theory Series, vol. 21, no. 7, pp , Mar [7] A. Dembo and O. Zeitouni, Parameter estimation of partially observed continuous time stochastic processes via the EM algorithm, Stoch. Process. Appl., vol. 23, no. 1, pp , [8] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Royal Statist. Soc. B, vol. 39, pp. 1 38, [9] R. J. Elliott, L. Aggoun, and J. B. Moore, Hidden Markov Models: Estimation and Control. New York: Springer-Verlag, [10] Y. Ephraim and N. Merhav, Hidden Markov processes, IEEE Trans. Inf. Theory, vol. 48, pp , Jun [11] W. Fischer and K. Meier-Hellstern, The Markov-modulated Poisson process (MMPP) cookbook, Perform. Eval., vol. 18, pp , [12] D. S. Freed and L. A. Shepp, A Poisson process whose rate is a hidden Markov chain, Adv. Appl. Probab., vol. 14, pp , [13] G. R. Grimmett and D. R. Stirzaker, Probability and Random Processes. Oxford, U.K.: Oxford Science, [14] A. Klemm, C. Lindemann, and M. Lohmann, Modeling IP traffic using the batch Markovian arrival process, Perform. Eval., vol. 54, pp , [15] J. Ledoux, Filtering and the EM-algorithm for the Markovian arrival process, Commun. Statist. Theory Methods, vol. 36, no. 14, pp , Jan [16] S. E. Levinson, Continuously variable duration hidden Markov models for automatic speech recognition, Comput. Speech Lang., vol. 1, pp , [17] J. Q. Li and A. R. Barron, Mixture density estimation, in Advances in Neural Information Processing Systems, S. A. Solla, T. K. Leen, and K.-R. Mueller, Eds. Cambridge, MA: MIT Press, 2000, vol. 12, pp [18] R. S. Liptser and A. N. Shiryaev, Statistics of Random Processes. New York: Springer-Verlag, 2004, pt. I. [19] M. F. Neuts, Structured Stochastic Matrices of M/G/1 Type and Their Applications. New York: Marcel Dekker, [20] W. J. J. Roberts, Y. Ephraim, and E. Dieguez, On Rydén EM algorithm for estimating MMPPs, IEEE Signal Process. Lett., vol. 13, no. 6, pp , Jun [21] W. J. J. Roberts and Y. Ephraim, An EM algorithm for ion-channel current estimation, IEEE Trans. Signal Process., vol. 56, no. 1, pp , Jan [22] T. Rydén, Parameter estimation for Markov modulated Poisson processes, Commun. Statist. Stoch. Models, vol. 10, no. 4, pp , [23] T. Rydén, Consistent and asymptotically normal parameter estimates for Markov modulated Poisson processes, Scand. J. Statist., vol. 22, pp , [24] T. Rydén, An EM algorithm for estimation in Markov-modulated Poisson processes, Comput. Statist. Data Anal., vol. 21, pp , [25] A. Segall, M. H. A. Davis, and T. Kailath, Nonlinear filtering with counting observations, IEEE Trans. Inf. Theory, vol. IT-21, pp , Mar [26] D. L. Snyder, Filtering and detection for doubly stochastic Poisson processes, IEEE Trans. Inf. Theory, vol. IT-18, pp , Jan [27] R. A. Sohn, Stochastic analysis of exit fluid temperature records from the active TAG hydrothermal mound (Mid-Atlantic Ridge 26 N): 2. Hidden Markov models of flow episodes, J. Geophys. Res., vol. 112, 2007, B [28] C. F. Van Loan, Computing integrals involving the matrix exponential, IEEE Trans. Autom. Control, vol. 23, no. 3, pp , [29] W. Wei, B. Wang, and D. Towsley, Continuous-time hidden Markov models for network performance evaluation, Perform. Eval., vol. 49, pp , [30] C. F. J. Wu, On the convergence properties of the EM algorithm, Ann. Statist., vol. 11, no. 1, pp , [31] S.-Z. Yu and H. Kobayashi, Practical implementation of an efficient forward-backward algorithm for an explicit-duration hidden Markov model, IEEE Trans. Signal Process., vol. 54, no. 5, pp , May [32] O. Zeitouni and A. Dembo, Exact filters for the estimation of the number of transitions of finite-state continuous-time Markov processes, IEEE Trans. Inf. Theory, vol. 34, pp , Jul Yariv Ephraim (S 82 M 84 SM 90 F 94) received the D.Sc. degree in electrical engineering from the Technion Israel Institute of Technology, Haifa, Israel, in During , he was a Rothschild Postdoctoral Fellow at the Information Systems Laboratory of Stanford University, Stanford, CA. From 1985 to 1993, he was a Member of Technical Staff at the Information Principles Research Laboratory of AT&T Bell Laboratories, Murray Hill, NJ. In 1991, he joined George Mason University, Fairfax, VA, where he currently is Professor of Electrical and Computer Engineering. His research interest are in statistical signal processing. William J. J. Roberts (S 89 M 90 SM 06) received the Ph.D. degree in information technology from George Mason University, Fairfax, VA, in From 1990 to 2000, he was with the Defence Science Technology Organization, Salisbury, South Australia. From 1998 to 1999, he held a postdoctoral position at the Tokyo Institute of Technology, Tokyo, Japan. Since 2000, he has been with Atlantic Coast Technologies, Inc., Silver Spring, MD. His interests are in statistical signal processing.

On modeling network congestion using continuous-time bivariate Markov chains

On modeling network congestion using continuous-time bivariate Markov chains On modeling networ congestion using continuous-time bivariate Marov chains Brian L. Mar and Yariv Ephraim Dept. of Electrical and Computer Engineering George Mason University, MS 1G5 44 University Drive,