HIDDEN MARKOV CHANGE POINT ESTIMATION

Size: px

Start display at page:

Download "HIDDEN MARKOV CHANGE POINT ESTIMATION"

Conrad Whitehead
5 years ago
Views:

1 Communications on Stochastic Analysis Vol. 9, No. 3 ( Serials Publications HIDDEN MARKOV CHANGE POINT ESTIMATION ROBERT J. ELLIOTT* AND SEBASTIAN ELLIOTT Abstract. A hidden Markov model is considered where the dynamics of the hidden process change at a random change point. In principle this gives rise to a non-linear filter but closed form recursive estimates are obtained for the conditional distribution of the hidden process and of. 1. Introduction Hidden Markov models have found many applications from speech processing to regime switching dynamics in financial models. An early description is given in the paper [4] by Rabiner and Huang and a fuller treatment can be found in the book Hidden Markov Models: Estimation and Control by Elliott, Aggoun and Moore. Recent results can be found in [3]. In this paper we consider the situation where a discrete time Markov chain X is observed indirectly through a second Markov chain Y. However, the dynamics of X undergo a change at a random time. Given the observed process Y, filtered recursive estimates for the conditional distribution of X and the change point are derived. This paper uses the expectation maximization, EM algorithm as discussed in [1] and [2]. 2. Chain Dynamics Suppose X = {X k, k = 0, 1,...} is a discrete time finite state Markov chain defined on a probability space (Ω, F, P. Without lose of generality, the state space can be taken to be the set of unit vectors {e 1, e 2,...e N } in R N, where N is the number of elements of the state space and e i = (0, 0,..., 0, 1, 0,..., 0 is a standard basis vector in R N. Suppose τ {1, 2,...} is a random time and P (τ = k = p k 0. Write = P (τ > k = l=k+1 p l. The random time τ represents a random time at which there is a change in the dynamics of the chain X. Received ; Communicated by the editors Mathematics Subject Classification. 60J10; 60J22. Key words and phrases. Hidden Markov chain, change points, estimation. * Professor Robert J. Elliott wishes to thank the ARO for support under project W911NF

2 368 ROBERT J. ELLIOTT AND SEBASTIAN ELLIOTT In fact, if k τ, suppose a 1 ji = P (X k+1 = e j X k = e i = P (X 1 = e j X 0 = e i. Write A 1 for the matrix (a 1 ji, 1 i, j N. If k > τ, suppose a 2 ji = P (X k+1 = e j X k = e i, and write A 2 for the matrix (a 2 ji, 1 i, j N. Write Z k = I k τ. The state space of Z will be mapped onto the unit vectors ē 1 = (1, 0, ē 2 = (0, 1 by considering the process Define the filtrations Z k = (1 Z k ē 1 + Z k ē 2 R 2. F X k = σ{x i, i k}, F Z k = σ{z i, i k}, = F X k F Z k. Lemma 2.1. Suppose A k = A 1 Z k, ē 1 +A 2 Z k, ē 2, then X k+1 = A k X k +M k+1 R N, where M is a sequence of Martingale increments, that is E[M k+1 ] = 0 R N. Proof. E[M k+1 ] = E[X k+1 A k X k F X k F Z k ] = E[X k+1 X k, Z k ] (A 1 Z k, ē 1 + A 2 Z k, ē 2 X k = E[X k+1 ( Z k, ē 1 + Z k, ē 2 X k, Z k ] (A 1 Z k, ē 1 + A 2 Z k, ē 2 X k = (A 1 Z k, ē 1 + A 2 Z k, ē 2 X k = 0. (A 1 Z k, ē 1 + A 2 Z k, ē 2 X k TRANSITIONS OF Z: Note P (Z k+1 = ē 1 F Z k = P (Z k+1 = ē 1 Z k. Now P (Z k+1 = ē 1 Z k = ē 1 = P (τ > k + 1 τ > k = +1, P (Z k+1 = ē 2 Z k = ē 1 = P (τ = k + 1 τ > k = p k+1, P (Z k+1 = ē 1 Z k = ē 2 = P (τ > k + 1 τ k = 0, P (Z k+1 = ē 2 Z k = ē 2 = P (τ k + 1 τ k = 1.

3 Lemma 2.2. Write = E[N k+1 F Z k ] = 0 R2. HIDDEN MARKOV CHANGE POINT ESTIMATION p k+1. Then Z k+1 = Z k + N k+1 R 2, where 1 ( Fk+1 3. Observations. The chain X is not observed directly. Rather, there is another finite state process Y which is a noisy observation of X. Suppose the finite state space of Y is identified with the unit vectors where {f 1, f 2,..., f M } of R M, f j = (0, 0,..., 1,..., 0 R N. We can have M > N, M < N or M = N. Suppose c j,i = P (Y k = f j X k = e i 0. Note M c ji = 1. Write C = (c ji, 1 i N, 1 j M, Fk Y = σ{y j, 0 j k}. Lemma 3.1. Y k = CX k + L k R M, where E[L k F X k ] = 0 RM. Proof. E[L k F X k ] = E[Y k CX k F X k ] = E[Y k CX k X k ] = E[Y k X k ] CX k = E[Y k Y k, f j X k, e i X k ] CX k = = = 0 R M. E[ Y k, f j X k, e i X k ]f j CX k X k, e i c ji f j CX k JOINT DISTRIBUTIONS: Lemma 3.2. E[X k+1 Z k+1 X k, Z k ] = A 1 X k + A 2 X k ( Fk+1 F p k k+1 ( 0 1 Z k, ē 1 Z k, ē 2. (3.1

4 370 ROBERT J. ELLIOTT AND SEBASTIAN ELLIOTT Proof. E[X k+1 Z k+1 X k, Z k ] = E[E[X k+1 Z k+1 X k+1, X k, Z k ] X k, Z k ] = E[X k+1 E[Z k+1 Z k ] X k, Z k ] = E[X k+1 E[Z k+1 ( Z k, ē 1 + Z k, ē 2 Z k ] X k, Z k ] = E[X k+1 ( Fk+1 F p k k+1 Z k, ē 1 X k, Z k ] ( 0 + E[X k+1 Z 1 k, ē 2 X k, Z k ] = A 1 X k ( Fk+1 p k+1 Z k, ē 1 + A 2 X k ( 0 Z 1 k, ē 2. ( Reference Probability. The above dynamics of X, Z, Y are under the real world probability P. Suppose we have another probability P under which X and Z have the same dynamics as above, but under which Y is a process independent of X and Z and under which Y is a sequence of independent, identity distributed random variables with Lemma 4.1. Let P (Y k = f j F X k F Z k F Y k 1 = P (Y k = f j = 1/M. λ k Λ k = CX k, f j Y k, f j, (4.1 k λ i. (4.2 i=0 The probability P can be defined by setting dp d P G k = Λ k, where G k = F X k F Y k F Z k. Then under P the dynamics of X and Z are unchanged and That is, under P where E[L k F X k ] = 0 RM. P (Y k = f j X k = e i = c ji. Y k = CX k + L k,

5 HIDDEN MARKOV CHANGE POINT ESTIMATION 371 Proof. Consider first Ē[λ k G k 1 X k ] = = 1. Ē[ CX k, f j Y k, f j X k ] CX k, f j Ē[ Y k, f j ] c ji X k, e i From Bayes theorem, (see [1], with dp d P G k = Λ k, E[ Y k, f j G k 1 X k ] = Ē[Λ k Y k, f j G k 1 X k ] Ē[Λ k G k 1 X k ] = Λ k 1Ē[λ k Y k, f j G k 1 X k ] Λ k 1 Ē[λ k G k 1 X k ] M Ē[( CX k, f r Y k, f r Y k, f j G k 1 X k ] r=1 Ē[ CX k, f j Y k, f j G k 1 X k ] = CX k, f j. Therefore, if X k = e i, P (Y k = f j X k = e i = E[ Y k, f j X k = e i ] = c ji. 5. Filters. We wish to estimate both X and Z, given the noisy observations Y. This is required under the real world probability P. That is, we wish to determine E[X k Z k Fk Y ] RN 2. However, it is easier to work under the reference probability P for which the dynamics of X and Z remain unchanged but the Y sequence is i.i.d. with P (Y k = f j = 1/M for all k and j. Using Bayes rule, Write E[X k Z k F Y k ] = Ē[Λ kx k Z k F Y k ] Ē[Λ k F Y k ]. q k := Ē[Λ kx k Z k F Y k ] R N 2, q 1 k := Ē[Λ kx k Z k, ē 1 F Y k ], q 2 k := Ē[Λ kx k Z k, ē 2 F Y k ].

6 372 ROBERT J. ELLIOTT AND SEBASTIAN ELLIOTT for unnormalized conditional expectations given the observations Fk Y Then to time k. q k = (q 1 k, q 2 k. Note that, with 1 denoting the vector of 1 s in R N or R 2 : 1 X k Z k 1 = 1. Therefore, 1 q k 1 = Ē[1 Λ k X k Z k 1 F Y k ] = Ē[Λ k F Y k ]. Consequently, if we know q k, the denominator Ē[Λ k Fk Y ] is just the sum of all components of q k. The filter is now a recursive estimate for q k given the observations Y. Theorem 5.1. Proof. q 1 k+1 q 2 k+1 A 1 q 1 k, e i +1 c ji e i Y k+1, f j, r (5.1 [ A 1 q 1 k, e i p k+1 + A 2 q 2 k, e i ]c ji e i Y k+1, f j. (5.2 q 1 k+1 = Ē[Λ kλ k+1 X k+1 Z k+1, ē 1 Fk+1] Y Ē[Λ k CX k+1, f j X k+1 Z k+1, ē 1 Fk+1] Y Y k+1, f j Ē[Λ k X k+1, e i Z k+1, ē 1 Fk+1]c Y ji e i Y k+1, f j r=1 c ji e i Y k+1, f j 2 Ē[Λ k X k+1, e i Z k+1, ē 1 Z k, ē r Fk+1] Y A 1 q 1 k, e i +1 c ji e i Y k+1, f j.

7 HIDDEN MARKOV CHANGE POINT ESTIMATION 373 For q 2 : q 2 k+1 = Ē[Λ kλ k+1 X k+1 Z k+1, ē 2 Fk+1] Y Ē[Λ k CX k+1, f j X k+1 Z k+1, ē 2 Fk+1] Y Y k+1, f j Ē[Λ k X k+1, e i Z k+1, ē 2 Fk+1]c Y ji e i Y k+1, f j r=1 2 Ē[Λ k X k+1, e i Z k+1, ē 2 Z k, ē r Fk+1] Y c ji e i Y k+1, f j [ A 1 qk, 1 e i p k+1 c ji e i Y k+1, f j + A 2 q 2 k, e i c ji e i Y k+1, f j ]. Remark 5.2. The filter is initialized by assuming the change point τ has not occured, so Z 0 = ē 1, and by taking an initial distribution p 0 = q 0 for X 0. Corollary 5.3. The normalized conditional distributions of X k and Z k are then given by p 1 k = E[X k Z k, ē 1 Fk Y qk 1 ] = 1 q k 1, (5.3 p 2 k = E[X k Z k, ē 2 Fk Y qk 2 ] = 1 q k 1. (5.4 Given the observations, the conditional probability of the change point τ is then P (τ > k = E[ Z k, ē 1 F Y k ] = 1 q 1 k. (5.5 Remark 5.4. Recall M is the number of elements in the state space of the observation process Y. The right hand sides of the recursions (5.5 and (5.6 of Theorem 5.1 involve a factor M. However, this will cancel when we consider the normalized forms in (5.7 and (5.8. Consequently, equation (5.5 and (5.6 can be modified to give recursions for unnormalized quantities q 1 k, q2 k as: q 1 k+1 = q 2 k+1 = A 1 q 1 k, e i +1 c ji e i Y k+1, f j, (5.6 [ A 1 q 1 k, e i p k+1 + A 2 q 2 k, e i ]c ji e i Y k+1, f j. (5.7 Again the initialization is Z 0 = ē 1 and q 0 = p 0 R N.

8 374 ROBERT J. ELLIOTT AND SEBASTIAN ELLIOTT 6. A Viterbi Recursion. As noted in our earlier papers, the Viterbi filter is given by replacing the expected value by a maximum likelihood. That is, the Viterbi estimation is given by a sequence of vectors q 1 k = [q 1 k (1, q 1 k (2,..., q 1 k (N], qk 2 = [qk 2 (1, qk 2 (2,..., qk 2 (N]. defined recursively by Z 0 = ē 1, q0 1 = p 0, q0 2 = 0 R N, and q 1 k+1(i := A 1 q 1 k, e i +1 max(c ji Y k+1, f j, (6.1 j qk+1(i 2 := [ A 1 qk 1, e i p k+1 + A 2 qk 2, e i ] max(c ji Y k+1, f j. (6.2 j 7. Conclusion. Hidden Markov chains, that is Markov chains observed indirectly through the observations of a second Markov chain, have been extensively studied. For a bibliography see the book [1] by Aggoun, Elliott and Moore. In this paper we have considered a hidden Markov chain X whose dynamics undergo a change at a random time τ. Given an observed process filtered estimates for the conditional distribution of X and the change point time τ are derived. References 1. Aggoun, L., Elliott, R. J. and Moore, J. B.: Hidden Markov models: estimation and control, Springer - Verlag, erlin Heidelberg - New York, Dembo, A. and Zeitouni, O.: Parameter estimation of partially observed continuous time stochastic processes via the EM algorithm, Stoch. Proc. Appl. 23 (1986, no. 1, Elliott, R. J. and Malcolm, W. P.: Data- recursive smoother formulae for paritally observed discrete-time Markov chains, Stoch. Anal. Appl. 24 (2006, no. 3, Rabiner, L. R. and Juang, B. H.: An introduction to Hidden Markov Models, IEEE ASSP Magazine, (January 1986, Robert J. Elliott: Haskayne School of Busiless, University of Calgary, Calgary, AB T2N1N4, Canada address: relliott@ucalgary.edu Sebastian Elliott: Elliott Stochastics LLC. 31 Varsity Estates Park NW, Calgary, AB Canada address: ellio02@gmail.com

A NEW NONLINEAR FILTER

COMMUNICATIONS IN INFORMATION AND SYSTEMS c 006 International Press Vol 6, No 3, pp 03-0, 006 004 A NEW NONLINEAR FILTER ROBERT J ELLIOTT AND SIMON HAYKIN Abstract A discrete time filter is constructed