On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool for nference n ncomplete data models. In ths paper, we revew fundamental EM algorthm and then focus especally on stochastc verson of EM. In order to construct the SAEM, the algorthm combnes EM wth a varant of stochastc approxmaton that uses Markov chan Monte-Carlo to deal wth the mssng data. The algorthm s ntroduced n general form and can be used to a wdely range of problems. Keywords: Stochastc Approxmaton; EM algorthm; Incomplete Data; Markov chan Monte- Carlo; Maxmum Lkelhood.. Introducton A standard method to handle ncomplete data problems s EM algorthm (Dempster et al., 977; Tadayon and Torab, 08. Ths procedure s an teratve method to fnd maxmum lkelhood n some ncomplete data. Ths algorthm s appled to wdely varous problems. However, the convergence of ths method can be slow. Furthermore, n stuatons where the data are dependent and ncomplete (for example, spatal ncomplete data problems ths method can be hghly neffcent. To resolve some of these dffcultes, we explore Stochastc Approxmaton EM (SAEM. To pay ths algorthm, frst Stochastc Approxmaton (SA s consdered. Stochastc Approxmaton (SA was ntroduced by Robbns and Monro (95, and has been extended by Gu and Kong (998 to stuatons where data are ncomplete. We utlze SA wth MCMC to construct SAEM. SAEM algorthm orgnates from Delyon et al. (999. In ths paper, we propose an extenson form of SAEM based on SA wth MCMC (Gu and Kong, 998. In the next secton, EM algorthm s revewed. Then SAEM s ntroduced.. EM algorthm We assume that x s observed (or ncomplete data and s generated by some dstrbuton. Let z denote the unobserved (or mssng data. Hence n EM algorthm par (x,z s recognzed as complete data. Let f( x, z; denote the jont dstrbuton of the complete data, dependent on parameter vector. Wth ths new densty functon, we can defne a new lkelhood functon L( = L( ; x =ò f( x, z; dz, whch s referred to as the ncomplete-data lkelhood functon. The goal s to fnd ˆ, the maxmze of the margnal lkelhood L(. Department of Statstcs, Tarbat Modares Unversty, Tehran-Iran.

The EM algorthm conssted of two stage. Frst computes the expected value of the complete-data log-lkelhood log f( x, z; wth respect to the unknown data z gven the observed data x and the current parameter estmates. That s, we defne: ( - ( - Q(, = E[log f( x, z; x, ] ( where ( - are the current parameters estmates. Second, we maxmze the expectaton. (0 Ths s the M-step. Gven an ntal value, the EM algorthm produces a seuence (0 ( ( {,,,...} that, under mld regularty condtons (Boyles, 983, converges to ˆ. To end ths secton, we menton some challenges for the EM algorthm. One of the bggest challenges for the EM algorthm s that t only guarantees convergence to a local soluton (Jank, 006; Tadayon and Rasekh, 08; Tadayon, 07; Tadayon, 05; Tadayon, 08. The EM algorthm s a graspng method n the sense that t s attracted to the soluton closest to ts startng value. Then the next problem wth EM algorthm s startng values. In addton, n some cases the lkelhood functon s computatonally ntractable and t s nfeasble to maxmze the lkelhood functon of observed data drectly. To avod above problems, n the next secton SAEM s ntroduced. 3. Stochastc Approxmaton EM algorthm Usng lkelhood functon L(, the maxmum lkelhood estmate of, denoted by ˆ, s defned by L( ˆ ; x = max L(, x. ( Due to computatonally ntractable (, we consder the frst-order and second- order partal dervatves of the log-lkelhood functon n order to use gradent-type algorthms, such as Newton-Raphson and Gauss-Newton algorthms (Ortega, 990. 3. Dervatves of the log-lkelhood functon The frst order and second-order dervatves of the log-lkelhood functons can be derved by usng the log-lkelhood functons of complete data, denoted by lc ( ; x, z. From the mssng nformaton prncple, the frst-order dervatve of L( ; x, called the score functon, can be wrtten as s ( ; x = log L( ; x = E[ S ( ; z x, ], (3 where S ( ; z = l ( ; x, z and E[. x, ] denotes that the expectaton s taken wth c respect to the condtonal dstrbuton f( z X = x,. In addton, we use and to denote the frst-order and second-order dervatves wth respect to a parameter vector, say

T a( = a( / and a( = a( /. To calculate the second order dervatve of the log-lkelhood functon, we apply Lous s (98 formula to obtan log L( ; x E[ I ( ; z S ( ; z Ä Ä - = - x, ] + s (, x, (4 where for vector aa, for complete data. Ä T = aa and I ( ; z lc ( ; x, z =- denotes the nformaton matrx 3. Steps of the SAEM algorthm ( At the -th teraton, s the current estmate of ˆ ( ; h the current estmate of s ( ˆ ; x ; ( G ( t, the current estmate of ˆ ˆ Ä [ ( ; ( ;, ˆ] ( ˆ Ä EI z - ts z x + s, x. We assume that P x, (.,. s the transton probablty of the Metropols-Hastngs algorthm used to smulate from the condtonal dstrbuton of z gven x and. (,0 ( -, N- Step. At the -th teraton, set z = z. Generate ( xk,, - transton probablty P ( ( z,.. x, - Step. Update the estmates as follows: ( (, (, N z = ( z,..., z from the where t Î [0,], = + g [ G ( t] H( ; z ( ( - ( (- ( - ( x, h = h + g ( H( ; z -h ( ( - ( - ( x, ( - G =G + g ( I( ; z -G ( ( - ( - ( x, ( - I z I z N ( x, ( xk,, (, = å (, N k = H z å S z. N ( x, ( xk,, T (, = (, N k = Fnally, the constants seuence { g } 0 g for all, satsfes the followng condtons: å g = and = å g <. = An mportant feature of the SAEM algorthm s that t uses a constants seuence { g } to handle the nose n approxmatng log L( ; x and log L( ; x n Step (Robbns and Monro, 95; La, 003. Now, we consder the convergence of the algorthm. In order to acheve ths goal, t can be shown that the seuence of parameters estmates of returned by SAEM algorthm, approxmate the soluton to the dfferental euaton

d ( ( E[ H(, z x, ] d - =G wth the correspondng terms for the ( G ( t. For more detals and condtons of ths convergence see theorem 3. of Benvenste et al. (990 and Gu and Kong (998. It s suffcent to check out that the dstrbuton under study satsfes the condtons of theorem 3. of Benvenste et al. (990. 4. Concluson In ths study, we rase a stochastc approxmaton nterpretaton for EM algorthm. Ths stochastc approxmaton vewpont provdes some convenence for EM algorthm. It also suggests a more flexble way to maxmzaton step of EM algorthm by usng MCMC. It should be emphaszed that the man goal of the current paper s concentraton on the role of stochastc approxmaton n expectaton stage of EM algorthm. References. Benvenste, A., Metver, M., and Prouret, P. (990. Adaptve Algorthms and Stochastc Approxmaton. New York: Sprnger.. Boyles, R. A. (983. On the convergence of the EM algorthm. Journal of the Royal Statstcal Socety B, 45: 47-50. 3. Delyon, B., Lavelle, M., and Moulnes, E. (999. Convergence of a stochastc approxmaton verson of the EM algorthm. The Annals of Statstcs, 7: 94-8. 4. Dempster, A. P., Lard, N. M., and Rubn, D. B. (977. Maxmum lkelhood from ncomplete data va the EM algorthm. Journal of the Royal Statstcal Socety B, 39: -. 5. Gu, M. G. and Kong, F. H. (998. A stochastc approxmaton algorthm wth Markov chan Monte Carlo method for ncomplete data estmaton problems. In: Proceedng of Natonal Academc Scence of USA, 95: 770-774. 6. Jank, W. (006. The EM algorthm, Its stochastc mplementaton and global optmzaton: some challenges and opportuntes for OR. In Alt, Fu, and Golden (Eds. Topcs n modelng, optmzaton, and Decson Technologes: Honorng Saul Gass contrbutons to operaton research. Sprnger Verlag, NY, 367-39. 7. La, T. L. (003. Stochastc approxmaton. The Annals of Statstcs, 3: 39-406. 8. Ortega, J. M. (990. Numercal Analyss: A Second Course. Phladelpha: Socety for Industral and Academc Press. 9. Robbns, H. and Monro, S. (95. A stochastc approxmaton method. Annals of Mathematcal Statstcs, : 400-407. 0. Tadayon, Vahd, & Mahmoud Torab. (08. Spatal models for non-gaussan data wth covarate measurement error. Envronmetrcs. 0.00/env.545.. Tadayon, Vahd & Rasekh, Abdolrahman. (08. Non-Gaussan Covarate- Dependent Spatal Measurement Error Model for Analyzng Bg Spatal Data. Journal of Agrcultural, Bologcal and Envronmental Statstcs. 0.007/s353-08-0034-3.

. Tadayon, Vahd. (07. Bayesan Analyss of Censored Spatal Data Based on a Non-Gaussan Model. Journal of Statstcal Research of Iran. 3. 55-80. 0.8869/acadpub.jsr.3..55. 3. Tadayon, Vahd. (05. Bayesan Analyss of Skew Gaussan Spatal Models Based on Censored Data. Communcaton n Statstcs-Smulaton and Computaton. 44. 0.080/036098.03.839036. 4. Tadayon, V. (08. Analyss of Gaussan Spatal Models wth Covarate Measurement Error. arxv preprnt arxv:8.05648. 5. Tadayon, V. (07. Bayesan Analyss of Censored Spatal Data Based on a Non- Gaussan Model. arxv preprnt arxv:706.0577.