On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

Similar documents
Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Lecture Notes on Linear Regression

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

EEE 241: Linear Systems

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Markov Chain Monte Carlo Lecture 6

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX

Inexact Newton Methods for Inverse Eigenvalue Problems

EM and Structure Learning

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Small Area Interval Estimation

Singular Value Decomposition: Theory and Applications

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Hybrid Variational Iteration Method for Blasius Equation

A quantum-statistical-mechanical extension of Gaussian mixture model

Parameter Estimation for Dynamic System using Unscented Kalman filter

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Information Geometry of Gibbs Sampler

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning

Generalized Linear Methods

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA

Tracking with Kalman Filter

Numerical Heat and Mass Transfer

The Basic Idea of EM

Identification of Linear Partial Difference Equations with Constant Coefficients

4.3 Poisson Regression

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

Quantum and Classical Information Theory with Disentropy

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Some modelling aspects for the Matlab implementation of MMA

Research Article Green s Theorem for Sign Data

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Gaussian Mixture Models

Conjugacy and the Exponential Family

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

DETERMINATION OF TEMPERATURE DISTRIBUTION FOR ANNULAR FINS WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY BY HPM

Gaussian process classification: a message-passing viewpoint

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Topic 5: Non-Linear Regression

Gaussian Conditional Random Field Network for Semantic Segmentation - Supplementary Material

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Maximum Likelihood Estimation

Maximum a posteriori estimation for Markov chains based on Gaussian Markov random fields

The Expectation-Maximization Algorithm

The Order Relation and Trace Inequalities for. Hermitian Operators

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Chapter Newton s Method

Hidden Markov Models

An adaptive SMC scheme for ABC. Bayesian Computation (ABC)

Newton s Method for One - Dimensional Optimization - Theory

IV. Performance Optimization

Relevance Vector Machines Explained

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Suppose that there s a measured wndow of data fff k () ; :::; ff k g of a sze w, measured dscretely wth varable dscretzaton step. It s convenent to pl

Code_Aster. Identification of the Summarized

Code_Aster. Identification of the model of Weibull

Basic Statistical Analysis and Yield Calculations

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Outline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)

Expectation Maximization Mixture Models HMMs

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

Grenoble, France Grenoble University, F Grenoble Cedex, France

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

A quantum-statistical-mechanical extension of Gaussian mixture model

Chapter - 2. Distribution System Power Flow Analysis

The Expectation-Maximisation Algorithm

A new Approach for Solving Linear Ordinary Differential Equations

Objective Priors for Estimation of Extended Exponential Geometric Distribution

1 Motivation and Introduction

Linear Approximation with Regularization and Moving Least Squares

A NEW DISCRETE WAVELET TRANSFORM

Limited Dependent Variables and Panel Data. Tibor Hanappi

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Fabio Rapallo. p x = P[x] = ϕ(t (x)) x X, (1)

Application of B-Spline to Numerical Solution of a System of Singularly Perturbed Problems

Week 5: Neural Networks

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Report on Image warping

3.1 ML and Empirical Distribution

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo

The RS Generalized Lambda Distribution Based Calibration Model

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

Transcription:

On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool for nference n ncomplete data models. In ths paper, we revew fundamental EM algorthm and then focus especally on stochastc verson of EM. In order to construct the SAEM, the algorthm combnes EM wth a varant of stochastc approxmaton that uses Markov chan Monte-Carlo to deal wth the mssng data. The algorthm s ntroduced n general form and can be used to a wdely range of problems. Keywords: Stochastc Approxmaton; EM algorthm; Incomplete Data; Markov chan Monte- Carlo; Maxmum Lkelhood.. Introducton A standard method to handle ncomplete data problems s EM algorthm (Dempster et al., 977; Tadayon and Torab, 08. Ths procedure s an teratve method to fnd maxmum lkelhood n some ncomplete data. Ths algorthm s appled to wdely varous problems. However, the convergence of ths method can be slow. Furthermore, n stuatons where the data are dependent and ncomplete (for example, spatal ncomplete data problems ths method can be hghly neffcent. To resolve some of these dffcultes, we explore Stochastc Approxmaton EM (SAEM. To pay ths algorthm, frst Stochastc Approxmaton (SA s consdered. Stochastc Approxmaton (SA was ntroduced by Robbns and Monro (95, and has been extended by Gu and Kong (998 to stuatons where data are ncomplete. We utlze SA wth MCMC to construct SAEM. SAEM algorthm orgnates from Delyon et al. (999. In ths paper, we propose an extenson form of SAEM based on SA wth MCMC (Gu and Kong, 998. In the next secton, EM algorthm s revewed. Then SAEM s ntroduced.. EM algorthm We assume that x s observed (or ncomplete data and s generated by some dstrbuton. Let z denote the unobserved (or mssng data. Hence n EM algorthm par (x,z s recognzed as complete data. Let f( x, z; denote the jont dstrbuton of the complete data, dependent on parameter vector. Wth ths new densty functon, we can defne a new lkelhood functon L( = L( ; x =ò f( x, z; dz, whch s referred to as the ncomplete-data lkelhood functon. The goal s to fnd ˆ, the maxmze of the margnal lkelhood L(. Department of Statstcs, Tarbat Modares Unversty, Tehran-Iran.

The EM algorthm conssted of two stage. Frst computes the expected value of the complete-data log-lkelhood log f( x, z; wth respect to the unknown data z gven the observed data x and the current parameter estmates. That s, we defne: ( - ( - Q(, = E[log f( x, z; x, ] ( where ( - are the current parameters estmates. Second, we maxmze the expectaton. (0 Ths s the M-step. Gven an ntal value, the EM algorthm produces a seuence (0 ( ( {,,,...} that, under mld regularty condtons (Boyles, 983, converges to ˆ. To end ths secton, we menton some challenges for the EM algorthm. One of the bggest challenges for the EM algorthm s that t only guarantees convergence to a local soluton (Jank, 006; Tadayon and Rasekh, 08; Tadayon, 07; Tadayon, 05; Tadayon, 08. The EM algorthm s a graspng method n the sense that t s attracted to the soluton closest to ts startng value. Then the next problem wth EM algorthm s startng values. In addton, n some cases the lkelhood functon s computatonally ntractable and t s nfeasble to maxmze the lkelhood functon of observed data drectly. To avod above problems, n the next secton SAEM s ntroduced. 3. Stochastc Approxmaton EM algorthm Usng lkelhood functon L(, the maxmum lkelhood estmate of, denoted by ˆ, s defned by L( ˆ ; x = max L(, x. ( Due to computatonally ntractable (, we consder the frst-order and second- order partal dervatves of the log-lkelhood functon n order to use gradent-type algorthms, such as Newton-Raphson and Gauss-Newton algorthms (Ortega, 990. 3. Dervatves of the log-lkelhood functon The frst order and second-order dervatves of the log-lkelhood functons can be derved by usng the log-lkelhood functons of complete data, denoted by lc ( ; x, z. From the mssng nformaton prncple, the frst-order dervatve of L( ; x, called the score functon, can be wrtten as s ( ; x = log L( ; x = E[ S ( ; z x, ], (3 where S ( ; z = l ( ; x, z and E[. x, ] denotes that the expectaton s taken wth c respect to the condtonal dstrbuton f( z X = x,. In addton, we use and to denote the frst-order and second-order dervatves wth respect to a parameter vector, say

T a( = a( / and a( = a( /. To calculate the second order dervatve of the log-lkelhood functon, we apply Lous s (98 formula to obtan log L( ; x E[ I ( ; z S ( ; z Ä Ä - = - x, ] + s (, x, (4 where for vector aa, for complete data. Ä T = aa and I ( ; z lc ( ; x, z =- denotes the nformaton matrx 3. Steps of the SAEM algorthm ( At the -th teraton, s the current estmate of ˆ ( ; h the current estmate of s ( ˆ ; x ; ( G ( t, the current estmate of ˆ ˆ Ä [ ( ; ( ;, ˆ] ( ˆ Ä EI z - ts z x + s, x. We assume that P x, (.,. s the transton probablty of the Metropols-Hastngs algorthm used to smulate from the condtonal dstrbuton of z gven x and. (,0 ( -, N- Step. At the -th teraton, set z = z. Generate ( xk,, - transton probablty P ( ( z,.. x, - Step. Update the estmates as follows: ( (, (, N z = ( z,..., z from the where t Î [0,], = + g [ G ( t] H( ; z ( ( - ( (- ( - ( x, h = h + g ( H( ; z -h ( ( - ( - ( x, ( - G =G + g ( I( ; z -G ( ( - ( - ( x, ( - I z I z N ( x, ( xk,, (, = å (, N k = H z å S z. N ( x, ( xk,, T (, = (, N k = Fnally, the constants seuence { g } 0 g for all, satsfes the followng condtons: å g = and = å g <. = An mportant feature of the SAEM algorthm s that t uses a constants seuence { g } to handle the nose n approxmatng log L( ; x and log L( ; x n Step (Robbns and Monro, 95; La, 003. Now, we consder the convergence of the algorthm. In order to acheve ths goal, t can be shown that the seuence of parameters estmates of returned by SAEM algorthm, approxmate the soluton to the dfferental euaton

d ( ( E[ H(, z x, ] d - =G wth the correspondng terms for the ( G ( t. For more detals and condtons of ths convergence see theorem 3. of Benvenste et al. (990 and Gu and Kong (998. It s suffcent to check out that the dstrbuton under study satsfes the condtons of theorem 3. of Benvenste et al. (990. 4. Concluson In ths study, we rase a stochastc approxmaton nterpretaton for EM algorthm. Ths stochastc approxmaton vewpont provdes some convenence for EM algorthm. It also suggests a more flexble way to maxmzaton step of EM algorthm by usng MCMC. It should be emphaszed that the man goal of the current paper s concentraton on the role of stochastc approxmaton n expectaton stage of EM algorthm. References. Benvenste, A., Metver, M., and Prouret, P. (990. Adaptve Algorthms and Stochastc Approxmaton. New York: Sprnger.. Boyles, R. A. (983. On the convergence of the EM algorthm. Journal of the Royal Statstcal Socety B, 45: 47-50. 3. Delyon, B., Lavelle, M., and Moulnes, E. (999. Convergence of a stochastc approxmaton verson of the EM algorthm. The Annals of Statstcs, 7: 94-8. 4. Dempster, A. P., Lard, N. M., and Rubn, D. B. (977. Maxmum lkelhood from ncomplete data va the EM algorthm. Journal of the Royal Statstcal Socety B, 39: -. 5. Gu, M. G. and Kong, F. H. (998. A stochastc approxmaton algorthm wth Markov chan Monte Carlo method for ncomplete data estmaton problems. In: Proceedng of Natonal Academc Scence of USA, 95: 770-774. 6. Jank, W. (006. The EM algorthm, Its stochastc mplementaton and global optmzaton: some challenges and opportuntes for OR. In Alt, Fu, and Golden (Eds. Topcs n modelng, optmzaton, and Decson Technologes: Honorng Saul Gass contrbutons to operaton research. Sprnger Verlag, NY, 367-39. 7. La, T. L. (003. Stochastc approxmaton. The Annals of Statstcs, 3: 39-406. 8. Ortega, J. M. (990. Numercal Analyss: A Second Course. Phladelpha: Socety for Industral and Academc Press. 9. Robbns, H. and Monro, S. (95. A stochastc approxmaton method. Annals of Mathematcal Statstcs, : 400-407. 0. Tadayon, Vahd, & Mahmoud Torab. (08. Spatal models for non-gaussan data wth covarate measurement error. Envronmetrcs. 0.00/env.545.. Tadayon, Vahd & Rasekh, Abdolrahman. (08. Non-Gaussan Covarate- Dependent Spatal Measurement Error Model for Analyzng Bg Spatal Data. Journal of Agrcultural, Bologcal and Envronmental Statstcs. 0.007/s353-08-0034-3.

. Tadayon, Vahd. (07. Bayesan Analyss of Censored Spatal Data Based on a Non-Gaussan Model. Journal of Statstcal Research of Iran. 3. 55-80. 0.8869/acadpub.jsr.3..55. 3. Tadayon, Vahd. (05. Bayesan Analyss of Skew Gaussan Spatal Models Based on Censored Data. Communcaton n Statstcs-Smulaton and Computaton. 44. 0.080/036098.03.839036. 4. Tadayon, V. (08. Analyss of Gaussan Spatal Models wth Covarate Measurement Error. arxv preprnt arxv:8.05648. 5. Tadayon, V. (07. Bayesan Analyss of Censored Spatal Data Based on a Non- Gaussan Model. arxv preprnt arxv:706.0577.