MODELING OF VEHICLE INSURANCE CLAIM USING EXPONENTIAL HIDDEN MARKOV. Bogor Agricultural University Meranti Street, Bogor, 16680, INDONESIA

Size: px
Start display at page:

Download "MODELING OF VEHICLE INSURANCE CLAIM USING EXPONENTIAL HIDDEN MARKOV. Bogor Agricultural University Meranti Street, Bogor, 16680, INDONESIA"

Transcription

1 International Journal of Pure and Applied Mathematics Volume 118 No , ISSN: (printed version); ISSN: (on-line version) url: doi: /ijpamv118i215 PAijpameu MODELING OF VEHICLE INSURANCE CLAIM USING EXPONENIAL HIDDEN MARKOV Irfani Azis 1, Berlian Setiawaty 2, I Gusti Putu Purnaba 3 1,2,3 Department of Mathematics Bogor Agricultural University Meranti Street, Bogor, 16680, INDONESIA Abstract: he purpose of this research is to model the vehicle insurance claim using exponential hidden Markov model (EHMM) he maximum likelihood method, forward-backward algorithm and expectation maximization (EM) algorithm are used to determine the EHMM parameter he best model was selected based on Bayesian information criterion (BIC) AMS Subject Classification: 60J22, 60J28 Key Words: claim, exponential, hidden Markov, vehicle insurance 1 Introduction Insurance is an agreement in which the insurer binds itself to an insured company, by receiving a premium to provide a replacement due to loss, damage or loss of expected profit, which may occur due to an uncertain event Vehicle insurance is one type of insurance that protects the vehicle from loss he cause of loss is unpredictable, so the fund which is paid for a claim by an insurance company is unknown Received: Revised: Published: March 17, 2018 c 2018 Academic Publications, Ltd url: wwwacadpubleu Correspondence author

2 310 I Azis, B Setiawaty, IGP Purnaba Figure 1: Histogram frequency of vehicle insurance claim of P XX in 2014 he greater loss that happened the greater claim would be paid he loss that happened can be caused by several factors, such as weather condition, traffic condition, driver condition and other factors Figure 1 shows that the small loss often occurred but huge loss rarely occurred he distribution of claim can be seen as an exponential distribution or a mix of the exponential distribution If the cause of loss is assumed to form a Markov chain and is not observed directly, and only data of claim is observed, then the pair of claim and cause of loss can be modeled by hidden Markov model he purpose of this research is to model of vehicle insurance claim using exponential hidden Markov model 2 Exponential Hidden Markov Model Exponential hidden Markov model (EHMM) is a model which consists of pair of stochastic process {X t,y t } t N {X t } t N is unobserved and form a Markov chain {Y t } t N is an observation process which depends on {X t } t N Y t X t has the exponential distribution, for all t Let {X t } t N be the Markov chain with state space S X = {1,2,3,,m} and transition probability matrix A = [a ij ] m m, where a ij = P(X t = j X t 1 = i) = P(X 2 = j X 1 = i),a ij 0 and j=1 a ij = 1 Let φ = [φ i ] m 1 is the initial state probability vector, where φ i = P(X 1 = i), j=1 φ i = 1 he distribution of Y t X t = i,i S X ;t N has an exponential distribution with parameter λ i he probability density function of Y t X t = i is γ yi =

3 MODELING OF VEHICLE INSURANCE CLAIM 311 f(y) = λ i e λ iy,y > 0 Suppose that λ = [λ i ] m 1 So EHMM can be identified by parameter φ = (m,a,λ) 3 Parameter Estimation of Exponential Hidden Markov Model he parameter φ = (m,a,λ) is estimated using maximum likelihood method (see [1]) Suppose that is the number of observation, y = (y 1,y 2,, y ) is the observation data of {Y t } t N, x = (i 1,i 2,,i ) is the unobserved data of {X t } t N, (x,y) = (i 1,y 1,i 2,y 2,,i,y ) is the data of {X,Y t } t N, and Φ m = {φ = (m,a,λ) : A = [0,1] m2,λ [ǫ,1/ǫ] m } is the parameter space of EHMM Assume that a ij, λ i, and ϕ i are continuous function from R m to R with a ij (φ) = a ij, λ i (φ) = λ i and ϕ i (φ) = ϕ i, for all i,j S X Let L (m) (φ) be an incomplete likelihood function, that is a probability density function of y 1,y 2,,y L (m) (φ) = f φ(y) = f φ (y 1,,y ) = m i 1 =1 m ϕ i1 γ y1 i 1 a it 1 i t γ yti t i =1 Let L C(m) (φ) be a complete likelihood function Where L C(m) (φ) = f φ (x,y) = f φ (i 1,y 1,,i,y ) = ϕ i1 γ y1 i 1 We obtain L (m) (φ) = x L C(m) (φ) t=2 a it 1 i t γ yti t Calculating L (m) (φ) directly when m and are big, is a complex process Because of that, we use the forward-backward algorithm to calculate L (m) (φ) he forward-backward algorithm consists of two sections, those are forward algorithm, and backward algorithm Define forward variabel α t (i,φ,) for t = 1,2,, and i S X as follows t=2 α t (i,φ,) = P φ (Y 1 = y 1,Y 2 = y 2,,Y t = y t,x t = i) Define backward variable β t (i,φ,) for t =, 1,,1;i S X as follows β t (i,φ,) = P φ (Y t+1 = y t+1,y t+2 = y t+2,,y = y X t = i)

4 312 I Azis, B Setiawaty, IGP Purnaba heorem 1 (see [5]) For any t = 1,2,, then m (φ) = α t (i,φ,)β t (i,φ,) L (m) i=1 Now, the problem is to find φ argmax ln(l (m) (φ)) In this case, Expectation Maximization (EM) algorithm is used EM algorithm is an iterative algorithm which consists of two steps for every iteration, those are E step, and M step EM algorithm is as follows 1 For k = 0, the initial value φ (k) is given 2 (E-step) : Calculate Q(φ,φ (k) ) = E φ (k) ( ln(l C(m) Use heorem 2 for calculating Q(φ,φ (k) ) heorem 2 (see [2]) + m m i=1 j=1 + = Q(φ,φ (k) ) = E φ(k) ( m i=1 1 m i=1 ln(l C(m) ) (φ)) y ) (φ)) y α 1 (i,φ (k),)β 1 (i,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) ln(ϕ i) t=1 α t(j,φ (k),)γ (k) y t+1,j a(k) ij β t+1(j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k) ln(a ij ),) t=1 α t(j,φ (k),)β t (j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) ln(γ y ti) heorem 3 (see [6]) Suppose that Φ m = {φ = (m,a,λ) : A = [0,1] m2,λ [ǫ,1/ǫ] m} and ǫ > 0 is very small near zero and EHMM parameter continuous assumption is fulfilled, then (a) Φ m is compact set in R m2 +m

5 MODELING OF VEHICLE INSURANCE CLAIM 313 (b) ln(l (m) (φ)) is a continuous function in Φ m and differentiable in interior Φ m (c) Q(ˆφ,φ) is a continuous function in Φ m Φ m 3 (M-step) : Find φ (k+1) which maximizes Q(φ,φ (k) ) hat is Q(φ (k+1),φ (k) ) Q(φ,φ (k) ), φ Φ m he existence of φ (k+1) is guaranteed by heorem 3 Furthermore, using heorem 4 we have the parameter φ (k+1) heorem 4 (see [2]) Suppose that φ (k) Φ m with φ (k) = (m,a k,λ k ), thentheparameterφ (k+1) = (m,a (k+1),λ (k+1) )whichmaximizesq(φ,φ (k) ) in Φ m can be written as a k+1 ij = 1 t=1 α t(j,φ (k),)γ (k) y t+1,j a(k) ij β t+1(j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) and λ (k+1) i = ( ) 1 t=1 α t(j,φ (k),)β t (j,φ (k),)(y t ) l=1 α t(l,φ (k),)β t (l,φ (k),) 4 Replace k with k +1 and if ln(l (m) (φ(k+1) )) ln(l (m) (φ(k) )) > ǫ then go back to 2 nd until 4 th step Iteration is stopped when ln(l (m) (φ(k+1) )) ln(l (m) (φ(k) )) < ǫ In other words {ln(l (m) (φ(k+1) ))} converges to {ln(l (m) (φ ))} Its existence is guaranteed by heorem 5 heorem 5 (see [6]) Suppose that Q(ˆφ,φ) is continuous function in Φ m Φ m and {φ (k) } is the set of EHMM estimator which is obtained from EM algorithm, then (a) {ln(l (m) (φ(k) ))} is monotonically increasing sequence (b) φ Φ m lim k ln(l (m) (φ(k) )) = ln(l (m) (φ )) (c) If lim k φ (k) = φ, then φ is the stasionary point of ( L (m) (φ) ) Supposethat ˆφ = φ (k+1) isobtainedinthelast iteration ofemalgorithm, so ˆφ is EHMM parameter estimation, where ˆφ = (ˆm,Â,ˆλ) Bayesian information criterion (BIC) is used for finding state ˆm (see [4]) he selection of ˆm is based

6 314 I Azis, B Setiawaty, IGP Purnaba on ˆm which maximizes ln(l (m) t (φ)) (ln)d m, where ln(l (m) t (φ)) is the value of likelihood function and d m is the number of estimation parameter, so BIC = lnl (m) (φ) (ln)d m 2 he number of estimation parameter in EHMM is (m 2 +m), so the formula of BIC as follows BIC = lnl (m) (φ) (ln)(m2 +m) 2 Furthermore to validate of the model, a set of estimation data ŷ 1,ŷ 2,,ŷ, is generated, based on ˆφ, where ŷ t = Eˆφ ( Yt = y t Y1 t 1 ) 1 = ) L m t 1(ˆφ 0 y t L m t (ˆφ) dy t, and error is calculated by mean absolute percentage error (MAPE), where MAPE = 1 Y t Ŷt Y t 100% t=1 Interpretation of MAPE is as follows able 1: Interpretation of MAPE [3] MAPE% Interpretation 10 Highly accurate forecasting Good forecasting Reasonable forecasting 4 Programming Algorithm of Exponential Hidden Markov Model Suppose that y = {y 1,y 2,,y } is an observation set and EHMM parameter φ = (m,a,λ) will be estimated he algorithm which is used is as follows 1 i) Input the number of data and data y 1,y 2,,y ii) Set m = 2,k = 0,l = 1

7 MODELING OF VEHICLE INSURANCE CLAIM i) For k = 0, choose the initial value of φ (k) = (m,a (k),λ (k) ), where A (k) and λ (k) are generated randomly, A (k) = [a (k) ij ] (m m),λ (k) = [λ (k) i ] (m 1) and fulfill j=1 a(k) ij = 1,λ (k) i 0 and i = 1,2,,m ii) Generate ϕ (k) value which fulfill (ϕ (k) ) = (ϕ (k) ) A (k) iii) Generate γ (k) y l i l from Y l X l = i l 3 Determine ǫ = 10 ( 1) as bound of the likelihood iteration, that is ln(l (m) (φ(k+1) )) ln(l (m) (φ(k) )) < ǫ 4 Calculate Forward and Backward function 5 (E-Step) Calculate L l (φ (k) ) L l (φ (k) ) = m α l (i,φ (k),l)β l (i,φ (k),l) i=1 6 (M-Step)Calculateparameterestimationvalueφ (k+1) = (m,a (k+1),λ (k+1) ), where 1 a k+1 t=1 ij = α t(j,φ (k),)γ (k) y t+1,j a(k) ij β t+1(j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) ( ) 1 λ (k+1) t=1 i = α t(j,φ (k),)β t (j,φ (k),)(y t ) l=1 α t(l,φ (k),)β t (l,φ (k),) 7 i) If ln(l (m) (φ(k) )) > ǫ, go back to 4 th until 6 th step and replace k by k+1 Otherwise go to the next step Replace l by l+1 and return to 2 nd step, where l = 1,2,, 1 ii) Generate set of estimation data Ŷl iii) Verify that Ŷl Y l Y l < 7%, otherwise replace l to l + 1, where l = 1,2,, 1 and return to 1st step iv) Calculate BIC value v) Calculate error value of estimation data by MAPE If MAPE> 30% then return to 2nd step (φ(k+1) )) ln(l (m) 8 i) For determining optimum m, do the 1st step until 8th step for every state m = 2,3,,6 with different ii) Chose m which maximize BIC

8 316 I Azis, B Setiawaty, IGP Purnaba Figure 2: Vehicle Insurance Claim for each Day in Application of EHMM on Vehicle Insurance Claim his research uses data of vehicle insurance claim in one of Indonesia insurance company in 2014 he data can be seen in Figure 2 he data of claim is called observation data Suppose that Y t is the observation data, X t is the unobserved data, and t is t th day, where t = 1,2,, and = 298 days Parameter estimation of the model is exercised for 2-states until 6-states he last 10 data are used for testing estimation data or prediction data Based on the programming result, log-likelihood value and BIC value is obtained for every state with 288 days of observation data which can be seen in able 2 able 2: Log-likelihood value and BIC value for every state with 288 days observation data m - state Loglikelihood value BIC value 2 - state state state state state Based on able 2, if the number of states increase, then the BIC value will decrease Afterwards interval observation will be determined starting from a

9 MODELING OF VEHICLE INSURANCE CLAIM 317 week, a month, and 3 months Where assumption of a week is 6 day and a month is 26 days Based on programming log-likelihood value and BIC value is obtained for every observation interval and state can be seen in able 3 and able 4 able 3: Log-likelihood value for every state and observation interval m - state Loglikelihood value 1 week 1 month 3 month 2 - state state state state state able 4: BIC value for every state and observation interval m - state BIC value 1 week 1 month 3 month 2 - state state state state state he best model is chosen by the highest BIC value Based on able 2, EHMM with state m = 2 has the highest BIC value, so the optimum parameter estimation is state m = 2 Based on able 3 and able 4, if the observation interval increase then log-likelihood value and BIC value will decrease So the optimum number of state and observation interval is state m = 2 and 1 week observation interval Parameter estimation value which is obtained from 1 week observation interval with state m = 2 are transition probability matrix A, initial transition probability vector ϕ, and exponential parameter estimation vector λ, where the number of 1 week observation data is 283rd until 288th day of claim data

10 318 I Azis, B Setiawaty, IGP Purnaba ransition probability matrix A is obtained as follows ( ) A = Based on matrix A, if the previous claim is caused by state 1, then the claim would happen again by state 1 with probability value If the previous claim is caused by state 2, then the claim would happen again by state 1 with probability value If the previous claim is caused by state 1, then the claim would happen again by state 2 with probability value If the previous claim is caused by state 2, then the claim would happen again by state 2 with probability value Initial transition probability matrix was obtained as follows ϕ = ( ϕ shows that, the claim which is caused by state 1 has parobability value and the claim which is caused by state 2 has probability value Exponential parameter estimation vector λ as follows λ = ( λ shows that, the claim that will be happened by state 1 has rate million rupiahs per day Afterward, the claim that will be happened by state 2 has rate million rupiahs per day Furthermore the set of estimation data is generated by the best obtained parameter Estimation data and Observation data can be seen in Figure 3 he MAPE value is obtained is 1955%, which interprets that estimation data is a good forecasting Observation data and estimation data of Figure 3 are presented in able 5 ) )

11 MODELING OF VEHICLE INSURANCE CLAIM 319 Figure 3: Comparison of observation data and estimation data able 5: Comparison of observation data and estimation data for every single day Day Observation data Estimation data (Million rupiah) (Million rupiah) References [1] AP Dempster, NM Laird, DB Rubin, Maximum likelihood from incomplete data via the EM algorithm, Series B: Journal of the Royal Statistical Society, 39, No 1 (1977), 1-38 [2] M Firmansyah, B Setiawaty, IGP Purnaba, Parameter estimation of exponential hidden Markov model and convergence of its parameter estimator sequence, St Petersburg Polytechnic University Journal (2017), Submitted [3] Y Hu, WM Ho, ravel time prediction for urban networks: he comparisons of simulation-based and time-series models, In: proceedings 17th IS World Congress, Busan, South Korea, 2010 [4] D Lam, S Li, DC Wunsch, Hidden Markov model with information criteria clustering and extreme learning machine regression for wind forecasting, Journal of Computer Science and Cybernetics, 30, No 4 (2014), [5] LR Rabiner, BH Juang, An introduction of hidden Markov models, IEEE Assp Magazine, doi: /86/ (1986)

12 320 I Azis, B Setiawaty, IGP Purnaba [6] CFJ Wu, On the convergence properties of the EM algorithm, he Annals of Statistics, 11, No 1 (1983),

Multiscale Systems Engineering Research Group

Multiscale Systems Engineering Research Group Hidden Markov Model Prof. Yan Wang Woodruff School of Mechanical Engineering Georgia Institute of echnology Atlanta, GA 30332, U.S.A. yan.wang@me.gatech.edu Learning Objectives o familiarize the hidden

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute

More information

Weighted Finite-State Transducers in Computational Biology

Weighted Finite-State Transducers in Computational Biology Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial

More information

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010 Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition

More information

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential

More information

Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models.

Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models. Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.com This Lecture Expectation-Maximization (EM)

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

CISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)

CISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II) CISC 889 Bioinformatics (Spring 24) Hidden Markov Models (II) a. Likelihood: forward algorithm b. Decoding: Viterbi algorithm c. Model building: Baum-Welch algorithm Viterbi training Hidden Markov models

More information

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm + September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature

More information

Dynamic sequential analysis of careers

Dynamic sequential analysis of careers Dynamic sequential analysis of careers Fulvia Pennoni Department of Statistics and Quantitative Methods University of Milano-Bicocca http://www.statistica.unimib.it/utenti/pennoni/ Email: fulvia.pennoni@unimib.it

More information

Machine Learning for natural language processing

Machine Learning for natural language processing Machine Learning for natural language processing Hidden Markov Models Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 33 Introduction So far, we have classified texts/observations

More information

CS Homework 3. October 15, 2009

CS Homework 3. October 15, 2009 CS 294 - Homework 3 October 15, 2009 If you have questions, contact Alexandre Bouchard (bouchard@cs.berkeley.edu) for part 1 and Alex Simma (asimma@eecs.berkeley.edu) for part 2. Also check the class website

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Hybrid HMM/MLP models for time series prediction

Hybrid HMM/MLP models for time series prediction Bruges (Belgium), 2-23 April 999, D-Facto public., ISBN 2-649-9-X, pp. 455-462 Hybrid HMM/MLP models for time series prediction Joseph Rynkiewicz SAMOS, Université Paris I - Panthéon Sorbonne Paris, France

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: ony Jebara Kalman Filtering Linear Dynamical Systems and Kalman Filtering Structure from Motion Linear Dynamical Systems Audio: x=pitch y=acoustic waveform Vision: x=object

More information

Statistical NLP: Hidden Markov Models. Updated 12/15

Statistical NLP: Hidden Markov Models. Updated 12/15 Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for part-of-speech-tagging applications Their first

More information

Stochastic Complexity of Variational Bayesian Hidden Markov Models

Stochastic Complexity of Variational Bayesian Hidden Markov Models Stochastic Complexity of Variational Bayesian Hidden Markov Models Tikara Hosino Department of Computational Intelligence and System Science, Tokyo Institute of Technology Mailbox R-5, 459 Nagatsuta, Midori-ku,

More information

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models 6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories

More information

Hidden Markov models for time series of counts with excess zeros

Hidden Markov models for time series of counts with excess zeros Hidden Markov models for time series of counts with excess zeros Madalina Olteanu and James Ridgway University Paris 1 Pantheon-Sorbonne - SAMM, EA4543 90 Rue de Tolbiac, 75013 Paris - France Abstract.

More information

ECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012

ECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012 Reading ECE 275B Homework #2 Due Thursday 2-16-12 MIDTERM is Scheduled for Tuesday, February 21, 2012 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example 7.11,

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Tutorial on Hidden Markov Model

Tutorial on Hidden Markov Model Applied and Computational Mathematics 2017; 6(4-1): 16-38 http://www.sciencepublishinggroup.com/j/acm doi: 10.11648/j.acm.s.2017060401.12 ISS: 2328-5605 (Print); ISS: 2328-5613 (Online) Tutorial on Hidden

More information

Data Analyzing and Daily Activity Learning with Hidden Markov Model

Data Analyzing and Daily Activity Learning with Hidden Markov Model Data Analyzing and Daily Activity Learning with Hidden Markov Model GuoQing Yin and Dietmar Bruckner Institute of Computer Technology Vienna University of Technology, Austria, Europe {yin, bruckner}@ict.tuwien.ac.at

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Mixtures of Rasch Models

Mixtures of Rasch Models Mixtures of Rasch Models Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl http://www.uibk.ac.at/statistics/ Introduction Rasch model for measuring latent traits Model assumption: Item parameters

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs

More information

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)

More information

ON SUPERCYCLICITY CRITERIA. Nuha H. Hamada Business Administration College Al Ain University of Science and Technology 5-th st, Abu Dhabi, , UAE

ON SUPERCYCLICITY CRITERIA. Nuha H. Hamada Business Administration College Al Ain University of Science and Technology 5-th st, Abu Dhabi, , UAE International Journal of Pure and Applied Mathematics Volume 101 No. 3 2015, 401-405 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v101i3.7

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels? Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

arxiv: v1 [stat.co] 4 Mar 2008

arxiv: v1 [stat.co] 4 Mar 2008 An EM algorithm for estimation in the Mixture Transition Distribution model arxiv:0803.0525v1 [stat.co] 4 Mar 2008 Sophie Lèbre, Pierre-Yves Bourguignon Laboratoire Statistique et Génome, UMR 8071 Université

More information

Dynamic Approaches: The Hidden Markov Model

Dynamic Approaches: The Hidden Markov Model Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message

More information

Determining the number of components in mixture models for hierarchical data

Determining the number of components in mixture models for hierarchical data Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000

More information

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017 1 Introduction Let x = (x 1,..., x M ) denote a sequence (e.g. a sequence of words), and let y = (y 1,..., y M ) denote a corresponding hidden sequence that we believe explains or influences x somehow

More information

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Parameters of an HMM States: A set of states S=s 1, s n Transition probabilities: A= a 1,1, a 1,2,, a n,n

More information

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 Hidden Markov Model Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/19 Outline Example: Hidden Coin Tossing Hidden

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Statistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t

Statistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t Markov Chains. Statistical Problem. We may have an underlying evolving system (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t Consecutive speech feature vectors are related

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

A minimalist s exposition of EM

A minimalist s exposition of EM A minimalist s exposition of EM Karl Stratos 1 What EM optimizes Let O, H be a random variables representing the space of samples. Let be the parameter of a generative model with an associated probability

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010

More information

a Λ q 1. Introduction

a Λ q 1. Introduction International Journal of Pure and Applied Mathematics Volume 9 No 26, 959-97 ISSN: -88 (printed version); ISSN: -95 (on-line version) url: http://wwwijpameu doi: 272/ijpamv9i7 PAijpameu EXPLICI MOORE-PENROSE

More information

Hidden Markov Models Part 1: Introduction

Hidden Markov Models Part 1: Introduction Hidden Markov Models Part 1: Introduction CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Modeling Sequential Data Suppose that

More information

Robert Collins CSE586 CSE 586, Spring 2015 Computer Vision II

Robert Collins CSE586 CSE 586, Spring 2015 Computer Vision II CSE 586, Spring 2015 Computer Vision II Hidden Markov Model and Kalman Filter Recall: Modeling Time Series State-Space Model: You have a Markov chain of latent (unobserved) states Each state generates

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Abstract

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Abstract INTERNATIONAL COMPUTER SCIENCE INSTITUTE 947 Center St. Suite 600 Berkeley, California 94704-98 (50) 643-953 FA (50) 643-7684I A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation

More information

Expectation Maximization (EM)

Expectation Maximization (EM) Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Stochastic Modelling Unit 1: Markov chain models

Stochastic Modelling Unit 1: Markov chain models Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson

More information

ECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015

ECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015 Reading ECE 275B Homework #2 Due Thursday 2/12/2015 MIDTERM is Scheduled for Thursday, February 19, 2015 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example

More information

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS LEARNING DYNAMIC SYSTEMS: MARKOV MODELS Markov Process and Markov Chains Hidden Markov Models Kalman Filters Types of dynamic systems Problem of future state prediction Predictability Observability Easily

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Stephen Scott.

Stephen Scott. 1 / 27 sscott@cse.unl.edu 2 / 27 Useful for modeling/making predictions on sequential data E.g., biological sequences, text, series of sounds/spoken words Will return to graphical models that are generative

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Recall: Modeling Time Series. CSE 586, Spring 2015 Computer Vision II. Hidden Markov Model and Kalman Filter. Modeling Time Series

Recall: Modeling Time Series. CSE 586, Spring 2015 Computer Vision II. Hidden Markov Model and Kalman Filter. Modeling Time Series Recall: Modeling Time Series CSE 586, Spring 2015 Computer Vision II Hidden Markov Model and Kalman Filter State-Space Model: You have a Markov chain of latent (unobserved) states Each state generates

More information

Use of hidden Markov models for QTL mapping

Use of hidden Markov models for QTL mapping Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data Statistical Machine Learning from Data Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL),

More information

Latent Tree Approximation in Linear Model

Latent Tree Approximation in Linear Model Latent Tree Approximation in Linear Model Navid Tafaghodi Khajavi Dept. of Electrical Engineering, University of Hawaii, Honolulu, HI 96822 Email: navidt@hawaii.edu ariv:1710.01838v1 [cs.it] 5 Oct 2017

More information

The Expected Opportunity Cost and Selecting the Optimal Subset

The Expected Opportunity Cost and Selecting the Optimal Subset Applied Mathematical Sciences, Vol. 9, 2015, no. 131, 6507-6519 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.58561 The Expected Opportunity Cost and Selecting the Optimal Subset Mohammad

More information

Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains

Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains Jean-Luc Gauvain 1 and Chin-Hui Lee Speech Research Department AT&T Bell Laboratories Murray Hill, NJ 07974

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Expectation Maximization Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning with MAR Values We discussed learning with missing at random values in data:

More information

Hidden Markov Models and Gaussian Mixture Models

Hidden Markov Models and Gaussian Mixture Models Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

Sequential Supervised Learning

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require Sequential Learning Part-of of-speech Tagging Information Extraction from the Web Text-to to-speech Mapping Part-of of-speech Tagging Given

More information

order is number of previous outputs

order is number of previous outputs Markov Models Lecture : Markov and Hidden Markov Models PSfrag Use past replacements as state. Next output depends on previous output(s): y t = f[y t, y t,...] order is number of previous outputs y t y

More information

Introduction to Hidden Markov Modeling (HMM) Daniel S. Terry Scott Blanchard and Harel Weinstein labs

Introduction to Hidden Markov Modeling (HMM) Daniel S. Terry Scott Blanchard and Harel Weinstein labs Introduction to Hidden Markov Modeling (HMM) Daniel S. Terry Scott Blanchard and Harel Weinstein labs 1 HMM is useful for many, many problems. Speech Recognition and Translation Weather Modeling Sequence

More information

SOLVING A MINIMIZATION PROBLEM FOR A CLASS OF CONSTRAINED MAXIMUM EIGENVALUE FUNCTION

SOLVING A MINIMIZATION PROBLEM FOR A CLASS OF CONSTRAINED MAXIMUM EIGENVALUE FUNCTION International Journal of Pure and Applied Mathematics Volume 91 No. 3 2014, 291-303 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v91i3.2

More information

Kernel Logistic Regression and the Import Vector Machine

Kernel Logistic Regression and the Import Vector Machine Kernel Logistic Regression and the Import Vector Machine Ji Zhu and Trevor Hastie Journal of Computational and Graphical Statistics, 2005 Presented by Mingtao Ding Duke University December 8, 2011 Mingtao

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

Series 7, May 22, 2018 (EM Convergence)

Series 7, May 22, 2018 (EM Convergence) Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18

More information

AN IMPROVED POISSON TO APPROXIMATE THE NEGATIVE BINOMIAL DISTRIBUTION

AN IMPROVED POISSON TO APPROXIMATE THE NEGATIVE BINOMIAL DISTRIBUTION International Journal of Pure and Applied Mathematics Volume 9 No. 3 204, 369-373 ISSN: 3-8080 printed version); ISSN: 34-3395 on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/0.2732/ijpam.v9i3.9

More information

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models Advanced Machine Learning Lecture 10 Mixture Models II 30.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ Announcement Exercise sheet 2 online Sampling Rejection Sampling Importance

More information

Expectation Maximisation (EM) CS 486/686: Introduction to Artificial Intelligence University of Waterloo

Expectation Maximisation (EM) CS 486/686: Introduction to Artificial Intelligence University of Waterloo Expectation Maximisation (EM) CS 486/686: Introduction to Artificial Intelligence University of Waterloo 1 Incomplete Data So far we have seen problems where - Values of all attributes are known - Learning

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems

Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems LEARNING AND INFERENCE IN GRAPHICAL MODELS Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems Dr. Martin Lauer University of Freiburg Machine Learning Lab Karlsruhe Institute

More information

Hidden Markov Models Part 2: Algorithms

Hidden Markov Models Part 2: Algorithms Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:

More information

Topics in Probability Theory and Stochastic Processes Steven R. Dunbar. Notation and Problems of Hidden Markov Models

Topics in Probability Theory and Stochastic Processes Steven R. Dunbar. Notation and Problems of Hidden Markov Models Steven R. Dunbar Department of Mathematics 203 Avery Hall University of Nebraska-Lincoln Lincoln, NE 68588-0130 http://www.math.unl.edu Voice: 402-472-3731 Fax: 402-472-8466 Topics in Probability Theory

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

10701/15781 Machine Learning, Spring 2007: Homework 2

10701/15781 Machine Learning, Spring 2007: Homework 2 070/578 Machine Learning, Spring 2007: Homework 2 Due: Wednesday, February 2, beginning of the class Instructions There are 4 questions on this assignment The second question involves coding Do not attach

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: Time Series Jan-Willem van de Meent (credit: Aggarwal Chapter 14.3) Time Series Data http://www.capitalhubs.com/2012/08/the-correlation-between-apple-product.html

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Expectation-Maximization (EM) algorithm

Expectation-Maximization (EM) algorithm I529: Machine Learning in Bioinformatics (Spring 2017) Expectation-Maximization (EM) algorithm Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Contents Introduce

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

EM Algorithm. Expectation-maximization (EM) algorithm.

EM Algorithm. Expectation-maximization (EM) algorithm. EM Algorithm Outline: Expectation-maximization (EM) algorithm. Examples. Reading: A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc.,

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Gear Health Monitoring and Prognosis

Gear Health Monitoring and Prognosis Gear Health Monitoring and Prognosis Matej Gas perin, Pavle Bos koski, -Dani Juiric ic Department of Systems and Control Joz ef Stefan Institute Ljubljana, Slovenia matej.gasperin@ijs.si Abstract Many

More information

Logistic Regression. Machine Learning Fall 2018

Logistic Regression. Machine Learning Fall 2018 Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes

More information

Convergence Rate of Expectation-Maximization

Convergence Rate of Expectation-Maximization Convergence Rate of Expectation-Maximiation Raunak Kumar University of British Columbia Mark Schmidt University of British Columbia Abstract raunakkumar17@outlookcom schmidtm@csubcca Expectation-maximiation

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

We Prediction of Geological Characteristic Using Gaussian Mixture Model

We Prediction of Geological Characteristic Using Gaussian Mixture Model We-07-06 Prediction of Geological Characteristic Using Gaussian Mixture Model L. Li* (BGP,CNPC), Z.H. Wan (BGP,CNPC), S.F. Zhan (BGP,CNPC), C.F. Tao (BGP,CNPC) & X.H. Ran (BGP,CNPC) SUMMARY The multi-attribute

More information

Conditional Random Fields: An Introduction

Conditional Random Fields: An Introduction University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 2-24-2004 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania

More information

Accelerating the EM Algorithm for Mixture Density Estimation

Accelerating the EM Algorithm for Mixture Density Estimation Accelerating the EM Algorithm ICERM Workshop September 4, 2015 Slide 1/18 Accelerating the EM Algorithm for Mixture Density Estimation Homer Walker Mathematical Sciences Department Worcester Polytechnic

More information