MODELING OF VEHICLE INSURANCE CLAIM USING EXPONENTIAL HIDDEN MARKOV. Bogor Agricultural University Meranti Street, Bogor, 16680, INDONESIA
|
|
- Lily Tyler
- 5 years ago
- Views:
Transcription
1 International Journal of Pure and Applied Mathematics Volume 118 No , ISSN: (printed version); ISSN: (on-line version) url: doi: /ijpamv118i215 PAijpameu MODELING OF VEHICLE INSURANCE CLAIM USING EXPONENIAL HIDDEN MARKOV Irfani Azis 1, Berlian Setiawaty 2, I Gusti Putu Purnaba 3 1,2,3 Department of Mathematics Bogor Agricultural University Meranti Street, Bogor, 16680, INDONESIA Abstract: he purpose of this research is to model the vehicle insurance claim using exponential hidden Markov model (EHMM) he maximum likelihood method, forward-backward algorithm and expectation maximization (EM) algorithm are used to determine the EHMM parameter he best model was selected based on Bayesian information criterion (BIC) AMS Subject Classification: 60J22, 60J28 Key Words: claim, exponential, hidden Markov, vehicle insurance 1 Introduction Insurance is an agreement in which the insurer binds itself to an insured company, by receiving a premium to provide a replacement due to loss, damage or loss of expected profit, which may occur due to an uncertain event Vehicle insurance is one type of insurance that protects the vehicle from loss he cause of loss is unpredictable, so the fund which is paid for a claim by an insurance company is unknown Received: Revised: Published: March 17, 2018 c 2018 Academic Publications, Ltd url: wwwacadpubleu Correspondence author
2 310 I Azis, B Setiawaty, IGP Purnaba Figure 1: Histogram frequency of vehicle insurance claim of P XX in 2014 he greater loss that happened the greater claim would be paid he loss that happened can be caused by several factors, such as weather condition, traffic condition, driver condition and other factors Figure 1 shows that the small loss often occurred but huge loss rarely occurred he distribution of claim can be seen as an exponential distribution or a mix of the exponential distribution If the cause of loss is assumed to form a Markov chain and is not observed directly, and only data of claim is observed, then the pair of claim and cause of loss can be modeled by hidden Markov model he purpose of this research is to model of vehicle insurance claim using exponential hidden Markov model 2 Exponential Hidden Markov Model Exponential hidden Markov model (EHMM) is a model which consists of pair of stochastic process {X t,y t } t N {X t } t N is unobserved and form a Markov chain {Y t } t N is an observation process which depends on {X t } t N Y t X t has the exponential distribution, for all t Let {X t } t N be the Markov chain with state space S X = {1,2,3,,m} and transition probability matrix A = [a ij ] m m, where a ij = P(X t = j X t 1 = i) = P(X 2 = j X 1 = i),a ij 0 and j=1 a ij = 1 Let φ = [φ i ] m 1 is the initial state probability vector, where φ i = P(X 1 = i), j=1 φ i = 1 he distribution of Y t X t = i,i S X ;t N has an exponential distribution with parameter λ i he probability density function of Y t X t = i is γ yi =
3 MODELING OF VEHICLE INSURANCE CLAIM 311 f(y) = λ i e λ iy,y > 0 Suppose that λ = [λ i ] m 1 So EHMM can be identified by parameter φ = (m,a,λ) 3 Parameter Estimation of Exponential Hidden Markov Model he parameter φ = (m,a,λ) is estimated using maximum likelihood method (see [1]) Suppose that is the number of observation, y = (y 1,y 2,, y ) is the observation data of {Y t } t N, x = (i 1,i 2,,i ) is the unobserved data of {X t } t N, (x,y) = (i 1,y 1,i 2,y 2,,i,y ) is the data of {X,Y t } t N, and Φ m = {φ = (m,a,λ) : A = [0,1] m2,λ [ǫ,1/ǫ] m } is the parameter space of EHMM Assume that a ij, λ i, and ϕ i are continuous function from R m to R with a ij (φ) = a ij, λ i (φ) = λ i and ϕ i (φ) = ϕ i, for all i,j S X Let L (m) (φ) be an incomplete likelihood function, that is a probability density function of y 1,y 2,,y L (m) (φ) = f φ(y) = f φ (y 1,,y ) = m i 1 =1 m ϕ i1 γ y1 i 1 a it 1 i t γ yti t i =1 Let L C(m) (φ) be a complete likelihood function Where L C(m) (φ) = f φ (x,y) = f φ (i 1,y 1,,i,y ) = ϕ i1 γ y1 i 1 We obtain L (m) (φ) = x L C(m) (φ) t=2 a it 1 i t γ yti t Calculating L (m) (φ) directly when m and are big, is a complex process Because of that, we use the forward-backward algorithm to calculate L (m) (φ) he forward-backward algorithm consists of two sections, those are forward algorithm, and backward algorithm Define forward variabel α t (i,φ,) for t = 1,2,, and i S X as follows t=2 α t (i,φ,) = P φ (Y 1 = y 1,Y 2 = y 2,,Y t = y t,x t = i) Define backward variable β t (i,φ,) for t =, 1,,1;i S X as follows β t (i,φ,) = P φ (Y t+1 = y t+1,y t+2 = y t+2,,y = y X t = i)
4 312 I Azis, B Setiawaty, IGP Purnaba heorem 1 (see [5]) For any t = 1,2,, then m (φ) = α t (i,φ,)β t (i,φ,) L (m) i=1 Now, the problem is to find φ argmax ln(l (m) (φ)) In this case, Expectation Maximization (EM) algorithm is used EM algorithm is an iterative algorithm which consists of two steps for every iteration, those are E step, and M step EM algorithm is as follows 1 For k = 0, the initial value φ (k) is given 2 (E-step) : Calculate Q(φ,φ (k) ) = E φ (k) ( ln(l C(m) Use heorem 2 for calculating Q(φ,φ (k) ) heorem 2 (see [2]) + m m i=1 j=1 + = Q(φ,φ (k) ) = E φ(k) ( m i=1 1 m i=1 ln(l C(m) ) (φ)) y ) (φ)) y α 1 (i,φ (k),)β 1 (i,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) ln(ϕ i) t=1 α t(j,φ (k),)γ (k) y t+1,j a(k) ij β t+1(j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k) ln(a ij ),) t=1 α t(j,φ (k),)β t (j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) ln(γ y ti) heorem 3 (see [6]) Suppose that Φ m = {φ = (m,a,λ) : A = [0,1] m2,λ [ǫ,1/ǫ] m} and ǫ > 0 is very small near zero and EHMM parameter continuous assumption is fulfilled, then (a) Φ m is compact set in R m2 +m
5 MODELING OF VEHICLE INSURANCE CLAIM 313 (b) ln(l (m) (φ)) is a continuous function in Φ m and differentiable in interior Φ m (c) Q(ˆφ,φ) is a continuous function in Φ m Φ m 3 (M-step) : Find φ (k+1) which maximizes Q(φ,φ (k) ) hat is Q(φ (k+1),φ (k) ) Q(φ,φ (k) ), φ Φ m he existence of φ (k+1) is guaranteed by heorem 3 Furthermore, using heorem 4 we have the parameter φ (k+1) heorem 4 (see [2]) Suppose that φ (k) Φ m with φ (k) = (m,a k,λ k ), thentheparameterφ (k+1) = (m,a (k+1),λ (k+1) )whichmaximizesq(φ,φ (k) ) in Φ m can be written as a k+1 ij = 1 t=1 α t(j,φ (k),)γ (k) y t+1,j a(k) ij β t+1(j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) and λ (k+1) i = ( ) 1 t=1 α t(j,φ (k),)β t (j,φ (k),)(y t ) l=1 α t(l,φ (k),)β t (l,φ (k),) 4 Replace k with k +1 and if ln(l (m) (φ(k+1) )) ln(l (m) (φ(k) )) > ǫ then go back to 2 nd until 4 th step Iteration is stopped when ln(l (m) (φ(k+1) )) ln(l (m) (φ(k) )) < ǫ In other words {ln(l (m) (φ(k+1) ))} converges to {ln(l (m) (φ ))} Its existence is guaranteed by heorem 5 heorem 5 (see [6]) Suppose that Q(ˆφ,φ) is continuous function in Φ m Φ m and {φ (k) } is the set of EHMM estimator which is obtained from EM algorithm, then (a) {ln(l (m) (φ(k) ))} is monotonically increasing sequence (b) φ Φ m lim k ln(l (m) (φ(k) )) = ln(l (m) (φ )) (c) If lim k φ (k) = φ, then φ is the stasionary point of ( L (m) (φ) ) Supposethat ˆφ = φ (k+1) isobtainedinthelast iteration ofemalgorithm, so ˆφ is EHMM parameter estimation, where ˆφ = (ˆm,Â,ˆλ) Bayesian information criterion (BIC) is used for finding state ˆm (see [4]) he selection of ˆm is based
6 314 I Azis, B Setiawaty, IGP Purnaba on ˆm which maximizes ln(l (m) t (φ)) (ln)d m, where ln(l (m) t (φ)) is the value of likelihood function and d m is the number of estimation parameter, so BIC = lnl (m) (φ) (ln)d m 2 he number of estimation parameter in EHMM is (m 2 +m), so the formula of BIC as follows BIC = lnl (m) (φ) (ln)(m2 +m) 2 Furthermore to validate of the model, a set of estimation data ŷ 1,ŷ 2,,ŷ, is generated, based on ˆφ, where ŷ t = Eˆφ ( Yt = y t Y1 t 1 ) 1 = ) L m t 1(ˆφ 0 y t L m t (ˆφ) dy t, and error is calculated by mean absolute percentage error (MAPE), where MAPE = 1 Y t Ŷt Y t 100% t=1 Interpretation of MAPE is as follows able 1: Interpretation of MAPE [3] MAPE% Interpretation 10 Highly accurate forecasting Good forecasting Reasonable forecasting 4 Programming Algorithm of Exponential Hidden Markov Model Suppose that y = {y 1,y 2,,y } is an observation set and EHMM parameter φ = (m,a,λ) will be estimated he algorithm which is used is as follows 1 i) Input the number of data and data y 1,y 2,,y ii) Set m = 2,k = 0,l = 1
7 MODELING OF VEHICLE INSURANCE CLAIM i) For k = 0, choose the initial value of φ (k) = (m,a (k),λ (k) ), where A (k) and λ (k) are generated randomly, A (k) = [a (k) ij ] (m m),λ (k) = [λ (k) i ] (m 1) and fulfill j=1 a(k) ij = 1,λ (k) i 0 and i = 1,2,,m ii) Generate ϕ (k) value which fulfill (ϕ (k) ) = (ϕ (k) ) A (k) iii) Generate γ (k) y l i l from Y l X l = i l 3 Determine ǫ = 10 ( 1) as bound of the likelihood iteration, that is ln(l (m) (φ(k+1) )) ln(l (m) (φ(k) )) < ǫ 4 Calculate Forward and Backward function 5 (E-Step) Calculate L l (φ (k) ) L l (φ (k) ) = m α l (i,φ (k),l)β l (i,φ (k),l) i=1 6 (M-Step)Calculateparameterestimationvalueφ (k+1) = (m,a (k+1),λ (k+1) ), where 1 a k+1 t=1 ij = α t(j,φ (k),)γ (k) y t+1,j a(k) ij β t+1(j,φ (k),) l=1 α t(l,φ (k),)β t (l,φ (k),) ( ) 1 λ (k+1) t=1 i = α t(j,φ (k),)β t (j,φ (k),)(y t ) l=1 α t(l,φ (k),)β t (l,φ (k),) 7 i) If ln(l (m) (φ(k) )) > ǫ, go back to 4 th until 6 th step and replace k by k+1 Otherwise go to the next step Replace l by l+1 and return to 2 nd step, where l = 1,2,, 1 ii) Generate set of estimation data Ŷl iii) Verify that Ŷl Y l Y l < 7%, otherwise replace l to l + 1, where l = 1,2,, 1 and return to 1st step iv) Calculate BIC value v) Calculate error value of estimation data by MAPE If MAPE> 30% then return to 2nd step (φ(k+1) )) ln(l (m) 8 i) For determining optimum m, do the 1st step until 8th step for every state m = 2,3,,6 with different ii) Chose m which maximize BIC
8 316 I Azis, B Setiawaty, IGP Purnaba Figure 2: Vehicle Insurance Claim for each Day in Application of EHMM on Vehicle Insurance Claim his research uses data of vehicle insurance claim in one of Indonesia insurance company in 2014 he data can be seen in Figure 2 he data of claim is called observation data Suppose that Y t is the observation data, X t is the unobserved data, and t is t th day, where t = 1,2,, and = 298 days Parameter estimation of the model is exercised for 2-states until 6-states he last 10 data are used for testing estimation data or prediction data Based on the programming result, log-likelihood value and BIC value is obtained for every state with 288 days of observation data which can be seen in able 2 able 2: Log-likelihood value and BIC value for every state with 288 days observation data m - state Loglikelihood value BIC value 2 - state state state state state Based on able 2, if the number of states increase, then the BIC value will decrease Afterwards interval observation will be determined starting from a
9 MODELING OF VEHICLE INSURANCE CLAIM 317 week, a month, and 3 months Where assumption of a week is 6 day and a month is 26 days Based on programming log-likelihood value and BIC value is obtained for every observation interval and state can be seen in able 3 and able 4 able 3: Log-likelihood value for every state and observation interval m - state Loglikelihood value 1 week 1 month 3 month 2 - state state state state state able 4: BIC value for every state and observation interval m - state BIC value 1 week 1 month 3 month 2 - state state state state state he best model is chosen by the highest BIC value Based on able 2, EHMM with state m = 2 has the highest BIC value, so the optimum parameter estimation is state m = 2 Based on able 3 and able 4, if the observation interval increase then log-likelihood value and BIC value will decrease So the optimum number of state and observation interval is state m = 2 and 1 week observation interval Parameter estimation value which is obtained from 1 week observation interval with state m = 2 are transition probability matrix A, initial transition probability vector ϕ, and exponential parameter estimation vector λ, where the number of 1 week observation data is 283rd until 288th day of claim data
10 318 I Azis, B Setiawaty, IGP Purnaba ransition probability matrix A is obtained as follows ( ) A = Based on matrix A, if the previous claim is caused by state 1, then the claim would happen again by state 1 with probability value If the previous claim is caused by state 2, then the claim would happen again by state 1 with probability value If the previous claim is caused by state 1, then the claim would happen again by state 2 with probability value If the previous claim is caused by state 2, then the claim would happen again by state 2 with probability value Initial transition probability matrix was obtained as follows ϕ = ( ϕ shows that, the claim which is caused by state 1 has parobability value and the claim which is caused by state 2 has probability value Exponential parameter estimation vector λ as follows λ = ( λ shows that, the claim that will be happened by state 1 has rate million rupiahs per day Afterward, the claim that will be happened by state 2 has rate million rupiahs per day Furthermore the set of estimation data is generated by the best obtained parameter Estimation data and Observation data can be seen in Figure 3 he MAPE value is obtained is 1955%, which interprets that estimation data is a good forecasting Observation data and estimation data of Figure 3 are presented in able 5 ) )
11 MODELING OF VEHICLE INSURANCE CLAIM 319 Figure 3: Comparison of observation data and estimation data able 5: Comparison of observation data and estimation data for every single day Day Observation data Estimation data (Million rupiah) (Million rupiah) References [1] AP Dempster, NM Laird, DB Rubin, Maximum likelihood from incomplete data via the EM algorithm, Series B: Journal of the Royal Statistical Society, 39, No 1 (1977), 1-38 [2] M Firmansyah, B Setiawaty, IGP Purnaba, Parameter estimation of exponential hidden Markov model and convergence of its parameter estimator sequence, St Petersburg Polytechnic University Journal (2017), Submitted [3] Y Hu, WM Ho, ravel time prediction for urban networks: he comparisons of simulation-based and time-series models, In: proceedings 17th IS World Congress, Busan, South Korea, 2010 [4] D Lam, S Li, DC Wunsch, Hidden Markov model with information criteria clustering and extreme learning machine regression for wind forecasting, Journal of Computer Science and Cybernetics, 30, No 4 (2014), [5] LR Rabiner, BH Juang, An introduction of hidden Markov models, IEEE Assp Magazine, doi: /86/ (1986)
12 320 I Azis, B Setiawaty, IGP Purnaba [6] CFJ Wu, On the convergence properties of the EM algorithm, he Annals of Statistics, 11, No 1 (1983),
Multiscale Systems Engineering Research Group
Hidden Markov Model Prof. Yan Wang Woodruff School of Mechanical Engineering Georgia Institute of echnology Atlanta, GA 30332, U.S.A. yan.wang@me.gatech.edu Learning Objectives o familiarize the hidden
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute
More informationWeighted Finite-State Transducers in Computational Biology
Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationExponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger
Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential
More informationSpeech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models.
Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.com This Lecture Expectation-Maximization (EM)
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationCISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)
CISC 889 Bioinformatics (Spring 24) Hidden Markov Models (II) a. Likelihood: forward algorithm b. Decoding: Viterbi algorithm c. Model building: Baum-Welch algorithm Viterbi training Hidden Markov models
More informationCS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm
+ September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature
More informationDynamic sequential analysis of careers
Dynamic sequential analysis of careers Fulvia Pennoni Department of Statistics and Quantitative Methods University of Milano-Bicocca http://www.statistica.unimib.it/utenti/pennoni/ Email: fulvia.pennoni@unimib.it
More informationMachine Learning for natural language processing
Machine Learning for natural language processing Hidden Markov Models Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 33 Introduction So far, we have classified texts/observations
More informationCS Homework 3. October 15, 2009
CS 294 - Homework 3 October 15, 2009 If you have questions, contact Alexandre Bouchard (bouchard@cs.berkeley.edu) for part 1 and Alex Simma (asimma@eecs.berkeley.edu) for part 2. Also check the class website
More information4 : Exact Inference: Variable Elimination
10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference
More informationHybrid HMM/MLP models for time series prediction
Bruges (Belgium), 2-23 April 999, D-Facto public., ISBN 2-649-9-X, pp. 455-462 Hybrid HMM/MLP models for time series prediction Joseph Rynkiewicz SAMOS, Université Paris I - Panthéon Sorbonne Paris, France
More informationMachine Learning 4771
Machine Learning 4771 Instructor: ony Jebara Kalman Filtering Linear Dynamical Systems and Kalman Filtering Structure from Motion Linear Dynamical Systems Audio: x=pitch y=acoustic waveform Vision: x=object
More informationStatistical NLP: Hidden Markov Models. Updated 12/15
Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for part-of-speech-tagging applications Their first
More informationStochastic Complexity of Variational Bayesian Hidden Markov Models
Stochastic Complexity of Variational Bayesian Hidden Markov Models Tikara Hosino Department of Computational Intelligence and System Science, Tokyo Institute of Technology Mailbox R-5, 459 Nagatsuta, Midori-ku,
More informationDynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models
6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories
More informationHidden Markov models for time series of counts with excess zeros
Hidden Markov models for time series of counts with excess zeros Madalina Olteanu and James Ridgway University Paris 1 Pantheon-Sorbonne - SAMM, EA4543 90 Rue de Tolbiac, 75013 Paris - France Abstract.
More informationECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012
Reading ECE 275B Homework #2 Due Thursday 2-16-12 MIDTERM is Scheduled for Tuesday, February 21, 2012 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example 7.11,
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationClustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26
Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationTutorial on Hidden Markov Model
Applied and Computational Mathematics 2017; 6(4-1): 16-38 http://www.sciencepublishinggroup.com/j/acm doi: 10.11648/j.acm.s.2017060401.12 ISS: 2328-5605 (Print); ISS: 2328-5613 (Online) Tutorial on Hidden
More informationData Analyzing and Daily Activity Learning with Hidden Markov Model
Data Analyzing and Daily Activity Learning with Hidden Markov Model GuoQing Yin and Dietmar Bruckner Institute of Computer Technology Vienna University of Technology, Austria, Europe {yin, bruckner}@ict.tuwien.ac.at
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationMixtures of Rasch Models
Mixtures of Rasch Models Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl http://www.uibk.ac.at/statistics/ Introduction Rasch model for measuring latent traits Model assumption: Item parameters
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs
More informationChapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang
Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check
More informationClustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation
Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide
More informationMachine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)
More informationON SUPERCYCLICITY CRITERIA. Nuha H. Hamada Business Administration College Al Ain University of Science and Technology 5-th st, Abu Dhabi, , UAE
International Journal of Pure and Applied Mathematics Volume 101 No. 3 2015, 401-405 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v101i3.7
More informationPerformance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project
Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationUnivariate ARIMA Models
Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.
More informationarxiv: v1 [stat.co] 4 Mar 2008
An EM algorithm for estimation in the Mixture Transition Distribution model arxiv:0803.0525v1 [stat.co] 4 Mar 2008 Sophie Lèbre, Pierre-Yves Bourguignon Laboratoire Statistique et Génome, UMR 8071 Université
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationDetermining the number of components in mixture models for hierarchical data
Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000
More informationMachine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017
1 Introduction Let x = (x 1,..., x M ) denote a sequence (e.g. a sequence of words), and let y = (y 1,..., y M ) denote a corresponding hidden sequence that we believe explains or influences x somehow
More informationHidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391
Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Parameters of an HMM States: A set of states S=s 1, s n Transition probabilities: A= a 1,1, a 1,2,, a n,n
More informationHidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Hidden Markov Model Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/19 Outline Example: Hidden Coin Tossing Hidden
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationStatistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t
Markov Chains. Statistical Problem. We may have an underlying evolving system (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t Consecutive speech feature vectors are related
More informationPACKAGE LMest FOR LATENT MARKOV ANALYSIS
PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,
More informationThe Ising model and Markov chain Monte Carlo
The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte
More informationA minimalist s exposition of EM
A minimalist s exposition of EM Karl Stratos 1 What EM optimizes Let O, H be a random variables representing the space of samples. Let be the parameter of a generative model with an associated probability
More informationHidden Markov Models
Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010
More informationa Λ q 1. Introduction
International Journal of Pure and Applied Mathematics Volume 9 No 26, 959-97 ISSN: -88 (printed version); ISSN: -95 (on-line version) url: http://wwwijpameu doi: 272/ijpamv9i7 PAijpameu EXPLICI MOORE-PENROSE
More informationHidden Markov Models Part 1: Introduction
Hidden Markov Models Part 1: Introduction CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Modeling Sequential Data Suppose that
More informationRobert Collins CSE586 CSE 586, Spring 2015 Computer Vision II
CSE 586, Spring 2015 Computer Vision II Hidden Markov Model and Kalman Filter Recall: Modeling Time Series State-Space Model: You have a Markov chain of latent (unobserved) states Each state generates
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Abstract
INTERNATIONAL COMPUTER SCIENCE INSTITUTE 947 Center St. Suite 600 Berkeley, California 94704-98 (50) 643-953 FA (50) 643-7684I A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation
More informationExpectation Maximization (EM)
Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted
More informationBasics of Modern Missing Data Analysis
Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing
More informationStochastic Modelling Unit 1: Markov chain models
Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson
More informationECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015
Reading ECE 275B Homework #2 Due Thursday 2/12/2015 MIDTERM is Scheduled for Thursday, February 19, 2015 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example
More informationLEARNING DYNAMIC SYSTEMS: MARKOV MODELS
LEARNING DYNAMIC SYSTEMS: MARKOV MODELS Markov Process and Markov Chains Hidden Markov Models Kalman Filters Types of dynamic systems Problem of future state prediction Predictability Observability Easily
More informationNote Set 5: Hidden Markov Models
Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional
More informationStephen Scott.
1 / 27 sscott@cse.unl.edu 2 / 27 Useful for modeling/making predictions on sequential data E.g., biological sequences, text, series of sounds/spoken words Will return to graphical models that are generative
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationRecall: Modeling Time Series. CSE 586, Spring 2015 Computer Vision II. Hidden Markov Model and Kalman Filter. Modeling Time Series
Recall: Modeling Time Series CSE 586, Spring 2015 Computer Vision II Hidden Markov Model and Kalman Filter State-Space Model: You have a Markov chain of latent (unobserved) states Each state generates
More informationUse of hidden Markov models for QTL mapping
Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data Statistical Machine Learning from Data Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL),
More informationLatent Tree Approximation in Linear Model
Latent Tree Approximation in Linear Model Navid Tafaghodi Khajavi Dept. of Electrical Engineering, University of Hawaii, Honolulu, HI 96822 Email: navidt@hawaii.edu ariv:1710.01838v1 [cs.it] 5 Oct 2017
More informationThe Expected Opportunity Cost and Selecting the Optimal Subset
Applied Mathematical Sciences, Vol. 9, 2015, no. 131, 6507-6519 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.58561 The Expected Opportunity Cost and Selecting the Optimal Subset Mohammad
More informationMaximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains
Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains Jean-Luc Gauvain 1 and Chin-Hui Lee Speech Research Department AT&T Bell Laboratories Murray Hill, NJ 07974
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Expectation Maximization Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning with MAR Values We discussed learning with missing at random values in data:
More informationHidden Markov Models and Gaussian Mixture Models
Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More informationSequential Supervised Learning
Sequential Supervised Learning Many Application Problems Require Sequential Learning Part-of of-speech Tagging Information Extraction from the Web Text-to to-speech Mapping Part-of of-speech Tagging Given
More informationorder is number of previous outputs
Markov Models Lecture : Markov and Hidden Markov Models PSfrag Use past replacements as state. Next output depends on previous output(s): y t = f[y t, y t,...] order is number of previous outputs y t y
More informationIntroduction to Hidden Markov Modeling (HMM) Daniel S. Terry Scott Blanchard and Harel Weinstein labs
Introduction to Hidden Markov Modeling (HMM) Daniel S. Terry Scott Blanchard and Harel Weinstein labs 1 HMM is useful for many, many problems. Speech Recognition and Translation Weather Modeling Sequence
More informationSOLVING A MINIMIZATION PROBLEM FOR A CLASS OF CONSTRAINED MAXIMUM EIGENVALUE FUNCTION
International Journal of Pure and Applied Mathematics Volume 91 No. 3 2014, 291-303 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v91i3.2
More informationKernel Logistic Regression and the Import Vector Machine
Kernel Logistic Regression and the Import Vector Machine Ji Zhu and Trevor Hastie Journal of Computational and Graphical Statistics, 2005 Presented by Mingtao Ding Duke University December 8, 2011 Mingtao
More informationVariable selection for model-based clustering
Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition
More informationSeries 7, May 22, 2018 (EM Convergence)
Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18
More informationAN IMPROVED POISSON TO APPROXIMATE THE NEGATIVE BINOMIAL DISTRIBUTION
International Journal of Pure and Applied Mathematics Volume 9 No. 3 204, 369-373 ISSN: 3-8080 printed version); ISSN: 34-3395 on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/0.2732/ijpam.v9i3.9
More informationLecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models
Advanced Machine Learning Lecture 10 Mixture Models II 30.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ Announcement Exercise sheet 2 online Sampling Rejection Sampling Importance
More informationExpectation Maximisation (EM) CS 486/686: Introduction to Artificial Intelligence University of Waterloo
Expectation Maximisation (EM) CS 486/686: Introduction to Artificial Intelligence University of Waterloo 1 Incomplete Data So far we have seen problems where - Values of all attributes are known - Learning
More informationMixture Models and EM
Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering
More informationChapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems
LEARNING AND INFERENCE IN GRAPHICAL MODELS Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems Dr. Martin Lauer University of Freiburg Machine Learning Lab Karlsruhe Institute
More informationHidden Markov Models Part 2: Algorithms
Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:
More informationTopics in Probability Theory and Stochastic Processes Steven R. Dunbar. Notation and Problems of Hidden Markov Models
Steven R. Dunbar Department of Mathematics 203 Avery Hall University of Nebraska-Lincoln Lincoln, NE 68588-0130 http://www.math.unl.edu Voice: 402-472-3731 Fax: 402-472-8466 Topics in Probability Theory
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More information10701/15781 Machine Learning, Spring 2007: Homework 2
070/578 Machine Learning, Spring 2007: Homework 2 Due: Wednesday, February 2, beginning of the class Instructions There are 4 questions on this assignment The second question involves coding Do not attach
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: Time Series Jan-Willem van de Meent (credit: Aggarwal Chapter 14.3) Time Series Data http://www.capitalhubs.com/2012/08/the-correlation-between-apple-product.html
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationExpectation-Maximization (EM) algorithm
I529: Machine Learning in Bioinformatics (Spring 2017) Expectation-Maximization (EM) algorithm Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Contents Introduce
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationEM Algorithm. Expectation-maximization (EM) algorithm.
EM Algorithm Outline: Expectation-maximization (EM) algorithm. Examples. Reading: A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc.,
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationGear Health Monitoring and Prognosis
Gear Health Monitoring and Prognosis Matej Gas perin, Pavle Bos koski, -Dani Juiric ic Department of Systems and Control Joz ef Stefan Institute Ljubljana, Slovenia matej.gasperin@ijs.si Abstract Many
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationConvergence Rate of Expectation-Maximization
Convergence Rate of Expectation-Maximiation Raunak Kumar University of British Columbia Mark Schmidt University of British Columbia Abstract raunakkumar17@outlookcom schmidtm@csubcca Expectation-maximiation
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationWe Prediction of Geological Characteristic Using Gaussian Mixture Model
We-07-06 Prediction of Geological Characteristic Using Gaussian Mixture Model L. Li* (BGP,CNPC), Z.H. Wan (BGP,CNPC), S.F. Zhan (BGP,CNPC), C.F. Tao (BGP,CNPC) & X.H. Ran (BGP,CNPC) SUMMARY The multi-attribute
More informationConditional Random Fields: An Introduction
University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 2-24-2004 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania
More informationAccelerating the EM Algorithm for Mixture Density Estimation
Accelerating the EM Algorithm ICERM Workshop September 4, 2015 Slide 1/18 Accelerating the EM Algorithm for Mixture Density Estimation Homer Walker Mathematical Sciences Department Worcester Polytechnic
More information