Hidden Markov Models. Adapted from. Dr Catherine Sweeney-Reed s slides

Similar documents
CS 4495 Computer Vision Hidden Markov Models

Hidden Markov Models Hamid R. Rabiee

Hidden Markov Models

Statistical Machine Learning Methods for Bioinformatics I. Hidden Markov Model Theory

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Machine Learning Methods for Bioinformatics I. Hidden Markov Model Theory

Machine Learning 4771

Isolated-word speech recognition using hidden Markov models

Hidden Markov Models. Seven. Three-State Markov Weather Model. Markov Weather Model. Solving the Weather Example. Markov Weather Model

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering

Temporal probability models

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Doctoral Course in Speech Recognition

Ensamble methods: Boosting

Multiscale Systems Engineering Research Group

Ensamble methods: Bagging and Boosting

Temporal probability models. Chapter 15, Sections 1 5 1

CS 4495 Computer Vision

Vehicle Arrival Models : Headway

Viterbi Algorithm: Background

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France

Nonlinear Parametric Hidden Markov Models

Expectation- Maximization & Baum-Welch. Slides: Roded Sharan, Jan 15; revised by Ron Shamir, Nov 15

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Authors. Introduction. Introduction

Object tracking: Using HMMs to estimate the geographical location of fish

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Reconstructing the power grid dynamic model from sparse measurements

Chapter 21. Reinforcement Learning. The Reinforcement Learning Agent

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

An introduction to the theory of SDDP algorithm

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

Lecture 11: Hidden Markov Models

Testing for a Single Factor Model in the Multivariate State Space Framework

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems

Notes on Kalman Filtering

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Hidden Markov Models

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game

Robust and Learning Control for Complex Systems

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Air Traffic Forecast Empirical Research Based on the MCMC Method

Pattern Classification and NNet applications with memristive crossbar circuits. Fabien ALIBART D. Strukov s group, ECE-UCSB Now at IEMN-CNRS, France

A variational radial basis function approximation for diffusion processes.

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004

Hidden Markov Modelling

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

CSE-473. A Gentle Introduction to Particle Filters

Block Diagram of a DCS in 411

HIDDEN MARKOV MODELS IN SPEECH RECOGNITION

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

CS 7180: Behavioral Modeling and Decision- making in AI

Linear Gaussian State Space Models

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear

Random Walk with Anti-Correlated Steps

3.1 More on model selection

(b) (a) (d) (c) (e) Figure 10-N1. (f) Solution:

Stability and Bifurcation in a Neural Network Model with Two Delays

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)

Lecture 1: Contents of the course. Advanced Digital Control. IT tools CCSDEMO

Planning in POMDPs. Dominik Schoenberger Abstract

Stationary Distribution. Design and Analysis of Algorithms Andrei Bulatov

Presentation Overview

Particle Swarm Optimization

Applications in Industry (Extended) Kalman Filter. Week Date Lecture Title

Solutions to the Exam Digital Communications I given on the 11th of June = 111 and g 2. c 2

Lecture 3: Exponential Smoothing

Inferring Dynamic Dependency with Applications to Link Analysis

ES 250 Practice Final Exam

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes

Ford-Fulkerson Algorithm for Maximum Flow

Advanced Data Science

Hidden Markov Model and Speech Recognition

EKF SLAM vs. FastSLAM A Comparison

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Tracking Adversarial Targets

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

CS 7180: Behavioral Modeling and Decision- making in AI

On-line Adaptive Optimal Timing Control of Switched Systems

Resource Allocation in Visible Light Communication Networks NOMA vs. OFDMA Transmission Techniques

Reliability of Technical Systems

Robust Learning Control with Application to HVAC Systems

Introduction to Probability and Statistics Slides 4 Chapter 4

Suboptimal MIMO Detector based on Viterbi Algorithm

Longest Common Prefixes

Learning Objectives: Practice designing and simulating digital circuits including flip flops Experience state machine design procedure

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

1 Review of Zero-Sum Games

SEIF, EnKF, EKF SLAM. Pieter Abbeel UC Berkeley EECS

Online Appendix to Solution Methods for Models with Rare Disasters

Space-time Galerkin POD for optimal control of Burgers equation. April 27, 2017 Absolventen Seminar Numerische Mathematik, TU Berlin

Probabilistic Robotics

KEY. Math 334 Midterm III Fall 2008 sections 001 and 003 Instructor: Scott Glasgow

A Hop Constrained Min-Sum Arborescence with Outage Costs

Sequential Importance Resampling (SIR) Particle Filter

Transcription:

Hidden Markov Models Adaped from Dr Caherine Sweeney-Reed s slides

Summary Inroducion Descripion Cenral in HMM modelling Exensions Demonsraion

Specificaion of an HMM Descripion N - number of saes Q = {q ; q 2 ; : : : ;q T } - se of saes M - he number of symbols (observables) O = {o ; o 2 ; : : : ;o T } - se of symbols

Descripion Specificaion of an HMM A - he sae ransiion probabiliy marix aij = P(q + = j q = i) B- observaion probabiliy disribuion b j (k) = P(o = k q = j) i k M π - he iniial sae disribuion

Specificaion of an HMM Descripion Full HMM is hus specified as a riple: λ = (A,B,π)

Cenral in HMM modelling Cenral Problem Evaluaion: Probabiliy of occurrence of a paricular observaion sequence, O = {o,,o k }, given he model P(O λ) Complicaed hidden saes Useful in sequence classificaion

Cenral in HMM modelling Cenral Problem 2 Decoding: Opimal sae sequence o produce given observaions, O = {o,,o k }, given model Opimaliy crierion Useful in recogniion

Cenral in HMM Cenral modelling Problem 3 Learning: Deermine opimum model, given a raining se of observaions Find λ, such ha P(O λ) is maximal

Problem : Naïve soluion Cenral Sae sequence Q = (q, q T ) Assume independen observaions: T P ( O q, ") =! P( o q,") = bq ( o ) bq 2( o2 i= )... b NB Observaions are muually independen, given he hidden saes. (Join disribuion of independen variables facorises ino marginal disribuions of he independen variables.) qt ( o T )

Problem : Naïve soluion Cenral Observe ha : And ha: P( q #) = " a a... a q qq2 q2q3 qt! qt P( O ") =! P( O q, ") P( q ") q

Problem : Naïve soluion Cenral Finally ge: P ( O ") =! P( O q, ") P( q ") q NB: -The above sum is over all sae pahs -There are N T saes pahs, each cosing O(T) calculaions, leading o O(TN T ) ime complexiy.

Problem : Efficien soluion Cenral Forward algorihm: Define auxiliary forward variable α: " ( i) = P( o,..., o q = i,!) α (i) is he probabiliy of observing a parial sequence of observables o, o such ha a ime, sae q =i

Problem : Efficien soluion Cenral Recursive algorihm: Iniialise: Calculae: Obain: " ( i) =! b ( o ) i i " N P O #) = i=! ( " Sum of differen ways of geing obs seq (Parial obs seq o AND sae i a ) x (ransiion o j a +) x (sensor) N + ( ) = [! Sum, as can reach j from j " ( i) aij ] bj ( o + ) any preceding sae i= α incorporaes parial obs seq o T ( i ) Complexiy is O(N 2 T)

Problem : Alernaive soluion Backward algorihm: Define auxiliary forward variable β: " ( i) = P( o, o,..., o q = i,!) + + 2 T Cenral β (i) he probabiliy of observing a sequence of observables o +,,o T given sae q =i a ime, and λ

Problem : Alernaive soluion Recursive algorihm: Iniialise:! T Calculae: Terminae: ( j) = N! ( i) = "! + ( j) a b ( o + ) ij j j= N p( O #) = " ( i = T!,...,! i= ) Cenral Complexiy is O(N 2 T)

Problem 2: Decoding Cenral Choose sae sequence o maximise probabiliy of observaion sequence Vierbi algorihm - inducive algorihm ha keeps he bes sae sequence a each insance

Problem 2: Decoding Cenral Vierbi algorihm: Sae sequence o maximise P(O,Q λ): P ( 2 q, q,... qt O,!) Define auxiliary variable δ: " ( 2 q i ) = max P( q, q2,..., q = i, o, o,... o!) δ (i) he probabiliy of he mos probable pah ending in sae q =i

Problem 2: Decoding Cenral Recurren propery:! Algorihm:. Iniialise: ( j ) max(! ( i) aij ) bj ( o ) + = + i " i) = b ( )! i! N (! i i o! ( i) = 0 To ge sae seq, need o keep rack of argumen o maximise his, for each and j. Done via he array ψ (j).

Problem 2: Decoding Cenral 2. Recursion: # ( j ) = max( #! ( i) a " i" N ij ) b j ( o ) 3. Terminae: $ ) = arg max( # ( i) a ) 2!! T,! j! N P ( j! " i" N # = max! " i" N T ( i) q # = arg max! T " i" N T ( i) ij P* gives he sae-opimised probabiliy Q* is he opimal sae sequence (Q* = {q*,q2*,,qt*})

Problem 2: Decoding Cenral 4. Backrack sae sequence: (! q! ) = " + q+ + T!, T! 2,..., O(N 2 T) ime complexiy

Problem 3: Learning Cenral Training HMM o encode obs seq such ha HMM should idenify a similar obs seq in fuure Find λ=(a,b,π), maximising P(O λ) General algorihm: Iniialise: λ 0 Compue new model λ, using λ 0 and observed sequence O Then! "! o Repea seps 2 and 3 unil: log P ( O ")! log P( O " 0) < d

Problem 3: Learning Le ξ(i,j) be a probabiliy of being in sae i a ime and a sae j a ime +, given λ and O seq ) ( ) ( ) ( ) ( ), (! " # $ O P j o b a i j i j ij + + = Cenral!! = = + + + + = N i N j j ij j ij j o b a i j o b a i ) ( ) ( ) ( ) ( ) ( ) ( " # " # Sep of Baum-Welch algorihm:

Problem 3: Learning Cenral Operaions required for he compuaion of he join even ha he sysem is in sae Si and ime and Sae Sj a ime +

Problem 3: Learning Cenral Le! ( i) be a probabiliy of being in sae i a ime, given O N! # ( i) = " ( i, j= j) T " # = T "! ( i) - expeced no. of ransiions from sae i #! ( ) - expeced no. of ransiions i i! j =

Problem 3: Learning Sep 2 of Baum-Welch algorihm: " ˆ =! ( i) he expeced frequency of sae i a ime =!! Cenral # ( i, j) aˆ ij = " ( i) raio of expeced no. of ransiions from sae i o j over expeced no. of ransiions from sae i! = ˆ " ( j) ( ) =, o k bj k! " ( j) raio of expeced no. of imes in sae j observing symbol k over expeced no. of imes in sae j

Problem 3: Learning Cenral Baum-Welch algorihm uses he forward and backward algorihms o calculae he auxiliary variables α,β B-W algorihm is a special case of he EM algorihm: E-sep: calculaion of ξ and γ M-sep: ieraive calculaion of,, Pracical issues:!ˆ ij â ˆ ( k) Can ge suck in local maxima Numerical log and scaling b j

Exensions Exensions Problem-specific: Lef o righ HMM (speech recogniion) Profile HMM (bioinformaics)

Exensions Exensions General machine learning: Facorial HMM Coupled HMM Hierarchical HMM Inpu-oupu HMM Swiching sae sysems Hybrid HMM (HMM +NN) Special case of graphical models Bayesian nes Dynamical Bayesian nes

Examples Exensions Coupled HMM Facorial HMM

HMMs Sleep Saging Demonsraions Flexer, Sykacek, Rezek, and Dorffner (2000) Observaion sequence: EEG daa Fi model o daa according o 3 sleep sages o produce coninuous probabiliies: P(wake), P(deep), and P(REM) Hidden saes correspond wih recognised sleep sages. 3 coninuous probabiliy plos, giving P of each a every second

HMMs Sleep Saging Demonsraions Manual scoring of sleep sages Saging by HMM Probabiliy plos for he 3 sages

Excel Demonsraions Demonsraion of a working HMM implemened in Excel

Furher Reading L. R. Rabiner, "A uorial on Hidden Markov Models and seleced applicaions in speech recogniion," Proceedings of he IEEE, vol. 77, pp. 257-286, 989. R. Dugad and U. B. Desai, "A uorial on Hidden Markov models," Signal Processing and Arifical Neural Neworks Laboraory, Dep of Elecrical Engineering, Indian Insiue of Technology, Bombay Technical Repor No.: SPANN-96., 996. W.H. Lavery, M.J. Mike, and I.W. Kelly, Simulaion of Hidden Markov Models wih EXCEL, The Saisician, vol. 5, Par, pp. 3-40, 2002