Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

EHEM ALPAYDI he MI Press, 04 Lecure Sldes for IRODUCIO O Machne Learnng 3rd Edon alpaydn@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/ml3e Sldes from exboo resource page. Slghly eded and wh addonal examples by Ron Khardon For comp35 Fall 07 ufs Unversy CHAPER 5: Hdden Marov Models 3 Inroducon Modelng dependences n npu; no longer d Sequences: emporal: In speech; phonemes n a word (dconary, words n a senence (synax, semancs of he language. In handwrng, pen movemens Spaal: In a DA sequence; base pars 4 Dscree Marov Process saes: S, S,..., S Sae a me, q S Frs-order Marov P(q + S q S, q - S,... P(q + S q S ranson probables a P(q + S q S a 0 and Σ a Inal probables π P(q S Σ π 5 Sochasc Auomaon 6 Example: Balls and Urns ( O Q A, Π P( q P( q q π a q q! a q q q P hree urns each full of balls of one color S : red, S : blue, S 3 : green Π P [ 5,, 3] 4 3 3 A 6 8 { S, S, S3, S3} ( O A, Π P( S P( S S P( S3 S P( S3 S3 π a a a O 3 33 5 4 3 8 048

Balls and Urns: Learnng Example: Balls and Urns Learnng 7 8 Gven K example sequences of lengh ˆπ aˆ #{ sequences sarng wh S} #{ sequences } #{ ranson s from S o S } #{ ranson s from S} - ( q + S and q S - ( q S ( q S Maxmum lelhood esmae naurally separaes o ndvdual componens K Example O {S,S,S,S } O {S,S 3,S 3,S 3 } O 3 {S,S,S,S } O 4 {S,S,S,S 3 }, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 a 3, 0/,a 3, 0/,a 3,3 / Learnng he parameers s easy! Hdden Marov Models HMM Unfolded n me 9 Saes are no observable Dscree observaons {v,v,...,v M } are recorded; a probablsc funcon of he sae Emsson probables b (m P(O v m q S Example: In each urn, here are balls of dfferen colors, bu wh dfferen probables. LP: saes are par of speech; observaons are words For each observaon sequence, here are mulple sae sequences 0 Elemens of an HMM hree Basc Problems of HMMs : umber of saes M: umber of observaon symbols A [a ]: by sae ranson probably marx B b (m: by M observaon probably marx Π [π ]: by nal sae probably vecor. Evaluaon: Gven λ, and O, calculae P (O λ. Sae sequence: Gven λ, and O, fnd Q * such ha P (O Q *, λ max Q P (O Q, λ 3. Learnng: Gven X{O }, fnd λ * such ha P ( X λ * max λ P ( X λ λ (A, B, Π, parameer se of HMM

, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 a 3, 0/,a 3, 0/,a 3,3 / p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8 Problem : Evaluaon Can be solved by forward compuaon or bacward compuaon Problem : Evaluaon Wha s he probably of producng he sequence? 5 Problem : Evaluaon, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8 Forward varable: a 3, 0/,a 3, 0/,a 3,3 / α ( P( O O, q S λ Inalzaon: α ( π b ( O Recurson: α + # & ( % α ( a (b ( O + $ ' P( O λ α ( alpha_(: he probably ha produce O o O and end up a q s ( 8 ( 0 (3 0 3 4 8 8 00448... 0 048 0768... 3 0 06......, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8 a 3, 0/,a 3, 0/,a 3,3 / a 3, 0/,a 3, 0/,a 3,3 / 3 4 8 8 00448... 0 048 0768... 3 0 06...... 3 4 8 8 00448... 0 048 0768... 3 0 06...... ( (8 + 0 + 08 8 3 ( (8 +048 4+06 0 00448 3 ( (8 6+048 4+06 0 8 0768 ( (8 + 0 + 08 8 3 ( (8 +048 4+06 0 00448 3 ( (8 6+048 4+06 0 8 0768 3

Bacward varable: β ( P( O! O q S, λ + Inalza on: β ( Recurson : β ( ab ( O + β+ ( bea_(: he probably o produce O + o O sarng a q s, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 a 3, 0/,a 3, 0/,a 3,3 / Problem : Mos lely pah p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8 Wha s he pah of hghes probably whch produces he sequence? Problem : Fndng he Sae Sequence Verb s Algorhm ave dea: Choose he sae ha has he hghes probably for each me sep: q * arg max γ ( he nave soluon s ncorrec ( P( q S O α ( β ( α ( ( γ,λ β umeraor s p(o and q S lambda Denomnaor s p(o lambda δ ( max qq q- p(q q q -,q S,O O λ he probably of max Inalzaon: prob pah producng δ ( π b (O, ψ ( 0 O o O endng a q s Recurson: δ ( max δ - (a b (O, ψ ( argmax δ - (a Paren of s n such a ermnaon: sequence p * max δ (, q * argmax δ ( Pah bacracng: q * ψ + (q * +, -, -,...,, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 a 3, 0/,a 3, 0/,a 3,3 / p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8, 0, 3 0 a, /5,a, 3/5,a,3 /5 a, /5,a, /5,a,3 /5 a 3, 0/,a 3, 0/,a 3,3 / p(r, G, B 8,, p(r, G, B, 8, p(r, G, B 3,, 8 3 4 8 8 0056... 0 048 064... 3 0 06...... HMM s no nown. We observe mulple sequences: RRGB, RBRBBG, GBGBR, RRRBGR, ( max(8, 0, 08 8 3( max(8, 048 4, 06 0 0056 3( max(8 6, 048 4, 06 0 8 064 Problem 3: Learn parameers of he HMM 4

5 ξ ξ Learnng he probably of producng he enre O sequence and gong hrough he s o s ranson (, P( q S, q+ S O, λ α ( ab ( O + β+ ( (, α ( albl ( O + β + ( l Baum - Welch f q S z 0 oherwse l algorhm (EM : If only we new he vsed sae hen learnng would be as easy as before. Bu we do no. z f q S and q + S 0 oherwse Baum-Welch (EM E sep: E M sep: bˆ ˆ π ( m [ z ] γ ( E[ z ] ξ (, K γ ( K ˆ a K K K γ ( ( O v m K γ ( ξ (, γ ( 6 HMM Recap 7 Algorhms for all 3 problems:. Evaluaon: Gven λ, and O, calculae P (O λ. Sae sequence: Gven λ, and O, fnd Q * such ha P (O Q *, λ max Q P (O Q, λ 3. Learnng: Gven X{O }, fnd λ * such ha P ( X λ * max λ P ( X λ 5