CS 188: Artificil Intelligence Advnced HMMs Dn Klein, Pieter Aeel University of Cliforni, Berkeley Demo Bonnz! Tody HMMs Demo onnz! Most likely explntion queries Speech recognition A mssive HMM! Detils of this section not required Strt mchine lerning Recp: Resoning Over Time [demo: sttionry] Mrkov models 0.3 X 1 X 2 X 3 X 4 rin sun 0.7 0.7 0.3 Hidden Mrkov models X 1 X 2 X 3 X 4 X 5 E 1 X E P rin umrell 0.9 rin no umrell 0.1 E 2 E 3 E 4 E 5 sun umrell 0.2 sun no umrell 0.8
Recp: Filtering Elpse time: compute P( X t e 1:t 1 ) Oserve: compute P( X t e 1:t ) X 1 X 2 Belief: <P(rin), P(sun)> <0.5, 0.5> Prior on X 1 <0.82, 0.18> Oserve E 1 E 2 <0.63, 0.37> Elpse time <0.88, 0.12> Oserve [demo: exct filtering] Recp: Prticle Filtering Prticles: trck smples of sttes rther thn n explicit distriution Elpse Weight Resmple Prticles: (1,2) Prticles: (3,1) (1,3) (2,2) Prticles: w=.9 w=.2 w=.9 (3,1) w=.4 w=.4 w=.9 (1,3) w=.1 w=.2 w=.9 (2,2) w=.4 (New) Prticles: (2,2) (1,3) [demo: prticle filtering] Prticle Filtering Root Locliztion In root locliztion: We know the mp, ut not the root s position Oservtions my e vectors of rnge finder redings Stte spce nd redings re typiclly continuous (works siclly like very fine grid) nd so we cnnot store B(X) Prticle filtering is min technique
Root Mpping SLAM: Simultneous Locliztion And Mpping We do not know the mp or our loction Stte consists of position AND mp! Min techniques: Klmn filtering (Gussin HMMs) nd prticle methods DP SLAM, Ron Prr Dynmic Byes Nets (DBNs) We wnt to trck multiple vriles over time, using multiple sources of evidence Ide: Repet fixed Byes net structure t ech time Vriles from time t cn condition on those from t 1 t =1 t =2 t =3 G 1 G 2 G 3 G 1 G 2 G 3 E 1 E 1 E 2 E 2 E 3 E 3 Dynmic Byes nets re generliztion of HMMs Dynmic ByesNets DBN Prticle Filters A prticle is complete smple for time step Initilize: Generte prior smples for the t=1 Byes net Exmple prticle: G 1 = G 1 = (5,3) Elpse time: Smple successor for ech prticle Exmple successor: G 2 = G 2 = (6,3) Oserve: Weight ech entire smple y the likelihood of the evidence conditioned on the smple Likelihood: P(E 1 G 1 ) * P(E 1 G 1 ) Resmple: Select prior smples (tuples of vlues) in proportion to their likelihood
Most Likely Explntion Stte Trellis Stte trellis: grph of sttes nd trnsitions over time sun sun sun sun rin rin rin rin Ech rc represents some trnsition Ech rc hs weight Ech pth is sequence of sttes (evidence The product of weights on pth is tht sequence s proility long with the evidence Forwrd lgorithm computes sums of pths, Viteri computes est pths HMMs: MLE Queries HMMs defined y Sttes X Oservtions E Initil distriution: Trnsitions: Emissions: X 1 X 2 X 3 X 4 E 1 E 2 E 3 E 4 E 5 X 5 New query: most likely explntion: New method: the Viteri lgorithm Forwrd / Viteri Algorithms sun sun sun sun rin rin rin rin Forwrd Algorithm (Sum) Viteri Algorithm (Mx)
Nturl Lnguge Speech technologies (e.g. Siri) Automtic speech recognition (ASR) Text to speech synthesis (TTS) Dilog systems Lnguge processing technologies Question nswering Mchine trnsltion We serch Text clssifiction, spm filtering, etc Digitizing Speech Speech Recognition Speech in n Hour Speech input is n coustic wveform s p ee ch l l to trnsition: Figure: Simon Arnfield, http://www.psyc.leeds.c.uk/reserch/cogn/speech/tutoril/
Spectrl Anlysis Frequency gives pitch; mplitude gives volume Smpling t ~8 khz (phone), ~16 khz (mic) (khz=1000 cycles/sec) s p ee ch l Fourier trnsform of wve displyed s spectrogrm Drkness indictes energy t ech frequency Humn er figure: depion.logspot.com Why These Peks? Articultor process: Vocl cord virtions crete hrmonics The mouth is n mplifier Depending on shpe of mouth, some hrmonics re mplified more thn others Prt of [e] from l Complex wve repeting nine times Plus smller wve tht repets 4x for every lrge cycle Lrge wve: freq of 250 Hz (9 times in.036 seconds) Smll wve roughly 4 times this, or roughly 1000 Hz Resonnces of the Vocl Trct The humn vocl trct s n open tue Open end Closed end Length 17.5 cm. Air in tue of given length will tend to virte t resonnce frequency of tue Constrint: Pressure differentil should e mximl t (closed) glottl end nd miniml t (open) lip end Figure: W. Brry Speech Science slides
Spectrum Shpes [ demo ] Figure: Mrk Liermn Acoustic Feture Sequence Time slices re trnslted into coustic feture vectors (~39 rel numers per slice)..e 12 e 13 e 14 e 15 e 16.. These re the oservtions E, now we need the hidden sttes X Vowel [i] sung t successively higher pitches F#2 A2 C3 F#3 A3 C4 (middle C) A4 Grphs: Rtree Wylnd Speech Stte Spce HMM Specifiction P(E X) encodes which coustic vectors re pproprite for ech phoneme (ech kind of sound) P(X X ) encodes how sounds cn e strung together Stte Spce We will hve one stte for ech sound in ech word Mostly, sttes dvnce sound y sound Build little stte grph for ech word nd chin them together to form the stte spce X
Trnsitions with Bigrm Model 198015222 the first 194623024 the sme 168504105 the following 158562063 the world 14112454 the door ----------------------------------- 23135851162 the * Trining Counts Sttes in Word Decoding Finding the words given the coustics is n HMM inference prolem Which stte sequence x 1:T is most likely given the evidence e 1:T? From the sequence x, we cn simply red off the words Figure: Hung et l, p. 618 End of Prt II! Now we re done with our unit on proilistic resoning Lst prt of clss: mchine lerning
Mchine Lerning Prmeter Estimtion Estimting the distriution of rndom vrile Elicittion: sk humn (why is this hrd?) Empiriclly: use trining dt (lerning!) E.g.: for ech outcome x, look t the empiricl rte of tht vlue: r r This is the estimte tht mximizes the likelihood of the dt r r r Mchine Lerning Up until now: how use model to mke optiml decisions Mchine lerning: how to cquire model from dt / experience Lerning prmeters (e.g. proilities) Lerning structure (e.g. BN grphs) Lerning hidden concepts (e.g. clustering) Estimtion: Smoothing Reltive frequencies re the mximum likelihood estimtes Another option is to consider the most likely prmeter vlue given the dt???? r r
Smoothing Estimtion: Lplce Smoothing Lplce s estimte (extended): Pretend you sw every outcome k extr times r r Wht s Lplce with k = 0? k is the strength of the prior Lplce for conditionls: Smooth ech condition independently: Estimtion: Lplce Smoothing Lplce s estimte: Pretend you sw every outcome once more thn you ctully did r r Cn derive this estimte with Dirichlet priors (see cs281)