Lecture 5 Iolated Word Recogitio Hidde Markov Model of peech State traitio ad aligmet probabilitie Searchig all poible aligmet Dyamic Programmig Viterbi Aligmet Iolated Word Recogitio 8. Iolated Word Recogier Hypothei Geerator cat Laguage Model pr(w) w Preproceor Acoutic Model pd( w) Preproceor Idetifie tart/ed of each word Covert peech ito a equece of feature vector at iterval of aroud 0 m. Hypothei Geerator Try each poible word i tur Laguage Model Etimate the probability of the word Ca deped o the precedig word Acoutic Model Calculate the probability deity that a oberved equece of feature vector correpod to the choe word 8. core
8.5 Speech Productio Model c c c a a a t t t c a t Each phoeme i a word correpod to a umber of model tate. Each model tate repreet a ditict oud with it ow acoutic pectrum. For each tate, we tore the mea value ad variace for each of F feature. Whe ayig a word, the peaker tay i each tate for oe or more frame ad the goe o to the ext tate. The time i each tate will vary accordig to how fat he/he i peakig. Some peech oud lat loger tha other 8.6 State Traitio Aume that the probability of proceedig oto the ext tate i p ad the probability of tayig i the ame tate i p. The probability of beig i ay tate for the ext frame deped oly o the curret tate ad ot o ay previou hitory thi i a Markov proce. 0.6 0.8 0.9 More realitic aumptio icreae computatio without improvig performace. 0. 0. 0. a a t 0. 0.8 0.6 0. 0. 0. 0.08 0.06 0.0 0.0 5 6 7 8 9 0 5 State Duratio (frame) 0 806 06 0.0.0. 0.0.50.60.7 Probability The legth of time, D, pet i ay tate follow a egative expoetial ditributio with a average duratio of /p frame. pr( D ) p ( p ) Sice by differetiatig E( D ) 0 x x p ( p ) 0 x p ( x )
8.7 Aligmet Probabilitie We ca calculate the probability of havig: frame i the firt tate, 0.6 0.8 0.9 0. 0. a a t 0. i the ecod tate, ad 6 i the third tate A complete pecificatio of which frame are i each tate i a aligmet. A before we ca ue log probabilitie ad add them itead of multiplyig Avoid dyamic rage problem 0.6 0.6 0. 0.8 0. 0. 0.0006 8.8 Hidde Markov Model c a t c c c a a a t t t To calculate the probability deity (pd) that a obervatio matche a particular word with a give aligmet, we multiply together: the probability of the aligmet the output pd of each frame Try thi for every poible aligmet of every poible word equece ad chooe the oe with highet probability. Hidde Markov Model the correct aligmet i hidde: we ca t oberve it directly. We talk of the probability deity of the model geeratig the oberved frame equece.
8.9 Hidde Markov Model Parameter c c c a a a t t t c a t A Hidde Markov Model for a word mut pecify the followig parameter for tate : The mea ad variace for each of the F elemet of the parameter vector: µ ad σ. Thee allow u to calculate d (x): the output probability deity of iput frame x i tate. The traitio probabilitie a,j to every poible ucceor tate. a,j i ofte zero for all j except j ad j+ it i the called a left-to-right, o kip model. For a Hidde Markov Model with S tate we therefore have aroud (F+)S parameter. A typical word might have S5 ad F9 givig 00 parameter i all. 8.0 Miimum Cot Path Suppoe we wat to fid the cheapet path through a toll road ytem: Start Fiih I each circle we will eter the lowet cot of a jourey from Start Start Fiih 5 Begi by puttig 0 i the Start circle Now put,, i the d colum circle ad mark all three egmet i bold. Thi how the lowet cot to each of thee circle ad the route by which you go. Put 6,, 6 i the rd colum circle ad, i each cae, mark the bet egmet from the previou colum i bold. Put 5, ad 6 i the th colum. Put 7 i the Fiih circle. We ca trace bold egmet backward from Fiih to fid the bet overall path.
8. Dyamic Programmig Thi techique for fidig the miimum cot path through a graph i kow a dyamic programmig. Three coditio mut be true: All path through the graph mut go from left to right. The cot of each egmet of a path mut be fixed i advace: it mut ot deped o which of the other egmet are icluded i the route. The total cot of a path mut jut be the um of each of it egmet cot. Dyamic programmig i guarateed to fid the path with miimum cot. We ca alo fid the maximum cot path i the ame way: i thi cae the cot are uually called utilitie itead. We ca ue Dyamic Programmig to fid the bet aligmet of a equece of feature vector with a word model. bet mea the aligmet with the highet productio probability deity. 8. Aligmet Graph Start Fiih We ca draw a aligmet graph (or lattice): The colum correpod to peech frame The row correpod to model tate Each poible path from Start to Fiih correpod to a aligmet of the peech frame with the model. All valid path pa through each colum i tur. I goig to the ext colum, a path i retricted to the tate traitio allowed by the tate diagram I the above example a path mut either remai i the ame tate or ele go o to the ext tate.
8. Segmet Utilitie x 0 Start Fiih Segmet prob deity a, d (x 0 ) The probability deity of a path egmet goig from tate i to tate j i the product of: The probability of the traitio: a i,j The output probability deity of the correpodig iput frame i tate j: d j (x) The probability deity of the etire aligmet i the product of it cotituet egmet pd. Thi equal the pd that the model geerate the oberved equece of feature vector with thi particular aligmet. 8. Dyamic Programmig Step x 0 Start B(,9) B(,9) B(,9) B(,0) Defie B(,t) to be the probability deity of the bet partial aligmet begiig at Start ad edig with frame t i tate. Ay aligmet goig through (,0) mut go through either (,9), (,9) or (,9). Hece:,, ( ) B(,0) max B(,9) a d ( x ) 0 I geeral, we ca calculate B(*,t) from B(*,t ): ( ) Bkt (, ) max Bt (, ) ak dk( x t) Iitialie thi recurio by ettig B(,)d ()
8.5 Viterbi Aligmet Start a S Fiih Thi procedure i called Viterbi Aligmet: B(,) d (x ); B(,) 0 for > for t:t for k:s z(k,t)jargmax(b(,t-) a k ) for all with a k 0 B(k,t)B(j,t-) a jk d k (x t ) ed ed The bet path ha prob deity B(S,T) a S a S i the exit probability from the fial tate The z(k,t) array tore the iformatio eeded to trace back the bet path. 8.6 Iolated Word Recogitio Require the peaker to iert a gap betwee each word Ued for budget ytem with little CPU power Recogitio: Extract a word-log egmet of peech,, from the iput igal. Covert it ito a equece of frame. Calculate pr(w ) pr(w) pr( w) for each poible word, w, i the recogitio vocabulary. pr(w) i the prior probability of the word: get thi from word frequecie or word-pair frequecie (e.g. miiter ofte follow prime ). pr( w) i obtaied by uig the Viterbi aligmet algorithm to fid the log probability deity of the bet aligmet of with the model for w. Chooe the word with the highet probability Need to create a eparate Hidde Markov model for each word i the vocabulary. Log