MACHINE LEARNING. Learning Bayesian networks

Size: px
Start display at page:

Download "MACHINE LEARNING. Learning Bayesian networks"

Transcription

1 Iowa Sae Unversy MACHINE LEARNING Vasan Honavar Bonformacs and Compuaonal Bology Program Cener for Compuaonal Inellgence, Learnng, & Dscovery Iowa Sae Unversy Iowa Sae Unversy Learnng Bayesan newors E B Daa + Pror nformaon L R A C E B P(A E,B) e b e b e b e b

2 Iowa Sae Unversy The Learnng Problem Complee Daa Incomplee Daa Known Srucure Sascal paramerc esmaon (closed-form eq.) Paramerc opmzaon (EM, graden descen...) Unnown Srucure Dscree opmzaon over srucures (dscree search) Combned (Srucural EM, mure models ) Iowa Sae Unversy Complee Daa Learnng Problem Known Srucure Sascal paramerc esmaon (closed-form eq.) Unnown Srucure Dscree opmzaon over srucures (dscree search) Incomplee Daa Paramerc opmzaon (EM, graden descen...) Combned (Srucural EM, mure models ) E B P(A E,B) e b e b e b e b???????? E E, B, A <Y,N,N> <Y,Y,Y> <N,N,Y> <N,Y,Y>.. <N,Y,Y> A B L E B E B P(A E,B) e b A e b e b e b

3 Iowa Sae Unversy Complee Daa Learnng Problem Known Srucure Sascal paramerc esmaon (closed-form eq.) Unnown Srucure Dscree opmzaon over srucures (dscree search) Incomplee Daa Paramerc opmzaon (EM, graden descen...) Combned (Srucural EM, mure models ) E B P(A E,B) e b e b e b e b???????? E E, B, A <Y,N,N> <Y,Y,Y> <N,N,Y> <N,Y,Y>.. <N,Y,Y> A B L E B E B P(A E,B) e b A e b e b e b Iowa Sae Unversy Complee Daa Learnng Problem Known Srucure Sascal paramerc esmaon (closed-form eq.) Unnown Srucure Dscree opmzaon over srucures (dscree search) Incomplee Daa Paramerc opmzaon (EM, graden descen...) Combned (Srucural EM, mure models ) E B P(A E,B) e b e b e b e b???????? E E, B, A <Y,N,?> <Y,?,Y> <N,N,Y> <?,Y,Y>.. <N,?,Y> A B L E B E B P(A E,B) e b A e b e b e b

4 Iowa Sae Unversy Complee Daa Learnng Problem Known Srucure Sascal paramerc esmaon (closed-form eq.) Unnown Srucure Dscree opmzaon over srucures (dscree search) Incomplee Daa Paramerc opmzaon (EM, graden descen...) Combned (Srucural EM, mure models ) E B P(A E,B) e b e b e b e b???????? E E, B, A <Y,N,?> <Y,?,Y> <N,N,Y> <?,Y,Y>.. <N,?,Y> A B L E B E B P(A E,B) e b A e b e b e b Iowa Sae Unversy Learnng Bayesan Newors Known Srucure Unnown Srucure Complee daa Incomplee daa» Parameer learnng: Complee daa (Revew) Sascal paramerc fng Mamum lelhood esmaon Bayesan nference Parameer learnng: Incomplee daa Srucure learnng: Complee daa Applcaon: classfcaon Srucure learnng: Incomplee daa 4

5 Iowa Sae Unversy Esmang probables from daa (dscree case) Mamum lelhood esmaon Bayesan esmaon Mamum a poseror esmaon Iowa Sae Unversy Bayesan esmaon Trea he unnown parameers as random varables Assume a pror dsrbuon for he unnown parameers Updae he dsrbuon of he parameers based on daa Use Bayes rule o mae predcon 5

6 Iowa Sae Unversy Bayesan Newors and Bayesan Predcon θ Y X θ X θ X X[] X[2] X[M] X[M+] m X[m] θ Y X Y[] Y[2] Y[M] Y[M+] Y[m] Observed daa Query Plae noaon Prors for each parameer group are ndependen Daa nsances are ndependen gven he unnown parameers Iowa Sae Unversy Bayesan Newors and Bayesan Predcon θ Y X θ X θ X X[] X[2] X[M] X[M+] m X[m] θ Y X Y[] Y[2] Y[M] Y[M+] Y[m] Observed daa Query Plae noaon We can read from he newor: Complee daa poserors on parameers are ndependen 6

7 Iowa Sae Unversy Bayesan Predcon (con.) Snce poserors on parameers for each node are ndependen, we can compue hem separaely Poserors for parameers whn a node are also ndependen: θ X θ X m X[m] θ Y X Refned model m X[m] θ Y X0 θ Y X Y[m] Y[m] Complee daa he poserors on θ Y X0 and θ Y X are ndependen Iowa Sae Unversy Bayesan Predcon Gven hese observaons, we can compue he poseror for each mulnomal θ X pa ndependenly The poseror s Drchle wh parameers α(x pa )+N (X pa ),, α(x pa )+N (X pa ) The predcve dsrbuon s hen represened by he parameers ~ α (, pa ) + θ pa α ( pa ) + N (, pa N ( pa ) ) 7

8 Iowa Sae Unversy Assgnng Prors for Bayesan Newors We need he α(,pa ) for each node j We can use nal parameers Θ 0 as pror nformaon Need also an equvalen sample sze parameer M 0 Then, we le α(,pa ) M 0 P(,pa Θ 0 ) Ths allows updae of a newor n response o new daa Iowa Sae Unversy Learnng Parameers Comparng wo dsrbuon P() (rue model) vs. Q() (learned dsrbuon) -- Measure her KL Dvergence KL( P Q) P( ) P( )log Q( ) KL(P Q) 0 KL(P Q) 0 ff are P and Q equal 8

9 Iowa Sae Unversy Learnng Parameers: Summary Esmaon reles on suffcen sascs For mulnomal hese are of he form N (,pa ) Parameer esmaon ˆ θ pa N (, pa ) N ( pa ) MLE ~ θ α (, pa ) + N (, pa ) α ( pa ) N ( pa ) pa + Bayesan (Drchle) Bayesan mehods also requre choce of prors Boh MLE and Bayesan esmaes are asympocally equvalen and conssen bu he laer wor beer wh small samples Boh can be mplemened n an on-lne manner by accumulang suffcen sascs Iowa Sae Unversy Complee Daa Learnng Problem Known Srucure Sascal paramerc esmaon (closed-form eq.) Unnown Srucure Dscree opmzaon over srucures (dscree search) Incomplee Daa Paramerc opmzaon (EM, graden descen...) Combned (Srucural EM, mure models ) E B P(A E,B) e b e b e b e b???????? E E, B, A <Y,N,N> <Y,Y,Y> <N,N,Y> <N,Y,Y>.. <N,Y,Y> A B L E B E B P(A E,B) e b A e b e b e b

10 Iowa Sae Unversy Why do we need accurae srucure? Earhquae Alarm Se Burglary Sound Mssng an arc Eraneous arc Earhquae Alarm Se Burglary Earhquae Alarm Se Burglary Sound Sound Canno be compensaed for by fng parameers Incorrec ndependence assumpons Increases he number of parameers o be esmaed Incorrec ndependence assumpons Iowa Sae Unversy Approaches o BN Srucure Learnng Score based mehods assgn a score o each canddae BN srucure usng a suable scorng funcon Search he space of canddae newor srucures for a BN srucure wh he mamum score Independence esng based mehods Use ndependence ess o deermne he srucure of he newor 0

11 Iowa Sae Unversy Score-based BN Srucure Learnng Defne scorng funcon ha evaluaes how well a srucure maches he daa E, B, A <Y,N,N> <Y,Y,Y> <N,N,Y> <N,Y,Y>.. <N,Y,Y> E A B E A B B E A Search for a srucure ha mamzes he score Iowa Sae Unversy Need for parsmony

12 Iowa Sae Unversy Basc dea: Mnmum descrpon lengh (MDL) prncple h argma P( h D) h h MAP MAP MDL P( D h) P( h) argma h H P( D) argma P( D h) P( h) h H argmn h H h H argmn h H ( log P( D h) log P( h) ) C ( D h) + C ( ) ( ) D h h h We need o desgn a scorng funcon ha mnmzes he descrpon lengh of he hypohess and he descrpon lengh of he daa gven he hypohess. In hs case, he hypohess s a Bayesan newor whch represens a jon probably dsrbuon Iowa Sae Unversy Scorng funcon A BN scorng funcon consss of A erm ha corresponds o he number of bs needed o encode he BN srucure and parameers A erm ha corresponds o he number of bs needed o encode he daa gven he BN We proceed o specfy each of hese erms 2

13 Iowa Sae Unversy Encodng a Bayesan Newor I suffces o ls he parens of each node record he condonal probables assocaed wh each node Consder a BN wh n varables. Consder a node wh parens. We need log 2 n bs o ls s parens. Suppose he node (varable X ) aes s dsnc values. Suppose he j h paren aes s j dsnc values. Suppose we use d bs o sore each condonal probably. Iowa Sae Unversy Encodng a Bayesan Newor Under he encodng scheme descrbed, he descrpon lengh of a parcular Bayesan newor s gven by n ( ) ( ) log + 2 n d s s j X j Parens( X ) 3

14 Iowa Sae Unversy Encodng he Daa Suppose we have M ndependen observaons (nsanaons) of he random varables X...X n Le V be he doman of random varable X Each observaon corresponds o an aomc even e.. V V V2 n Le p be he probably of e When M s large, we epec M p occurrences of e among he M observaons. Under opmal encodng, he number of bs needed o encode he daa s M p log ( p ) e V... V n 2 Iowa Sae Unversy Encodng he Daa..usng a Bayesan newor Bu.. We do no now p - he probably of e! Wha we have nsead s a Bayesan newor whch assgns a probably q o e When we use he learned newor o encode he daa, he number of bs needed o encode he newor (and hence he daa usng he newor) s M e V... V n ( ) p log2 q 4

15 Iowa Sae Unversy Encodng he Daa..usng a Bayesan newor Theorem (Gbbs): M e V... V wh equaly holdng f and only f p n log 2 ( p ) M p log2 ( q ) e V... V Number of bs needed o encode he daa f rue probables of each aomc even are nown s less han or equal o he number of bs needed o encode he daa usng a code based on he esmaed probables. p q n Iowa Sae Unversy Pung he wo ogeher MDL prncple recommends mnmzng he sum of he encodng lenghs of he model (Bayes newor) and he encodng lengh of he daa usng he model N ( ) ( ) log2 n + d s s j M p log2( q ) X j Parens( X ) e V... Vn Problems wh evaluang he second erm: we do no now he probables p he second erm requres summaon over all aomc evens (all nsanaons of he n random varables) 5

16 Iowa Sae Unversy Kullbac-Lebler Dvergence o he rescue! Le P and Q be wo probably dsrbuons over he same even space such ha an even e s assgned probably p by P and q by Q KL( P Q) p KL( P Q) 0 KL( P Q) 0ff P Q ( logp logq ) Iowa Sae Unversy Kullbac-Lebler Dvergence o he rescue! Theorem: The encodng lengh of he daa s a monooncally ncreasng funcon of he KL dvergence beween he dsrbuon Q defned by he model and he rue dsrbuon P. Hence, we can use he esmaed KL dvergence as a proy for he encodng lengh of he daa (usng he model) o score a model. We can use local compuaons over a Bayes newor o evaluae KL( P Q) p log p logq ( ) 6

17 Iowa Sae Unversy Applyng he MDL Prncple Ehausve search over he space of all newors nfeasble! Evaluang KL-dvergence drecly s nfeasble! Hence we need o Resor o a heursc search o fnd a newor wh a near mnmal descrpon lengh Develop a more effcen mehod of evaluang KL dvergence of a canddae newor Iowa Sae Unversy Evaluang KL dvergence for a newor Theorem (Chou and Lu, 969). Suppose we defne muual nformaon beween any wo nodes X and X j as P ( ) ( ) ( X, X j ) W X, X j P X, X j log2 P X P X ( X, X ) j ( ) ( ) Then he cross enropy KL(P Q) over all ree-srucured dsrbuons s mnmzed when he graph represenng Q(X.. X n ) s a mamum wegh spannng ree of he graph where he edge beween nodes X and X j s assgned he wegh equal o W(X, X j ). The resulng ree srucured model can be shown o correspond o he mamum lelhood model among all ree srucured models j 7

18 Iowa Sae Unversy Evaluang KL dvergence for a newor Theorem (Lam and Bacchus, 994). Suppose we defne wegh measure beween a nodes X and a se of arbrary parens Parens(X ) W ( X, Parens( X )) P( X, Parens( X )) ( X, Parens( X )) P( X, Parens( X )) ( X ) P( Parens( X )) Then he cross enropy KL(P Q) for a Bayesan newor represenng Q(X.. X n ) s a monooncally decreasng n funcon of W X, Parens( X ), Parens( X ) ( ) Hence, KL(P Q) s mnmzed f and only f hs sum of weghs s mamzed log 2 P Iowa Sae Unversy In words If we fnd a Bayes newor ha mamzes n, Parens W ( X, Parens ( X )) ( X ) Then he probably dsrbuon Q modeled by newor wll be closes o he underlyng dsrbuon P from whch he daa have been sampled wh respec o KL(P Q) Theorem: I s always possble o decrease KL(P Q) by addng arcs o he newor. Hence he need for MDL! 8

19 Iowa Sae Unversy In summary We need o fnd a Bayes newor ha mamzes n W, Parens( X ) ( X, Parens( X )) Whle mnmzng N ( ) ( ) log + 2 n d s s j X j Parens( X ) Iowa Sae Unversy Each X s Alernave Scorng Funcons - Noaon aes r ( X ) gven he h nsanaon of s paren se Parens N N j j j are he pseudocouns (from he Drchle pror) ; dsnc values aes he jh value n s doman η ; ( X ) are he observed couns for he correspondng nsanaon j X Parens j j N r θ probably ha X η j η j j 9

20 Iowa Sae Unversy Bayesan scorng funcon Le B G, P θ be a Bayesan newor wh graph srucure G and probably dsrbuon P( θ ) over a se of n random varables X Pror probably dsrbuon p(b) over he newors Poseror probably gven daa D s gven by p ( G, θ D) ( ( )) p p p ( G, θ, D) p( D) p ( G, θ, D) p( G, θ, D) G, θ ( G) p( θ D, G) p( D G, θ ) p( G, θ ) p( D G, θ ) D, θ ( G) p( θ D, G) p( D G, θ ) p G, θ ( D, G, θ ) p( D, G, θ ) p( G,θ ) Iowa Sae Unversy p n r s ( D G θ ) Bayesan scorng funcon where n s he number of random varables r s he number of parens of node s s he number of nsanaons of he parens of node N he correspondng couns esmaed from D j p p η ( θ G), θ ( θ G, D) j n j r j n r s θ s j N j he correspondng pseudocouns j ηj j θ N j j + η j 20

21 Iowa Sae Unversy Geger Hecerman Scorng Funcon Geger-Hecerman Measure for a BN wh graph G and parameers Θ Q GH ( G, D) log p( G) + log p( D G, Θ) p( Θ G) dθ n s r Γ ( ) ( η ) Γ( η + ) j N j log p G + log + ( ) log ( ) Γ η + N j Γ η Drawbac Does no eplcly penalze comple newors j Iowa Sae Unversy Cooper-Hersovs Scorng Funcon Cooper-Hersovs Measure for a BN wh graph G and parameers Θ Q CH ( G, D) log p( G) + Γ( r ) ( r + N ) n s r log + log Γ( + Nj ) Γ Drawbac Does no eplcly penalze comple newors j 2

22 Iowa Sae Unversy Sandard Bayesan Measure Sandard Bayesan Measure for a BN wh graph G and parameers Θ Q G, D log p G Bayes ( ) ( ) + n r s ( Nj + ηj ) ( N + η ) j log j η j ( N + ( r ) ) Dm( G) log N 2 where Dm( G) s he number of parameers n he BN and N s he sample sze log N s he average number of bs needed o sore 2 a number beween and N Iowa Sae Unversy Q Sandard Bayesan Measure Asympoc verson Asympoc verson of he sandard Bayesan Measure for a BN wh graph G and parameers Θ AsymBayes ( G, D) Q ( G, D) MDL log p n r s N N j ( G) + N log Dm( G) log N j j 2 22

23 Iowa Sae Unversy Q I Asympoc Informaon Measures log p G ( B, D) ( ) f f + Nj j dm( B) f ( D ) where f ( D ) ( D ) 0 for ( D ) for ( D ) ( log N ) for N log N s a non -negave penaly funcon mamum lelhood nformaon creron Aae nformaon creron f Schwarz nformaon creron 2 Noe :MDL s a specal case of hs measure j Iowa Sae Unversy Srucure Search as Opmzaon Inpu: Tranng daa Scorng funcon Se of possble srucures Oupu: A newor ha mamzes he score Key Compuaonal Propery: Decomposably: score(g) score ( famly of X n G ) 23

24 Iowa Sae Unversy Tree-Srucured Newors MINVOLSET Trees: A mos one paren per varable Why rees? Elegan mahemacs we can eacly and effcenly solve he opmzaon problem Sparse parameerzaon PCWP avod overfng TPR HYPOVOLEMIA LVEDVOLUME CVP PULMEMBOLUS PAP CO SHUNT LVFAILURE SAO2 STROEVOLUME INTUBATION MINOVL HISTORY PVSAT INSUFFANESTH CATECHOL VENTLUNG VENTALV ARTCO2 HRBP KINKEDTUBE ERRBLOWOUTPUT EXPCO2 HR HREKG PRESS VENTMACH ERRCAUTER HRSAT VENITUBE DISCONNECT BP Iowa Sae Unversy Learnng Trees Le p() denoe paren of X We can wre he Bayesan score as Score ( G : D) Score( X : Pa ) ( ( X : X ) Score( X )) p( ) Score + Score( X ) Score sum of edge scores + consan Improvemen over empy newor Score of empy newor 24

25 Iowa Sae Unversy Learnng Trees Se w(j ) Score( X j X ) - Score(X ) Fnd ree (or fores) wh mamal wegh --Sandard ma spannng ree algorhm O(n 2 log n) Theorem: Ths procedure fnds ree wh ma score Iowa Sae Unversy Beyond Trees When we consder more comple newor, he problem s no as easy Suppose we allow a mos wo parens per node A greedy algorhm s no longer guaraneed o fnd he opmal newor In fac, no effcen algorhm ess Theorem: Fndng mamal scorng srucure wh a mos parens per node s NP-hard for > 25

26 Iowa Sae Unversy Heursc Search Defne a search space: search saes are possble srucures operaors mae small changes o srucure Traverse space loong for hgh-scorng srucures Search echnques: Greedy hll-clmbng Bes frs search Smulaed Annealng... Iowa Sae Unversy K2 Algorhm (Cooper and Hersovs) Sar wh an ordered ls of random varables For each varable X add o s paren se, a node ha s lower numbered han X and yelds he mamum mprovemen n score Repea unl score does no mprove or a complee newor s obaned Dsadvanage: Requres an ordered ls of nodes 26

27 Iowa Sae Unversy B Algorhm (Bunne) Sar wh he paren se for each random varables nalzed o an empy se A each sep, add a ln (a node o he paren se of some node), ha does no nroduce a cycle and yelds he mamum mprovemen n score Repea unl score does no mprove or a complee newor s obaned Iowa Sae Unversy Local Search Sar wh a gven newor empy newor bes ree a random newor A each eraon Evaluae all possble changes Apply change based on score Sop when no modfcaon mproves score 27

28 Iowa Sae Unversy Heursc Search Typcal operaons: S C To updae score afer local change, only rescore famles ha changed S E C S Delee C E E D C Add C D Reverse C E S E D Δscore S({C,E} D) -S({E} D) E C D D Iowa Sae Unversy 2 Learnng n Pracce: Alarm newor KL Dvergence from rue dsrbuon Srucure nown, f parameers Learn boh srucure and parameers #samples

29 Iowa Sae Unversy Local Search: Possble Pfalls Local search can ge suc n: Local Mama All one-edge changes reduce he score Plaeau Some one-edge changes leave he score unchanged Sandard heurscs can escape boh Random resars TABU search Smulaed annealng Iowa Sae Unversy Independence Based Mehods Rely on ndependence ess o decde wheher o add lns beween nodes n he srucure search phase Need o penalze for comple srucures Hard o bea a fully conneced newor! In he mos general seng, here are oo many ndependence ess o consder Somemes s possble o nfer addonal ndependences based on nown (or nfered) ndependences (See Bromberg e al., 2006 and references ced heren) 29

30 Iowa Sae Unversy Srucure Search: Summary Dscree opmzaon problem In some cases, opmzaon problem s easy Eample: learnng rees In general, NP-Hard Need o resor o heursc search Or resrc connecvy each node assumed o have no more han l parens where l s much smaller han n Sochasc search e.g., smulaed annealng, genec algorhms Iowa Sae Unversy Srucure Dscovery Tas: Dscover srucural properes Is here a drec connecon beween X & Y Does X separae beween wo subsysems Does X causally effec Y Eample: scenfc daa mnng Dsease properes and sympoms Ineracons beween he epresson of genes 30

31 Iowa Sae Unversy P(G D) Dscoverng Srucure E B R A C Model selecon Pc a sngle hgh-scorng model Use ha model o nfer doman srucure Iowa Sae Unversy Dscoverng Srucure P(G D) E B E B E B E B E B R A R A R A R A R A C C C C C Problem Small sample sze many hgh scorng models Answer based on one model ofen useless Wan feaures common o many models 3

32 Iowa Sae Unversy Bayesan Approach Poseror dsrbuon over srucures Esmae probably of feaures Edge X Y Pah X Y P ( f D) f ( G) P( G D) G Bayesan score for G Feaure of G, e.g., X Y Indcaor funcon for feaure f Iowa Sae Unversy MCMC over Newors Canno enumerae srucures, so sample srucures MCMC Samplng Defne Marov chan over BNs Run chan o ge samples from poseror P(G D) n P( f ( G) D) Possble pfalls: f ( G ) n Huge (super-eponenal) number of newors Tme for chan o converge o poseror s unnown Islands of hgh poseror, conneced by low brdges 32

33 Iowa Sae Unversy Fed Orderng Suppose ha We now he orderng of varables say, X > X 2 > X 3 > X 4 > > X n 2 n log n parens for X mus be n X,,X - newors Lm number of parens per nodes o Inuon: Order decouples choce of parens Choce of Pa(X 7 ) does no resrc choce of Pa(X 2 ) Upsho: Can compue effcenly n closed form Lelhood P(D p) Feaure probably P(f D, p) Iowa Sae Unversy Sample Orderngs We can wre P( f D) P( f p, D) P( p D) p Sample orderngs and appromae P( f D) P( f p, D) MCMC Samplng Defne Marov chan over orderngs Run chan o ge samples from poseror P (p D) n 33

34 Iowa Sae Unversy Applcaon: Gene epresson Daa Analyss Fredman e al., 200 Inpu: Measuremen of gene epresson under dfferen condons Thousands of genes Hundreds of epermens Oupu: Models of gene neracon Uncover pahways Iowa Sae Unversy Mang response Subsrucure SST2 KAR4 TEC NDJ KSS FUS PRM AGA YLR343W AGA2 TOM6 FIG FUS3 YLR334C MFA STE6 YEL059W Auomacally consruced sub-newor of hgh-confdence edges Almos eac reconsrucon of yeas mang pahway 34

35 Iowa Sae Unversy Complee Learnng Problem Known Srucure Sascal paramerc esmaon (closed-form eq.) Unnown Srucure Dscree opmzaon over srucures (dscree search) Incomplee Paramerc opmzaon (EM, graden descen...) Combned (Srucural EM, mure models ) E B P(A E,B) e b?? e b?? e b?? e b?? E, B, A <Y,N,N> <Y,?,Y> <N,N,Y> <N,Y,?>.. <?,Y,Y> E A B L E A B E B P(A E,B) e b.9. e b.7.3 e b.8.2 e b.99.0 Iowa Sae Unversy Incomplee Daa Daa are ofen ncomplee Some varables of neres are no assgned values Ths phenomenon occurs when we have Mssng values Some varables unobserved n some nsances Hdden varables Some varables are never observed We mgh no even now hey es 35

36 Iowa Sae Unversy Hdden (Laen) Varables Why should we care abou hdden varables? X X 2 X 3 X X 2 X 3 H Y Y 2 Y 3 Y Y 2 Y 3 7 parameers 59 parameers Iowa Sae Unversy Incomplee Daa In he presence of ncomplee daa, he lelhood can have mulple mama H Y Eample: If H has wo values, lelhood has wo mama In pracce, many local mama 36

37 Iowa Sae Unversy Epecaon Mamzaon (EM) A general purpose mehod for learnng from ncomplee daa Inuon: If we had rue couns, we could esmae parameers Bu wh mssng values, couns are unnown We complee couns usng probablsc nference based on curren parameer assgnmen We use compleed couns as f hey were real o reesmae parameers Iowa Sae Unversy Epecaon Mamzaon (EM) P(YH XH,ZT,Θ) 0.3 Curren model P(YH XT,Θ) 0.4 X H T H H T Daa Y?? H T T Z T?? T H Epeced Couns N (X,Y ) X Y # H H.3 T H 0.4 H T.7 T T.6 37

38 Iowa Sae Unversy Epecaon Mamzaon (EM) Ierae Inal newor (G,Θ 0 ) Updaed newor (G,Θ ) X X 2 X 3 H Y Y 2 Y 3 + Tranng Daa Compuaon (E-Sep) Epeced Couns N(X ) N(X 2 ) N(X 3 ) N(H, X, X, X 3 ) N(Y, H) N(Y 2, H) N(Y 3, H) Reparameerze X X 2 X 3 H (M-Sep) Y Y 2 Y 3 Iowa Sae Unversy Formal Guaranees L(Θ :D) L(Θ 0 :D) Epecaon Mamzaon (EM) Each eraon mproves he lelhood If Θ Θ 0, hen Θ 0 s a saonary pon of L(Θ:D) Usually, hs means a local mamum 38

39 Iowa Sae Unversy Epecaon Mamzaon (EM) Compuaonal bolenec: Compuaon of epeced couns n E-Sep Need o compue poseror for each unobserved varable n each nsance of ranng se All poserors for an nsance can be derved from one pass of sandard BN nference Iowa Sae Unversy Summary of Parameer Learnng from Incomplee Daa Incomplee daa maes parameer esmaon hard Lelhood funcon Does no have closed form Is mulmodal Fndng ma lelhood parameers: EM Graden ascen Boh eplo nference procedures for Bayesan newors o compue epeced suffcen sascs 39

40 Iowa Sae Unversy Learnng Problem Known Srucure Unnown Srucure E, B, A <Y,N,N> <Y,?,Y> <N,N,Y> <?,Y,Y>.. <N,Y,?> E B P(A E,B) e b?? e b?? e b?? e b?? Complee Incomplee E A B Sascal paramerc esmaon (closed-form eq.) Paramerc opmzaon (EM, graden descen...) L Dscree opmzaon over srucures (dscree search) Combned (Srucural EM, mure models ) E A B E B P(A E,B) e b.9. e b.7.3 e b.8.2 e b.99.0 Iowa Sae Unversy Incomplee Daa: Srucure Scores Recall, Bayesan score: P( G D) P( G) P( D G) P( G) Wh ncomplee daa: Canno evaluae margnal lelhood n closed form We have o resor o appromaons: Evaluae score around MAP parameers Need o fnd MAP parameers (e.g., EM) P( D G, Θ) P( Θ G) dθ 40

41 Iowa Sae Unversy Srucural EM Recall, n complee daa we had Decomposon effcen search Idea: Insead of opmzng he real score Fnd decomposable alernave score Such ha mamzng new score mprovemen n real score Iowa Sae Unversy Srucural EM Idea: Use curren model o help evaluae new srucures Oulne: Perform search n (Srucure, Parameers) space A each eraon, use curren model for fndng eher: Beer scorng parameers: paramerc EM sep or Beer scorng srucure: srucural EM sep 4

42 Iowa Sae Unversy Ierae X X 2 X 3 H Y Y 2 Y 3 + Compuaon Epeced Couns N(X ) N(X 2 ) N(X 3 ) N(H, X, X, X 3 ) N(Y, H) N(Y 2, H) N(Y 3, H) Score & Parameerze X X 2 X 3 H Y Y 2 Y 3 Tranng Daa N(X 2, X ) N(H, X, X 3 ) N(Y, X 2 ) N(Y 2, Y, H) X X 2 X 3 H Y Y 2 Y 3 Iowa Sae Unversy Some Addonal Graphcal Models (Fne) Mure Models Graphcal models for sequence daa Marov Models and Hdden Marov Models Undreced graphcal models Marov newors Marov random felds 42

43 Iowa Sae Unversy Fne Mure Models p( ) K K K p(, c) p( c) p( c) p( c, θ ) α Componen Model Wegh Parameers Iowa Sae Unversy Gaussan mures: Eample: Mure of Gaussans K p( ) p( c, θ) α Each mure componen s a muldmensonal Gaussan wh s own mean μ and covarance shape Σ e.g., K2, -dm: {θ, α} {μ, σ, μ 2, σ 2, α } 43

44 Iowa Sae Unversy 2.5 Componen Models p() Mure Model p() Iowa Sae Unversy Eample: Mure of Naïve Bayes K p( ) p( c, θ) α p d ( c j c θ j, θ ) p(, ) Condonal Independence model for each componen (ofen que useful as a frs-order appromaon) 44

45 Iowa Sae Unversy Inerpreaon of Mures C has a drec (physcal) nerpreaon e.g., C {age of fsh}, C {male, female} C mgh have an nerpreaon e.g., clusers of Web surfers C s jus a convenen laen varable e.g., fleble densy esmaon Iowa Sae Unversy Graphcal Models for Mures E.g., Mures of Naïve Bayes: C (dscree, hdden) X X 2 X 3 (observed) 45

46 Iowa Sae Unversy Sequenal Mures C C C X X 2 X 3 X X 2 X 3 X X 2 X 3 Tme - Tme Tme + Marov Mures C has Marov dependence Hdden Marov Model (here wh naïve Bayes) C dscree sae, couples observables Iowa Sae Unversy Mure densy P( θ) where θ ( θ c componen denses P( ω, θ ). j j j j 2 3 mng parameers, θ2,..., θc ) P( ω ) Tas To use samples drawn from hs mure densy o esmae he unnown parameer vecor θ. Once θ s nown, we can decompose he mure no s componens 46

47 Iowa Sae Unversy Idenfably of mure densy A densy P ( θ ) s sad o be denfable f θ θ mples ha here ess an such ha P( θ ) P ( θ ) Eample -- Consder he case where s bnary and P ( θ) s he mure: P( θ ) θ ( θ) + θ2 ( θ2) 2 2 ( θ + θ2) f 2 - ( θ + θ2) f 0 2 Assume ha: P ( θ) 0.6 P ( 0 θ ) 0.4 whch mples θ + θ 2.2 Bu we canno deermne mure (why?) Iowa Sae Unversy Idenfyng mure dsrbuons Undenfably of he mure dsrbuon suggess mpossbly of unsupervsed learnng Mures of many commonly encounered densy funcons (e.g., Gaussans) are usually denfable. Dscree dsrbuons, especally when here are many componens n he mure, ofen resul n more unnowns han here are ndependen equaons, mang denfably mpossble unless oher addonal nformaon s avalable. Whle can be shown ha mures of normal denses are usually denfable, bu here are scenaros where hs s no he case 47

48 Iowa Sae Unversy Idenfyng mure dsrbuons Whle can be shown ha mures of normal denses are usually denfable, bu here are scenaros where hs s no he case P( ω ) 2 P( ω2 ) P( θ) ep ( θ) + ep ( θ2 ) 2π 2 2π 2 Canno be unquely denfed f P(ω ) P(ω 2 ) because θ (θ, θ 2 ) and θ (θ 2, θ ) are wo possble vecors ha can be nerchanged whou affecng P ( θ ) we canno recover a unque θ even from an nfne amoun of daa! We focus on hose cases n whch he mure dsrbuons are denfable 2 Iowa Sae Unversy Learnng Mures from Daa Consder fed K e.g., Unnown parameers Θ {μ, σ, μ 2, σ 2, α α 2 } Gven daa D {,. N }, we wan o fnd he parameers Θ ha bes f he daa 48

49 Iowa Sae Unversy Mamum Lelhood Prncple assume a probablsc model lelhood p(daa parameers, model) fnd he parameers ha mae he daa mos lely L( Θ) p( D Θ) N N p( K Θ) whch n he case of p( c, θ ) α a mure model reduces o Iowa Sae Unversy The EM Algorhm Dempser, Lard, and Rubn (977) general framewor for lelhood-based parameer esmaon wh mssng daa sar wh nal guesses of parameers Esep: esmae membershps gven params Msep: esmae params gven membershps Repea unl convergence converges o a (local) mamum of lelhood E-sep and M-sep are ofen compuaonally smple generalzes o mamum a poseror (wh prors) 49

50 Iowa Sae Unversy The EM Algorhm for Learnng Componens of a Mure Dsrbuon Smlar o he applcaon of EM for handlng mssng arbue values n Bayesan newors. I s he class label ha s mssng n he enre daa se! Iowa Sae Unversy EM Mehod for Mure Esmaon Suppose ha we have a se D {,, n } of n unlabeled samples drawn ndependenly from he mure densy L( Θ) p( D Θ) where Θ N N p( Θ) p( K Θ) p( K p( ω, θ ω, θ ) α { θ... θ, α... α } K ) α K 50

51 Iowa Sae Unversy EM Mehod for Mure Esmaon Mamum lelhood esmae Θ ˆ argma p( D Θ) wh p( D Θ) Θ Log lelhood s N p( Θ) l N ln p( Θ)) N ln K α p( ω, θ )) Iowa Sae Unversy EM Mehod for Mure Esmaon Because he ω are unnown, we model hem as a se of hdden random varables and ae he epecaon over possble values of Ω Unforunaely, esmang he dsrbuon of requres nowledge of Θ. To brea hs cycle, we sar wh a guess for Θ Θ ) E l Ω N P ( ω Θˆ ) K, ln α p( ω, θ ) 5

52 52 Iowa Sae Unversy EM Mehod for Mure Esmaon Wh a b of algebra, hs epresson can be smplfed o yeld We pc he ne guess for Θ so as o mamze he above epecaon subjec o he consran ( ) ( ) Ω Θ N K p P l E ( ln ˆ, ω α ω α K usng he sandard approach usng he Lagrange mulpler Iowa Sae Unversy Updae equaons for Θ: ( ) ( ) ( ) Θ Θ Θ Θ N N K j j j j P P p p, P ω where, ω P N ˆ, ˆ, ˆ ) ˆ, ( ˆ ) ˆ, ( ˆ ˆ ) ˆ ( ω ω θ α θ ω α θ ω α : Mamum lelhood mure denfcaon

53 Iowa Sae Unversy Eample: Mures of Normal Mures p ( ω, θ ) ~ N (μ, Σ ) Possble cases Case μ? Σ P (ω ) c 2??? 3???? Iowa Sae Unversy Case Unnown mean vecors μ θ,, c d / 2 / 2 [ 2π) ] ( μ ) ln p ( ω, μ ) ln( ( μ ) 2 μˆ n n P( ω P( ω, μˆ ), μˆ ) P( ω ˆ s he fracon of hose samples havng value, μ ) ha come from he h class, and ˆμ s he average of he samples comng from he h class. () 53

54 Iowa Sae Unversy Mamum lelhood mure denfcaon Unforunaely, equaon () does no gve ˆμ eplcly However, f we have some way of obanng good nal esmaes ˆ μ ( 0 ) for he unnown means, equaon () provde a way o apply EM algorhm μˆ ( j + ) n n P( ω P( ω, μˆ( j)), μˆ( j)) Iowa Sae Unversy Graden based ML denfcaon of mures Eample Consder he smple wo-componen onedmensonal normal mure 2 2 p( μ, μ 2 ) ep ( μ) + ep ( μ 2 ) 3 2π 2 3 2π 2 (2 clusers!) se μ -2, μ 2 2 and draw 25 samples sequenally from hs mure. The log-lelhood funcon s: l( μ n, μ 2 ) ln p( μ, μ 2 ) 2 54

55 Iowa Sae Unversy The mamum value of l occurs a: Graden based ML denfcaon of mures μ ˆ and μ ˆ whch are no far from he rue values: μ -2 and μ 2 +2 Iowa Sae Unversy Idenfyng mures of normals when all parameers unnown If no consrans are placed on he covarance mar ML prncple resuls n useless sngular soluons because s possble o mae he lelhood arbrarly large. In pracce, we ge useful resuls by focusng on he larges of he fne local mama of he lelhood funcon or by applyng he Mnmum descrpon lengh prncple. 55

56 56 Iowa Sae Unversy Mamum Lelhood esmaon of mures of normals he general case θ ω μ μ θ ω Σ θ ω θ ω μ θ ω ω n n n n n P P P P P n P ˆ), ˆ( ) ˆ )( ˆ ˆ)(, ˆ( ˆ ˆ), ˆ( ˆ), ˆ( ˆ ˆ), ˆ( ) ˆ( ω μ Σ μ Σ ω μ Σ μ Σ θ ω c j j j j j j P P P ) ˆ( ) ˆ ( ˆ ) ˆ ( ep ˆ ) ˆ( ) ˆ ( ˆ ) ˆ ( ep ˆ), ˆ( / / Iowa Sae Unversy Marov Models, Hdden Marov Models Oulne Bag of words, n-grams, and relaed models Marov models Hdden Marov models Hgher order Marov models Varaons on Hdden Marov Models Applcaons

57 Iowa Sae Unversy Applcaons of Sequence Classfers Speech recognon Naural language processng Te processng Gesure recognon Bologcal sequence analyss gene denfcaon proen classfcaon Iowa Sae Unversy Bag of words, n-grams and relaed models Map arbrary lengh sequences o fed lengh feaure represenaons Bag of words represen sequences by feaure vecors wh as many componens as here are words n he vocabulary n-grams shor subsequences of n leers Ignore relave orderng of words or n-grams along he sequence ca chased he mouse and mouse chased he ca have dencal bag of words represenaons 57

58 Iowa Sae Unversy Bag of words, n-grams and relaed models Fed lengh feaure represenaons mae possble o apply machne learnng mehods ha wor wh feaure-based represenaons Feaures Gven (as n he case of words Englsh vocabulary) Dscovered from daa sascs of occurrence of n-grams n daa If varable lengh n-grams are allowed, need o ae no accoun possble overlaps Compuaon of n-gram frequences can be made effcen usng dynamc programmng f a srng appears mes n a pece of e, any subsrng of he srng appears a leas mes n he e Iowa Sae Unversy Marov models (Marov Chans) A Marov model s a probablsc model of symbol sequences n whch he probably of he curren even s depends only on he mmedaely precedng even. Consder a sequence of random varables X, X 2,, X N. Thn of he subscrps as ndcang word poson n a senence or a leer poson n a sequence Recall ha a random varable s a funcon In he case of senences made of words, he range of he random varables s he vocabulary of he language. In he case of DNA sequences, he random varables ae on values from a 4-leer alphabe {A, C, G, T} 58

59 Iowa Sae Unversy Smple Model - Marov Chans Marov Propery: The sae of he sysem a me + only depends on he sae of he sysem a me P[ X + + X, X - -,..., X, X 0 0 ] P[ X + + X ] X X 2 X 3 X 4 X 5 Iowa Sae Unversy Marov chans The fac ha subscrp appears on boh he X and he n X s a b abusve of noaon. I mgh be beer o wre: P ( X s, X s,..., X 2 2 s ) where { v... v } Range( X ) j s j L j In wha follows, we wll abuse noaon 59

60 Iowa Sae Unversy Marov Chans Saonary: Probables are ndependen of P[ X j X ] a + j Ths means ha f sysem s n sae, he probably ha he sysem wll ranson o sae j s p j regardless of he value of Iowa Sae Unversy Descrbng a Marov Chan A Marov chan can be descrbed by he ranson mar A and nal sae probables Q: a j + P( X j X ) q P( X ) T, K, X T ) P( X ) P( X 2 X ) P( X T X T ) q X P( X K A( X, X + ) 60

61 Iowa Sae Unversy Two ways o represen he condonal probably able of a frs-order Marov process Curren symbol.7.7 Ne Symbol A B C A B C A Sample srng: CCBBAAAAABAABACBABAAA C.5 B Iowa Sae Unversy The probably of generang a srng Produc of probables, one for each erm n he sequence T 2 T p({ X } ) p( X ) p( X X ) Ths comes from he able of nal probables Ths means a sequence of symbols from me o me T Ths s a ranson probably 6

62 Iowa Sae Unversy The fundamenal quesons Lelhood Gven a model μ (A,Q), how can we effcenly compue he lelhood of an observaon P (X μ )? For any sae sequence (X,,X T ): P ( X,..., X T ) q a a L a T T Learnng Gven a se of observaon sequences X, and a generc model, how can we esmae he parameers ha defne he bes model o descrbe he daa? Use sandard esmaon mehods mamum lelhood or Bayesan esmaes dscussed earler n he course Iowa Sae Unversy Smple Eample of a Marov model Weaher ranng oday ran omorrow a rr 0.4 ranng oday no ran omorrow a rn 0.6 no ranng oday ran omorrow a nr 0.2 no ranng oday no ran omorrow a rr

63 Iowa Sae Unversy Smple Eample of a Marov model 0. 3 Q A Noe ha boh he ranson mar and he nal sae mar are Sochasc Marces (rows sum o ) n general, he ranson probables beween wo saes need no be symmerc ( a j a j ) and he probably of ranson from a sae o self ( a ) need no be zero Iowa Sae Unversy Types of Marov models Ergodc models Ergodc model - Srongly conneced dreced pah wh posve probables from each sae o each sae j (bu no necessarly a complee dreced graph). Tha s, for all,j a j >0; a >0 63

64 Iowa Sae Unversy Types of Models LR models Lef-o-Rgh (LR) model -- Inde of sae non-decreasng wh me Iowa Sae Unversy Marov models wh absorbng saes A each play Gambler wns $ wh probably p or Gambler loses $ wh probably -p Game ends when gambler goes broe, or gans a forune of $00 -- Boh $0 and $00 are absorbng saes p p p p 0 2 N- N -p -p -p -p Sar (0$) 64

65 Iowa Sae Unversy Coe vs. Peps Gven ha a person s las cola purchase was Coe, here s a 90% chance ha her ne cola purchase wll also be Coe. If a person s las cola purchase was Peps, here s an 80% chance ha her ne cola purchase wll also be Peps coe peps 0.2 Iowa Sae Unversy Coe vs. Peps Gven ha a person s currenly a Peps purchaser, wha s he probably ha she wll purchase Coe wo purchases from now? The ranson mar s: A (Correspondng o one purchase ahead) A

66 Iowa Sae Unversy Coe vs. Peps Gven ha a person s currenly a Coe drner, wha s he probably ha she wll purchase Peps hree purchases from now? A Iowa Sae Unversy Coe vs. Peps Assume each person maes one cola purchase per wee. Suppose 60% of all people now drn Coe, and 40% drn Peps. Wha fracon of people wll be drnng Coe hree wees from now? Le (q 0,q )(0.6,0.4) be he nal probables. We wll denoe Coe by 0 and Peps by 0. 9 A We wan o fnd P(X 3 0) 0. 2 a ( 0. 6)( 0. 78) + ( 0. 4)( ) ( 3) ( 3) ( 3) P( X 3 0) qa0 q0a00 + qq

67 Iowa Sae Unversy Learnng he condonal probably able Naïve: Jus observe a lo of srngs and se he condonal probables equal o observed probables Beer: add o op and number of symbols o boom - a wea unform pror over he ranson probables. p ( B A) srngs srngs p( B A) N occurrences of occurrences of + symbols srngs + # AB AB A srngs # A Iowa Sae Unversy Hdden Marov Models In many scenaros saes canno be drecly observed. We need an eenson -- Hdden Marov Models a a 22 a 33 a 44 b b 4 b 2 b 3 Observaons 2 3 a 2 a 23 a 34 4 b + b 2 + b 3 + b 4, b 2 + b 22 + b 23 + b 24, ec. a j are sae ranson probables. b are observaon (oupu) probables. 67

68 Iowa Sae Unversy Hdden Marov Models We nroduce hdden saes o ge a hdden Marov model: The ne hdden sae depends only on he curren hdden sae, bu hdden saes can carry along nformaon from more han one me-sep n he pas. The curren symbol depends only on he curren hdden sae. Iowa Sae Unversy Eample: Dshones Casno Wha s hdden n hs model? Sae sequences You are allowed o see he oucome of a de roll You do no now whch oucomes were obaned by a far de and whch oucomes were obaned by a loaded de 68

69 Iowa Sae Unversy Wha s an HMM? Green crcles are hdden saes Each hdden sae s dependen only on he prevous sae: Marov process The pas s ndependen of he fuure gven he presen. Iowa Sae Unversy Wha s an HMM? Purple nodes are observed saes Each observed sae s dependen only on he correspondng hdden sae 69

70 Iowa Sae Unversy Specfyng HMM X A X 2 A A A X B B B O O 2 O {X,O,, A, B} Π {π ι } are he nal sae probables A {a j } are he sae ranson probables B {b } are he observaon sae probables Iowa Sae Unversy A hdden Marov model j A B C A B C A B C Each hdden node has a vecor of ranson probables and a vecor of oupu probables. 70

71 Iowa Sae Unversy Con-Tossng Eample al 0.9 Sar /2 /2 /2 0. /4 Far loaded al 0. /2 3/4 0.9 head head L osses Far/Loaded X X 2 X L- X L X O O 2 O L- OL O Head/Tal Query: wha are he mos lely values n he X-nodes o generae he gven daa? Iowa Sae Unversy Fundamenal problems Lelhood Compue he probably of a gven observaon sequence gven a model Decodng Gven an observaon sequence, and a model, compue he mos lely hdden sae sequence Learnng Gven an observaon sequence and se of possble models, whch model mos closely fs he daa? 7

72 Iowa Sae Unversy Generang a srng from an HMM I s easy o generae srngs f we now he parameers of he model. A each me sep, mae wo random choces: Use he ranson probables from he curren hdden node o pc he ne hdden node. Use he oupu probables from he curren hdden node o pc he curren symbol o oupu. Iowa Sae Unversy Generang a srng from an HMM I s easy o generae srngs f we now he parameers of he model. We can frs produce a complee hdden sequence and hen allowng each hdden node n he sequence o produce one symbol. Hdden nodes only depend on prevous hdden nodes The probably of generang a hdden sequence does no depend on he vsble sequence ha generaes. 72

73 Iowa Sae Unversy The probably of generang a hdden sequence Produc of probables, one for each erm n he sequence T 2 T p({ X } ) p( X ) p( X X ) Ths comes from he able of nal probables of Ths means a hdden nodes sequence of hdden nodes from me o me T a Ths s a ranson probably beween hdden nodes j p( X j X ) Iowa Sae Unversy The jon probably of generang a hdden sequence and a vsble sequence T p({ X, O } ) p( X ) p( O X ) p( X X ) p( O X ) T 2 a sequence of hdden saes and oupu symbols he probably of oupung symbol O from sae X 73

74 Iowa Sae Unversy The probably of generang a vsble sequence from an HMM T T p({ O } ) p({ O } X ) p( X ) X pahs hrough hdden saes The same vsble sequence can be produced by many dfferen hdden sequences There are eponenally many possble hdden sequences. T How can we calculae p({ O } )? Iowa Sae Unversy Fundamenal problems Lelhood Compue he probably of a gven observaon sequence gven a model Decodng Gven an observaon sequence, and a model, compue he mos lely hdden sae sequence Learnng Gven an observaon sequence and se of possble models, whch model mos closely fs he daa? 74

75 Iowa Sae Unversy The HMM dynamc programmng rc τ τ DP offers an effcen way o compue a sum ha has eponenally many erms. j j j A each me τ we combne everyhng we need o now abou he pahs up o ha me The probably of havng τ produced he sequence λ p({ O } X τ ) τ up o me τ gven ha sae s used a me τ Ths quany can be compued recursvely: λ p( O X ) λ, p( X X j) j τ τ τ τ τ τ j Iowa Sae Unversy Probably of an Observaon Sequence o o - o o + o T Gven an observaon sequence and a model, compue he probably of he observaon sequence O ( o,..., ot ), μ ( A, B, Π) Compue P( O μ) 75

76 Iowa Sae Unversy Probably of an observaon sequence - + T o o - o o + o T P... b ( O X, μ) b b o 2 o2 P ( X μ) π a a... a T T P ( O, X μ) P( O X, μ) P( X μ) P ( O μ) P( O X, μ) P( X μ) X T o T Iowa Sae Unversy Probably of an Observaon Sequence - + T o o - o o + o T P( O μ) { X... X π T } b o T Π a + b o

77 Iowa Sae Unversy Probably of an observaon sequence - + T o o - o o + o T Specal srucure gves us an effcen soluon usng dynamc programmng. Inuon Probably of he frs observaons s he same for all possble + lengh sae sequences. Defne: α ( ) P( o... o, μ) Iowa Sae Unversy Forward Procedure - + T o o - o o + o T α j ( +) P( o... o P( o... o P( o... o, P( o... o ) P( o P( o... o, j) j) P( j) P( o + j) P( o j) + j) P( + j) P( + j) j) + j) 77

78 Iowa Sae Unversy Forward Procedure - + T o o - o o + o T α j ( + )... N... N α... N ( ) a... N P( o... o, P( o... o, j b P( o... o, jo + + ) P( j +, + ) P( j j) P( o + ) P( o ) P( o j) j) j) Iowa Sae Unversy Bacward Procedure - + T o o - o o + o T β ( T + ) + T β ( ) P( o... o ) β ( ) a b β ( + ) j o+ j... N j Probably of he res of he observaons gven he sae a me 78

79 Iowa Sae Unversy Sequence probably - + T o o - o o + o T N P( O μ) α ( N P O μ ) N T ( π β ) b, o () P( O μ) α ( ) β ( ) Forward Procedure Bacward Procedure Combnaon Iowa Sae Unversy Fundamenal problems Lelhood Compue he probably of a gven observaon sequence gven a model Decodng Gven an observaon sequence, and a model, compue he mos lely hdden sae sequence Learnng Gven an observaon sequence and se of possble models, whch model mos closely fs he daa? 79

80 Iowa Sae Unversy The mos probable Sae Sequence o o - o o + o T Fnd he sae sequence ha bes eplans he observaons arg ma P( X O) X Iowa Sae Unversy Verb Algorhm - j o o - o o + o T δ j ( ) ma P(..., o... o,... j, o ) The probably of he sae sequence whch mamzes he probably of seeng he observaons o me -, landng n sae j, and seeng he observaon a me 80

81 Iowa Sae Unversy Verb Algorhm - + o o - o o + o T δ j ( ) ma P(..., o... o, δ ψ j j... ( + ) maδ ( ) ajbjo + ( + ) arg maδ ( ) ajb jo+ j, o ) Recursve Compuaon Iowa Sae Unversy Verb Algorhm - + T o o - o o + o T Xˆ T Xˆ arg maδ ( T ) ψ ^ ( + ) X + Compue he mos lely sae sequence by worng bacwards 8

82 Iowa Sae Unversy Fundamenal problems Lelhood Compue he probably of a gven observaon sequence gven a model Decodng Gven an observaon sequence, and a model, compue he mos lely hdden sae sequence Learnng Gven an observaon sequence and se of possble models, whch model mos closely fs he daa? Iowa Sae Unversy Learnng he parameers of an HMM Is easy o learn he parameers f, for each observed sequence of symbols, we can nfer he poseror dsrbuon across he sequences of hdden saes We can nfer whch hdden sae sequence gave rse o an observed sequence by usng he dynamc programmng rc. 82

83 Iowa Sae Unversy Learnng HMM Parameer Esmaon A A A A B B B B B o o - o o + o T Gven an observaon sequence, fnd he model ha s mos lely o produce ha sequence. Gven a model and observaon sequence, updae he model parameers o beer f he observaons. Iowa Sae Unversy The probably of generang a vsble sequence from an HMM p( O) p( O X ) p( X ) X hdden pahs The same vsble sequence can be produced by many dfferen hdden sequences A B C D 83

84 Iowa Sae Unversy The poseror probably of a hdden pah gven a vsble sequence p( X O) p( X ) p( O X ) p( Y ) p( O Y ) Y hdden pahs a hdden pah The sum n he denomnaor could be compued effcenly usng he dynamc programmng rc. Bu for learnng we do no need o now abou enre hdden pahs. Iowa Sae Unversy Learnng he parameers of an HMM Is easy o learn he parameers f, for each observed sequence of symbols, we can nfer he poseror probably for each hdden node a each me sep. We can nfer hese poseror probables by usng he dynamc programmng rc. 84

85 Iowa Sae Unversy The HMM dynamc programmng rc α ( ) p( O... O, ) j j j α ( j ) α j ( ) a j b, hdden saes o Iowa Sae Unversy β The dynamc programmng rc agan () p( O... O ) + T j j j β ( ) aj b, o β ( + ) + j j hdden saes 85

86 Iowa Sae Unversy The forward-bacward algorhm (Baum-Welch algorhm) We do a forward pass along he observed srng o compue he alpha s a each me sep for each node We do a bacward pass along he observed srng o compue he bea s a each me sep for each node Once we have he alpha s and bea s a each me sep, s easy o re-esmae he oupu probables and ranson probables Iowa Sae Unversy Learnng he parameers of he HMM To learn he ranson mar we need o now he epeced number of mes ha each ranson beween wo hdden nodes was used when generang he observed sequence. To learn he oupu probables we need o now he epeced number of mes each node was used o generae each symbol. Because of hdden saes, we use epecaon mamzaon (EM) algorhm 86

87 Iowa Sae Unversy The re-esmaon equaons (he M-sep of he EM procedure) For he ranson probably from node o node j: a new j Coun( o j ransons n he daa) Coun( o ransons n he daa) hdden saes For he probably ha node generaes symbol A: b ( A) Coun( sae produces symbol A n he daa) Coun( sae produces symbol B n he daa) B symbols Iowa Sae Unversy Summng he epecaons over me The epeced number of mes ha node produces symbol A requres a summaon over all he dfferen mes n he sequence when here was an A. Coun( sae produces symbol A) p( X O) : O A The epeced number of mes ha he ranson from o j occurred requres a summaon over all pars of adjacen mes n he sequence Coun( ransons from sae o sae j) p(, + T j O) 87

88 Iowa Sae Unversy Combne he pas and he fuure o ge he full poseror To re-esmae he oupu probables, we need o compue he poseror probably of beng a a parcular hdden node a a parcular me. Ths requres a summaon of he poseror probables of all he pahs ha go hrough ha node a ha me. j j j j j Iowa Sae Unversy Combnng pas and fuure p( O, ) α ( ) β ( ) p( O) p( O, ) p( O) α ( ) β ( ) p( O) α ( ) aj bj, O β ( + j p(, + j O) p( O) + ) 88

89 Iowa Sae Unversy Parameer Esmaon: Baum-Welch or Forward-Bacward A A A A B B B B B o o - o o + o T p α ( ) a + (, j) γ ( ) j... N j b jo m m... N p (, β ( + ) α ( ) β ( ) j) j m Probably of raversng an arc Probably of beng n sae Iowa Sae Unversy Parameer Esmaon: Baum-Welch Algorhm A A A A B B B B B o o - o o + o T aˆ j T T π ˆ p (, j) γ ( ) γ () bˆ { : o } T γ ( ) γ ( ) Now we can compue he new esmaes of he model parameers. 89

90 Iowa Sae Unversy HMM Parameer esmaon n pracce Sparseness of daa requres Smoohng of esmaes usng Laplace esmaes (as n Naïve Bayes) o gve suable nonzero probably o unseen observaons Doman specfc rcs Feaure decomposon (capalzed?, number?, ec. n e processng) gves a beer esmae Shrnage allows poolng of esmaes over mulple saes of same ype Well desgned HMM opology Iowa Sae Unversy 90

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model Probablsc Model for Tme-seres Daa: Hdden Markov Model Hrosh Mamsuka Bonformacs Cener Kyoo Unversy Oulne Three Problems for probablsc models n machne learnng. Compung lkelhood 2. Learnng 3. Parsng (predcon

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machne Learnng & Percepon Insrucor: Tony Jebara SVM Feaure & Kernel Selecon SVM Eensons Feaure Selecon (Flerng and Wrappng) SVM Feaure Selecon SVM Kernel Selecon SVM Eensons Classfcaon Feaure/Kernel

More information

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms Course organzaon Inroducon Wee -2) Course nroducon A bref nroducon o molecular bology A bref nroducon o sequence comparson Par I: Algorhms for Sequence Analyss Wee 3-8) Chaper -3, Models and heores» Probably

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4 CS434a/54a: Paern Recognon Prof. Olga Veksler Lecure 4 Oulne Normal Random Varable Properes Dscrmnan funcons Why Normal Random Varables? Analycally racable Works well when observaon comes form a corruped

More information

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany Herarchcal Markov Normal Mxure models wh Applcaons o Fnancal Asse Reurns Appendx: Proofs of Theorems and Condonal Poseror Dsrbuons John Geweke a and Gann Amsano b a Deparmens of Economcs and Sascs, Unversy

More information

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition EHEM ALPAYDI he MI Press, 04 Lecure Sldes for IRODUCIO O Machne Learnng 3rd Edon alpaydn@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/ml3e Sldes from exboo resource page. Slghly eded and wh addonal examples

More information

CHAPTER 10: LINEAR DISCRIMINATION

CHAPTER 10: LINEAR DISCRIMINATION CHAPER : LINEAR DISCRIMINAION Dscrmnan-based Classfcaon 3 In classfcaon h K classes (C,C,, C k ) We defned dscrmnan funcon g j (), j=,,,k hen gven an es eample, e chose (predced) s class label as C f g

More information

Clustering (Bishop ch 9)

Clustering (Bishop ch 9) Cluserng (Bshop ch 9) Reference: Daa Mnng by Margare Dunham (a slde source) 1 Cluserng Cluserng s unsupervsed learnng, here are no class labels Wan o fnd groups of smlar nsances Ofen use a dsance measure

More information

CHAPTER 5: MULTIVARIATE METHODS

CHAPTER 5: MULTIVARIATE METHODS CHAPER 5: MULIVARIAE MEHODS Mulvarae Daa 3 Mulple measuremens (sensors) npus/feaures/arbues: -varae N nsances/observaons/eamples Each row s an eample Each column represens a feaure X a b correspons o he

More information

CHAPTER 2: Supervised Learning

CHAPTER 2: Supervised Learning HATER 2: Supervsed Learnng Learnng a lass from Eamples lass of a famly car redcon: Is car a famly car? Knowledge eracon: Wha do people epec from a famly car? Oupu: osve (+) and negave ( ) eamples Inpu

More information

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press, Lecure Sldes for INTRDUCTIN T Machne Learnng ETHEM ALAYDIN The MIT ress, 2004 alpaydn@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/2ml CHATER 3: Hdden Marov Models Inroducon Modelng dependences n npu; no

More information

Variants of Pegasos. December 11, 2009

Variants of Pegasos. December 11, 2009 Inroducon Varans of Pegasos SooWoong Ryu bshboy@sanford.edu December, 009 Youngsoo Cho yc344@sanford.edu Developng a new SVM algorhm s ongong research opc. Among many exng SVM algorhms, we wll focus on

More information

( ) () we define the interaction representation by the unitary transformation () = ()

( ) () we define the interaction representation by the unitary transformation () = () Hgher Order Perurbaon Theory Mchael Fowler 3/7/6 The neracon Represenaon Recall ha n he frs par of hs course sequence, we dscussed he chrödnger and Hesenberg represenaons of quanum mechancs here n he chrödnger

More information

Lecture VI Regression

Lecture VI Regression Lecure VI Regresson (Lnear Mehods for Regresson) Conens: Lnear Mehods for Regresson Leas Squares, Gauss Markov heorem Recursve Leas Squares Lecure VI: MLSC - Dr. Sehu Vjayakumar Lnear Regresson Model M

More information

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!) i+1,q - [(! ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL The frs hng o es n wo-way ANOVA: Is here neracon? "No neracon" means: The man effecs model would f. Ths n urn means: In he neracon plo (wh A on he horzonal

More information

Fall 2010 Graduate Course on Dynamic Learning

Fall 2010 Graduate Course on Dynamic Learning Fall 200 Graduae Course on Dynamc Learnng Chaper 4: Parcle Flers Sepember 27, 200 Byoung-Tak Zhang School of Compuer Scence and Engneerng & Cognve Scence and Bran Scence Programs Seoul aonal Unversy hp://b.snu.ac.kr/~bzhang/

More information

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005 Dynamc Team Decson Theory EECS 558 Proec Shruvandana Sharma and Davd Shuman December 0, 005 Oulne Inroducon o Team Decson Theory Decomposon of he Dynamc Team Decson Problem Equvalence of Sac and Dynamc

More information

Applications of Sequence Classifiers. Learning Sequence Classifiers. Simple Model - Markov Chains. Markov models (Markov Chains)

Applications of Sequence Classifiers. Learning Sequence Classifiers. Simple Model - Markov Chains. Markov models (Markov Chains) Learnng Sequence Classfers pplcans f Sequence Classfers Oulne pplcans f sequence classfcan ag f wrds, n-grams, and relaed mdels Markv mdels Hdden Markv mdels Hgher rder Markv mdels Varans n Hdden Markv

More information

An introduction to Support Vector Machine

An introduction to Support Vector Machine An nroducon o Suppor Vecor Machne 報告者 : 黃立德 References: Smon Haykn, "Neural Neworks: a comprehensve foundaon, second edon, 999, Chaper 2,6 Nello Chrsann, John Shawe-Tayer, An Inroducon o Suppor Vecor Machnes,

More information

Solution in semi infinite diffusion couples (error function analysis)

Solution in semi infinite diffusion couples (error function analysis) Soluon n sem nfne dffuson couples (error funcon analyss) Le us consder now he sem nfne dffuson couple of wo blocks wh concenraon of and I means ha, n a A- bnary sysem, s bondng beween wo blocks made of

More information

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University Hdden Markov Models Followng a lecure by Andrew W. Moore Carnege Mellon Unversy www.cs.cmu.edu/~awm/uorals A Markov Sysem Has N saes, called s, s 2.. s N s 2 There are dscree meseps, 0,, s s 3 N 3 0 Hdden

More information

Lecture 6: Learning for Control (Generalised Linear Regression)

Lecture 6: Learning for Control (Generalised Linear Regression) Lecure 6: Learnng for Conrol (Generalsed Lnear Regresson) Conens: Lnear Mehods for Regresson Leas Squares, Gauss Markov heorem Recursve Leas Squares Lecure 6: RLSC - Prof. Sehu Vjayakumar Lnear Regresson

More information

Notes on the stability of dynamic systems and the use of Eigen Values.

Notes on the stability of dynamic systems and the use of Eigen Values. Noes on he sabl of dnamc ssems and he use of Egen Values. Source: Macro II course noes, Dr. Davd Bessler s Tme Seres course noes, zarads (999) Ineremporal Macroeconomcs chaper 4 & Techncal ppend, and Hamlon

More information

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS R&RATA # Vol.) 8, March FURTHER AALYSIS OF COFIDECE ITERVALS FOR LARGE CLIET/SERVER COMPUTER ETWORKS Vyacheslav Abramov School of Mahemacal Scences, Monash Unversy, Buldng 8, Level 4, Clayon Campus, Wellngon

More information

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s Ordnary Dfferenal Equaons n Neuroscence wh Malab eamples. Am - Gan undersandng of how o se up and solve ODE s Am Undersand how o se up an solve a smple eample of he Hebb rule n D Our goal a end of class

More information

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data Anne Chao Ncholas J Goell C seh lzabeh L ander K Ma Rober K Colwell and Aaron M llson 03 Rarefacon and erapolaon wh ll numbers: a framewor for samplng and esmaon n speces dversy sudes cology Monographs

More information

Department of Economics University of Toronto

Department of Economics University of Toronto Deparmen of Economcs Unversy of Torono ECO408F M.A. Economercs Lecure Noes on Heeroskedascy Heeroskedascy o Ths lecure nvolves lookng a modfcaons we need o make o deal wh he regresson model when some of

More information

Machine Learning Linear Regression

Machine Learning Linear Regression Machne Learnng Lnear Regresson Lesson 3 Lnear Regresson Bascs of Regresson Leas Squares esmaon Polynomal Regresson Bass funcons Regresson model Regularzed Regresson Sascal Regresson Mamum Lkelhood (ML)

More information

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model BGC1: Survval and even hsory analyss Oslo, March-May 212 Monday May 7h and Tuesday May 8h The addve regresson model Ørnulf Borgan Deparmen of Mahemacs Unversy of Oslo Oulne of program: Recapulaon Counng

More information

CHAPTER 7: CLUSTERING

CHAPTER 7: CLUSTERING CHAPTER 7: CLUSTERING Semparamerc Densy Esmaon 3 Paramerc: Assume a snge mode for p ( C ) (Chapers 4 and 5) Semparamerc: p ( C ) s a mure of denses Mupe possbe epanaons/prooypes: Dfferen handwrng syes,

More information

Robustness Experiments with Two Variance Components

Robustness Experiments with Two Variance Components Naonal Insue of Sandards and Technology (NIST) Informaon Technology Laboraory (ITL) Sascal Engneerng Dvson (SED) Robusness Expermens wh Two Varance Componens by Ana Ivelsse Avlés avles@ns.gov Conference

More information

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance INF 43 3.. Repeon Anne Solberg (anne@f.uo.no Bayes rule for a classfcaon problem Suppose we have J, =,...J classes. s he class label for a pxel, and x s he observed feaure vecor. We can use Bayes rule

More information

CS286.2 Lecture 14: Quantum de Finetti Theorems II

CS286.2 Lecture 14: Quantum de Finetti Theorems II CS286.2 Lecure 14: Quanum de Fne Theorems II Scrbe: Mara Okounkova 1 Saemen of he heorem Recall he las saemen of he quanum de Fne heorem from he prevous lecure. Theorem 1 Quanum de Fne). Le ρ Dens C 2

More information

Hidden Markov Models

Hidden Markov Models 11-755 Machne Learnng for Sgnal Processng Hdden Markov Models Class 15. 12 Oc 2010 1 Admnsrva HW2 due Tuesday Is everyone on he projecs page? Where are your projec proposals? 2 Recap: Wha s an HMM Probablsc

More information

FTCS Solution to the Heat Equation

FTCS Solution to the Heat Equation FTCS Soluon o he Hea Equaon ME 448/548 Noes Gerald Reckenwald Porland Sae Unversy Deparmen of Mechancal Engneerng gerry@pdxedu ME 448/548: FTCS Soluon o he Hea Equaon Overvew Use he forward fne d erence

More information

( ) [ ] MAP Decision Rule

( ) [ ] MAP Decision Rule Announcemens Bayes Decson Theory wh Normal Dsrbuons HW0 due oday HW o be assgned soon Proec descrpon posed Bomercs CSE 90 Lecure 4 CSE90, Sprng 04 CSE90, Sprng 04 Key Probables 4 ω class label X feaure

More information

Robust and Accurate Cancer Classification with Gene Expression Profiling

Robust and Accurate Cancer Classification with Gene Expression Profiling Robus and Accurae Cancer Classfcaon wh Gene Expresson Proflng (Compuaonal ysems Bology, 2005) Auhor: Hafeng L, Keshu Zhang, ao Jang Oulne Background LDA (lnear dscrmnan analyss) and small sample sze problem

More information

January Examinations 2012

January Examinations 2012 Page of 5 EC79 January Examnaons No. of Pages: 5 No. of Quesons: 8 Subjec ECONOMICS (POSTGRADUATE) Tle of Paper EC79 QUANTITATIVE METHODS FOR BUSINESS AND FINANCE Tme Allowed Two Hours ( hours) Insrucons

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecure Sldes for Machne Learnng nd Edon ETHEM ALPAYDIN, modfed by Leonardo Bobadlla and some pars from hp://www.cs.au.ac.l/~aparzn/machnelearnng/ The MIT Press, 00 alpaydn@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/mle

More information

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes. umercal negraon of he dffuson equaon (I) Fne dfference mehod. Spaal screaon. Inernal nodes. R L V For hermal conducon le s dscree he spaal doman no small fne spans, =,,: Balance of parcles for an nernal

More information

Clustering with Gaussian Mixtures

Clustering with Gaussian Mixtures Noe o oher eachers and users of hese sldes. Andrew would be delghed f you found hs source maeral useful n gvng your own lecures. Feel free o use hese sldes verbam, or o modfy hem o f your own needs. PowerPon

More information

Comb Filters. Comb Filters

Comb Filters. Comb Filters The smple flers dscussed so far are characered eher by a sngle passband and/or a sngle sopband There are applcaons where flers wh mulple passbands and sopbands are requred Thecomb fler s an example of

More information

Math 128b Project. Jude Yuen

Math 128b Project. Jude Yuen Mah 8b Proec Jude Yuen . Inroducon Le { Z } be a sequence of observed ndependen vecor varables. If he elemens of Z have a on normal dsrbuon hen { Z } has a mean vecor Z and a varancecovarance marx z. Geomercally

More information

Volatility Interpolation

Volatility Interpolation Volaly Inerpolaon Prelmnary Verson March 00 Jesper Andreasen and Bran Huge Danse Mares, Copenhagen wan.daddy@danseban.com brno@danseban.com Elecronc copy avalable a: hp://ssrn.com/absrac=69497 Inro Local

More information

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue. Lnear Algebra Lecure # Noes We connue wh he dscusson of egenvalues, egenvecors, and dagonalzably of marces We wan o know, n parcular wha condons wll assure ha a marx can be dagonalzed and wha he obsrucons

More information

Mechanics Physics 151

Mechanics Physics 151 Mechancs Physcs 5 Lecure 0 Canoncal Transformaons (Chaper 9) Wha We Dd Las Tme Hamlon s Prncple n he Hamlonan formalsm Dervaon was smple δi δ Addonal end-pon consrans pq H( q, p, ) d 0 δ q ( ) δq ( ) δ

More information

DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL

DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL Sco Wsdom, John Hershey 2, Jonahan Le Roux 2, and Shnj Waanabe 2 Deparmen o Elecrcal Engneerng, Unversy o Washngon, Seale, WA, USA

More information

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC CH.3. COMPATIBILITY EQUATIONS Connuum Mechancs Course (MMC) - ETSECCPB - UPC Overvew Compably Condons Compably Equaons of a Poenal Vecor Feld Compably Condons for Infnesmal Srans Inegraon of he Infnesmal

More information

CS 536: Machine Learning. Nonparametric Density Estimation Unsupervised Learning - Clustering

CS 536: Machine Learning. Nonparametric Density Estimation Unsupervised Learning - Clustering CS 536: Machne Learnng Nonparamerc Densy Esmaon Unsupervsed Learnng - Cluserng Fall 2005 Ahmed Elgammal Dep of Compuer Scence Rugers Unversy CS 536 Densy Esmaon - Cluserng - 1 Oulnes Densy esmaon Nonparamerc

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Ths documen s downloaded from DR-NTU, Nanyang Technologcal Unversy Lbrary, Sngapore. Tle A smplfed verb machng algorhm for word paron n vsual speech processng( Acceped verson ) Auhor(s) Foo, Say We; Yong,

More information

Computing Relevance, Similarity: The Vector Space Model

Computing Relevance, Similarity: The Vector Space Model Compung Relevance, Smlary: The Vecor Space Model Based on Larson and Hears s sldes a UC-Bereley hp://.sms.bereley.edu/courses/s0/f00/ aabase Managemen Sysems, R. Ramarshnan ocumen Vecors v ocumens are

More information

Graduate Macroeconomics 2 Problem set 5. - Solutions

Graduate Macroeconomics 2 Problem set 5. - Solutions Graduae Macroeconomcs 2 Problem se. - Soluons Queson 1 To answer hs queson we need he frms frs order condons and he equaon ha deermnes he number of frms n equlbrum. The frms frs order condons are: F K

More information

Let s treat the problem of the response of a system to an applied external force. Again,

Let s treat the problem of the response of a system to an applied external force. Again, Page 33 QUANTUM LNEAR RESPONSE FUNCTON Le s rea he problem of he response of a sysem o an appled exernal force. Agan, H() H f () A H + V () Exernal agen acng on nernal varable Hamlonan for equlbrum sysem

More information

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Anomaly eecon Lecure Noes for Chaper 9 Inroducon o aa Mnng, 2 nd Edon by Tan, Senbach, Karpane, Kumar 2/14/18 Inroducon o aa Mnng, 2nd Edon 1 Anomaly/Ouler eecon Wha are anomales/oulers? The se of daa

More information

Dishonest casino as an HMM

Dishonest casino as an HMM Dshnes casn as an HMM N = 2, ={F,L} M=2, O = {h,} A = F B= [. F L F L 0.95 0.0 0] h 0.5 0. L 0.05 0.90 0.5 0.9 c Deva ubramanan, 2009 63 A generave mdel fr CpG slands There are w hdden saes: CpG and nn-cpg.

More information

Including the ordinary differential of distance with time as velocity makes a system of ordinary differential equations.

Including the ordinary differential of distance with time as velocity makes a system of ordinary differential equations. Soluons o Ordnary Derenal Equaons An ordnary derenal equaon has only one ndependen varable. A sysem o ordnary derenal equaons consss o several derenal equaons each wh he same ndependen varable. An eample

More information

EEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment

EEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment EEL 6266 Power Sysem Operaon and Conrol Chaper 5 Un Commmen Dynamc programmng chef advanage over enumeraon schemes s he reducon n he dmensonaly of he problem n a src prory order scheme, here are only N

More information

[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5

[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5 TPG460 Reservor Smulaon 08 page of 5 DISCRETIZATIO OF THE FOW EQUATIOS As we already have seen, fne dfference appromaons of he paral dervaves appearng n he flow equaons may be obaned from Taylor seres

More information

Chapter 6: AC Circuits

Chapter 6: AC Circuits Chaper 6: AC Crcus Chaper 6: Oulne Phasors and he AC Seady Sae AC Crcus A sable, lnear crcu operang n he seady sae wh snusodal excaon (.e., snusodal seady sae. Complee response forced response naural response.

More information

Mechanics Physics 151

Mechanics Physics 151 Mechancs Physcs 5 Lecure 9 Hamlonan Equaons of Moon (Chaper 8) Wha We Dd Las Tme Consruced Hamlonan formalsm H ( q, p, ) = q p L( q, q, ) H p = q H q = p H = L Equvalen o Lagrangan formalsm Smpler, bu

More information

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue. Mah E-b Lecure #0 Noes We connue wh he dscusson of egenvalues, egenvecors, and dagonalzably of marces We wan o know, n parcular wha condons wll assure ha a marx can be dagonalzed and wha he obsrucons are

More information

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS THE PREICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS INTROUCTION The wo dmensonal paral dfferenal equaons of second order can be used for he smulaon of compeve envronmen n busness The arcle presens he

More information

Normal Random Variable and its discriminant functions

Normal Random Variable and its discriminant functions Noral Rando Varable and s dscrnan funcons Oulne Noral Rando Varable Properes Dscrnan funcons Why Noral Rando Varables? Analycally racable Works well when observaon coes for a corruped snle prooype 3 The

More information

Mechanics Physics 151

Mechanics Physics 151 Mechancs Physcs 5 Lecure 9 Hamlonan Equaons of Moon (Chaper 8) Wha We Dd Las Tme Consruced Hamlonan formalsm Hqp (,,) = qp Lqq (,,) H p = q H q = p H L = Equvalen o Lagrangan formalsm Smpler, bu wce as

More information

Natural Language Processing NLP Hidden Markov Models. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing NLP Hidden Markov Models. Razvan C. Bunescu School of Electrical Engineering and Computer Science Naural Language rcessng NL 6840 Hdden Markv Mdels Razvan C. Bunescu Schl f Elecrcal Engneerng and Cmpuer Scence bunescu@h.edu Srucured Daa Fr many applcans he..d. assumpn des n hld: pels n mages f real

More information

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6) Econ7 Appled Economercs Topc 5: Specfcaon: Choosng Independen Varables (Sudenmund, Chaper 6 Specfcaon errors ha we wll deal wh: wrong ndependen varable; wrong funconal form. Ths lecure deals wh wrong ndependen

More information

2/20/2013. EE 101 Midterm 2 Review

2/20/2013. EE 101 Midterm 2 Review //3 EE Mderm eew //3 Volage-mplfer Model The npu ressance s he equalen ressance see when lookng no he npu ermnals of he amplfer. o s he oupu ressance. I causes he oupu olage o decrease as he load ressance

More information

Lecture 2 M/G/1 queues. M/G/1-queue

Lecture 2 M/G/1 queues. M/G/1-queue Lecure M/G/ queues M/G/-queue Posson arrval process Arbrary servce me dsrbuon Sngle server To deermne he sae of he sysem a me, we mus now The number of cusomers n he sysems N() Tme ha he cusomer currenly

More information

Testing a new idea to solve the P = NP problem with mathematical induction

Testing a new idea to solve the P = NP problem with mathematical induction Tesng a new dea o solve he P = NP problem wh mahemacal nducon Bacground P and NP are wo classes (ses) of languages n Compuer Scence An open problem s wheher P = NP Ths paper ess a new dea o compare he

More information

Appendix to Online Clustering with Experts

Appendix to Online Clustering with Experts A Appendx o Onlne Cluserng wh Expers Furher dscusson of expermens. Here we furher dscuss expermenal resuls repored n he paper. Ineresngly, we observe ha OCE (and n parcular Learn- ) racks he bes exper

More information

Introduction to Boosting

Introduction to Boosting Inroducon o Boosng Cynha Rudn PACM, Prnceon Unversy Advsors Ingrd Daubeches and Rober Schapre Say you have a daabase of news arcles, +, +, -, -, +, +, -, -, +, +, -, -, +, +, -, + where arcles are labeled

More information

Advanced time-series analysis (University of Lund, Economic History Department)

Advanced time-series analysis (University of Lund, Economic History Department) Advanced me-seres analss (Unvers of Lund, Economc Hsor Dearmen) 3 Jan-3 Februar and 6-3 March Lecure 4 Economerc echnues for saonar seres : Unvarae sochasc models wh Box- Jenns mehodolog, smle forecasng

More information

GMM parameter estimation. Xiaoye Lu CMPS290c Final Project

GMM parameter estimation. Xiaoye Lu CMPS290c Final Project GMM paraeer esaon Xaoye Lu M290c Fnal rojec GMM nroducon Gaussan ure Model obnaon of several gaussan coponens Noaon: For each Gaussan dsrbuon:, s he ean and covarance ar. A GMM h ures(coponens): p ( 2π

More information

Fitting a Conditional Linear Gaussian Distribution

Fitting a Conditional Linear Gaussian Distribution Fng a Condonal Lnear Gaussan Dsrbuon Kevn P. Murphy 28 Ocober 1998 Revsed 29 January 2003 1 Inroducon We consder he problem of fndng he maxmum lkelhood ML esmaes of he parameers of a condonal Gaussan varable

More information

Scattering at an Interface: Oblique Incidence

Scattering at an Interface: Oblique Incidence Course Insrucor Dr. Raymond C. Rumpf Offce: A 337 Phone: (915) 747 6958 E Mal: rcrumpf@uep.edu EE 4347 Appled Elecromagnecs Topc 3g Scaerng a an Inerface: Oblque Incdence Scaerng These Oblque noes may

More information

Consider processes where state transitions are time independent, i.e., System of distinct states,

Consider processes where state transitions are time independent, i.e., System of distinct states, Dgal Speech Processng Lecure 0 he Hdden Marov Model (HMM) Lecure Oulne heory of Marov Models dscree Marov processes hdden Marov processes Soluons o he hree Basc Problems of HMM s compuaon of observaon

More information

Cubic Bezier Homotopy Function for Solving Exponential Equations

Cubic Bezier Homotopy Function for Solving Exponential Equations Penerb Journal of Advanced Research n Compung and Applcaons ISSN (onlne: 46-97 Vol. 4, No.. Pages -8, 6 omoopy Funcon for Solvng Eponenal Equaons S. S. Raml *,,. Mohamad Nor,a, N. S. Saharzan,b and M.

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Lnear Response Theory: The connecon beween QFT and expermens 3.1. Basc conceps and deas Q: ow do we measure he conducvy of a meal? A: we frs nroduce a weak elecrc feld E, and hen measure

More information

Single-loop System Reliability-Based Design & Topology Optimization (SRBDO/SRBTO): A Matrix-based System Reliability (MSR) Method

Single-loop System Reliability-Based Design & Topology Optimization (SRBDO/SRBTO): A Matrix-based System Reliability (MSR) Method 10 h US Naonal Congress on Compuaonal Mechancs Columbus, Oho 16-19, 2009 Sngle-loop Sysem Relably-Based Desgn & Topology Opmzaon (SRBDO/SRBTO): A Marx-based Sysem Relably (MSR) Mehod Tam Nguyen, Junho

More information

WiH Wei He

WiH Wei He Sysem Idenfcaon of onlnear Sae-Space Space Baery odels WH We He wehe@calce.umd.edu Advsor: Dr. Chaochao Chen Deparmen of echancal Engneerng Unversy of aryland, College Par 1 Unversy of aryland Bacground

More information

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations Chaper 6: Ordnary Leas Squares Esmaon Procedure he Properes Chaper 6 Oulne Cln s Assgnmen: Assess he Effec of Sudyng on Quz Scores Revew o Regresson Model o Ordnary Leas Squares () Esmaon Procedure o he

More information

Density Matrix Description of NMR BCMB/CHEM 8190

Density Matrix Description of NMR BCMB/CHEM 8190 Densy Marx Descrpon of NMR BCMBCHEM 89 Operaors n Marx Noaon Alernae approach o second order specra: ask abou x magnezaon nsead of energes and ranson probables. If we say wh one bass se, properes vary

More information

(,,, ) (,,, ). In addition, there are three other consumers, -2, -1, and 0. Consumer -2 has the utility function

(,,, ) (,,, ). In addition, there are three other consumers, -2, -1, and 0. Consumer -2 has the utility function MACROECONOMIC THEORY T J KEHOE ECON 87 SPRING 5 PROBLEM SET # Conder an overlappng generaon economy le ha n queon 5 on problem e n whch conumer lve for perod The uly funcon of he conumer born n perod,

More information

2.1 Constitutive Theory

2.1 Constitutive Theory Secon.. Consuve Theory.. Consuve Equaons Governng Equaons The equaons governng he behavour of maerals are (n he spaal form) dρ v & ρ + ρdv v = + ρ = Conservaon of Mass (..a) d x σ j dv dvσ + b = ρ v& +

More information

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas) Lecure 8: The Lalace Transform (See Secons 88- and 47 n Boas) Recall ha our bg-cure goal s he analyss of he dfferenal equaon, ax bx cx F, where we emloy varous exansons for he drvng funcon F deendng on

More information

Lecture 11 SVM cont

Lecture 11 SVM cont Lecure SVM con. 0 008 Wha we have done so far We have esalshed ha we wan o fnd a lnear decson oundary whose margn s he larges We know how o measure he margn of a lnear decson oundary Tha s: he mnmum geomerc

More information

Pattern Classification (III) & Pattern Verification

Pattern Classification (III) & Pattern Verification Preare by Prof. Hu Jang CSE638 --4 CSE638 3. Seech & Language Processng o.5 Paern Classfcaon III & Paern Verfcaon Prof. Hu Jang Dearmen of Comuer Scence an Engneerng York Unversy Moel Parameer Esmaon Maxmum

More information

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction ECOOMICS 35* -- OTE 9 ECO 35* -- OTE 9 F-Tess and Analyss of Varance (AOVA n he Smple Lnear Regresson Model Inroducon The smple lnear regresson model s gven by he followng populaon regresson equaon, or

More information

TSS = SST + SSE An orthogonal partition of the total SS

TSS = SST + SSE An orthogonal partition of the total SS ANOVA: Topc 4. Orhogonal conrass [ST&D p. 183] H 0 : µ 1 = µ =... = µ H 1 : The mean of a leas one reamen group s dfferen To es hs hypohess, a basc ANOVA allocaes he varaon among reamen means (SST) equally

More information

On One Analytic Method of. Constructing Program Controls

On One Analytic Method of. Constructing Program Controls Appled Mahemacal Scences, Vol. 9, 05, no. 8, 409-407 HIKARI Ld, www.m-hkar.com hp://dx.do.org/0.988/ams.05.54349 On One Analyc Mehod of Consrucng Program Conrols A. N. Kvko, S. V. Chsyakov and Yu. E. Balyna

More information

Relative controllability of nonlinear systems with delays in control

Relative controllability of nonlinear systems with delays in control Relave conrollably o nonlnear sysems wh delays n conrol Jerzy Klamka Insue o Conrol Engneerng, Slesan Techncal Unversy, 44- Glwce, Poland. phone/ax : 48 32 37227, {jklamka}@a.polsl.glwce.pl Keywor: Conrollably.

More information

Genetic Algorithm in Parameter Estimation of Nonlinear Dynamic Systems

Genetic Algorithm in Parameter Estimation of Nonlinear Dynamic Systems Genec Algorhm n Parameer Esmaon of Nonlnear Dynamc Sysems E. Paeraks manos@egnaa.ee.auh.gr V. Perds perds@vergna.eng.auh.gr Ah. ehagas kehagas@egnaa.ee.auh.gr hp://skron.conrol.ee.auh.gr/kehagas/ndex.hm

More information

Digital Speech Processing Lecture 20. The Hidden Markov Model (HMM)

Digital Speech Processing Lecture 20. The Hidden Markov Model (HMM) Dgal Speech Processng Lecure 20 The Hdden Markov Model (HMM) Lecure Oulne Theory of Markov Models dscree Markov processes hdden Markov processes Soluons o he Three Basc Problems of HMM s compuaon of observaon

More information

FI 3103 Quantum Physics

FI 3103 Quantum Physics /9/4 FI 33 Quanum Physcs Aleander A. Iskandar Physcs of Magnesm and Phooncs Research Grou Insu Teknolog Bandung Basc Conces n Quanum Physcs Probably and Eecaon Value Hesenberg Uncerany Prncle Wave Funcon

More information

Dual Approximate Dynamic Programming for Large Scale Hydro Valleys

Dual Approximate Dynamic Programming for Large Scale Hydro Valleys Dual Approxmae Dynamc Programmng for Large Scale Hydro Valleys Perre Carpener and Jean-Phlppe Chanceler 1 ENSTA ParsTech and ENPC ParsTech CMM Workshop, January 2016 1 Jon work wh J.-C. Alas, suppored

More information

Filtrage particulaire et suivi multi-pistes Carine Hue Jean-Pierre Le Cadre and Patrick Pérez

Filtrage particulaire et suivi multi-pistes Carine Hue Jean-Pierre Le Cadre and Patrick Pérez Chaînes de Markov cachées e flrage parculare 2-22 anver 2002 Flrage parculare e suv mul-pses Carne Hue Jean-Perre Le Cadre and Parck Pérez Conex Applcaons: Sgnal processng: arge rackng bearngs-onl rackng

More information

More belief propaga+on (sum- product)

More belief propaga+on (sum- product) Notes for Sec+on 5 Today More mo+va+on for graphical models A review of belief propaga+on Special- case: forward- backward algorithm From variable elimina+on to junc+on tree (mainly just intui+on) More

More information

e-journal Reliability: Theory& Applications No 2 (Vol.2) Vyacheslav Abramov

e-journal Reliability: Theory& Applications No 2 (Vol.2) Vyacheslav Abramov June 7 e-ournal Relably: Theory& Applcaons No (Vol. CONFIDENCE INTERVALS ASSOCIATED WITH PERFORMANCE ANALYSIS OF SYMMETRIC LARGE CLOSED CLIENT/SERVER COMPUTER NETWORKS Absrac Vyacheslav Abramov School

More information

arxiv: v1 [math.oc] 11 Dec 2014

arxiv: v1 [math.oc] 11 Dec 2014 Nework Newon Aryan Mokhar, Qng Lng and Alejandro Rbero Dep. of Elecrcal and Sysems Engneerng, Unversy of Pennsylvana Dep. of Auomaon, Unversy of Scence and Technology of Chna arxv:1412.374v1 [mah.oc] 11

More information

Part II CONTINUOUS TIME STOCHASTIC PROCESSES

Part II CONTINUOUS TIME STOCHASTIC PROCESSES Par II CONTINUOUS TIME STOCHASTIC PROCESSES 4 Chaper 4 For an advanced analyss of he properes of he Wener process, see: Revus D and Yor M: Connuous marngales and Brownan Moon Karazas I and Shreve S E:

More information