Grenoble, France Grenoble University, F Grenoble Cedex, France

Size: px
Start display at page:

Download "Grenoble, France Grenoble University, F Grenoble Cedex, France"

Transcription

1 MODIFIED K-MEA CLUSTERIG METHOD OF HMM STATES FOR IITIALIZATIO OF BAUM-WELCH TRAIIG ALGORITHM Paulne Larue 1, Perre Jallon 1, Bertrand Rvet 2 1 CEA LETI - MIATEC Campus Grenoble, France emal: perre.jallon@cea.fr 2 GIPSA-lab, CRS-UMR5216 Grenoble Unversty, F Grenoble Cedex, France emal: bertrand.rvet@gpsa-lab.grenoble-np.fr ABSTRACT Hdden Markov models are wdely used for recognton algorthms (speech, wrtng, gesture,...). In ths paper, a classcal set of models s consdered: state space of hdden varable s dscrete and observaton probabltes are modeled as Gaussan dstrbutons. The models parameters are generally estmated wth tranng sequences and the Baum-Welch algorthm,.e. an expectaton maxmzaton algorthm. However ths knd of algorthm s well known to be senstve to ts ntalzaton pont. The problem of ths ntalzaton pont choce s addressed n ths paper: a model wth a very large number of states whch descrbe tranng sequences wth accuracy s frst constructed. The number of states s then reduced usng a k-mean algorthm on the state. Ths algorthm s compared to other methods based on a k-mean algorthm on the data wth numercal smulatons. 1. ITRODUCTIO Hdden Markov models (HMM) are wdely used for statstcal modelng of sgnals havng a temporal structure. They are based on two statstcs processes: the state varable process and the observaton process. The frst one X s the so-called state varable and s a Markov chan, assumed dscrete n ths paper. It s not observed but t s fully characterzed by transtons and ntalzaton probabltes. In many recognton algorthms (e.g., speech [1], wrtng recognton [2], gesture recognton [3],...), ths varable s used to descrbe the temporal structure of the sgnals to model. In partcular f these sgnals can be descrbed as sequences of (shorter) statonary sgnals, a very partcular set of models can be used: the left-rght models; models whose transton probabltes satsfy the followng constrants: ( 1, 2 ), 2 < 1, p(x t = 2 X t = 1 )=0, whch means that the state varable can only be a, ncreased sequence. The second process Y s the so-called observaton process and s assumed to be an ndependent process condtonally to X [1]. For each state n {0,,} a probablty densty functon (p.d.f.) p (Y X = n) sdefned,where s the number of dfferent values taken by X or the number of hdden states. Y s assumed to be a contnuous varable and ts p.d.f s modeled for each state as a Gaussan dstrbuton. The followng notatons wll be used n the rest of ths paper: the ntal probablty of state s denoted π = p (X 0 = ), {0,, 1}, the transton probablty a 1, 2 between states 1 and 2 s defned by a 1, 2 = p (X t = 2 X t = 1 ), ( 1, 2 ) {0,, 1} {0,, 1}. The observaton probabltes are descrbed, for state # wth two varables µ (mean vector) and Σ (covarance matrx). Fnally the whole set of parameters of a HMM wth states s denoted as λ : λ = {π }, {a 1, 2 } 1, 2, {µ, Σ }. In general λ s estmated usng tranng sequences. Gven K observaton sequences of length T k, Y k,0:tk = {Y k,0,y k,tk }, the optmal set of parameters λ s defned as: ˆλ = argmax λ K k=0 p Y k,0:tk λ. Wthout addtonal assumptons, ths problem can not be solved analytcally. The manly used technque to estmate λ s the expectaton maxmzaton (EM) algorthm [4] trough the forward-backward method so-called Baum-Welch algorthm [5]. Startng from a prelmnary set of parameters λ (0), the algorthm estmates the set of parameters n an teratve manner - denoted as λ (s) at step #s - such as K k=0 p Y k,0:tk λ (s) ncreases wth respect to s. ˆλ s then estmated as: ˆλ = lm s λ(s). However convergence to a global maxma s not ensured and ths algorthm s well known to be very senstve to ts ntalzaton pont λ (0) [6]. Several ntalzaton methods have hence been proposed n the lterature. A k-mean algorthm on the data can be used to cluster observatons Y k,0:tk [1, 7] or several sets of parameters to perform the tranng can be used [8]. For left-rght models, constraned clusterng technques can also be used wth a k-mean algorthm on the observaton data [9].

2 In ths paper, an alternatve approach s proposed to estmate a frst set λ (0). A set of parameters s frst estmated wth a very large number of states (M ) to descrbe all tranng sequences n a very accurate way. However, ths set of parameters can not be used n practce due to over-learnng problems and computatonal tme ssues. As a consequence, the number of states has to be reduced to a much smaller number to overcome these dffcultes. An unsupervsed clusterng algorthm (based on the k-mean algorthm) on the observaton p.d.f s hence proposed to do ths operaton whle keepng the tranng sgnals descrpton accuracy. Transton and ntalzaton probabltes reman to be estmated wth the Baum-Welch algorthm. The proposed method s derved n two cases: unconstraned and left-rght models. The paper s structured as follows. Secton 2 descrbes the manlnes of the algorthm. The k-mean algorthm operatons are detaled n secton 3. The proposed method performance s compared to other methods s based on smulatons n secton 4. Fnally, Secton 5 concludes ths paper. 2. IITIALIZATIO ALGORITHM DESCRIPTIO The am of the algorthm s to provde an ntalzaton set of parameters λ (0) close enough to ˆλ to ensure a good convergence of the learnng algorthm. It only focuses on the observaton probabltes parameters, whch ams at fndng sets of observaton probabltes parameters {, Σ (c) } {1,,},.e. Gaussan dstrbutons, to descrbe n an accurate way the tranng sequences. It s worth notng that transton probabltes a 1, 2 and ntalzaton probabltes π are not estmated (although constraned for the left-rght models), ths operaton beng done by the Baum-Welch algorthm. The followng values are hence used: for unconstraned models, transton probabltes are set to a, =0.8, {0,,} and 1 = 2, a 1, 2 =0.2/( 1), and ntalzaton probabltes π =1/, {0,, 1}; for left-rght models, transton probabltes are set to a, =0.8, {0,, 2} and a, = 1, a,+1 =0.2 and other values are set to 0, fnally ntalzaton probabltes π 0 = 1 and {1,, 1}, π = 0. Concernng the p.d.f. of hdden states, the estmaton s performed n two steps: frst M p.d.f. are estmated whch accurately descrbe the sgnal (Subsecton 2.1). These M dstrbutons are then reduced to (Subsecton 2.2). 2.1 Intalzaton step Gven a set of K tranng sequences, Y k,0:tk, k {1,, K}, M Gaussan dstrbutons whch accurately descrbe data can be estmated as follows. Each tranng sequence Y k s splt nto several segments of length P, overlappng or not and coverng ts tme support. Each segment s then modeled as a Gaussan sgnal whch parameters µ k,j, Σ k,j are estmated usng classcal methods: µ k,j beng the mean of the related sgnal segment and Σ k,j ts covarance matrx. Wthout restrcton, t s assumed that the dstrbutons are sorted wth respect to the Y k segment delay. For nstance, wth non-overlappng segments, each couple of values (0 k K 1 and 0 j<t k /P ) can be estmated as: µ k,j = 1 P Σ k,j = 1 P P p=0 P p=0 Y k,jp +p, Yk,jP +p µ k,j Yk,jP +p µ k,j T, where. T s the transpose operator. P should be chosen small enough so that each sgnal segment can be consdered as statonary, and large enough to ensure a good estmaton of mean vector and covarance matrx. It s possble after ths step to collect all these values to buld the ntal set of parameters λ (0) M. It s worth notng that ths ntal set λ (0) M descrbes n a very accurate way the tranng sequences: ndeed, each tranng sgnal segment beng descrbed by one of the M states. However, and as already mentoned n the ntroducton secton, ths model can not be used n practce because of over-learnng problems and of computatonal tme ssues. 2.2 Clusterng step The second step of the algorthm s hence to approxmate these M Gaussan dstrbutons wth ones ( M). Ths operaton s performed usng a k-mean algorthm on these Gaussan dstrbutons. Let µ k,j, Σ k,j denotes the Gaussan dstrbuton p.d.f. wth mean vector µ k,j and covarance matrx Σ k,j. Moreover, let us refer as center the dstrbuton whch characterzes a cluster, of whch parameters are labelled by (c):, Σ (c). The k-mean algorthm works teratvely as follows. Gven an ntal set of centers,0, Σ(c),0, the followng two steps 1 are performed at each teraton #s: 1. Assocaton: each dstrbuton µ k,j, Σ k,j s assocated to a center n the set,s, Σ(c),s ; 1 2. Center update: based on the prevous assocaton, updated centers are estmated,s+1, Σ(c),s+1. 1 The centers are fnally estmated as:,, Σ (c) = lm s,s, Σ(c),s. The ntal set of centers depends on the type of HMM (unconstraned or left-rght models). For unconstraned models, the M dstrbutons are sorted accordng to the frst component of ther mean vector. The set of M dstrbutons s splt nto consecutve sets and a center s estmated for each set accordng to the k-mean algorthm second step as descrbed above. For

3 left-rght models, each tranng sequence dstrbutons are splt nto dsjonted segments. Frst segment s then assocated to center #0, second one to center #1, and so on. Intalzaton centers are also estmated usng the k-mean algorthm second step. It s worth notng that the frst step requres to adapt the Eucldan dstance to assocate each dstrbuton to a cluster. Ths pont s dscussed n Secton 3.1 for unconstraned models and n Secton 3.2 for left-rght models. Furthermore, the second step s descrbed n Secton MODIFIED K-MEA ALGORITHM Ths secton detals the practcal proposed consderatons of the proposed modfed k-mean clusterng algorthm. 3.1 State to center dstance computaton for unconstraned models The unconstraned model case s frst consdered: each Gaussan dstrbuton s assocated to a center n an ndependent way. As both dstrbuton and center are Gaussan dstrbutons, the Kullback-Lebler dvergence s proposed to estmate the dstance between both p.d.f.. Ths dvergence s defned as: p 1 (u) p 1 (u)p 2 (u) = p 1 (u) log du. p 2 (u) For Gaussan dstrbutons, ths latter expresson s proportonal to µ k,j, Σ k,j det Σ(c) ln +Tr Σ (c) det Σ k,j T µ k,j Σ (c), Σ (c) Σk,j + µ k,j, (1) where Tr s the trace operator and det s classcally the determnant. Fnally, (k, j), the related dstrbuton µ k,j, Σ k,j s assocated to the closest center whch satsfes: = argmn µ k,j, Σ k,j, Σ (c). 3.2 State to center dstance computaton for left-rght models Compared to prevous unconstraned model, left-rght assumpton sets an addtonal constrant. For all k, f dstrbuton µ k,j0, Σ k,j0 s assocated wth center # 0, then for all j 1 >j 0, µ k,j1, Σ k,j1 cannot be assocated to any center #, < 0. In other words, the dstrbuton center assocaton has to be jontly done for each tranng sequence (rather than n an ndependent way). For sequence #k, let consder therefore the set of + 1 break ndces I k (0),,I k (), defnng the states. Ths set must verfy due to the left-rght constran that I k (0) = 0, I k () =M 1 and for two states 1 and 2, 1 < 2 {0,,} {0,,}, I k ( 1 ) <I k ( 2 ). Fnally, (k, j), the related dstrbuton µ k,j, Σ k,j s assocated to the closest center whch satsfes: = I k () j<i k ( + 1). The set of break ndces s prevously estmated as Îk (1),, Îk( 1) = where argmn D I k (1),,I k ( 1), I k (1),,I k () D I k (1),,I k ( 1) = =0 j I k () µ k,j, Σ k,j, Σ (c), wth I k () = {I k (),,I k ( + 1) 1} and ( ) defned by (1). ote that the two extrema I k (0) and I k () do not requre to be optmzed snce I k (0) = 0 and I k () =M Center estmaton The thrd and last ssue to solve s the computaton of the updated center parameters once each dstrbutons have been assocated to one cluster. Consder therefore the center # and I (c) the assocated set of dstrbutons. A random varable Z n ths set can be wrtten as: Z = δ(h (k, j)) k,j where k,j s a Gaussan random varable wth mean and varance {µ k,j, Σ k,j }, δ(u) the Drac delta functon and H s a hdden varable equal to (k, j) fz shares k,j dstrbuton. The -th center parameters, Σ (c) are estmated as mean and covarance of Z. It s straghtforward to check that: and Σ (c) = 1 card I (c) = 1 card I (c) µ k,j Σ k,j + µ k,j µ T k,j T where card I (c) (c) s the number of elements n set I. 4. UMERICAL RESULTS Ths secton presents the results acheved by the proposed method wth several knd of confguratons: toy example and smulated sgnals.

4 Scenaro µ 0, Σ 0 µ 1, Σ 1 µ 2, Σ 2 µ 3, Σ 3 #1 [ 2 2 ], [ 02 ] [ 6 6 ], [ 20 [ 1 1 ], [ 02 ] [ 5 5 ], [ 20 #2 [ 1 1 ], [ 05 ] [ 3 3 ], [ [ 6 6 ], [ 05 ] [ 5 5 ], [ #3 [ 1 1 ], [ 05 ] [ 2 2 ], [ [ 3 3 ], [ 05 ] [ 4 4 ], [ Table 1: Smulaton scenaros. 4.1 Evaluaton method The proposed algorthm performance (denoted k-mean on Gaussan laws ) have been estmated usng numercal smulatons and are compared to the performance of the classcal k-mean algorthm appled drectly on data (denoted k-mean on data ). Moreover, the nfluence of the left-rght constran s nvestgated (referred as no constraned or left-rght ). evertheless, constraned k-mean on data for left-rght models has not been tested because the number of sets of break ndces to test s prohbtve leadng thus to an excessve computatonal tme. As mentoned n the ntroducton, the am of these algorthms s to estmate a set of parameters ˆλ (0) so that the convergence of the Baum-Welch algorthm s mproved. Ths latter algorthm s an teratve algorthm computng n an teratve manner the set of parameters λ (s) at step #s such as K k=0 p Y k,0:tk λ (s) ncreases wth respect to s. For a gven set of tranng sequences Y k,0:tk ˆλ (0) the sets of parameters estmated wth the dfferent ntalzaton algorthms are hence compared wth the value reached and the convergence rate of the followng lkelhood functon (s) = log 4.2 Toy example K k=0 k, p Y k,0:tk λ (s). Sgnals have been generated as the concatenaton of 4 segments sgnals so that Y 0:T = {Y (0) 0:T, Y (1) 0 0:T, Y (2) 1 0:T, Y (3) 2 0:T 3 }. Each of these four sgnals s generated usng a Gaussan dstrbuton: for the p th sgnal, wth mean vector µ p and covarance matrx Σ p. Ther tme lengths T p, p are randomly chosen usng a truncated Gaussan dstrbuton of mean 0 and standard devaton 0 (keepng only postve value). Three sets of parameters {µ p, Σ p } p have been tested correspondng to 3 scenaros resumed n Table 1. As one can see, these three scenaro vary from these dffculty. Indeed, scenaro #1 s the easest snce the 4 states are qute separated. Scenaro #2 s a lttle bt more complex than scenaro #1 snce states 1 and 2, and 3 and 4 overlap. Fnally, scenaro #3 s the more confusng model snce all the state largely overlap. For each scenaro, 0 realzatons have been generated, and for each realzaton 5 sequences have been used for the ntalzaton and the tranng algorthms. P = 20 has Averaged log lkelyhood values Averaged log lkelyhood values Averaged log lkelyhood values no constraned k mean on gaussan laws Left rght k mean on gaussan laws no constraned k mean on data umber of teratons (a) Scenaro #1 no constraned k mean on gaussan laws Left rght k mean on gaussan laws no constraned k mean on data umber of teratons (b) Scenaro #2 no constraned k mean on gaussan laws Left rght k mean on gaussan laws no constraned k mean on data umber of teratons (c) Scenaro #3 Fgure 1: Performance evaluaton acheved on the toy example. been used for estmaton of M Gaussan dstrbutons pror to k-mean algorthm. Fg. 1 presents the acheved results as the averaged values of (s) over the 0 realzatons. For the frst scenaro (Fg. 1(a)), the 3 tested algorthms ( left-rght k- mean on Gaussan dstrbutons, unconstraned k-mean on Gaussan dstrbutons and on data ) converge to the same value, but k-mean on Gaussan dstrbutons

5 Conf. #1 Conf. #2 DTW HMM Table 2: Classfcaton accuracy n percentage [%]. algorthms converge much faster than classcal k-mean ntalzaton. For the second scenaro (Fg. 1(b)), the two proposed k-mean on Gaussan dstrbutons algorthms (left-rght and unconstraned) converge to the same pont wth a few teratons whereas the classcal k-mean on data algorthm fals nto a local maxma. Fnally for the most complex scenaro (Fg. 1(c)), the constraned left-rght proposed k-mean algorthm on Gaussan dstrbuton has better performance than both other algorthms. The proposed method hence mproves the estmaton of the set of parameters used to ntate the Baum Welch algorthm: a better and a faster convergence s shown by smulatons. 4.3 Recognton problem In a second set of numercal experments (Tab. 2) classfcaton accuracy (CA) of the proposed method s compared to CA acheved by the reference dynamc tme warpng method (DTW) []. Two confguratons are compared, and for each of them b-dmensonal observatons (Y k, R 2 ) are generated from 25 dfferent models. Each model s composed from the concatenaton of 2 to 4 states (Confguraton #1) and of 4 to 8 states (Confguraton #2). In all ths experment (confguratons #1 and #2), a 5-state left-rght HMM wth the proposed ntalzaton procedure s consdered (referred as HMM). As one can see, the proposed method outperforms the classcal DTW by about % of classfcaton accuracy computed n a -fold cross valdaton procedure:.e. the observatons sequences are parttoned nto sets and each of them s sequentally used as the test database whle the other 9 sets are used to tran the HMM parameters. It s worth notng that the good behavor of proposed method s acheved wthout optmzaton of the number of states. As a consequence t s not surprsng that the best CA s obtaned on the smplest confguraton (Conf. #1), however the gap between the two detecton method remans of about % even wth the more complex confguraton (Conf. #2). 5. COCLUSIO In ths study, an ntalzaton of the Baum-Welch tranng algorthm based on a modfed k-mean clusterng method of HMM states s presented. The proposed procedure dffers from classcal mplementatons by a clusterng of the states rather than on the tranng data. The smulatons results have shown that the proposed method mproves the ntalzaton of the Baum-Welch algorthm snce the value of the log-lkelhood acheved by our method s hgher than value acheved by the classcal ntalzaton. Moreover, n a recognton problem, the proposed method outperforms the reference dynamc tme warpng method showng ts good behavor. Future works wll nclude a deeper analyss of ths method as well as an automatc procedure to adjust jontly the number of hdden states. REFERECES [1] L. R. Rabner, A tutoral on hdden markov models and selected applcatons n speech recognton, Proceedngs of the IEEE, vol. 77, no. 2, pp , February [2] Janyng Hu, M.K. Brown, and W. Turn, HMM based onlne handwrtng recognton, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 18, no., pp , October [3] M. Elmezan, A. Al-Hamad, J. Appenrodt, and B. Mchaels, A hdden markov model-based contnuous gesture recognton system for hand moton trajectory, n Proc. 19th Internatonal Conference on Pattern Recognton (ICPR), December 2008, pp [4] A. P. Dempster,. M. Lard, and D. B. Rubn, Maxmum-lkelhood from ncomplete data va the EM algorthm, J. Royal Statst. Soc. Ser. B., vol. 39, pp. 1 38, [5] Olver Cappé, Erc Moulnes, and Tobas Rydén, Eds., Inference n Hdden Markov Model, Sprnger Seres n Statstcs, [6] L. R. Rabner, B. H. Juang, S. E. Levnson, and M. M. Sondh, Some propertes of contnuous hdden markov model representatons, AT&T techncal journal, [7] K. athan, A. Senor, and J. Subrahmona, Intalzaton of hdden markov models for unconstraned on-lne handwrtng recognton, n Proc. IEEE Int. Conf. on Acoustcs, Speech, and Sgnal Processng (ICASSP), May 1996, vol. 6, pp [8] Md. Huda, Ranadhr Ghosh, and John Yearwood, A varable ntalzaton approach to the EM algorthm for better estmaton of the parameters of hdden markov model based acoustc modelng of speech sgnals, n Advances n Data Mnng, Lecture otes n Computer Scence [9] S. Huda, J. Yearwood, and R. Togner, A constrant-based evolutonary learnng approach to the expectaton maxmzaton for optmal estmaton of the hdden markov model for speech sgnal modelng, IEEE Transactons on Systems, Man, and Cybernetcs, Part B: Cybernetcs, vol. 39, no. 1, pp , February [] H. Sakoe and S. Chba, Dynamc programmng algorthm optmzaton for spoken word recognton, IEEE Transactons on Acoustcs, Speech and Sgnal Processng, vol. 26, no. 1, pp , 1978.

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Introduction to Hidden Markov Models

Introduction to Hidden Markov Models Introducton to Hdden Markov Models Alperen Degrmenc Ths document contans dervatons and algorthms for mplementng Hdden Markov Models. The content presented here s a collecton of my notes and personal nsghts

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Hidden Markov Models

Hidden Markov Models Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Hidden Markov Model Cheat Sheet

Hidden Markov Model Cheat Sheet Hdden Markov Model Cheat Sheet (GIT ID: dc2f391536d67ed5847290d5250d4baae103487e) Ths document s a cheat sheet on Hdden Markov Models (HMMs). It resembles lecture notes, excet that t cuts to the chase

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

The Basic Idea of EM

The Basic Idea of EM The Basc Idea of EM Janxn Wu LAMDA Group Natonal Key Lab for Novel Software Technology Nanjng Unversty, Chna wujx2001@gmal.com June 7, 2017 Contents 1 Introducton 1 2 GMM: A workng example 2 2.1 Gaussan

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Overview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition

Overview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition Overvew Hdden Marov Models and Gaussan Mxture Models Steve Renals and Peter Bell Automatc Speech Recognton ASR Lectures &5 8/3 January 3 HMMs and GMMs Key models and algorthms for HMM acoustc models Gaussans

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Maximum a posteriori estimation for Markov chains based on Gaussian Markov random fields

Maximum a posteriori estimation for Markov chains based on Gaussian Markov random fields Proceda Computer Scence Proceda Computer Scence (1) 1 1 Maxmum a posteror estmaton for Markov chans based on Gaussan Markov random felds H. Wu, F. Noé Free Unversty of Berln, Arnmallee 6, 14195 Berln,

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model EXACT OE-DIMESIOAL ISIG MODEL The one-dmensonal Isng model conssts of a chan of spns, each spn nteractng only wth ts two nearest neghbors. The smple Isng problem n one dmenson can be solved drectly n several

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Information Geometry of Gibbs Sampler

Information Geometry of Gibbs Sampler Informaton Geometry of Gbbs Sampler Kazuya Takabatake Neuroscence Research Insttute AIST Central 2, Umezono 1-1-1, Tsukuba JAPAN 305-8568 k.takabatake@ast.go.jp Abstract: - Ths paper shows some nformaton

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

OPTIMAL COMBINATION OF FOURTH ORDER STATISTICS FOR NON-CIRCULAR SOURCE SEPARATION. Christophe De Luigi and Eric Moreau

OPTIMAL COMBINATION OF FOURTH ORDER STATISTICS FOR NON-CIRCULAR SOURCE SEPARATION. Christophe De Luigi and Eric Moreau OPTIMAL COMBINATION OF FOURTH ORDER STATISTICS FOR NON-CIRCULAR SOURCE SEPARATION Chrstophe De Lug and Erc Moreau Unversty of Toulon LSEET UMR CNRS 607 av. G. Pompdou BP56 F-8362 La Valette du Var Cedex

More information

Differentiating Gaussian Processes

Differentiating Gaussian Processes Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Error Probability for M Signals

Error Probability for M Signals Chapter 3 rror Probablty for M Sgnals In ths chapter we dscuss the error probablty n decdng whch of M sgnals was transmtted over an arbtrary channel. We assume the sgnals are represented by a set of orthonormal

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Curve Fitting with the Least Square Method

Curve Fitting with the Least Square Method WIKI Document Number 5 Interpolaton wth Least Squares Curve Fttng wth the Least Square Method Mattheu Bultelle Department of Bo-Engneerng Imperal College, London Context We wsh to model the postve feedback

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

1 Motivation and Introduction

1 Motivation and Introduction Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Mean Field / Variational Approximations

Mean Field / Variational Approximations Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n!

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n! 8333: Statstcal Mechancs I Problem Set # 3 Solutons Fall 3 Characterstc Functons: Probablty Theory The characterstc functon s defned by fk ep k = ep kpd The nth coeffcent of the Taylor seres of fk epanded

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Identification of Linear Partial Difference Equations with Constant Coefficients

Identification of Linear Partial Difference Equations with Constant Coefficients J. Basc. Appl. Sc. Res., 3(1)6-66, 213 213, TextRoad Publcaton ISSN 29-434 Journal of Basc and Appled Scentfc Research www.textroad.com Identfcaton of Lnear Partal Dfference Equatons wth Constant Coeffcents

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

CSC401/2511 Spring CSC401/2511 Natural Language Computing Spring 2019 Lecture 5 Frank Rudzicz and Chloé Pou-Prom University of Toronto

CSC401/2511 Spring CSC401/2511 Natural Language Computing Spring 2019 Lecture 5 Frank Rudzicz and Chloé Pou-Prom University of Toronto CSC41/2511 Natural Language Computng Sprng 219 Lecture 5 Frank Rudzcz and Chloé Pou-Prom Unversty of Toronto Defnton of an HMM θ A hdden Markov model (HMM) s specfed by the 5-tuple {S, W, Π, A, B}: S =

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

Relevance Vector Machines Explained

Relevance Vector Machines Explained October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Feature Selection & Dynamic Tracking F&P Textbook New: Ch 11, Old: Ch 17 Guido Gerig CS 6320, Spring 2013

Feature Selection & Dynamic Tracking F&P Textbook New: Ch 11, Old: Ch 17 Guido Gerig CS 6320, Spring 2013 Feature Selecton & Dynamc Trackng F&P Textbook New: Ch 11, Old: Ch 17 Gudo Gerg CS 6320, Sprng 2013 Credts: Materal Greg Welch & Gary Bshop, UNC Chapel Hll, some sldes modfed from J.M. Frahm/ M. Pollefeys,

More information

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

The optimal delay of the second test is therefore approximately 210 hours earlier than =2. THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 61508-6 provdes approxmaton formulas for the PF for smple

More information