ARTIFICIAL INTELLIGENCE LABORATORY. and. A.I. Memo No January, Factorial Hidden Markov Models.

Size: px
Start display at page:

Download "ARTIFICIAL INTELLIGENCE LABORATORY. and. A.I. Memo No January, Factorial Hidden Markov Models."

Transcription

1 MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No January, 1996 C.B.C.L. Paper No. 130 Factoral Hdden Markov Models Zoubn Ghahraman and Mchael I. Jordan Ths publcaton can be retreved by anonymous ftp to publcatons.a.mt.edu. Abstract We present a framework for learnng n hdden Markov models wth dstrbuted state representatons. Wthn ths framework, we derve a learnng algorthm based on the Expectaton{Maxmzaton (EM) procedure for maxmum lkelhood estmaton. Analogous to the standard Baum-Welch update rules, the M-step of our algorthm s exact and can be solved analytcally. However, due to the combnatoral nature of the hdden state representaton, the exact E-step s ntractable. A smple and tractable mean eld approxmaton s derved. Emprcal results on a set of problems suggest that both the mean eld approxmaton and Gbbs samplng are vable alternatves to the computatonally expensve exact algorthm. Copyrght c Massachusetts Insttute of Technology, 1994 Ths report descrbes research done at the Center for Bologcal and Computatonal Learnng and the Artcal Intellgence Laboratory of the Massachusetts Insttute of Technology. Support for the Center s provded n part by a grant from the Natonal Scence Foundaton under contract ASC{ Ths project was supported n part by a grant from the McDonnell-Pew Foundaton, by a grant from ATR Human Informaton Processng Research Laboratores, by a grant from Semens Corporaton, and by grant N from the Oce of Naval Research.

2 1 Introducton A problem of fundamental nterest to machne learnng s tme seres modelng. Due to the smplcty and ecency of ts parameter estmaton algorthm, the hdden Markov model (HMM) has emerged as one of the basc statstcal tools for modelng dscrete tme seres, ndng wdespread applcaton n the areas of speech recognton (Rabner and Juang, 1986) and computatonal molecular bology (Bald et al., 1994). An HMM s essentally a mxture model, encodng nformaton about the hstory of a tme seres n the value of a sngle multnomal varable (the hdden state). Ths multnomal assumpton allows an ecent parameter estmaton algorthm to be derved (the Baum-Welch algorthm). However, t also severely lmts the representatonal capacty of HMMs. For example, to represent 30 bts of nformaton about the hstory of a tme sequence, an HMM would need 2 30 dstnct states. On the other hand an HMM wth a dstrbuted state representaton could acheve the same task wth 30 bnary unts (Wllams and Hnton, 1991). Ths paper addresses the problem of dervng ecent learnng algorthms for hdden Markov models wth dstrbuted state representatons. The need for dstrbuted state representatons n HMMs can be motvated n twoways. Frst, such representatons allow the state space to be decomposed nto features that naturally decouple the dynamcs of a sngle process generatng the tme seres. Second, dstrbuted state representatons smplfy the task of modelng tme seres generated by the nteracton of multple ndependent processes. For example, a speech sgnal generated by the superposton of multple smultaneous speakers can be potentally modeled wth such an archtecture. Wllams and Hnton (1991) rst formulated the problem of learnng n HMMs wth dstrbuted state representaton and proposed a soluton based on determnstc Boltzmann learnng. The approach presented n ths paper s smlar to Wllams and Hnton's n that t s also based on a statstcal mechancal formulaton of hdden Markov models. However, our learnng algorthm s qute derent n that t makes use of the specal structure of HMMs wth dstrbuted state representaton, resultng n a more ecent learnng procedure. Antcpatng the results n secton 2, ths learnng algorthm both obvates the need for the two-phase procedure of Boltzmann machnes, and has an exact M-step. A derent approach comes from Saul and Jordan (1995), who derved a set of rules for computng the gradents requred for learnng n HMMs wth dstrbuted state spaces. However, ther methods can only be appled to a lmted class of archtectures. 2 Factoral hdden Markov models Hdden Markov models are a generalzaton of mxture models. At any tme step, the probablty densty over the observables dened by an HMM s a mxture of the denstes dened by each state n the underlyng Markov model. Temporal dependences are ntroduced by specfyng that the pror probablty of the state at tme t depends on the state at tme t 1 through a transton matrx, P (Fgure 1a). Another generalzaton of mxture models, the cooperatve vector quantzer (CVQ; Hnton and Zemel, 1994 ), provdes a natural formalsm for dstrbuted state representatons n HMMs. Whereas n smple mxture models each data pont must be accounted for by a sngle mxture component, n CVQs each data pont s accounted for by the combnaton of contrbutons from many mxture components, one from each separate vector quantzer. The total probablty densty modeled by a CVQ s also a mxture model; however ths mxture densty s assumed to factorze nto a product of denstes, each densty assocated wth one of the vector quantzers. Thus, the CVQ s a mxture 1

3 model wth dstrbuted representatons for the mxture components. Factoral hdden Markov models 1 combne the state transton structure of HMMs wth the dstrbuted representatons of CVQs (Fgure 1b). Each of the d underlyng Markov models has a dscrete state s t at tme t and transton probablty matrx P. As n the CVQ, the states are mutually exclusve wthn each vector quantzer and we assume real-valued outputs. The sequence of observable output vectors s generated from a normal dstrbuton wth mean gven by the weghted combnaton of the states of the underlyng Markov models:! dx y t N W s t ;C ; =1 where C s a common covarance matrx. The k-valued states s are represented as dscrete column vectors wth a 1 n one poston and 0 everywhere else; the mean of the observable s therefore a combnaton of columns from each of the W matrces. a) b) y y W s W 1 W 2 s1 s2... W d s d P P 1 P 2 P d Fgure 1. a) Hdden Markov model. b) Factoral hdden Markov model. We capture the above probablty model by denng the energy of a sequence of T states and observatons, f(s t ; y t )g T t=1, whch we abbrevate to fs; yg, as: H(fs; yg) = 1 2 " TX t=1 y t dx =1 W s t # 0 C 1 " y t dx =1 W s t # T X dx t=1 =1 s t0 A s t 1 ; (1) where [A ] jl = log P (s t j jst 1 l ) such that P k j=1 e[a ] jl =1,and 0 denotes matrx transpose. Prors P for the ntal state, s 1, are ntroduced by settng the second term n (1) to d =1 s 10 log. The probablty model s dened from ths energy by the Boltzmann dstrbuton P (fs; yg) = 1 Z expf H(fs; yg)g: (2) 1 We refer to HMMs wth dstrbuted state as factoral HMMs as the features of the dstrbuted state factorze the total state representaton. 2

4 Note that lke n the CVQ (Ghahraman, 1995), the unclamped partton functon Z = Z dfyg X fsg expf H(fs; yg)g; evaluates to a constant, ndependent of the parameters. Ths can be shown by rst ntegratng the Gaussan varables, removng all dependency on fyg, and then summng over the states usng the constrant one [A ] jl. The EM algorthm for Factoral HMMs As n HMMs, the parameters of a factoral HMM can be estmated va the EM (Baum-Welch) algorthm. Ths procedure terates between assumng the current parameters to compute probabltes over the hdden states (E-step), and usng these probabltes to maxmze the expected log lkelhood of the parameters (M-step). Usng the lkelhood (2), the expected log lkelhood of the parameters s Q( new j) =h H(fs; yg) log Z c ; (3) where = fw ;P ;Cg d denotes the current parameters, and h =1 c denotes expectaton gven the clamped observaton sequence and. Gven the observaton sequence, the only random varables are the hdden states. Expandng equaton (3) and lmtng the expectaton to these random varables we nd that the statstcs that need to be computed for the E-step are hs t c, hs t s t0 j c, and hs t s t 10 c. Note that n standard HMM notaton (Rabner and Juang, 1986), hs t c corresponds to t and hs t s t 10 c corresponds to t, whereas hs t s t0 j c has no analogue when there s only a sngle underlyng Markov model. The M-step uses these expectatons to maxmze Q wth respect to the parameters. The constant partton functon allowed us to drop the second term n (3). Therefore, unlke the Boltzmann machne, the expected log lkelhood does not depend on statstcs collected n an unclamped phase of learnng, resultng n much faster learnng than the tradtonal Boltzmann machne (Neal, 1992). M-step Settng the dervatves of Q wth respect to the output weghts to zero, we obtan a lnear system of equatons for W : W new = 2 4 X N;t 3y 2 hss 0 c 5 4 X 3 hs c y 0 5 ; N;t where s and W are the vector and matrx of concatenated s and W, respectvely, P N denotes summaton over a data set of N sequences, and y s the Moore-Penrose pseudo-nverse. To estmate the log transton probabltes we ] jl = 0 subject to the constrant P j e[a ] jl =1, obtanng! [A ] new jl = log The covarance matrx can be smlarly estmated: C new = X N;t yy 0 PN;ths t j st 1 l c P N;t;jhs t j st 1 l c X N;t yhs 0 chss 0 y chs c y 0 : : (4) The M-step equatons can therefore be solved analytcally; furthermore, for a sngle underlyng Markov chan, they reduce to the tradtonal Baum-Welch re-estmaton equatons. 3

5 E-step Unfortunately, as n the smpler CVQ, the exact E-step for factoral HMMs s computatonally ntractable. For example, the expectaton of the j th unt n vector at tme step t, gven fyg, s: hs t j c = P (s t j =1jfyg;) = kx j1;:::;j h6=;:::;j d P (s t 1j1 =1;:::;st j =1;:::;st d;j d =1jfyg;) Although the Markov property can be used to obtan a forward-backward{lke factorzaton of ths expectaton across tme steps, the sum over all possble conguratons of the other hdden unts wthn each tme step s unavodable. For a data set of N sequences of length T, the full E-step calculated through the forward-backward procedure has tme complexty O(NTk 2d ). Although more careful bookkeepng can reduce the complexty too(ntdk d+1 ), the exponental tme cannot be avoded. Ths ntractablty of the exact E-step s due nherently to the cooperatve nature of the model the settng of one vector only determnes the mean of the observable f all the other vectors are xed. Rather than summng over all possble hdden state patterns to compute the exact expectatons, a natural approach s to approxmate them through a Monte Carlo method such as Gbbs samplng. The procedure starts wth a clamped observable sequence fyg and a random settng of the hdden states fs t jg. At each tme step, each state vector s updated stochastcally accordng to ts probablty dstrbuton condtoned on the settng of all the other state vectors: s t P (s t jfyg; fs j : j 6= or 6= tg;): These condtonal dstrbutons are straghtforward to compute and a full pass of Gbbs samplng requres O(NTkd) operatons. The rst and second-order statstcs needed to estmate hs t c, hs t st0 j c and hs t 10 st c are collected usng the s t j 's vsted and the probabltes estmated durng ths samplng process. Mean eld approxmaton A derent approach to computng the expectatons n an ntractable system s gven by mean eld theory. A mean eld approxmaton for factoral HMMs can be obtaned by denng the energy functon ~H(fs; yg) = 1 2 Xhy t t 0C hy t X 1 t t t; s t0 log m t : whch results n a completely factorzed approxmaton to probablty densty (2): ~P (fs; yg) / Y t expf 1 2 hy t t 0 C 1 h y t t g Y t;;j(m t j) st j (5) In ths approxmaton, the observables are ndependently Gaussan dstrbuted wth mean t and each hdden state vector s multnomally dstrbuted wth mean m t. Ths approxmaton s made as tght as possble bychosng the mean eld parameters t and m t that mnmze the Kullback-Lebler dvergence KL( PkP) ~ hlog P P ~ hlog P ~ P ~ where h P ~ denotes expectaton over the mean eld dstrbuton (5). Wth the observables clamped, t can be set equal to the observable y t. Mnmzng KL( PkP) ~ wth respect to the mean eld 4

6 parameters for the states results n a xed-pont equaton whch can be terated untl convergence: m t new = fw 0 C 1 h y t ^y t + W 0 C +A m t 1 + A 0 mt+1 g 1 W m t 1 2 dagfw 0 C 1 W g 1 (6) where ^y t P W m t and fg s the softmax exponental, normalzed over each hdden state vector. The rst term s the projecton of the error n the observable onto the weghts of state vector the more a hdden unt can reduce ths error, the larger ts mean eld parameter. The next three terms arse from the fact that hs 2 j s equal to P ~ m j and not m 2 j. The last two terms ntroduce dependences forward and backward n tme. Each state vector s asynchronously updated usng (6), at a tme cost of O(NTkd) per teraton. Convergence s dagnosed by montorng the KL dvergence n the mean eld dstrbuton between successve tme steps; n practce convergence s very rapd (about 2 to 10 teratons of (6)). 3 Emprcal Results We compared three EM algorthms for learnng n factoral HMMs usng Gbbs samplng, mean eld approxmaton, and the exact (exponental) E step on the bass of performance and speed on randomly generated problems. Problems were generated from a factoral HMM structure, the parameters of whch were sampled from a unform [0; 1] dstrbuton, and approprately normalzed to satsfy the sum-to-one constrants of the transton matrces and prors. Also ncluded n the comparson was a tradtonal HMM wth as many states (k d ) as the factoral HMM. Table 1 summarzes the results. Even for moderately large state spaces (d 3 and k 3) the standard HMM wth k d states suers from severe overttng. Furthermore, both the standard HMM and the exact E-step factoral HMM are extremely slow on the larger problems. The Gbbs samplng and mean eld approxmatons oer roughly comparable performance at a great ncrease n speed. 4 Dscusson The basc contrbuton of ths paper s a learnng algorthm for hdden Markov models wth dstrbuted state representatons. The standard Baum-Welch procedure s ntractable for such archtectures as the sze of the state space generated from the cross product of d k-valued features s O(k d ), and the tme complexty of Baum-Welch s quadratc n ths sze. More mportantly, unless specal constrants are appled to ths cross-product HMM archtecture, the number of parameters also grows as O(k 2d ), whch can result n severe overttng. The archtecture for factoral HMMs presented n ths paper dd not nclude any couplng between the underlyng Markov chans. It s possble to extend the algorthm presented to archtectures whch ncorporate such couplngs. However, these couplngs must be ntroduced wth cauton as they may result ether n an exponental growth n parameters or n a loss of the constant partton functon property. The learnng algorthm derved n ths paper assumed real-valued observables. The algorthm can also be derved for HMMs wth dscrete observables, an archtecture closely related to sgmod belef networks (Neal, 1992). However, the nonlneartes nduced by dscrete observables make both the E-step and M-step of the algorthm more dcult. 5

7 Table 1: Comparson of factoral HMM on four problems of varyng sze d k Alg # Tran Test Cycles Tme/Cycle 3 2 HMM s Exact s Gbbs s MF s 3 3 HMM s Exact s Gbbs s MF s 5 2 HMM s Exact s Gbbs s MF s 5 3 HMM ,1678,1690-1,-1,-1 14,14, s Exact -55,-354, ,-378, ,100, s Gbbs -123,-160, ,-237, ,73, s MF -287,-286, ,-370, ,100, s Table 1. Data was generated from a factoral HMM wth d underlyng Markov models of k states each. The tranng set was 10 sequences of length 20 where the observable was a 4-dmensonal vector; the test set was 20 such sequences. HMM ndcates a hdden Markov model wth k d states; the other algorthms are factoral HMMs wth d underlyng k-state models. Gbbs samplng used 10 samples of each state. The algorthms were run untl convergence, as montored by relatve change n the lkelhood, or a maxmum of 100 cycles. The # column ndcates number of runs. The Tran and Test columns show the log lkelhood one standard devaton on the two data sets. The last column ndcates approxmate tme per cycle on a Slcon Graphcs R4400 processor runnng Matlab. 6

8 In concluson, we have presented Gbbs samplng and mean eld learnng algorthms for factoral hdden Markov models. Such models ncorporate the tme seres modelng capabltes of hdden Markov models and the advantages of dstrbuted representatons for the state space. Future work wll concentrate on a more ecent mean eld approxmaton n whch the forward-backward algorthm s used to compute the E-step exactly wthn each Markov chan, and mean eld theory s used to handle nteractons between chans (Saul and Jordan, 1996). References Bald, P., Chauvn, Y., Hunkapller, T., and McClure, M. (1994). Hdden Markov models of bologcal prmary sequence nformaton. Proc. Nat. Acad. Sc. (USA), 91(3):1059{1063. Hnton, G. and Zemel, R. (1994). Autoencoders, mnmum descrpton length, and Helmholtz free energy. In Cowan, J., Tesauro, G., and Alspector, J., edtors, Advances n Neural Informaton Processng Systems 6. Morgan Kaufmanm Publshers, San Francsco, CA. Neal, R. (1992). Connectonst learnng of belef networks. Artcal Intellgence, 56:71{113. Rabner, L. and Juang, B. (1986). An Introducton to hdden Markov models. IEEE Acoustcs, Speech & Sgnal Processng Magazne, 3:4{16. Saul, L. and Jordan, M. (1995). Boltzmann chans and hdden Markov models. In Tesauro, G., Touretzky, D., and Leen, T., edtors, Advances n Neural Informaton Processng Systems 7. MIT Press, Cambrdge, MA. Saul, L. and Jordan, M. (1996). Explotng tractable substructures n Intractable networks. In Touretzky, D., Mozer, M., and Hasselmo, M., edtors, Advances n Neural Informaton Processng Systems 8. MIT Press. Wllams, C. and Hnton, G. (1991). Mean eld networks that learn to dscrmnate temporally dstorted strngs. In Touretzky, D., Elman, J., Sejnowsk, T., and Hnton, G., edtors, Connectonst Models: Proceedngs of the 1990 Summer School, pages 18{22. Morgan Kaufmann Publshers, Man Mateo, CA. 7

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Information Geometry of Gibbs Sampler

Information Geometry of Gibbs Sampler Informaton Geometry of Gbbs Sampler Kazuya Takabatake Neuroscence Research Insttute AIST Central 2, Umezono 1-1-1, Tsukuba JAPAN 305-8568 k.takabatake@ast.go.jp Abstract: - Ths paper shows some nformaton

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia Usng deep belef network modellng to characterze dfferences n bran morphometry n schzophrena Walter H. L. Pnaya * a ; Ary Gadelha b ; Orla M. Doyle c ; Crstano Noto b ; André Zugman d ; Qurno Cordero b,

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

The Basic Idea of EM

The Basic Idea of EM The Basc Idea of EM Janxn Wu LAMDA Group Natonal Key Lab for Novel Software Technology Nanjng Unversty, Chna wujx2001@gmal.com June 7, 2017 Contents 1 Introducton 1 2 GMM: A workng example 2 2.1 Gaussan

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Relevance Vector Machines Explained

Relevance Vector Machines Explained October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Introduction to Hidden Markov Models

Introduction to Hidden Markov Models Introducton to Hdden Markov Models Alperen Degrmenc Ths document contans dervatons and algorthms for mplementng Hdden Markov Models. The content presented here s a collecton of my notes and personal nsghts

More information

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows: Supplementary Note Mathematcal bacground A lnear magng system wth whte addtve Gaussan nose on the observed data s modeled as follows: X = R ϕ V + G, () where X R are the expermental, two-dmensonal proecton

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Overview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition

Overview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition Overvew Hdden Marov Models and Gaussan Mxture Models Steve Renals and Peter Bell Automatc Speech Recognton ASR Lectures &5 8/3 January 3 HMMs and GMMs Key models and algorthms for HMM acoustc models Gaussans

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

Hidden Markov Models

Hidden Markov Models Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Generative and Discriminative Models. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Generative and Discriminative Models. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 Generatve and Dscrmnatve Models Je Tang Department o Computer Scence & Technolog Tsnghua Unverst 202 ML as Searchng Hpotheses Space ML Methodologes are ncreasngl statstcal Rule-based epert sstems beng

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

2 STATISTICALLY OPTIMAL TRAINING DATA 2.1 A CRITERION OF OPTIMALITY We revew the crteron of statstcally optmal tranng data (Fukumzu et al., 1994). We

2 STATISTICALLY OPTIMAL TRAINING DATA 2.1 A CRITERION OF OPTIMALITY We revew the crteron of statstcally optmal tranng data (Fukumzu et al., 1994). We Advances n Neural Informaton Processng Systems 8 Actve Learnng n Multlayer Perceptrons Kenj Fukumzu Informaton and Communcaton R&D Center, Rcoh Co., Ltd. 3-2-3, Shn-yokohama, Yokohama, 222 Japan E-mal:

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Second order approximations for probability models

Second order approximations for probability models Second order approxmatons for probablty models lbert Kappen Department of Bophyscs Njmegen Unversty Njmegen, the Netherlands bertmbfysunnl Wm Wegernc Department of Bophyscs Njmegen Unversty Njmegen, the

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Gaussian Conditional Random Field Network for Semantic Segmentation - Supplementary Material

Gaussian Conditional Random Field Network for Semantic Segmentation - Supplementary Material Gaussan Condtonal Random Feld Networ for Semantc Segmentaton - Supplementary Materal Ravtea Vemulapall, Oncel Tuzel *, Mng-Yu Lu *, and Rama Chellappa Center for Automaton Research, UMIACS, Unversty of

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Multi-Conditional Learning for Joint Probability Models with Latent Variables

Multi-Conditional Learning for Joint Probability Models with Latent Variables Mult-Condtonal Learnng for Jont Probablty Models wth Latent Varables Chrs Pal, Xueru Wang, Mchael Kelm and Andrew McCallum Department of Computer Scence Unversty of Massachusetts Amherst Amherst, MA USA

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

MATH 567: Mathematical Techniques in Data Science Lab 8

MATH 567: Mathematical Techniques in Data Science Lab 8 1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

arxiv: v2 [stat.me] 26 Jun 2012

arxiv: v2 [stat.me] 26 Jun 2012 The Two-Way Lkelhood Rato (G Test and Comparson to Two-Way χ Test Jesse Hoey June 7, 01 arxv:106.4881v [stat.me] 6 Jun 01 1 One-Way Lkelhood Rato or χ test Suppose we have a set of data x and two hypotheses

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

Chapter 7 Channel Capacity and Coding

Chapter 7 Channel Capacity and Coding Wreless Informaton Transmsson System Lab. Chapter 7 Channel Capacty and Codng Insttute of Communcatons Engneerng atonal Sun Yat-sen Unversty Contents 7. Channel models and channel capacty 7.. Channel models

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

International Journal of Pure and Applied Sciences and Technology

International Journal of Pure and Applied Sciences and Technology Int. J. Pure Appl. Sc. Technol., 4() (03), pp. 5-30 Internatonal Journal of Pure and Appled Scences and Technology ISSN 9-607 Avalable onlne at www.jopaasat.n Research Paper Schrödnger State Space Matrx

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

Lecture 3. Ax x i a i. i i

Lecture 3. Ax x i a i. i i 18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

More information

ARTIFICIAL INTELLIGENCE LABORATORY. A.I. Memo No August Mean Field Theory for Sigmoid Belief Networks

ARTIFICIAL INTELLIGENCE LABORATORY. A.I. Memo No August Mean Field Theory for Sigmoid Belief Networks MASSACUSETTS INSTITUTE OF TECNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY A.I. Memo No. 57 August 996 Mean Feld Theory for Sgmod Belef Networks Lawrence K. Saul, Tomm Jaakkola, and Mchael I. Jordan Ths publcaton

More information

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information