Variational Learning for Switching State-Space Models

Size: px
Start display at page:

Download "Variational Learning for Switching State-Space Models"

Transcription

1 LEER Communicaed by Volker resp Variaional Learning for Swiching Sae-Space odels Zoubin Ghahramani Geoffrey E. Hinon Gasby Compuaional euroscience Uni, Universiy College London, London WC 3R, U.K. We inroduce a new saisical model for ime series ha ieraively segmens daa ino regimes wih approximaely linear dynamics and learns he parameers of each of hese linear regimes. his model combines and generalizes wo of he mos widely used sochasic ime-series models hidden arkov models and linear dynamical sysems and is closely relaed o models ha are widely used in he conrol and economerics lieraures. I can also be derived by exending he mixure of expers neural nework Jacobs, Jordan, owlan, & Hinon, 99) o is fully dynamical version, in which boh exper and gaing neworks are recurren. Inferring he poserior probabiliies of he hidden saes of his model is compuaionally inracable, and herefore he exac expecaion maximizaion E) algorihm canno be applied. However, we presen a variaional approximaion ha maximizes a lower bound on he log-likelihood and makes use of boh he forward and backward recursions for hidden arkov models and he Kalman filer recursions for linear dynamical sysems. We esed he algorihm on arificial daa ses and a naural daa se of respiraion force from a paien wih sleep apnea. he resuls sugges ha variaional approximaions are a viable mehod for inference and learning in swiching sae-space models. Inroducion os commonly used probabilisic models of ime series are descendans of eiher hidden arkov models H) or sochasic linear dynamical sysems, also known as sae-space models SS). Hs represen informaion abou he pas of a sequence hrough a single discree random variable he hidden sae. he prior probabiliy disribuion of his sae is derived from he previous hidden sae using a sochasic ransiion marix. Knowing he sae a any ime makes he pas, presen, and fuure observaions saisically independen. his is he arkov independence propery ha gives he model is name. SSs represen informaion abou he pas hrough a real-valued hidden sae vecor. gain, condiioned on his sae vecor, he pas, presen, and fuure observaions are saisically independen. he dependency beween eural Compuaion 2, ) c 2000 assachuses Insiue of echnology

2 832 Zoubin Ghahramani and Geoffrey E. Hinon he presen sae vecor and he previous sae vecor is specified hrough he dynamic equaions of he sysem and he noise model. When hese equaions are linear and he noise model is gaussian, he SS is also known as a linear dynamical sysem or Kalman filer model. Unforunaely, mos real-world processes canno be characerized by eiher purely discree or purely linear-gaussian dynamics. For example, an indusrial plan may have muliple discree modes of behavior, each wih approximaely linear dynamics. Similarly, he pixel inensiies in an image of a ranslaing objec vary according o approximaely linear dynamics for subpixel ranslaions, bu as he image moves over a larger range, he dynamics change significanly and nonlinearly. his aricle addresses models of dynamical phenomena ha are characerized by a combinaion of discree and coninuous dynamics. We inroduce a probabilisic model called he swiching SS inspired by he divideand-conquer principle underlying he mixure-of-expers neural nework Jacobs, Jordan, owlan, & Hinon, 99). Swiching SSs are a naural generalizaion of Hs and SSs in which he dynamics can ransiion in a discree manner from one linear operaing regime o anoher. here is a large lieraure on models of his kind in economerics, signal processing, and oher fields Harrison & Sevens, 976; Chang & hans, 978; Hamilon, 989; Shumway & Soffer, 99; Bar-Shalom & Li, 993). Here we exend hese models o allow for muliple real-valued sae vecors, draw connecions beween hese fields and he relevan lieraure on neural compuaion and probabilisic graphical models, and derive a learning algorihm for all he parameers of he model based on a srucured variaional approximaion ha rigorously maximizes a lower bound on he log-likelihood. In he following secion we review he background maerial on SSs, Hs, and hybrids of he wo. In secion 3, we describe he generaive model he probabiliy disribuion defined over he observaion sequences for swiching SSs. In secion 4, we describe a learning algorihm for swiching sae-space models ha is based on a srucured variaional approximaion o he expecaion-maximizaion algorihm. In secion 5 we presen simulaion resuls in boh an arificial domain, o assess he qualiy of he approximae inference mehod, and a naural domain. We conclude wih secion 6. 2 Background 2. Sae-Space odels. n SS defines a probabiliy densiy over ime series of real-valued observaion vecors {Y } by assuming ha he observaions were generaed from a sequence of hidden sae vecors {X }. ppendix describes he variables and noaion used hroughou his aricle.) In paricular, he SS specifies ha given he hidden sae vecor a one ime sep, he observaion vecor a ha ime sep is saisically independen from all oher observaion vecors, and ha he hidden sae vecors

3 Swiching Sae-Space odels 833 X X 2 X 3 X Y Y 2 Y 3 Y Figure : direced acyclic graph DG) specifying condiional independence relaions for a sae-space model. Each node is condiionally independen from is nondescendans given is parens. he oupu Y is condiionally independen from all oher variables given he sae X ; and X is condiionally independen from X,,X 2 given X. In his and he following figures, shaded nodes represen observed variables, and unshaded nodes represen hidden variables. obey he arkov independence propery. he join probabiliy for he sequences of saes X and observaions Y can herefore be facored as: P{X, Y }) = PX )PY X ) PX X )PY X ). 2.) =2 he condiional independencies specified by equaion 2. can be expressed graphically in he form of Figure. he simples and mos commonly used models of his kind assume ha he ransiion and oupu funcions are linear and ime invarian and he disribuions of he sae and observaion variables are mulivariae gaussian. We use he erm sae-space model o refer o his simple form of he model. For such models, he sae ransiion funcion is X = X + w, 2.2) where is he sae ransiion marix and w is zero-mean gaussian noise in he dynamics, wih covariance marix Q. PX ) is assumed o be gaussian. Equaion 2.2 ensures ha if PX ) is gaussian, so is PX ). he oupu funcion is Y = CX + v, 2.3) where C is he oupu marix and v is zero-mean gaussian oupu noise wih covariance marix R; PY X ) is herefore also gaussian: { PY X ) = 2π) D/2 R /2 exp } 2 Y CX ) R Y CX ), 2.4) where D is he dimensionaliy of he Y vecors.

4 834 Zoubin Ghahramani and Geoffrey E. Hinon Ofen he observaion vecor can be divided ino inpu or predicor) variables and oupu or response) variables. o model he inpu-oupu behavior of such a sysem he condiional probabiliy of oupu sequences given inpu sequences he linear gaussian SS can be modified o have a sae-ransiion funcion, X = X + BU + w, 2.5) where U is he inpu observaion vecor and B is he fixed) inpu marix. he problem of inference or sae esimaion for an SS wih known parameers consiss of esimaing he poserior probabiliies of he hidden variables given a sequence of observed variables. Since he local likelihood funcions for he observaions are gaussian and he priors for he hidden saes are gaussian, he resuling poserior is also gaussian. hree special cases of he inference problem are ofen considered: filering, smoohing, and predicion nderson & oore, 979; Goodwin & Sin, 984). he goal of filering is o compue he probabiliy of he curren hidden sae X given he sequence of inpus and oupus up o ime PX {Y}, {U} ).2 he recursive algorihm used o perform his compuaion is known as he Kalman filer Kalman & Bucy, 96). he goal of smoohing is o compue he probabiliy of X given he sequence of inpus and oupus up o ime, where >. he Kalman filer is used in he forward direcion o compue he probabiliy of X given {Y} and {U}. similar se of backward recursions from o complees he compuaion by accouning for he observaions afer ime Rauch, 963). We will refer o he combined forward and backward recursions for smoohing as he Kalman smoohing recursions also known as he RS, or Rauch-ung-Sreibel smooher). Finally, he goal of predicion is o compue he probabiliy of fuure saes and observaions given observaions up o ime. Given PX {Y}, {U} ) compued as before, he model is simulaed in he forward direcion using equaions 2.2 or 2.5 if here are inpus) and 2.3 o compue he probabiliy densiy of he sae or oupu a fuure ime + τ. he problem of learning he parameers of an SS is known in engineering as he sysem idenificaion problem; in is mos general form i assumes access only o sequences of inpu and oupu observaions. We focus on maximum likelihood learning in which a single locally opimal) value of he parameers is esimaed, raher han Bayesian approaches ha rea he parameers as random variables and compue or approximae he poserior disribuion of he parameers given he daa. One can also disinguish beween on-line and off-line approaches o learning. On-line recursive algorihms, favored in real-ime adapive conrol applicaions, can be obained by compuing he gradien or he second derivaives of he log-likelihood Ljung One can also define he sae such ha X + = X + BU + w. 2 he noaion {Y} is shorhand for he sequence Y,,Y.

5 Swiching Sae-Space odels 835 & Södersröm, 983). Similar gradien-based mehods can be obained for off-line mehods. n alernaive mehod for off-line learning makes use of he expecaion maximizaion E) algorihm Dempser, Laird, & Rubin, 977). his procedure ieraes beween an E-sep ha fixes he curren parameers and compues poserior probabiliies over he hidden saes given he observaions, and an -sep ha maximizes he expeced log-likelihood of he parameers using he poserior disribuion compued in he E-sep. For linear gaussian sae-space models, he E-sep is exacly he Kalman smoohing problem as defined above, and he -sep simplifies o a linear regression problem Shumway & Soffer, 982; Digalakis, Rohlicek, & Osendorf, 993). Deails on he E algorihm for SSs can be found in Ghahramani and Hinon 996a), as well as in he original Shumway and Soffer 982) aricle. 2.2 Hidden arkov odels. Hidden arkov models also define probabiliy disribuions over sequences of observaions {Y }. he disribuion over sequences is obained by specifying a disribuion over observaions a each ime sep given a discree hidden sae S, and he probabiliy of ransiioning from one hidden sae o anoher. Using he arkov propery, he join probabiliy for he sequences of saes S and observaions Y, can be facored in exacly he same manner as equaion 2., wih S aking he place of X : P{S, Y }) = PS )PY S ) PS S )PY S ). 2.6) =2 Similarly, he condiional independencies in an H can be expressed graphically in he same form as Figure. he sae is represened by a single mulinomial variable ha can ake one of K discree values, S {,,K}. he sae ransiion probabiliies, PS S ), are specified by a K K ransiion marix. If he observables are discree symbols aking on one of L values, he observaion probabiliies PY S ) can be fully specified as a K L observaion marix. For a coninuous observaion vecor, PY S ) can be modeled in many differen forms, such as a gaussian, mixure of gaussians, or neural nework. Hs have been applied exensively o problems in speech recogniion Juang & Rabiner, 99), compuaional biology Baldi, Chauvin, Hunkapiller, & cclure, 994), and faul deecion Smyh, 994). Given an H wih known parameers and a sequence of observaions, wo algorihms are commonly used o solve wo differen forms of he inference problem Rabiner & Juang, 986). he firs compues he poserior probabiliies of he hidden saes using a recursive algorihm known as he forward-backward algorihm. he compuaions in he forward pass are exacly analogous o he Kalman filer for SSs, and he compuaions in he backward pass are analogous o he backward pass of he Kalman smoohing

6 836 Zoubin Ghahramani and Geoffrey E. Hinon equaions. s noed by Bridle pers. comm., 985) and Smyh, Heckerman, and Jordan 997), he forward-backward algorihm is a special case of exac inference algorihms for more general graphical probabilisic models Laurizen & Spiegelhaler, 988; Pearl, 988). he same observaion holds rue for he Kalman smoohing recursions. he oher inference problem commonly posed for Hs is o compue he single mos likely sequence of hidden saes. he soluion o his problem is given by he Vierbi algorihm, which also consiss of a forward and backward pass hrough he model. o learn maximum likelihood parameers for an H given sequences of observaions, one can use he well-known Baum-Welch algorihm Baum, Perie, Soules, & Weiss, 970). his algorihm is a special case of E ha uses he forward-backward algorihm o infer he poserior probabiliies of he hidden saes in he E-sep. he -sep uses expeced couns of ransiions and observaions o reesimae he ransiion and oupu marices or linear regression equaions in he case where he observaions are gaussian disribued). Like SSs, Hs can be augmened o allow for inpu variables, such ha hey model he condiional disribuion of sequences of oupu observaions given sequences of inpus Cacciaore & owlan, 994; Bengio & Frasconi, 995; eila & Jordan, 996). 2.3 Hybrids. burgeoning lieraure on models ha combine he discree ransiion srucure of Hs wih he linear dynamics of SSs has developed in fields ranging from economerics o conrol engineering Harrison & Sevens, 976; Chang & hans, 978; Hamilon, 989; Shumway & Soffer, 99; Bar-Shalom & Li, 993; Deng, 993; Kadirkamanahan & Kadirkamanahan, 996; Chaer, Bishop, & Ghosh, 997). hese models are known alernaely as hybrid models, SSs wih swiching, and jump-linear sysems. We briefly review some of his lieraure, including some relaed neural nework models. 3 Shorly afer Kalman and Bucy solved he problem of sae esimaion for linear gaussian SSs, aenion urned o he analogous problem for swiching models ckerson & Fu, 970). Chang and hans 978) derive he equaions for compuing he condiional mean and variance of he sae when he parameers of a linear SS swich according o arbirary and arkovian dynamics. he prior and ransiion probabiliies of he swiching process are assumed o be known. hey noe ha for models ses of parameers) and an observaion lengh, he exac condiional disribuion of he sae is a gaussian mixure wih componens. he condiional mean and variance, which require far less compuaion, are herefore only summary saisics. 3 review of how SSs and Hs are relaed o simpler saisical models such as principal componens analysis, facor analysis, mixure of gaussians, vecor quanizaion, and independen componens analysis IC) can be found in Roweis and Ghahramani 999).

7 Swiching Sae-Space odels 837 a b S S 2 S 3 S S 2 S 3 X X 2 X 3 X X X 2 3 Y Y Y 2 3 Y Y Y 2 3 c d U U2 U3 S S2 S3 S S 2 S 3 X X 2 X 3 X X 2 X 3 Y Y Y 2 3 Y Y Y 2 3 Figure 2: Direced acyclic graphs specifying condiional independence relaions for various swiching sae-space models. a) Shumway and Soffer 99): he oupu marix C in equaion 2.3) swiches independenly beween a fixed number of choices a each ime sep. Is seing is represened by he discree hidden variable S ; b) Bar-Shalom and Li 993): boh he oupu equaion and he dynamic equaion can swich, and he swiches are arkov; c) Kim 994); d) Fraser and Dimiriadis 993): oupus and saes are observed. Here we have shown a simple case where he oupu depends direcly on he curren sae, previous sae, and previous oupu. Shumway and Soffer 99) consider he problem of learning he parameers of SSs wih a single real-valued hidden sae vecor and swiching oupu marices. he probabiliy of choosing a paricular oupu marix is a prespecified ime-varying funcion, independen of previous choices see Figure 2a). pseudo-e algorihm is derived in which he E-sep, which

8 838 Zoubin Ghahramani and Geoffrey E. Hinon in is exac form would require compuing a gaussian mixure wih componens, is approximaed by a single gaussian a each ime sep. Bar-Shalom and Li 993; sec..6) review models in which boh he sae dynamics and he oupu marices swich, and where he swiching follows arkovian dynamics see Figure 2b). hey presen several mehods for approximaely solving he sae-esimaion problem in swiching models hey do no discuss parameer esimaion for such models). hese mehods, referred o as generalized pseudo-bayesian GPB) and ineracing muliple models I), are all based on he idea of collapsing ino one gaussian he mixure of gaussians ha resuls from considering all he seings of he swich sae a a given ime sep. his avoids he exponenial growh of mixure componens a he cos of providing an approximae soluion. ore sophisicaed bu compuaionally expensive mehods ha collapse 2 gaussians ino gaussians are also derived. Kim 994) derives a similar approximaion for a closely relaed model, which also includes observed inpu variables see Figure 2c). Furhermore, Kim discusses parameer esimaion for his model, alhough wihou making reference o he E algorihm. Oher auhors have used arkov chain one Carlo mehods for sae and parameer esimaion in swiching models Carer & Kohn, 994; haide, 995) and in oher relaed dynamic probabilisic neworks Dean & Kanazawa, 989; Kanazawa, Koller, & Russell, 995). Hamilon 989, 994, sec. 22.4) describes a class of swiching models in which he real-valued observaion a ime, Y, depends on boh he observaions a imes o r and he discree saes a ime o r. ore precisely, Y is gaussian wih mean ha is a linear funcion of Y,,Y r and of binary indicaor variables for he discree saes, S,,S r. he sysem can herefore be seen as an r + )h order H driving an rh order auoregressive process, and is racable for small r and number of discree saes in S. Hamilon s models are closely relaed o hidden filer H HFH; Fraser & Dimiriadis, 993). HFHs have boh discree and real-valued saes. However, he real-valued saes are assumed o be eiher observed or a known, deerminisic funcion of he pas observaions i.e., an embedding). he oupus depend on he saes and previous oupus, and he form of his dependence can swich randomly see Figure 2d). Because a any ime sep he only hidden variable is he swich sae, S, exac inference in his model can be carried ou racably. he resuling algorihm is a varian of he forward-backward procedure for Hs. Kehagias and Peridis 997) and Pawelzik, Kohlmorgen, and üller 996) presen oher varians of his model. Ellio, ggoun, and oore 995; sec. 2.5) presen an inference algorihm for hybrid arkov swiching) sysems for which here is a separae observable from which he swich sae can be esimaed. he rue swich saes, S, are represened as uni vecors in R, and he esimaed swich sae is a vecor in he uni square wih elemens corresponding o he es-

9 Swiching Sae-Space odels 839 imaed probabiliy of being in each swich sae. he real-valued sae, X, is approximaed as a gaussian given he esimaed swich sae by forming a linear combinaion of he ransiion and observaion marices for he differen SSs weighed by he esimaed swich sae. Elio e al. also derive conrol equaions for such hybrid sysems and discuss applicaions of he change-of-measures whiening procedure o a large family of models. Wih regard o he lieraure on neural compuaion, he model presened in his aricle is a generalizaion of boh he mixure-of-expers neural nework Jacobs e al., 99; Jordan & Jacobs, 994) and he relaed mixure of facor analyzers Hinon, Dayan, & Revow, 997; Ghahramani & Hinon, 996b). Previous dynamical generalizaions of he mixure-of-expers archiecure consider he case in which he gaing nework has arkovian dynamics Cacciaore & owlan, 994; Kadirkamanahan & Kadirkamanahan, 996; eila & Jordan, 996). One limiaion of his generalizaion is ha he enire pas sequence is summarized in he value of a single discree variable he gaing acivaion), which for a sysem wih expers can convey on average a mos log bis of informaion abou he pas. In he models we consider here, boh he expers and he gaing nework have arkovian dynamics. he pas is herefore summarized by a sae composed of he cross-produc of he discree variable and he combined real-valued sae-space of all he expers. his provides a much wider informaion channel from he pas. One advanage of his represenaion is ha he real-valued sae can conain componenial srucure. hus, aribues such as he posiion, orienaion, and scale of an objec in an image, which are mos naurally encoded as independen real-valued variables, can be accommodaed in he sae wihou he exponenial growh required of a discreized H-like represenaion. I is imporan o place he work in his aricle in he conex of he lieraure we have jus reviewed. he hybrid models, sae-space models wih swiching, and jump-linear sysems we have described all assume a single real-valued sae vecor. he model considered in his aricle generalizes his o muliple real-valued sae vecors. 4 Unlike he models described in Hamilon 994), Fraser and Dimiradis 993), and he curren dynamical exensions of mixures of expers, in he model we presen, he real-valued sae vecors are hidden. he inference algorihm we derive, which is based on making a srucured variaional approximaion, is enirely novel in he conex of swiching SSs. Specifically, our mehod is unlike all he approximae mehods we have reviewed in ha i is no based on fiing a single gaussian o a mixure of gaussians by compuing he mean and covariance of he mixure. 5 We derive a learning algorihm for all of he parameers 4 oe ha he sae vecors could be concaenaed ino one large sae vecor wih facorized block-diagonal) ransiion marices cf. facorial hidden arkov model; Ghahramani & Jordan, 997). However, his obscures he decoupled srucure of he model. 5 Boh classes of mehods can be seen as minimizing Kullback-Liebler KL) diver-

10 840 Zoubin Ghahramani and Geoffrey E. Hinon of he model, including he arkov swiching parameers. his algorihm maximizes a lower bound on he log-likelihood of he daa raher han a heurisically moivaed approximaion o he likelihood. he algorihm has a simple and inuiive flavor: I decouples ino forward-backward recursions on a H, and Kalman smoohing recursions on each SS. he saes of he H deermine he sof assignmen of each observaion o an SS; he predicion errors of he SSs deermine he observaion probabiliies for he H. 3 he Generaive odel In swiching SSs, he sequence of observaions {Y } is modeled by specifying a probabilisic relaion beween he observaions and a hidden saespace comprising real-valued sae vecors,, and one discree sae vecor S. he discree sae, S, is modeled as a mulinomial variable ha can ake on values: S {,,}; for reasons ha will become obvious we refer o i as he swich variable. he join probabiliy of observaions and hidden saes can be facored as { }) P S, X ),,X ), Y ) ) = PS ) PS S ) P P =2 m= =2 ) P Y X ),,X ), S, 3.) = which corresponds graphically o he condiional independencies represened by Figure 3. Condiioned on a seing of he swich sae, S = m, he observable is mulivariae gaussian wih oupu equaion given by saespace model m. oice ha m is used as boh an index for he real-valued sae variables and a value for he swich sae. he probabiliy of he observaion vecor Y is herefore ),,X ), S = m { = 2π R 2 exp 2 P Y X ) Y C m) ) R ) } Y C m), 3.2) where R is he observaion noise covariance marix and C m) is he oupu marix for SS m cf. equaion 2.4 for a single linear-gaussian SS). Each gences. However, he KL divergence is asymmerical, and whereas he variaional mehods minimize i in one direcion, he mehods ha merge gaussians minimize i in he oher direcion. We reurn o his poin in secion 4.2.

11 Swiching Sae-Space odels 84 a b X ) X ) 2 X ) 3 ) ) ) X X 2 X 3 X ) 2) X X ) S S S 2 3 S Y Y Y 2 3 Y Figure 3: a) Graphical model represenaion for swiching sae-space models. S is he discree swich variable, and are he real-valued sae vecors. b) Swiching sae-space model depiced as a generalizaion of he mixure of expers. he dashed arrows correspond o he connecions in a mixure of expers. In a swiching sae-space model, he saes of he expers and he gaing nework also depend on heir previous saes solid arrows). real-valued sae vecor evolves according o he linear gaussian dynamics of an SS wih differing iniial sae, ransiion marix, and sae noise see equaion 2.2). For simpliciy we assume ha all sae vecors have idenical dimensionaliy; he generalizaion of he algorihms we presen o models wih differen-sized sae-spaces is immediae. he swich sae iself evolves according o he discree arkov ransiion srucure specified by he iniial sae probabiliies PS ) and he sae ransiion marix PS S ). n exac analogy can be made o he mixure-of-expers archiecure for modular learning in neural neworks see Figure 3b; Jacobs e al., 99). Each SS is a linear exper wih gaussian oupu noise model and lineargaussian dynamics. he swich sae gaes he oupus of he SSs, and herefore plays he role of a gaing nework wih arkovian dynamics. here are many possible exensions of he model; we shall consider hree obvious and sraighforward ones: Ex) Differing oupu covariances, R m), for each SS; Ex2) Differing oupu means, µ m) Y, for each SS, such ha each model is allowed o capure observaions in a differen operaing range Ex3) Condiioning on a sequence of observed inpu vecors, {U }

12 842 Zoubin Ghahramani and Geoffrey E. Hinon 4 Learning n efficien learning algorihm for he parameers of a swiching SS can be derived by generalizing he E algorihm Baum e al., 970; Dempser e al., 977). E alernaes beween opimizing a disribuion over he hidden saes he E-sep) and opimizing he parameers given he disribuion over hidden saes he -sep). ny disribuion over he hidden saes, Q{S, X }), where X = [X ),X ) ] is he combined sae of he SSs, can be used o define a lower bound, B, on he log probabiliy of he observed daa: log P{Y } θ) = log P{S, X, Y } θ) d{x } 4.) {S } = log [ ] P{S, X, Y } θ) Q{S, X }) d{x } 4.2) Q{S {S }, X }) [ ] P{S, X, Y } θ) Q{S, X }) log d{x } Q{S {S }, X }) = BQ,θ), 4.3) where θ denoes he parameers of he model and we have made use of Jensen s inequaliy Cover & homas, 99) o esablish equaion 4.3. Boh seps of E increase he lower bound on he log probabiliy of he observed daa. he E-sep holds he parameers fixed and ses Q o be he poserior disribuion over he hidden saes given he parameers, Q{S, X }) = P{S, X } {Y },θ). 4.4) his maximizes B wih respec o he disribuion, urning he lower bound ino an equaliy, which can be easily seen by subsiuion. he -sep holds he disribuion fixed and compues he parameers ha maximize B for ha disribuion. Since B = log P{Y } θ) a he sar of he -sep and since he E-sep does no affec log P, he wo seps combined can never decrease log P. Given he change in he parameers produced by he - sep, he disribuion produced by he previous E-sep is ypically no longer opimal, so he whole procedure mus be ieraed. Unforunaely, he exac E-sep for swiching SSs is inracable. Like he relaed hybrid models described in secion 2.3, he poserior probabiliy of he real-valued saes is a gaussian mixure wih erms. his can be seen by using he semanics of direced graphs, in paricular he d-separaion crierion Pearl, 988), which implies ha he hidden sae variables in Figure 3, while marginally independen, become condiionally dependen given he observaion sequence. his induced dependency effecively couples all of he real-valued hidden sae variables o he discree swich variable, as a

13 Swiching Sae-Space odels 843 consequence of which he exac poseriors become Gaussian mixures wih an exponenial number of erms. 6 In order o derive an efficien learning algorihm for his sysem, we relax he E algorihm by approximaing he poserior probabiliy of he hidden saes. he basic idea is ha since expecaions wih respec o P are inracable, raher han seing Q{S, X }) = P{S, X } {Y }) in he E-sep, a racable disribuion Q is used o approximae P. his resuls in an E learning algorihm ha maximizes a lower bound on he log-likelihood. he difference beween he bound B and he log-likelihood is given by he Kullback-Liebler KL) divergence beween Q and P: KLQ P) = {S } [ ] Q{S, X }) Q{S, X }) log d{x }. 4.5) P{S, X } {Y }) Since he complexiy of exac inference in he approximaion given by Q is deermined by is condiional independence relaions, no by is parameers, we can choose Q o have a racable srucure a graphical represenaion ha eliminaes some of he dependencies in P. Given his srucure, he parameers of Q are varied o obain he ighes possible bound by minimizing equaion 4.5. herefore, he algorihm alernaes beween opimizing he parameers of he disribuion Q o minimize equaion 4.5 he E-sep) and opimizing he parameers of P given he disribuion over he hidden saes he -sep). s in exac E, boh seps increase he lower bound B on he log-likelihood; however, equaliy is no reached in he E-sep. We will refer o he general sraegy of using a parameerized approximaing disribuion as a variaional approximaion and refer o he free parameers of he disribuion as variaional parameers. compleely facorized approximaion is ofen used in saisical physics, where i provides he basis for simple ye powerful mean-field approximaions o saisical mechanical sysems Parisi, 988). heoreical argumens moivaing approximae E-seps are presened in eal and Hinon 998; originally in a echnical repor in 993). Saul and Jordan 996) showed ha approximae E-seps could be used o maximize a lower bound on he log-likelihood, and proposed he powerful echnique of srucured variaional approximaions o inracable probabilisic neworks. he key insigh of heir work, which his aricle makes use of, is ha by judicious use of an approximaion Q, exac inference algorihms can be used on he racable subsrucures in an inracable nework. general uorial on variaional approximaions can be found in Jordan, Ghahramani, Jaakkola, and Saul 998). 6 he inracabiliy of he E-sep or smoohing problem in he simpler single-sae swiching model has been noed by ckerson and Fu 970), Chang and hans 978), Bar-Shalom and Li 993), and ohers.

14 844 Zoubin Ghahramani and Geoffrey E. Hinon X ) X ) 2 X ) 3 ) ) ) X X 2 X 3 S S S 2 3 Figure 4: Graphical model represenaion for he srucured variaional approximaion o he poserior disribuion of he hidden saes of a swiching saespace model. he parameers of he swiching SS are θ = { m), C m), Q m),µ m) X,, R, π, }, where m) is he sae dynamics marix for model m, C m) is Q m) is he mean of he iniial sae, Q m) is he covariance of he iniial sae, R is he ied) oupu noise covariance, π = PS ) is he prior for he discree arkov process, and = PS S ) is he discree ransiion marix. Exensions Ex) hrough Ex3) can be readily implemened by subsiuing R m) for R, adding means is oupu marix, Q m) is is sae noise covariance, µ m) X µ m) Y and inpu marices Bm). lhough here are many possible approximaions o he poserior disribuion of he hidden variables ha one could use for learning and inference in swiching SSs, we focus on he following: [ Q{S, X }) = ψs ) Z Q =2 ] ψs, S ) =2 m= ψ ) ψ, Xm), 4.6) where he ψ are unnormalized probabiliies, which we will call poenial funcions and define soon, and Z Q is a normalizaion consan ensuring ha Q inegraes o one. lhough Q has been wrien in erms of poenial funcions raher han condiional probabiliies, i corresponds o he simple graphical model shown in Figure 4. he erms involving he swich variables S define a discree arkov chain, and he erms involving he sae vecors define uncoupled SSs. s in mean-field approximaions, we have approximaed he sochasically coupled sysem by removing some of he )

15 Swiching Sae-Space odels 845 couplings of he original sysem. Specifically, we have removed he sochasic coupling beween he chains ha resuls from he fac ha he observaion a ime depends on all he hidden variables a ime. However, we reain he coupling beween he hidden variables a successive ime seps since hese couplings can be handled exacly using he forward-backward and Kalman smoohing recursions. his approximaion is herefore srucured, in he sense ha no all variables are uncoupled. he discree swiching process is defined by ψs = m) = PS = m) q m) 4.7) ψs, S = m) = PS = m S ) q m), 4.8) where he q m) are variaional parameers of he Q disribuion. hese parameers scale he probabiliies of each of he saes of he swich variable a each ime sep, so ha q m) plays exacly he same role ha he observaion probabiliy PY S = m) would play in a regular H. We will soon see ha minimizing KLQ P) resuls in an equaion for q m) ha suppors his inuiion. he uncoupled SSs in he approximaion Q are also defined by poenial funcions ha are relaed o probabiliies in he original sysem. hese poenials are he prior and ransiion probabiliies for muliplied by a facor ha changes hese poenials o ry o accoun for he daa: ψ ) ) ψ, Xm) = P = P )[ P )] h m) Y, S = m 4.9) )[ )] h P Y m), S = m, 4.0) where he h m) are variaional parameers of Q. he vecor h plays a role very similar o he swich variable S. Each componen h m) can range beween 0 and. When h m) = 0, he poserior probabiliy of under Q does no depend on he observaion a ime Y. When h m) =, he poserior probabiliy of under Q includes a erm ha assumes ha SS m generaed Y. We call h m) he responsibiliy assigned o SS m for he observaion vecor Y. he difference beween h m) and S m) is ha h m) is a deerminisic parameer, while S m) is a sochasic random variable. o maximize he lower bound on he log-likelihood, KLQ P) is minimized wih respec o he variaional parameers h m) and q m) separaely for each sequence of observaions. Using he definiion of P for he swiching sae-space model equaions 3. and 3.2) and he approximaing disribuion Q, he minimum of KL saisfies he following fixed-poin equaions for he variaional parameers see appendix B): h m) = QS = m) 4.)

16 846 Zoubin Ghahramani and Geoffrey E. Hinon q m) { = exp 2 Y C m) ) R Y C m) ) }, 4.2) where denoes expecaion over he Q disribuion. Inuiively, he responsibiliy, h m) is equal o he probabiliy under Q ha SS m generaed observaion vecor Y, and q m) is an unnormalized gaussian funcion of he expeced squared error if SS m generaed Y. o compue h m) i is necessary o sum Q over all he S τ variables no including S. his can be done efficienly using he forward-backward algorihm on he swich sae variables, wih q m) playing exacly he same role as an observaion probabiliy associaed wih each seing of he swich variable. Since q m) is relaed o he predicion error of model m on daa Y, his has he inuiive inerpreaion ha he swich sae associaed wih models wih smaller expeced predicion error on a paricular observaion will be favored a ha ime sep. However, he forward-backward algorihm ensures ha he final responsibiliies for he models are obained afer considering he enire sequence of observaions. o compue q m), i is necessary o calculae he expecaions of and under Q. We see his by expanding equaion 4.2: q m) { = exp 2 Y R Y + Y R C m) [ 2 r C m) R C m) ]}, 4.3) where r is he marix race operaor, and we have used rb) = rb). he expecaions of and can be compued efficienly using he Kalman smoohing algorihm on each SS, where for model m a ime, he daa are weighed by he responsibiliies h m). 7 Since he h parameers depend on he q parameers, and vice versa, he whole process has o be ieraed, where each ieraion involves calls o he forward-backward and Kalman smoohing algorihms. Once he ieraions have converged, he E- sep oupus he expeced values of he hidden variables under he final Q. he -sep compues he model parameers ha opimize he expecaion of he log-likelihood see equaion B.7), which is a funcion of he expecaions of he hidden variables. For swiching SSs, all he parameer reesimaes can be compued analyically. For example, aking derivaives of he expecaion of equaion B.7 wih respec o C m) and seing o zero, 7 Weighing he daa by h m) is equivalen o running he Kalman smooher on he unweighed daa using a ime-varying observaion noise covariance marix R m) = R/h m).

17 Swiching Sae-Space odels 847 Figure 5: Learning algorihm for swiching sae-space models. we ge Ĉ m) = = S m) Y ) = S m) ), 4.4) which is a weighed version of he reesimaion equaions for SSs. Similarly, he reesimaion equaions for he swich process are analogous o he Baum-Welch updae rules for Hs. he learning algorihm for swiching sae-space models using he above srucured variaional approximaion is summarized in Figure Deerminisic nnealing. he KL divergence minimized in he E- sep of he variaional E algorihm can have muliple minima in general. One way o visualize hese minima is o consider he space of all possible segmenaions of an observaion sequence of lengh, where by segmenaion we mean a discree pariion of he sequence beween he SSs. If here are SSs, hen here are possible segmenaions of he sequence. Given one such segmenaion, inferring he opimal disribuion for he real-valued saes of he SSs is a convex opimizaion problem, since hese real-valued saes are condiionally gaussian. So he difficuly in he KL minimizaion lies in rying o find he bes sof) pariion of he daa.

18 848 Zoubin Ghahramani and Geoffrey E. Hinon s in oher combinaorial opimizaion problems, he possibiliy of geing rapped in local minima can be reduced by gradually annealing he cos funcion. We can employ a deerminisic varian of he annealing idea by making he following simple modificaions o he variaional fixed-poin equaions 4. and 4.2: h m) = QS = m) 4.5) { = exp ) ) } Y C m) R Y C m). 4.6) 2 q m) Here is a emperaure parameer, which is iniialized o a large value and gradually reduced o. he above equaions maximize a modified form of he bound B in equaion 4.3, where he enropy of Q has been muliplied by Ueda & akano, 995). 4.2 erging Gaussians. lmos all he approximae inference mehods ha are described in he lieraure for swiching SSs are based on he idea of merging, a each ime sep, a mixure of gaussians ino one gaussian. he merged gaussian is obained by seing is mean and covariance equal o he mean and covariance of he mixure. Here we briefly describe, as an alernaive o he variaional approximaion mehods we have derived, how his more radiional gaussian merging procedure can be applied o he model we have defined. In he swiching sae-space models described in secion 3, here are differen SSs, wih possibly differen sae-space dimensionaliies, so i would be inappropriae o merge heir saes ino one gaussian. However, i is sill possibly o apply a gaussian merging echnique by considering each SS separaely. In each SS, m, he hidden sae densiy produces a each ime sep a mixure of wo gaussians: one for he case S = m and one for S m. We merge hese wo gaussians, weighed he curren esimaes of PS = m Y,Y ) and PS = m Y,Y ), respecively. his merged gaussian is used o obain he gaussian prior for + for he nex ime sep. We implemened a forward-pass version of his approximae inference scheme, which is analogous o he I procedure described in Bar-Shalom and Li 993). his procedure finds a each ime sep he bes gaussian fi o he curren mixure of gaussians for each SS. If we denoe he approximaing gaussian by Q and he mixure being approximaed by P, bes is defined here as minimizing KLP Q). Furhermore, gaussian merging echniques are greedy in ha he bes gaussian is compued a every ime sep and used immediaely for he nex ime sep. For a gaussian Q,KLP Q) has no local minima, and i is very easy o find he opimal Q by compuing he firs wo momens of P. Inaccuracies in his greedy procedure arise because he esimaes of PS Y,,Y ) are based on his single merged gaussian, no on he real mixure.

19 Swiching Sae-Space odels 849 In conras, variaional mehods seek o minimize KLQ P), which can have many local minima. oreover, hese mehods are no greedy in he same sense: hey ierae forward and backward in ime unil obaining a locally opimal Q. 5 Simulaions 5. Experimen : Variaional Segmenaion and Deerminisic nnealing. he goal of his experimen was o assess he qualiy of soluions found by he variaional inference algorihm and he effec of using deerminisic annealing on hese soluions. We generaed 200 sequences of lengh 200 from a simple model ha swiched beween wo SSs. hese SSs and he swiching process were defined by: X ) = 0.99 X ) + w) w ) 0, ) 5.) X 2) = 0.9 X 2) + w2) w 2) 0, 0) 5.2) Y = + v v 0, 0.), 5.3) where he swich sae m was chosen using priors π ) = π 2) = /2 and ransiion probabiliies = 22 = 0.95; 2 = 2 = Five sequences from his daa se are shown in in Figure 6, along wih he rue sae of he swich variable. We compared hree differen inference algorihms: variaional inference, variaional inference wih deerminisic annealing secion 4.), and inference by gaussian merging secion 4.2). For each sequence, we iniialized he variaional inference algorihms wih equal responsibiliies for he wo SSs and ran hem for 2 ieraions. he nonannealed inference algorihm ran a a fixed emperaure of =, while he annealed algorihm was iniialized o a emperaure of = 00, which decayed down o over he 2 ieraions, using he decay funcion i+ = 2 i + 2. o eliminae he effec of model inaccuracies we gave all hree inference algorihms he rue parameers of he generaive model. he segmenaions found by he nonannealed variaional inference algorihm showed lile similariy o he rue segmenaions of he daa see Figure 7). Furhermore, he nonannealed algorihm generally underesimaed he number of swiches, ofen converging on soluions wih no swiches a all. Boh he annealed variaional algorihm and he gaussian merging mehod found segmenaions ha were more similar o he rue segmenaions of he daa. Comparing percenage correc segmenaions, we see ha annealing subsanially improves he variaional inference mehod and ha he gaussian merging and annealed variaional mehods perform comparably see Figure 8). he average performance of he annealed variaional mehod is only abou.3% beer han gaussian merging.

20 850 Zoubin Ghahramani and Geoffrey E. Hinon Figure 6: Five daa sequences of lengh 200, wih heir rue segmenaions below hem. In he segmenaions, swich saes and 2 are represened wih presence and absence of dos, respecively. oice ha i is difficul o segmen he sequences correcly based only on knowing he dynamics of he wo processes. 5.2 Experimen 2: odeling Respiraion in a Paien wih Sleep pnea. Swiching sae-space models should prove useful in modeling ime series ha have dynamics characerized by several differen regimes. o illusrae his poin, we examined a physiological daa se from a paien enaively diagnosed wih sleep apnea, a medical condiion in which an individual inermienly sops breahing during sleep. he daa were obained from he reposiory of ime-series daa ses associaed wih he Sana Fe ime Series nalysis and Predicion Compeiion Weigend & Gershenfeld, 993) and is described in deail in Rigney e al. 993). 8 he respiraion paern in sleep apnea is characerized by a leas wo regimes: no breahing and gasping breahing induced by a reflex arousal. Furhermore, in his paien here also seem o be periods of normal rhyhmic breahing see Figure 9). 8 he daa are available online a hp:// aweigend/ime- Series/SanaFe.hml#seB. We used samples for raining and for esing.

21 Swiching Sae-Space odels 85 Figure 7: For 0 differen sequences of lengh 200, segmenaions are shown wih presence and absence of dos corresponding o he wo SSs generaing hese daa. he rows are he segmenaions found using he variaional mehod wih no annealing ), he variaional mehod wih deerminisic annealing ), he gaussian merging mehod ), and he rue segmenaion ). ll hree inference algorihms give real-valued h m) ; hard segmenaions were obained by hresholding he final h m) values a 0.5. he firs five sequences are he ones shown in Figure 6.

22 852 Zoubin Ghahramani and Geoffrey E. Hinon a b c d Percen Correc Segmenaion Figure 8: Hisograms of percenage correc segmenaions: a) conrol, using random segmenaion; b) variaional inference wihou annealing; c) variaional inference wih annealing; d) gaussian merging. Percenage correc segmenaion was compued by couning he number of ime seps for which he rue and esimaed segmenaions agree. We rained swiching SSs varying he random seed, he number of componens in he mixure = 2 o 5), and he dimensionaliy of he sae-space in each componen K = o 0) on a daa se consising of 000 consecuive measuremens of he ches volume. s conrols, we also rained simple SSs i.e., = ), varying he dimension of he sae-space from K = o 0, and simple Hs i.e., K = 0), varying he number of discree hidden saes from = 2o = 50. Simulaions were run unil convergence or for 200 ieraions, whichever came firs; convergence was assessed by measuring he change in likelihood or bound on he likelihood) over consecuive seps of E. he likelihood of he simple SSs and he Hs was calculaed on a es se consising of 000 consecuive measuremens of he ches volume. For he swiching SSs, he likelihood is inracable, so we calculaed he lower bound on he likelihood, B. he simple SSs modeled he daa very poorly for K =, and he performance was fla for values of K = 2 o 0 see

23 Swiching Sae-Space odels 853 respiraion respiraion a ime s) b ime s) Figure 9: Ches volume respiraion force) of a paien wih sleep apnea during wo nonconinuous ime segmens of he same nigh measuremens sampled a 2 Hz). a) raining daa. pnea is characerized by exended periods of small variabiliy in ches volume, followed by burss gasping). Here we see such behavior around = 250, followed by normal rhyhmic breahing. b) es daa. In his segmen we find several insances of apnea and an approximaely rhyhmic region. he hick lines a he boom of each plo are explained in he main ex.) Figure 0a). he large majoriy of runs of he swiching sae-space model resuled in models wih higher likelihood han hose of he simple Ss see Figures 0b 0e). One consisen excepion should be noed: for values of = 2 and K = 6 o 0, he swiching SS performed almos idenically o he simple SS. Exploraory experimens sugges ha in hese cases, a single componen akes responsibiliy for all he daa, so he model has = effecively. his may be a local minimum problem or a resul of poor iniializaion heurisics. Looking a he learning curves for simple and swiching SSs, i is easy o see ha here are plaeaus a he soluions found by he simple one-componen SSs ha he swiching SS can ge caugh in see Figure ). he likelihoods for Hs wih around = 5 were comparable o hose of he bes swiching SSs see Figure 0f). Purely in erms of cod-

24 854 Zoubin Ghahramani and Geoffrey E. Hinon a 0.8 SS b 0.8 SwSS =2) c 0.8 SwSS =3) K K K d 0.8 SwSS =4) e 0.8 SwSS =5) f 0.8 H K K Figure 0: Log likelihood nas per observaion) on he es daa from a oal of almos 400 runs of simple sae-space models a), swiching sae-space models wih differing numbers of componens b e), and hidden arkov models f). ing efficiency, swiching SSs have lile advanage over Hs on his daa. However, i is useful o conras he soluions learned by Hs wih he soluions learned by he swiching SSs. he hick dos a he boom of he Figures 9a and 9b show he responsibiliy assigned o one of wo componens in a fairly ypical swiching SS wih = 2 componens of sae size K = 2. his componen has clearly specialized o modeling he daa during periods of apnea, while he oher componen models he gasps and periods of rhyhmic breahing. hese wo swiching componens provide a much more inuiive model of he daa han he 0 o 20 discree componens needed in an H wih comparable coding efficiency. 9 9 By using furher assumpions o consrain he model, such as coninuiy of he realvalued hidden sae a swich imes, i should be possible o obain even beer performance on hese daa.

25 Swiching Sae-Space odels 855 log P ieraions of E Figure : Learning curves for a sae space model K = 4) and a swiching sae-space model = 2, K = 2). 6 Discussion he main conclusion we can draw from he firs series of experimens is ha even when given he correc model parameers, he problem of segmening a swiching ime series ino is componens is difficul. here are combinaorially many alernaives o be considered, and he energy surface suffers from many local minima, so local opimizaion approaches like he variaional mehod we used are limied by he qualiy of he iniial condiions. Deerminisic annealing can be hough of as a sophisicaed iniializaion procedure for he hidden saes: he final soluion a each emperaure provides he iniial condiions a he nex. We found ha annealing subsanially improved he qualiy of he segmenaions found. he firs experimen also indicaes ha he much simpler gaussian merging mehod performs comparably o annealed variaional inference. he gaussian merging mehods have he advanage ha a each ime sep, he cos funcion minimized has no local minima. his may accoun for how well hey perform relaive o he nonannealed variaional mehod. On he oher hand, he variaional mehods have he advanage ha hey ieraively improve heir approximaion o he poserior, and hey define a lower bound

26 856 Zoubin Ghahramani and Geoffrey E. Hinon on he likelihood. Our resuls sugges ha i may be very fruiful o use he gaussian merging mehod o iniialize he variaional inference procedure. Furhermore, i is possible o derive variaional approximaions for oher swiching models described in he lieraure, and a combinaion of gaussian merging and variaional approximaion may provide a fas and robus mehod for learning and inference in hose models. he second series of experimens suggess ha on a real daa se believed o have swiching dynamics, he swiching SS can indeed uncover muliple regimes. When i capures hese regimes, i generalizes o he es se much beer han he simple linear dynamical model. Similar coding efficiency can be obained by using Hs, which due o he discree naure of he sae-space, can model nonlinear dynamics. However, in doing so, he Hs had o use 0 o 20 discree saes, which makes heir soluions less inerpreable. Variaional approximaions provide a powerful ool for inference and learning in complex probabilisic models. We have seen ha when applied o he swiching SS, hey can incorporae wihin a single framework wellknown exac inference mehods like Kalman smoohing and he forwardbackward algorihm. Variaional mehods can be applied o many of he oher classes of inracable swiching models described in secion 2.3. However, raining more complex models also makes apparen he imporance of good mehods for model selecion and iniializaion. o summarize, swiching SSs are a dynamical generalizaion of mixure-of-expers neural neworks, are closely relaed o well-known models in economerics and conrol, and combine he represenaions underlying Hs and linear dynamical sysems. For domains in which we have some a priori belief ha here are muliple, approximaely linear dynamical regimes, swiching SSs provide a naural modeling ool. Variaional approximaions provide a mehod o overcome he mos difficul problem in learning swiching SSs: ha he inference sep is inracable. Deerminisic annealing furher improves on he soluions found by he variaional mehod. ppendix : oaion Symbol Size Descripion Variables Y D observaion vecor a ime {Y } D sequence of observaion vecors [Y, Y 2,Y ] K sae vecor of sae-space model SS) m a ime X K enire real-valued hidden sae a ime : X =,,X ) [X ) ]

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

A variational radial basis function approximation for diffusion processes.

A variational radial basis function approximation for diffusion processes. A variaional radial basis funcion approximaion for diffusion processes. Michail D. Vreas, Dan Cornford and Yuan Shen {vreasm, d.cornford, y.shen}@ason.ac.uk Ason Universiy, Birmingham, UK hp://www.ncrg.ason.ac.uk

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

An EM based training algorithm for recurrent neural networks

An EM based training algorithm for recurrent neural networks An EM based raining algorihm for recurren neural neworks Jan Unkelbach, Sun Yi, and Jürgen Schmidhuber IDSIA,Galleria 2, 6928 Manno, Swizerland {jan.unkelbach,yi,juergen}@idsia.ch hp://www.idsia.ch Absrac.

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Estimation of Poses with Particle Filters

Estimation of Poses with Particle Filters Esimaion of Poses wih Paricle Filers Dr.-Ing. Bernd Ludwig Chair for Arificial Inelligence Deparmen of Compuer Science Friedrich-Alexander-Universiä Erlangen-Nürnberg 12/05/2008 Dr.-Ing. Bernd Ludwig (FAU

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

Testing for a Single Factor Model in the Multivariate State Space Framework

Testing for a Single Factor Model in the Multivariate State Space Framework esing for a Single Facor Model in he Mulivariae Sae Space Framework Chen C.-Y. M. Chiba and M. Kobayashi Inernaional Graduae School of Social Sciences Yokohama Naional Universiy Japan Faculy of Economics

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

Department of Computer Science, University of Toronto, Toronto, ON M5S 3H5, Canada

Department of Computer Science, University of Toronto, Toronto, ON M5S 3H5, Canada Machine Learning,?, {3 (997) c 997 Kluwer Academic Publishers, Boson. Manufacured in The Neherlands. Facorial Hidden Markov Models ZOUBIN GHAHRAMANI zoubin@cs.orono.edu Deparmen of Compuer Science, Universiy

More information

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes WHAT IS A KALMAN FILTER An recursive analyical echnique o esimae ime dependen physical parameers in he presence of noise processes Example of a ime and frequency applicaion: Offse beween wo clocks PREDICTORS,

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

EKF SLAM vs. FastSLAM A Comparison

EKF SLAM vs. FastSLAM A Comparison vs. A Comparison Michael Calonder, Compuer Vision Lab Swiss Federal Insiue of Technology, Lausanne EPFL) michael.calonder@epfl.ch The wo algorihms are described wih a planar robo applicaion in mind. Generalizaion

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

Y. Xiang, Learning Bayesian Networks 1

Y. Xiang, Learning Bayesian Networks 1 Learning Bayesian Neworks Objecives Acquisiion of BNs Technical conex of BN learning Crierion of sound srucure learning BN srucure learning in 2 seps BN CPT esimaion Reference R.E. Neapolian: Learning

More information

OBJECTIVES OF TIME SERIES ANALYSIS

OBJECTIVES OF TIME SERIES ANALYSIS OBJECTIVES OF TIME SERIES ANALYSIS Undersanding he dynamic or imedependen srucure of he observaions of a single series (univariae analysis) Forecasing of fuure observaions Asceraining he leading, lagging

More information

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering Inroducion o Arificial Inelligence V22.0472-001 Fall 2009 Lecure 18: aricle & Kalman Filering Announcemens Final exam will be a 7pm on Wednesday December 14 h Dae of las class 1.5 hrs long I won ask anyhing

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

Hidden Markov Models. Adapted from. Dr Catherine Sweeney-Reed s slides

Hidden Markov Models. Adapted from. Dr Catherine Sweeney-Reed s slides Hidden Markov Models Adaped from Dr Caherine Sweeney-Reed s slides Summary Inroducion Descripion Cenral in HMM modelling Exensions Demonsraion Specificaion of an HMM Descripion N - number of saes Q = {q

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Object tracking: Using HMMs to estimate the geographical location of fish

Object tracking: Using HMMs to estimate the geographical location of fish Objec racking: Using HMMs o esimae he geographical locaion of fish 02433 - Hidden Markov Models Marin Wæver Pedersen, Henrik Madsen Course week 13 MWP, compiled June 8, 2011 Objecive: Locae fish from agging

More information

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin ACE 56 Fall 005 Lecure 4: Simple Linear Regression Model: Specificaion and Esimaion by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Simple Regression: Economic and Saisical Model

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

Time series model fitting via Kalman smoothing and EM estimation in TimeModels.jl

Time series model fitting via Kalman smoothing and EM estimation in TimeModels.jl Time series model fiing via Kalman smoohing and EM esimaion in TimeModels.jl Gord Sephen Las updaed: January 206 Conens Inroducion 2. Moivaion and Acknowledgemens....................... 2.2 Noaion......................................

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks - Deep Learning: Theory, Techniques & Applicaions - Recurren Neural Neworks - Prof. Maeo Maeucci maeo.maeucci@polimi.i Deparmen of Elecronics, Informaion and Bioengineering Arificial Inelligence and Roboics

More information

Inferring Dynamic Dependency with Applications to Link Analysis

Inferring Dynamic Dependency with Applications to Link Analysis Inferring Dynamic Dependency wih Applicaions o Link Analysis Michael R. Siracusa Massachuses Insiue of Technology 77 Massachuses Ave. Cambridge, MA 239 John W. Fisher III Massachuses Insiue of Technology

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H.

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H. ACE 56 Fall 5 Lecure 8: The Simple Linear Regression Model: R, Reporing he Resuls and Predicion by Professor Sco H. Irwin Required Readings: Griffihs, Hill and Judge. "Explaining Variaion in he Dependen

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION DOI: 0.038/NCLIMATE893 Temporal resoluion and DICE * Supplemenal Informaion Alex L. Maren and Sephen C. Newbold Naional Cener for Environmenal Economics, US Environmenal Proecion

More information

Block Diagram of a DCS in 411

Block Diagram of a DCS in 411 Informaion source Forma A/D From oher sources Pulse modu. Muliplex Bandpass modu. X M h: channel impulse response m i g i s i Digial inpu Digial oupu iming and synchronizaion Digial baseband/ bandpass

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time.

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time. Supplemenary Figure 1 Spike-coun auocorrelaions in ime. Normalized auocorrelaion marices are shown for each area in a daase. The marix shows he mean correlaion of he spike coun in each ime bin wih he spike

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

m = 41 members n = 27 (nonfounders), f = 14 (founders) 8 markers from chromosome 19

m = 41 members n = 27 (nonfounders), f = 14 (founders) 8 markers from chromosome 19 Sequenial Imporance Sampling (SIS) AKA Paricle Filering, Sequenial Impuaion (Kong, Liu, Wong, 994) For many problems, sampling direcly from he arge disribuion is difficul or impossible. One reason possible

More information

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France ADAPTIVE SIGNAL PROCESSING USING MAXIMUM ENTROPY ON THE MEAN METHOD AND MONTE CARLO ANALYSIS Pavla Holejšovsá, Ing. *), Z. Peroua, Ing. **), J.-F. Bercher, Prof. Assis. ***) Západočesá Univerzia v Plzni,

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

Isolated-word speech recognition using hidden Markov models

Isolated-word speech recognition using hidden Markov models Isolaed-word speech recogniion using hidden Markov models Håkon Sandsmark December 18, 21 1 Inroducion Speech recogniion is a challenging problem on which much work has been done he las decades. Some of

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LDA, logisic

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies,

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

2.3 SCHRÖDINGER AND HEISENBERG REPRESENTATIONS

2.3 SCHRÖDINGER AND HEISENBERG REPRESENTATIONS Andrei Tokmakoff, MIT Deparmen of Chemisry, 2/22/2007 2-17 2.3 SCHRÖDINGER AND HEISENBERG REPRESENTATIONS The mahemaical formulaion of he dynamics of a quanum sysem is no unique. So far we have described

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004 Augmened Realiy II Kalman Filers Gudrun Klinker May 25, 2004 Ouline Moivaion Discree Kalman Filer Modeled Process Compuing Model Parameers Algorihm Exended Kalman Filer Kalman Filer for Sensor Fusion Lieraure

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Anno accademico 2006/2007. Davide Migliore

Anno accademico 2006/2007. Davide Migliore Roboica Anno accademico 2006/2007 Davide Migliore migliore@ele.polimi.i Today Eercise session: An Off-side roblem Robo Vision Task Measuring NBA layers erformance robabilisic Roboics Inroducion The Bayesian

More information

Approximation Algorithms for Unique Games via Orthogonal Separators

Approximation Algorithms for Unique Games via Orthogonal Separators Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

In this chapter the model of free motion under gravity is extended to objects projected at an angle. When you have completed it, you should

In this chapter the model of free motion under gravity is extended to objects projected at an angle. When you have completed it, you should Cambridge Universiy Press 978--36-60033-7 Cambridge Inernaional AS and A Level Mahemaics: Mechanics Coursebook Excerp More Informaion Chaper The moion of projeciles In his chaper he model of free moion

More information

hen found from Bayes rule. Specically, he prior disribuion is given by p( ) = N( ; ^ ; r ) (.3) where r is he prior variance (we add on he random drif

hen found from Bayes rule. Specically, he prior disribuion is given by p( ) = N( ; ^ ; r ) (.3) where r is he prior variance (we add on he random drif Chaper Kalman Filers. Inroducion We describe Bayesian Learning for sequenial esimaion of parameers (eg. means, AR coeciens). The updae procedures are known as Kalman Filers. We show how Dynamic Linear

More information

Tracking. Announcements

Tracking. Announcements Tracking Tuesday, Nov 24 Krisen Grauman UT Ausin Announcemens Pse 5 ou onigh, due 12/4 Shorer assignmen Auo exension il 12/8 I will no hold office hours omorrow 5 6 pm due o Thanksgiving 1 Las ime: Moion

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LTU, decision

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17 EES 16A Designing Informaion Devices and Sysems I Spring 019 Lecure Noes Noe 17 17.1 apaciive ouchscreen In he las noe, we saw ha a capacior consiss of wo pieces on conducive maerial separaed by a nonconducive

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Probabilisic reasoning over ime So far, we ve mosly deal wih episodic environmens Excepions: games wih muliple moves, planning In paricular, he Bayesian neworks we ve seen so far describe

More information

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi Creep in Viscoelasic Subsances Numerical mehods o calculae he coefficiens of he Prony equaion using creep es daa and Herediary Inegrals Mehod Navnee Saini, Mayank Goyal, Vishal Bansal (23); Term Projec

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

Probabilistic Robotics

Probabilistic Robotics Probabilisic Roboics Bayes Filer Implemenaions Gaussian filers Bayes Filer Reminder Predicion bel p u bel d Correcion bel η p z bel Gaussians : ~ π e p N p - Univariae / / : ~ μ μ μ e p Ν p d π Mulivariae

More information

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006 2.160 Sysem Idenificaion, Esimaion, and Learning Lecure Noes No. 8 March 6, 2006 4.9 Eended Kalman Filer In many pracical problems, he process dynamics are nonlinear. w Process Dynamics v y u Model (Linearized)

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM

WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM Shinsuke KOBAYASHI, Shogo MURAMATSU, Hisakazu KIKUCHI, Masahiro IWAHASHI Dep. of Elecrical and Elecronic Eng., Niigaa Universiy, 8050 2-no-cho Igarashi,

More information

Failure of the work-hamiltonian connection for free energy calculations. Abstract

Failure of the work-hamiltonian connection for free energy calculations. Abstract Failure of he work-hamilonian connecion for free energy calculaions Jose M. G. Vilar 1 and J. Miguel Rubi 1 Compuaional Biology Program, Memorial Sloan-Keering Cancer Cener, 175 York Avenue, New York,

More information

RC, RL and RLC circuits

RC, RL and RLC circuits Name Dae Time o Complee h m Parner Course/ Secion / Grade RC, RL and RLC circuis Inroducion In his experimen we will invesigae he behavior of circuis conaining combinaions of resisors, capaciors, and inducors.

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Linear Gaussian State Space Models

Linear Gaussian State Space Models Linear Gaussian Sae Space Models Srucural Time Series Models Level and Trend Models Basic Srucural Model (BSM Dynamic Linear Models Sae Space Model Represenaion Level, Trend, and Seasonal Models Time Varying

More information