Combining Statistical and Knowledge-based Spoken Language Understanding in Conditional Models

Size: px
Start display at page:

Download "Combining Statistical and Knowledge-based Spoken Language Understanding in Conditional Models"

Transcription

1 COLING/ACL06, pp , Associaion for Compuaional Linguisics, Sydney, Ausralia, 2006 Combining Saisical and Knowledge-based Spoken Language Undersanding in Condiional Models Ye-Yi Wang, Alex Acero, Milind Mahajan Microsof Research One Microsof Way Redmond, WA 98052, USA John Lee Spoken Language Sysems MIT CSAIL Cambridge, MA 0239, USA Absrac Spoken Language Undersanding (SLU) addresses he problem of exracing semanic meaning conveyed in an uerance. The radiional knowledge-based approach o his problem is very expensive -- i requires join experise in naural language processing and speech recogniion, and bes pracices in language engineering for every new domain. On he oher hand, a saisical learning approach needs a large amoun of annoaed daa for model raining, which is seldom available in pracical applicaions ouside of large research labs. A generaive HMM/CFG composie model, which inegraes easy-oobain domain knowledge ino a daa-driven saisical learning framework, has previously been inroduced o reduce daa requiremen. The major conribuion of his paper is he invesigaion of inegraing prior knowledge and saisical learning in a condiional model framework. We also sudy and compare condiional random fields (CRFs) wih percepron learning for SLU. Experimenal resuls show ha he condiional models achieve more han 20% relaive reducion in slo error rae over he HMM/CFG model, which had already achieved an SLU accuracy a he same level as he bes resuls repored on he ATIS daa. Inroducion Spoken Language Undersanding (SLU) addresses he problem of exracing meaning conveyed in an uerance. Tradiionally, he problem is solved wih a knowledge-based approach, which requires join experise in naural language processing and speech recogniion, and bes pracices in language engineering for every new domain. In he pas decade many saisical learning approaches have been proposed, mos of which exploi generaive models, as surveyed in (Wang, Deng e al., 2005). While he daa-driven approach addresses he difficulies in knowledge engineering, i requires a large amoun of labeled daa for model raining, which is seldom available in pracical applicaions ouside of large research labs. To alleviae he problem, a generaive HMM/CFG composie model has previously been inroduced (Wang, Deng e al., 2005). I inegraes a knowledge-based approach ino a saisical learning framework, uilizing prior knowledge o compensae for he dearh of raining daa. In he ATIS evaluaion (Price, 990), his model achieves he same level of undersanding accuracy (5.3% error rae on sandard ATIS evaluaion) as he bes sysem (5.5% error rae), which is a semanic parsing sysem based on a manually developed grammar. Discriminaive raining has been widely used for acousic modeling in speech recogniion (Bahl, Brown e al., 986; Juang, Chou e al., 997; Povey and Woodland, 2002). Mos of he mehods use he same generaive model framework, exploi he same feaures, and apply discriminaive raining for parameer opimizaion. Along he same lines, we have recenly exploied condiional models by direcly poring he HMM/CFG model o Hidden Condiional Random Fields (HCRFs) (Gunawardana, Mahajan e al., 2005), bu failed o obain any improvemen. This is mainly due o he vas parameer space, wih he parameers seling a local opima. We hen simplified he original model srucure by removing he hidden variables, and inroduced a number of imporan overlapping and non-homogeneous feaures. The resuling Condiional Random Fields (CRFs) (Laffery, McCallum e al., 200) yielded a 2% relaive improvemen in SLU accuracy. We also applied a much simpler percepron learning algorihm on he condiional model and observed improved SLU accuracy as well. In his paper, we will firs inroduce he generaive HMM/CFG composie model, hen discuss he problem of direcly poring he model o HCRFs, and finally inroduce he CRFs and

2 he feaures ha obain he bes SLU resul on ATIS es daa. We compare he CRF and percepron raining performances on he ask. 2 Generaive Models The HMM/CFG composie model (Wang, Deng e al., 2005) adops a paern recogniion approach o SLU. Given a word sequence W, an SLU componen needs o find he semanic represenaion of he meaning M ha has he maximum a poseriori probabiliy Pr ( M W ) : Mˆ = arg max Pr M W M ( ) ( W M) ( M) = arg max Pr Pr M The composie model inegraes domain knowledge by seing he opology of he prior model, Pr ( M ), according o he domain semanics; and by using PCFG rules as par of Pr W M. he lexicalizaion model ( ) The domain semanics define an applicaion s semanic srucure wih semanic frames. Figure shows a simplified example of hree semanic frames in he ATIS domain. The wo frames wih he oplevel aribue are also known as commands. The filler aribue of a slo specifies he semanic objec ha can fill i. Each slo may be associaed wih a CFG rule, and he filler semanic objec mus be insaniaed by a word sring ha is covered by ha rule. For example, he sring Seale is covered by he Ciy rule in a CFG. I can herefore fill he ACiy (ArrivalCiy) or he DCiy (DeparureCiy) slo, and insaniae a Fligh frame. This frame can hen fill he Fligh slo of a ShowFligh frame. Figure 2 shows a semanic represenaion according o hese frames. < frame name= ShowFligh oplevel= > <slo name= Fligh filler= Fligh / > < /frame > < frame name= GroundTrans oplevel= > < slo name= Ciy filler= Ciy / > < /frame > < frame name= Fligh > <slo name= DCiy filler= Ciy / > < slo name= ACiy filler= Ciy / > < /frame > Figure. Simplified domain semanics for he ATIS domain. The semanic prior model comprises he HMM opology and sae ransiion probabiliies. The opology is deermined by he domain semanics, and he ransiion probabiliies can be esimaed from raining daa. Figure 3 shows he opology of he underlying saes in he saisical model for he semanic frames in Figure. On op is he ransiion nework for he wo op-level commands. A he boom is a zoomed-in view for he Fligh sub-nework. Sae and sae 4 are called precommands. Sae 3 and sae 6 are called poscommands. Saes 2, 5, 8 and 9 represen slos. A slo is acually a hree-sae sequence he slo sae is preceded by a preamble sae and followed by a posamble sae, boh represened by black circles. They provide conexual clues for he slo s ideniy. < ShowFligh > < Fligh > < DCiy filler= Ciy > Seale < /DCiy > < ACiy filler= Ciy > Boson< /ACiy > < /Fligh > < /ShowFligh > Figure 2. The semanic represenaion for Show me he flighs deparing from Seale arriving a Boson is an insaniaion of he semanic frames in Figure. Figure 3. The HMM/CFG model s sae opology, as deermined by he semanic frames in Figure. The lexicalizaion model, Pr ( W M ), depics he process of senence generaion from he opology by esimaing he disribuion of words emied by a sae. I uses sae-dependen n- grams o model he precommands, poscommands, preambles and posambles, and uses knowledge-based CFG rules o model he slo fillers. These rules help compensae for he dearh of domain-specific daa. In he remainder of his paper we will say a sring is covered by a CFG non-erminal (NT), or equivalenly, is CFG-covered for s if he sring can be parsed by he CFG rule corresponding o he slo s. Given he semanic represenaion in Figure 2, he sae sequence hrough he model opology in

3 Figure 3 is deerminisic, as shown in Figure 4. However, he words are no aligned o he saes in he shaded boxes. The parameers in heir corresponding n-gram models can be esimaed wih an EM algorihm ha reas he alignmens as hidden variables. 4. Semanic label ambiguiy, e.g., Washingon D.C. can fill eiher an ArrivalCiy or DeparureCiy slo. Figure 4. Word/sae alignmens. The segmenaion of he word sequences in he shaded region is hidden. The HMM/CFG composie model was evaluaed in he ATIS domain (Price, 990). The model was rained wih ATIS3 caegory A raining daa (~700 annoaed senences) and esed wih he 993 ATIS3 caegory A es senences (470 senences wih 702 reference slos). The slo inserion-deleion-subsiuion error rae (SER) of he es se is 5.0%, leading o a 5.3% semanic error rae in he sandard end-oend ATIS evaluaion, which is slighly beer han he bes manually developed sysem (5.5%). Moreover, a seep drop in he error rae is observed afer raining wih only he firs wo hundred senences. This demonsraes ha he inclusion of prior knowledge in he saisical model helps alleviae he daa sparseness problem. 3 Condiional Models We invesigaed he applicaion of condiional models o SLU. The problem is formulaed as assigning a label l o each elemen in an observaion o. Here, o consiss of a word sequence o and a lis of CFG non-erminals (NT) ha cover is subsequences, as illusraed in Figure 5. The ask is o label wo as he Numof-ickes slo of he ShowFligh command, and Washingon D.C. as he ArrivalCiy slo for he same command. To do so, he model mus be able o resolve several kinds of ambiguiies:. Filler/non-filler ambiguiy, e.g., wo can eiher fill a Num-of-ickes slo, or is homonym o can form par of he preamble of an ArrivalCiy slo. 2. CFG ambiguiy, e.g., Washingon can be CFG-covered as eiher Ciy or Sae. 3. Segmenaion ambiguiy, e.g., [Washingon] [D.C.] vs. [Washingon D.C.]. Figure 5. The observaion includes a word sequence and he subsequences covered by CFG non-erminals. 3. CRFs and HCRFs Condiional Random Fields (CRFs) (Laffery, McCallum e al., 200) are undireced condiional graphical models ha assign he condiional probabiliy of a sae (label) sequence s wih respec o a vecor of feaures f( s, o ). They are of he following form: ps ( o; λ) = exp ( λ f( s, o) ). z( o; λ) z( o; λ) = exp λ f( s, o ) normalizes Here ( ) s he disribuion over all possible sae sequences. The parameer vecor λ is rained condiionally (discriminaively). If we assume ha s is a Markov chain given o and he feaure funcions only depend on wo adjacen saes, hen ps ( o; λ) ( ) ( ) = exp λk fk( s, s, o, ) z( o; λ) k = In some cases, i may be naural o exploi feaures on variables ha are no direcly observed. For example, a feaure for he Fligh preamble may be defined in erms of an observed word and an unobserved sae in he shaded region in Figure 4: ( ) ( ) f ( s, s, o, ) FlighIni,flighs if s =FlighIni o = flighs; 0 oherwise In his case, he sae sequence s is only parially observed in he meaning represenaion M : M( s5) = "DCiy" M( s8) = "ACiy" for he words Seale and Boson. The saes for he remaining words are hidden. Le Γ ( M ) represen he se of all sae sequences ha saisfy he consrains imposed by M. To obain he condiional probabiliy of M, we need o sum over all possible labels for he hidden saes: () (2) (3)

4 pm ( o; λ) = ( ) ( ) exp λk fk( s, s, o, ) z( o; λ) s Γ( M) k = CRFs wih feaures dependen on hidden sae variables are called Hidden Condiional Random Fields (HCRFs). They have been applied o asks such as phoneic classificaion (Gunawardana, Mahajan e al., 2005) and objec recogniion (Quaoni, Collins e al., 2004). 3.2 Condiional Model Training We rain CRFs and HCRFs wih gradien-based opimizaion algorihms ha maximize he log poserior. The gradien of he objecive funcion is λ L( λ) = E ( s ( ) ) λ P, P( s, ) f, o ; lo lo E ( s ( ), ); λ P o P( s o) f o which is he difference beween he condiional expecaion of he feaure vecor given he observaion sequence and label sequence, and he condiional expecaion given he observaion sequence alone. Wih he Markov assumpion in Eq. (2), hese expecaions can be compued using a forward-backward-like dynamic programming algorihm. For CRFs, whose feaures do no depend on hidden sae sequences, he firs expecaion is simply he feaure couns given he observaion and label sequences. In his work, we applied sochasic gradien descen (SGD) (Kushner and Yin, 997) for parameer opimizaion. In our experimens on several differen asks, i is faser han L- BFGS (Nocedal and Wrigh, 999), a quasi- Newon opimizaion algorihm. 3.3 CRFs and Percepron Learning Percepron raining for condiional models (Collins, 2002) is an approximaion o he SGD algorihm, using feaure couns from he Vierbi label sequence in lieu of expeced feaure couns. I eliminaes he need of a forward-backward algorihm o collec he expeced couns, hence grealy speeds up model raining. This algorihm can be viewed as using he minimum margin of a raining example (i.e., he difference in he log condiional probabiliy of he reference label sequence and he Vierbi label sequence) as he objecive funcion insead of he condiional probabiliy: L' ( λ) = log P( l o; λ) max log P( l' o; λ) l ' Here again, o is he observaion and l is is reference label sequence. In percepron raining, he parameer updaing sops when he Vierbi label sequence is he same as he reference label sequence. In conras, he opimizaion based on he log poserior probabiliy objecive funcion keeps pulling probabiliy mass from all incorrec label sequences o he reference label sequence unil convergence. In boh percepron and CRF raining, we average he parameers over raining ieraions (Collins, 2002). 4 Poring HMM/CFG Model o HCRFs In our firs experimen, we would like o exploi he discriminaive raining capabiliy of a condiional model wihou changing he HMM/CFG model s opology and feaure se. Since he sae sequence is only parially labeled, an HCRF is used o model he condiional disribuion of he labels. 4. Feaures We used he same sae opology and feaures as hose in he HMM/CFG composie model. The following indicaor feaures are included: Command prior feaures capure he a priori likelihood of differen op-level commands: PR ( ) ( ) fc ( s, s, o, ) if =0 C( s ) = c, c CommandSe 0 oherwise Here C(s) sands for he name of he command ha corresponds o he ransiion nework conaining sae s. Sae Transiion feaures capure he likelihood of ransiion from one sae o anoher: ( ) ( ) TR ( ) ( ) if s = s, s = s2 f ( s, s, o, ), s, s = 2 0 oherwise where s s2 is a legal ransiion according o he sae opology. Unigram and Bigram feaures capure he likelihoods of words emied by a sae:

5 = o = f s, s,, = 0 oherwise f s s o UG ( ) ( ) if s s w ( o, ), sw BG ( ) ( ) (,,,,, ) sw w2 ( ) ( ) if s = s s = s o = w o = w2, 0 oherwise s isfiller s ; w, ww TrainingDaa ( ) 2 The condiion isfiller( s ) resrics s o be a slo sae and no a pre- or posamble sae. 4.2 Experimens The model is rained wih SGD wih he parameers iniialized in wo ways. The fla sar iniializaion ses all parameers o 0. The generaive model iniializaion uses he parameers rained by he HMM/CFG model. Figure 6 shows he es se slo error raes (SER) a differen raining ieraions. Wih he fla sar iniializaion (op curve), he error rae never comes close o he 5% baseline error rae of he HMM/CFG model. Wih he generaive model iniializaion, he error rae is reduced o 4.8% a he second ieraion, bu he model quickly ges over-rained aferwards Figure 6. Tes se slo error raes (in %) a differen raining ieraions. The op curve is for he fla sar iniializaion, he boom for he generaive model iniializaion. The failure of he direc poring of he generaive model o he condiional model can be aribued o he following reasons: The condiional log-likelihood funcion is no longer a convex funcion due o he summaion over hidden variables. This makes he model highly likely o sele on a local opimum. The fac ha he fla sar iniializaion failed o achieve he accuracy of he generaive model iniializaion is a clear indicaion of he problem. In order o accoun for words in he es daa, he n-grams in he generaive model are properly smoohed wih back-offs o he uniform disribuion over he vocabulary. This resuls in a huge number of parameers, many of which canno be esimaed reliably in he condiional model, given ha model regularizaion is no as well sudied as in n-grams. The hidden variables make parameer esimaion less reliable, given only a small amoun of raining daa. 5 CRFs for SLU An imporan lesson we have learned from he previous experimen is ha we should no hink generaively when applying condiional models. While i is imporan o find cues ha help idenify he slos, here is no need o exhausively model he generaion of every word in a senence. Hence, he disincions beween preand poscommands, and pre- and posambles are no longer necessary. Every word ha appears beween wo slos is labeled as he preamble sae of he second slo, as illusraed in Figure 7. This labeling scheme effecively removes he hidden variables and simplifies he model o a CRF. I no only expedies model raining, bu also prevens parameers from seling a a local opimum, because he log condiional probabiliy is now a convex funcion. Figure 7. Once he slos are marked in he simplified model opology, he sae sequence is fully marked, leaving no hidden variables and resuling in a CRF. Here, PAC sands for preamble for arrival ciy, and PDC for preamble for deparure ciy. The command prior and sae ransiion feaures (wih fewer saes) are he same as in he HCRF model. For unigrams and bigrams, only hose ha occur in fron of a CFG-covered sring are considered. If he sring is CFG-covered for slo s, hen he unigram and bigram feaures for he preamble sae of s are included. Suppose he words ha depars occur a posiions and in fron of he word Seale, which is CFG-covered by he non-erminal Ciy. Since Ciy can fill a DeparureCiy or ArrivalCiy slo, he four following feaures are inroduced:

6 And UG ( ) ( ) UG ( ) ( ) f ( s, s, o PDC,ha, ) = f ( s, s, o PAC,ha, ) = previous slo is ArrivalCiy, so he sae ransiion feaures are no helpful for disambiguaion. The ideniy of he ime slo BG ( ) ( ) f ( s, s, o PDC,ha,depars, ) = depends no on he ArrivalCiy slo, bu on is BG ( ) ( ) preamble. Our second feaure se, previous-slo o PAC,ha,depars conex, inroduces his dependency o he model: f ( s, s,, ) = Formally, UG ( ) ( ) if s = s o = w f ( s, s, o,, ), sw = 0 oherwise BG ( ) ( ) f ( s, s, o, ) sw,, w2 ( ) ( ) if s = s = s o = w o = w2, 0 oherwise s isfiller ( s) ; www, 2 in he raining daa, wand ww 2 appears in fron of sequence ha is CFG-covered for s. 5. Addiional Feaures One advanage of CRFs over generaive models is he ease wih which overlapping feaures can be incorporaed. In his secion, we describe hree addiional feaure ses. The firs se addresses a side effec of no modeling he generaion of every word in a senence. Suppose a preamble sae has never occurred in a posiion ha is confusable wih a slo sae s, and a word ha is CFG-covered for s has never occurred as par of he preamble sae in he raining daa. Then, he unigram feaure of he word for ha preamble sae has weigh 0, and here is hus no penaly for mislabeling he word as he preamble. This is one of he mos common errors observed in he developmen se. The chunk coverage for preamble words feaure inroduced o model he likelihood of a CFGcovered word being labeled as a preamble: f s s CC ( ) ( ) (,, o, ) cnt, = if C( s ) = c covers( NT, o ) ispre( s ) 0 oherwise where ispre( s ) indicaes ha s is a preamble sae. Ofen, he ideniy of a slo depends on he preambles of he previous slo. For example, a wo PM is a DeparureTime in fligh from Seale o Boson a wo PM, bu i is an ArrivalTime in fligh deparing from Seale arriving in Boson a wo PM. In boh cases, he f s s o PC ( ) ( ) (,,, ) s, s2, w ( ) ( ) if s = s s = s2 w Θ( s, o, ) isfiller( s) Slo( s) Slo( s2) 0 oherwise Here Slo( s ) sands for he slo associaed wih he sae s, which can be a filler sae or a preamble sae, as shown in Figure 7. Θ( s, o, ) is he se of k words (where k is an adjusable window size) in fron of he longes sequence ha ends a posiion and ha is CFG-covered by Slo( s ). The hird feaure se is inended o penalize erroneous segmenaion, such as segmening Washingon D.C. ino wo separae Ciy slos. The chunk coverage for slo boundary feaure is acivaed when a slo boundary is covered by a CFG non-erminal NT, i.e., when words in wo consecuive slos ( Washingon and D.C. ) can also be covered by one single slo: SB ( ) ( ) f ( s, s, o, ) cnt, if C( s ) = c covers( NT, o ) ( ) ( ) isfiller( s ) isfiller( s ) ( ) ( ) s s 0 oherwise This feaure se shares is weighs wih he chunk coverage feaures for preamble words, and does no inroduce any new parameers. Feaures # of Param. SER Command Prior 6 +Sae Transiion % +Unigrams % +Bigrams % +Chunk Cov Preamble Word % +Previous-Slo Conex % +Chunk Cov Slo Boundaries % Table. Number of addiional parameers and he slo error rae afer each new feaure se is inroduced. 5.2 Experimens Since he objecive funcion is convex, he opimizaion algorihm does no make any significan difference on SLU accuracy. We

7 rained he model wih SGD. Oher opimizaion algorihm like Sochasic Mea-Decen (Vishwanahan, Schraudolph e al., 2006) can be used o speed up he convergence. The raining sopping crierion is cross-validaed wih he developmen se. Table shows he number of new parameers and he slo error rae (SER) on he es daa, afer each new feaure se is inroduced. The new feaures improve he predicion of slo ideniies and reduce he SER by 2%, relaive o he generaive HMM/CFG composie model. The figures below show in deail he impac of he n-gram, previous-slo conex and chunk coverage feaures. The chunk coverage feaure has hree seings: 0 sands for no chunk coverage feaures; for chunk coverage feaures for preamble words only; and 2 for boh words and slo boundaries. Figure 8 shows he impac of he order of n- gram feaures. Zero-order means no lexical feaures for preamble saes are included. As he figure illusraes, he inclusion of CFG rules for slo filler saes and domain-specific knowledge abou command priors and slo ransiions have already produced a reasonable SER under 5%. Unigram feaures for preamble saes cu he error by more han 50%, while he impac of bigram feaures is no consisen -- i yields a small posiive or negaive difference depending on oher experimenal parameer seings. Slo Error Rae 6% 4% 2% 0% 8% 6% 4% 2% 0% ChunkCoverage=0 ChunkCoverage= ChunkCoverage=2 0 2 Ngram Order Figure 8. Effecs of he order of n-grams on SER. The window size for he previous-slo conex feaures is 2. Figure 9 shows he impac of he CFG chunk coverage feaure. Coverage for boh preamble words and slo boundaries help improve he SLU accuracy. Figure 0 shows he impac of he window size for he previous-slo conex feaure. Here, 0 means ha he previous-slo conex feaure is no used. When he window size is k, he k words in fron of he longes previous CFG-covered word sequence are included as he previous-slo unigram conex feaures. As he figure illusraes, his feaure significanly reduces SER, while he window size does no make any significan difference. Slo Error Rae 6% 4% 2% 0% 8% 6% 4% 2% 0% n=0 n= n=2 0 2 Chunk Coverage Figure 9. Effecs of he chunk coverage feaure. The window size for he previous-slo conex feaure is 2. The hree lines correspond o differen n-gram orders, where 0-gram indicaes ha no preamble lexical feaures are used. I is imporan o noe ha overlapping CC SB PC feaures like f, f and f could no be easily incorporaed ino a generaive model. Slo Error Rae 2% 0% 8% 6% 4% 2% n=0 n= n=2 0% 0 2 Window Size Figure 0. Effecs of he window size of he previous-slo conex feaure. The hree lines represen differen orders of n-grams (0,, and 2). Chunk coverage feaures for boh preamble words and slo boundaries are used. 5.3 CRFs vs. Perceprons Table 2 compares he percepron and CRF raining algorihms, using chunk coverage feaures for boh preamble words and slo boundaries, wih which he bes accuracy resuls

8 are achieved. Boh improve upon he 5% baseline SER from he generaive HMM/CFG model. CRF raining ouperforms he percepron in mos seings, excep for he one wih unigram feaures for preamble saes and wih window size -- he model wih he fewes parameers. One possible explanaion is as follows. The objecive funcion in CRFs is a convex funcion, and so SGD can find he single global opimum for i. In conras, he objecive funcion for he percepron, which is he difference beween wo convex funcions, is no convex. The gradien ascen approach in percepron raining is hence more likely o sele on a local opimum as he model becomes more complicaed. PSWSize= PSWSize=2 Percepron CRFs Percepron CRFs n= 3.76% 4.% 4.23% 3.94% n=2 4.76% 4.4% 4.58% 3.94% Table 2. Percepron vs. CRF raining. Chunk coverage feaures are used for boh preamble words and slo boundaries. PSWSize sands for he window size of he previous-slo conex feaure. N is he order of he n-gram feaures. The bigges advanage of percepron learning is is speed. I direcly couns he occurrence of feaures given an observaion and is reference label sequence and Vierbi label sequence, wih no need o collec expeced feaure couns wih a forward-backward-like algorihm. No only is each ieraion faser, bu fewer ieraions are required, when using SLU accuracy on a crossvalidaion se as he sopping crierion. Overall, percepron raining is 5 o 8 imes faser han CRF raining. 6 Conclusions This paper has inroduced a condiional model framework ha inegraes saisical learning wih a knowledge-based approach o SLU. We have shown ha a condiional model reduces SLU slo error rae by more han 20% over he generaive HMM/CFG composie model. The improvemen was mosly due o he inroducion of new overlapping feaures ino he model. We have also discussed our experience in direcly poring a generaive model o a condiional model, and demonsraed ha i may no be beneficial a all if we sill hink generaively in condiional modeling; more specifically, replicaing he feaure se of a generaive model in a condiional model may no help much. The key benefi of condiional models is he ease wih which hey can incorporae overlapping and nonhomogeneous feaures. This is consisen wih he finding in he applicaion of condiional models for POS agging (Laffery, McCallum e al., 200). The paper also compares differen raining algorihms for condiional models. In mos cases, CRF raining is more accurae, however, percepron raining is much faser. References Bahl, L., P. Brown, e al Maximum muual informaion esimaion of hidden Markov model parameers for speech recogniion. IEEE Inernaional Conference on Acousics, Speech, and Signal Processing. Collins, M Discriminaive Training Mehods for Hidden Markov Models: Theory and Experimens wih Percepron Algorihms. EMNLP, Philadelphia, PA. Gunawardana, A., M. Mahajan, e al Hidden condiional random fields for phone classificaion. Eurospeech, Lisbon, Porugal. Juang, B.-H., W. Chou, e al "Minimum classificaion error rae mehods for speech recogniion." IEEE Transacions on Speech and Audio Processing 5(3): Kushner, H. J. and G. G. Yin Sochasic approximaion algorihms and applicaions, Springer-Verlag. Laffery, J., A. McCallum, e al Condiional random fields: probabilisic models for segmening and labeling sequence daa. ICML. Nocedal, J. and S. J. Wrigh Numerical opimizaion, Springer-Verlag. Povey, D. and P. C. Woodland Minimum phone error and I-smoohing for improved discriminaive raining. IEEE Inernaional Conference on Acousics, Speech, and Signal Processing. Price, P Evaluaion of spoken language sysem: he ATIS domain. DARPA Speech and Naural Language Workshop, Hidden Valley, PA. Quaoni, A., M. Collins and T. Darrell Condiional Random Fields for Objec Recogniion. NIPS. Vishwanahan, S. V. N., N. N. Schraudolph, e al Acceleraed Training of condiional random fields wih sochasic mea-descen. The Learning Workshop, Snowbird, Uah. Wang, Y.-Y., L. Deng, e al "Spoken language undersanding --- an inroducion o he saisical framework." IEEE Signal Processing Magazine 22(5): 6-3.

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Probabilisic reasoning over ime So far, we ve mosly deal wih episodic environmens Excepions: games wih muliple moves, planning In paricular, he Bayesian neworks we ve seen so far describe

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Pattern Classification (VI) 杜俊

Pattern Classification (VI) 杜俊 Paern lassificaion VI 杜俊 jundu@usc.edu.cn Ouline Bayesian Decision Theory How o make he oimal decision? Maximum a oserior MAP decision rule Generaive Models Join disribuion of observaion and label sequences

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

WEEK-3 Recitation PHYS 131. of the projectile s velocity remains constant throughout the motion, since the acceleration a x

WEEK-3 Recitation PHYS 131. of the projectile s velocity remains constant throughout the motion, since the acceleration a x WEEK-3 Reciaion PHYS 131 Ch. 3: FOC 1, 3, 4, 6, 14. Problems 9, 37, 41 & 71 and Ch. 4: FOC 1, 3, 5, 8. Problems 3, 5 & 16. Feb 8, 018 Ch. 3: FOC 1, 3, 4, 6, 14. 1. (a) The horizonal componen of he projecile

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

A Shooting Method for A Node Generation Algorithm

A Shooting Method for A Node Generation Algorithm A Shooing Mehod for A Node Generaion Algorihm Hiroaki Nishikawa W.M.Keck Foundaion Laboraory for Compuaional Fluid Dynamics Deparmen of Aerospace Engineering, Universiy of Michigan, Ann Arbor, Michigan

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization A Forward-Backward Spliing Mehod wih Componen-wise Lazy Evaluaion for Online Srucured Convex Opimizaion Yukihiro Togari and Nobuo Yamashia March 28, 2016 Absrac: We consider large-scale opimizaion problems

More information

Written HW 9 Sol. CS 188 Fall Introduction to Artificial Intelligence

Written HW 9 Sol. CS 188 Fall Introduction to Artificial Intelligence CS 188 Fall 2018 Inroducion o Arificial Inelligence Wrien HW 9 Sol. Self-assessmen due: Tuesday 11/13/2018 a 11:59pm (submi via Gradescope) For he self assessmen, fill in he self assessmen boxes in your

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Learning Naive Bayes Classifier from Noisy Data

Learning Naive Bayes Classifier from Noisy Data UCLA Compuer Science Deparmen Technical Repor CSD-TR No 030056 1 Learning Naive Bayes Classifier from Noisy Daa Yirong Yang, Yi Xia, Yun Chi, and Richard R Munz Universiy of California, Los Angeles, CA

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

Authors. Introduction. Introduction

Authors. Introduction. Introduction Auhors Hidden Applied in Agriculural Crops Classificaion Caholic Universiy of Rio de Janeiro (PUC-Rio Paula B. C. Leie Raul Q. Feiosa Gilson A. O. P. Cosa Hidden Applied in Agriculural Crops Classificaion

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks - Deep Learning: Theory, Techniques & Applicaions - Recurren Neural Neworks - Prof. Maeo Maeucci maeo.maeucci@polimi.i Deparmen of Elecronics, Informaion and Bioengineering Arificial Inelligence and Roboics

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

On Multicomponent System Reliability with Microshocks - Microdamages Type of Components Interaction

On Multicomponent System Reliability with Microshocks - Microdamages Type of Components Interaction On Mulicomponen Sysem Reliabiliy wih Microshocks - Microdamages Type of Componens Ineracion Jerzy K. Filus, and Lidia Z. Filus Absrac Consider a wo componen parallel sysem. The defined new sochasic dependences

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Matlab and Python programming: how to get started

Matlab and Python programming: how to get started Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information

Errata (1 st Edition)

Errata (1 st Edition) P Sandborn, os Analysis of Elecronic Sysems, s Ediion, orld Scienific, Singapore, 03 Erraa ( s Ediion) S K 05D Page 8 Equaion (7) should be, E 05D E Nu e S K he L appearing in he equaion in he book does

More information

A Dynamic Model of Economic Fluctuations

A Dynamic Model of Economic Fluctuations CHAPTER 15 A Dynamic Model of Economic Flucuaions Modified for ECON 2204 by Bob Murphy 2016 Worh Publishers, all righs reserved IN THIS CHAPTER, OU WILL LEARN: how o incorporae dynamics ino he AD-AS model

More information

Energy Storage Benchmark Problems

Energy Storage Benchmark Problems Energy Sorage Benchmark Problems Daniel F. Salas 1,3, Warren B. Powell 2,3 1 Deparmen of Chemical & Biological Engineering 2 Deparmen of Operaions Research & Financial Engineering 3 Princeon Laboraory

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Numerical Dispersion

Numerical Dispersion eview of Linear Numerical Sabiliy Numerical Dispersion n he previous lecure, we considered he linear numerical sabiliy of boh advecion and diffusion erms when approimaed wih several spaial and emporal

More information

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17 EES 16A Designing Informaion Devices and Sysems I Spring 019 Lecure Noes Noe 17 17.1 apaciive ouchscreen In he las noe, we saw ha a capacior consiss of wo pieces on conducive maerial separaed by a nonconducive

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

Isolated-word speech recognition using hidden Markov models

Isolated-word speech recognition using hidden Markov models Isolaed-word speech recogniion using hidden Markov models Håkon Sandsmark December 18, 21 1 Inroducion Speech recogniion is a challenging problem on which much work has been done he las decades. Some of

More information

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France ADAPTIVE SIGNAL PROCESSING USING MAXIMUM ENTROPY ON THE MEAN METHOD AND MONTE CARLO ANALYSIS Pavla Holejšovsá, Ing. *), Z. Peroua, Ing. **), J.-F. Bercher, Prof. Assis. ***) Západočesá Univerzia v Plzni,

More information

The equation to any straight line can be expressed in the form:

The equation to any straight line can be expressed in the form: Sring Graphs Par 1 Answers 1 TI-Nspire Invesigaion Suden min Aims Deermine a series of equaions of sraigh lines o form a paern similar o ha formed by he cables on he Jerusalem Chords Bridge. Deermine he

More information

Refinement of Document Clustering by Using NMF *

Refinement of Document Clustering by Using NMF * Refinemen of Documen Clusering by Using NMF * Hiroyuki Shinnou and Minoru Sasaki Deparmen of Compuer and Informaion Sciences, Ibaraki Universiy, 4-12-1 Nakanarusawa, Hiachi, Ibaraki JAPAN 316-8511 {shinnou,

More information

Class Meeting # 10: Introduction to the Wave Equation

Class Meeting # 10: Introduction to the Wave Equation MATH 8.5 COURSE NOTES - CLASS MEETING # 0 8.5 Inroducion o PDEs, Fall 0 Professor: Jared Speck Class Meeing # 0: Inroducion o he Wave Equaion. Wha is he wave equaion? The sandard wave equaion for a funcion

More information

Evaluation of Mean Time to System Failure of a Repairable 3-out-of-4 System with Online Preventive Maintenance

Evaluation of Mean Time to System Failure of a Repairable 3-out-of-4 System with Online Preventive Maintenance American Journal of Applied Mahemaics and Saisics, 0, Vol., No., 9- Available online a hp://pubs.sciepub.com/ajams/// Science and Educaion Publishing DOI:0.69/ajams--- Evaluaion of Mean Time o Sysem Failure

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

USP. Surplus-Production Models

USP. Surplus-Production Models USP Surplus-Producion Models 2 Overview Purpose of slides: Inroducion o he producion model Overview of differen mehods of fiing Go over some criique of he mehod Source: Haddon 2001, Chaper 10 Hilborn and

More information

Presentation Overview

Presentation Overview Acion Refinemen in Reinforcemen Learning by Probabiliy Smoohing By Thomas G. Dieerich & Didac Busques Speaer: Kai Xu Presenaion Overview Bacground The Probabiliy Smoohing Mehod Experimenal Sudy of Acion

More information

Hidden Markov Models. Adapted from. Dr Catherine Sweeney-Reed s slides

Hidden Markov Models. Adapted from. Dr Catherine Sweeney-Reed s slides Hidden Markov Models Adaped from Dr Caherine Sweeney-Reed s slides Summary Inroducion Descripion Cenral in HMM modelling Exensions Demonsraion Specificaion of an HMM Descripion N - number of saes Q = {q

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Experiments on logistic regression

Experiments on logistic regression Experimens on logisic regression Ning Bao March, 8 Absrac In his repor, several experimens have been conduced on a spam daa se wih Logisic Regression based on Gradien Descen approach. Firs, he overfiing

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Temporal probability models

Temporal probability models Temporal probabiliy models CS194-10 Fall 2011 Lecure 25 CS194-10 Fall 2011 Lecure 25 1 Ouline Hidden variables Inerence: ilering, predicion, smoohing Hidden Markov models Kalman ilers (a brie menion) Dynamic

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints IJCSI Inernaional Journal of Compuer Science Issues, Vol 9, Issue 1, No 1, January 2012 wwwijcsiorg 18 Applying Geneic Algorihms for Invenory Lo-Sizing Problem wih Supplier Selecion under Sorage Capaciy

More information

ST2352. Stochastic Processes constructed via Conditional Simulation. 09/02/2014 ST2352 Week 4 1

ST2352. Stochastic Processes constructed via Conditional Simulation. 09/02/2014 ST2352 Week 4 1 ST35 Sochasic Processes consruced via Condiional Simulaion 09/0/014 ST35 Week 4 1 Sochasic Processes consruced via Condiional Simulaion Markov Processes Simulaing Random Tex Google Sugges n grams Random

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

Failure of the work-hamiltonian connection for free energy calculations. Abstract

Failure of the work-hamiltonian connection for free energy calculations. Abstract Failure of he work-hamilonian connecion for free energy calculaions Jose M. G. Vilar 1 and J. Miguel Rubi 1 Compuaional Biology Program, Memorial Sloan-Keering Cancer Cener, 175 York Avenue, New York,

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Smoothing. Backward smoother: At any give T, replace the observation yt by a combination of observations at & before T

Smoothing. Backward smoother: At any give T, replace the observation yt by a combination of observations at & before T Smoohing Consan process Separae signal & noise Smooh he daa: Backward smooher: A an give, replace he observaion b a combinaion of observaions a & before Simple smooher : replace he curren observaion wih

More information

Probabilistic Models for Reliability Analysis of a System with Three Consecutive Stages of Deterioration

Probabilistic Models for Reliability Analysis of a System with Three Consecutive Stages of Deterioration Yusuf I., Gaawa R.I. Volume, December 206 Probabilisic Models for Reliabiliy Analysis of a Sysem wih Three Consecuive Sages of Deerioraion Ibrahim Yusuf Deparmen of Mahemaical Sciences, Bayero Universiy,

More information

Testing for a Single Factor Model in the Multivariate State Space Framework

Testing for a Single Factor Model in the Multivariate State Space Framework esing for a Single Facor Model in he Mulivariae Sae Space Framework Chen C.-Y. M. Chiba and M. Kobayashi Inernaional Graduae School of Social Sciences Yokohama Naional Universiy Japan Faculy of Economics

More information

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates Biol. 356 Lab 8. Moraliy, Recruimen, and Migraion Raes (modified from Cox, 00, General Ecology Lab Manual, McGraw Hill) Las week we esimaed populaion size hrough several mehods. One assumpion of all hese

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error

More information

1. VELOCITY AND ACCELERATION

1. VELOCITY AND ACCELERATION 1. VELOCITY AND ACCELERATION 1.1 Kinemaics Equaions s = u + 1 a and s = v 1 a s = 1 (u + v) v = u + as 1. Displacemen-Time Graph Gradien = speed 1.3 Velociy-Time Graph Gradien = acceleraion Area under

More information

Unit Root Time Series. Univariate random walk

Unit Root Time Series. Univariate random walk Uni Roo ime Series Univariae random walk Consider he regression y y where ~ iid N 0, he leas squares esimae of is: ˆ yy y y yy Now wha if = If y y hen le y 0 =0 so ha y j j If ~ iid N 0, hen y ~ N 0, he

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

Notes on online convex optimization

Notes on online convex optimization Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec

More information

WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM

WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM Shinsuke KOBAYASHI, Shogo MURAMATSU, Hisakazu KIKUCHI, Masahiro IWAHASHI Dep. of Elecrical and Elecronic Eng., Niigaa Universiy, 8050 2-no-cho Igarashi,

More information

Learning Dynamic Audio/Visual Mapping with Input-Output Hidden Markov Models

Learning Dynamic Audio/Visual Mapping with Input-Output Hidden Markov Models ACCV00: The 5h Asian Conference on Compuer Vision, 3 5 Januar 00, Melbourne, Ausralia. 1 Learning Dnamic Audio/Visual Mapping wih Inpu-Oupu Hidden Markov Models Yan Li, Heung-Yueng Shum Microsof Research,

More information

Object tracking: Using HMMs to estimate the geographical location of fish

Object tracking: Using HMMs to estimate the geographical location of fish Objec racking: Using HMMs o esimae he geographical locaion of fish 02433 - Hidden Markov Models Marin Wæver Pedersen, Henrik Madsen Course week 13 MWP, compiled June 8, 2011 Objecive: Locae fish from agging

More information

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear In The name of God Lecure4: Percepron and AALIE r. Majid MjidGhoshunih Inroducion The Rosenbla s LMS algorihm for Percepron 958 is buil around a linear neuron a neuron ih a linear acivaion funcion. Hoever,

More information

5 The fitting methods used in the normalization of DSD

5 The fitting methods used in the normalization of DSD The fiing mehods used in he normalizaion of DSD.1 Inroducion Sempere-Torres e al. 1994 presened a general formulaion for he DSD ha was able o reproduce and inerpre all previous sudies of DSD. The mehodology

More information

Phys1112: DC and RC circuits

Phys1112: DC and RC circuits Name: Group Members: Dae: TA s Name: Phys1112: DC and RC circuis Objecives: 1. To undersand curren and volage characerisics of a DC RC discharging circui. 2. To undersand he effec of he RC ime consan.

More information

In this chapter the model of free motion under gravity is extended to objects projected at an angle. When you have completed it, you should

In this chapter the model of free motion under gravity is extended to objects projected at an angle. When you have completed it, you should Cambridge Universiy Press 978--36-60033-7 Cambridge Inernaional AS and A Level Mahemaics: Mechanics Coursebook Excerp More Informaion Chaper The moion of projeciles In his chaper he model of free moion

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LDA, logisic

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

Self assessment due: Monday 4/29/2019 at 11:59pm (submit via Gradescope)

Self assessment due: Monday 4/29/2019 at 11:59pm (submit via Gradescope) CS 188 Spring 2019 Inroducion o Arificial Inelligence Wrien HW 10 Due: Monday 4/22/2019 a 11:59pm (submi via Gradescope). Leave self assessmen boxes blank for his due dae. Self assessmen due: Monday 4/29/2019

More information

Lab #2: Kinematics in 1-Dimension

Lab #2: Kinematics in 1-Dimension Reading Assignmen: Chaper 2, Secions 2-1 hrough 2-8 Lab #2: Kinemaics in 1-Dimension Inroducion: The sudy of moion is broken ino wo main areas of sudy kinemaics and dynamics. Kinemaics is he descripion

More information

A Video Vehicle Detection Algorithm Based on Improved Adaboost Algorithm Weiguang Liu and Qian Zhang*

A Video Vehicle Detection Algorithm Based on Improved Adaboost Algorithm Weiguang Liu and Qian Zhang* A Video Vehicle Deecion Algorihm Based on Improved Adaboos Algorihm Weiguang Liu and Qian Zhang* Zhongyuan Universiy of Technology, Zhengzhou 450000, China lwg66123@163.com, 2817343431@qq.com *The corresponding

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION DOI: 0.038/NCLIMATE893 Temporal resoluion and DICE * Supplemenal Informaion Alex L. Maren and Sephen C. Newbold Naional Cener for Environmenal Economics, US Environmenal Proecion

More information