Non-Informative Dirichlet Score for learning Bayesian networks

Size: px
Start display at page:

Download "Non-Informative Dirichlet Score for learning Bayesian networks"

Transcription

1 Non-Informatve Drchlet Score for learnng Bayesan networks Maom Ueno Unversty of Electro-Communcatons, Japan Masak Uto Unversty of Electro-Communcatons, Japan uto Abstract Learnng Bayesan networks s known to be hghly senstve to the chosen equvalent sample sze (ESS) n the Bayesan Drchlet equvalence unform (BDeu). Ths senstvty often engenders unstable or undesred results because the pror of BDeu does not represent gnorance of pror knowledge, but rather a user s pror belef n the unformty of the condtonal dstrbuton. Ths paper presents a proposal for a nonnformatve Drchlet score by margnalzng the possble hypothetcal structures as the user s pror belef. Some numercal experments demonstrate that the proposed score mproves learnng accuracy. The results also suggest that the proposed score mght be effectve especally for small samples. Introducton Margnal lkelhood (ML) (usng a Drchlet pror) that ensures lkelhood equvalence, the most popular learnng score for Bayesan networks, fnds the maxmum a posteror (MAP) structure (Buntne, 99; Heckerman et al., 995). Ths score s known as Bayesan Drchlet equvalence (BDe) (Heckerman et al., 995). Gven no pror knowledge, the Bayesan Drchlet equvalence unform (BDeu), as proposed earler by Buntne (99), s often used. Actually, BDe(u) requres an equvalent sample sze (ESS), whch reflects the degree of a user s pror belef. Moreover, recent studes have demonstrated that learnng Bayesan networks s hghly senstve to the chosen equvalent sample sze (ESS) (Steck and Jaakkola, ; Slander, Kontkanen and Myllymak, 7). To clarfy the BDe(u) mechansm, Ueno () analyzed log-bde(u) asymptotcally, obtanng the result that t s decomposed nto the log-lkelhood and the penalty term of the complexty, whch reflects the dfference between the learned structure from data and the hypothetcal structure from the user s knowledge. As the two structures become equvalent, the penalty term s mnmzed wth the fxed ESS. Conversely, the term ncreases to the degree that the two structures become dfferent. Furthermore, the result suggests that a tradeoff exsts between the role of ESS n the log-lkelhood (whch helps to block extra arcs) and ts role n the penalty term (whch helps to add extra arcs). That tradeoff mght cause the BDeu score to be hghly senstve to ESS. It mght make t more dffcult to determne an approxmate ESS. Moreover, Ueno () showed that the pror of BDeu does not represent gnorance of pror knowledge, but rather a user s pror belef n the unformty of the condtonal dstrbuton. Ths fact s partcularly surprsng because t had been beleved that BDeu has a non-nformatve pror. The results further mply that the optmal ESS becomes large/small when the emprcal condtonal dstrbuton becomes unform/skewed. Ths man factor underpns the senstvty of BDeu to ESS. To solve ths problem, Slander, Kontkanen and Myllymak (7) proposed a learnng method to margnalze the ESS of BDeu. They averaged BDeu values wth ncreasng ESS

2 from to by.. Moreover, to decrease the computaton costs of the margnalzaton, Cano et al. () proposed averagng a wde range of dfferent chosen ESSs: ESS >., ESS >>., ESS <., ESS <<.. They reported that the proposed method performed robustly and effcently. However, such approxmated margnalzaton does not always guarantee robust results when the optmal ESS s extremely small or large. In addton, the exact margnalzaton of ESS s dffcult because ESS s a contnuous varable n doman (, ). Our proposal n ths paper s a full nonnformatve Drchlet score. We assume all possble hypothetcal structures as a pror belef of BDe because the problem of BDeu s that t assumes only a unform dstrbuton as a pror belef. Ths paper presents a proposal for a nonnformatve Drchlet score by averagng the hypothetcal structures as a user s pror belef. The optmal ESS of BDeu becomes large/small when the emprcal condtonal dstrbuton becomes unform/skewed because ts hypothetcal structure assumes a unform condtonal probabltes dstrbuton and the ESS adjusts the magntude of the user s belef for a hypothetcal structure. However, the ESS of full non-nformatve pror s expected to work effectvely as actual pseudo-samples to augment the data, especally when the sample sze s small, regardless of the unformty of emprcal dstrbuton. Ths s a unque feature of the proposed method because the prevous non-nformatve methods exclude the ESS from the score(averaged BDeu(Slander, Kontkanen and Myllymak, 7,Cano et al., ), Normalzed Maxmum Lkelhood (NML)(Slander, Roos, and Myllymak, ) ). Some numercal experments demonstrate that the proposed score mproves learnng accuracy. The results also suggest that the proposed score mght be effectve especally for small samples. Learnng Bayesan networks Let {x, x,, x N } be a set of N dscrete varables, each of whch can take a value n the set of states {,, r }. Here, x = k means that an x s state k. Accordng to the Bayesan network structure g G, the jont probablty dstrbuton s gven as x, x,, x N g) = N x Π, g), () = where G sgnfes the possble set of Bayesan network structures, and where Π denotes the parent varables set of x. Next, we ntroduce the problem of learnng a Bayesan network. Let θ jk be a condtonal probablty parameter of x = k when the j-th nstance of the parents of x s observed (We wrte Π = j). Buntne (99) assumed the Drchlet pror and used an expected a posteror (EAP) estmator as the parameter estmator Θ = (ˆθ jk ), ( =,, N, j =,, q, k =,, r ): ˆθ jk = α jk + n jk α j + n j, (k =,, r ), () where n jk represents the number of samples of x = k when Π = j and n j = r k= n jk, and where α jk denotes the hyperparameters of the Drchlet pror dstrbutons. (α jk s a pseudosample correspondng to n jk ), α j = r k= α jk, and ˆθ jr = r k= ˆθ jk. The margnal lkelhood s obtaned as X α, g) = q N = j= Γ(α j) Γ(α j + n j) r k= Γ(α jk + n jk ). Γ(α jk ) () Here, q sgnfes the number of nstances of Π n whch q = x l Π r l. In addton, X s a dataset. The problem of learnng a Bayesan network s to fnd the MAP structure that maxmzes the score (). Partcularly, Heckerman et al. (995) presented a suffcent condton for satsfyng the lkelhood equvalence assumpton, whch says that data should not help to dscrmnate network structures that represent the same assertons of condtonal ndependence, n the form of the followng constrant related to hyperparameters of (): q N r α jk = const. () = j= k= Furthermore, they proposed a margnal lkelhood that reflects a user s pror knowledge as

3 shown below. X α, g, g h ) = q N = j= j j ) r + nj) k= jk + n jk) jk ) (5) jk = αx = k, Πg = j gh ) (6) Here, α s the user-determned equvalent sample sze (ESS), Π g denotes the parent varable sets of x accordng to g and g h s the hypothetcal Bayesan network structure that reflects a user s pror knowledge. Ths metrc was desgnated as the Bayesan Drchlet equvalence (BDe) score metrc. As Buntne (99) descrbed, jk = α (r q ) s regarded as a specal case of the BDe metrc. Heckerman et al. (995) desgnated ths specal case as BDeu. For cases of whch we have no pror knowledge, BDeu s often used n practce. Heckerman et al. (995) reported, as a result of ther comparatve analyses of BDeu and BDe, that BDeu s better than BDe unless the user s belefs are close to the true model. BDeu requres an equvalent sample sze (ESS), whch s the value of a free parameter specfed by the user. Recent reports have descrbed that ESS n BDeu plays an mportant role n learnng Bayesan networks (Steck and Jaakkola, ; Slander, Kontkanen and Myllymak, 7). Especally, Ueno () reported that the pror of BDeu does not represent gnorance of pror knowledge but rather a user s pror belef n the unformty of the condtonal dstrbuton. Ths result s partcularly surprsng because t had been beleved that BDeu has a nonnformatve pror. In addton, he explaned the mechansm by whch the optmal ESS becomes large/small when the emprcal condtonal dstrbuton becomes unform/skewed because ESS determnes the magntude of the user s pror belef for a hypothetcal structure. Ths mechansm causes the BDeu score to be hghly senstve to ESS. Non-nformatve Drchlet Score Ths secton presents an alternatve nonnformatve pror Drchlet score because BDeu has no non-nformatve pror. To clarfy the mechansm of BDe, Ueno () analyzed the log-bde asymptotcally and derved the followng theorem. Theorem. (Ueno ) When α + nt s suffcently large, log-bde converges to log X α, g, g h ) = log Θ g X, α, g, g h ) (7) ( ) q N r r log + n jk + const, r = j= k= jk where log Θ g X, α, g, g h ) = q N r h ( jk + n jk) log (αg jk + n jk) = j= k= ( j + n j), const s ndependent terms of the number of parameters, and Θ g = { θ g jk }, ( =,,, N, j =,, q, k =,, r ), θ g jk = αgh jk + n jk j + nj. (8) From (7), log-bde can be decomposed nto two factors:. a log-posteror term log Θ gh X, α, g, g h ) and. ( a penalty ) r r k= r log + n jk. jk term N q = j= N q = r r k= r j= s the number of parameters. Ths well known model selecton formula s generally nterpreted. as reflectng the ft to the data and. as sgnfyng the penalty that blocks extra arcs from beng added. Ueno () descrbed that the term N = q j= r k= log ( + n jk jk ) n (7) reflects the dfference between the learned structure from data and the hypothetcal structure g h from the user s knowledge n BDe. To the degree that the two structures are equvalent, the penalty term s mnmzed wth the fxed ESS. Conversely, to the degree that the two structures dffer, the term s larger. Moreover, from (7), α determnes the magntude of the user s pror belef for a hypothetcal structure g h. On the other hand, BDeu, of whch the pror dstrbuton assumes an unform dstrbuton of condtonal probabltes, had been beleved to employ a non-nformatve pror. However, from (7), the optmal ESS of BDeu becomes

4 large/small when the emprcal condtonal dstrbuton becomes unform/skewed because ts hypothetcal structure g h assumes a unform condtonal probabltes dstrbuton and the ESS adjusts the magntude of the user s belef for a hypothetcal structure. Namely, the pror of BDeu does not represent gnorance of pror knowledge but rather a user s pror belef n the unformty of the condtonal dstrbuton. The man purpose of ths paper s to develop a Drchlet score that has a full non-nformatve pror. For ths purpose, we assume all possble hypothetcal structures as a pror belef of BDe because a problem of BDeu s that t assumes only a unform dstrbuton as a pror belef. That s, we margnalze ML over the hypothetcal structures g h G as follows: Defnton. N IP BDe (Non-nformatve pror Bayesan Drchlet equvalence) s defned as X g, α) = g h )X α, g, g h ) (9) = q N g h ) g h G = j= g h G j j ) + n j) r k= jk + n jk) jk ), where g h G s the summaton over the possble hypothetcal structures, and where g h ) s a unform dstrbuton. To estmate the margnal ML, t s necessary to calculate x = k, Π g = j g h ) automatcally for all the structures. For ths purpose, we use an emprcal estmaton of x = k, Π g = j g h, Θ gh ) usng data. Here, we estmate the jont probablty estmate x = k, Π g = j gh, Θ gh ) whch s transformed to the jont probablty of x and ts parents varables n g from the estmated condtonal probablty parameters Θ gh gven g h. Our method has the followng two steps:. Estmate the condtonal probablty parameters set Θ gh = {θ gh jk }, ( =,,, N, j =,, q, k =,, r ) gven g h from data. Estmate the jont probablty x = k, Π g = j gh, Θ gh ) Frst, we estmate the condtonal probablty parameters set gven g h as θ gh jk = + n gh r q gh jk + n gh q gh j, () where n gh jk represents the number of samples of x = k when Π gh = j (parent varables set of x gven g h ) and n gh j = r k= ngh jk, and qgh denotes the number of parent varables of Π gh. Next, we calculate the estmated jont probablty x = k, Π g = j g h, Θ gh ) as shown below. x = k, Π g = j gh, Θ gh ) = () x,, x,, x N g h, Θ gh ) x l / x Π g For the computaton of (), we can employ varous margnalzaton algorthms (e.g. varable elmnaton, factor elmnaton, random samplng elmnaton: see, for example, (Darwche, 9)). In practce, however, the Expected log-bde s dffcult to calculate because the product of multple probabltes suffers serous computatonal problems. To avod ths, we propose an alternatve method, Expected log-bde, as descrbed below. Defnton. Expected log-bde s defned as E g h G log X α, g, g h ) = () g h ) log X α, g, g h ) g h G = q N g h ) j log ) g h G = j= j + n j) r h log Γ(αg jk + n jk) k= jk ), Compared to NIP-BDe, the Expected log- BDe s practcal for computaton because t can be calculated by the sum of log-bde. Although a smlar Bayesan model averagng crteron has already been proposed (Chckerng and Heckerman, ), (Tan, He, and Ram, ), ts purpose s not to predct the true structure but to seek the optmal structure

5 whch maxmzes the nference predcton of a new data x +. Therefore, ther purpose s not the same as that of ths study. Learnng algorthm usng dynamc programmng Learnng Bayesan networks s known to be an NP complete problem (Chckerng, 996). Recently, however, the exact soluton methods can produce results n reasonable computaton tme f the varables are not prohbtvely numerous (e.g.(slander and Myllymak, 6), Malone et al., ). We employ the learnng algorthm of (Slander and Myllymak, 6) usng dynamc programmng. Our algorthm comprses four steps:. Compute the local Expected log-bde scores for all possble N N (x, Π g ) pars.. For each varable x x, fnd the best parent set n parent canddate set {Π g } for all Π g x \ {x }, ( g G).. Fnd the best snk for all N varables and the best orderng.. Fnd the optmal Bayesan network Only Step s dfferent from the procedure descrbed by Slander et al., 6. Frst, to obtan the local Expected log-bde score for a varable and ts parent varables set par, we should average log-bde for all possble hypothetcal structures. To compute the local Expected log-bde score for each par (x, Π g ) n Step, condtonal frequency tables cft(x, Π gh ) for all possble hypothess parent sets {Π gh } for ( g h G)are needed. Algorthm gves a pseudo-code for computng the local Expected log-bde scores, LS[x ][Π g ], for all possble (x, Π g ) pars. The pseudo-code assumes some helper functons: getcft(x, Π g ) produces the condtonal frequency table cft(x, Π g ), and gett l(π g, cft[x ][Π gh ]) translates the jont probablty estmate x = k, Π g = j gh, Θ gh ) from cft[x ][Π gh ], and the functon score (Π g, T l[x ][Π g ][Πgh ]) calculates the local log-bde score of g gven a hypothetcal structure g h. Frst, getcft(x, Π g ) produces the condtonal frequency tables cft(x, Π g ) for all possble Π g. They are stored on the memory or the dsk as soon as they are produced. Ths procedure s the same as that of (Slander and Myllymak, 6). Next, gett l(π g, cft[x ][Π gh ]) calculates the jont probablty estmate x = k, Π g = j gh, Θ gh ) usng the stored condtonal frequency tables by margnalzng the product of factors n Θ gh ncludng {x, Π g }. Ths procedure runs n O( {x Π g } exw)) gven an elmnaton order of wdth w. The local Expected log-bde, LS[x ][Π g ], can be calculated as the summaton of local log-bde values for all local possble structures Π gh. Algorthm getlocalscore(x). for all x x do for all {Π g } x \ {x } do f cft[x ][Π g ] = null then cft[x ][Π g ] getcft(x, Πg ) end f end for for all {Π g } x \ {x} do for all {Π gh } x \ {x } do T l[x ][Π g ][Πgh ] gett l(π g, cft[x][πgh ]) LS[x ][Π g ] LS[x ][Π g ] + score (Π g, T l[x ][Π g ][Πgh ]) end for LS[x ][Π g ] LS[x ][Π g ]/ {Πg x \ {x }} end for end for f x > then getlocalscore(x \ {x }) end f The tme complexty of the Algorthm s O(N (N ) exw)). After computng the local scores, we fnd the optmal Bayesan network n Steps, and of our algorthm. Steps are the same as those proposed by Slander et al. (6). Although the tradtonal scores (BDeu, AIC, BIC, and so on) run n O(N (N ) ), the proposed method requres greater computaton costs. However, the requred memory for the proposed computaton s equal to that of the tradtonal scores.

6 x = x = ) =.5 x = x = ) =.95 x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.95 x = x = ) =.5 x = x = ) =.95 x = x = ) =.5 x = x = ) =.95 Fgure : g: Strongly skewed dstrbuton. x = x = ) =. x = x = ) =.8 x = ) =.5 x = x =, x = ) =. x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.8 x = x = ) =. x = x = ) =.8 x = x = ) =. x = x = ) =.8 Fgure : g: Skewed dstrbuton. x = x = ) =. x = x = ) =.7 x = ) =.5 x = x =, x = ) =. x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.7 x = x = ) =. x = x = ) =.7 x = x = ) =. x = x = ) =.7 Fgure : g: Unform dstrbuton. x = x = ) =.5 x = x = ) =.65 x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.65 x = x = ) =.5 x = x = ) =.65 x = x = ) =.5 x = x = ) =.65 Fgure : g: Strongly unform dstrbuton. 5 Smulaton experments We conducted smulaton experments to compare Expected log-bde wth BDeu accordng to the procedure descrbed by Ueno (). We used small network structures wth bnary varables n Fgs.,,, and, n whch the dstrbutons are changed from skewed to unform. Fgure presents a structure n whch the condtonal probabltes dffer greatly because of the parent varable states (g: Strongly skewed Table : Learnng performance of BDeu g BDeu(α =.) g g g g BDeu(α =.) g g g g BDeu(α = ) g g g dstrbuton). By gradually reducng the dfference of the condtonal probabltes from Fg., we generated Fg. (g: Skewed dstrbuton), Fg. (g: Unform dstrbuton), and Fg. (g: Strongly unform dstrbuton). Procedures used for the smulaton experments are descrbed below.. We generated, 5, and, samples

7 Table : Learnng performance of Expected log-bde g Expected log-bde(α =.) g g g g Expected log-bde(α =.) g g g g Expected log-bde(α = ) g g g from the four fgures.. Usng BDeu and Expected log-bde by changng α (.,., ), Bayesan network structures were estmated, respectvely, based on, 5, and, samples. We searched for the exactly true structure.. The tmes the estmated structure was the true structure (when Structural Hammng Dstance (SHD) s zero; Tsamardnos, Brown, and Alfers, 6) were counted by repeatng procedure for teratons. We employ the varable elmnaton algorthm (Darwche, 9) for the margnalzaton n () because our experments use only a fve-node network. Its computatonal cost s not burdensome. Table presents the results for BDeu. Table presents results for Expected log-bde. Column shows the number of correct-structure estmates n trals. Column SHD shows the Structure Hammng Dstance (SHD). Column ME shows the total number of mssng arcs, column EE shows the total number of extra arcs, column WO shows the total number of arcs wth wrong orentaton, column MO shows the total number of arcs wth mssng orentaton, and column EO shows the total number of arcs wth extra orentaton n the case of lkelhood equvalence. The results presented n Table reveal that BDeu s hghly senstve to α. In Table, the optmum value of α becomes small as the condtonal dstrbuton becomes skewed. In contrast, the optmum value of α becomes large as the condtonal dstrbuton becomes unform. As shown n Table, the performances of Expected log-bde are better than those of BDeu n almost all cases. The results also show that the BDeu performances are hghly senstve to α but those of Expected log-bde are robust for α. Addtonally, t s noteworthy that the performance of BDeu s extremely worse than those of Expected log-bde for g wth α =. and g wth α =. The reason s that the optmal α becomes large/small for unform/skewed condtonal probabltes because BDeu assumes a unform condtonal pror. Although the optmal α for BDeu s hghly senstve to the unformty of condtonal probabltes dstrbuton, the optmal α of the proposed method s robust for the unformty. Especally for small samples, the large α set-

8 tng for the proposed method works effectvely because the learnng performances of α = for n = are better than those of α =., whch means that the ESS of Expected log-bde mght be affected only by the sample sze. The results also suggest that the optmal ESS ncreases as the sample sze becomes small. Therefore, the ESS of the proposed method works effectvely as actual pseudo-samples to augment the data, especally when the sample sze s small. 6 Conclusons Ths paper presented a proposal for a nonnformatve Drchlet score. The results suggest that the proposed method s effectve especally for a small sample sze. The future tasks are the followng:. Analyss of the proposed method usng varous smulaton data.. Improvement of the learnng algorthm.. Determnaton of the optmal α for a gven data sze. Furthermore, NML (Slander, Roos, and Myllymak, ) s known as an alternatve nformatontheoretc approach for crcumventng the problem wth ESS. It s also an mportant future task to compare the performances of these nonnformatve learnng scores. References W.L. Buntne. 99. Theory refnement on Bayesan networks. In B. D Ambroso, P. Smets and P. Bonssone, (eds.), Proc. of the Seventh Int. Conf. Uncertanty n Artfcal Intellgence, pages 5 6. E. Castllo, A.S. Had and C. Solares Learnng and updatng of uncertanty n Drchlet models. Machne Learnng, 6, pages 6. D. Chckerng Learnng Bayesan networks s NPcomplete. In Learnng from Data Artfcal Intellgence and Statstcs, V, pages. D.M. Chckerng and D. Heckerman.. A comparson of scentfc and engneerng crtera for Bayesan model selecton. Statstcs and Computng,, pages A. Cano, M. Gomez-Olmedo, A.R. Masegosa, and S. Moral.. Locally Averaged Bayesan Drchlet Metrcs. In Symbolc and Quanttatve Approaches to Reasonng wth Uncertanty Symbolc and Quanttatve Approaches to Reasonng wth Uncertanty(Ecsqaru ), pages 7-8. D. Heckerman, D. Geger and D. Chckerng Learnng Bayesan networks: the combnaton of knowledge and statstcal data. Machne Learnng,, pages 97. B.M. Malone, C. Yuan, E.A. Hansen, and S. Brdges.. Improvng the Scalablty of Optmal Bayesan Network Learnng wth External-Memory Fronter Breadth-Frst Branch and Bound Search. In A. Preffer and F.G. Cozman (eds.) Proc. the Twenty-Seventh Conference on Uncertanty n Artfcal Intellgence, pages T. Slander and P. Myllymak. 6. A smple approach for fndng the globally optmal Bayesan network structure. In R. Dechter and T. Rchardson (eds.), Proc. nd Conf. on Uncertanty n Artfcal Intellgence, pages 5-5. T. Slander, P. Kontkanen and P. Myllymak. 7. On senstvty of the MAP Bayesan network structure to the equpment sample sze parameter, In K.B. Laskey, S.M. Mahoney and J. Goldsmth (eds.), Proc. rd Conference on Uncertanty n Artfcal Intellgence, pages T. Slander, T. Roos and P. Myllymak.. Learnng Locally Mnmax Optmal Bayesan Networks. Internatonal Journal of Approxmate Reasonng, 5(5): H. Steck and T.S. Jaakkola.. On the Drchlet Pror and Bayesan Regularzaton. In S. Becker, S. Thrun, and K. Obermayer (eds.), Advances n Neural Informaton Processng Systems (NIPS), pages H. Steck. 8. Learnng the Bayesan network structure: Drchlet Pror versus Data. D.A. McAllester and P. Myllymak (eds.), Proc. th Conference on Uncertanty n Artfcal Intellgence, pages J. Tan, R. He and L. Ram.. Bayesan Model Averagng Usng the k-best Bayesan Network Structures. In P. Grunwald and P. Sprtes (eds.), Proc. 6th Conference on Uncertanty n Artfcal Intellgence, pages I. Tsamardnos, L.E. Brown, and C.F. Alfers. 6. The max-mn hll-clmbng Bayesan network structure learnng algorthm. Machne Learnng, 65(), pages 78. M. Ueno.. Learnng networks determned by the rato of pror and data. In P. Grunwald and P. Sprtes (eds.), Proc. 6th Conference on Uncertanty n Artfcal Intellgence, pages M. Ueno.. Robust learnng Bayesan networks for pror belef. In A. Preffer and F.G. Cozman (eds.) Proc. 7th Conference on Uncertanty n Artfcal Intellgence, pages A. Darwche. 9. Modelng and reasonng wth Bayesan networks. Cambrdge Unversty Press.

Artificial Intelligence Bayesian Networks

Artificial Intelligence Bayesian Networks Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal

More information

Factorized Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures

Factorized Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures Factorzed Normalzed Maxmum Lkelhood Crteron for Learnng Bayesan Network Structures Tom Slander, Teemu Roos, Petr Kontkanen and Petr Myllymäk Helsnk Insttute for Informaton Technology HIIT, Fnland Abstract

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

A New Evolutionary Computation Based Approach for Learning Bayesian Network

A New Evolutionary Computation Based Approach for Learning Bayesian Network Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example. CIS587 - Artfcal Intellgence Bayesan Networks KB for medcal dagnoss. Example. We want to buld a KB system for the dagnoss of pneumona. Problem descrpton: Dsease: pneumona Patent symptoms (fndngs, lab tests):

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Introduction to Hidden Markov Models

Introduction to Hidden Markov Models Introducton to Hdden Markov Models Alperen Degrmenc Ths document contans dervatons and algorthms for mplementng Hdden Markov Models. The content presented here s a collecton of my notes and personal nsghts

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

One-sided finite-difference approximations suitable for use with Richardson extrapolation

One-sided finite-difference approximations suitable for use with Richardson extrapolation Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals Internatonal Journal of Scentfc World, 2 1) 2014) 1-9 c Scence Publshng Corporaton www.scencepubco.com/ndex.php/ijsw do: 10.14419/jsw.v21.1780 Research Paper Statstcal nference for generalzed Pareto dstrbuton

More information

Why BP Works STAT 232B

Why BP Works STAT 232B Why BP Works STAT 232B Free Energes Helmholz & Gbbs Free Energes 1 Dstance between Probablstc Models - K-L dvergence b{ KL b{ p{ = b{ ln { } p{ Here, p{ s the eact ont prob. b{ s the appromaton, called

More information

Bayesian Planning of Hit-Miss Inspection Tests

Bayesian Planning of Hit-Miss Inspection Tests Bayesan Plannng of Ht-Mss Inspecton Tests Yew-Meng Koh a and Wllam Q Meeker a a Center for Nondestructve Evaluaton, Department of Statstcs, Iowa State Unversty, Ames, Iowa 5000 Abstract Although some useful

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

BAYESIAN networks are a popular class of probabilistic

BAYESIAN networks are a popular class of probabilistic Explotng Experts Knowledge for Structure Learnng of Bayesan Networks Hossen Amrkhan, Mohammad Rahmat, Peter J.F. Lucas, and Arjen Hommersom 1 Abstract Learnng Bayesan network structures from data s known

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Speech and Language Processing

Speech and Language Processing Speech and Language rocessng Lecture 3 ayesan network and ayesan nference Informaton and ommuncatons Engneerng ourse Takahro Shnozak 08//5 Lecture lan (Shnozak s part) I gves the frst 6 lectures about

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

On the Repeating Group Finding Problem

On the Repeating Group Finding Problem The 9th Workshop on Combnatoral Mathematcs and Computaton Theory On the Repeatng Group Fndng Problem Bo-Ren Kung, Wen-Hsen Chen, R.C.T Lee Graduate Insttute of Informaton Technology and Management Takmng

More information

A quantum-statistical-mechanical extension of Gaussian mixture model

A quantum-statistical-mechanical extension of Gaussian mixture model A quantum-statstcal-mechancal extenson of Gaussan mxture model Kazuyuk Tanaka, and Koj Tsuda 2 Graduate School of Informaton Scences, Tohoku Unversty, 6-3-09 Aramak-aza-aoba, Aoba-ku, Senda 980-8579, Japan

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Relevance Vector Machines Explained

Relevance Vector Machines Explained October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes

More information

Additional File 1 - Detailed explanation of the expression level CPD

Additional File 1 - Detailed explanation of the expression level CPD Addtonal Fle - Detaled explanaton of the expreon level CPD A mentoned n the man text, the man CPD for the uterng model cont of two ndvdual factor: P( level gen P( level gen P ( level gen 2 (.).. CPD factor

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH Turbulence classfcaton of load data by the frequency and severty of wnd gusts Introducton Oscar Moñux, DEWI GmbH Kevn Blebler, DEWI GmbH Durng the wnd turbne developng process, one of the most mportant

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information