Non-Informative Dirichlet Score for learning Bayesian networks
|
|
- Toby Cummings
- 5 years ago
- Views:
Transcription
1 Non-Informatve Drchlet Score for learnng Bayesan networks Maom Ueno Unversty of Electro-Communcatons, Japan Masak Uto Unversty of Electro-Communcatons, Japan uto Abstract Learnng Bayesan networks s known to be hghly senstve to the chosen equvalent sample sze (ESS) n the Bayesan Drchlet equvalence unform (BDeu). Ths senstvty often engenders unstable or undesred results because the pror of BDeu does not represent gnorance of pror knowledge, but rather a user s pror belef n the unformty of the condtonal dstrbuton. Ths paper presents a proposal for a nonnformatve Drchlet score by margnalzng the possble hypothetcal structures as the user s pror belef. Some numercal experments demonstrate that the proposed score mproves learnng accuracy. The results also suggest that the proposed score mght be effectve especally for small samples. Introducton Margnal lkelhood (ML) (usng a Drchlet pror) that ensures lkelhood equvalence, the most popular learnng score for Bayesan networks, fnds the maxmum a posteror (MAP) structure (Buntne, 99; Heckerman et al., 995). Ths score s known as Bayesan Drchlet equvalence (BDe) (Heckerman et al., 995). Gven no pror knowledge, the Bayesan Drchlet equvalence unform (BDeu), as proposed earler by Buntne (99), s often used. Actually, BDe(u) requres an equvalent sample sze (ESS), whch reflects the degree of a user s pror belef. Moreover, recent studes have demonstrated that learnng Bayesan networks s hghly senstve to the chosen equvalent sample sze (ESS) (Steck and Jaakkola, ; Slander, Kontkanen and Myllymak, 7). To clarfy the BDe(u) mechansm, Ueno () analyzed log-bde(u) asymptotcally, obtanng the result that t s decomposed nto the log-lkelhood and the penalty term of the complexty, whch reflects the dfference between the learned structure from data and the hypothetcal structure from the user s knowledge. As the two structures become equvalent, the penalty term s mnmzed wth the fxed ESS. Conversely, the term ncreases to the degree that the two structures become dfferent. Furthermore, the result suggests that a tradeoff exsts between the role of ESS n the log-lkelhood (whch helps to block extra arcs) and ts role n the penalty term (whch helps to add extra arcs). That tradeoff mght cause the BDeu score to be hghly senstve to ESS. It mght make t more dffcult to determne an approxmate ESS. Moreover, Ueno () showed that the pror of BDeu does not represent gnorance of pror knowledge, but rather a user s pror belef n the unformty of the condtonal dstrbuton. Ths fact s partcularly surprsng because t had been beleved that BDeu has a non-nformatve pror. The results further mply that the optmal ESS becomes large/small when the emprcal condtonal dstrbuton becomes unform/skewed. Ths man factor underpns the senstvty of BDeu to ESS. To solve ths problem, Slander, Kontkanen and Myllymak (7) proposed a learnng method to margnalze the ESS of BDeu. They averaged BDeu values wth ncreasng ESS
2 from to by.. Moreover, to decrease the computaton costs of the margnalzaton, Cano et al. () proposed averagng a wde range of dfferent chosen ESSs: ESS >., ESS >>., ESS <., ESS <<.. They reported that the proposed method performed robustly and effcently. However, such approxmated margnalzaton does not always guarantee robust results when the optmal ESS s extremely small or large. In addton, the exact margnalzaton of ESS s dffcult because ESS s a contnuous varable n doman (, ). Our proposal n ths paper s a full nonnformatve Drchlet score. We assume all possble hypothetcal structures as a pror belef of BDe because the problem of BDeu s that t assumes only a unform dstrbuton as a pror belef. Ths paper presents a proposal for a nonnformatve Drchlet score by averagng the hypothetcal structures as a user s pror belef. The optmal ESS of BDeu becomes large/small when the emprcal condtonal dstrbuton becomes unform/skewed because ts hypothetcal structure assumes a unform condtonal probabltes dstrbuton and the ESS adjusts the magntude of the user s belef for a hypothetcal structure. However, the ESS of full non-nformatve pror s expected to work effectvely as actual pseudo-samples to augment the data, especally when the sample sze s small, regardless of the unformty of emprcal dstrbuton. Ths s a unque feature of the proposed method because the prevous non-nformatve methods exclude the ESS from the score(averaged BDeu(Slander, Kontkanen and Myllymak, 7,Cano et al., ), Normalzed Maxmum Lkelhood (NML)(Slander, Roos, and Myllymak, ) ). Some numercal experments demonstrate that the proposed score mproves learnng accuracy. The results also suggest that the proposed score mght be effectve especally for small samples. Learnng Bayesan networks Let {x, x,, x N } be a set of N dscrete varables, each of whch can take a value n the set of states {,, r }. Here, x = k means that an x s state k. Accordng to the Bayesan network structure g G, the jont probablty dstrbuton s gven as x, x,, x N g) = N x Π, g), () = where G sgnfes the possble set of Bayesan network structures, and where Π denotes the parent varables set of x. Next, we ntroduce the problem of learnng a Bayesan network. Let θ jk be a condtonal probablty parameter of x = k when the j-th nstance of the parents of x s observed (We wrte Π = j). Buntne (99) assumed the Drchlet pror and used an expected a posteror (EAP) estmator as the parameter estmator Θ = (ˆθ jk ), ( =,, N, j =,, q, k =,, r ): ˆθ jk = α jk + n jk α j + n j, (k =,, r ), () where n jk represents the number of samples of x = k when Π = j and n j = r k= n jk, and where α jk denotes the hyperparameters of the Drchlet pror dstrbutons. (α jk s a pseudosample correspondng to n jk ), α j = r k= α jk, and ˆθ jr = r k= ˆθ jk. The margnal lkelhood s obtaned as X α, g) = q N = j= Γ(α j) Γ(α j + n j) r k= Γ(α jk + n jk ). Γ(α jk ) () Here, q sgnfes the number of nstances of Π n whch q = x l Π r l. In addton, X s a dataset. The problem of learnng a Bayesan network s to fnd the MAP structure that maxmzes the score (). Partcularly, Heckerman et al. (995) presented a suffcent condton for satsfyng the lkelhood equvalence assumpton, whch says that data should not help to dscrmnate network structures that represent the same assertons of condtonal ndependence, n the form of the followng constrant related to hyperparameters of (): q N r α jk = const. () = j= k= Furthermore, they proposed a margnal lkelhood that reflects a user s pror knowledge as
3 shown below. X α, g, g h ) = q N = j= j j ) r + nj) k= jk + n jk) jk ) (5) jk = αx = k, Πg = j gh ) (6) Here, α s the user-determned equvalent sample sze (ESS), Π g denotes the parent varable sets of x accordng to g and g h s the hypothetcal Bayesan network structure that reflects a user s pror knowledge. Ths metrc was desgnated as the Bayesan Drchlet equvalence (BDe) score metrc. As Buntne (99) descrbed, jk = α (r q ) s regarded as a specal case of the BDe metrc. Heckerman et al. (995) desgnated ths specal case as BDeu. For cases of whch we have no pror knowledge, BDeu s often used n practce. Heckerman et al. (995) reported, as a result of ther comparatve analyses of BDeu and BDe, that BDeu s better than BDe unless the user s belefs are close to the true model. BDeu requres an equvalent sample sze (ESS), whch s the value of a free parameter specfed by the user. Recent reports have descrbed that ESS n BDeu plays an mportant role n learnng Bayesan networks (Steck and Jaakkola, ; Slander, Kontkanen and Myllymak, 7). Especally, Ueno () reported that the pror of BDeu does not represent gnorance of pror knowledge but rather a user s pror belef n the unformty of the condtonal dstrbuton. Ths result s partcularly surprsng because t had been beleved that BDeu has a nonnformatve pror. In addton, he explaned the mechansm by whch the optmal ESS becomes large/small when the emprcal condtonal dstrbuton becomes unform/skewed because ESS determnes the magntude of the user s pror belef for a hypothetcal structure. Ths mechansm causes the BDeu score to be hghly senstve to ESS. Non-nformatve Drchlet Score Ths secton presents an alternatve nonnformatve pror Drchlet score because BDeu has no non-nformatve pror. To clarfy the mechansm of BDe, Ueno () analyzed the log-bde asymptotcally and derved the followng theorem. Theorem. (Ueno ) When α + nt s suffcently large, log-bde converges to log X α, g, g h ) = log Θ g X, α, g, g h ) (7) ( ) q N r r log + n jk + const, r = j= k= jk where log Θ g X, α, g, g h ) = q N r h ( jk + n jk) log (αg jk + n jk) = j= k= ( j + n j), const s ndependent terms of the number of parameters, and Θ g = { θ g jk }, ( =,,, N, j =,, q, k =,, r ), θ g jk = αgh jk + n jk j + nj. (8) From (7), log-bde can be decomposed nto two factors:. a log-posteror term log Θ gh X, α, g, g h ) and. ( a penalty ) r r k= r log + n jk. jk term N q = j= N q = r r k= r j= s the number of parameters. Ths well known model selecton formula s generally nterpreted. as reflectng the ft to the data and. as sgnfyng the penalty that blocks extra arcs from beng added. Ueno () descrbed that the term N = q j= r k= log ( + n jk jk ) n (7) reflects the dfference between the learned structure from data and the hypothetcal structure g h from the user s knowledge n BDe. To the degree that the two structures are equvalent, the penalty term s mnmzed wth the fxed ESS. Conversely, to the degree that the two structures dffer, the term s larger. Moreover, from (7), α determnes the magntude of the user s pror belef for a hypothetcal structure g h. On the other hand, BDeu, of whch the pror dstrbuton assumes an unform dstrbuton of condtonal probabltes, had been beleved to employ a non-nformatve pror. However, from (7), the optmal ESS of BDeu becomes
4 large/small when the emprcal condtonal dstrbuton becomes unform/skewed because ts hypothetcal structure g h assumes a unform condtonal probabltes dstrbuton and the ESS adjusts the magntude of the user s belef for a hypothetcal structure. Namely, the pror of BDeu does not represent gnorance of pror knowledge but rather a user s pror belef n the unformty of the condtonal dstrbuton. The man purpose of ths paper s to develop a Drchlet score that has a full non-nformatve pror. For ths purpose, we assume all possble hypothetcal structures as a pror belef of BDe because a problem of BDeu s that t assumes only a unform dstrbuton as a pror belef. That s, we margnalze ML over the hypothetcal structures g h G as follows: Defnton. N IP BDe (Non-nformatve pror Bayesan Drchlet equvalence) s defned as X g, α) = g h )X α, g, g h ) (9) = q N g h ) g h G = j= g h G j j ) + n j) r k= jk + n jk) jk ), where g h G s the summaton over the possble hypothetcal structures, and where g h ) s a unform dstrbuton. To estmate the margnal ML, t s necessary to calculate x = k, Π g = j g h ) automatcally for all the structures. For ths purpose, we use an emprcal estmaton of x = k, Π g = j g h, Θ gh ) usng data. Here, we estmate the jont probablty estmate x = k, Π g = j gh, Θ gh ) whch s transformed to the jont probablty of x and ts parents varables n g from the estmated condtonal probablty parameters Θ gh gven g h. Our method has the followng two steps:. Estmate the condtonal probablty parameters set Θ gh = {θ gh jk }, ( =,,, N, j =,, q, k =,, r ) gven g h from data. Estmate the jont probablty x = k, Π g = j gh, Θ gh ) Frst, we estmate the condtonal probablty parameters set gven g h as θ gh jk = + n gh r q gh jk + n gh q gh j, () where n gh jk represents the number of samples of x = k when Π gh = j (parent varables set of x gven g h ) and n gh j = r k= ngh jk, and qgh denotes the number of parent varables of Π gh. Next, we calculate the estmated jont probablty x = k, Π g = j g h, Θ gh ) as shown below. x = k, Π g = j gh, Θ gh ) = () x,, x,, x N g h, Θ gh ) x l / x Π g For the computaton of (), we can employ varous margnalzaton algorthms (e.g. varable elmnaton, factor elmnaton, random samplng elmnaton: see, for example, (Darwche, 9)). In practce, however, the Expected log-bde s dffcult to calculate because the product of multple probabltes suffers serous computatonal problems. To avod ths, we propose an alternatve method, Expected log-bde, as descrbed below. Defnton. Expected log-bde s defned as E g h G log X α, g, g h ) = () g h ) log X α, g, g h ) g h G = q N g h ) j log ) g h G = j= j + n j) r h log Γ(αg jk + n jk) k= jk ), Compared to NIP-BDe, the Expected log- BDe s practcal for computaton because t can be calculated by the sum of log-bde. Although a smlar Bayesan model averagng crteron has already been proposed (Chckerng and Heckerman, ), (Tan, He, and Ram, ), ts purpose s not to predct the true structure but to seek the optmal structure
5 whch maxmzes the nference predcton of a new data x +. Therefore, ther purpose s not the same as that of ths study. Learnng algorthm usng dynamc programmng Learnng Bayesan networks s known to be an NP complete problem (Chckerng, 996). Recently, however, the exact soluton methods can produce results n reasonable computaton tme f the varables are not prohbtvely numerous (e.g.(slander and Myllymak, 6), Malone et al., ). We employ the learnng algorthm of (Slander and Myllymak, 6) usng dynamc programmng. Our algorthm comprses four steps:. Compute the local Expected log-bde scores for all possble N N (x, Π g ) pars.. For each varable x x, fnd the best parent set n parent canddate set {Π g } for all Π g x \ {x }, ( g G).. Fnd the best snk for all N varables and the best orderng.. Fnd the optmal Bayesan network Only Step s dfferent from the procedure descrbed by Slander et al., 6. Frst, to obtan the local Expected log-bde score for a varable and ts parent varables set par, we should average log-bde for all possble hypothetcal structures. To compute the local Expected log-bde score for each par (x, Π g ) n Step, condtonal frequency tables cft(x, Π gh ) for all possble hypothess parent sets {Π gh } for ( g h G)are needed. Algorthm gves a pseudo-code for computng the local Expected log-bde scores, LS[x ][Π g ], for all possble (x, Π g ) pars. The pseudo-code assumes some helper functons: getcft(x, Π g ) produces the condtonal frequency table cft(x, Π g ), and gett l(π g, cft[x ][Π gh ]) translates the jont probablty estmate x = k, Π g = j gh, Θ gh ) from cft[x ][Π gh ], and the functon score (Π g, T l[x ][Π g ][Πgh ]) calculates the local log-bde score of g gven a hypothetcal structure g h. Frst, getcft(x, Π g ) produces the condtonal frequency tables cft(x, Π g ) for all possble Π g. They are stored on the memory or the dsk as soon as they are produced. Ths procedure s the same as that of (Slander and Myllymak, 6). Next, gett l(π g, cft[x ][Π gh ]) calculates the jont probablty estmate x = k, Π g = j gh, Θ gh ) usng the stored condtonal frequency tables by margnalzng the product of factors n Θ gh ncludng {x, Π g }. Ths procedure runs n O( {x Π g } exw)) gven an elmnaton order of wdth w. The local Expected log-bde, LS[x ][Π g ], can be calculated as the summaton of local log-bde values for all local possble structures Π gh. Algorthm getlocalscore(x). for all x x do for all {Π g } x \ {x } do f cft[x ][Π g ] = null then cft[x ][Π g ] getcft(x, Πg ) end f end for for all {Π g } x \ {x} do for all {Π gh } x \ {x } do T l[x ][Π g ][Πgh ] gett l(π g, cft[x][πgh ]) LS[x ][Π g ] LS[x ][Π g ] + score (Π g, T l[x ][Π g ][Πgh ]) end for LS[x ][Π g ] LS[x ][Π g ]/ {Πg x \ {x }} end for end for f x > then getlocalscore(x \ {x }) end f The tme complexty of the Algorthm s O(N (N ) exw)). After computng the local scores, we fnd the optmal Bayesan network n Steps, and of our algorthm. Steps are the same as those proposed by Slander et al. (6). Although the tradtonal scores (BDeu, AIC, BIC, and so on) run n O(N (N ) ), the proposed method requres greater computaton costs. However, the requred memory for the proposed computaton s equal to that of the tradtonal scores.
6 x = x = ) =.5 x = x = ) =.95 x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.95 x = x = ) =.5 x = x = ) =.95 x = x = ) =.5 x = x = ) =.95 Fgure : g: Strongly skewed dstrbuton. x = x = ) =. x = x = ) =.8 x = ) =.5 x = x =, x = ) =. x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.8 x = x = ) =. x = x = ) =.8 x = x = ) =. x = x = ) =.8 Fgure : g: Skewed dstrbuton. x = x = ) =. x = x = ) =.7 x = ) =.5 x = x =, x = ) =. x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.7 x = x = ) =. x = x = ) =.7 x = x = ) =. x = x = ) =.7 Fgure : g: Unform dstrbuton. x = x = ) =.5 x = x = ) =.65 x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.5 x = x =, x = ) =.65 x = x = ) =.5 x = x = ) =.65 x = x = ) =.5 x = x = ) =.65 Fgure : g: Strongly unform dstrbuton. 5 Smulaton experments We conducted smulaton experments to compare Expected log-bde wth BDeu accordng to the procedure descrbed by Ueno (). We used small network structures wth bnary varables n Fgs.,,, and, n whch the dstrbutons are changed from skewed to unform. Fgure presents a structure n whch the condtonal probabltes dffer greatly because of the parent varable states (g: Strongly skewed Table : Learnng performance of BDeu g BDeu(α =.) g g g g BDeu(α =.) g g g g BDeu(α = ) g g g dstrbuton). By gradually reducng the dfference of the condtonal probabltes from Fg., we generated Fg. (g: Skewed dstrbuton), Fg. (g: Unform dstrbuton), and Fg. (g: Strongly unform dstrbuton). Procedures used for the smulaton experments are descrbed below.. We generated, 5, and, samples
7 Table : Learnng performance of Expected log-bde g Expected log-bde(α =.) g g g g Expected log-bde(α =.) g g g g Expected log-bde(α = ) g g g from the four fgures.. Usng BDeu and Expected log-bde by changng α (.,., ), Bayesan network structures were estmated, respectvely, based on, 5, and, samples. We searched for the exactly true structure.. The tmes the estmated structure was the true structure (when Structural Hammng Dstance (SHD) s zero; Tsamardnos, Brown, and Alfers, 6) were counted by repeatng procedure for teratons. We employ the varable elmnaton algorthm (Darwche, 9) for the margnalzaton n () because our experments use only a fve-node network. Its computatonal cost s not burdensome. Table presents the results for BDeu. Table presents results for Expected log-bde. Column shows the number of correct-structure estmates n trals. Column SHD shows the Structure Hammng Dstance (SHD). Column ME shows the total number of mssng arcs, column EE shows the total number of extra arcs, column WO shows the total number of arcs wth wrong orentaton, column MO shows the total number of arcs wth mssng orentaton, and column EO shows the total number of arcs wth extra orentaton n the case of lkelhood equvalence. The results presented n Table reveal that BDeu s hghly senstve to α. In Table, the optmum value of α becomes small as the condtonal dstrbuton becomes skewed. In contrast, the optmum value of α becomes large as the condtonal dstrbuton becomes unform. As shown n Table, the performances of Expected log-bde are better than those of BDeu n almost all cases. The results also show that the BDeu performances are hghly senstve to α but those of Expected log-bde are robust for α. Addtonally, t s noteworthy that the performance of BDeu s extremely worse than those of Expected log-bde for g wth α =. and g wth α =. The reason s that the optmal α becomes large/small for unform/skewed condtonal probabltes because BDeu assumes a unform condtonal pror. Although the optmal α for BDeu s hghly senstve to the unformty of condtonal probabltes dstrbuton, the optmal α of the proposed method s robust for the unformty. Especally for small samples, the large α set-
8 tng for the proposed method works effectvely because the learnng performances of α = for n = are better than those of α =., whch means that the ESS of Expected log-bde mght be affected only by the sample sze. The results also suggest that the optmal ESS ncreases as the sample sze becomes small. Therefore, the ESS of the proposed method works effectvely as actual pseudo-samples to augment the data, especally when the sample sze s small. 6 Conclusons Ths paper presented a proposal for a nonnformatve Drchlet score. The results suggest that the proposed method s effectve especally for a small sample sze. The future tasks are the followng:. Analyss of the proposed method usng varous smulaton data.. Improvement of the learnng algorthm.. Determnaton of the optmal α for a gven data sze. Furthermore, NML (Slander, Roos, and Myllymak, ) s known as an alternatve nformatontheoretc approach for crcumventng the problem wth ESS. It s also an mportant future task to compare the performances of these nonnformatve learnng scores. References W.L. Buntne. 99. Theory refnement on Bayesan networks. In B. D Ambroso, P. Smets and P. Bonssone, (eds.), Proc. of the Seventh Int. Conf. Uncertanty n Artfcal Intellgence, pages 5 6. E. Castllo, A.S. Had and C. Solares Learnng and updatng of uncertanty n Drchlet models. Machne Learnng, 6, pages 6. D. Chckerng Learnng Bayesan networks s NPcomplete. In Learnng from Data Artfcal Intellgence and Statstcs, V, pages. D.M. Chckerng and D. Heckerman.. A comparson of scentfc and engneerng crtera for Bayesan model selecton. Statstcs and Computng,, pages A. Cano, M. Gomez-Olmedo, A.R. Masegosa, and S. Moral.. Locally Averaged Bayesan Drchlet Metrcs. In Symbolc and Quanttatve Approaches to Reasonng wth Uncertanty Symbolc and Quanttatve Approaches to Reasonng wth Uncertanty(Ecsqaru ), pages 7-8. D. Heckerman, D. Geger and D. Chckerng Learnng Bayesan networks: the combnaton of knowledge and statstcal data. Machne Learnng,, pages 97. B.M. Malone, C. Yuan, E.A. Hansen, and S. Brdges.. Improvng the Scalablty of Optmal Bayesan Network Learnng wth External-Memory Fronter Breadth-Frst Branch and Bound Search. In A. Preffer and F.G. Cozman (eds.) Proc. the Twenty-Seventh Conference on Uncertanty n Artfcal Intellgence, pages T. Slander and P. Myllymak. 6. A smple approach for fndng the globally optmal Bayesan network structure. In R. Dechter and T. Rchardson (eds.), Proc. nd Conf. on Uncertanty n Artfcal Intellgence, pages 5-5. T. Slander, P. Kontkanen and P. Myllymak. 7. On senstvty of the MAP Bayesan network structure to the equpment sample sze parameter, In K.B. Laskey, S.M. Mahoney and J. Goldsmth (eds.), Proc. rd Conference on Uncertanty n Artfcal Intellgence, pages T. Slander, T. Roos and P. Myllymak.. Learnng Locally Mnmax Optmal Bayesan Networks. Internatonal Journal of Approxmate Reasonng, 5(5): H. Steck and T.S. Jaakkola.. On the Drchlet Pror and Bayesan Regularzaton. In S. Becker, S. Thrun, and K. Obermayer (eds.), Advances n Neural Informaton Processng Systems (NIPS), pages H. Steck. 8. Learnng the Bayesan network structure: Drchlet Pror versus Data. D.A. McAllester and P. Myllymak (eds.), Proc. th Conference on Uncertanty n Artfcal Intellgence, pages J. Tan, R. He and L. Ram.. Bayesan Model Averagng Usng the k-best Bayesan Network Structures. In P. Grunwald and P. Sprtes (eds.), Proc. 6th Conference on Uncertanty n Artfcal Intellgence, pages I. Tsamardnos, L.E. Brown, and C.F. Alfers. 6. The max-mn hll-clmbng Bayesan network structure learnng algorthm. Machne Learnng, 65(), pages 78. M. Ueno.. Learnng networks determned by the rato of pror and data. In P. Grunwald and P. Sprtes (eds.), Proc. 6th Conference on Uncertanty n Artfcal Intellgence, pages M. Ueno.. Robust learnng Bayesan networks for pror belef. In A. Preffer and F.G. Cozman (eds.) Proc. 7th Conference on Uncertanty n Artfcal Intellgence, pages A. Darwche. 9. Modelng and reasonng wth Bayesan networks. Cambrdge Unversty Press.
Artificial Intelligence Bayesian Networks
Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal
More informationFactorized Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures
Factorzed Normalzed Maxmum Lkelhood Crteron for Learnng Bayesan Network Structures Tom Slander, Teemu Roos, Petr Kontkanen and Petr Myllymäk Helsnk Insttute for Informaton Technology HIIT, Fnland Abstract
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationEM and Structure Learning
EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder
More informationHidden Markov Models
CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationDepartment of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING
MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationA PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,
More informationOn an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1
On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool
More informationA New Evolutionary Computation Based Approach for Learning Bayesian Network
Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationCIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.
CIS587 - Artfcal Intellgence Bayesan Networks KB for medcal dagnoss. Example. We want to buld a KB system for the dagnoss of pneumona. Problem descrpton: Dsease: pneumona Patent symptoms (fndngs, lab tests):
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationSee Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)
Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationGoodness of fit and Wilks theorem
DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationOutline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline
Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationIntroduction to Hidden Markov Models
Introducton to Hdden Markov Models Alperen Degrmenc Ths document contans dervatons and algorthms for mplementng Hdden Markov Models. The content presented here s a collecton of my notes and personal nsghts
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors
Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference
More informationOne-sided finite-difference approximations suitable for use with Richardson extrapolation
Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationStatistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals
Internatonal Journal of Scentfc World, 2 1) 2014) 1-9 c Scence Publshng Corporaton www.scencepubco.com/ndex.php/ijsw do: 10.14419/jsw.v21.1780 Research Paper Statstcal nference for generalzed Pareto dstrbuton
More informationWhy BP Works STAT 232B
Why BP Works STAT 232B Free Energes Helmholz & Gbbs Free Energes 1 Dstance between Probablstc Models - K-L dvergence b{ KL b{ p{ = b{ ln { } p{ Here, p{ s the eact ont prob. b{ s the appromaton, called
More informationBayesian Planning of Hit-Miss Inspection Tests
Bayesan Plannng of Ht-Mss Inspecton Tests Yew-Meng Koh a and Wllam Q Meeker a a Center for Nondestructve Evaluaton, Department of Statstcs, Iowa State Unversty, Ames, Iowa 5000 Abstract Although some useful
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationBayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County
Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationDETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH
Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationProbability Theory (revisited)
Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationParametric fractional imputation for missing data analysis
Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More informationThis column is a continuation of our previous column
Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationMACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression
11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING
More informationUsing the estimated penetrances to determine the range of the underlying genetic model in casecontrol
Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable
More informationBAYESIAN networks are a popular class of probabilistic
Explotng Experts Knowledge for Structure Learnng of Bayesan Networks Hossen Amrkhan, Mohammad Rahmat, Peter J.F. Lucas, and Arjen Hommersom 1 Abstract Learnng Bayesan network structures from data s known
More informationSpace of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics
/7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationMAXIMUM A POSTERIORI TRANSDUCTION
MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,
More informationSpeech and Language Processing
Speech and Language rocessng Lecture 3 ayesan network and ayesan nference Informaton and ommuncatons Engneerng ourse Takahro Shnozak 08//5 Lecture lan (Shnozak s part) I gves the frst 6 lectures about
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationMotion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong
Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More information3.1 ML and Empirical Distribution
67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More informationChapter 15 - Multiple Regression
Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term
More informationOn the Repeating Group Finding Problem
The 9th Workshop on Combnatoral Mathematcs and Computaton Theory On the Repeatng Group Fndng Problem Bo-Ren Kung, Wen-Hsen Chen, R.C.T Lee Graduate Insttute of Informaton Technology and Management Takmng
More informationA quantum-statistical-mechanical extension of Gaussian mixture model
A quantum-statstcal-mechancal extenson of Gaussan mxture model Kazuyuk Tanaka, and Koj Tsuda 2 Graduate School of Informaton Scences, Tohoku Unversty, 6-3-09 Aramak-aza-aoba, Aoba-ku, Senda 980-8579, Japan
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationRelevance Vector Machines Explained
October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes
More informationAdditional File 1 - Detailed explanation of the expression level CPD
Addtonal Fle - Detaled explanaton of the expreon level CPD A mentoned n the man text, the man CPD for the uterng model cont of two ndvdual factor: P( level gen P( level gen P ( level gen 2 (.).. CPD factor
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear
More informationMATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)
1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons
More informationChapter - 2. Distribution System Power Flow Analysis
Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load
More informationTurbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH
Turbulence classfcaton of load data by the frequency and severty of wnd gusts Introducton Oscar Moñux, DEWI GmbH Kevn Blebler, DEWI GmbH Durng the wnd turbne developng process, one of the most mportant
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationOutline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]
DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm
More informationReal-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling
Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More information