Density Propagation and Improved Bounds on the Partition Function
|
|
- Lionel Smith
- 5 years ago
- Views:
Transcription
1 Densty Propagaton and Improved Bounds on the Partton Functon Stefano Ermon, Carla P. Gomes Dept. of Computer Scence Cornell Unversty Ithaca NY 14853, U.S.A. Ashsh Sabharwal IBM Watson esearch Ctr. Yorktown Heghts NY 10598, U.S.A. Bart Selman Dept. of Computer Scence Cornell Unversty Ithaca NY 14853, U.S.A. Abstract Gven a probablstc graphcal model, ts densty of states s a functon that, for any lkelhood value, gves the number of confguratons wth that probablty. We ntroduce a novel message-passng algorthm called Densty Propagaton (DP) for estmatng ths functon. We show that DP s exact for tree-structured graphcal models and s, n general, a strct generalzaton of both sum-product and maxproduct algorthms. Further, we use densty of states and tree decomposton to ntroduce a new famly of upper and lower bounds on the partton functon. For any tree decomposton, the new upper bound based on fner-graned densty of state nformaton s provably at least as tght as prevously known bounds based on convexty of the log-partton functon, and strclty stronger f a general condton holds. We conclude wth emprcal evdence of mprovement over convex relaxatons and mean-feld based bounds. 1 Introducton Assocated wth any undrected graphcal model [1] s the so-called densty of states, a term borrowed from statstcal physcs ndcatng a functon that, for any lkelhood value, gves the number of confguratons wth that probablty. The densty of states plays an mportant role n statstcal physcs because t provdes a fne graned descrpton of the system, and can be used to effcently compute many propertes of nterests, such as the partton functon and ts parameterzed verson [, 3]. It can be seen that computng the densty of states s computatonally ntractable n the worst case, snce t subsumes a #-P complete problem (computng the partton functon) and an NP-hard one (MAP nference). All current approxmate technques estmatng the densty of states are based on samplng, the most promnent beng the Wang-Landau algorthm [3] and ts mproved varants []. These methods have been shown to be very effectve n practce. However, they do not provde any guarantee on the qualty of the results. Furthermore, they gnore the structure of the underlyng graphcal model, effectvely treatng the energy functon (whch gves the log-lkelhood of a confguraton) as a black-box. As a frst step towards explotng the structure of the graphcal model when computng the densty of states, we propose an algorthm called DENSITYPOPAGATION (DP). The algorthm s based on dynamc programmng and can be convenently expressed n terms of message passng on the graphcal model. We show that DENSITYPOPAGATION computes the densty of states exactly for any tree-structured graphcal model. It s closely related to the popular Sum-Product (Belef Propagaton, BP) and Max-Product (MP) algorthms, and can be seen as a generalzaton of both. However, t computes somethng much rcher, namely the densty of states, whch contans nformaton such as the partton functon and varable margnals. Although we do not work at the level of ndvdual confguratons, DENSITYPOPAGATION allows us to reason n terms of groups of confguratons wth the same probablty (energy). Beng able to solve nference tasks for certan tractable classes of problems (e.g., trees) s mportant because one can often decompose a complex problem nto tractable subproblems (such as spannng 1
2 trees) [4], and the solutons to these smpler problems can be combned to recover useful propertes of the orgnal graphcal model [5, 6]. In ths paper we show that by combnng the addtonal nformaton gven by the densty of states, we can obtan a new famly of upper and lower bounds on the partton functon. We prove that the new upper bound s always at least as tght as the one based on the convexty of the log-partton functon [4], and we provde a general condton where the new bound s strctly tghter. Further, we llustrate emprcally that the new upper bound mproves upon the convexty-based one on Isng grd and clque models, and that the new lower bound s emprcally slghtly stronger than the one gven by mean-feld theory [4, 7]. Problem defnton and setup We consder a graphcal model specfed as a factor graph wth N = V dscrete random varables x, V where x X. The global random vector x = {x s, s V } takes value n the cartesan product X = X 1 X X N, wth cardnalty D = X = N =1 X. We consder a probablty dstrbuton over elements x X (called confguratons) p(x) = 1 Z ψ α ({x} α ) (1) α I that factors nto potentals or factors ψ α : {x} α +, where I s an ndex set and {x} α V a subset of varables the factor ψ α depends on. The correspondng factor graph s a bpartte graph wth vertex set V I. In the factor graph, each varable node V s connected wth all the factors α I that depend on. Smlarly, each factor node α I s connected wth all the varable nodes {x} α. We denote the neghbours of and α by N () and N (α) respectvely. We wll also make use of the related exponental representaton [8]. Let φ be a collecton of potental functons {φ α, α I}, defned over the ndex set I. Gven an exponental parameter vector Θ = {Θ α, α I}, the exponental famly defned by φ s the famly of probablty dstrbutons over X defned as follows: p(x, Θ) = 1 1 exp(θ φ(x)) = Z(Θ) Z(Θ) exp ( ) Θ α φ α ({x} α ) α I () Gven an exponental famly, we defne the densty of states [] as a functon n : I : (E, Θ) {x X Θ φ(x) = E} where for any exponental parameter Θ, t holds that + n(e, Θ)dE = X. We wll refer to the quantty α I Θ αφ α ({x} α ) as the energy of a confguraton x. The densty n(e, Θ) s the partton functon for the mcrocanoncal ensemble (solated system at equlbrum wth constant energy, volume, and number of partcles), whle Z(Θ) s the partton functon for the tradtonal canoncal ensemble. We wll denote wth n(e) (omttng the parameter) the densty of states of the orgnal factor graph. 3 Densty Propagaton Snce any propostonal Satsfablty (SAT) nstance can be effently encoded as a factor graph (e.g., by defnng a unform probablty measure over satsfyng assgnments), t s clear that computng the densty of states s computatonally ntractable n the worst case, as a generalzaton of an NP- Complete problem (satsfablty testng) and a #-P complete problem (model countng). We show that the densty of states can be computed effcently 1 for acyclc graphcal models. We provde a Dynamc Programmng algorthm, whch can also be nterpreted as a message passng algorthm on the factor graph, called DENSITYPOPAGATION (DP), whch computes the densty of states exactly for acyclc graphcal models. 1 Polynomal n the cardnalty of the functon s support, whch could be exponental n N n the worst case.
3 3.1 Densty propagaton equatons DENSITYPOPAGATION works by exchangng messages from varable to factor nodes and vce versa. Unlke tradtonal message passng algorthms, where messages represent margnal probabltes (vectors of real numbers), for every x X a DENSITYPOPAGATION message m a (x ) represents an unnormalzed dscrete probablty dstrbuton wth a fnte alphabet (a margnal densty of states). We use the notaton m a (x )(E) to denote the value of the functon m a (x ) evaluated at pont E. At every teraton, messages are updated accordng to the followng rules. The message from varable node to factor node a s updated as follows: m a (x ) = m b (x ) (3) where s the convoluton operator (commutatve, assocatve and dstrbutve). Intutvely, the convoluton operaton corresponds to workng wth the sum of (condtonally) ndependent random varables, such as the ones correspondng to dfferent subtrees n a tree-structured graphcal model. The message from factor a to varable s updated as follows: m a (x ) = m j a (x j ) δ Eα({x} α) (4) where δ Eα({x} α) s a Drac delta functon centered at E α (x α ) = log ψ α (x α ). For tree structured graphcal models, DENSITYPOPAGATION converges after a fnte number of teratons, ndependent of the ntal condton, to the true densty of states. Formally, Theorem 1. For any varable s V and E, for any ntal condton, after a fnte number of teratons q X s b N () m b (q)(e) = {x X α log ψ α(x α ) = E}. The proof s by nducton on the sze of the tree (omtted due to lack of space). The most effcent message update schedule for tree structured models s a two-pass procedure where messages are frst sent from the leaves to the root node, and then propagated backwards from the root to the leaves. However, as wth other message-passng algorthms, for tree structured problems the algorthm wll converge wth ether a sequental or a parallel update schedule, wth any ntal condton for the messages. Although DP requres the same number of messages updates as BP and MP, DP updates are more expensve because they requre the computaton of convolutons. In the worst case, the densty of states can have an exponental number of non-zero entres (.e., E such that n(e) > 0, whch we wll also refer to as buckets ), for nstance when potentals are set to logarthms of prme numbers, so that every x X has a dfferent probablty. However, n many practcal problems of nterest (e.g., Isng models, grounded Markov Logc Networks [9]), the number of energy buckets s lmted. Another key property of equatons (4) and (3) s that, unlke Belef Propagaton and Max- Product algorthms, the message update operator s lnear, although n a hgher dmensonal space of probablty dstrbutons elatonshp wth sum and max product algorthms DENSITYPOPAGATION s closely related to tradtonal message passng algorthms such as BP (Belef Propagaton, Sum-Product) and MP (Max-Product), snce t s based on the same (condtonal) ndependence assumptons. Specfcally, as shown by the next theorem, both BP and MP can be seen as smplfed versons of DENSITYPOPAGATION that only consder certan global statstcs of the dstrbutons represented by DENSITYPOPAGATION messages. Theorem. Assumng the same ntal condton and message update schedule, at every teraton k we can recover Belef Propagaton and Max-Product margnals from DENSITYPOPAGATION messages. Proof. The Max-Product algorthm corresponds to only consder the entry assocated wth the hghest probablty,.e. max{e m j (x j )(E) > 0}. For compactness, let us defne ths quantty γ j (x j ) max E {E m j (x j )(E) > 0}. Accordng to DP update n equaton (3), the quanttes γ a (x ) are updated as follows γ a (x ) = max {E m b (x )(E) > 0} = γ b (x ) E 3
4 Usng equaton (4) max{e max E γ a (x ) = max {E E m j a (x j ) δ Eα({x} α)(e) > 0} = max m j a (x j ) δ Eα({x} α)(e) > 0} = γ j a (x j ) + E α ({x} α ) These results show that the quanttes γ j (x j ) are updated accordng to the Max-Product algorthm (wth messages n log-scale). To see the relatonshp wth BP, for every DP message m j (x j ), let us defne µ j (x j ) = m j (x j )(E) exp(e) 1 = m j (x j )(E) exp(e) E Notce that µ j (x j ) would correspond to an unnormalzed margnal probablty, assumng that m j (x j ) s the densty of states of the problem when varable j s clamped to value x j. Accordng to DP update n equaton (3), the quanttes µ a (x ) are updated as follows µ a (x ) = m a (x )(E) exp(e) 1 = m b (x )(E) exp(e) 1 = µ b (x ) that s, we recover BP updates of messages from varable to factor nodes. Smlarly, usng (4) µ a (x ) = µ a (x )(E) exp(e) 1 = m j a (x j ) δ Eα({x} α)(e) exp(e) 1 = m j a (x j ) δ Eα({x} α)(e) exp(e) 1 = ψ α ({x} α ) µ j a (x ) we recover BP updates from factors to varable nodes for the µ j quanttes (whch correspond to margnals computed accordng to the estmated denstes). Smlarly, f we defne temperature versons of the margnals µ T j (x j) m j (x j )(E) exp(e/t ) 1, we recover the temperatureversons of Belef Propagaton updates, smlar to [10] and [11]. As other message passng algorthms, DENSITYPOPAGATION updates are well defned also for loopy graphcal models, even though there s no guarantee of convergence or correctness [1]. The correspondence wth BP and MP (Theorem ) however stll holds: f loopy BP converges, then the correspondng quanttes µ j computed from DP messages wll converge as well, and to the same value (assumng the same ntal condton and update schedule). Notce however that the convergence of the µ j does not mply the convergence of DENSITYPOPAGATION messages (e.g. n probablty, law, or L p ). In fact, we have observed emprcally that the stuaton where µ j converge but m j do not converge (not even n dstrbuton) s farly common. It would be nterestng to see f there s a varatonal nterpretaton for DENSITYPOPAGATION equatons, lke n [13]. Notce also that Juncton Tree style algorthms could also be used n conjuncton wth DP updates for the messages. 4 Boundng densty of states usng tractable famles Usng technques lke DENSITYPOPAGATION, we can compute the densty of states exactly for tractable famles such as tree-structured graphcal models. Let p(x, Θ ) be a general (ntractable) probablstc model of nterest, and let Θ be a famly of tractable parameters (e.g., correspondng to trees) such that Θ s a convex combnaton of Θ, as defned formally below and used prevously by Wanwrght et al. [5, 6]. See below (Fgure 1) for an example of a possble decomposton of a Isng model nto tractable dstrbutons. By computng the partton functon or MAP estmates for the tree structured subproblems, Wanwrght et al. showed that one can recover useful nformaton about the orgnal ntractable problem, for nstance by explotng convexty of the logpartton functon log Z(Θ). 4
5 We present a way to explot the decomposton dea to derve an upper bound on the densty of states n(e, Θ ) of the orgnal ntractable model, despte the fact that densty of states s not a convex functon. The result below gves a pont-by-pont upper bound whch, to the best of our knowledge, s the frst bound of ths knd for densty of states. Theorem 3. Let Θ = n =1 γ Θ, n =1 γ = 1, and y n = E n 1 =1 y. Then n n(e, Θ)... mn {n(y, γ Θ )} dy 1 dy... dy n 1 =1 Proof. From the defnton of densty of states and usng 1 {} to denote the 0-1 ndcator functon, n(e, Θ ) = 1 {Θ φ(x)=e} = = ( n... =1 ( n =... = {( γθ)φ(x)=e} 1 {γθ φ(x)=y } 1 {γθ φ(x)=y } =1 ( n mn =1 n mn =1 { ) ) dy 1 dy... dy n 1 dy 1 dy... dy n 1 { } ) 1{γΘ φ(x)=y } dy 1 dy... dy n 1 ( ) } 1{γΘ φ(x)=y } dy 1 dy... dy n 1 n 1 where y n = E =1 y Exchangng fnte sum and ntegrals Observng that ( 1{γΘ φ(x)=y } ) s precsely n(y, γ Θ ) fnshes the proof. 5 New bounds on the partton functon The densty of states n(e, Θ ) can be used to compute the partton functon, snce by defnton Z(Θ ) = n(e, Θ ) exp(e) 1. We can therefore get an upper bound on Z(Θ ) by ntegratng the pont-by-pont upper bound on n(e, Θ ) from Theorem 3. Ths bound can be tghter than the known bound [6] obtaned by applyng Jensen s nequalty to the log-partton functon (whch s convex), gven by log Z(Θ ) γ log Z(Θ ). For nstance, consder a graphcal model wth weghts that are large enough such that the densty of states based sum defnng Z(Θ ) s domnated by the contrbuton of the hghest-energy bucket. As a concrete example, consder the decomposton n Fgure 1. As the edge weght w (w = n the fgure) grows, the convexty-based bound wll approxmately equal the geometrc average of exp(6w) and 8 exp(w), whch s 4 exp(4w). On the other hand, the bound based on Theorem 3 wll approxmately equal mn{, 8} exp(( + 6)w/) = exp(4w). In general, the latter bound wll always be strctly better for large enough w unless the hghest-energy bucket counts are dentcal across all Θ. Whle ths s already promsng, we can, n fact, obtan a much tghter bound by takng nto account the nteractons between dfferent energy levels across any parameter decomposton, e.g., by enforcng the fact that there are a total of X confguratons. For compactness, n the followng let us defne y (x) = exp(θ φ(x)) for any x X and = 1,, n. Then, Z(Θ ) = exp(θ φ(x)) = y (x) γ Theorem 4. Let Π be the (fnte) set of all possble permutatons of X. Gven σ = (σ 1,, σ n ) Π n, let Z(Θ, σ) = y (σ (x)) γ. Then, mn σ Π Z(Θ, σ) Z(Θ ) max n σ Π Z(Θ, σ) (5) n Proof. Let σ I Π n denote a collecton of n dentty permutatons. Then we have Z(Θ ) = Z(Θ, σ I ), whch proves the upper and lower bounds n equaton (5). 5
6 Algorthm 1 Greedy algorthm for the maxmum matchng (upper bound). 1: whle there exsts E such that n(e, Θ ) > 0 do : E max(θ ) max E {E n(e, Θ ) > 0)}, for = 1,, n 3: c mn {n(e max(θ 1), Θ 1),, n(e max(θ n), Θ n)} 4: u b (γ 1E max(θ 1) + + γ ne max(θ n), Θ 1,, Θ n) c 5: n(e max(θ ), Θ ) n(e max(θ ), Θ ) c, for = 1,, n 6: end whle We can thnk of σ Π n as an n-dmensonal matchng. For any, j, σ (x) matches wth σ j (x), and σ(x) gves the correspondng hyper-edge. If we defne the weght of each hyper-edge n the matchng graph as w(σ(x)) = y (σ (x)) γ then Z(Θ, σ) = w(σ(x)) corresponds to the weght of the matchng represented by σ. We can therefore thnk the bounds n Equaton (5) as gven by a maxmum and mnmum matchng respectvely. Intutvely, the maxmum matchng corresponds to the case where the confguratons n the hgh energy buckets of the denstes happen to be the same confguraton (matchng), so that ther energes are summed up. 5.1 Upper bound The maxmum matchng max σ Z(Θ, σ) (.e., the upper bound on the partton functon) can be computed usng Algorthm 1. Algorthm 1 returns a dstrbuton u b such that E u b(e) = X and E u b(e) exp(e) = max σ Z(Θ, σ). Notce however that u b s not a vald pont by pont upper bound on the densty n(e, Θ ) of the orgnal mode. Proposton 1. Algorthm 1 computes the maxmum matchng and ts runtme s bounded by the total number of non-empty buckets {E n(e, Θ ) > 0}. Proof. The correctness of Algorthm 1 follows from observng that exp(e 1 +E )+exp(e 1+E ) exp(e 1 + E ) + exp(e 1 + E ) when E 1 E 1 and E E. Intutvely, ths means that for n = parameters t s always optmal to connect the hghest energy confguratons, therefore the greedy method s optmal. Ths result can be generalzed for n > by nducton. The runtme s proportonal to the total number of buckets because we remove one bucket from at least one densty at every teraton. The key property of Algorthm 1 s that even though t defnes a matchng over an exponental number of confguratons X, ts runtme proportonal to the total number of buckets, because t matches confguratons n groups at the bucket level. We can show that the value of the maxmum matchng s at least as tght as the bound provded by the convexty of the log-partton functon, whch s used for example by Tree eweghted Belef Propagaton (TWBP) [6]. Theorem 5. For any parameter decomposton n =1 γ Θ = Θ, the upper bound gven by the maxmum matchng n (5) and computed usng Algorthm 1 s always at least as tght as the bound obtaned usng the convexty of the log-partton functon. Proof. The bound obtaned by applyng Jensen s nequalty to the log-partton functon (whch s convex), gven by log Z(Θ ) γ log Z(Θ ) [6] leads to the followng geometrc average bound Z(Θ ) ( x y (x)) γ. Gven any n permutatons of the confguratons σ : X X for = 1,, n (n partcular, t holds for the one attanng the maxmum matchng value) we have y (σ (x)) γ = y (σ (x)) γ 1 y (σ (x)) γ 1/γ = ( ) γ y (σ (x)) x x where we used Generalzed Holder nequalty and the norm l ndcates a sum over X. 5. Lower bound We also provde Algorthm to compute the mnmum matchng when there are n = parameters. The proof of correctness s smlar to that for Proposton 1. Proposton. For n =, Algorthm computes the mnmum matchng and ts runtme s bounded by the total number of non-empty buckets {E n(e, Θ ) > 0}. 6
7 Algorthm Greedy algorthm for the mnmum matchng wth n = parameters (lower bound). 1: whle there exsts E such that n(e, Θ ) > 0 do : E max(θ ) max E {E n(e, Θ ) > 0)}; E mn(θ ) mn E {E n(e, Θ ) > 0)} 3: c mn {n(e max(θ 1), Θ 1), n(e mn(θ ), Θ )} 4: l b (γ 1E max(θ 1) + γ E mn(θ ), Θ 1, Θ ) c 5: n(e max(θ 1), Θ 1) n(e max(θ 1), Θ 1) c ; n(e mn(θ ), Θ ) n(e mn(θ ), Θ ) c 6: end whle For the mnmum matchng case, the nducton argument does not apply and the result cannot be extended to the case n >. For that case, we can obtan a weaker lower bound by applyng everse Generalzed Holder nequalty [14]. Specfcally, let s 1,, s n 1 < 0 and s n such that 1 s = 1. We then have mn σ Z(Θ, σ) = y (σ mn, (x)) γ = y (σ mn, (x)) γ 1 (6) x y (σ mn, (x)) γ s = ( ) 1 s y (σ mn, (x)) sγ = ( ) 1 s y (x) sγ x x Notce ths result cannot be appled f y (x) = 0,.e. there are factors assgnng probablty zero (hard constrants) n the probablstc model. 6 Emprcal Evaluaton To evaluate the qualty of the bounds, we consder an Isng model from statstcal physcs, where gven a graph (V, E), sngle node varables x s, s V are Bernoull dstrbuted (x s {0, 1})), and the global random vector s dstrbuted accordng to p(x, Θ) = 1 Z(Θ) exp Θ s x s + Θ j 1 {x=x j} s V (,j) E Fgure 1 shows a smple grd Isng model wth exponental parameter Θ = [0, 0, 0, 0, 1, 1, 1, 1] (Θ s = 0 and Θ j = 1) decomposed as the convex sum of two parameters Θ 1 and Θ correspondng to tractable dstrbutons,.e. Θ = (1/)Θ 1 + (1/)Θ. The correspondng partton functon s Z(Θ ) = + 1 exp() + exp(4) In panels (a) and (b) we report the correspondng densty of states n(e, Θ 1 ) and n(e, Θ ) as hstograms. For nstance, for the model correspondng to Θ there are only two global confguratons (all varables postve and all negatve) that gve an energy of 6. It can be seen from the denstes reported that Z(Θ 1 ) = + 6 exp() + 6 exp(4) + exp(6) , whle Z(Θ ) = exp() The correspondng geometrc average (obtaned from the convexty of the log-partton functon) s (Z(Θ 1 )) (Z(Θ )) In panels (c1) and (c) we show u b and l b computed usng Algorthms 1 and,.e. the solutons to the maxmum and mnmum matchng problems, respectvely. For nstance, for the maxmum matchng case the confguratons wth energy 6 from n(e, Θ 1 ) are matched wth of the 8 wth energy from n(e, Θ ), gvng an energy 6/ + / = 4. Notce that u b and l b are not vald bounds on ndvdual denstes of states themselves, but they nonetheless provde upper and lower bounds on the partton functon as shown n the fgure: and 134.7, resp. The bound (7) gven by nverse Holder nequalty wth s 1 = 1, s = 1/ s 16., whle the mean feld lower bound [4, 7] s In ths case, the addtonal nformaton provded by the densty leads to tghter upper and lower bounds on the partton functon. In Fgure we report the upper bounds obtaned for several types of Isng models (n all cases, Θ s = 0,.e., there s no external feld). In the two left plots, we consder a 5 5 square Isng model, once wth attractve nteractons (Θ j [0.1w, 0.w]) and once wth mxed nteractons (Θ j [ 0.1w, 0.1w]). In the two rght plots, we use a complete graph (a clque) wth N = 9 vertces. For each model, we compute the upper bound gven by TWBP (wth edge appearance probabltes µ e based on a subset of randomly selected spannng trees) and the mean-feld bound usng the mplementatons n lbdai [15]. We then compute the bound based on the maxmum matchng usng the same set of spannng trees. For the grd case, we also use a combnaton of spannng trees and compute the correspondng lower bound based on the mnmum matchng (notce 7
8 (a1) 1 (b1) confguratons energy (a) hstogram confguratons energy (b) hstogram matchng problem soluton confguratons (c1) u b confguratons (c) l b upper bound: Z ub =+6e+6e 3 +e 4 energy lower bound: Z lb =e+1e +e 3 energy Fgure 1: Decomposton of a Isng model, densty of states soluton obtaned wth maxmum and mnmum matchng algorthms, and the correspondng upper and lower bounds on Z(Θ ). elatve upper bound error Convexty MaxMatchng Edge strength (a) 5 5 grd, attractve. elatve upper bound error Convexty MaxMatchng Edge strength (b) 5 5 grd, mxed. elatve upper bound error Convexty MaxMatchng Edge strength (c) 9-Clque, attractve. elatve upper bound error Convexty MaxMatchng Edge strength (d) 9-Clque, mxed. Fgure : elatve error of the upper bounds. Top row: 5 5 ferromagnetc grd. Left: attractve nteractons. ght: mxed nteractons. t s not possble to cover all the edges n a clque wth only spannng tree). For each bound, we report the relatve error, defned as (log(bound) log(z)) / log(z), where Z s the true partton functon, computed usng the juncton tree method. In these experments, both our upper and lower bounds mprove over the ones obtaned wth T- WBP [6] and mean-feld respectvely. The lower bound based on mnmum matchng vsually overlaps wth the mean-feld bound and s thus omtted from Fgure. It s, however, strctly better, even f by a small amount. Notce that we mght be able to get a better bound by choosng a dfferent set of parameters Θ (whch may be suboptmal for TW-BP). We also used numercal optmzaton (BFGS and BOBYQA [16]) to select the values of s n the nverse Holder bound (7) (notce that once we have computed the denstes n(e, Θ ), evaluatng the bound s cheap,.e., t does not requre solvng an nference task). We found that both optmzaton strateges are very senstve to the ntal condton; however, by optmzng the parameters we were always able to obtan a lower bound at least as good as the one gven by mean feld. 7 Conclusons We presented DENSITYPOPAGATION, a novel message passng algorthm to compute the densty of states whle explotng the structure of the underlyng graphcal model. We showed that DENSI- TYPOPAGATION computes the exact densty for tree structured graphcal models, s closely related to Belef Propagaton and Max-Product algorthms, and s n fact a generalzaton of both. We ntroduced a new famly of bounds on the partton functon based on tree decomposton but wthout relyng on convexty. We showed both theoretcally and emprcally that the addtonal nformaton provded by the densty of states leads to better bounds than standard convexty-based ones. Ths work opens up several nterestng drectons. These nclude an exploraton of the convergence propertes of (loopy) DENSITYPOPAGATION specfcally n relaton to BP and MP [17, 18, 19, 0], an nvestgaton of the exstence of a varatonal nterpretaton for the updates, and devsng an effcent strategy to select the parameters Θ of the tree decomposton such that the proposed bound s further optmzed. 8
9 eferences [1] M.J. Wanwrght and M.I. Jordan. Graphcal models, exponental famles, and varatonal nference. Foundatons and Trends n Machne Learnng, 1(1-):1 305, 008. [] S. Ermon, C. Gomes, A. Sabharwal, and B. Selman. Accelerated Adaptve Markov Chan for Partton Functon Computaton. Neural Informaton Processng Systems, 011. [3] F. Wang and DP Landau. Effcent, multple-range random walk algorthm to calculate the densty of states. Physcal evew Letters, 86(10): , 001. [4] M.J. Wanwrght. Stochastc processes on graphs wth cycles: geometrc and Varatonal approaches. PhD thess, Massachusetts Insttute of Technology, 00. [5] M. Wanwrght, T. Jaakkola, and A. Wllsky. Exact map estmates by (hyper) tree agreement. Advances n neural nformaton processng systems, pages , 003. [6] M.J. Wanwrght. Tree-reweghted belef propagaton algorthms and approxmate ML estmaton va pseudo-moment matchng. In AISTATS, 003. [7] G. Pars and. Shankar. Statstcal feld theory. Physcs Today, 41:110, [8] L.D. Brown. Fundamentals of statstcal exponental famles: wth applcatons n statstcal decson theory. Insttute of Mathematcal Statstcs, [9] M. chardson and P. Domngos. Markov logc networks. Machne Learnng, 6(1): , 006. [10] Y. Wess, C. Yanover, and T. Meltzer. MAP estmaton, lnear programmng and belef propagaton wth convex free energes. In Uncertanty n Artfcal Intellgence, 007. [11] T. Hazan and A. Shashua. Norm-product belef propagaton: Prmal-dual message-passng for approxmate nference. Informaton Theory, IEEE Transactons on, 56(1): , 010. [1] K.P. Murphy, Y. Wess, and M.I. Jordan. Loopy belef propagaton for approxmate nference: An emprcal study. In Proceedngs of the Ffteenth conference on Uncertanty n artfcal ntellgence, pages Morgan Kaufmann Publshers Inc., [13] J.S. Yedda, W.T. Freeman, and Y. Wess. Understandng belef propagaton and ts generalzatons. Explorng artfcal ntellgence n the new mllennum, 8:36 39, 003. [14] W.S. Cheung. Generalzatons of hölders nequalty. Internatonal Journal of Mathematcs and Mathematcal Scences, 6:7 10, 001. [15] J.M. Mooj. lbdai: A free and open source c++ lbrary for dscrete approxmate nference n graphcal models. The Journal of Machne Learnng esearch, 11: , 010. [16] M.J.D. Powell. The BOBYQA algorthm for bound constraned optmzaton wthout dervatves. Unversty of Cambrdge Techncal eport, 009. [17] Davd Sontag, Talya Meltzer, Amr Globerson, Tomm Jaakkola, and Yar Wess. Tghtenng LP relaxatons for MAP usng message passng. In Uncertanty n Artfcal Intellgence, pages , 008. [18] T. Meltzer, A. Globerson, and Y. Wess. Convergent message passng algorthms: a unfyng vew. In Proceedngs of the Twenty-Ffth Conference on Uncertanty n Artfcal Intellgence, pages AUAI Press, 009. [19] A.T. Ihler, J.W. Fsher, and A.S. Wllsky. Loopy belef propagaton: Convergence and effects of message errors. Journal of Machne Learnng esearch, 6(1):905, 006. [0] J.M. Mooj and H.J. Kappen. Suffcent condtons for convergence of the sum product algorthm. Informaton Theory, IEEE Transactons on, 53(1): ,
Density Propagation and Improved Bounds on the Partition Function
Densty Propagaton and Improved Bounds on the Partton Functon Stefano Ermon, Carla P. Gomes Dept. of Computer Scence Cornell Unversty Ithaca NY 1853, U.S.A. Ashsh Sabharwal IBM Watson esearch Ctr. Yorktown
More informationProbabilistic & Unsupervised Learning
Probablstc & Unsupervsed Learnng Convex Algorthms n Approxmate Inference Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt Unversty College London Term 1, Autumn 2008 Convexty A convex
More informationWhy BP Works STAT 232B
Why BP Works STAT 232B Free Energes Helmholz & Gbbs Free Energes 1 Dstance between Probablstc Models - K-L dvergence b{ KL b{ p{ = b{ ln { } p{ Here, p{ s the eact ont prob. b{ s the appromaton, called
More informationConjugacy and the Exponential Family
CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationNP-Completeness : Proofs
NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem
More informationHidden Markov Models & The Multivariate Gaussian (10/26/04)
CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models
More informationComputing Correlated Equilibria in Multi-Player Games
Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationPower law and dimension of the maximum value for belief distribution with the max Deng entropy
Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationMATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)
1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationApproximate inference using conditional entropy decompositions
Approxmate nference usng condtonal entropy decompostons Amr Globerson, Tomm Jaakkola Computer Scence and Artfcal Intellgence Laboratory MIT Cambrdge, MA 239 Abstract We ntroduce a novel method for estmatng
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More information8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF
10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationP exp(tx) = 1 + t 2k M 2k. k N
1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.
More informationOutline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline
Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationComparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method
Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method
More informationThe Expectation-Maximization Algorithm
The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.
More informationDepartment of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING
MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/
More informationCourse 395: Machine Learning - Lectures
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationTHE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens
THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationFinding Dense Subgraphs in G(n, 1/2)
Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng
More informationPhysical Fluctuomatics Applied Stochastic Process 9th Belief propagation
Physcal luctuomatcs ppled Stochastc Process 9th elef propagaton Kazuyuk Tanaka Graduate School of Informaton Scences Tohoku Unversty kazu@smapp.s.tohoku.ac.jp http://www.smapp.s.tohoku.ac.jp/~kazu/ Stochastc
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationProbability-Theoretic Junction Trees
Probablty-Theoretc Juncton Trees Payam Pakzad, (wth Venkat Anantharam, EECS Dept, U.C. Berkeley EPFL, ALGO/LMA Semnar 2/2/2004 Margnalzaton Problem Gven an arbtrary functon of many varables, fnd (some
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors
Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationEM and Structure Learning
EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationOpen Systems: Chemical Potential and Partial Molar Quantities Chemical Potential
Open Systems: Chemcal Potental and Partal Molar Quanttes Chemcal Potental For closed systems, we have derved the followng relatonshps: du = TdS pdv dh = TdS + Vdp da = SdT pdv dg = VdP SdT For open systems,
More informationCalculation of time complexity (3%)
Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More informationLecture 17: Lee-Sidford Barrier
CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the
More informationTree Block Coordinate Descent for MAP in Graphical Models
ree Block Coordnate Descent for MAP n Graphcal Models Davd Sontag omm Jaakkola Computer Scence and Artfcal Intellgence Laboratory Massachusetts Insttute of echnology Cambrdge, MA 02139 Abstract A number
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationTAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES
TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationInductance Calculation for Conductors of Arbitrary Shape
CRYO/02/028 Aprl 5, 2002 Inductance Calculaton for Conductors of Arbtrary Shape L. Bottura Dstrbuton: Internal Summary In ths note we descrbe a method for the numercal calculaton of nductances among conductors
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationBézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0
Bézer curves Mchael S. Floater September 1, 215 These notes provde an ntroducton to Bézer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationComputing MLE Bias Empirically
Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.
More informationVARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES
VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES BÂRZĂ, Slvu Faculty of Mathematcs-Informatcs Spru Haret Unversty barza_slvu@yahoo.com Abstract Ths paper wants to contnue
More informationMaximizing the number of nonnegative subsets
Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum
More informationSome modelling aspects for the Matlab implementation of MMA
Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton
More informationDynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)
/24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationHidden Markov Models
Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More information= z 20 z n. (k 20) + 4 z k = 4
Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5
More informationarxiv:cs.cv/ Jun 2000
Correlaton over Decomposed Sgnals: A Non-Lnear Approach to Fast and Effectve Sequences Comparson Lucano da Fontoura Costa arxv:cs.cv/0006040 28 Jun 2000 Cybernetc Vson Research Group IFSC Unversty of São
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationU.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016
U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and
More information1 Motivation and Introduction
Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,
More informationTHE SUMMATION NOTATION Ʃ
Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationLecture 7: Boltzmann distribution & Thermodynamics of mixing
Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationA new construction of 3-separable matrices via an improved decoding of Macula s construction
Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula
More informationCSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing
CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann
More informationUniversity of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015
Lecture 2. 1/07/15-1/09/15 Unversty of Washngton Department of Chemstry Chemstry 453 Wnter Quarter 2015 We are not talkng about truth. We are talkng about somethng that seems lke truth. The truth we want
More informationCredit Card Pricing and Impact of Adverse Selection
Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More information