Density Propagation and Improved Bounds on the Partition Function

Size: px
Start display at page:

Download "Density Propagation and Improved Bounds on the Partition Function"

Transcription

1 Densty Propagaton and Improved Bounds on the Partton Functon Stefano Ermon, Carla P. Gomes Dept. of Computer Scence Cornell Unversty Ithaca NY 14853, U.S.A. Ashsh Sabharwal IBM Watson esearch Ctr. Yorktown Heghts NY 10598, U.S.A. Bart Selman Dept. of Computer Scence Cornell Unversty Ithaca NY 14853, U.S.A. Abstract Gven a probablstc graphcal model, ts densty of states s a functon that, for any lkelhood value, gves the number of confguratons wth that probablty. We ntroduce a novel message-passng algorthm called Densty Propagaton (DP) for estmatng ths functon. We show that DP s exact for tree-structured graphcal models and s, n general, a strct generalzaton of both sum-product and maxproduct algorthms. Further, we use densty of states and tree decomposton to ntroduce a new famly of upper and lower bounds on the partton functon. For any tree decomposton, the new upper bound based on fner-graned densty of state nformaton s provably at least as tght as prevously known bounds based on convexty of the log-partton functon, and strclty stronger f a general condton holds. We conclude wth emprcal evdence of mprovement over convex relaxatons and mean-feld based bounds. 1 Introducton Assocated wth any undrected graphcal model [1] s the so-called densty of states, a term borrowed from statstcal physcs ndcatng a functon that, for any lkelhood value, gves the number of confguratons wth that probablty. The densty of states plays an mportant role n statstcal physcs because t provdes a fne graned descrpton of the system, and can be used to effcently compute many propertes of nterests, such as the partton functon and ts parameterzed verson [, 3]. It can be seen that computng the densty of states s computatonally ntractable n the worst case, snce t subsumes a #-P complete problem (computng the partton functon) and an NP-hard one (MAP nference). All current approxmate technques estmatng the densty of states are based on samplng, the most promnent beng the Wang-Landau algorthm [3] and ts mproved varants []. These methods have been shown to be very effectve n practce. However, they do not provde any guarantee on the qualty of the results. Furthermore, they gnore the structure of the underlyng graphcal model, effectvely treatng the energy functon (whch gves the log-lkelhood of a confguraton) as a black-box. As a frst step towards explotng the structure of the graphcal model when computng the densty of states, we propose an algorthm called DENSITYPOPAGATION (DP). The algorthm s based on dynamc programmng and can be convenently expressed n terms of message passng on the graphcal model. We show that DENSITYPOPAGATION computes the densty of states exactly for any tree-structured graphcal model. It s closely related to the popular Sum-Product (Belef Propagaton, BP) and Max-Product (MP) algorthms, and can be seen as a generalzaton of both. However, t computes somethng much rcher, namely the densty of states, whch contans nformaton such as the partton functon and varable margnals. Although we do not work at the level of ndvdual confguratons, DENSITYPOPAGATION allows us to reason n terms of groups of confguratons wth the same probablty (energy). Beng able to solve nference tasks for certan tractable classes of problems (e.g., trees) s mportant because one can often decompose a complex problem nto tractable subproblems (such as spannng 1

2 trees) [4], and the solutons to these smpler problems can be combned to recover useful propertes of the orgnal graphcal model [5, 6]. In ths paper we show that by combnng the addtonal nformaton gven by the densty of states, we can obtan a new famly of upper and lower bounds on the partton functon. We prove that the new upper bound s always at least as tght as the one based on the convexty of the log-partton functon [4], and we provde a general condton where the new bound s strctly tghter. Further, we llustrate emprcally that the new upper bound mproves upon the convexty-based one on Isng grd and clque models, and that the new lower bound s emprcally slghtly stronger than the one gven by mean-feld theory [4, 7]. Problem defnton and setup We consder a graphcal model specfed as a factor graph wth N = V dscrete random varables x, V where x X. The global random vector x = {x s, s V } takes value n the cartesan product X = X 1 X X N, wth cardnalty D = X = N =1 X. We consder a probablty dstrbuton over elements x X (called confguratons) p(x) = 1 Z ψ α ({x} α ) (1) α I that factors nto potentals or factors ψ α : {x} α +, where I s an ndex set and {x} α V a subset of varables the factor ψ α depends on. The correspondng factor graph s a bpartte graph wth vertex set V I. In the factor graph, each varable node V s connected wth all the factors α I that depend on. Smlarly, each factor node α I s connected wth all the varable nodes {x} α. We denote the neghbours of and α by N () and N (α) respectvely. We wll also make use of the related exponental representaton [8]. Let φ be a collecton of potental functons {φ α, α I}, defned over the ndex set I. Gven an exponental parameter vector Θ = {Θ α, α I}, the exponental famly defned by φ s the famly of probablty dstrbutons over X defned as follows: p(x, Θ) = 1 1 exp(θ φ(x)) = Z(Θ) Z(Θ) exp ( ) Θ α φ α ({x} α ) α I () Gven an exponental famly, we defne the densty of states [] as a functon n : I : (E, Θ) {x X Θ φ(x) = E} where for any exponental parameter Θ, t holds that + n(e, Θ)dE = X. We wll refer to the quantty α I Θ αφ α ({x} α ) as the energy of a confguraton x. The densty n(e, Θ) s the partton functon for the mcrocanoncal ensemble (solated system at equlbrum wth constant energy, volume, and number of partcles), whle Z(Θ) s the partton functon for the tradtonal canoncal ensemble. We wll denote wth n(e) (omttng the parameter) the densty of states of the orgnal factor graph. 3 Densty Propagaton Snce any propostonal Satsfablty (SAT) nstance can be effently encoded as a factor graph (e.g., by defnng a unform probablty measure over satsfyng assgnments), t s clear that computng the densty of states s computatonally ntractable n the worst case, as a generalzaton of an NP- Complete problem (satsfablty testng) and a #-P complete problem (model countng). We show that the densty of states can be computed effcently 1 for acyclc graphcal models. We provde a Dynamc Programmng algorthm, whch can also be nterpreted as a message passng algorthm on the factor graph, called DENSITYPOPAGATION (DP), whch computes the densty of states exactly for acyclc graphcal models. 1 Polynomal n the cardnalty of the functon s support, whch could be exponental n N n the worst case.

3 3.1 Densty propagaton equatons DENSITYPOPAGATION works by exchangng messages from varable to factor nodes and vce versa. Unlke tradtonal message passng algorthms, where messages represent margnal probabltes (vectors of real numbers), for every x X a DENSITYPOPAGATION message m a (x ) represents an unnormalzed dscrete probablty dstrbuton wth a fnte alphabet (a margnal densty of states). We use the notaton m a (x )(E) to denote the value of the functon m a (x ) evaluated at pont E. At every teraton, messages are updated accordng to the followng rules. The message from varable node to factor node a s updated as follows: m a (x ) = m b (x ) (3) where s the convoluton operator (commutatve, assocatve and dstrbutve). Intutvely, the convoluton operaton corresponds to workng wth the sum of (condtonally) ndependent random varables, such as the ones correspondng to dfferent subtrees n a tree-structured graphcal model. The message from factor a to varable s updated as follows: m a (x ) = m j a (x j ) δ Eα({x} α) (4) where δ Eα({x} α) s a Drac delta functon centered at E α (x α ) = log ψ α (x α ). For tree structured graphcal models, DENSITYPOPAGATION converges after a fnte number of teratons, ndependent of the ntal condton, to the true densty of states. Formally, Theorem 1. For any varable s V and E, for any ntal condton, after a fnte number of teratons q X s b N () m b (q)(e) = {x X α log ψ α(x α ) = E}. The proof s by nducton on the sze of the tree (omtted due to lack of space). The most effcent message update schedule for tree structured models s a two-pass procedure where messages are frst sent from the leaves to the root node, and then propagated backwards from the root to the leaves. However, as wth other message-passng algorthms, for tree structured problems the algorthm wll converge wth ether a sequental or a parallel update schedule, wth any ntal condton for the messages. Although DP requres the same number of messages updates as BP and MP, DP updates are more expensve because they requre the computaton of convolutons. In the worst case, the densty of states can have an exponental number of non-zero entres (.e., E such that n(e) > 0, whch we wll also refer to as buckets ), for nstance when potentals are set to logarthms of prme numbers, so that every x X has a dfferent probablty. However, n many practcal problems of nterest (e.g., Isng models, grounded Markov Logc Networks [9]), the number of energy buckets s lmted. Another key property of equatons (4) and (3) s that, unlke Belef Propagaton and Max- Product algorthms, the message update operator s lnear, although n a hgher dmensonal space of probablty dstrbutons elatonshp wth sum and max product algorthms DENSITYPOPAGATION s closely related to tradtonal message passng algorthms such as BP (Belef Propagaton, Sum-Product) and MP (Max-Product), snce t s based on the same (condtonal) ndependence assumptons. Specfcally, as shown by the next theorem, both BP and MP can be seen as smplfed versons of DENSITYPOPAGATION that only consder certan global statstcs of the dstrbutons represented by DENSITYPOPAGATION messages. Theorem. Assumng the same ntal condton and message update schedule, at every teraton k we can recover Belef Propagaton and Max-Product margnals from DENSITYPOPAGATION messages. Proof. The Max-Product algorthm corresponds to only consder the entry assocated wth the hghest probablty,.e. max{e m j (x j )(E) > 0}. For compactness, let us defne ths quantty γ j (x j ) max E {E m j (x j )(E) > 0}. Accordng to DP update n equaton (3), the quanttes γ a (x ) are updated as follows γ a (x ) = max {E m b (x )(E) > 0} = γ b (x ) E 3

4 Usng equaton (4) max{e max E γ a (x ) = max {E E m j a (x j ) δ Eα({x} α)(e) > 0} = max m j a (x j ) δ Eα({x} α)(e) > 0} = γ j a (x j ) + E α ({x} α ) These results show that the quanttes γ j (x j ) are updated accordng to the Max-Product algorthm (wth messages n log-scale). To see the relatonshp wth BP, for every DP message m j (x j ), let us defne µ j (x j ) = m j (x j )(E) exp(e) 1 = m j (x j )(E) exp(e) E Notce that µ j (x j ) would correspond to an unnormalzed margnal probablty, assumng that m j (x j ) s the densty of states of the problem when varable j s clamped to value x j. Accordng to DP update n equaton (3), the quanttes µ a (x ) are updated as follows µ a (x ) = m a (x )(E) exp(e) 1 = m b (x )(E) exp(e) 1 = µ b (x ) that s, we recover BP updates of messages from varable to factor nodes. Smlarly, usng (4) µ a (x ) = µ a (x )(E) exp(e) 1 = m j a (x j ) δ Eα({x} α)(e) exp(e) 1 = m j a (x j ) δ Eα({x} α)(e) exp(e) 1 = ψ α ({x} α ) µ j a (x ) we recover BP updates from factors to varable nodes for the µ j quanttes (whch correspond to margnals computed accordng to the estmated denstes). Smlarly, f we defne temperature versons of the margnals µ T j (x j) m j (x j )(E) exp(e/t ) 1, we recover the temperatureversons of Belef Propagaton updates, smlar to [10] and [11]. As other message passng algorthms, DENSITYPOPAGATION updates are well defned also for loopy graphcal models, even though there s no guarantee of convergence or correctness [1]. The correspondence wth BP and MP (Theorem ) however stll holds: f loopy BP converges, then the correspondng quanttes µ j computed from DP messages wll converge as well, and to the same value (assumng the same ntal condton and update schedule). Notce however that the convergence of the µ j does not mply the convergence of DENSITYPOPAGATION messages (e.g. n probablty, law, or L p ). In fact, we have observed emprcally that the stuaton where µ j converge but m j do not converge (not even n dstrbuton) s farly common. It would be nterestng to see f there s a varatonal nterpretaton for DENSITYPOPAGATION equatons, lke n [13]. Notce also that Juncton Tree style algorthms could also be used n conjuncton wth DP updates for the messages. 4 Boundng densty of states usng tractable famles Usng technques lke DENSITYPOPAGATION, we can compute the densty of states exactly for tractable famles such as tree-structured graphcal models. Let p(x, Θ ) be a general (ntractable) probablstc model of nterest, and let Θ be a famly of tractable parameters (e.g., correspondng to trees) such that Θ s a convex combnaton of Θ, as defned formally below and used prevously by Wanwrght et al. [5, 6]. See below (Fgure 1) for an example of a possble decomposton of a Isng model nto tractable dstrbutons. By computng the partton functon or MAP estmates for the tree structured subproblems, Wanwrght et al. showed that one can recover useful nformaton about the orgnal ntractable problem, for nstance by explotng convexty of the logpartton functon log Z(Θ). 4

5 We present a way to explot the decomposton dea to derve an upper bound on the densty of states n(e, Θ ) of the orgnal ntractable model, despte the fact that densty of states s not a convex functon. The result below gves a pont-by-pont upper bound whch, to the best of our knowledge, s the frst bound of ths knd for densty of states. Theorem 3. Let Θ = n =1 γ Θ, n =1 γ = 1, and y n = E n 1 =1 y. Then n n(e, Θ)... mn {n(y, γ Θ )} dy 1 dy... dy n 1 =1 Proof. From the defnton of densty of states and usng 1 {} to denote the 0-1 ndcator functon, n(e, Θ ) = 1 {Θ φ(x)=e} = = ( n... =1 ( n =... = {( γθ)φ(x)=e} 1 {γθ φ(x)=y } 1 {γθ φ(x)=y } =1 ( n mn =1 n mn =1 { ) ) dy 1 dy... dy n 1 dy 1 dy... dy n 1 { } ) 1{γΘ φ(x)=y } dy 1 dy... dy n 1 ( ) } 1{γΘ φ(x)=y } dy 1 dy... dy n 1 n 1 where y n = E =1 y Exchangng fnte sum and ntegrals Observng that ( 1{γΘ φ(x)=y } ) s precsely n(y, γ Θ ) fnshes the proof. 5 New bounds on the partton functon The densty of states n(e, Θ ) can be used to compute the partton functon, snce by defnton Z(Θ ) = n(e, Θ ) exp(e) 1. We can therefore get an upper bound on Z(Θ ) by ntegratng the pont-by-pont upper bound on n(e, Θ ) from Theorem 3. Ths bound can be tghter than the known bound [6] obtaned by applyng Jensen s nequalty to the log-partton functon (whch s convex), gven by log Z(Θ ) γ log Z(Θ ). For nstance, consder a graphcal model wth weghts that are large enough such that the densty of states based sum defnng Z(Θ ) s domnated by the contrbuton of the hghest-energy bucket. As a concrete example, consder the decomposton n Fgure 1. As the edge weght w (w = n the fgure) grows, the convexty-based bound wll approxmately equal the geometrc average of exp(6w) and 8 exp(w), whch s 4 exp(4w). On the other hand, the bound based on Theorem 3 wll approxmately equal mn{, 8} exp(( + 6)w/) = exp(4w). In general, the latter bound wll always be strctly better for large enough w unless the hghest-energy bucket counts are dentcal across all Θ. Whle ths s already promsng, we can, n fact, obtan a much tghter bound by takng nto account the nteractons between dfferent energy levels across any parameter decomposton, e.g., by enforcng the fact that there are a total of X confguratons. For compactness, n the followng let us defne y (x) = exp(θ φ(x)) for any x X and = 1,, n. Then, Z(Θ ) = exp(θ φ(x)) = y (x) γ Theorem 4. Let Π be the (fnte) set of all possble permutatons of X. Gven σ = (σ 1,, σ n ) Π n, let Z(Θ, σ) = y (σ (x)) γ. Then, mn σ Π Z(Θ, σ) Z(Θ ) max n σ Π Z(Θ, σ) (5) n Proof. Let σ I Π n denote a collecton of n dentty permutatons. Then we have Z(Θ ) = Z(Θ, σ I ), whch proves the upper and lower bounds n equaton (5). 5

6 Algorthm 1 Greedy algorthm for the maxmum matchng (upper bound). 1: whle there exsts E such that n(e, Θ ) > 0 do : E max(θ ) max E {E n(e, Θ ) > 0)}, for = 1,, n 3: c mn {n(e max(θ 1), Θ 1),, n(e max(θ n), Θ n)} 4: u b (γ 1E max(θ 1) + + γ ne max(θ n), Θ 1,, Θ n) c 5: n(e max(θ ), Θ ) n(e max(θ ), Θ ) c, for = 1,, n 6: end whle We can thnk of σ Π n as an n-dmensonal matchng. For any, j, σ (x) matches wth σ j (x), and σ(x) gves the correspondng hyper-edge. If we defne the weght of each hyper-edge n the matchng graph as w(σ(x)) = y (σ (x)) γ then Z(Θ, σ) = w(σ(x)) corresponds to the weght of the matchng represented by σ. We can therefore thnk the bounds n Equaton (5) as gven by a maxmum and mnmum matchng respectvely. Intutvely, the maxmum matchng corresponds to the case where the confguratons n the hgh energy buckets of the denstes happen to be the same confguraton (matchng), so that ther energes are summed up. 5.1 Upper bound The maxmum matchng max σ Z(Θ, σ) (.e., the upper bound on the partton functon) can be computed usng Algorthm 1. Algorthm 1 returns a dstrbuton u b such that E u b(e) = X and E u b(e) exp(e) = max σ Z(Θ, σ). Notce however that u b s not a vald pont by pont upper bound on the densty n(e, Θ ) of the orgnal mode. Proposton 1. Algorthm 1 computes the maxmum matchng and ts runtme s bounded by the total number of non-empty buckets {E n(e, Θ ) > 0}. Proof. The correctness of Algorthm 1 follows from observng that exp(e 1 +E )+exp(e 1+E ) exp(e 1 + E ) + exp(e 1 + E ) when E 1 E 1 and E E. Intutvely, ths means that for n = parameters t s always optmal to connect the hghest energy confguratons, therefore the greedy method s optmal. Ths result can be generalzed for n > by nducton. The runtme s proportonal to the total number of buckets because we remove one bucket from at least one densty at every teraton. The key property of Algorthm 1 s that even though t defnes a matchng over an exponental number of confguratons X, ts runtme proportonal to the total number of buckets, because t matches confguratons n groups at the bucket level. We can show that the value of the maxmum matchng s at least as tght as the bound provded by the convexty of the log-partton functon, whch s used for example by Tree eweghted Belef Propagaton (TWBP) [6]. Theorem 5. For any parameter decomposton n =1 γ Θ = Θ, the upper bound gven by the maxmum matchng n (5) and computed usng Algorthm 1 s always at least as tght as the bound obtaned usng the convexty of the log-partton functon. Proof. The bound obtaned by applyng Jensen s nequalty to the log-partton functon (whch s convex), gven by log Z(Θ ) γ log Z(Θ ) [6] leads to the followng geometrc average bound Z(Θ ) ( x y (x)) γ. Gven any n permutatons of the confguratons σ : X X for = 1,, n (n partcular, t holds for the one attanng the maxmum matchng value) we have y (σ (x)) γ = y (σ (x)) γ 1 y (σ (x)) γ 1/γ = ( ) γ y (σ (x)) x x where we used Generalzed Holder nequalty and the norm l ndcates a sum over X. 5. Lower bound We also provde Algorthm to compute the mnmum matchng when there are n = parameters. The proof of correctness s smlar to that for Proposton 1. Proposton. For n =, Algorthm computes the mnmum matchng and ts runtme s bounded by the total number of non-empty buckets {E n(e, Θ ) > 0}. 6

7 Algorthm Greedy algorthm for the mnmum matchng wth n = parameters (lower bound). 1: whle there exsts E such that n(e, Θ ) > 0 do : E max(θ ) max E {E n(e, Θ ) > 0)}; E mn(θ ) mn E {E n(e, Θ ) > 0)} 3: c mn {n(e max(θ 1), Θ 1), n(e mn(θ ), Θ )} 4: l b (γ 1E max(θ 1) + γ E mn(θ ), Θ 1, Θ ) c 5: n(e max(θ 1), Θ 1) n(e max(θ 1), Θ 1) c ; n(e mn(θ ), Θ ) n(e mn(θ ), Θ ) c 6: end whle For the mnmum matchng case, the nducton argument does not apply and the result cannot be extended to the case n >. For that case, we can obtan a weaker lower bound by applyng everse Generalzed Holder nequalty [14]. Specfcally, let s 1,, s n 1 < 0 and s n such that 1 s = 1. We then have mn σ Z(Θ, σ) = y (σ mn, (x)) γ = y (σ mn, (x)) γ 1 (6) x y (σ mn, (x)) γ s = ( ) 1 s y (σ mn, (x)) sγ = ( ) 1 s y (x) sγ x x Notce ths result cannot be appled f y (x) = 0,.e. there are factors assgnng probablty zero (hard constrants) n the probablstc model. 6 Emprcal Evaluaton To evaluate the qualty of the bounds, we consder an Isng model from statstcal physcs, where gven a graph (V, E), sngle node varables x s, s V are Bernoull dstrbuted (x s {0, 1})), and the global random vector s dstrbuted accordng to p(x, Θ) = 1 Z(Θ) exp Θ s x s + Θ j 1 {x=x j} s V (,j) E Fgure 1 shows a smple grd Isng model wth exponental parameter Θ = [0, 0, 0, 0, 1, 1, 1, 1] (Θ s = 0 and Θ j = 1) decomposed as the convex sum of two parameters Θ 1 and Θ correspondng to tractable dstrbutons,.e. Θ = (1/)Θ 1 + (1/)Θ. The correspondng partton functon s Z(Θ ) = + 1 exp() + exp(4) In panels (a) and (b) we report the correspondng densty of states n(e, Θ 1 ) and n(e, Θ ) as hstograms. For nstance, for the model correspondng to Θ there are only two global confguratons (all varables postve and all negatve) that gve an energy of 6. It can be seen from the denstes reported that Z(Θ 1 ) = + 6 exp() + 6 exp(4) + exp(6) , whle Z(Θ ) = exp() The correspondng geometrc average (obtaned from the convexty of the log-partton functon) s (Z(Θ 1 )) (Z(Θ )) In panels (c1) and (c) we show u b and l b computed usng Algorthms 1 and,.e. the solutons to the maxmum and mnmum matchng problems, respectvely. For nstance, for the maxmum matchng case the confguratons wth energy 6 from n(e, Θ 1 ) are matched wth of the 8 wth energy from n(e, Θ ), gvng an energy 6/ + / = 4. Notce that u b and l b are not vald bounds on ndvdual denstes of states themselves, but they nonetheless provde upper and lower bounds on the partton functon as shown n the fgure: and 134.7, resp. The bound (7) gven by nverse Holder nequalty wth s 1 = 1, s = 1/ s 16., whle the mean feld lower bound [4, 7] s In ths case, the addtonal nformaton provded by the densty leads to tghter upper and lower bounds on the partton functon. In Fgure we report the upper bounds obtaned for several types of Isng models (n all cases, Θ s = 0,.e., there s no external feld). In the two left plots, we consder a 5 5 square Isng model, once wth attractve nteractons (Θ j [0.1w, 0.w]) and once wth mxed nteractons (Θ j [ 0.1w, 0.1w]). In the two rght plots, we use a complete graph (a clque) wth N = 9 vertces. For each model, we compute the upper bound gven by TWBP (wth edge appearance probabltes µ e based on a subset of randomly selected spannng trees) and the mean-feld bound usng the mplementatons n lbdai [15]. We then compute the bound based on the maxmum matchng usng the same set of spannng trees. For the grd case, we also use a combnaton of spannng trees and compute the correspondng lower bound based on the mnmum matchng (notce 7

8 (a1) 1 (b1) confguratons energy (a) hstogram confguratons energy (b) hstogram matchng problem soluton confguratons (c1) u b confguratons (c) l b upper bound: Z ub =+6e+6e 3 +e 4 energy lower bound: Z lb =e+1e +e 3 energy Fgure 1: Decomposton of a Isng model, densty of states soluton obtaned wth maxmum and mnmum matchng algorthms, and the correspondng upper and lower bounds on Z(Θ ). elatve upper bound error Convexty MaxMatchng Edge strength (a) 5 5 grd, attractve. elatve upper bound error Convexty MaxMatchng Edge strength (b) 5 5 grd, mxed. elatve upper bound error Convexty MaxMatchng Edge strength (c) 9-Clque, attractve. elatve upper bound error Convexty MaxMatchng Edge strength (d) 9-Clque, mxed. Fgure : elatve error of the upper bounds. Top row: 5 5 ferromagnetc grd. Left: attractve nteractons. ght: mxed nteractons. t s not possble to cover all the edges n a clque wth only spannng tree). For each bound, we report the relatve error, defned as (log(bound) log(z)) / log(z), where Z s the true partton functon, computed usng the juncton tree method. In these experments, both our upper and lower bounds mprove over the ones obtaned wth T- WBP [6] and mean-feld respectvely. The lower bound based on mnmum matchng vsually overlaps wth the mean-feld bound and s thus omtted from Fgure. It s, however, strctly better, even f by a small amount. Notce that we mght be able to get a better bound by choosng a dfferent set of parameters Θ (whch may be suboptmal for TW-BP). We also used numercal optmzaton (BFGS and BOBYQA [16]) to select the values of s n the nverse Holder bound (7) (notce that once we have computed the denstes n(e, Θ ), evaluatng the bound s cheap,.e., t does not requre solvng an nference task). We found that both optmzaton strateges are very senstve to the ntal condton; however, by optmzng the parameters we were always able to obtan a lower bound at least as good as the one gven by mean feld. 7 Conclusons We presented DENSITYPOPAGATION, a novel message passng algorthm to compute the densty of states whle explotng the structure of the underlyng graphcal model. We showed that DENSI- TYPOPAGATION computes the exact densty for tree structured graphcal models, s closely related to Belef Propagaton and Max-Product algorthms, and s n fact a generalzaton of both. We ntroduced a new famly of bounds on the partton functon based on tree decomposton but wthout relyng on convexty. We showed both theoretcally and emprcally that the addtonal nformaton provded by the densty of states leads to better bounds than standard convexty-based ones. Ths work opens up several nterestng drectons. These nclude an exploraton of the convergence propertes of (loopy) DENSITYPOPAGATION specfcally n relaton to BP and MP [17, 18, 19, 0], an nvestgaton of the exstence of a varatonal nterpretaton for the updates, and devsng an effcent strategy to select the parameters Θ of the tree decomposton such that the proposed bound s further optmzed. 8

9 eferences [1] M.J. Wanwrght and M.I. Jordan. Graphcal models, exponental famles, and varatonal nference. Foundatons and Trends n Machne Learnng, 1(1-):1 305, 008. [] S. Ermon, C. Gomes, A. Sabharwal, and B. Selman. Accelerated Adaptve Markov Chan for Partton Functon Computaton. Neural Informaton Processng Systems, 011. [3] F. Wang and DP Landau. Effcent, multple-range random walk algorthm to calculate the densty of states. Physcal evew Letters, 86(10): , 001. [4] M.J. Wanwrght. Stochastc processes on graphs wth cycles: geometrc and Varatonal approaches. PhD thess, Massachusetts Insttute of Technology, 00. [5] M. Wanwrght, T. Jaakkola, and A. Wllsky. Exact map estmates by (hyper) tree agreement. Advances n neural nformaton processng systems, pages , 003. [6] M.J. Wanwrght. Tree-reweghted belef propagaton algorthms and approxmate ML estmaton va pseudo-moment matchng. In AISTATS, 003. [7] G. Pars and. Shankar. Statstcal feld theory. Physcs Today, 41:110, [8] L.D. Brown. Fundamentals of statstcal exponental famles: wth applcatons n statstcal decson theory. Insttute of Mathematcal Statstcs, [9] M. chardson and P. Domngos. Markov logc networks. Machne Learnng, 6(1): , 006. [10] Y. Wess, C. Yanover, and T. Meltzer. MAP estmaton, lnear programmng and belef propagaton wth convex free energes. In Uncertanty n Artfcal Intellgence, 007. [11] T. Hazan and A. Shashua. Norm-product belef propagaton: Prmal-dual message-passng for approxmate nference. Informaton Theory, IEEE Transactons on, 56(1): , 010. [1] K.P. Murphy, Y. Wess, and M.I. Jordan. Loopy belef propagaton for approxmate nference: An emprcal study. In Proceedngs of the Ffteenth conference on Uncertanty n artfcal ntellgence, pages Morgan Kaufmann Publshers Inc., [13] J.S. Yedda, W.T. Freeman, and Y. Wess. Understandng belef propagaton and ts generalzatons. Explorng artfcal ntellgence n the new mllennum, 8:36 39, 003. [14] W.S. Cheung. Generalzatons of hölders nequalty. Internatonal Journal of Mathematcs and Mathematcal Scences, 6:7 10, 001. [15] J.M. Mooj. lbdai: A free and open source c++ lbrary for dscrete approxmate nference n graphcal models. The Journal of Machne Learnng esearch, 11: , 010. [16] M.J.D. Powell. The BOBYQA algorthm for bound constraned optmzaton wthout dervatves. Unversty of Cambrdge Techncal eport, 009. [17] Davd Sontag, Talya Meltzer, Amr Globerson, Tomm Jaakkola, and Yar Wess. Tghtenng LP relaxatons for MAP usng message passng. In Uncertanty n Artfcal Intellgence, pages , 008. [18] T. Meltzer, A. Globerson, and Y. Wess. Convergent message passng algorthms: a unfyng vew. In Proceedngs of the Twenty-Ffth Conference on Uncertanty n Artfcal Intellgence, pages AUAI Press, 009. [19] A.T. Ihler, J.W. Fsher, and A.S. Wllsky. Loopy belef propagaton: Convergence and effects of message errors. Journal of Machne Learnng esearch, 6(1):905, 006. [0] J.M. Mooj and H.J. Kappen. Suffcent condtons for convergence of the sum product algorthm. Informaton Theory, IEEE Transactons on, 53(1): ,

Density Propagation and Improved Bounds on the Partition Function

Density Propagation and Improved Bounds on the Partition Function Densty Propagaton and Improved Bounds on the Partton Functon Stefano Ermon, Carla P. Gomes Dept. of Computer Scence Cornell Unversty Ithaca NY 1853, U.S.A. Ashsh Sabharwal IBM Watson esearch Ctr. Yorktown

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probablstc & Unsupervsed Learnng Convex Algorthms n Approxmate Inference Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt Unversty College London Term 1, Autumn 2008 Convexty A convex

More information

Why BP Works STAT 232B

Why BP Works STAT 232B Why BP Works STAT 232B Free Energes Helmholz & Gbbs Free Energes 1 Dstance between Probablstc Models - K-L dvergence b{ KL b{ p{ = b{ ln { } p{ Here, p{ s the eact ont prob. b{ s the appromaton, called

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Approximate inference using conditional entropy decompositions

Approximate inference using conditional entropy decompositions Approxmate nference usng condtonal entropy decompostons Amr Globerson, Tomm Jaakkola Computer Scence and Artfcal Intellgence Laboratory MIT Cambrdge, MA 239 Abstract We ntroduce a novel method for estmatng

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

P exp(tx) = 1 + t 2k M 2k. k N

P exp(tx) = 1 + t 2k M 2k. k N 1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

Physical Fluctuomatics Applied Stochastic Process 9th Belief propagation

Physical Fluctuomatics Applied Stochastic Process 9th Belief propagation Physcal luctuomatcs ppled Stochastc Process 9th elef propagaton Kazuyuk Tanaka Graduate School of Informaton Scences Tohoku Unversty kazu@smapp.s.tohoku.ac.jp http://www.smapp.s.tohoku.ac.jp/~kazu/ Stochastc

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Probability-Theoretic Junction Trees

Probability-Theoretic Junction Trees Probablty-Theoretc Juncton Trees Payam Pakzad, (wth Venkat Anantharam, EECS Dept, U.C. Berkeley EPFL, ALGO/LMA Semnar 2/2/2004 Margnalzaton Problem Gven an arbtrary functon of many varables, fnd (some

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential Open Systems: Chemcal Potental and Partal Molar Quanttes Chemcal Potental For closed systems, we have derved the followng relatonshps: du = TdS pdv dh = TdS + Vdp da = SdT pdv dg = VdP SdT For open systems,

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Lecture 17: Lee-Sidford Barrier

Lecture 17: Lee-Sidford Barrier CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the

More information

Tree Block Coordinate Descent for MAP in Graphical Models

Tree Block Coordinate Descent for MAP in Graphical Models ree Block Coordnate Descent for MAP n Graphcal Models Davd Sontag omm Jaakkola Computer Scence and Artfcal Intellgence Laboratory Massachusetts Insttute of echnology Cambrdge, MA 02139 Abstract A number

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Inductance Calculation for Conductors of Arbitrary Shape

Inductance Calculation for Conductors of Arbitrary Shape CRYO/02/028 Aprl 5, 2002 Inductance Calculaton for Conductors of Arbtrary Shape L. Bottura Dstrbuton: Internal Summary In ths note we descrbe a method for the numercal calculaton of nductances among conductors

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0 Bézer curves Mchael S. Floater September 1, 215 These notes provde an ntroducton to Bézer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES BÂRZĂ, Slvu Faculty of Mathematcs-Informatcs Spru Haret Unversty barza_slvu@yahoo.com Abstract Ths paper wants to contnue

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence) /24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

arxiv:cs.cv/ Jun 2000

arxiv:cs.cv/ Jun 2000 Correlaton over Decomposed Sgnals: A Non-Lnear Approach to Fast and Effectve Sequences Comparson Lucano da Fontoura Costa arxv:cs.cv/0006040 28 Jun 2000 Cybernetc Vson Research Group IFSC Unversty of São

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

1 Motivation and Introduction

1 Motivation and Introduction Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

A new construction of 3-separable matrices via an improved decoding of Macula s construction

A new construction of 3-separable matrices via an improved decoding of Macula s construction Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

University of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015

University of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015 Lecture 2. 1/07/15-1/09/15 Unversty of Washngton Department of Chemstry Chemstry 453 Wnter Quarter 2015 We are not talkng about truth. We are talkng about somethng that seems lke truth. The truth we want

More information

Credit Card Pricing and Impact of Adverse Selection

Credit Card Pricing and Impact of Adverse Selection Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information