Duality and Auxiliary Functions for Bregman Distances

Size: px
Start display at page:

Download "Duality and Auxiliary Functions for Bregman Distances"

Transcription

1 Dualty and Auxlary Functons for Bregman Dstances Stephen Della Petra, Vncent Della Petra, and John Lafferty October 8, 2001 CMU-CS School of Computer Scence Carnege Mellon Unversty Pttsburgh, PA Abstract We formulate and prove a convex dualty theorem for Bregman dstances and present a technque based on auxlary functons for dervng and provng convergence of teratve algorthms to mnmze Bregman dstance subject to lnear constrants. Ths research was partally supported by the Advanced Research and Development Actvty n Informaton Technology (ARDA), contract number MDA C-2106, and by the Natonal Scence Foundaton (NSF), grant CCR The vews and conclusons contaned n ths document are those of the authors and should not be nterpreted as representng the offcal polces, ether expressed or mpled, of ARDA, NSF, or the U.S. government.

2 Keywords: Bregman dstance, convex dualty, Legendre functons, auxlary functons

3 I. Introducton Convexty plays a central role n a wde varety of machne learnng and statstcal nference problems. A standard paradgm s to dstngush a preferred member from a set of canddates based upon a convex mpurty measure or loss functon talored to the specfc problem to be solved. Examples nclude least squares regresson, decson trees, boostng, onlne learnng, maxmum lkelhood for exponental models, logstc regresson, maxmum entropy, and support vector machnes. Such problems can often be naturally cast as convex optmzaton problems nvolvng a Bregman dstance, whch can lead to new algorthms, analytcal tools, and nsghts derved from the powerful methods of convex analyss. In ths paper we formulate and prove a convex dualty theorem for mnmzng a general class of Bregman dstances subject to lnear constrants. The dualty result s then used to derve teratve algorthms for solvng the assocated optmzaton problem. Our presentaton s motvated by the recent work of Collns, Schapre, and Snger (2001), who showed how certan boostng algorthms and maxmum lkelhood logstc regresson can be unfed wthn the framework of Bregman dstances. In partcular, specfc nstances of the results gven here are used by Collns et al. (2001) to show the convergence of a famly teratve algorthms for mnmzng the exponental or logstc loss. Whle nvokng methods from convex analyss can unfy and clarfy the relatonshp between dfferent methods, the hgher level of abstracton often comes at a prce, snce there can be consderable techncaltes. For example, n some treatments the assumptons on the convex functons that can be used to defne Bregman dstances are very techncal and dffcult to verfy. Here we trade off generalty for relatve smplcty by workng wth a restrcted class of Bregman dstances, whch however ncludes many of the examples that arse n machne learnng. Our treatment of dualty and auxlary functons for Bregman dstances closely parallels the results presented by Della Petra et al. (1997) for the Kullback-Lebler dvergence. In partcular, the statement and proof of the dualty theorem gven n (Della Petra et al., 1997) carres over wth only a few changes to the class of Bregman dstances we consder. Our approach dffers from much of the lterature n convex analyss n several ways. Frst, we work prmarly wth the argument at whch a convex conjugate takes on ts value, rather than the value of the functon tself. The reason for ths s that the argument corresponds to a statstcal model, whch s the man object of nterest n statstcal or machne learnng applcatons, whle the value corresponds to a lkelhood or loss functon. Second, whle Bregman dstances are typcally defned only on the nteror of the doman of the underlyng convex functon, we assume that there s a contnuous extenson to the entre doman. Ths makes t possble to formulate a very natural dualty theorem that also ncludes many cases requred n practce, when the desred model may le on the boundary of the doman. The followng secton recalls the standard defntons from convex analyss that wll be requred, and presents the techncal assumptons made on the class of Bregman dstances that we work wth. We also ntroduce some new termnology, usng the terms Legendre-Bregman conjugate and Legendre- Bregman projecton to extend the classcal noton of the Legendre conjugate and transform to Bregman dstances. Secton 3 contans the statement and proof of the dualty theorem that connects the prmal problem wth ts dual, showng that the soluton s characterzed n geometrcal terms 1

4 by a Pythagorean equalty. Secton 4 defnes the noton of an auxlary functon, whch s used to construct teratve algorthms for solvng constraned optmzaton problems. Ths secton shows how convexty can be used to derve an auxlary functon for Bregman dstances based on separable functons. The last secton summarzes the man results of the paper. II. Bregman Dstances and Legendre-Bregman Projectons In ths secton we begn by establshng our notaton and recallng the relevant notons from convex analyss that we requre; the classc text (Rockafellar, 1970) remans one of the best references for ths materal. We then defne Bregman dstances and ther assocated conjugate functons and projectons, and derve varous relatons between these that wll be mportant n provng the dualty theorem. Next we state our assumptons on the underlyng convex functon that enable us to derve these propertes for the contnous extenson of the Bregman dstance to the entre doman. A. Notaton and Basc Defntons We wll use notaton that s suggestve of our man applcatons: rather than φ(x) we wll wrte φ(p) or φ(q), havng n mnd probablty dstrbutons p or q. A convex functon φ : S R m [, + ] s proper f there s no q S wth φ(q) = and there s some q wth φ(q). The effectve doman of φ, denoted φ, s the set of ponts where φ s fnte: φ = {q S φ(q) < }; for brevty we usually refer to φ as smply the doman of φ. A proper convex functon s closed f t s lower sem-contnuous. The conjugate φ of φ s gven by φ (v) = sup ( q, v φ(q)) (2.1) q S A proper convex functon φ s sad to be essentally smooth or steep f t s dfferentable on the nteror of ts doman nt( φ ), and f lm n φ(q n ) = + whenever q n s a sequence n nt( φ ) convergng to a pont on the boundary of nt( φ ). The functon φ s sad to be coercve n case the level set {q S φ(q) c} s bounded for every c R. Defnton 2.1. Let φ be a closed, convex and proper functon defned on S R m, such that φ s dfferentable on nt( φ ). The Bregman dstance D φ : φ nt( φ ) [0, ) s defned by D φ (p, q) = φ(p) φ(q) φ(q), p q (2.2) Bregman dstance can be nterpreted as a measure of the convexty of φ. Ths s easy to vsualze n the one-dmensonal case: drawng a tangent lne to the graph of φ at q, the Bregman dstance D φ (p, q) s seen as the vertcal dstance between ths lne and the pont φ(p). Legendre functons are a very well behaved famly of convex functons that wll make workng wth Bregman dstances much easer. Defnton 2.2. A closed convex functon φ s Legendre, or a convex functon of Legendre type, n case nt( φ ) s convex and φ s essentally smooth and strctly convex on nt( φ ). 2

5 The prmary propertes that make workng wth Legendre functons convenent are summarzed n the followng results quoted from (Rockafellar, 1970). Proposton 2.3. (Rockafellar, 1970; Theorem 26.5) If φ s a convex functon of Legendre type then φ : nt( φ ) nt( φ ) s a bjecton, contnuous n both drectons, and φ = ( φ) 1. Proposton 2.4. Suppose that φ s Legendre, and ψ s a proper closed, convex functon that s also essentally smooth. Then φ + ψ s Legendre. In partcular, snce for fxed q nt( φ ) the mappng p φ(q), p q + φ(q) s affne lnear, the functon p D φ (p, q) s Legendre wth doman φ, and wth conjugate doman φ φ(q). Defnton 2.5. For φ a convex functon of Legendre type we defne the Legendre-Bregman conjugate l φ : nt( φ ) R m R { } as l φ (q, v) = sup p φ ( v, p D φ (p, q)) (2.3) We defne the Legendre-Bregman projecton L φ : nt( φ ) R m φ as whenever ths s well defned. L φ (q, v) = arg max p φ ( v, p D φ (p, q)) (2.4) Let us explan our choce of termnology n the above defnton, whch s nonstandard. When h s a convex functon of Legendre type, the Legendre conjugate, as defned n (Rockafellar 1970; Chapter 26), corresponds to the convex conjugate h. For fxed q, our defnton of the Legendre- Bregman conjugate s smply the classcal Legendre conjugate for the convex functon h(p) = D φ (p, q). Note that Rockafellar defnes the Legendre transform as the mappng from the orgnal convex functon (and doman) to ts Legendre conjugate (and assocated doman). The Legendre- Bregman projecton L φ (q, v) s the actual argument at whch the maxmum s attaned. As shown by the followng result, our use of the term projecton accords wth the standard termnology of Bregman projectons. Proposton 2.6. Let φ be Legendre. Then for q nt( φ ) and v nt( φ ) φ(q), the Legendre-Bregman projecton s gven explctly by Moreover, t can be wrtten as a Bregman projecton L φ (q, v) = ( φ ) ( φ(q) + v) (2.5) L φ (q, v) = arg mn D φ (p, q) (2.6) p φ H for the hyperplane H = {p R m p, v = b} wth b = L φ (q, v), v. 3

6 Proof. satsfy: To prove the frst statement, note that a statonary pont p of p, v D φ (p, q) must Snce φ s Legendre, for φ(q) + v nt( φ ), we then have that φ(p ) = φ(q) + v (2.7) p = ( φ) 1 ( φ(q) + v) (2.8) = ( φ )( φ(q) + v) (2.9) usng the fact that ( φ) 1 s well defned and equal to φ from Proposton 2.3. The second statement follows from, for example, the results n Secton 2.2 of Censor and Zenos (1997) on projectons onto hyperplanes, notng that every Legendre functon s zone consstent. In our formulaton of the dualty theorem, t s the Legendre-Bregman projecton L φ (q, v) rather than the conjugate functon l φ (q, v) that plays a central role, leadng to a natural and smple statement of convex dualty for Bregman dstances. Ths projecton corresponds to a statstcal model, whch s the prmary object of nterest for machne learnng problems. B. Basc Relatons We now derve some basc algebrac relatons between D φ, L φ, and l φ. These relatons wll be mportant n establshng the geometrcal aspects of the dualty theorem, as well as for dervng auxlary functons. In order to free us from havng to specfy the doman of l φ and L φ, we wll n the followng assume that φ = R m. Proposton 2.7. Let φ be Legendre, wth φ = R m. For fxed p φ, D φ (p, L φ (q, v)) s contnuous n q nt( φ ) and convex n v. Together, the Legendre-Bregman conjugate and projecton satsfy D φ (p, q) D φ (p, L φ (q, v)) = v, p l φ (q, v) (2.10) for all p φ, q nt( φ ) and v R m. = D(L φ (q, v), q) + v, p L φ (q, v) (2.11) Proof. Usng the defnton of Bregman dstance and Proposton 2.6, we have for q nt( φ ) that D φ (p, q) D φ (p, L φ (q, v)) = φ(l φ (q, v)) φ(q) + φ(l φ (q, v)), p L φ (q, v) φ(q), p q (2.12) = φ(l φ (q, v)) φ(q) + φ(q) + v, p L φ (q, v) φ(q), p q (2.13) = v, p v, L φ (q, v) + φ(l φ (q, v)) φ(q) φ(q), L φ (q, v) q (2.14) = v, p v, L φ (q, v) + D φ (L φ (q, v), q) (2.15) = v, p l φ (q, v) (2.16) 4

7 Therefore (2.10) holds for p φ and q nt( φ ). From the defnton of the Legendre-Bregman conjugate we have for q nt( φ ), that l φ (q, v) = v, L φ (q, v) D φ (L φ (q, v), q) (2.17) Equaton (2.24) results from combnng ths wth (2.10). The convexty of D φ (p, L φ (q, v)) n v follows from (2.10), whch expresses D φ (p, L φ (q, v)) as a sum of the convex functons l φ (q, v) and D φ (p, q) v, p. The next result shows how D φ (p, L φ (q, v)) vares wth v, and wll be an mportant ngredent n the dualty result of the next secton. Proposton 2.8. Let φ be Legendre, wth p φ and q nt( φ ). Then for v R m, the mappng t D φ (p, L φ (q, tv)) s dfferentable at t = 0, wth dervatve d D φ (p, L φ (q, tv)) = v, q v, p. (2.18) dt t=0 Proof. Snce φ s Legendre, f q nt( φ ) then L φ (q, tv) nt( φ ); see for example Theorem 3.12 of (Bauschke & Borwen, 1997). From Proposton 2.7 we have that d dt D φ (p, L φ (q, tv)) = d dt ( tv, L φ (q, tv) tv, p + D φ (L φ (q, tv), q)) (2.19) t=0 t=0 = v, q v, p + d d dt φ(l φ (q, tv)) φ(q), dt L φ (q, tv) (2.20) t=0 t=0 = v, q v, p (2.21) whch proves the result for p φ and q nt( φ ). C. The Contnuous Extenson The results above are gven n terms of the Bregman dstance usng ts standard defnton as a functon on φ nt( φ ). We now make assumptons that allow us to work wth D φ as an extended real-valued functon on φ φ. Ths enables us to formlate a very natural and general dualty result, presented n the followng secton. Informally, we assume that D φ extends contnously from φ nt( φ ) to φ φ, and that L φ extends contnously from nt( φ ) R m to φ R m. In addton, we requre a form of compactness to guarantee the exstence of certan mnmzers. As before, n order to smplfy the presentaton we assume that the range of φ s all of R m. 5

8 Thus, we make the followng assumptons on φ: A1. φ s of Legendre type; A2. φ = R m ; A3. D φ extends to a functon D φ : φ φ [0, ] such that D φ (p, q) s jontly contnuous n p and q, and satsfes D φ (p, q) = 0 f and only f p = q. A4. L φ extends to a functon L φ : φ R m φ such that L φ (q, v) s jontly contnuous n q and v, and satsfes L φ (q, 0) = q. A5. D φ (p, ) s coercve for every p φ \nt( φ ); Note that snce for a Legendre functon φ s contnous on nt( φ ) (Proposton 2.3), t follows from Defnton 2.1 that D φ (p, q) s jontly contnous on φ nt( φ ) and from Proposton 2.6 that L φ (q, v) s jontly contnous on nt( φ ) R m. We also note that snce we assume φ = R m, D φ (p, ) s automatcally coercve for p nt( φ ). Together, condtons A1 A5 mply that φ s a Bregman-Legendre functon as defned by Bauschke and Borwen (1997). Now, from the defnton of the Legendre-Bregman conjugate we have l φ (q, v) = v, L φ (q, v) D φ (L φ (q, v), q) (2.22) for q nt( φ ). Propertes A4 and A5 allow us to defne l φ : φ R m R as the contnuous extenson of l φ : nt( φ ) R m R, satsfyng the same dentty. Thus, the Legendre-Bregman conjugate l φ (q, v) s contnuous n q, contnuous and convex n v, and satsfes l φ (q, 0) = 0. Proposton 2.7 now generalzes to the contnuous extenson. Proposton 2.9. Let φ satsfy A1 A4. For fxed p φ, D φ (p, L φ (q, v)) s contnuous n q and convex n v. Together, the Legendre-Bregman conjugate and projecton satsfy D φ (p, q) D φ (p, L φ (q, v)) = v, p l φ (q, v) (2.23) = D(L φ (q, v), q) + v, p L φ (q, v) (2.24) for all p, q φ and v R m. Proof. Ths follows drectly from the jont contnuty of D φ, L φ, and l φ. The dfferental dentty n Proposton 2.8 also extends. Proposton Let φ satsfy A1 A4, and let p, q φ wth D φ (p, q) <. Then for v R m, the mappng t D φ (p, L φ (q, tv)) s dfferentable at t = 0, wth dervatve d D φ (p, L φ (q, tv)) = v, q v, p. (2.25) dt t=0 6

9 Proof. Let q nt( φ ). From Proposton 2.8 we know that d D φ (p, L φ (q, tv)) = v, q v, p (2.26) dt t=0 Thus, snce L φ (q, (t + s)v) = L φ (L φ (q, tv), sv), we also have that d dt D(p, L φ(q, tv)) = v, L φ (q, tv) v, p (2.27) To show that the result holds when q φ \nt( φ ), we ll use a fact from elementary analyss: f f n f, and f n s contnuous wth f n(t) g(t) unformly for t [a, b], then g s contnous and f (t) = g(t). Frst, let q nt( φ ) and p / nt( φ ). Suppose p n nt( φ ) wth p n p. Snce L φ (q, tv) nt( φ ), we have from the above calculaton that d dt D(p n, L φ (q, tv)) = v, L φ (q, tv) v, p n v, L φ (q, tv) v, p (2.28) where the convergence s unform on every nterval [a, b] around zero; property (2.25) follows. Now suppose that p φ, q φ \nt( φ ), and q n nt( φ ) wth q n q. Because L φ (q, v) s jontly contnous n q and v, t s unformly contnuous on every compact set of (q, v). In partcular, v, L φ (q n, tv) v, p converges unformly n t to v, L φ (q, tv) v, p on every nterval [a, b]. Thus property (2.25) holds for all p, q φ. Proposton 2.9 and 2.10 are the man computatons that we wll requre n the followng secton. III. Dualty In ths secton we derve the man dualty result. The setup s that we have a set of features f (j) R m, j = 1, 2,..., n and denote by F the m n matrx wth columns gven by the f (j). These features correspond to the weak learners n boostng, or to the suffcent statstcs n an exponental model. The prmal problem constrans the values p, f (j), and these constrants carry over to Lagrange multplers n a famly of Legendre-Bregman projectons L(q, Fλ) n the dual problem. Defnton 3.1. For a gven element p 0 φ, the feasble set for p 0 and F s defned by { } P(p 0, F ) = p φ p, f (j) = p 0, f (j), j = 1,... n (3.1) For a gven q 0 φ, the Legendre-Bregman projecton famly for q 0 and F s defned by Q(q 0, F ) = {q φ q = L φ (q 0, Fλ) for some λ R n } (3.2) Trvally, both sets are non-empty snce p 0 P(p 0, F ) and q 0 Q(q 0, F ). Snce p 0, q 0, and F wll be fxed, we wll use abbrevated notaton and refer to these sets as P and Q. We use Q to denote the closure of Q(q 0, F ) as a subset of R m. Dualty relates the projecton onto P to the projecton onto Q. 7

10 Proposton 3.2. Let φ satsfy A1 A5, and suppose that p 0, q 0 φ wth D φ (p 0, q 0 ) <. Then there exsts a unque q φ satsfyng the followng four propertes: (1) q P Q (2) D φ (p, q) = D φ (p, q ) + D φ (q, q) for any p P and q Q (3) q = arg mn p P (4) q = arg mn q Q D φ (p, q 0 ) D φ (p 0, q) Moreover, any one of these four propertes determnes q unquely. In order to prove Proposton 3.2, we frst prove two lemmas. The frst shows that there s at least one member n common between P and Q; the second shows that the Pythagorean equalty (2) holds for any such member. Lemma 3.3. If D φ (p 0, q 0 ) < then P Q s nonempty. Proof. Note that snce D φ (p 0, q 0 ) <, D φ (p 0, q) s not dentcally on Q. Also, the mappng λ D φ (p 0, L φ (q 0, Fλ)) s contnuous and convex. Let R be the level set R = {q φ D φ (p 0, q) D(p 0, q 0 )} (3.3) We know from Assumpton A5 that R s bounded. Thus D φ (p 0, q) attans ts mnmum at a (not necessarly unque) pont q Q R Q. We wll show that q s also n P. Let q Q, and let µ j R n be such that q = lm j L φ (q 0, F µ j ). Then by the contnuty of L φ (, ), L φ (q, Fλ) = lm j L φ (L φ (q 0, F µ j ), Fλ) (3.4) = lm j L φ (q 0, F (µ j + λ)) Q (3.5) Thus Q s closed under the mappng q L φ (q, Fλ) for λ R m, and L φ (q, Fλ) s n Q for any λ. By the defnton of q, t follows that λ = 0 s a mnmum of the functon λ D φ (p 0, L φ (q, Fλ)). Takng dervatves wth respect to λ and usng Proposton 2.10 we conclude that q, f = p 0, f ; thus q P. Lemma 3.4. If q P Q then the Pythagorean equalty D φ (p, q ) = D φ (p, q ) + D φ (q, q ) holds for any p P and q Q. Proof. that Suppose that p 1, p 2, q 1, q 2 φ wth q 2 = L φ (q 1, Fλ). From Proposton 2.9 we have D φ (p 1, q 1 ) D φ (p 1, q 2 ) = p 1, Fλ l φ (p 1, Fλ) (3.6) 8

11 and smlarly Therefore, D φ (p 2, q 1 ) D φ (p 2, q 2 ) = p 2, Fλ l φ (p 2, Fλ) (3.7) D φ (p 1, q 1 ) D φ (p 1, q 2 ) D φ (p 2, q 1 ) + D φ (p 2, q 2 ) = p 1, Fλ p 2, Fλ (3.8) n ) = λ j ( p 1, f (j) p 2, f (j) It follows from ths dentty and the contnuty of D φ that D φ (p 1, q 1 ) D φ (p 1, q 2 ) D φ (p 2, q 1 ) + D φ (p 2, q 2 ) = 0 (3.9) f p 1, p 2 P and q 1, q 2 Q. The lemma follows by takng p 1 = q 1 = q. Proof of Proposton 3.2. Choose q to be any pont n P Q. Such a q exsts by Lemma 3.3. It satsfes property (1) by defnton, and t satsfes property (2) by Lemma 3.4. As a consequence of property (2), t also satsfes propertes (3) and (4). To check property (3), note that f q s any pont n Q, then D φ (p 0, q ) = D φ (p 0, q )+D φ (q, q ) D φ (p 0, q ). Smlarly, property (4) must hold snce f p s any pont n P, then D φ (p, q 0 ) = D φ (p, q ) + D φ (q, q 0 ) D(q, q 0 ). It remans to prove that each of the four propertes (1) (4) determnes q unquely. In other words, we need to show that f m s a pont n φ satsfyng any of the four propertes (1) (4), then m = q. Suppose that m satsfes property (1). Then property (2) wth p = q = m mples that D φ (m, m) = D φ (m, q )+D φ (q, m). Snce D φ (m, m) = 0, t follows that D φ (m, q ) = 0 so m = q. If m satsfes property (2), then the same argument wth q and m reversed proves that m = q. Suppose that m satsfes property (3). Then D φ (p 0, q ) D φ (p 0, m) = D φ (p 0, q ) + D φ (q, m) (3.10) where the second equalty follows from property (2) for q. Thus D φ (q, m) 0 so m = q. If m satsfes property (4), then showng once agan that m = q. D φ (q, q 0 ) D φ (m, q 0 ) = D φ (m, q ) + D φ (q, q 0 ) (3.11) In the followng secton we outlne the auxlary functon method for buldng teratve algorthms to compute q, and show how to use convexty to derve an auxlary functon for separable Bregman dstances. IV. Auxlary Functons The auxlary functon approach s conceptually smple: bound the change n Bregman dstance from below usng a functon that s easy to compute and that decouples the constrants. Maxmzng 9

12 ths auxlary functon we obtan new parameters λ = λ + λ and a new model q λ+ λ gven by q λ+ λ = L φ (q λ, F λ) (4.1) = L φ (L φ (q 0, Fλ), F λ) (4.2) = L φ (q 0, F (λ + λ)) (4.3) We then use the dualty theorem to show that when λ = 0, we must have that q = q. The strategy s very smlar to EM. In an EM algorthm, the Q-functon Q(λ, λ) s computed as a lower bound to the change n log-lkelhood: p 0 (x) log q (x λ ) = h p 0 (x) log ) q (x λ) q (x λ) x x (4.4) = p 0 (x) log q (h x, λ) q (x, h λ ) q (x, h λ) x h (4.5) q (h x, λ) log q (x, h λ ) q (x, h λ) (4.6) h def = Q(λ, λ) (4.7) where (x, h) s the complete data, x s the ncomplete data, and the nequalty follows from the concavty of the logarthm. After computng Q n the E-step, t s then maxmzed over λ n the M-step. In the same way, for Bregman dstances the am s to derve an auxlary functon A(λ, λ) whose calculaton can be carred out effcently n somethng lke an E-step, and such that t can be easly maxmzed over λ n an M-step. However, just as for EM, ths s a general strategy more than t s a precse algorthm. A partcular Bregman dstance problem may requre some ngenuty n order to come up wth an approprate auxlary functon. Ths s the general motvaton behnd the followng two defntons. Defnton 4.1. A functon A : φ R n R s called an auxlary functon for p 0 and F n case 1. A(q, λ) s contnuous n q and A(q, 0) = 0 2. D φ (p 0, q ) D φ (p 0, L φ (q, Fλ)) A(q, λ) 3. If λ = 0 s a maxmum of A(q, λ), then q, f (j) = p 0, f (j) for j = 1,..., n. Defnton 4.2. Let A be an auxlary functon and q 0 φ. The update sequence for q 0 wth respect to A s defned by q (0) = q 0 and q (t+1) def = L φ (q (t), Fλ (t) ) where λ (t) = arg max λ A(q (t), λ) (4.8) The reason for defnng auxlary functons n ths way s the followng result, whch can be proved n a smlar way to Proposton 5 n (Della Petra et al., 1997) or Lemma 2 n (Collns et al., 2001). The compactness assumpton wll n general follow from coercvty n Assumpton A5. 10

13 Proposton 4.3. Suppose that the sequence q (t) les n a compact set. Then lm q (t) = arg mn D φ (p 0, q ) (4.9) t q Q As we now explan, auxlary functons can be convenently constructed by usng the relaton D φ (p, q ) D φ (p, L φ (q, v)) = p, v l φ (q, v) (4.10) from Proposton 2.9 and explotng the convexty of l φ (q, v). For smplcty, we wll assume that φ s separable, so that φ(p) = φ (p ) wth each φ : R R satsfyng propertes A1 A5 (wth m = 1). Auxlary functons n the general case can be derved usng smlar arguments. For the separable case, clearly D φ (p, q ) = l φ (q, v) = D φ (p, q ) (4.11) l φ (q, v ) (4.12) where l φ (q, v) = sup p φ (pv D φ (p, q)) s the Legendre-Bregman conjugate of φ. Proposton 4.4. sgn(f (j) ). Then A(q, λ) For each = 1,..., m, select N so that n (j) f N, and set s j = def = n λ j p 0, f (j) 1 n N f (j) l φ (q, s j N λ j ) (4.13) s an auxlary functon for p 0 and F, and the correspondng update scheme s gven by q (t+1) = L φ (q (t), F λ (t) ) (4.14) where λ (t) j satsfes f (j) L φ (q (t), s j N λ j ) = p 0, f (j) (4.15) The dea of factorng out the sgns s j s taken from Collns et al. (2001). If we take N = 1 for all we obtan the algorthms presented n that paper. If we take N = (j) j f then we obtan algorthms analogous to the IIS algorthm of Della Petra et al. (1997). Proof. We verfy that the functon defned n (4.13) satfes the three propertes of Defnton 4.1. Property (1) of the defnton holds snce l φ (q, 0) = 0. Property (2) follows from the convexty of l φ. In partcular, we have the nequalty l φ (q, F λ) = l φ (q, (F λ) ) (4.16) 11

14 = = l φ (q, n s j F j λ j ) (4.17) n F j l φ (q, s j N λ j ) + 1 N n n F j l φ (q, 0) (4.18) N F j N l φ (q, s j N λ j ) (4.19) whch together wth Proposton 2.9 says that D φ (p, q ) D φ (p, L φ (q, F λ)) n λ j p, f (j) n F j N l φ (q, s j N λ j ) (4.20) Now, usng Propostons 2.9 and 2.10, t can be shown that v l φ (q, v) = L φ (q, v) (4.21) whch shows that (4.15) s the correct update. Therefore, at a maxmum λ of A(q, λ) we have that p 0, f (j) = f (j) L φ (q, s j N λ j ) for each j (4.22) If λ = 0, then p 0, f (j) = f (j) L φ (q, 0) = q, f (j) (4.23) showng that Property (3) holds. Thus (4.13) defnes an auxlary functon. Whle the auxlary functon (4.13) looks a bt messy, both the E-step and M-step for ths type of auxlary functon are generally qute practcal and easy to mplement. V. Concluson Ths paper has presented a convex dualty theorem for constraned optmzaton usng Bregman dstances. The man result, Proposton 3.2, dffers from results presented n the convex analyss lterature n that the Bregman dstance s defned on the entre essental doman, rather than only on the nteror. Ths generalty s needed n many applcatons. Though the assumptons A1 A5 that we make on the underlyng convex functon are farly restrctve, t may well be possble to relax these assumptons to cover a broader class of examples. In partcular, assumpton A2, whch states that the conjugate doman s all of R m, may not be essental n our approach. The auxlary functon technque presented n Secton 4 s a general and practcal method for dervng algorthms for solvng the dual problem. Although the specfc auxlary functon we derve assumes the Bregman dstance s separable, smlar arguments can be used for non-separable 12

15 Bregman dstances. The analyss gven here makes clear the role of convexty, as the bounds are derved usng only the propertes of the underlyng Legendre-Bregman conjugate. Acknowledgments We are grateful to Rob Schapre for encouragng us to wrte ths paper, and for helpful questons and suggestons. We also thank Jonathan Borwen and Henz Bauschke for comments on the paper. References Bauschke, H., & Borwen, J. (1997). Legendre functons and the method of random Bregman projectons. Journal of Convex Analyss, 4, Censor, Y., & Zenos, S. A. (1997). Parallel optmzaton: Theory, algorthms and applcatons. Oxford Unversty Press. Collns, M., Schapre, R., & Snger, Y. (2001). Logstc regresson, AdaBoost, and Bregman dstances. Machne Learnng. In press. Della Petra, S., Della Petra, V., & Lafferty, J. (1997). Inducng features of random felds. IEEE Transactons on Pattern Analyss and Machne Intellgence, 19, Rockafellar, R. T. (1970). Convex analyss. Prnceton Unversty Press. 13

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Lecture 17: Lee-Sidford Barrier

Lecture 17: Lee-Sidford Barrier CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Spectral Graph Theory and its Applications September 16, Lecture 5

Spectral Graph Theory and its Applications September 16, Lecture 5 Spectral Graph Theory and ts Applcatons September 16, 2004 Lecturer: Danel A. Spelman Lecture 5 5.1 Introducton In ths lecture, we wll prove the followng theorem: Theorem 5.1.1. Let G be a planar graph

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Perron Vectors of an Irreducible Nonnegative Interval Matrix Perron Vectors of an Irreducble Nonnegatve Interval Matrx Jr Rohn August 4 2005 Abstract As s well known an rreducble nonnegatve matrx possesses a unquely determned Perron vector. As the man result of

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Affine and Riemannian Connections

Affine and Riemannian Connections Affne and Remannan Connectons Semnar Remannan Geometry Summer Term 2015 Prof Dr Anna Wenhard and Dr Gye-Seon Lee Jakob Ullmann Notaton: X(M) space of smooth vector felds on M D(M) space of smooth functons

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Convexity preserving interpolation by splines of arbitrary degree

Convexity preserving interpolation by splines of arbitrary degree Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

DIFFERENTIAL FORMS BRIAN OSSERMAN

DIFFERENTIAL FORMS BRIAN OSSERMAN DIFFERENTIAL FORMS BRIAN OSSERMAN Dfferentals are an mportant topc n algebrac geometry, allowng the use of some classcal geometrc arguments n the context of varetes over any feld. We wll use them to defne

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Appendix B. Criterion of Riemann-Stieltjes Integrability

Appendix B. Criterion of Riemann-Stieltjes Integrability Appendx B. Crteron of Remann-Steltes Integrablty Ths note s complementary to [R, Ch. 6] and [T, Sec. 3.5]. The man result of ths note s Theorem B.3, whch provdes the necessary and suffcent condtons for

More information

Deriving the X-Z Identity from Auxiliary Space Method

Deriving the X-Z Identity from Auxiliary Space Method Dervng the X-Z Identty from Auxlary Space Method Long Chen Department of Mathematcs, Unversty of Calforna at Irvne, Irvne, CA 92697 chenlong@math.uc.edu 1 Iteratve Methods In ths paper we dscuss teratve

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

The Second Eigenvalue of Planar Graphs

The Second Eigenvalue of Planar Graphs Spectral Graph Theory Lecture 20 The Second Egenvalue of Planar Graphs Danel A. Spelman November 11, 2015 Dsclamer These notes are not necessarly an accurate representaton of what happened n class. The

More information

The proximal average for saddle functions and its symmetry properties with respect to partial and saddle conjugacy

The proximal average for saddle functions and its symmetry properties with respect to partial and saddle conjugacy The proxmal average for saddle functons and ts symmetry propertes wth respect to partal and saddle conjugacy Rafal Goebel December 3, 2009 Abstract The concept of the proxmal average for convex functons

More information

12 MATH 101A: ALGEBRA I, PART C: MULTILINEAR ALGEBRA. 4. Tensor product

12 MATH 101A: ALGEBRA I, PART C: MULTILINEAR ALGEBRA. 4. Tensor product 12 MATH 101A: ALGEBRA I, PART C: MULTILINEAR ALGEBRA Here s an outlne of what I dd: (1) categorcal defnton (2) constructon (3) lst of basc propertes (4) dstrbutve property (5) rght exactness (6) localzaton

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars

More information

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems

More information

SELECTED SOLUTIONS, SECTION (Weak duality) Prove that the primal and dual values p and d defined by equations (4.3.2) and (4.3.3) satisfy p d.

SELECTED SOLUTIONS, SECTION (Weak duality) Prove that the primal and dual values p and d defined by equations (4.3.2) and (4.3.3) satisfy p d. SELECTED SOLUTIONS, SECTION 4.3 1. Weak dualty Prove that the prmal and dual values p and d defned by equatons 4.3. and 4.3.3 satsfy p d. We consder an optmzaton problem of the form The Lagrangan for ths

More information

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016 CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Physics 5153 Classical Mechanics. Principle of Virtual Work-1 P. Guterrez 1 Introducton Physcs 5153 Classcal Mechancs Prncple of Vrtual Work The frst varatonal prncple we encounter n mechancs s the prncple of vrtual work. It establshes the equlbrum condton of a mechancal

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials MA 323 Geometrc Modellng Course Notes: Day 13 Bezer Curves & Bernsten Polynomals Davd L. Fnn Over the past few days, we have looked at de Casteljau s algorthm for generatng a polynomal curve, and we have

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n

More information

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION

More information

Ballot Paths Avoiding Depth Zero Patterns

Ballot Paths Avoiding Depth Zero Patterns Ballot Paths Avodng Depth Zero Patterns Henrch Nederhausen and Shaun Sullvan Florda Atlantc Unversty, Boca Raton, Florda nederha@fauedu, ssull21@fauedu 1 Introducton In a paper by Sapounaks, Tasoulas,

More information

( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1

( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1 Problem Set 4 Suggested Solutons Problem (A) The market demand functon s the soluton to the followng utlty-maxmzaton roblem (UMP): The Lagrangean: ( x, x, x ) = + max U x, x, x x x x st.. x + x + x y x,

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

Modelli Clamfim Equazione del Calore Lezione ottobre 2014

Modelli Clamfim Equazione del Calore Lezione ottobre 2014 CLAMFIM Bologna Modell 1 @ Clamfm Equazone del Calore Lezone 17 15 ottobre 2014 professor Danele Rtell danele.rtell@unbo.t 1/24? Convoluton The convoluton of two functons g(t) and f(t) s the functon (g

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness. 20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The frst dea s connectedness. Essentally, we want to say that a space cannot be decomposed

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information