Efficient Bregman Projections onto the Simplex

Size: px
Start display at page:

Download "Efficient Bregman Projections onto the Simplex"

Transcription

1 05 IEEE 54th Annual Conference on Decson and Control (CDC) December 5-8, 05. Osaka, Japan Effcent Bregman Projectons onto the Smplex Wald Krchene Syrne Krchene Alexandre Bayen Abstract We consder the problem of projectng a vector onto the smplex = {x R d + : d = x = }, usng a Bregman projecton. Ths s a common problem n frstorder methods for convex optmzaton and onlne-learnng algorthms, such as mrror descent. We derve the KKT condtons of the projecton problem, and show that for Bregman dvergences nduced by ω-potentals, one can effcently compute the soluton usng a bsecton method. More precsely, an ɛ- approxmate projecton can be obtaned n O(d log ). We also ɛ consder a class of exponental potentals for whch the exact soluton can be computed effcently, and gve a O(d log d) determnstc algorthm and O(d) randomzed algorthm to compute the projecton. In partcular, we show that one can generalze the KL dvergence to a Bregman dvergence whch s bounded on the smplex (unlke the KL dvergence), strongly convex wth respect to the l norm, and for whch one can stll solve the projecton n expected lnear tme. I. INTRODUCTION Many frst-order methods for convex optmzaton and onlne learnng can be formulated as teratve projectons of a vector on a feasble set. Consder for example the constraned convex problem, mnmze x X f(x), where X s a convex set and f : X R s convex. Ths problem can be solved usng the mrror descent algorthm, a frst-order method proposed by Nemrovsk and Yudn n [] (see also [4]), whch generalzes the projected gradent descent method, by replacng the Eucldean projecton step wth a generalzed Bregman projecton. Ths method can be summarzed n Algorthm. Algorthm Mrror descent method wth learnng rates (η τ ) and Bregman dvergence D ψ. : for τ N do : Query a sub-gradent vector g (τ) f(x (τ) ) 3: Update x (τ+) = arg mn D ψ (x, ( ψ) ( ψ(x (τ) ) η τ g (τ) )) x X () 4: end for Here, D ψ s the Bregman dvergence nduced by a dstance generatng functon ψ. The defnton and propertes Wald Krchene s wth the department of Electrcal Engneerng and Computer Scences, Unversty of Calforna, Berkeley, USA. wald@eecs.berkeley.edu Syrne Krchene s wth the ENSIMAG school of Computer Scences and Appled Mathematcs of Grenoble, France. syrne.krchene@ensmag.grenoble-np.fr Alexandre Bayen s wth the department of Electrcal Engneerng and Computer Scences, and the department of Cvl and Envronmental Engneerng, Unversty of Calforna, Berkeley, USA. bayen@berkeley.edu of Bregman dvergences wll be revewed n Secton II. Some mportant nstances of the mrror descent method nclude projected gradent descent, obtaned by takng the Bregman dvergence to be the squared Eucldean dstance, and the exponentated gradent descent [8] (also called Hedge algorthm or multplcatve weghts algorthm []), obtaned by takng the Bregman dvergence to be the KL dvergence. In ths artcle, we focus specfcally on smplexconstraned convex problems. That s, we suppose that X s the smplex d = {x R d + : d = x = }, or more generally, a product of scaled smplexes, X = α d α I d I. Smplex-constraned problems nclude nonparametrc statstcal estmaton, see for example Secton 7. n [8], mult-commodty flow problems, see Chapter n [0], tomography mage reconstructon [5] and learnng dynamcs n repeated games [0]. Other varants of the mrror descent method have been studed as well, such as stochastc mrror descent [7], [9]. Besdes ts applcatons to convex optmzaton, smplexconstraned mrror descent plays an mportant role n onlne learnng problems [9], n whch a decson maker chooses, at each teraton τ, a dstrbuton x (τ) over a fnte acton set A wth A = d. Then, a bounded loss vector l (τ) [0, ] d s revealed, and the decson maker ncurs expected loss l (τ), x (τ) = d = x(τ) l (τ). Ths sequental decson problem s also called predcton wth expert advce [], and has a long hstory whch dates back to Hannan [5] and Blackwell [6], who studed ths problem n the context of repeated games. In (adversaral) onlne learnng problems, one seeks to desgn an algorthm whch has a guarantee on the worst-case regret, defned as follows: f the algorthm s presented wth a sequence of losses (l (τ) ) τ T, and t generates a sequence of decsons (x (τ) ) τ T, then the cumulatve regret of the algorthm up to teraton T s R((l (τ) ) 0 τ T ) = T τ= l (τ), x (τ) mn T τ= l (τ), x, and the worst-case regret s the maxmum such regret over admssble sequences of losses max (l (τ) ) 0 τ T R((l (τ) ) 0 τ T ). An algorthm s sad to have sublnear regret f ts worstcase regret grows sub-lnearly n T, that s, R((l lm sup T max (τ) ) 0 τ T ) (l (τ) ) 0 τ T T 0. The onlne mrror descent method, obtaned smply by replacng the subgradent vector g (τ) n Algorthm wth the loss vector l (τ), defnes a large class of onlne learnng algorthms wth sub-lnear regret, see for example the survey of Bubeck and Cesa-Banch n [9]. The onlne /5/$ IEEE 39

2 mrror descent method s summarzed n Algorthm. Algorthm Onlne mrror descent method wth learnng rates (η τ ) and Bregman dvergence D ψ. : for τ N do : Play acton a (τ) x (τ) 3: Dscover loss vector l (τ) [0, ] d 4: Incur expected loss l (τ), x (τ) 5: Update x (τ+) = arg mn D ψ (x, ( ψ) ( ψ(x (τ) ) η τ l (τ) )) () 6: end for Onlne mrror descent, and ts stochastc varant, have been appled to several problems ncludng mult-armed bandts [9], [], machne learnng [] and repeated games [], to cte a few. In all the varants of smplex-constraned mrror descent, one needs to solve, at each teraton τ, the Bregman projecton step gven n equaton () or (). Some nstances of Bregman projectons are known to have an exact soluton whch can be computed effcently. For example, the soluton of the KL dvergence projecton on the smplex s gven by the exponental weghts update [], [3], and the Eucldean projecton on the smplex can be computed effcently ether by sortng and thresholdng n O(d log d), or by usng a randomzed pvot method n O(d), see [3]. In ths artcle, we start by dervng the KKT condtons of the Bregman projecton problem n Secton II, then consder, n Secton III, a general class of Bregman dvergences, nduced by ω-potentals, as defned by Audbert et al. []. We show that for ths class, the soluton can be approxmated effcently: an ɛ-approxmate soluton can be computed n O(d log ɛ ) operatons. In Secton IV, we consder a class of exponental potentals, and study the resultng Bregman projecton, a generalzaton of the KL-dvergence projecton. We show that for ths class, the exact soluton can be computed usng a determnstc algorthm wth O(d log d) complexty, or a randomzed algorthm wth expected lnear complexty. We also study the propertes of the resultng Bregman dvergence. In partcular, we emphasze a tradeoff between strong convexty and boundedness, two propertes whch affect the convergence rates of the mrror descent method. II. BREGMAN PROJECTION AND OPTIMALITY CONDITIONS Let ψ : X R be a convex functon defned on a convex set X, and let X be the subset of X on whch ψ s dfferentable. Let ψ : X R be the gradent of ψ, and R ts range. The Bregman dvergence nduced by ψ s defned as follows D ψ : X X R + (x, y) D ψ (x, y) = ψ(x) ψ(y) ψ(y), x y (3) By convexty of ψ, the Bregman dvergence s non-negatve, and x D ψ (x, y) s convex. We wll refer to ψ as the dstance-generatng functon. We say that ψ s l ψ -strongly convex wth respect to a reference norm f D ψ (x, y) l ψ x y x, y X X. In order for the Bregman projecton () to be well-defned, the gradent vector (or loss vector) at teraton τ must satsfy the followng consstency condton: ψ(x (τ) ) η τ g (τ) R. (4) A. Interpretatons of the Bregman projecton The Bregman projecton, gven n equaton (), can be nterpreted as projectng on X, the vector ( ψ) ( ψ(x (τ) ) η τ g (τ) ), obtaned by mappng the current terate x (τ) to the set R through ψ, takng a step n the opposte drecton of the gradent, then mappng the new vector back through ( ψ), see Nemrovsk and Yudn []. A second nterpretaton can be obtaned, as observed by Beck and Teboulle [3], by rewrtng the objectve functon as follows: denotng the vector ( ψ) ( ψ(x (τ) ) η τ g (τ) ) by x (τ), we have by defnton of D ψ x (t+) = arg mn D ψ (x, x (τ) ) = arg mn = arg mn ψ(x) whch s equvalent to mnmzng x (τ+) = arg mn η τ (f(x (τ) ) + ψ(x) ψ( x (τ) ) ψ( x (τ) ), x x (τ) ψ(x (τ) ) η τ g (τ), x, g (τ), x x (τ) ) +D ψ (x, x (τ) ), whch can be nterpreted as follows: the frst term f(x (τ) )+ g (τ), x x (τ) s the lnear approxmaton of f around the current terate x (τ), and the second term D ψ (x, x (τ) ) s a non-negatve functon whch penalzes devatons from x (τ). The step sze (or learnng rate) η τ, controls the relatve weght of both terms. B. Smplex-constraned Bregman projecton In the remander of the paper, we wll assume, to smplfy the dscusson, that the feasble set s the smplex d = {x R d + : d = x = }. We observe that all the results can be readly extended to the case n whch X s a product of scaled smplexes, as follows: suppose X = α d α K d K, wth α k > 0, and let ψ k be a dstance generatng functon on d k. Then consder the functon ψ : α d α K d K R K (α x,..., α K x K ) α k ψ k (x k ). k= The gradent of ψ s smply ψ : α d α K d K R R K, (α x,..., α K x K ) ( ψ (x ),..., ψ K (x K )), and ts nverse s gven by 39

3 ( ψ) : R... R K α d α K d K, (y,..., y K ) (α ψ (y ),..., α K ψ K (y K)). Fnally, the Bregman dvergence decomposes as follows D ψ ((α k x k ) k, (α k y k ) k ) = k = k α k ψ k (x k ) k α k D ψk (x k, y k ). α k ψ k (y k ) k ψ(y k ), α k (x k y k ) Therefore, the projecton on X wth Bregman dvergence D ψ can be decomposed nto K projectons on d k wth Bregman dvergence D ψk, as follows: arg mn D ψ (x, ( ψ) ( ψ(x (τ) ) η τ g (τ) )) x k d k = arg mn x k d k α k D ψk (x k, ψ k ( ψ k(x (τ) k ) η τ g (τ) k )), assumng the consstency condton holds for each k. Example (Eucldean projecton): Consder the functon ψ(x) = x. Then ψ(x) = x, and the Bregman dvergence s smply D ψ (x, y) = x y. As a consequence, the Bregman projecton step reduces to arg mn d D ψ (x, ( ψ) ( ψ(x (τ) ) η τ g (τ) )) = arg mn x (x(τ) η τ g (τ) ), d whch corresponds to a projected gradent descent update, wth step sze η τ. C. Optmalty condtons We now derve the KKT condtons for the Bregman projecton problem gven by mnmze x R d D ψ (x, ( ψ) ( ψ( x) ḡ)) subject to x d (5) where, x d, and ḡ R d are gven. Note that we combne η τ g (τ) nto a sngle vector ḡ, to smplfy notaton. By strong convexty, the soluton s unque. Proposton : Consder the Bregman projecton problem (5). Then x R d s optmal f and only f there exst λ R d + and ν R such that x = ( ψ) ( ψ( x) ḡ + λ + ν ), d = x =,, x 0, λ x = 0, where ν s the vector whose entres are all equal to ν. Proof: Defne the Lagrangan, for x R d, λ R d +, and ν R, L(x, λ, ν) = D ψ (x, ( ψ) ( ψ( x) ḡ)) λ, x + ν( x ). (6) = For all x, y X, the gradent of the Bregman dvergence s gven by x D ψ (x, y) = ψ(x) ψ(y). Thus the gradent of L s gven by x L(x, λ, ν) = ψ(x) ψ( x) + ḡ λ ν. Wrtng the KKT condtons of problem (5), we have that (x, λ, ν ) s optmal f and only f ψ(x ) ψ( x) + ḡ λ ν = 0, x =,, x 0, λ 0, λ x = 0, and the frst equaton can be rearranged as x = ( ψ) ( ψ( x) ḡ + λ + ν ), whch proves the clam. In the next secton, we wll derve an effcent algorthm to compute an approxmate soluton for the class of Bregman dvergences nduced by ω-potentals, by solvng the KKT system gven n Proposton. III. EFFICIENT APPROXIMATE PROJECTION WITH ω-potentials Defnton : Let a (, + ] and ω 0. An ncreasng, C -dffeomorphsm φ : (, a) (ω, + ) s called an ω-potental f lm φ(u) = ω, lm u φ(u) = +, φ (u)du <. u a We assocate, to an ω-potental φ, the dstance-generatng x ω Fg.. φ(u) x φ (u)du functon ψ defned as follows Illustraton of an ω-potental ψ : (ω, + ) d R x = x 0 0 a φ (u)du. By defnton, ψ s fnte (n partcular, the thrd condton on the potental ensures that ψ s fnte on the boundary of the smplex snce 0 φ (u)du < ), dfferentable on (ω, + ) d, and ts gradent s gven by ψ : (ω, ) d R = (, a) d x ψ(x) = (φ (x )) =,...,d, and snce φ n ncreasng, ψ s convex. Smlarly, the nverse of ts gradent s ( ψ) : (, a) d (ω, ) d y (φ(y )) =,...,d. 393

4 Proposton : Consder the Bregman projecton onto the smplex gven n Problem (5), and assume that ψ s nduced by an ω-potental φ. Then x s a soluton f and only f there exsts ν R such that {, x = ( φ(φ ( x ) ḡ + ν ) ) +, d = x =, where x + denoted the postve part of x, x + = max(x, 0). Proof: Combnng the expresson of ψ and ( ψ) wth Proposton, we have that x s optmal f and only f there exst ν R and λ R d + such that, x = φ(φ ( x ) ḡ + ν + λ ), d = x =,, x 0, x λ = 0. Let I = { : x > 0} be the support of x. Then by the complementary slackness condton, we have for all I, λ = 0, thus x = φ(φ ( x ) ḡ + ν ), and for all / I, φ(φ ( x ) ḡ + ν ) φ(φ ( x ) ḡ + ν + λ ) = x = 0. snce φ s ncreasng Therefore ( x can be smply wrtten x = φ(φ ( x ) ḡ + ν ) ) whch proves the clam. + Next, we make the followng observaton regardng the support of the soluton: Proposton 3: Let x be the soluton to the projecton problem (5), and let I be ts support. Then for all, j, f I and φ ( x ) ḡ φ ( x j ) ḡ j, then j I. Proof: Follows from Proposton and the fact that φ s ncreasng. As a consequence of the prevous propostons, computng the projecton reduces to computng the optmal dual varable ν, and snce the potental s ncreasng, one can teratvely approxmate ν usng a bsecton method, gven n Algorthm 3: we start by defnng a bound on the optmal ν, ν ν ν, then we teratvely halve the sze of the nterval by nspectng the value of a carefully defned crteron functon. Theorem : Consder the Bregman projecton onto the smplex gven n Problem (5), and assume that ψ s nduced by an ω-potental φ. Let ɛ > 0, and consder the bsecton method gven n Algorthm 3. Then the Algorthm termnates after T = O(log ɛ ) steps, and ts output x( ν(t ) ) s such that x( ν (T ) ) x ɛ. Each step of the algorthm has complexty O(d), thus the total complexty s O ( d log ɛ ). Proof: Defne, as n Algorthm 3, the functon x(ν) = ( φ(φ ( x ) ḡ + ν) + )=,...,d. Snce φ s, by assumpton, ncreasng, so s ν x (ν), whch s the key fact that allows us to use a bsecton. We wll denote by a superscrpt (t) the value of each varable at teraton t of the loop. To prove the clam, we show the followng nvarant for t: Algorthm 3 Bsecton method to compute the projecton x wth precson ɛ. : Input: x, ḡ, ɛ. : Intalze ν = φ () max φ ( x ) ḡ ν = φ (/d) max φ ( x ) ḡ 3: Defne x(ν) = ( φ(φ ( x ) ḡ + ν) + )=,...,d 4: whle x(ν) x(ν) > ɛ do 5: Let ν + ν+ν 6: f x (ν + ) > then 7: ν ν + 8: else 9: ν ν + 0: end f : end whle : Return x( ν) () 0 ν (t) ν (t) ν(0) ν (0), t (), 0 x (ν (t) ) x (ν (t) ), () d = x (ν (t) ) d = x ( ν (t) ). We frst prove the nvarant for t = 0. Let 0 = arg max φ ( x ) ḡ. By defnton of ν (0) and ν (0), we have φ (/d) ν = φ ( x 0 ) ḡ 0 = φ () ν, (7) and t follows that x 0 (ν (0) ) = d and x 0 ( ν (0) ) =. By (7), ν (0) ν (0) = φ () φ (/d) 0 (snce φ s ncreasng), whch proves (). Next, snce ν x (ν) s ncreasng, we have 0 x (ν (0) ) x ( ν (0) ) x 0 ( ν (0) ) =, whch proves (). Fnally, we have d = x (ν (0) ) d x 0 (ν (0) ) =, d = x ( ν (0) ) x 0 ( ν (0) ) =, whch proves (). Ths proves the nvarant for t = 0. Now suppose t holds at teraton t, and let us prove t stll holds at t +. By defnton of the bsecton (lnes 5 0), we mmedately have ν (t+) ν (t+) = ν(t) ν (t) = ν (0) ν (0) t, whch proves (). We also have that ν (t) ν (t+) ν (t+) ν (t), whch proves () snce ν x (ν) s ncreasng. Fnally, () follows from the condton of the bsecton (lne 6). To conclude the proof, we smply observe that snce the dstance ν ν decreases exponentally, the algorthm wll termnate after a number of steps logarthmc n /ɛ. Indeed, snce φ s C on (, a), t s Lpschtz-contnuous on 394

5 [φ (0), φ ()]. Let L be ts Lpschtz constant, then x(ν (t) ) x( ν (t) ) = x (ν (t) ) x ( ν (t) ) = dl ν (t) ν (t) = dl ν(0) ν (0) t by (), thus the algorthm termnates after T = log ν (0) ν (0) ɛdl teratons, and the last terate satsfes x(ν ) x( ν (T ) ) x(ν (T ) ) x( ν (t) ) ɛ, whch concludes the proof. by () and snce x are ncreasng IV. EFFICIENT EXACT PROJECTION WITH EXPONENTIAL POTENTIALS We now consder a subclass of ω-potentals, for whch we derve the exact soluton. Defnton (Exponental potental): Let ɛ 0. The functon φ ɛ : (, + ) ( ɛ, + ) u e u ɛ, s called the exponental potental wth parameter ɛ. It s a ( ɛ)-potental. The dstance generatng functon nduced by ths class of potentals s gven by ψ ɛ(x) = = = x φ ɛ (u)du = = x + ln(u + ɛ)du (x + ɛ) ln(x + ɛ) ( + ɛ) ln( + ɛ) = = H(x + ɛ) H( + ɛ), where ɛ s the vector whose entres are all equal to ɛ, and H s the generalzed negatve entropy functon, defned on R d + H(x) = d = x ln x. The correspondng Bregman dvergence s D ψɛ (x, y) = H(x + ɛ) H(y + ɛ) H(y + ɛ), x y = D KL (x + ɛ, y + ɛ) = (x + ɛ) ln x + ɛ y + ɛ, = and wll be denoted D KL,ɛ (x, y). In partcular, when ɛ = 0, D KL,ɛ (x, y) s the KL dvergence between the dstrbuton vectors x and y. When ɛ > 0, the Bregman dvergence s the KL dvergence between x + ɛ and y + ɛ. In partcular, as we wll see n Proposton 6, D KL,ɛ (x, y) s bounded whenever ɛ > 0, whle the KL dvergence (ɛ = 0) can be unbounded. As mentoned n the ntroducton, projectng on the smplex wth the KL dvergence plays a central role n many applcatons such as onlne learnng. In partcular, the projecton problem can be solved exactly n O(d) operatons, whch H(x) H ɛ (x) = H(x + ɛ), ɛ =. 0 ɛ ln ɛ + ( + ɛ) ln( + ɛ) Fg.. Illustraton of the dstance generatng functon nduced by exponental potentals wth parameter ɛ, for d = : H(x) = x ln(x ) + ( x ) ln( x ). makes ths projecton effcent. However, some varants of mrror descent, such as stochastc mrror descent, requre the Bregman dvergence to be bounded on the smplex n order to have guarantees on the convergence rate, see for example [4]. In the remander of ths secton, we wll show that projectng wth the generalzed KL dvergence D KL,ɛ enjoys many desrable propertes (strong convexty wth respect to the l norm, boundedness), and the projecton can stll be computed effcently. A. A sortng algorthm to compute the exact projecton We frst apply the optmalty condtons of Proposton to ths specal class, and show that the soluton s entrely determned by ts support. Proposton 4: Consder the Bregman projecton onto the smplex gven n Problem (5), wth Bregman dvergence D KL,ɛ. Let x be the soluton and I = { : x > 0} ts support. Then { I, x = ɛ + ( x+ɛ)e ḡ Z, Z I = ( x+ɛ)e ḡ (8) + I ɛ. Proof: Applyng Proposton wth the expresson φ(u) = e u + ɛ and φ (u) = + ln(u + ɛ), x s a soluton f and only f there exsts ν R such that, x = ( ɛ + ( x + ɛ)e ḡ e )+ ν, and x =. Thus, f I s the support of x, then these optmalty condtons are equvalent to { I, x = ɛ + ( x + ɛ)e ḡ e ν, I ɛ + ( x + ɛ)e ḡ e ν =, and the second equaton can be rewrtten as + ɛ I = e ν I ( x + ɛ)e ḡ, whch proves the clam, wth Z = e ν. Proposton 4 shows that solvng the Bregman projecton wth generalzed KL dvergence reduces to fndng the support of the soluton. Next, we show that the support has a smple characterzaton. To ths end, we assocate to ( x, ḡ) the vector ȳ defned as follows, ȳ = ( x + ɛ)e ḡ, and we denote by ȳ σ() the -th largest element of ȳ. 395

6 Algorthm 4 Sortng method to compute the Bregman projecton wth D ψɛ : Input: x, ḡ : Output: x 3: Form the vector ȳ = ( x + ɛ)e ḡ 4: Sort y, let ȳ σ() be the -th smallest element of y. 5: Let j be the smallest ndex for whch 6: Set Z = 7: Set c(j) := ( + ɛ(d j + ))ȳ σ(j) ɛ j j ȳ σ() +ɛ(d j +) x = ( ɛ + Proposton 5: The functon ) ȳ Z(j ) + c(j) ( + ɛ(d j + ))ȳ σ(j) ɛ j ȳ σ() > 0 ȳ σ() s ncreasng, and the support of x s {σ(j ),..., σ(n)}, where j = mn{j : c(j) > 0}. Proof: Frst, straghtforward algebra shows that c(j + ) c(j) = ( + ɛ(d j))(ȳ σ(j+) ȳ σ(j) ) 0. Thus c s ncreasng. To prove the second part of the clam, we know by Proposton 3 that the support s {σ( ),..., σ(n)} for some, and to show that = j = mn{j : c(j) > 0}, t suffces to show that c( ) > 0 and c(j) 0 for all j <. Frst, by the expresson (8) of x, we have x σ( ) = ɛ + ȳ σ( ) ȳ σ() +ɛ(d +) > 0, whch s equvalent to c( ) > 0. And f j < (.e. σ(j) s outsde the support), then by the expresson (8) agan, whch s equvalent to 0 = x σ(j) ɛ + ȳ σ(j) ȳ σ() +ɛ(d +) ( + ɛ(d ))ȳ σ(j) ɛ ȳ σ() 0, but c(j) s smaller than the LHS, snce c(j) ( + ɛ(d ))ȳ σ(j) ɛ ȳ σ() = ɛ j < ȳ σ(j) ȳ σ() 0, whch concludes the proof. Theorem : Algorthm 4 solves the Bregman projecton problem wth exponental potental φ ɛ n O(d log d) teratons. Proof: Correctness of the algorthm follows from the characterzaton of the support of x n Proposton 5 and Algorthm 5 QuckProjecton Algorthm to compute the Bregman projecton wth D ψɛ : Input: x, ḡ : Output: x 3: Form the vector ȳ = ( x + ɛ)e ḡ 4: Intalze J = {,..., d}, S = 0, C = 0, s = d + 5: whle J = do 6: Select a random pvot ndex j J 7: Partton J J + = { J : ȳ ȳ j } J = { J : ȳ < ȳ j } and compute S + = J + ȳ C + = J + 8: Let γ = ( + ɛ(c + C + ))ȳ j ɛ(s + S + ) 9: f γ > 0 then 0: J J, s = j : S S + S +, C C + C + : else 3: J J + 4: end f 5: end whle 6: Set Z = S +ɛc 7: Set ( ) x = ɛ + ȳ Z + the expresson of x n Proposton 4. The complexty of the sort operaton (step 4) s O(d log d), and fndng j (step 5) can be done n lnear tme snce the crteron functon c( ) s such that c(j +) c(j) = (+ɛ(d j))(ȳ σ(j+) ȳ σ(j) ), so each crteron evaluaton costs O(). Therefore, the overall complexty of Algorthm 4 s O(d log d). B. A randomzed pvot algorthm to compute the exact soluton We now propose a randomzed verson of Algorthm 4, whch selects a random pvot at each teraton, nstead of sortng the full vector. The resultng algorthm, whch we call QuckProject, s an extenson of the QuckSelect algorthm due to Hoare [6]. A smlar dea s used n the randomzed verson of the l projecton on the smplex n [3]. Theorem 3: In expectaton, the QuckProject Algorthm termnates after O(d) operatons, and outputs the soluton x of the Bregman projecton problem 5 wth the Bregman dvergence D KL,ɛ. Proof: Frst, we prove that the algorthm has expected lnear complexty. Let T (n) be the expected complexty of the whle loop when J = n. The partton and compute step (7) takes 3n operatons, then we recursvely apply the loop to J or J +, whch have szes (m, n m) for any m {,..., n}, wth unform 396

7 probablty. Thus we can bound T (n) as follows T (n) 3n + n 3n + n n T (max(m, n m)) m= n T (m), m= n and we can show by nducton that T (n) n, snce T (0) = 0 and 3n + n n m 3n + 3n 4 = n. m= n To prove the correctness of the algorthm, we wll prove that once the whle loop termnates, s = σ(j ), and S, C are respectvely the sum and the cardnalty of {ȳ σ() : j }, then by Proposton 4, we have the correct expresson of x. We start by showng the followng nvarants: () If ȳ σ(mt), s the largest element n J (t), then σ(m t + ) = (s ) (t). () J (t) contans σ(j ) or σ(j ). () S and C are the sum and cardnalty of { : σ() s }. (v) γ (t) = c(j (t) ), where c s the crteron functon defned n Proposton 5. The nvarant holds for the frst teraton snce J () = {,..., d}, m t = d, and S () = C () = 0. Suppose the nvarant s true at teraton t of the loop. Then two cases are possble: ) If γ (t) 0, then J (t+) = (J (t) ) + and m (t+) = m (t), and the nvarant stll holds. ) If γ (t) > 0, then J (t+) = (J (t) ) and (s ) (t+) = j (t), thus { : σ() (s ) t+ } = { : σ() (s ) (t) } { : (s ) t+ σ() (s ) (t) } = { : σ() (s ) (t) } (J (t) ) +, and by the update step (lnes 0 ), the nvarant stll holds. To fnsh the proof, suppose the whle loop termnates after T teratons,.e. J (T +) =. We clam that (s ) (T +) = σ(j ). Durng the last update, two cases are possble: ) If γ (T ) > 0, then ȳ j (T ) s the smallest element of J (T ). In ths case, snce c() 0 for < j, and J (T ) contans σ(j ) or σ(j ), t must be that j (T ) = σ(j ), thus (s ) T + = j (T ) = σ(j ). ) If γ (T ) 0, then ȳ j (T ) s the largest element of J (T ), n ths case, snce c(j ) > 0, t must be that j (T ) = σ(j ), so m (t) = j and (s ) (T +) = (s ) (T ) = σ(m (t) + ) = σ(j ). Ths concludes the proof. C. Propertes of the generalzed KL dvergence Algorthms 4 and 5 gve effcent methods for computng the projecton wth generalzed KL dvergence D KL,ɛ. In ths secton, we show that ths famly of Bregman dvergences enjoys addtonal propertes, gven below. Proposton 6: For all ɛ > 0, D KL,ɛ s l ɛ -strongly convex and L ɛ -smooth w.r.t., and bounded by D ɛ on, wth l ɛ + dɛ, L ɛ ɛ, D ɛ ln + ɛ. ɛ Proof: Frst, we show strong convexty. Let x, y. By Taylor s theorem, z (x + ɛ, y + ɛ) such that D KL,ɛ (x, y) = H(x + ɛ) H(y + ɛ) H(y + ɛ), x y = x y, H(z)(x y) = (x y ), z where we used the fact that the Hessan of the negatve entropy functon s H(z) = dag( z ). And snce, z ɛ (z belongs to the segment (x + ɛ, y + ɛ)), t follows that D KL,ɛ (x, y) (x y ) ɛ ɛ x y. Furthermore, by the Cauchy-Schwartz nequalty, ( x y ) (x y ) z z, thus D KL,ɛ (x, y) x y = z D KL,ɛ (δ 0, δ j0 ) = + dɛ x y. To compute the upper bound on D KL,ɛ, we observe that D KL,ɛ (x, y) s jontly-convex n (x, y) (by jont-convexty of the KL dvergence), therefore, ts maxmum on d d s attaned on a vertex of the feasble set, that s, for (x, y) = (δ 0, δ j0 ), for some ( 0, j 0 ), where δ 0 s the Drac dstrbuton on 0. Fnally, smple calculaton shows that { 0 f 0 = j 0, p q ln +ɛ ɛ otherwse. D KL(x, y 0) D KL,ɛ(x, y 0) lɛ x y 0 Lɛ x y0 0 p Fg. 3. Illustraton of Proposton 6, when d =. The dstrbutons x and y are parameterzed as follows: x = (p, p) and y = (q, q). The surface plot (left) shows the generalzed KL dvergence for ɛ =., l wth, n dashed lnes, the quadratc upper and lower bounds, ɛ y x and Lɛ x y. The second plot (rght) compares D KL,.(x, y 0 ) and D KL (x, y 0 ) for a fxed y 0 = (.35,.65). 397

8 D. Numercal experments We provde a smple python mplementaton of the projecton algorthms at gthub.com/waldk/ BregmanProjecton. The mplementaton of Algorthm 3 s generc and can be nstantated for any ω-potental by provdng the functon φ and ts nverse. The mplementaton of Algorthm 4 and QuckProject are specfc to the generalzed exponental potental. Fnally, we report n Fgure 4 the run tmes of both algorthms as the dmenson d grows, averaged over 50 runs, for randomly generated, normally dstrbuted vectors x and ḡ. The numercal smulatons are also avalable on the same repostory. Average run tme (s) SortProjecton QuckProjecton d Average run tme (s) SortProjecton QuckProjecton d 0 6 Fg. 4. Executon tme as a functon of the dmenson d, wth ɛ =., n log-log scale (left). The hghlghted regon s zoomed-n n lnear scale on the rght. The smulaton confrms that the QuckProject algorthm s, on average, faster than the sortng algorthm, especally for large d. V. CONCLUSION We studed the Bregman projecton problem on the smplex wth ω-potentals, and derved optmalty condtons for the soluton, whch motvated a smple bsecton algorthm to compute ɛ approxmate solutons n O(d log(/ɛ)) tme. Then we focused on the projecton problem wth exponental potentals, resultng n a Bregman dvergence whch generalzes the KL dvergence. We showed that n ths case, the soluton can be computed exactly n O(d log d) tme usng a sortng algorthm, or n expected O(d) tme usng a randomzed pvot algorthm. Ths class of dvergences s of partcular nterest because t has a quadratc upper and lower bound (.e. ts dstance generatng functon s both strongly convex and smooth), a property whch s essental to obtan convergence guarantees n some settngs, such as stochastc mrror descent. A queston whch remans open s whether one can project n O(d) tme usng a determnstc algorthm akn to the medan of medans algorthm due to Blum et al. [7] whch solves the selecton problem n determnstc lnear tme. The fact that one can effcently compute the exact soluton hnges on the exstence of a closed-form soluton of the dual varable ν gven the support of the soluton (Proposton 4). Ths s also the case for the Eucldean projecton,.e. when D ψ s the squared Eucldean norm, see [3]. Ths suggests that one may derve effcent projecton algorthms for other classes of Bregman dvergences, whch would, n turn, lead to new effcent nstances of the mrror descent method. REFERENCES [] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multplcatve weghts update method: a meta-algorthm and applcatons. Theory of Computng, 8(): 64, 0. [] Jean-Yves Audbert, Sébasten Bubeck, and Gàbor Lugos. Regret n onlne combnatoral optmzaton. Mathematcs of Operatons Research, 39():3 45, 04. [3] Amr Beck and Marc Teboulle. Mrror descent and nonlnear projected subgradent methods for convex optmzaton. Oper. Res. Lett., 3(3):67 75, May 003. [4] A. Ben-Tal and A. Nemrovsk. Lectures on Modern Convex Optmzaton. Socety for Industral and Appled Mathematcs, 00. [5] Aharon Ben-Tal, Tamar Margalt, and Arkad Nemrovsk. The ordered subsets mrror descent optmzaton method wth applcatons to tomography. SIAM J. on Optmzaton, ():79 08, January 00. [6] Davd Blackwell. An analog of the mnmax theorem for vector payoffs. Pacfc Journal of Mathematcs, 6(): 8, 956. [7] Manuel Blum, Robert W. Floyd, Vaughan Pratt, Ronald L. Rvest, and Robert E. Tarjan. Tme bounds for selecton. J. Comput. Syst. Sc., 7(4):448 46, August 973. [8] Stephen Boyd and Leven Vandenberghe. Convex Optmzaton, volume 5. Cambrdge Unversty Press, 00. [9] Sébasten Bubeck and Ncolò Cesa-Banch. Regret analyss of stochastc and nonstochastc mult-armed bandt problems. Foundatons and Trends n Machne Learnng, 5():, 0. [0] Yar Censor and Stavros Zenos. Parallel Optmzaton: Theory, Algorthms and Applcatons. Oxford Unversty Press, 997. [] Ncolò Cesa-Banch and Gábor Lugos. Predcton, learnng, and games. Cambrdge Unversty Press, 006. [] Ofer Dekel, Ran Glad-Bachrach, Ohad Shamr, and Ln Xao. Optmal dstrbuted onlne predcton. In Proceedngs of the 8th Internatonal Conference on Machne Learnng (ICML), June 0. [3] John Duch, Sha Shalev-Shwartz, Yoram Snger, and Tushar Chandra. Effcent projectons onto the l-ball for learnng n hgh dmensons. In Proceedngs of the 5th Internatonal Conference on Machne Learnng, ICML 08, pages 7 79, New York, NY, USA, 008. ACM. [4] John C. Duch, Alekh Agarwal, Mkael Johansson, and Mchael Jordan. Ergodc mrror descent. SIAM Journal on Optmzaton (SIOPT), (4): , 00. [5] James Hannan. Approxmaton to Bayes rsk n repeated plays. Contrbutons to the Theory of Games, 3:97 39, 957. [6] C. A. R. Hoare. Algorthm 65: Fnd. Commun. ACM, 4(7):3 3, July 96. [7] Anatol Judtsky, Arkad Nemrovsk, and Clare Tauvel. Solvng varatonal nequaltes wth stochastc mrror-prox algorthm. Stoch. Syst., ():7 58, 0. [8] Jyrk Kvnen and Manfred K. Warmuth. Exponentated gradent versus gradent descent for lnear predctors. Informaton and Computaton, 3(): 63, 997. [9] Syrne Krchene, Wald Krchene, Roy Dong, and Alexandre Bayen. Convergence of heterogeneous dstrbuted learnng n the stochastc routng game. In Proceedngs of the 53rd Annual Allerton Conference on Communcaton, Control, and Computng, 05. [0] Wald Krchene, Syrne Krchene, and Alexandre Bayen. Convergence of mrror descent dynamcs n the routng game. In European Control Conference (ECC), accepted, 05. [] A. S. Nemrovsky and D. B. Yudn. Problem complexty and method effcency n optmzaton. Wley-Interscence seres n dscrete mathematcs. Wley,

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016 CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Convergence rates of proximal gradient methods via the convex conjugate

Convergence rates of proximal gradient methods via the convex conjugate Convergence rates of proxmal gradent methods va the convex conjugate Davd H Gutman Javer F Peña January 8, 018 Abstract We gve a novel proof of the O(1/ and O(1/ convergence rates of the proxmal gradent

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

General viscosity iterative method for a sequence of quasi-nonexpansive mappings Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 9 (2016), 5672 5682 Research Artcle General vscosty teratve method for a sequence of quas-nonexpansve mappngs Cuje Zhang, Ynan Wang College of Scence,

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Lecture 20: November 7

Lecture 20: November 7 0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

Finding Primitive Roots Pseudo-Deterministically

Finding Primitive Roots Pseudo-Deterministically Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots Pseudo-Determnstcally Ofer Grossman December 22, 205 Abstract Pseudo-determnstc algorthms are randomzed search algorthms

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES BÂRZĂ, Slvu Faculty of Mathematcs-Informatcs Spru Haret Unversty barza_slvu@yahoo.com Abstract Ths paper wants to contnue

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Communication Complexity 16:198: February Lecture 4. x ij y ij

Communication Complexity 16:198: February Lecture 4. x ij y ij Communcaton Complexty 16:198:671 09 February 2010 Lecture 4 Lecturer: Troy Lee Scrbe: Rajat Mttal 1 Homework problem : Trbes We wll solve the thrd queston n the homework. The goal s to show that the nondetermnstc

More information

DECOUPLING THEORY HW2

DECOUPLING THEORY HW2 8.8 DECOUPLIG THEORY HW2 DOGHAO WAG DATE:OCT. 3 207 Problem We shall start by reformulatng the problem. Denote by δ S n the delta functon that s evenly dstrbuted at the n ) dmensonal unt sphere. As a temporal

More information

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

P exp(tx) = 1 + t 2k M 2k. k N

P exp(tx) = 1 + t 2k M 2k. k N 1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

On the Global Linear Convergence of the ADMM with Multi-Block Variables

On the Global Linear Convergence of the ADMM with Multi-Block Variables On the Global Lnear Convergence of the ADMM wth Mult-Block Varables Tany Ln Shqan Ma Shuzhong Zhang May 31, 01 Abstract The alternatng drecton method of multplers ADMM has been wdely used for solvng structured

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

The Experts/Multiplicative Weights Algorithm and Applications

The Experts/Multiplicative Weights Algorithm and Applications Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

MAT 578 Functional Analysis

MAT 578 Functional Analysis MAT 578 Functonal Analyss John Qugg Fall 2008 Locally convex spaces revsed September 6, 2008 Ths secton establshes the fundamental propertes of locally convex spaces. Acknowledgment: although I wrote these

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence. Vector Norms Chapter 7 Iteratve Technques n Matrx Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematcs Unversty of Calforna, Berkeley Math 128B Numercal Analyss Defnton A vector norm

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

SL n (F ) Equals its Own Derived Group

SL n (F ) Equals its Own Derived Group Internatonal Journal of Algebra, Vol. 2, 2008, no. 12, 585-594 SL n (F ) Equals ts Own Derved Group Jorge Macel BMCC-The Cty Unversty of New York, CUNY 199 Chambers street, New York, NY 10007, USA macel@cms.nyu.edu

More information

Approximate Smallest Enclosing Balls

Approximate Smallest Enclosing Balls Chapter 5 Approxmate Smallest Enclosng Balls 5. Boundng Volumes A boundng volume for a set S R d s a superset of S wth a smple shape, for example a box, a ball, or an ellpsod. Fgure 5.: Boundng boxes Q(P

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence Remarks on the Propertes of a Quas-Fbonacc-lke Polynomal Sequence Brce Merwne LIU Brooklyn Ilan Wenschelbaum Wesleyan Unversty Abstract Consder the Quas-Fbonacc-lke Polynomal Sequence gven by F 0 = 1,

More information

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization To appear n Optmzaton Vol. 00, No. 00, Month 20XX, 1 27 Research Artcle Almost Sure Convergence of Random Projected Proxmal and Subgradent Algorthms for Dstrbuted Nonsmooth Convex Optmzaton Hdea Idua a

More information

Random Projection Algorithms for Convex Set Intersection Problems

Random Projection Algorithms for Convex Set Intersection Problems Random Projecton Algorthms for Convex Set Intersecton Problems A. Nedć Department of Industral and Enterprse Systems Engneerng Unversty of Illnos, Urbana, IL 61801 angela@llnos.edu Abstract The focus of

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information