arxiv: v3 [math.na] 1 Jul 2017

Size: px
Start display at page:

Download "arxiv: v3 [math.na] 1 Jul 2017"

Transcription

1 Accelerated Alternatng Drecton Method of Multplers: an Optmal O/K Nonergodc Analyss Huan L Zhouchen Ln arxv: v3 [math.na] Jul 07 July, 07 Abstract The Alternatng Drecton Method of Multplers ADMM s wdely used for lnearly constraned convex problems. It s proven to have an o/ K nonergodc convergence rate and a faster O/K ergodc rate after ergodc averagng, whch may destroy the sparsty and lowranness n sparse and low-ran learnng, where K s the number of teratons. In ths paper, we modfy the accelerated ADMM proposed n [Y. Ouyang, Y. Chen, G. Lan, and E. Paslao, An Accelerated Lnearzed Alternatng Drecton Method of Multplers, SIAM J. on Imagng Scences, 05, ] and gve an O/K nonergodc convergence rate analyss, whch satsfes F x K F x O/K, Ax K b O/K and x K has a more favorable sparseness and low-ranness than the ergodc result. As far as we now, ths s the frst O/K nonergodc convergent ADMM type method for general lnearly constraned convex problems. Moreover, we show that the lower complexty bound of ADMM type methods for the separable lnearly constraned nonsmooth convex problems s O/K, whch means that our method s optmal. Introducton We consder the followng general lnearly constraned convex problem: mn x R n f x h x, s.t. A x = b, where both f and h are convex. f s L -Lpschtz dfferentable and h can be nonsmooth. Specally, f can vansh n problem. Problems le arse from dverse applcatons n machne learnng, magng and computer vson, see, e.g., [,, 3] and references theren. In machne learnng, f s often the loss functon to ft the data and h s the regularzer that promotes some pror nformaton on the desred soluton, such as sparseness and low-ranness. We say f s L -Lpschtz dfferentable f t satsfes f x f y L x y, x, y, and L -contnuous f f x f y L x y, x, y. We denote F x = f x h x, x = x, x, F x = F x and Ax = A x. The dscusson n ths paper also suts for the general constrant A x = b, where A : R n R m s a lnear mappng. For smplcty we focus on A x = b. We denote x as x for a vector x. Peng Unversty. Emal: lhuanss@pdu.edu.cn Peng Unversty. Emal: zln@pu.edu.cn

2 ADMM [] s wdely used n magng and vson to solve problem snce the separable structure can be exploted. ADMM conssts of three steps: where Lx, x, λ = x = argmn x Lx, x, λ, a x = argmn Lx, x, λ, b x λ = λ ρ A x b, c f x h x λ, A x b ρ A x b s the augmented Lagrangan functon and λ s the Lagrange multpler. When F s not smple and A s non-untary, the cost of solvng the subproblems may be hgh. Thus the Lnearzed ADMM LADMM s proposed by lnearzng the augmented term Ax b and the complex f [4, 5, 6] such that the subproblems may even have closed form solutons. Tradtonal convergence rate analyss on ADMM s dffcult due to ts seral update of x and x, whch means that x, x s not the soluton to mn x,x Lx, x, λ. Thus some alternatve crtera are used nstead. The most popular crteron s the ergodc convergence rate. Defnton Let {x,, x K } be a sequence produced by the algorthm that they have the property promoted by the regularzer hx, for nstance, sparseness and low-ranness. We say a convergence rate s nonergodc f t measures the optmalty at x K drectly. A convergence rate s ergodc f t consders the optmalty at pont of K = c x wth c > 0 and K = c =. K The most commonly used ergodc crteron for ADMM s the average form of K = x. It s proved n [7] that ADMM converges wth an O/K ergodc rate. A crtcal dsadvantage of the ergodc result s that the pont measured for the convergence rate may not have the property promoted by hx snce t may be destroyed by the ergodc averagng. For example, n sparse learnng, {x,, x K } are sparse, but ther average may not be sparse any more. So the nonergodc analyss s strongly requred for ADMM. He and Yuan [8] proved w K w K K wth wk = x K, λ K. However, ths crteron does not drectly measure how far F x K s from F x and how much the constrant error Ax K b s, where x s an optmal soluton to problem. Recently, Davs and Yn [9] proved that the Douglas-Rachford DR splttng [0] converges wth an O/K ergodc rate and an o/ K nonergodc rate. Moreover, they constructed some examples showng that ths rate s tght. As s nown, ADMM s a specal case of DR splttng []. So for ADMM, Davs and Yn [9] establshed F x K F x o/ K and Ax K b o/ K n a nonergodc sense. Thus n sparse and low-ran learnng, we have that for ADMM the nonergodc measurement at x K s sparse or low-ran, but has the slow o/ K theoretcal convergence rate, and the ergodc measurement at K = c x has the faster O/K theoretcal convergence rate, but may not be sparse or low-ran. We want to combne the advantage of these two aspects,.e., a faster O/K convergence rate but stll n the nonergodc sense. Our technque s to use Nesterov s acceleraton scheme for ADMM. Bec and Teboulle [] extended Nesterov s accelerated gradent method [3] to the nonsmooth unconstraned problem of mn x fxhx, whch conssts of two steps: frst extrapolate a pont y = x θ x x and then compute x = Prox αh y α fy, where Prox αh z = argmn x hx α x z. On the other hand, Nesterov [4] proposed another accelerated gradent method, whch conssts of three steps: z = z x, y = z x and x = Prox α h x α fy. We follow [5] to name these two schemes as Nesterov s frst and second acceleraton scheme, respectvely.

3 Chen et al. [6] proposed an nertal proxmal ADMM whch uses the same dea as Nesterov s frst scheme: frst extrapolate a pont ˆx, ˆx, ˆλ and then perform the steps a-c on ˆx, ˆx, ˆλ. However, they only establshed the o/ K convergence rate n the sense of mn =,,K F x F x o/ K and mn =,,K Ax b o/ K. Lorenz and Poc [7] analyzed the nertal forward-bacward algorthm for the general monotone nclusons, whch nclude problem as a specal case. However, no convergence rate s establshed n [7]. Ouyang et al. [8] proposed an accelerated ADMM va Nesterov s second acceleraton scheme. The convergence rate s better than that of LADMM n terms of ther dependence on the Lpschtz constant of the smooth component. However, the entre convergence rate remans O/K n an ergodc sense. Nesterov s second scheme only nfluences the lnearzaton of f n steps a-b. It cannot mprove the nonergodc rate of ADMM. Thus, the nonergodc rate of the accelerated ADMM n [8] cannot be better than o/ K. Please see Secton for detaled explanatons. When strongly convexty s assumed, Goldsten et al. [9] proposed an O/K fast ADM- M for ts dual problem. When even more assumptons are made, such as beng strongly convex and havng contnuous gradent, or subdfferentals of the underlyng functons are pecewse lnear multfunctons, lnear convergence can be obtaned [0,,, 3, 4]. Some researchers studed the frst-order prmal-dual algorthm for the saddle-pont problem, whch ncludes problem as a specal case. For example, Chambolle and Poc [] establshed the O/K ergodc convergence rate for the general convex problems, the accelerated O/K convergence rate when the prmal or the dual objectve s unformly convex and the lnear convergence rate when both are unformly convex. Chen et al. [5] combned Nesterov s second scheme wth the prmal-dual algorthm and also establshed the O/K ergodc convergence rate.. Contrbutons Although the O/K convergence rate of ADMM and ts accelerated versons s wdely studed n the lteratures, they all need an ergodc averagng [7, 6, 8,, 5], whch may destroy the sparsty and low-ranness n sparse and low-ran learnng. As far as we now, there s no lterature establshng the O/K nonergodc convergence rate of ADMM type methods for the general convex problem. Moreover, as proved n [9], the nonergodc convergence rate of the tradtonal ADMM s o/ K and t wll be shown n Secton 4 that ths rate s tght. In ths paper, we am to gve the frst O/K nonergodc convergent ADMM type method. We modfy the accelerated ADMM proposed n [8] and gve an O/K nonergodc analyss satsfyng F x K F x O/K and Ax K b O/K. Compared wth the O/K ergodc rate n [8] and the tradtonal ADMM, our result s n a nonergodc sense and thus enjoys the sparseness and low-ranness drectly n applcatons of sparse and low-ran learnng. Compared wth the nonergodc rate n [8] and the tradtonal ADMM, we mprove t from o/ K to O/K. We also show that the lower complexty bound of ADMM type methods for the separable lnearly constraned convex problems s O/K when each F s nonsmooth and not strongly convex, whch means that the convergence rate of ADMM type methods cannot be better than O/K no matter how t s accelerated. Thus our method s optmal. 3

4 Revew of the Accelerated ADMM n [8] In ths secton, we frst revew the accelerated ADMM n [8] for problem, whch conssts of the followng steps: y = x z, =,, 3a z = argmn f z f y, z z L z z z h z λ, A z A T A z A z b, z z A z z, 3b z = argmn f z f y, z z L z z z h z λ, A z A T A z A z b, z z A z z, 3c x = x z, 3d x = x z, 3e λ = λ A z A z b, 3f where satsfes θ. Snce the regularzer hx acts drectly on z n 3b-3c and thus z has the property promoted by hx, the convergence measured at z K, z K s n the nonergodc sense. Accordngly, x K s a convex combnaton of z,, z K : x K K = K = z and so t = s an ergodc result measured at x K, x K. It s proved n [8] that 3a-3f has the O/K ergodc convergence rate measured at x K, x K. We can see that the accelerated ADMM n [8] s a drect combnaton of Nesterov s second acceleraton scheme and the tradtonal LADMM. Nesterov s acceleraton scheme only nfluences the lnearzaton of f and cannot mprove the convergence rate of the tradtonal ADMM. In fact, we can consder the specal case of f = 0 correspondngly, L = 0 and omt the lnearzaton of the augmented term. In ths case, procedure 3a-3f reduces to: z = argmn z z = argmn z h z λ, A z A z A z b, 4a h z λ, A z A z A z b, 4b x = x z, 4c x = x z, 4d λ = λ A z A z b. 4e We can see that procedure 4a-4e reduces to the tradtonal ADMM and 4c-4d has no nfluence on the teratons of the tradtonal ADMM. Thus the nonergodc convergence rate of procedure 4a- 4e measured at z K, z K remans o/ K. Snce 4a-4e s a specal case of 3a-3f, we can have that the nonergodc rate of 3a-3f measured at z K, z K should not be better than o/ K. 3 ALADMM-NE wth O/K Nonergodc Convergence Rate In ths secton, we gve our Accelerated LADMM wth NonErgodc convergence rate ALADMM- NE. We frst provde an equvalent descrpton of 3a-3f for the smooth case of problem We smplfy some parameter settngs, but the algorthm framewor remans the same wth [8]. 4

5 n Secton 3., whch motvates our nonergodc algorthm for the nonsmooth case n Secton 3.. Then we gve the convergence rate analyss n Secton 3.3 and at last, we dscuss the advantage and dsadvantage of the accelerated ADMM n Secton An Equvalent Algorthm for the Smooth Problem In ths secton, we gve an equvalent descrpton of 3a-3f for the smooth case of problem wth h x = 0, =, : y = x θ x x, =,, 5a x = argmn f y f y L, x y x x y ˆλ, A x A T θ A y A y b, x y A x y θ, 5b x = argmn f y f y L, x y x x y ˆλ, A x A T θ A x A y b, x y A x y θ, 5c ˆλ = ˆλ τax b. 5d for some > τ > 0.5, θ 0 = and =, whch leads to θ and thus concdes τ wth the requrement for 3b-3f. It can be observed that f we set τ =, then =, y = x,, and 5a-5d reduces to the tradtonal LADMM. At frst glance, 3a-3f combnes ADMM wth Nesterov s second acceleraton scheme whle 5a-5d uses Nesterov s frst acceleraton scheme. Proposton The sequence x, x produced n 3a-3f and 5a-5d are equvalent when h x = 0, =,. Proof We derve each step of 5a-5d from 3a-3f. From 3a, 3d and 3e, we have y = x z = x θ x x = x θ x x, 5

6 whch s 5a. From the optmalty condton of 3b, we have 0 = f y L z z A T λ A T A z A z b A z z 3a,3d = f y L x y A T λ A T A z A z b A x y 3a = f y L x y A T λ A T A y A x y A x A y θ A x b = f y L x y A T λ A T A x A x b A T A y A y b A x y θ = f y L x y A T ˆλ A T A x y, A y A y b where we defne ˆλ = λ A x A x b. It s exactly the optmalty condton of 5b. Smlarly, from the optmalty condton of 3c, we also have 0 = f y L z z A T λ A T A z A z b A z z = f y L x y A T λ A T A z A z b 3d,3a A x y = f y L x y A T λ A T A x A x A y A x y = f y L x y A T ˆλ A T A x y, A x b A x A y b 6

7 whch s the optmalty condton of 5c. From the defnton of ˆλ, we have ˆλ ˆλ = λ λ Ax b Ax b 3f = Az b Ax b Ax b 3d,3e Ax Ax = b Ax b Ax b Ax b Ax b = Ax b Ax b θ = θ Ax b = τax b, where we defne τ = and t s the same wth 5d. 3. The Nonergodc Algorthm for the Nonsmooth Problem From the dscusson n Secton, we now that the accelerated ADMM proposed n [8] has the o/ K nonergodc convergence rate measured at z K, z and the O/K ergodc convergence rate measured at x, x. We want to have an algorthm wth the faster O/K nonergodc convergence rate. After establshng the equvalence between 3a-3f and 5a-5d, an easy ntuton s to add the nonsmooth term h x n steps 5b and 5c drectly: x = argmn f y f y L, x y x x y h x ˆλ, A x A T θ A y A y b, x y A x y θ, 0a x = argmn f y f y L, x y x x y h x ˆλ, A x A T θ A x A y b, x y A x y θ. 0b We descrbe the new method n Algorthm. Due to the dfferent postons of the term h x, Algorthm and 3a-3f are no longer equvalent for the nonsmooth problem. In Algorthm, h x acts on x drectly and thus t has the property promoted by hx, such as the sparseness or low-ranness f h x s a sparse or low ran regularzer. So the convergence rate measured at x K n Algorthm s n the nonergodc sense. As comparson, 3a-3f promotes the sparseness and low-ranness on z and x K s a convex combnaton of z,, z K. The zeros may le n dfferent postons of z,, z K or n dfferent postons of ther sngular values for low-ranness and thus x K may not be sparse or low-ran any more. It should be noted that for the smooth case, snce hx vanshes, we do not dstngush the ergodc and nonergodc rate between 3a-3f and 5a-5d. 7

8 Algorthm Accelerated LADMM wth NonErgodc convergence rate ALADMM-NE Intalze λ 0, x 0 = x, =,, > τ > 0.5, > 0, θ 0 =, θ = /τ. for = 0,,, do Update y, =, usng 5a, Update x and x serally, usng 0a and 0b, respectvely, Update ˆλ usng 5d, = τ. end for 3.3 The Convergence Rate Analyss In ths secton, we prove the O/K convergence rate measured at x K for Algorthm. Due to the dfferent postons of the nonsmooth term h x, the proof technque of 3a-3f n [8] cannot be extended to Algorthm and more efforts are needed for the analyss on Algorthm. Moreover, Ouyang et al. [8] need the assumpton that the prmal and dual varables are bounded n order to accomplsh the proof. As comparson, we do not need ths assumpton. Ths verfes that our proof s totally dfferent from [8]. ALADMM-NE s an extenson of Nesterov s frst acceleraton scheme from unconstraned problems to constraned ones. For unconstraned problems, a crucal property of Nesterov s frst acceleraton scheme s F x F x F x F x δ z x z x. The man step n the convergence rate proof of ALADMM-NE s to construct a counterpart of for both the objectve and the constrant n problem. Proposton plays such a role for the objectve. As comparson, the tradtonal ADMM [6] can prove a smlar result n the form of F x F x λ, Ax b δ x x x x κ λ λ λ λ, whch can only lead to the result of ergodc averagng after telescopng. Proposton Assume that f x s convex wth contnuous gradent, L s the Lpschtz constant, and h x s convex. Let θ = τ wth 0 < τ < and θ 0 =. For Algorthm, we have F x F x λ, Ax b F x F x λ, Ax b τ F x F x λ, Ax b ˆλ λ ˆλ λ η D x η D x η D x A D A x η D x A D A x, where η = L A, ˆλ = λ θ θ Ax b, D {x, λ } s any KKT pont. = x θ x, D0 = x0 and Before provng Proposton, we frst prove the followng Lemma. 8

9 Lemma Let λ = λ A x A y b, then we have Proof From ˆλ = λ θ we have b Ax Ax = θ ˆλ ˆλ, ˆλ λ A x y, ˆλ K ˆλ K [ 0 = Ax b Ax b τ Ax b ]. =0 Ax b, θ = τ and λ = λ τ ˆλ =λ A x b =λ τ =λ τ A x b A x b τ A x b =λ A x b =ˆλ A x b A x b =ˆλ b A x A x. On the other hand, from 4a and the defnton of λ we have ˆλ λ = θ From 4b and θ = τ we have = = = ˆλ K ˆλ 0 K =0 =0 ˆλ ˆλ [ K A x [ K A x =0 A x y A x y. ] b θ A x b b A x ] b τ A x b. A x b 4a 4b Then we can prove Proposton usng Lemma. 9

10 Proof 3 Let λ = λ A y. b From the optmalty condton of 0a and 0b, we have 0 f y h x A T λ L A x y, From the convexty of h x we have h x h x f y A T λ L A x y, x x. On the other hand, from the L -Lpschtz dfferentable and convex of of f, we have So f x f y f y, x y L x y =f y f y, x y f y, x L x x y f x f y, x x L x y. F x F x [ A T λ, x x L A x y ]. L A x y, x y Let x = x and x = x respectvely, we have F x F x [ A T λ, x x L A x y ], L A x y, x y and F x F x [ A T λ, x x L A x y ]. L A x y, x y 0

11 Multply the frst nequalty by, multply the second by and add them together, we have So we have F x F x F x [ λ, A x A x A x L A L A x y, x x y x y ]. F x F x λ, Ax b F x F x λ, Ax b =F x F x F x λ, A x A x b = [ λ λ, A x A x A x L A L A x y, x x y x y ] λ λ, A x A x A x [ λ λ, A x A x A x L A L A where we use A x we have Snce x y, x x y x y ], = b. Let D = x θ x, D = y θ x = x y = x θ x x. λ λ, A x A x A x = A y A x, A x A x A x θ = [ A x A x A x A x A x A y ] θ A y A x [ A D A x A D A x ] A y x, x,

12 and L A L = = θ η A L A x y, x x y [ x x y x x x x y. [ D x D x ] L A x y, ] where η = L A. So from Lemma we have F x F x λ, Ax b F x F x λ, Ax b θ λ λ, ˆλ ˆλ θ [ A D A x A D A x ] η [ D x D x ] A y x = θ ˆλ λ ˆλ λ λ ˆλ ˆλ λ θ [ A D A x A D A x ] η [ D x D x ] A y x θ ˆλ λ ˆλ λ λ ˆλ θ [ A D A x A D A x ] η [ D x D x ].

13 Dvde both sdes by and use θ = τ, we have F x F x λ, Ax b F x F x λ, Ax b τ F x F x λ, Ax b ˆλ λ ˆλ λ λ ˆλ η D x A D A x η D x A D A x η [ D x D x ] ˆλ λ ˆλ λ λ ˆλ η D x A D A x η D x A D A x η D x η D x, where we use and η η, whch can be derved from = τ and 0 < τ <. A good property of Proposton s that we can sum the nequalty over = 0,, K and then bound F x K F x λ, Ax K b by a constant, whch leads to F x K F x θ K λ, Ax K b Oθ K. Proposton manly handles the objectve. For the constrant we have a smlar proposton, but wth an addtonal summaton over = 0,, K. Proposton 3 If the condtons n Proposton hold, then for Algorthm we have K Ax b Ax b τ Ax b C λ ˆλ 0, =0 where C = λ0 λ L A x 0 x A x 0 A x L A x 0 x. Proof 4 We contnue from Proposton. Summng over = 0,,, K, we have F x K θ K F x λ, Ax K b K τ F x F x λ, Ax b C ˆλ K λ, = 30 where we use θ 0 =, 0 = θ0 θ 0 = θ τ, η K D K x A D K A x 0, 3

14 and C λ0 λ L A x 0 x A x 0 A x L A = ˆλ 0 λ x 0 x η 0 D0 x A D 0 A x η0 D0 x. The last relaton comes from D 0 = x0, ˆλ 0 = λ 0 θ0 θ 0 A x 0 b = λ 0 and η 0 = L θ 0 A. Snce {x, λ } s any KKT pont, we have So Thus we have whch leads to F x = F x x = argmn F x x λ, λ, A x b A x b F x λ, ˆλ K λ C, ˆλ K ˆλ 0 ˆλ K λ λ ˆλ 0 C λ ˆλ 0.. A x b, x. 34 From Lemma, we have [ K A x b A x b τ A x b] =0 C λ ˆλ 0. Both Proposton and Proposton 3 have a smlar form to. Thus we have extended Nesterov s frst acceleraton scheme from unconstraned problem to deal wth both the objectve and the constrant. Moreover, from Proposton 3 we can see that Nesterov s acceleraton scheme s crtcal to accelerate not only the decrease of the objectve, but also the constrant error. In Proposton 3, the summaton les nsde. Thus t s more dffcult to bound AxK b θ K than boundng θ K F x K F x λ, Ax K b from Proposton. We dscover the followng crtcal Lemma whch can overcome ths dffculty. Lemma Consder a sequence {a, a, } of vectors, f {a } satsfes K /τ K/τ ak a c, K = 0,,,. = where > τ > 0. Then K = a < c for all K =,,. 4

15 Proof 5 For each K 0, there exsts c K wth every entry c K 0 such that and c K = c. Let S K c K /τ K/τ a K K = = K = a, K and S0 = 0, then c K a c K, S K /τ K/τ S K ak, K 0, /τ K/τ c K where we use /τ > and /τ K/τ > 0. So K 0 we have S K =a K S K ck S K /τ K/τ SK c K = /τ K/τ c K /τ K/τ K /τ /τ K/τ c K K /τ /τ K/τ SK c K /τ K /τ K/τ /τ K /τ SK K /τ c K /τ K/τ /τ K/τ /τ K /τ K /τ K/τ /τ K/τ /τ K /τ K /τ /τ K /τ SK c K /τ K/τ K /τ c K /τ K/τ /τ K /τ c K /τ K /τ K /τ K/τ /τ K/τ /τ K /τ /τ K /τ c K K j/τ c /τ j /τ /τ 0/τ /τ /τ 0/τ S0 j= K = c K j/τ, /τ /τ /τ j /τ = j= where we defne K j/τ j=k /τj /τ =. Let r = /τ /τ K j= j/τ, =,,, K. /τ j /τ 5

16 Then we have r > 0 and S K K = r c. Smlarly, we have S K K = r c. Thus Defne R K = K = K S K r c. = /τ /τ K j= j/τ /τ j /τ, and Then we have R K = R = = K = /τ /τ /τ /τ K j= j= j/τ /τ j /τ, j/τ /τ j /τ = τ. R K K = /τ K/τ /τ /τ = /τ K/τ K /τ /τ K/τ = /τ K/τ = K = /τ /τ K /τ /τ K/τ RK. K j= K j= j/τ /τ j /τ j/τ /τ j /τ Then we wll prove R K <, K by nducton. It can be easly checed that R = τ <. Assume R K < holds, then S K R K < So by nducton we can have R K <, K. So K 0, we have K = K /τ /τ K/τ /τ K/τ =. K r = r c K K = r = K r = r c K K < = r where we use K = r = R K < and the convexty of x. So we have S K = K S K < r = c = K = r c < c, = r c, where we use c = c,. So K = a = S K < c, K 0. 6

17 . Based on Propostons and 3, we can have the O/K nonergodc convergence rate n Theorems Theorem If the condtons n Proposton hold, then for Algorthm we have and where C = τc λ K τ F xk F x C τc λ K τ, Ax K b τc K τ, C λ λ 0 τ and C s defned n Proposton 3. Proof 6 We contnue from Proposton 3. From 30, 34 and Proposton 3 we can have F x K F x λ, A x K b Cθ K, and C λ ˆλ 0 { K A x b A x b τ A x b} =0 = A x K b θ K A x 0 b K θ τ A x b =0 = A x K b K θ K τ A x b, K = 0,,,. = where we use θ τ = θ0 θ = 0. Snce = τ = 0 θ τ, we have = 0 τ. For smplcty, let a = A x b, then K C λ /τ K/τ ak a ˆλ 0 τ = C, K = 0,,. θ 0 τ = From Lemma we have K = a C, K =,,. So a K,,. Moreover, a C τc /τ0/τ. So A x K b τc, K = 0,,, K τ Thus F x K F x Cθ K λ A x K b C K τ τc λ K τ, C /τk/τ, K = 7

18 Algorthm Accelerated LADMM wth NonErgodc convergence rate and RestartALADMM- NER Intalze λ 0, x 0 = x, =,, > τ > 0.5, > 0, θ 0 =, > ɛ > 0, θ = /τ. for = 0,,, do Update y, =, usng 5a, Update x and x serally usng 0a and 0b, Update ˆλ usng 5d, =. τ f A x =, = end f end for b A x b and θ < ɛ then and F x K F x λ A x K b τc λ K τ, whch s from 34. From Theorem we can see that the O/K nonergodc convergence rate exsts only f τ <. In fact, only when τ <, = τ s n the order of O/ and Nesterov s acceleraton scheme s effectve. As dscussed n Secton 3., ALADMM-NE reduces to the tradtonal LADMM when τ =. 3.4 Tps on the Choce of the Algorthms In applcatons where the practcal performance of LADMM concdes wth ts theoretcal convergence rate, t s guaranteed that ALADMM-NE practcally outperforms LADMM. However, n the cases where LADMM converges much faster than ts theoretcal rate, e.g., n applcatons of Robust PCA [7] that LADMM almost lnearly converges, we emprcally observe that the superorty of ALADMM-NE and the accelerated ADMM n [8] s not obvous. In fact, due to the specal settng of whch dependents on, ALADMM-NE and the method n [8] have exactly the O/K convergence rate measured at {x K, x K } even for the strongly convex problems. So n practce, we suggest that when the problem s complex and does not satsfy the lnear convergence condtons [0,,, 3, 4], ALADMM-NE and the accelerated ADMM n [8] are better choces than the tradtonal LADMM. When sparsty or low-ranness s requred, ALADMM-NE s better than the accelerated ADMM n [8]. Donoghue and Candès [8] proposed a restart strategy for Nesterov s frst acceleraton scheme when mnmzng the unconstraned problems, n whch the algorthm s restarted after some teratons by settng = and y = x. Then the lnear convergence s guaranteed even for the sublnear settng of [9]. A smlar technque s dscussed for Nesterov s second scheme n [30]. So we can apply the restart scheme for the accelerated ADMM n [8] and ALADMM-NE. The latter s descrbed n Algorthm. We restart ALADMM-NE as long as Ax b ncreases. We set = = n the f-clause to mae y = x when the algorthm s restarted. We use the crteron < ɛ to prevent frequent restart and only restart when becomes small. 8

19 4 Tghtness of the o/ K Nonergodc Rate for the Tradtonal ADMM In ths secton we show that the o K rate s tght for ADMM, at least for the constrant, by studyng a specal problem [3, 9], on whch Alternatng Projecton Method APM and DR splttng perform slowly: arbtrarly slow on the measure of x x and tght o convergence rate on the measure of fx fx. The dscusson n ths secton also suts for LADMM and the accelerated ADMM n [8] measured at z, z snce they are equvalent to ADMM on ths specal problem. Let θ be a sequence of angles n 0, π/ wth cosθ =. Let e 0 =, 0, e π/ = 0,, e θ = cosθ e 0 snθ e π/. Defne two lnes U = span{e 0 } and V = span{e θ }, then U V = {0}. Consder the Hlbert space H = R R. Defne We consder problem U = R e 0 R e 0, V = R e θ0 R e θ, mn x fx = hx gx, 59 where hx = I U x s the ndcator functon of U, gx = a d V x, d V x = mn v V x v and a can be any constant satsfyng a > 0.5. Ths problem can be solved by ADMM and ALADMM-NE by transformng t to mn hx gz s.t. z x = x,z Proposton 4 says that the o K rate s tght for ADMM. Ths means that the slow o K nonergodc convergence rate of ADMM s not due to the weaness of the proof, but that of ADMM tself. It s dffcult to establsh the lower complexty bound of hx gz hx gz, so we only measure fx fx for smplcty. It should be noted that Proposton 4 s ADMM specfed and t does not sut for ALADMM-NE. As comparson, we can establsh z x O/ and fx fx O/ for ALADMM-NE, whch establshes the superorty of ALADMM-NE wth theoretcal guarantee. We lst the comparsons n Table. [ ] Proposton 4 Let x 0 = a, λ 0 = 0, a > 0.5, then for ADMM wth teratons 0 a-c we have z x Ω and fx fx Ω. a a a In Proposton 4 we specalze the ntalzaton of x 0 and λ 0, where x 0 x s bounded and ndependent on. Ths s a standard trc n the analyss of lower bound. Proposton 4 can be proved usng the same proof framewor n [9], so we omt the detals. One may thn that the ncreasng penalty n ALADMM-NE s the decdng factor of the mproved convergence rate. However, ths s ncorrect. Emprcally, large penalty speeds up the decrease of the constrant error n ADMM [6]. But ths s not guaranteed n theory. In fact, From Proposton 4 we can see that the constrant error s ndependent of, whch means that the decrease of the constrant error cannot be faster than o K no matter how large s. There are two reasons for ths result:. It s equvalent to mnmzng the sum of two ndcator functons when usng ADMM to solve problem 60 and has no nfluence on the projecton operaton;. x and ALADMM-NE can be appled to Hlbert spaces. Snce gz s contnuous [9], we have fx fx hx gz hx gz gz gx O/ OL/ = O/. 9

20 Table : Theoretcal complexty comparsons among ALADMM-NE, ADMM, DR and APM on problem 59. a s any constant satsfyng a > 0.5. Theoretcal complexty bound APM fx fx Ω a DR fx fx Ω a ADMM fx fx Ω, z x Ω a a ALADMM-NE fx fx O/, z x O/ z are updated serally, not parallel. Thus although the gradually ncreasng penalty n ALADMM- NE plays an mportant role to cooperate wth Nesterov s acceleraton scheme, Nesterov s scheme s ndeed the crtcal factor to mprove the convergence rate n theory. Large penalty cannot mprove the convergence rate of ADMM even for the constrant. 5 Lower Complexty Bound Recently, Woodworth and Srebro [3] establshed the O/K lower complexty bound of the s- m tochastc gradent methods for optmzng the fnte sum problem: mn x m f x, where each f s nonsmooth and not strongly convex. In ths secton we frst use Woodworth and Srebro s result to analyze the general splttng scheme, and then extend t to the general ADMM type methods, whch deal wth the addtonal lnear constrant. 5. Splttng Scheme We consder the followng problem: mn F x F x. x X We call a method belongng to the general splttng scheme f t has the form of Generate z t n any way, x t = Prox F/ tzt, Generate z t n any way, x t = Prox F/ tzt, 6 n the t-th teraton and t s arbtrary. In ths general scheme, two proxmal subproblems are solved alternatvely. z t and z t can be generated n any way. For example, z t Span{x,, x t, x,, x t } and z t Span{x,, x t, x,, x t }. Ths general splttng scheme ncludes many famous s- plttng algorthms, such as DR splttng, whch conssts of the followng steps: x t = Prox F/z t, x t = Prox F/x t z t, z t = z t x t x t. For ths general splttng scheme, we can have the followng proposton by the same analyss n [3]: 0

21 Proposton 5 There exst functons F and F defned over X = {x R 65 : x B}, whch are convex and L-Lpschtz contnuous, such that for the general splttng scheme 6 we can have F ˆx F ˆx LB 8, where ˆx = α x α x, α and α, =,,. 5. General ADMM Type Methods Now we use Proposton 5 to establsh the lower complexty bound of ADMM type methods. Consder the followng specal case of problem : Defne the general ADMM type methods as mn F x F x, s.t. x x = x,x X Generate λ t and y t n any way, x t = argmn Lx, y, t λ t, t = Prox F/ t y t λt t, z Generate λ t and y t n any way, x t = argmn x Ly t, x, λ t, t = Prox F/ t y t λt t, 66 where t can be any value. The dfference between ths general scheme and the tradtonal ADMM s that we replace λ t, x t, x t and ρ n a-c wth λ t, λ t y, t y t and t. These fve varables can be generated n any way. For nstance, λ t, λ t Span{λ,, λ t }, λ t = λ t α t t x t x t wth arbtrary α t, y t Span{x,, x t, x,, x t } and y t Span{x,, x t, x,, x t }. It can be checed that the tradtonal ADMM and ALADMM-NE wth f = 0 and A = I belongs to ths general scheme. We can see that procedure 66 belongs to 6 by lettng z t = y t λt and z t t = y t λt. t Let ˆx = α x and ˆx = α x. Then from Proposton 5 we now that there exsts convex and L-contnuous F and F such that F ˆx F ˆx F x F x LB 8. Snce F s L-contnuous: F ˆx F ˆx L ˆx ˆx, we can have F ˆx F ˆx L ˆx ˆx and LB 8 F ˆx F ˆx F x F x L ˆx ˆx F ˆx F ˆx F x F x L ˆx ˆx F ˆx F ˆx F x F x where x = x = x. Thus we have the followng lower complexty bound proposton for the general ADMM type methods for both the ergodc and nonergodc case, where the nonergodc bound can be obtaned by lettng α = α = 0, =,,, and α = α =. Proposton 6 There exsts functons F and F defned over X = {x R 65 : x B}, whch are convex and L-contnuous, such that for the general ADMM type methods 66 we can have L ˆx ˆx F ˆx F ˆx F x F x where ˆx = α x and ˆx = α x, α and α, =,,. LB 8.

22 Snce problem 65 s a specal case of, we can have that O/K s the optmal convergence rate of the general ADMM type methods 66 for problem. There s no better ADMM type algorthm whch converges faster than the O/K rate f t belongs to the framewor n 66. Moreover, 66 s general enough for the separable problem whle stll eepng the property of ADMM that alternately mnmzes the augmented Lagrangan functon. Thus our result s general enough. Snce we can easly construct some algorthms whch may dverge such that they can easly mae one of Ax b and F x F x small but dffcult to eep both small, ths s why we use the summaton n Proposton Experments on the Group Sparse Logstc Regresson wth Overlap In ths secton we test the performance of ALADMM-NE and ALADMM-NER on the Group Sparse Logstc Regresson wth Overlap. Ths problem can be deemed as a combnaton of the Group Sparse Logstc Regresson [33] and the Group LASSO wth Overlap [34]. Its mathematcal model s as follows: mn w,b s s log exp y w T x b ν t S j w, where x and y are the tranng samples and labels. w and b are the parameters for the classfer. s s the sample sze and t s the group sze. S j, j =,, t are the selecton matrces wth only one at each row and 0 for the rest entres. We consder the case that the groups of entres may overlap each other. We can transform the problem to a lnearly constraned one by ntroducng S j = S j ; 0, S = S. S t, w = mn w,z w b s x, x =, z j = S j w and z = s log exp y w T x ν z. z t j= : t z j, s.t. z = Sw. 70 We carry out the experment on the breast cancer gene expresson data set. 350 genes n 95 breast cancer tumors are consdered n our experment, whch appear n 637 gene groups. Gene selecton s a ey purpose n ths problem. The group sparsty regularzaton helps to decde whch groups of Genes play a central role n the cancer predcton. Thus the group sparsty s strongly requred. We compare ALADMM-NE and ALADMM-NER wth LADMM and the accelerated LADMM ALADMM [8]. We set the ntalzer at 0 and run all the methods for 000 teratons. We set τ = 0.8 for ALADMM-NE and ALADMM-NER and ɛ = 0.0 for ALADMM-NER. For ALADMM, we set the parameters followng the assumptons n Theorem.6 of [8]. We set = 0.3 for LADMM, = 0.06 for erg-aladmm, = 0.4 for nerg-aladmm, = 0.08 for ALADMM-NE and ALADMM-NER for the best performance of each algorthm, respectvely, where erg-aladmm erg-ladmm means that we use the ergodc soluton x K = K K θ = = x /K of j= K = z LADMM and nerg-aladmm nerg-ladmm means that we use the nonergodc soluton z K x K of LADMM drectly. Fgure draws the plots of the objectve functon value, the constrant error, the sparsty and the group sparsty vs. tme. We run LADMM for teratons and use ts nonergodc output as the optmal F. We can see that both erg-ladmm and erg-aladmm have a less favorable sparsty and group sparsty than ther nonergodc counterparts, ths verfes that the nonergodc measurement s requred. However, Nerg-ALADMM decreases the objectve functon slower than erg-aladmm. In some practcal applcatons, ADMM can perform better than the theoretcal bound. Thus t s

23 0 erg LADMM nerg LADMM nerg ALADMM erg ALADMM ALADMM NEours ALADMM NERours 0 erg LADMM erg ALADMM nerg LADMM nerg ALADMM ALADMM NEours ALADMM NERours Functon Value Constrant Tme 50% a log 0 F w, z F Tme 30 b log 0 z Sw Sparsty 45% 40% 35% 30% 5% 0% 5% erg LADMM erg ALADMM nerg LADMM nerg ALADMM ALADMM NEours ALADMM NERours Group Sparsty erg LADMM erg ALADMM nerg LADMM nerg ALADMM ALADMM NEours ALADMM NERours 0% 5% Tme c Sparsty Tme d Group Sparsty Fgure : Compare ALADMM-NE and ALADMM-NER wth LADMM and ALADMM on the Group Sparse Logstc Regresson problem. We present the functon value, constrant error, sparsty percent of selected Genes and Group sparsty number of non-empty groups. 3

24 not strange that nerg-ladmm converges faster than erg-ladmm. As comparson, ALADNM-NE and ALADMM-NER not only run faster than the compared methods but also have the sparsty and group sparsty as well as nerg-ladmm and nerg-aladmm. In ADMM type methods, the monotoncty of the objectve functon and the constrant error cannot be guaranteed n theory. Ths leads to the oscllaton n Fgure. 6 Conclusons In ths paper, we modfy the accelerated ADMM proposed n [8] and gve an O/K nonergodc analyss n the sense of F x K F x O/K and Ax K b O/K, where the nonergodc result has a more favorable sparsty and low-ranness than the ergodc one. Ths s the frst O/K nonergodc convergent ADMM type method and surpasses the o/ K nonergodc rate of the tradtonal ADMM. Moreover, we show that the lower complexty bound of ADMM type methods s O/K when each F s nonsmooth and not strongly convex, whch means that our method s optmal. References [] S. Boyd, N. Parh, E. Chu, B. Peleato, and J. Ecsten. Dstrbuted optmzaton and statstcal learnng va the alternatng drecton method of multplers. In Foundatons and Trends n Machne Learnng, 00. [] A. Chambolle and T. Poc. A frst-order prmal-dual algorthm for convex problems wth applcatons to magng. Journal of Mathematcal Imagng and Vson, 40:0 45, 0. [3] E. Esser, X. Zhang, and T. F. Chan. A general framewor for a class of frst order prmal-dual algorthms for convex optmzaton n magng scence. SIAM J. on Imagng Scence, 34:05 046, 00. [4] B. He, L. Lao, D. Han, and H. Yang. A new nexact alternatng drectons method for monotone varatonal nequaltes. Mathematcal Programmng, 9:03 8, 00. [5] R. Shef and M. Teboulle. Rate of convergence analyss of decomposton methods based on the proxmal method of multplers for convex mnmzaton. SIAM J. on Optmzaton, 4:69 97, 04. [6] X.F. Wang and X.M. Yuan. The lnearzed alternatng drecton method for Dantzg selector. SIAM J. on Scentfc Computng, 345:A79 A8, 0. [7] B.S. He and X.M. Yuan. On the O/t convergence rate of the Douglas-Rachford alternatng drecton method. SIAM J. on Numercal Analyss, 50: , 0. [8] B.S. He and X.M. Yuan. On non-ergodc convergence rate of Douglas-Rachford alternatng drectons method of multplers. Numersche Mathemat, 30: , 05. [9] D. Davs and W.T. Yn. Convergence rate analyss of several splttng schemes. Techncal report, UCLA CAM Report, 04. [0] J. Douglas and H.H. Rachford. On the numercal soluton of heat conducton problems n two and three space varables. Transactons of the Amercan mathematcal Socety, pages 4 439, 956. [] D. Gabay. Applcatons of the method of multplers to varatonal nequaltes. Studes n Mathematcs and ts applcatons, 5:99 33, 983. [] A. Bec and M. Teboulle. A fast teratve shrnage thresholdng algorthm for lnear nverse problems. SIAM J. Imagng Scences, :83 0, 009. [3] Y. Nesterov. A method for unconstraned convex mnmzaton problem wth the rate of convergence O/. Sovet Mathematcs Dolady, 7:37 376, 983. [4] Y. Nesterov. On an approach to the constructon of optmal methods of mnmzaton of smooth convex functons. Èonom.. Mat. Metody, pages , 988. [5] P. Tseng. On accelerated proxmal gradent methods for convex-concave optmzaton. Techncal report, Unversty of Washngton, Seattle,

25 [6] C.H. Chen, R.H. Chan, S.Q. Ma, and J.F. Yang. Inertal proxmal ADMM for lnearly constraned separable convex optmzaton. SIAM J. on Imagng Scences, 84:39 67, 05. [7] D. A. Lorenz and T. Poc. An nertal forward-bacward algorthm for monotone nclusons. Journal of Mathematcal Imagng and Vson, 5:3 35, 05. [8] Y.Y. Ouyang, Y.M. Chen, G.H. Lan, and E. Paslao. An accelerated lnearzed alternatng drecton method of multplers. SIAM J. on Imagng Scences, 73:588 63, 05. [9] T. Goldsten, B. O Donoghue, S. Setzer, and R. Baranu. Fast alternatng drecton optmzaton methods. SIAM J. on Imagng Scences, 73:588 63, 04. [0] W. Deng and W.T. Yn. On the global and lnear convergence of the generalzed alternatng drecton method of multplers. Journal of Scentfc Computng, pages , 06. [] M.Y. Hong and Z.Q. Luo. On the lnear convergence of the alternatng drecton method of multplers. Mathematcal Programmng, 6-:65 99, 07. [] P. Gselsson and S. Boyd. Lnear convergence and metrc selecton n douglas rachford splttng and admm. IEEE Transactons of Automatc Control, 6:53 544, 07. [3] W. Yang and D. Han. Lnear convergence of the alternatng drecton method of multplers for a class of convex optmzaton problems. SIAM J. on Numercal Analyss, 54:65 640, 06. [4] D. Boley. Local lnear convergence of the alternatng drecton methodof multplers on quadratc or lnear programs. SIAM J. on Optmzaton, 34:83 07, 03. [5] Y. Chen, G. Lan, and Y. Ouyang. Optmal prmal-dual methods for a class of saddle pont problems. SIAM J. on Optmzaton, 44:779 84, 04. [6] Z.C. Ln, R.S. Lu, and H. L. Lnearzed alternatng drecton method wth parallel splttng and adaptve penalty for separable convex programs n machne learnng. Machne Learnng, 99:87 35, 05. [7] Y. Ma Z. Ln, M. Chen. The augmented lagrange multpler method for exact recovery of corrupted low-ran matrces. arxv: , 00. [8] B. O Donoghue and E. J. Candès. Adaptve restart for accelerated gradent schemes. Foundatons of Computatonal Mathematcs, 53:75 73, 05. [9] I. Necoara, Yu. Nesterov, and F. Glneur. Lnear convergence of frst order methods for non-strongly convex optmzaton. arxv: , 06. [30] Z. Ln H. L. Provable accelerated gradent method for nonconvex low ran optmzaton. arxv: , 07. [3] H.H. Bausche, J.Y. Bello Cruz, T.T.A. Ngha, H.M. Phan, and X. Wang. The rate of lnear convergence of the Douglas Rachford algorthm for subspaces s the cosne of the Fredrchs angle. Journal of Approxmaton Theory, 850:63 79, 04. [3] B. Woodworth and N. Srebro. Tght complexty bounds for optmzng composte objectves. In NIPS, 06. [33] L. Meer, S. van de Geer, and P. Bühlmann. The group LASSO for logstc regresson. Journal of the Royal Statstcal Socety: Seres B Statstcal Methodology, 70:53 7, 008. [34] L. Jacob, G. Obozns, and J.P. Vert. Group LASSO wth overlap and graph LASSO. In ICML,

On the Global Linear Convergence of the ADMM with Multi-Block Variables

On the Global Linear Convergence of the ADMM with Multi-Block Variables On the Global Lnear Convergence of the ADMM wth Mult-Block Varables Tany Ln Shqan Ma Shuzhong Zhang May 31, 01 Abstract The alternatng drecton method of multplers ADMM has been wdely used for solvng structured

More information

Convergence rates of proximal gradient methods via the convex conjugate

Convergence rates of proximal gradient methods via the convex conjugate Convergence rates of proxmal gradent methods va the convex conjugate Davd H Gutman Javer F Peña January 8, 018 Abstract We gve a novel proof of the O(1/ and O(1/ convergence rates of the proxmal gradent

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Lecture 20: November 7

Lecture 20: November 7 0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:

More information

arxiv: v1 [math.oc] 14 Nov 2015

arxiv: v1 [math.oc] 14 Nov 2015 Fast Promal Lnearzed Alternatng recton Method of Multpler wth Parallel Splttng arxv:5.0533v [math.oc] 4 Nov 05 Cany Lu, Huan L, Zhouchen Ln,3,, Shucheng Yan epartment of Electrcal and Computer Engneerng,

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Inexact Alternating Minimization Algorithm for Distributed Optimization with an Application to Distributed MPC

Inexact Alternating Minimization Algorithm for Distributed Optimization with an Application to Distributed MPC Inexact Alternatng Mnmzaton Algorthm for Dstrbuted Optmzaton wth an Applcaton to Dstrbuted MPC Ye Pu, Coln N. Jones and Melane N. Zelnger arxv:608.0043v [math.oc] Aug 206 Abstract In ths paper, we propose

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Iteration-complexity of a Jacobi-type non-euclidean ADMM for multi-block linearly constrained nonconvex programs

Iteration-complexity of a Jacobi-type non-euclidean ADMM for multi-block linearly constrained nonconvex programs Iteraton-complexty of a Jacob-type non-eucldean ADMM for mult-block lnearly constraned nonconvex programs Jefferson G. Melo Renato D.C. Montero May 13, 017 Abstract Ths paper establshes the teraton-complexty

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization To appear n Optmzaton Vol. 00, No. 00, Month 20XX, 1 27 Research Artcle Almost Sure Convergence of Random Projected Proxmal and Subgradent Algorthms for Dstrbuted Nonsmooth Convex Optmzaton Hdea Idua a

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

Lagrange Multipliers Kernel Trick

Lagrange Multipliers Kernel Trick Lagrange Multplers Kernel Trck Ncholas Ruozz Unversty of Texas at Dallas Based roughly on the sldes of Davd Sontag General Optmzaton A mathematcal detour, we ll come back to SVMs soon! subject to: f x

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

Randomized block proximal damped Newton method for composite self-concordant minimization

Randomized block proximal damped Newton method for composite self-concordant minimization Randomzed block proxmal damped Newton method for composte self-concordant mnmzaton Zhaosong Lu June 30, 2016 Revsed: March 28, 2017 Abstract In ths paper we consder the composte self-concordant CSC mnmzaton

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 13 GENE H GOLUB 1 Iteratve Methods Very large problems (naturally sparse, from applcatons): teratve methods Structured matrces (even sometmes dense,

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

System of implicit nonconvex variationl inequality problems: A projection method approach

System of implicit nonconvex variationl inequality problems: A projection method approach Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 6 (203), 70 80 Research Artcle System of mplct nonconvex varatonl nequalty problems: A projecton method approach K.R. Kazm a,, N. Ahmad b, S.H. Rzv

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

Structured Nonconvex and Nonsmooth Optimization: Algorithms and Iteration Complexity Analysis

Structured Nonconvex and Nonsmooth Optimization: Algorithms and Iteration Complexity Analysis Structured onconvex and onsmooth Optmzaton: Algorthms and Iteraton Complexty Analyss Bo Jang Tany Ln Shqan Ma Shuzhong Zhang ovember 13, 017 Abstract onconvex and nonsmooth optmzaton problems are frequently

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Convex Optimization. Optimality conditions. (EE227BT: UC Berkeley) Lecture 9 (Optimality; Conic duality) 9/25/14. Laurent El Ghaoui.

Convex Optimization. Optimality conditions. (EE227BT: UC Berkeley) Lecture 9 (Optimality; Conic duality) 9/25/14. Laurent El Ghaoui. Convex Optmzaton (EE227BT: UC Berkeley) Lecture 9 (Optmalty; Conc dualty) 9/25/14 Laurent El Ghaou Organsatonal Mdterm: 10/7/14 (1.5 hours, n class, double-sded cheat sheet allowed) Project: Intal proposal

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

IV. Performance Optimization

IV. Performance Optimization IV. Performance Optmzaton A. Steepest descent algorthm defnton how to set up bounds on learnng rate mnmzaton n a lne (varyng learnng rate) momentum learnng examples B. Newton s method defnton Gauss-Newton

More information

arxiv: v1 [math.oc] 6 Jan 2016

arxiv: v1 [math.oc] 6 Jan 2016 arxv:1601.01174v1 [math.oc] 6 Jan 2016 THE SUPPORTING HALFSPACE - QUADRATIC PROGRAMMING STRATEGY FOR THE DUAL OF THE BEST APPROXIMATION PROBLEM C.H. JEFFREY PANG Abstract. We consder the best approxmaton

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far

More information

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso Machne Learnng Data Mnng CS/CS/EE 155 Lecture 4: Regularzaton, Sparsty Lasso 1 Recap: Complete Ppelne S = {(x, y )} Tranng Data f (x, b) = T x b Model Class(es) L(a, b) = (a b) 2 Loss Functon,b L( y, f

More information

Canonical transformations

Canonical transformations Canoncal transformatons November 23, 2014 Recall that we have defned a symplectc transformaton to be any lnear transformaton M A B leavng the symplectc form nvarant, Ω AB M A CM B DΩ CD Coordnate transformatons,

More information

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016 CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

General viscosity iterative method for a sequence of quasi-nonexpansive mappings Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 9 (2016), 5672 5682 Research Artcle General vscosty teratve method for a sequence of quas-nonexpansve mappngs Cuje Zhang, Ynan Wang College of Scence,

More information

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING 1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Neural networks. Nuno Vasconcelos ECE Department, UCSD Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

SELECTED SOLUTIONS, SECTION (Weak duality) Prove that the primal and dual values p and d defined by equations (4.3.2) and (4.3.3) satisfy p d.

SELECTED SOLUTIONS, SECTION (Weak duality) Prove that the primal and dual values p and d defined by equations (4.3.2) and (4.3.3) satisfy p d. SELECTED SOLUTIONS, SECTION 4.3 1. Weak dualty Prove that the prmal and dual values p and d defned by equatons 4.3. and 4.3.3 satsfy p d. We consder an optmzaton problem of the form The Lagrangan for ths

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Convexity preserving interpolation by splines of arbitrary degree

Convexity preserving interpolation by splines of arbitrary degree Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

arxiv: v1 [quant-ph] 6 Sep 2007

arxiv: v1 [quant-ph] 6 Sep 2007 An Explct Constructon of Quantum Expanders Avraham Ben-Aroya Oded Schwartz Amnon Ta-Shma arxv:0709.0911v1 [quant-ph] 6 Sep 2007 Abstract Quantum expanders are a natural generalzaton of classcal expanders.

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Communication-Efficient Algorithms for Decentralized and Stochastic Optimization

Communication-Efficient Algorithms for Decentralized and Stochastic Optimization oname manuscrpt o. (wll be nserted by the edtor) Communcaton-Effcent Algorthms for Decentralzed and Stochastc Optmzaton Guanghu Lan Soomn Lee Y Zhou the date of recept and acceptance should be nserted

More information

Math 217 Fall 2013 Homework 2 Solutions

Math 217 Fall 2013 Homework 2 Solutions Math 17 Fall 013 Homework Solutons Due Thursday Sept. 6, 013 5pm Ths homework conssts of 6 problems of 5 ponts each. The total s 30. You need to fully justfy your answer prove that your functon ndeed has

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Distributed and Stochastic Machine Learning on Big Data

Distributed and Stochastic Machine Learning on Big Data Dstrbuted and Stochastc Machne Learnng on Bg Data Department of Computer Scence and Engneerng Hong Kong Unversty of Scence and Technology Hong Kong Introducton Synchronous ADMM Asynchronous ADMM Stochastc

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Lecture 17: Lee-Sidford Barrier

Lecture 17: Lee-Sidford Barrier CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning

A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning A Delay-tolerant Proxmal-Gradent Algorthm for Dstrbuted Learnng Konstantn Mshchenko Franck Iutzeler Jérôme Malck Massh Amn KAUST Unv. Grenoble Alpes CNRS and Unv. Grenoble Alpes Unv. Grenoble Alpes ICML

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information