Revisiting Projection-Free Optimization for Strongly Convex Constraint Sets

Size: px
Start display at page:

Download "Revisiting Projection-Free Optimization for Strongly Convex Constraint Sets"

Transcription

1 Revisiing Projecion-Free Opimizaion for Srongly Convex Consrain Ses Jarrid Recor-Brooks 2260 Hayward S Ann Arbor, MI, 4804 Universiy of Michigan, Ann Arbor jrecorb@umich.edu Jun-Kun Wang 226 Fers Drive NW Alana, GA, Georgia Insiue of Technology jimwang@gaech.edu * Barzan Mozafari 2260 Hayward S Ann Arbor, MI, 4804 Universiy of Michigan, Ann Arbor mozafari@umich.edu Absrac We revisi he Frank-Wolfe (FW) opimizaion under srongly convex consrain ses. We provide a faser convergence rae for FW wihou line search, showing ha a previously overlooked varian of FW is indeed faser han he sandard varian. Wih line search, we show ha FW can converge o he global opimum, even for smooh funcions ha are no convex, bu are quasi-convex and locally-lipschiz. We also show ha, for he general case of (smooh) non-convex funcions, FW wih line search converges wih high probabiliy o a saionary poin a a rae of O( ), as long as he consrain se is srongly convex one of he fases convergence raes in non-convex opimizaion. Inroducion A popular family of opimizaion algorihms are so-called gradien descen algorihms: ieraive algorihms ha are comprised of a gradien descen sep a each ieraion, followed by a projecion sep when here is a feasibiliy consrain. The purpose of he projecion is o ensure ha he updae vecor remains wihin he feasible se. In many cases, however, he projecion sep may have no closed-form and hus requires solving anoher opimizaion problem iself (e.g., for l.5 norm balls or maroid polyopes (Hazan and ohers 206; Hazan and Kale 202)), he closed-form may exis bu involve an expensive compuaion (e.g., he SVD of he model marix for Schaen-, Schaen- 2, and Schaen- norm balls (Hazan and ohers 206)), or here may simply be no mehod available for compuing he projecion in general (e.g., he convex hull of roaion marices (Hazan, Kale, and Warmuh 200), which arises as a consrain se in online learning seings (Hazan, Kale, and Warmuh 200)). In hese scenarios, each ieraion of he gradien descen may require many inner ieraions o compue he projecion (Jaggi, Sulovsk, and ohers 200; Lacose-Julien and Jaggi 205; Hazan and Kale 202). This makes he projecion sep quie cosly, and can accoun for much of he execuion ime of each ieraion (e.g., see our echnical repor (Recor-Brooks, Wang, and Mozafari 208)). * The work performed while a suden a he Universiy of Michigan, Ann Arbor. Copyrigh 209, Associaion for he Advancemen of Arificial Inelligence ( All righs reserved. Frank-Wolfe (FW) opimizaion In his paper, we focus on FW approaches, also known as projecion-free or condiional gradien algorihms (Frank and Wolfe 956). Unlike gradien descen, hese algorihms avoid he projecion sep alogeher by ensuring ha he updae vecor always lies wihin he feasible se. A each ieraion, FW solves a linear program over a consrain se. Since linear programs have closed-form soluions for mos consrain ses, each ieraion of FW is, in many cases, more cos effecive han conducing a gradien descen sep and hen projecing i back o he consrain se (Jaggi 203; Hazan and Kale 202; Hazan and ohers 206). Anoher main advanage of FW is he sparsiy of is soluion. Since he soluion of a linear program is always a verex (i.e., exreme poin) of he feasible se (when he se iself is convex), each ieraion of FW can add, a mos, one new verex o he soluion vecor. Thus, a ieraion, he soluion is a combinaion of, a mos, + verices of he feasible se, hereby guaraneeing he sparsiy of he evenual soluion (Clarkson 200; Jaggi 203; 20). For hese reasons, FW opimizaion has drawn growing ineres in recen years, especially in marix compleion, srucural SVM, compuer vision, sparse PCA, meric learning, and many oher seings (Jaggi, Sulovsk, and ohers 200; Lacose-Julien e al. 203; Osokin e al. 206; Wang e al. 206; Chari e al. 205; Harchaoui e al. 202; Hazan and Kale 202; Shalev-Shwarz, Gonen, and Shamir 20). Unforunaely, while faser in each ieraion, sandard FW requires many more ieraions o converge han gradien descen, and herefore is slower overall. This is because FW s convergence rae is ypically O ( ) while ha of (acceleraed) gradien descen is O ( ), where is he number of ieraions (Jaggi 203). 2 We make several conribuions (summarized in Table ):. We revisi a non-convenional varian of FW opimizaion, called Primal Averaging (PA) (Lan 203), which has been largely negleced in he pas, as i was believed o have he same convergence rae as FW wihou line search, ye incurring exra compuaions (i.e., marix averaging sep) a each ieraion. However, we discover ha, when he consrain se is srongly convex, his non-convenional varian enjoys a much faser convergence rae wih high probabiliy, O( ) versus O( 2 ), which more han com-

2 Addiional Assumpions abou he Loss Funcion Consrain Se Assumpion Convergence Rae Convex Loss Funcion This Paper None Srongly convex O ( ) wih high 2 probabiliy No Sae-of-he-Ar Resul(s) (Jaggi 203) None Convex O ( ) No (Garber and Hazan 205) Srongly convex Srongly convex O ( ) Yes (Lacose-Julien and Jaggi Srongly convex Polyope O (exp 2 ( )) Yes 205) (Leviin and Polyak 966; Demyanov and Rubinov 970; Dunn 979) Norm of he gradien is lower bounded Srongly convex O (exp ( )) No (Beck and Teboulle 2004) f(x) = Ax b 2 2 Convex O (exp ( )) No Quasi-Convex Loss Funcion This Paper Locally-Lipschiz, Norm of he gradien is lower bounded Srongly convex O ( min ( /3, /2 )) Yes Requires Line Search (In Each Ieraion) Sae-of-he-Ar Resul(s) Does no exis Does no exis Does no exis Does no exis Does no exis Non-Convex Loss Funcion This Paper None Srongly convex O ( ) wih high probabiliy Sae-of-he-Ar Resul(s) (Lacose-Julien 206) None Convex O ( ) No /2 Table : Our conribuions compared o he sae-of-he-ar resuls for projecion-free opimizaion. Here, is he number of ieraions. For non-convex funcions, convergence is defined in erms of a saionary poin insead of a global minimum. Noe ha alhough our bound is probabilisic for convex loss funcions, we use no addiional assumpions on he loss funcion and do no require line search, which can be a cosly operaion for big daa (see Secion 2). Yes pensaes for is slighly more expensive ieraions. This surprising resul has imporan ramificaions in pracice, as many classificaion, regression, muliask learning, and collaboraive filering asks rely on norm consrains ha are srongly convex, e.g., generalized linear models wih l p norm, squared loss regression wih l p norm, muliask learning wih Group Marix norm, and marix compleion wih Schaen norm (Kim and Xing 200; Garber and Hazan 205; Hazan and ohers 206). 2. While previous work on FW opimizaion has generally focused on convex funcions, we show ha FW wih line search can converge o he global opimum, even for smooh funcions ha are no convex, bu are quasiconvex and locally-lipschiz. 3. We also sudy he general case of (smooh) non-convex funcions, showing ha FW wih line search can converge o a saionary poin a a rae of O( ) wih high probabiliy, as long as he consrain se is srongly convex. To he bes of our knowledge, we are no aware of such a fas convergence rae in he non-convex opimizaion lieraure. 4. Finally, we conduc exensive experimens on various benchmark daases, empirically validaing our heore- Wihou any assumpions, converging o local opima for coninuous non-convex funcions is NP-hard (Carmon e al. 207; Agarwal e al. 206). ical resuls, and comparing he acual performance of various FW varians in pracice. 2 Relaed Work Table compares he sae-of-he-ar on projecion-free opimizaion o our conribuions. Convex opimizaion Garber and Hazan (Garber and Hazan 205) show ha for srongly convex and smooh loss funcions, FW wih line search achieves a convergence rae of O( ) over srongly convex ses. In conras, we do no need he 2 loss funcion o be srongly convex. Furher, hey require an exac line search a each ieraion o achieve his convergence rae. Line search, however, comes wih significan downsides. An exac line search solves he problem min f(x + γv) for loss funcion f, soluion vecor x γ [0,] Rn, and descen direcion v R n. There are several mehods for solving his opimizaion, and choosing he bes mehod is ofen difficul for praciioners (e.g., brackeing line searches versus inerpolaion ones). Moreover, a bes, hese mehods converge o he minimum a a rae of O ( ) (Sun and Yuan 2006). Approximae line searches require 2 fewer ieraions. However, in using hem, one loses mos heoreical guaranees provided in previous work, including ha of (Garber and Hazan 205). Noneheless, boh exac and inexac line searches involve a leas one evaluaion of he loss funcion or one of is derivaives, which can be quie prohibiive for large daases (see Secion 7.2). This is because he underlying

3 funcion for daa modeling is ypically in he form of a finie sum (e.g., regression loss) over all he daa. In comparison, Primal Averaging, which we sudy and promoe, does no require a line search and works wih a predefined sep size. Noably, his allows PA o considerably ouperform FW wih line search (see Secion 7.2). Prior work (Leviin and Polyak 966; Demyanov and Rubinov 970; Dunn 979) shows ha sandard FW wihou line search for smooh funcions can achieve an exponenial convergence rae, by making a sric assumpion ha he gradien is lower-bounded everywhere in he feasible se. In our analysis of PA, however, we do no assume he gradien is lower-bounded everywhere, allowing our resul o be more widely applicable. Quasi-convex opimizaion Hazan e al. sudy quasiconvex and locally-lipschiz loss funcions ha admi some saddle poins (Hazan, Levy, and Shalev-Shwarz 205). One of he opimizaion algorihms for his class of funcions is he so-called normalized gradien descen, which converges o an ɛ-neighborhood of he global minimum. The analysis in (Hazan, Levy, and Shalev-Shwarz 205) is for unconsrained opimizaion. In his paper, we analyze FW for he same class of funcions, bu wih srongly convex consrain ses. Ineresingly, when he consrain se is an l 2 ball, FW becomes equivalen o normalized gradien descen. In his paper, we boh ) show ha FW can converge o a neighborhood of a global minimum, and 2) derive a convergence rae. (Dunn 979) exends he analysis of FW o a class of quasi-convex funcions of he form f(w) := g(h(w)), where h is differeniable and monoonically increasing, and g is a smooh funcion. Such funcions are quie rare in machine learning. In conras, we sudy a much more general class of quasi-convex funcions, including several popular models (e.g., generalized linear models wih a sigmoid loss). Non-convex opimizaion While here has been a surge of research on non-convex opimizaion in recen years (Carmon e al. 207; Ge e al. 205; Agarwal e al. 206; Lee e al. 206; Lacose-Julien 206), nearly all of i has focused on unconsrained opimizaion. To our knowledge, here are only a few excepions (Lacose-Julien 206; Ghadimi and Lan 206; Ge e al. 205; Reddi e al. 206). (Lacose-Julien 206) proves ha FW for smooh non-convex funcions converges o a saionary poin, a a rae of O( ), which maches he rae of projeced gradien descen. (Reddi e al. 206) exends his and considers a sochasic version of FW for smooh non-convex funcions. Furhermore, Theorem 7 of (Yu, Zhang, and Schuurmans 204) provides a convergence rae for non-convex opimizaion using FW, which is slower han O( ). We show in his paper ha, for srongly convex ses, FW converges o a saionary poin wih high probabiliy much faser: O( ). 3 Background 3. Preliminaries Srongly convex consrain ses are quie common in machine learning. For example, when p (, 2], l p balls {u R n : u p r} and Schaen-p balls {X R m n : X Sp r} are all srongly convex (Garber and Hazan 205), where ( min(m,n) ) /p X Sp = i= σ(x) p i is he Schaen-p norm and σ(x) i is he i h larges singular value of X. Group l p,q balls, used in muliask learning (Garber and Hazan 205; Kim and Xing 200), are also srongly convex when p, q (, 2]. In his paper, we use he following definiions. Definiion (Srongly convex se). A convex se Ω R d is an α-srongly convex se wih respec o a norm if for any u, v Ω and any θ [0, ], he ball induced by which is cenered a θu + ( θ)v wih radius θ( θ) α 2 u v 2 is also included in Ω. Definiion 2 (Quasi-convex funcions). A funcion f : R d R is quasi-convex if for all u, v R d such ha f(u) f(v), i follows ha f(v), u v 0, where, is he sandard inner produc. Definiion 3 (Sricly-quasi-convex funcions). A funcion f : R d R is sricly-quasi-convex if i is quasi-convex and is gradiens only vanish a he global minimum. Tha is, for all u R d, i follows ha f(u) > f(u ) f(u) = 0 where u is he global minimum. Definiion 4 (Sricly-locally-quasi-convex funcions). Le u, v R d, κ, ɛ > 0. Furher, wrie B r (x) as he Euclidean norm ball cenered a x of radius r where x R d and r R. We say f : R d R is (ɛ, κ, v)-sricly-locally-quasi-convex in u if a leas one of he following applies:. f(u) f(v) ɛ 2. f(u) > 0 and for every y B ɛ (v) i holds ha κ f(u), y u A Brief Overview of Frank-Wolfe (FW) The Frank-Wolfe (FW) algorihm (Algorihm ) aemps o solve he consrained opimizaion problem minf(x) for x Ω some convex consrain se Ω (a.k.a. feasible se) and some funcion f : Ω R. FW begins wih an iniial soluion w 0 Ω. Then, a each ieraion, i compues a search direcion v by minimizing he linear approximaion of f a w, v = min v, f(w ), where f(w ) is he gradien of f a w. v Ω Nex, FW produces a convex combinaion of he curren ierae w and he search direcion v o find he nex ierae w + = ( γ )w + γ v where γ [0, ] is he learning rae for he curren ieraion. There are a number of ways o choose he learning rae γ. Chief among hese are seing γ = 2 + (Algorihm, opion A) or finding γ via line search (Algorihm, opion B). 4 Faser Convergence Rae for Smooh Convex Funcions 4. Primal Averaging (PA) PA (Lan 203) (Algorihm 2) is a varian of FW ha operaes in a syle similar o Neserov s acceleraion mehod. PA mainains hree sequences, (z ) =,2,..., (v ) =,2,..., and (w ) =,2,.... The firs is he acceleraing sequence (as in Neserov acceleraion), he second is he sequence of search direcions, and he hird is he sequence of soluion vecors. A each ieraion, PA updaes is sequences by compuing wo

4 Algorihm Sandard Frank-Wolfe algorihm : Inpu: loss f : Ω R. 2: Inpu: linear op. oracle O( ) for Ω. 3: Iniialize: any w Ω. 4: for =, 2, 3,... do 5: v O( f(w )) = arg min v Ω v, f(w ). 6: Opion (A): Predefined decay learning rae {γ [0, ]} =,2,... 7: Opion (B): γ =arg min γ [0,] γ v w, f(w ) + γ 2 L 2 v w 2. 8: w + ( γ )w + γ v. 9: end for Algorihm 2 Primal Averaging : Iniialize any v 0 Ω R d. Se w 0 = v 0. 2: for =, 2, 3,... do 3: γ = : z = ( γ )w + γ v. 5: Opion (A): p = Σ i= θi Θ f(z i ), where Θ = Σ i= θ i, θ =, and θ Θ = γ. 6: Opion (B): p = f(z ). 7: v = arg min v Ω v, p. 8: w = ( γ )w + γ v. 9: end for convex combinaions and consuling he linear oracle, such ha z = ( γ )w + γ v v = arg min Θ v Ω θ i f(z i ), v i= w = ( γ )w + γ v where Θ = i= θ i and he θ i are chosen, such ha γ = θ Θ. Noe ha choosing θ does no require significan compuaion as seing θ = saisfies he requiremen γ = θ 2 Θ for all. Since z and w are convex combinaions of elemens of he consrain se Ω, z and w are hemselves in Ω. While he inpu o he linear oracle is a single gradien vecor in sandard FW, PA uses an average of he gradiens seen in ieraions, 2,..., as he inpu o he linear oracle. In sandard FW, he sequence (w ) =,2,... has he following propery (Jaggi 203; Lan 203; Hazan and ohers 206): f(w ) f(w ) 2L ( + ) Σ i= v i w i 2 () where w is an opimal poin and L is he smoohness parameer of f. We observe ha he i= v i w i facor of () is he average disance beween he search direcion and soluion vecor pairs. Denoe he diameer D of Ω as D = sup u v. Then, since w i and v i are boh in Ω, u,v Ω we find ha i= v i w i D. Tha is, he average disance of v i and w i is upper bounded by diameer D of Ω. 2 If θ = hen θ Θ = = 2 = 2 = γ. i= i (+) + Combining his wih () yields sandard FW s convergence rae: f(w ) f(w ) 2L ( + ) Σ i= v i w i 2 ( ) (2) 2LD2 + = O PA has a similar guaranee for he sequence (w ) =,2,... (Lan 203). Namely f(w ) f(w ) 2L ( + ) Σ i= v i v i 2 (3) While he inabiliy o guaranee an arbirarily small disance beween v i and w i in Equaion caused sandard FW o converge as O( ), his is no he case for he disance beween v i and v i in Equaion 3. Should we be able o bound he disance v i v i o be arbirarily small, we can show ha PA converges as O( ) wih high probabiliy. We observe ha he sequence (v 2 ) =,2,... expresses his behavior when he consrain se is srongly convex. We have he following heorem. 3 Theorem. Assume he convex funcion f is smooh wih parameer L. Furher, define he funcion h as h(w) = f(w) + θξ T w where θ ( ] ɛ 0, 4D, ξ R d, w Ω, Ω is an α-srongly convex se, D is he diameer of Ω, and ξ is uniform on he uni sphere. Applying PA o h yields he following convergence rae for f wih probabiliy δ, ( ) dl f(w ) f(w ) = O α 2 δ All omied proofs can be found in our echnical repor (Recor- Brooks, Wang, and Mozafari 208).

5 Theorem saes ha applying PA o a perurbed funcion h over an α-srongly convex consrain se allows any smooh, convex funcion f o converge as O ( ) wih probabiliy δ, albei depending on δ and d. However, 2 as grows, he 2 erm in he convergence rae s denominaor quickly dominaes he rae s δ and d erms. This, combined wih PA s nonreliance on line search, allows i o ouperform he mehod proposed in (Garber and Hazan 205). We noe ha, alhough Theorem requires us o run PA on he perurbed funcion h, f iself sill converges as O ( ) wih high probabiliy. Tha is, he ieraes w 2 produced by running PA on h hemselves have he guaranee of f(w ) f(w ) = O ( ) dl α 2 δ 2 2 for w = arg minf(w) wih probabiliy δ. We also w Ω empirically invesigae his resul in Secion Sochasic Primal Averaging (SPA) Here we provide a sochasic version of Primal Averaging. While in he previous secion we sudied PA wih Opion (A) of Algorihm 2, we now consider PA wih Opion (B) of Algorihm 2, providing an analysis of is sochasic version. Tha is, p = f(z ), where f represens he aggregaed sochasic gradien consruced as f(z ) = i S ˆ fi (z ). Furher, ˆ f i ( ) is he sochasic gradien compued wih he ih iem of a daase of size N, while S is he se of indices sampled wihou replacemen from {, 2,..., N} a ieraion. We noe ha S = min( 4, N). Theorem 2. Assume he convex funcion f is smooh wih parameer L. Denoe σ as he variance of a sochasic gradien. Suppose p = f(z ) and he number of samples used o obain p is n = O( 4 ). Furher, define he funcion h as h(w) = f(w) + θξ T w where θ ( ] ɛ 0, 4D, ξ R d, w Ω, Ω is an α-srongly convex se, D is he diameer of Ω, and ξ is uniform on he uni sphere. Then applying PA o h yields he following convergence rae for f wih probabiliy δ, ( dl E[f(w )] f(w 2 (D 2 ) + σ) log ) = O α 2 δ 2 2 Theorem( 2 saes ) ha he sochasic version of PA mainains an O log convergence rae wih high probabiliy, 2 using h in a manner similar o Theorem. Noe ha n grows as O( 4 ) unil i begins o use all he daa poins o compue he gradien. Thus, for earlier ieraions of SPA, he algorihm requires far less compuaion han is deerminisic counerpar. However, he samples required in each ieraion grows quickly, causing laer ieraions of SPA o share he same compuaional cos as deerminisic Primal Averaging. 5 Sricly-Locally-Quasi-Convex Funcions In his secion we show ha FW wih line search can converge wihin an ɛ-neighborhood of he global minimum for sriclylocally-quasi-convex funcions. Furhermore, if i is assumed ha he norm of he gradien is lower bounded, hen FW wih line search can converge wihin an ɛ-neighborhood of he global minimum in O ( max ( )) ɛ, 2 ɛ ieraions. 3 Theorem 3. Assume ha he funcion f is smooh wih parameer L, and ha f is (ɛ, κ, w )-sricly-locally-quasiconvex, where w is a global minimum. Then, he sandard FW algorihm wih line search (Algorihm opion (B)) can converge wihin an ɛ-neighborhood of he global minimum when he consrain se is srongly convex. Furhermore, if one assumes ha f(w) f(w ) ɛ implies ha he norm of he gradien is lower bounded as f(w) θɛ for some θ R, hen he algorihm needs = O(max( 2κ θɛ, 8Lκ 2 θɛ )) ieraions o produce an ierae ha is wihin an ɛ neighborhood 3 of he global minimum. Hazan e al. (Hazan, Levy, and Shalev-Shwarz 205) provide several examples of sricly-locally-quasi-convex funcions. Firs, if ɛ (0, ] and x = (x, x 2 ) [ 0, 0] 2, hen he funcion g(x) = ( + e x ) + ( + e x2 ) is (ɛ,, x )-sricly-locally-quasi-convex in x. Second, if ɛ (0, ) and w R d, hen he funcion h(w) = m (y i φ( w, x i )) 2 m i= is (ɛ, 2 γ, w )-sricly-locally-quasi-convex in w. Here, φ(z) = z 0, γ R is he margin of a percepron, and we have m samples {(x i, y i )} m i= B (0) {0, } where B (0) R d. 6 Smooh Non-Convex Funcions In his secion, we show ha, wih high probabiliy, FW wih line search converges as O ( ) o a saionary poin when he loss funcion is non-convex and he consrain se is srongly convex. To our knowledge, a rae his rapid does no exis in he non-convex opimizaion lieraure. To help demonsrae our heoreical guaranee, we inroduce a measure called he FW gap. The FW gap of f a a poin w Ω is defined as k := max v Ω v w, f(w ). This measure is adoped in (Lacose-Julien 206), which is he firs work o show ( ) ha, for smooh non-convex funcions, FW has an O convergence rae o a saionary poin ( ) over arbirary convex ses. The O rae maches he rae of projeced gradien descen when he loss funcion is smooh and non-convex. I has been shown (Lacose-Julien 206) ha a poin w is a saionary poin for he consrained opimizaion problem if and only if k = 0. Theorem 4. Assume ha he non-convex funcion f is smooh wih parameer L and he consrain se Ω is α- srongly convex and has dimensionaliy d. Furher, define he funcion h as h(w) = f(w) + θξ T w where θ ( ] ɛ 0, 4D, ξ R d, w Ω, D is he diameer of Ω, and ξ is uniform on he uni sphere. Le l = f(w ) f(w ) and C = αδ π 8L. 2d Then applying FW wih line search o h yields he following guaranee for he FW gap of f wih probabiliy δ, min k s s l min{ 2, C } = O ( )

6 Convexiy of Loss Funcion Loss Funcion Consrain Task Convex Quadraic Loss l p norm Regression Observed Quadraic Loss Schaen-p norm Marix Compleion Sricly-Locally-Quasi-Convex Squared Sigmoid l p norm Classificaion Non-Convex Bi-Weigh Loss l p norm Robus Regression Table 2: Various loss funcions and consrain ses used in our experimens. (a) Marix compleion w/ convex (observed quadraic) loss, Schaen-2 norm consrain. (b) Classificaion w/ quasiconvex (squared sigmoid) loss, l 2 norm consrain. (c) Regression w/ non-convex (bi-weigh) loss, l 2 norm consrain. Figure : Convergence raes of FW varians for convex loss wihou line search and non-convex loss wih line search. We would furher discuss he resul saed in he heorem. In non-convex opimizaion lieraure, Neserov and Polyak (Neserov and Polyak 2006) show ha cubic regularizaion of Newon s mehod can find a saionary poin in O(ɛ 3/2 ) ieraions and evaluaions of he Hessian. Firs order mehods, such as gradien descen, ypically require O(ɛ 2 ) ieraions (Carmon e al. 207) o converge o a saionary poin. Recen progress on firs order mehods, however, assumes some mild condiions and show ha an improved rae of O(ɛ 7/4 ) is possible (Carmon e al. 207; Agarwal e al. 206). Here, we show ha when he consrain se is srongly convex, FW wih line search only needs O(ɛ ) ieraions o arrive wihin an ɛ-neighborhood of a saionary poin. I is imporan o noe, alhough he O(ɛ ) convergence rae holds probabilisically, i is quie fas compared o he known raes in he non-convex opimizaion lieraure. 7 Experimens We have conduced exensive experimens on differen combinaions of loss funcions, consrain ses, and real-life daases (Table 2). Here, we only repor wo main ses of experimens: he empirical validaion of our heoreical resuls in erms of convergence raes (Secion 7.) and he comparison of various opimizaions in erms of acual run imes (Secion 7.2). We refer he ineresed reader o our echnical repor for addiional experimens (Recor-Brooks, Wang, and Mozafari 208). For classificaion and regression, we used he logisic and quadraic loss funcions. For marix compleion, we used he observed quadraic loss (Freund, Grigas, and Mazumder 207), defined as f (X) = (i,j) P (M) (X i,j M i,j ) 2 where X is he esimaed marix, M is he observed marix, and P (M)={(i, j) : M i,j is observed}. As a nonconvex, bu sricly-locally-quasi-convex loss, we also used squared sigmoid loss ϕ(z) = ( + exp( z)) (Hazan, Levy, and Shalev-Shwarz 205) for classificaion. For robus regression, we used he bi-weigh loss (Belagiannis e al. 205), as a non-convex (bu smooh) loss ψ(f(x i ), y i ) = (f(x i ) y i ) 2 + (f(x i ) y i ) 2. For regression, we used he YearPredicionMSD daase (500K observaions, 90 feaures) (Lichman 203). For classificaion, we used he Adul daase (49K observaions, 4 feaures) (Lichman 203). For marix compleion, we used he MovieLens daase (M movie raings from 6,040 users on 3,900 movies) (Harper and Konsan 206). 7. Empirical Validaion of Convergence Raes We ran several experimens o empirically validae our convergence resuls. In paricular, we sudied he performance of Primal Averaging (PA) and sandard FW Wih Line Search (FWLS) wih boh l 2 and Schaen-2 norm balls as our srongly convex consrain ses. Theorem guaranees a convergence rae of O( ) for PA when he consrain se is srongly convex and he 2 loss funcion is convex. We experimened wih boh l 2 (logisic classifier) and Schaen-2 norm (marix compleion) balls, measuring he loss value a each ieraion. As shown in Figure a, a slope of 2.4 confirms Theorem s guaranee, which predics a slope of a leas 2. Theorem 3 shows ha FWLS converges o he global minimum a he rae of O ( min ( )), /3 when he consrain /2

7 (a) PA vs. sandard FW varians. (b) PA vs. gradien descen. (c) Sochasic PA vs. sochasic GD. Figure 2: PA versus (a) oher FW varians, (b) gradien descen, and (c) sochasic gradien descen. se is srongly convex and he loss funcion is sricly-locallyquasi-convex. We invesigaed his resul wih he squared sigmoid loss and an l 2 norm consrain. Figure b exhibis our resuls, showing a slope of 2.2, a finding beer han he wors-case bounds given by Theorem 3, i.e., a slope of 0.5 (see our echnical repor (Recor-Brooks, Wang, and Mozafari 208) for a deailed discussion). From Theorem 4, we expec FWLS o converge o a saionary poin of a (smooh) non-convex funcion a a rae of O( ) when consrained o a srongly convex se. Using he bi-weigh loss and an l 2 norm consrain, we measured he loss value a each ieraion. As shown in Figure c, he resuls confirmed our heoreical resuls, showing an even seeper slope (.46 insead of, since Theorem 4 only provides a wors-case upper bound). 7.2 Comparison of Differen Opimizaion Algorihms To compare he acual performance of various opimizaion algorihms, we measure he run imes, insead of he number of ieraions o convergence, in order o accoun for he ime spen in each ieraion. In Figure 2, doed verical lines mark he convergence poins of various algorihms. Firs, we compared all hree varians of FW: PA, sandard FW Wih Predefined Learning Rae (FWPLR) defined in Algorihm wih opion A, and sandard FW Wih Line Search (FWLS) defined in Algorihm wih opion B. All mehods were esed on a regression ask (quadraic loss) wih an l 2 norm ball consrain. As shown in Figure 2a, PA converged 3.7 and 5.6 faser han FWPLR and FWLS, respecively. This considerable speedup has significan ramificaions in pracice. Tradiionally, PA has been shied away from, due o is slower ieraions, while is convergence rae was believed o be he same as he more efficien varians (Lan 203). However, as proven in Secion 4, PA does converge in fewer ieraions. We also compared he run ime of PA versus projeced gradien descen (regression ask wih a quadraic loss). We compared heir deerminisic versions in Figure 2b, where PA converged significanly faser (7.7 ), as expeced. For a fair comparison of heir sochasic versions, Sochasic Primal Averaging (SPA) and Sochasic Gradien Descen (SGD), we considered wo cases: an l 2 consrain (which has an efficien projecion) and l. consrain (which has a cosly projecion). As expeced, for an efficien projecion, SGD converged 4.6 faser han SPA (Figure 2c), and when he projecion was cosly, SPA converged 25. faser (see (Recor-Brooks, Wang, and Mozafari 208) for deailed plos). 8 Conclusion In his paper, we revisied an imporan class of opimizaion echniques, FW mehods, and offered new insigh ino heir convergence properies for srongly convex consrain ses, which are quie common in machine learning. Specifically, we discovered ha, for convex funcions, a non-convenional varian of FW (i.e., Primal Averaging) converges significanly faser han he commonly used varians of FW wih high probabiliy. We also showed ha PA s O( 2 ) convergence rae more han compensaes for is slighly more expensive compuaional cos a each ieraion. We furher proved ha for sricly-locally-quasi-convex funcions, FW can converge o wihin an ɛ-neighborhood of he global minimum in O ( max( ɛ 2, ɛ 3 ) ) ieraions. Even for non-convex funcions, we proved ha FW s convergence rae is beer han he previously known resuls in he lieraure wih high probabiliy. These new convergence raes have significan ramificaions for praciioners, due o he widespread applicaions of srongly convex norm consrains in classificaion, regression, marix compleion, and collaboraive filering seings. Finally, we conduced exensive experimens on real-world daases o validae our heoreical resuls and invesigae our improvemen over exising mehods. In summary, we showed ha PA reduces opimizaion ime by compared o sandard FW varians, and by compared o projeced gradien descen. Our plan is o inegrae PA in machine learning libraries libraries, including our BlinkML projec (Park e al. 208). 9 Acknowledgmens This work is in par suppored by he Naional Science Foundaion (grans and 55369).

8 References Agarwal, N.; Allen-Zhu, Z.; Bullins, B.; Hazan, E.; and Ma, T Finding approximae local minima for nonconvex opimizaion in linear ime. arxiv preprin arxiv: Beck, A., and Teboulle, M A condiional gradien mehod wih linear rae of convergence for solving convex linear sysems. Mahemaical Mehods of Operaions Research 59(2): Belagiannis, V.; Rupprech, C.; Carneiro, G.; and Navab, N Robus opimizaion for deep regression. In Proceedings of he IEEE Inernaional Conference on Compuer Vision, Carmon, Y.; Duchi, J.; Hinder, O.; and Sidford, A Acceleraed mehods for non-convex opimizaion. hps://arxiv.org/pdf/ pdf. Chari, V.; Lacose-Julien, S.; Lapev, I.; and Sivic, J On pairwise coss for nework flow muli-objec racking. In Proceedings of he IEEE Conference on Compuer Vision and Paern Recogniion, Clarkson, K. L Coreses, sparse greedy approximaion, and he frank-wolfe algorihm. ACM Transacions on Algorihms (TALG) 6(4):63. Demyanov, V. F., and Rubinov, A. M Approximae mehods in opimizaion problems. Elsevier Publishing Company,. Dunn, J. C Raes of convergence for condiional gradien algorihms near singular and nonsingular exremals. SIAM Journal on Conrol and Opimizaion. Frank, M., and Wolfe, P An algorihm for quadraic programming. Naval research logisics quarerly 3(-2):95 0. Freund, R. M.; Grigas, P.; and Mazumder, R An exended frank wolfe mehod wih in-face direcions, and is applicaion o low-rank marix compleion. SIAM Journal on Opimizaion 27(): Garber, D., and Hazan, E Faser raes for he frank-wolfe mehod over srongly-convex ses. In Inernaional Conference on Machine Learning, Ge, R.; Huang, F.; Jin, C.; and Yuan, Y Escaping from saddle poins - online sochasic gradien for ensor decomposiion. CoRR abs/ Ghadimi, S., and Lan, G Acceleraed gradien mehods for nonconvex nonlinear and sochasic programming. Mahemaical Programming 56(-2): Harchaoui, Z.; Douze, M.; Paulin, M.; Dudik, M.; and Malick, J Large-scale image classificaion wih race-norm regularizaion. In Compuer Vision and Paern Recogniion (CVPR), 202 IEEE Conference on, IEEE. Harper, F. M., and Konsan, J. A The movielens daases: Hisory and conex. ACM Transacions on Ineracive Inelligen Sysems (TiiS) 5(4):9. Hazan, E., and Kale, S Projecion-free online learning. arxiv preprin arxiv: Hazan, E., e al Inroducion o online convex opimizaion. Foundaions and Trends in Opimizaion 2(3-4): Hazan, E.; Kale, S.; and Warmuh, M. K Learning roaions wih lile regre. In COLT, Hazan, E.; Levy, K.; and Shalev-Shwarz, S Beyond convexiy: Sochasic quasi-convex opimizaion. In Advances in Neural Informaion Processing Sysems, Jaggi, M.; Sulovsk, M.; e al A simple algorihm for nuclear norm regularized problems. In Proceedings of he 27h inernaional conference on machine learning (ICML-0), Jaggi, M. 20. Sparse convex opimizaion mehods for machine learning. Technical repor, ETH Zürich. Jaggi, M Revisiing frank-wolfe: Projecion-free sparse convex opimizaion. In ICML (), Kim, S., and Xing, E. P Tree-guided group lasso for muliask regression wih srucured sparsiy. In ICML, Lacose-Julien, S., and Jaggi, M On he global linear convergence of frank-wolfe opimizaion varians. In Advances in Neural Informaion Processing Sysems, Lacose-Julien, S.; Jaggi, M.; Schmid, M.; and Plescher, P Block-coordinae frank-wolfe opimizaion for srucural svms. ICML. Lacose-Julien, S Convergence rae of frank-wolfe for nonconvex objecives,. arxiv: Lan, G The complexiy of large-scale convex programming under a linear opimizaion oracle. hps://arxiv.org/abs/ Lee, J. D.; Simchowiz, M.; Jordan, M. I.; and Rech, B Gradien descen converges o minimizers. COLT. Leviin, E. S., and Polyak, B. T Consrained minimizaion mehods. USSR Compuaional mahemaics and mahemaical physics. Lichman, M UCI machine learning reposiory. Neserov, Y., and Polyak, B. T Cubic regularizaion of newon mehod and is global performance. Mahemaical Programming 08(): Osokin, A.; Alayrac, J.-B.; Lukasewiz, I.; Dokania, P.; and Lacose- Julien, S Minding he gaps for block frank-wolfe opimizaion of srucured svms. In Inernaional Conference on Machine Learning, Park, Y.; Qing, J.; Shen, X.; and Mozafari, B BlinkML: Approximae machine learning wih probabilisic guaranees y. Technical Repor hp://web.eecs.umich.edu/ mozafari/ php/daa/uploads/blinkml_repor.pdf. Recor-Brooks, J.; Wang, J.-K.; and Mozafari, B Revisiing projecion-free opimizaion for srongly convex consrain ses. Technical Repor hp://web.eecs.umich.edu/ mozafari/php/daa/uploads/fw_repor.pdf. Reddi, S. J.; Sra, S.; Poczos, B.; and Smola., A Sochasic frank-wolfe mehods for nonconvex opimizaion. Alleron. Shalev-Shwarz, S.; Gonen, A.; and Shamir, O. 20. Large-scale convex minimizaion wih a low-rank consrain. arxiv preprin arxiv: Sun, W., and Yuan, Y.-X Opimizaion heory and mehods: nonlinear programming, volume. Springer Science & Business Media. Wang, Y.-X.; Sadhanala, V.; Dai, W.; Neiswanger, W.; Sra, S.; and Xing, E Parallel and disribued block-coordinae frank-wolfe algorihms. In Inernaional Conference on Machine Learning, Yu, Y.; Zhang, X.; and Schuurmans, D Generalized condiional gradien for srucured esimaion. arxiv:

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

Lecture 9: September 25

Lecture 9: September 25 0-725: Opimizaion Fall 202 Lecure 9: Sepember 25 Lecurer: Geoff Gordon/Ryan Tibshirani Scribes: Xuezhi Wang, Subhodeep Moira, Abhimanu Kumar Noe: LaTeX emplae couresy of UC Berkeley EECS dep. Disclaimer:

More information

Stochastic Variance-Reduced Cubic Regularized Newton Method

Stochastic Variance-Reduced Cubic Regularized Newton Method Sochasic Variance-Reduced Cubic Regularized Newon Mehod Dongruo Zhou and Pan Xu and Quanquan Gu arxiv:1802.04796v1 [cs.lg] 13 Feb 2018 February 9, 2018 Absrac We propose a sochasic variance-reduced cubic

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Notes on online convex optimization

Notes on online convex optimization Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

ELE 538B: Large-Scale Optimization for Data Science. Introduction. Yuxin Chen Princeton University, Spring 2018

ELE 538B: Large-Scale Optimization for Data Science. Introduction. Yuxin Chen Princeton University, Spring 2018 ELE 538B: Large-Scale Opimizaion for Daa Science Inroducion Yuxin Chen Princeon Universiy, Spring 2018 Surge of daa-inensive applicaions Widespread applicaions in large-scale daa science and learning 2.5

More information

A Shooting Method for A Node Generation Algorithm

A Shooting Method for A Node Generation Algorithm A Shooing Mehod for A Node Generaion Algorihm Hiroaki Nishikawa W.M.Keck Foundaion Laboraory for Compuaional Fluid Dynamics Deparmen of Aerospace Engineering, Universiy of Michigan, Ann Arbor, Michigan

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization A Forward-Backward Spliing Mehod wih Componen-wise Lazy Evaluaion for Online Srucured Convex Opimizaion Yukihiro Togari and Nobuo Yamashia March 28, 2016 Absrac: We consider large-scale opimizaion problems

More information

ELE 538B: Large-Scale Optimization for Data Science. Quasi-Newton methods. Yuxin Chen Princeton University, Spring 2018

ELE 538B: Large-Scale Optimization for Data Science. Quasi-Newton methods. Yuxin Chen Princeton University, Spring 2018 ELE 538B: Large-Scale Opimizaion for Daa Science Quasi-Newon mehods Yuxin Chen Princeon Universiy, Spring 208 00 op ff(x (x)(k)) f p 2 L µ f 05 k f (xk ) k f (xk ) =) f op ieraions converges in only 5

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Experiments on logistic regression

Experiments on logistic regression Experimens on logisic regression Ning Bao March, 8 Absrac In his repor, several experimens have been conduced on a spam daa se wih Logisic Regression based on Gradien Descen approach. Firs, he overfiing

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

Chapter 3 Boundary Value Problem

Chapter 3 Boundary Value Problem Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le

More information

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018 MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren

More information

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source Muli-scale D acousic full waveform inversion wih high frequency impulsive source Vladimir N Zubov*, Universiy of Calgary, Calgary AB vzubov@ucalgaryca and Michael P Lamoureux, Universiy of Calgary, Calgary

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

A Local Regret in Nonconvex Online Learning

A Local Regret in Nonconvex Online Learning Sergul Aydore Lee Dicker Dean Foser Absrac We consider an online learning process o forecas a sequence of oucomes for nonconvex models. A ypical measure o evaluae online learning policies is regre bu such

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Primal-Dual Splitting: Recent Improvements and Variants

Primal-Dual Splitting: Recent Improvements and Variants Primal-Dual Spliing: Recen Improvemens and Varians 1 Thomas Pock and 2 Anonin Chambolle 1 Insiue for Compuer Graphics and Vision, TU Graz, Ausria 2 CMAP & CNRS École Polyechnique, France The proximal poin

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Dimitri Solomatine. D.P. Solomatine. Data-driven modelling (part 2). 2

Dimitri Solomatine. D.P. Solomatine. Data-driven modelling (part 2). 2 Daa-driven modelling. Par. Daa-driven Arificial di Neural modelling. Newors Par Dimiri Solomaine Arificial neural newors D.P. Solomaine. Daa-driven modelling par. 1 Arificial neural newors ANN: main pes

More information

A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization

A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization 1 A Decenralized Second-Order Mehod wih Exac Linear Convergence Rae for Consensus Opimizaion Aryan Mokhari, Wei Shi, Qing Ling, and Alejandro Ribeiro Absrac This paper considers decenralized consensus

More information

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and

More information

Regularization Paths with Guarantees for Convex Semidefinite Optimization

Regularization Paths with Guarantees for Convex Semidefinite Optimization Regularizaion Pahs wih Guaranees for Convex Semidefinie Opimizaion Joachim Giesen Marin Jaggi Sören Laue Friedrich-Schiller-Universiy Jena ETH Zurich Friedrich-Schiller-Universiy Jena Absrac We devise

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Optimality Conditions for Unconstrained Problems

Optimality Conditions for Unconstrained Problems 62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems Single-Pass-Based Heurisic Algorihms for Group Flexible Flow-shop Scheduling Problems PEI-YING HUANG, TZUNG-PEI HONG 2 and CHENG-YAN KAO, 3 Deparmen of Compuer Science and Informaion Engineering Naional

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models Journal of Saisical and Economeric Mehods, vol.1, no.2, 2012, 65-70 ISSN: 2241-0384 (prin), 2241-0376 (online) Scienpress Ld, 2012 A Specificaion Tes for Linear Dynamic Sochasic General Equilibrium Models

More information

Aryan Mokhtari, Wei Shi, Qing Ling, and Alejandro Ribeiro. cost function n

Aryan Mokhtari, Wei Shi, Qing Ling, and Alejandro Ribeiro. cost function n IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 507 A Decenralized Second-Order Mehod wih Exac Linear Convergence Rae for Consensus Opimizaion Aryan Mokhari,

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Network Newton Distributed Optimization Methods

Network Newton Distributed Optimization Methods Nework Newon Disribued Opimizaion Mehods Aryan Mokhari, Qing Ling, and Alejandro Ribeiro Absrac We sudy he problem of minimizing a sum of convex objecive funcions where he componens of he objecive are

More information

OBJECTIVES OF TIME SERIES ANALYSIS

OBJECTIVES OF TIME SERIES ANALYSIS OBJECTIVES OF TIME SERIES ANALYSIS Undersanding he dynamic or imedependen srucure of he observaions of a single series (univariae analysis) Forecasing of fuure observaions Asceraining he leading, lagging

More information

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection Appendix o Online l -Dicionary Learning wih Applicaion o Novel Documen Deecion Shiva Prasad Kasiviswanahan Huahua Wang Arindam Banerjee Prem Melville A Background abou ADMM In his secion, we give a brief

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales Advances in Dynamical Sysems and Applicaions. ISSN 0973-5321 Volume 1 Number 1 (2006, pp. 103 112 c Research India Publicaions hp://www.ripublicaion.com/adsa.hm The Asympoic Behavior of Nonoscillaory Soluions

More information

CHEAPEST PMT ONLINE TEST SERIES AIIMS/NEET TOPPER PREPARE QUESTIONS

CHEAPEST PMT ONLINE TEST SERIES AIIMS/NEET TOPPER PREPARE QUESTIONS CHEAPEST PMT ONLINE TEST SERIES AIIMS/NEET TOPPER PREPARE QUESTIONS For more deails see las page or conac @aimaiims.in Physics Mock Tes Paper AIIMS/NEET 07 Physics 06 Saurday Augus 0 Uni es : Moion in

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

Rapid Termination Evaluation for Recursive Subdivision of Bezier Curves

Rapid Termination Evaluation for Recursive Subdivision of Bezier Curves Rapid Terminaion Evaluaion for Recursive Subdivision of Bezier Curves Thomas F. Hain School of Compuer and Informaion Sciences, Universiy of Souh Alabama, Mobile, AL, U.S.A. Absrac Bézier curve flaening

More information

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi Creep in Viscoelasic Subsances Numerical mehods o calculae he coefficiens of he Prony equaion using creep es daa and Herediary Inegrals Mehod Navnee Saini, Mayank Goyal, Vishal Bansal (23); Term Projec

More information

Refinement of Document Clustering by Using NMF *

Refinement of Document Clustering by Using NMF * Refinemen of Documen Clusering by Using NMF * Hiroyuki Shinnou and Minoru Sasaki Deparmen of Compuer and Informaion Sciences, Ibaraki Universiy, 4-12-1 Nakanarusawa, Hiachi, Ibaraki JAPAN 316-8511 {shinnou,

More information

18 Biological models with discrete time

18 Biological models with discrete time 8 Biological models wih discree ime The mos imporan applicaions, however, may be pedagogical. The elegan body of mahemaical heory peraining o linear sysems (Fourier analysis, orhogonal funcions, and so

More information

A Reinforcement Learning Approach for Collaborative Filtering

A Reinforcement Learning Approach for Collaborative Filtering A Reinforcemen Learning Approach for Collaboraive Filering Jungkyu Lee, Byonghwa Oh 2, Jihoon Yang 2, and Sungyong Park 2 Cyram Inc, Seoul, Korea jklee@cyram.com 2 Sogang Universiy, Seoul, Korea {mrfive,yangjh,parksy}@sogang.ac.kr

More information

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t M ah 5 2 7 Fall 2 0 0 9 L ecure 1 0 O c. 7, 2 0 0 9 Hamilon- J acobi Equaion: Explici Formulas In his lecure we ry o apply he mehod of characerisics o he Hamilon-Jacobi equaion: u + H D u, x = 0 in R n

More information

A variational radial basis function approximation for diffusion processes.

A variational radial basis function approximation for diffusion processes. A variaional radial basis funcion approximaion for diffusion processes. Michail D. Vreas, Dan Cornford and Yuan Shen {vreasm, d.cornford, y.shen}@ason.ac.uk Ason Universiy, Birmingham, UK hp://www.ncrg.ason.ac.uk

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims Problem Se 5 Graduae Macro II, Spring 2017 The Universiy of Nore Dame Professor Sims Insrucions: You may consul wih oher members of he class, bu please make sure o urn in your own work. Where applicable,

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

SUFFICIENT CONDITIONS FOR EXISTENCE SOLUTION OF LINEAR TWO-POINT BOUNDARY PROBLEM IN MINIMIZATION OF QUADRATIC FUNCTIONAL

SUFFICIENT CONDITIONS FOR EXISTENCE SOLUTION OF LINEAR TWO-POINT BOUNDARY PROBLEM IN MINIMIZATION OF QUADRATIC FUNCTIONAL HE PUBLISHING HOUSE PROCEEDINGS OF HE ROMANIAN ACADEMY, Series A, OF HE ROMANIAN ACADEMY Volume, Number 4/200, pp 287 293 SUFFICIEN CONDIIONS FOR EXISENCE SOLUION OF LINEAR WO-POIN BOUNDARY PROBLEM IN

More information

Class Meeting # 10: Introduction to the Wave Equation

Class Meeting # 10: Introduction to the Wave Equation MATH 8.5 COURSE NOTES - CLASS MEETING # 0 8.5 Inroducion o PDEs, Fall 0 Professor: Jared Speck Class Meeing # 0: Inroducion o he Wave Equaion. Wha is he wave equaion? The sandard wave equaion for a funcion

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

EKF SLAM vs. FastSLAM A Comparison

EKF SLAM vs. FastSLAM A Comparison vs. A Comparison Michael Calonder, Compuer Vision Lab Swiss Federal Insiue of Technology, Lausanne EPFL) michael.calonder@epfl.ch The wo algorihms are described wih a planar robo applicaion in mind. Generalizaion

More information

Estimation of Poses with Particle Filters

Estimation of Poses with Particle Filters Esimaion of Poses wih Paricle Filers Dr.-Ing. Bernd Ludwig Chair for Arificial Inelligence Deparmen of Compuer Science Friedrich-Alexander-Universiä Erlangen-Nürnberg 12/05/2008 Dr.-Ing. Bernd Ludwig (FAU

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

CS376 Computer Vision Lecture 6: Optical Flow

CS376 Computer Vision Lecture 6: Optical Flow CS376 Compuer Vision Lecure 6: Opical Flow Qiing Huang Feb. 11 h 2019 Slides Credi: Krisen Grauman and Sebasian Thrun, Michael Black, Marc Pollefeys Opical Flow mage racking 3D compuaion mage sequence

More information

Planning in POMDPs. Dominik Schoenberger Abstract

Planning in POMDPs. Dominik Schoenberger Abstract Planning in POMDPs Dominik Schoenberger d.schoenberger@sud.u-darmsad.de Absrac This documen briefly explains wha a Parially Observable Markov Decision Process is. Furhermore i inroduces he differen approaches

More information

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY ECO 504 Spring 2006 Chris Sims RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY 1. INTRODUCTION Lagrange muliplier mehods are sandard fare in elemenary calculus courses, and hey play a cenral role in economic

More information

ON THE BEAT PHENOMENON IN COUPLED SYSTEMS

ON THE BEAT PHENOMENON IN COUPLED SYSTEMS 8 h ASCE Specialy Conference on Probabilisic Mechanics and Srucural Reliabiliy PMC-38 ON THE BEAT PHENOMENON IN COUPLED SYSTEMS S. K. Yalla, Suden Member ASCE and A. Kareem, M. ASCE NaHaz Modeling Laboraory,

More information

Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization Dual Averaging Mehods for Regularized Sochasic Learning and Online Opimizaion Lin Xiao Microsof Research Microsof Way Redmond, WA 985, USA lin.xiao@microsof.com Revised March 8, Absrac We consider regularized

More information

Information Geometry of Contrastive Divergence

Information Geometry of Contrastive Divergence Informaion Geomery of Conrasive Divergence hoaro Akaho AIT Tsukuba, 305-8568 Japan Kazuya Takabaake AIT Tsukuba, 305-8568 Japan Absrac The conrasive divergence(cd) mehod proposed by Hinon finds an approximae

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

Mean-square Stability Control for Networked Systems with Stochastic Time Delay JOURNAL OF SIMULAION VOL. 5 NO. May 7 Mean-square Sabiliy Conrol for Newored Sysems wih Sochasic ime Delay YAO Hejun YUAN Fushun School of Mahemaics and Saisics Anyang Normal Universiy Anyang Henan. 455

More information

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance. 1 An Inroducion o Backward Sochasic Differenial Equaions (BSDEs) PIMS Summer School 2016 in Mahemaical Finance June 25, 2016 Chrisoph Frei cfrei@ualbera.ca This inroducion is based on Touzi [14], Bouchard

More information

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1 Vecorauoregressive Model and Coinegraion Analysis Par V Time Series Analysis Dr. Sevap Kesel 1 Vecorauoregression Vecor auoregression (VAR) is an economeric model used o capure he evoluion and he inerdependencies

More information

Model Reduction for Dynamical Systems Lecture 6

Model Reduction for Dynamical Systems Lecture 6 Oo-von-Guericke Universiä Magdeburg Faculy of Mahemaics Summer erm 07 Model Reducion for Dynamical Sysems ecure 6 v eer enner and ihong Feng Max lanck Insiue for Dynamics of Complex echnical Sysems Compuaional

More information

An Introduction to Stochastic Programming: The Recourse Problem

An Introduction to Stochastic Programming: The Recourse Problem An Inroducion o Sochasic Programming: he Recourse Problem George Danzig and Phil Wolfe Ellis Johnson, Roger Wes, Dick Cole, and Me John Birge Where o look in he ex pp. 6-7, Secion.2.: Inroducion o sochasic

More information

arxiv: v4 [stat.ml] 14 Jun 2018

arxiv: v4 [stat.ml] 14 Jun 2018 Projecion-Free Online Opimizaion wih Sochasic Gradien: From Convexiy o Submodulariy Lin Chen Chrisopher Harshaw 3 Hamed Hassani 4 Amin Karbasi arxiv:80.0883v4 [sa.ml 4 Jun 08 Absrac Online opimizaion has

More information

Accelerated Distributed Nesterov Gradient Descent for Convex and Smooth Functions

Accelerated Distributed Nesterov Gradient Descent for Convex and Smooth Functions 07 IEEE 56h Annual Conference on Decision and Conrol (CDC) December -5, 07, Melbourne, Ausralia Acceleraed Disribued Neserov Gradien Descen for Convex and Smooh Funcions Guannan Qu, Na Li Absrac This paper

More information

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011 Mainenance Models Prof Rober C Leachman IEOR 3, Mehods of Manufacuring Improvemen Spring, Inroducion The mainenance of complex equipmen ofen accouns for a large porion of he coss associaed wih ha equipmen

More information

On-line Adaptive Optimal Timing Control of Switched Systems

On-line Adaptive Optimal Timing Control of Switched Systems On-line Adapive Opimal Timing Conrol of Swiched Sysems X.C. Ding, Y. Wardi and M. Egersed Absrac In his paper we consider he problem of opimizing over he swiching imes for a muli-modal dynamic sysem when

More information

Mean Square Projection Error Gradient-based Variable Forgetting Factor FAPI

Mean Square Projection Error Gradient-based Variable Forgetting Factor FAPI 3rd Inernaional Conference on Advances in Elecrical and Elecronics Engineering (ICAEE'4) Feb. -, 4 Singapore Mean Square Projecion Error Gradien-based Variable Forgeing Facor FAPI Young-Kwang Seo, Jong-Woo

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information