A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization
|
|
- Stewart James
- 6 years ago
- Views:
Transcription
1 1 A Decenralized Second-Order Mehod wih Exac Linear Convergence Rae for Consensus Opimizaion Aryan Mokhari, Wei Shi, Qing Ling, and Alejandro Ribeiro Absrac This paper considers decenralized consensus opimizaion problems where differen summands of a global objecive funcion are available a nodes of a nework ha can communicae wih neighbors only. The proximal mehod of mulipliers is considered as a powerful ool ha relies on proximal primal descen and dual ascen updaes on a suiably defined augmened Lagrangian. The srucure of he augmened Lagrangian makes his problem nondecomposable, which precludes disribued implemenaions. This problem is regularly addressed by he use of he alernaing direcion mehod of mulipliers. The exac second order mehod (ESOM) is inroduced here as an alernaive ha relies on: (i) The use of a separable quadraic approximaion of he augmened Lagrangian. (ii) A runcaed Taylor s series o esimae he soluion of he firs order condiion imposed on he minimizaion of he quadraic approximaion of he augmened Lagrangian. The sequences of primal and dual variables generaed by ESOM are shown o converge linearly o heir opimal argumens when he aggregae cos funcion is srongly convex and is gradiens are Lipschiz coninuous. Numerical resuls demonsrae advanages of ESOM relaive o decenralized alernaives in solving leas squares and logisic regression problems. Index Terms Muli-agen neworks, decenralized opimizaion, mehod of mulipliers, linear convergence, second-order mehods I. INTRODUCTION In decenralized consensus opimizaion problems, componens of a global objecive funcion ha is o be minimized are available a differen nodes of a nework. Formally, consider a decision variable x R p and a conneced nework conaining n nodes where each node i has access o a local objecive funcion f i : R p R. Nodes can exchange informaion wih neighbors only and ry o minimize he global cos funcion n i=1 f i( x), n x := argmin f i ( x). (1) x R p We assume ha he local objecive funcions f i ( x) are srongly convex. The global objecive funcion n i=1 f i( x), which is he sum of a se of srongly convex funcions, is also srongly convex. Problems like (1) arise in decenralized conrol [1] [3], wireless communicaion [4], [5], sensor neworks [6] [8], and large scale machine learning [9] [11]. Decenralized mehods for solving (1) can be divided ino wo classes: primal domain mehods and dual domain mehods Work suppored by NSF CAREER CCF , ONR N , and NSFC A. Mokhari and A. Ribeiro are wih he Dep. of Elecrical and Sysems Engineering, Universiy of Pennsylvania, 200 S 33rd S., Philadelphia, PA {aryanm, aribeiro}@seas.upenn.edu. W. Shi is wih he Coordinaed Science Laboraory, Universiy of Illinois a Urbana-Champaign, 1308 W Main S, Urbana, IL wilburs@illinois.edu. Q. Ling is wih he Dep. of Auomaion, Universiy of Science and Technology of China, 96 Jinzhao Rd., Hefei, Anhui, , China. qingling@mail.usc.edu.cn. i=1 (1). Decenralized gradien descen (DGD) is a well-esablished primal mehod ha implemens gradien descen on a penalized version of (1) whose gradien can be separaed ino per-node componens. Nework Newon (NN) is a more recen alernaive ha acceleraes convergence of DGD by incorporaing second order informaion of he penalized objecive [12], [13]. Boh, DGD and NN, converge o a neighborhood of he opimal argumen x when using a consan sepsize and converge sublinearly o he exac opimal argumen if using a diminishing sepsize. Dual domain mehods build on he fac ha he dual funcion of (1) has a gradien wih separable srucure. The use of plain dual gradien descen is possible bu generally slow o converge [14] [16]. In cenralized opimizaion, beer convergence speeds are aained by he mehod of mulipliers (MM) ha adds a quadraic augmenaion erm o he Lagrangian [17], [18], or he proximal (P)MM ha adds an addiional erm o keep ieraes close. In eiher case, he quadraic erm ha is added o consruc he augmened Lagrangian makes disribued compuaion of primal gradiens impossible. This issue is mos ofen overcome wih he use of decenralized (D) versions of he alernaing direcion mehod of mulipliers (ADMM) [6], [19], [20]. Besides he ADMM, oher mehods ha use differen alernaives o approximae he gradiens of he dual funcion have also been proposed [21] [27]. The convergence raes of hese mehods have no been sudied excep for he DADMM and is varians ha are known o converge linearly o he opimal argumen when he local funcions are srongly convex and heir gradiens are Lipschiz coninuous [20], [28], [29]. An imporan observaion here is ha while all of hese mehods ry o approximae he MM or he PMM, he performance penaly enailed by he approximaion has no been sudied. This paper inroduces he exac second order mehod (ESOM) which uses quadraic approximaions of he augmened Lagrangians of (1) and leads o a se of separable subproblems. Similar o oher second order mehods, implemenaion of ESOM requires compuaion of Hessian inverses. Disribued implemenaion of his operaion is infeasible because while he Hessian of he proximal augmened Lagrangian is neighbor sparse, is inverse is no. ESOM resolves his issue by using he Hessian inverse approximaion echnique inroduced in [12], [13], [30]. This echnique consiss of runcaing he Taylor s series of he Hessian inverse o order K o obain he family of mehods ESOM-K. Implemenaion of his expansion in erms of local operaions is possible. A remarkable propery of all ESOM-K mehods is ha hey can be shown o pay a performance penaly relaive o (cenralized) PMM ha vanishes wih increasing ieraions. We begin he paper by reformulaing (1) in a form more suiable for decenralized implemenaion (Proposiion 1) and
2 2 proceed o describe he PMM (Secion II). ESOM is a variaion of PMM ha subsiues he proximal augmened Lagrangian wih is quadraic approximaion (Secion III). Implemenaion of ESOM requires compuing he inverse of he Hessian of he proximal augmened Lagrangian. Since his inversion canno be compued using local and neighboring informaion, ESOM-K approximaes he Hessian inverse wih he K-order runcaion of he Taylor s series expansion of he Hessian inverse. This expansion can be carried ou using an inner loop of local operaions. This and oher deails required for decenralized implemenaion of ESOM-K are discussed in Secion III-A along wih a discussion of how ESOM can be inerpreed as a saddle poin generalizaion of he Nework Newon mehods proposed in [12], [13] (Remark 1) or a second order version of he EXTRA mehod proposed in [31] (Remark 2). Convergence analyses of PMM and ESOM are hen presened (Secion IV). Linear convergence of PMM is esablished (Secion IV-A) and linear convergence facors explicily derived o use as benchmarks (Theorem 1). In he ESOM analysis (Secion IV-B) we provide an upper bound for he error of he proximal augmened Lagrangian approximaion (Lemma 3). We leverage his resul o prove linear convergence of ESOM (Theorem 2) and o show ha ESOM s linear convergence facor approaches he corresponding PMM facor as ime grows (Secion IV-C). This indicaes ha he convergence pahs of (disribued) ESOM-K and (cenralized) PMM are very close. We also sudy he dependency of he convergence consan wih he algorihm s order K. ESOM radeoffs and comparisons wih oher decenralized mehods for solving consensus opimizaion problems are illusraed in numerical experimens (Secion V) for a decenralized leas squares problem (Secion V-A) and a decenralized logisic regression classificaion problem (Secion V-B). Numerical resuls in boh seings verify ha larger K leads o faser convergence in erms of number of ieraions. However, we observe ha all version of ESOM-K exhibi similar convergence raes in erms of he number of communicaion exchanges. This implies ha ESOM-0 is preferable wih respec o he laer meric and ha larger K is jusified when compuaional cos is of ineres. Faser convergence relaive o EXTRA, Nework Newon, and DADMM is observed. We close he paper wih concluding remarks (Secion VI). Noaion. Vecors are wrien as x R n and marices as A R n n. Given n vecors x i, he vecor x = [x 1 ;... ; x n ] represens a sacking of he elemens of each individual x i. We use x and A o denoe he Euclidean norm of vecor x and marix A, respecively. The norm of vecor x wih respec o posiive definie marix A is x A := (x T Ax) 1/2. Given a funcion f is gradien x is denoed as f(x) and is Hessian as 2 f(x). II. PROXIMAL METHOD OF MULTIPLIERS Le x i R p be a copy of he decision variable x kep a node i and define N i as he neighborhood of node i. Assuming he nework is bidirecionally conneced, he opimizaion problem in (1) is equivalen o he program n {x i } n i=1 := argmin f i (x i ), {x i} n i=1 i=1 s.. x i = x j, for all i, j N i. (2) Indeed, he consrain in (2) enforces he consensus condiion x 1 = = x n for any feasible poin of (2). Wih his condiion saisfied, he objecive in (2) is equal o he objecive funcion in (1) from where i follows ha he opimal local variables x i are all equal o he opimal argumen x of (1), i.e., x 1 = = x n = x. To derive ESOM define x := [x 1 ;... ; x n ] R np as he concaenaion of he local decision variables x i and he aggregae funcion f : R np R as f(x) = f(x 1,..., x n ) := n i=1 f i(x i ) as he sum of all he local funcions f i (x i ). Inroduce he marix W R n n wih elemens w ij 0 represening a weigh ha node i assigns o variables of node j. The weigh w ij = 0 if and only if j / N i {i}. The marix W is furher required o saisfy W T = W, W1 = 1, null(i W) = span(1). (3) The firs condiion implies ha he weighs are symmeric, i.e., w ij = w ji. The second condiion ensures ha he weighs of a given node sum up o 1, i.e., n j=1 w ij = 1 for all i. Since W1 = 1 we have ha I W is rank deficien. The las condiion null(i W) = span(1) makes he rank of I W exacly equal o n 1 [32]. The marix W can be used o reformulae (2) as we show in he following proposiion. Proposiion 1 Define he marix Z := W I p R np R np as he Kronecker produc of he weigh marix W and he ideniy marix I p and consider he definiions of he global vecor x := [x 1 ;... ; x n ] and aggregae funcion f(x) := n i=1 f i(x i ). The opimizaion problem in (2) is equivalen o x = argmin f(x) s.. (I Z) 1/2 x = 0. (4) x R np I.e., x = [x 1;... ; x n] wih {x i }n i=1 he soluion of (2). Proof : We jus show ha he consrain ((I n W) I p )x = (I np Z)x = 0 is also a consensus consrain. To do so begin by noicing ha since I W is posiive semidefinie, I Z = (I W) I p is also posiive semidefinie. Therefore, he null space of he square roo marix (I Z) 1/2 is equal o he null space of I Z and we conclude ha saisfying he condiion (I Z) 1/2 x is equivalen o he consensus condiion x 1 = = x n. This observaion in conjuncion wih he definiion of he aggregae funcion f(x) = n i=1 f i(x i ) shows ha he programs in (4) and (3) are equivalen. In paricular, he opimal soluion of (4) is x = [x 1;... ; x n] wih {x i }n i=1 he soluion of (2). The formulaion in (4) is used o define he proximal mehod of mulipliers (PMM) ha we consider in his paper. To do so inroduce dual variables v R np o define he augmened Lagrangian L(x, v) of (4) as L(x, v) = f(x) v T (I Z) 1/2 x α 2 xt (I Z)x, (5) where α is a posiive consan. Given he properies of he marix Z, he augmenaion erm (α/2)x T (I Z)x is null when he
3 3 variable x is a feasible soluion of (4). Oherwise, he inner produc is posiive and behaves as a penaly for he violaion of he consensus consrain. Inroduce now a ime index N and define x and v as primal and dual ieraes a sep. The primal variable x 1 is updaed by minimizing he sum of he augmened Lagrangian in (5) and he proximal erm (ɛ/2) x x 2. We hen have ha { x 1 = argmin L(x, v ) ɛ x R np 2 x x 2}, (6) where he proximal coefficien ɛ > 0 is a sricly posiive consan. The dual variable v is updaed by ascending hrough he gradien of he augmened Lagrangian wih respec o he dual variable v L(x 1, v ) wih sepsize α v 1 = v α(i Z) 1/2 x 1. (7) The updaes in (6) and (7) for PMM can be considered as a generalizaion of he mehod of mulipliers (MM), because seing he proximal coefficien ɛ = 0 recovers he updaes of MM. The proximal erm (ɛ/2) x x 2 is added o keep he updaed variable x 1 close o he previous ierae x. This does no affec convergence guaranees bu improves compuaional sabiliy. The primal updae in (6) may be compuaionally cosly because i requires solving a convex program and canno be implemened in a decenralized manner because he augmenaion erm (1/)x T (I Z)x in (5) is no separable. In he following secion we propose an approximaion of PMM ha makes he minimizaion in (6) compuaionally economic and separable over nodes of he nework. This leads o he se of decenralized updaes ha define he ESOM algorihm. III. ESOM: EXACT SECOND-ORDER METHOD To reduce he compuaional complexiy of (6) and obain a separable updae we inroduce a second order approximaion of he augmened Lagrangian in (5). Consider hen he second order Taylor s expansion L(x, v ) L(x, v ) x L(x, v ) T (x x ) (1/2)(x x ) T 2 xl(x, v )(x x ) of he augmened Lagrangian wih respec o x cenered around (x, v ). Using his approximaion in lieu of L(x, v ) in (6) leads o he primal updae { x 1 = argmin L(x, v ) x L(x, v ) T (x x ) (8) x R np 1 2 (x x ) T( 2 xl(x, v ) ɛi ) } (x x ). The minimizaion in he righ hand side of (8) is of a posiive definie quadraic form. Thus, upon defining he Hessian marix H R np np as H := 2 f(x ) α(i Z) ɛi, (9) and considering he explici form of he augmened Lagrangian gradien x L(x, v ) [cf. (5)] i follows ha he variable x 1 in (8) is given by x 1 = x H ] [ f(x ) (I Z) 1/2 v α(i Z)x. (10) A fundamenal observaion here is ha he marix H, which is he Hessian of he objecive funcion in (8), is block neighbor sparse. By block neighbor sparse we mean ha he (i, j)h block is non-zero if and only if j N i or j = i. To confirm his claim, observe ha 2 f(x ) R np np is a block diagonal marix where is ih diagonal block is he Hessian of he ih local funcion, 2 f i (x i, ) R p p. Addiionally, marix ɛi np is a diagonal marix which implies ha he erm 2 f(x ) ɛi np is a block diagonal marix wih blocks 2 f i (x i, ) ɛi p. Furher, i follows from he definiion of he marix Z ha he marix I Z is neighbor sparse. Therefore, he Hessian H is also neighbor sparse. Alhough he Hessian H is neighbor sparse, is inverse H is no. This observaion leads o he conclusion ha he updae in (10) is no implemenable in a decenralized manner, i.e., nodes canno implemen (10) by exchanging informaion only wih heir neighbors. To resolve his issue, we use a Hessian inverse approximaion ha is buil on runcaing he Taylor s series of he Hessian inverse H as in [12]. To do so, we ry o decompose he Hessian as H = D B where D is a block diagonal posiive definie marix and B is a neighbor sparse posiive semidefinie marix. In paricular, define D as D := 2 f(x ) ɛi (I Z d ), (11) where Z d := diag(z). Observing he definiions of he marices H and D and considering he relaion B = D H we conclude ha B is given by B := α (I 2Z d Z). (12) Noice ha using he decomposiion H = D B and by facoring D 1/2, he Hessian inverse can be wrien as H = D /2 (I D /2 BD /2 ) D /2. Observe ha he inverse marix (I D /2 BD /2 ) can be subsiued by is Taylor s series u=0 (D/2 BD /2 ) u ; however, compuaion of he series requires global communicaion which is no affordable in decenralized seings. Thus, we approximae he Hessian inverse H by runcaing he firs K 1 erms of is Taylor s series which leads o he Hessian inverse approximaion (K), H (K) := D /2 K u=0 H ( ) u D /2 BD /2 /2 D. (13) Noice ha he approximae Hessian inverse H (K) is K-hop block neighbor sparse, i.e., he (i, j)h block is nonzero if and only if here is a leas one pah beween nodes i and j wih lengh K or smaller. We inroduce he Exac Second-Order Mehod (ESOM) as a second order mehod for solving decenralized opimizaion problems which subsiues he Hessian inverse in updae (10) by is K block neighbor sparse approximaion Ĥ k (K) defined in (13). Therefore, he primal updae of ESOM is x 1 = x H (K) ] [ f(x ) (I Z) 1/2 v α(i Z)x. The ESOM dual updae is idenical o he updae in (7), (14) v 1 = v α(i Z) 1/2 x 1. (15) Noice ha ESOM is differen from PMM in approximaing he augmened Lagrangian in he primal updae of PMM by a second order approximaion. Furher, ESOM approximaes he Hessian
4 4 inverse of he augmened Lagrangian by runcaing he Taylor s series of he Hessian inverse which is no necessarily neighbor sparse. In he following subsecion we sudy he implanaion deails of he updaes in (14) and (15). A. Decenralized implemenaion of ESOM The updaes in (14) and (15) show ha ESOM is a second order approximaion of PMM. Alhough hese updaes are necessary for undersanding he raionale behind ESOM, hey are no implemenable in a decenralized fashion since he marix (I Z) 1/2 is no neighbor sparse. To resolve his issue, define he sequence of variables q as q := (I Z) 1/2 v. Considering he definiion of q, he primal updae in (14) can be wrien as x 1 = x H (K) ( f(x ) q α(i Z)x ). (16) By muliplying he dual updae in (15) by (I Z) 1/2 from he lef hand side and using he definiion q := (I Z) 1/2 v we obain ha q 1 = q α(i Z)x 1. (17) Noice ha he sysem of updaes in (16) and (17) is equivalen o he updaes in (14) and (15), i.e., he sequences of variables x generaed by hem are idenical. Nodes can implemen he primaldual updaes in (16) and (17) in a decenralized manner, since he squared roo marix (I Z) 1/2 is eliminaed from he updaes and nodes can compue he producs (I Z)x and (I Z)x 1 by exchanging informaion wih heir neighbors. To characerize he local updae of each node for implemening he updaes in (16) and (17), define g := x L(x, v ) = f(x ) q α(i Z)x, (18) as he gradien of he augmened Lagrangian in (5). Furher, define he primal descen direcion d (K) wih K levels of approximaion as d (K) := H (K) g, (19) which implies ha he updae in (16) can be wrien as x 1 = x d (K). Based on he mechanism of he Hessian inverse approximaion in (13), he descen direcions d (k) and d (k 1) saisfy he condiion d (k 1) = D Bd (k) D g. (20) Define d i, (k) as he descen direcion of node i a sep which is he ih elemen of he global descen direcion d (k) = [d 1, (k);... ; d n, (k)]. Therefore, he localized version of he relaion in (20) a node i is given by d i, (k 1) = D ii, j=i,j N i B ij d j, (k) D ii, g i,. (21) The updae in (21) shows ha node i can compue is (k 1)h descen direcion d i, (k 1) if i has access o he kh descen direcion d i, (k) of iself and is neighbors d j, (k) for j N i. Thus, if nodes iniialize wih he ESOM-0 descen direcion d i, (0) = D ii, g i, and exchange heir descen direcions wih heir neighbors for K rounds and use he updae in (21), hey can compue heir local ESOM-K descen direcion d i, (K). Noice ha he ih diagonal block D is given by D ii, := Algorihm 1 ESOM-K mehod a node i Require: Iniial ieraes x i,0 = x j,0 = 0 for all j N i and q i,0 = 0. 1: B blocks: B ii = α(1 w ii)i and B ij = αw iji 2: for = 0, 1, 2,... do 3: D block: D ii, = 2 f i(x i,) ((1 w ii) ɛ)i 4: Compue g i, = f i(x i,)q i,α(1 w ii)x i, α j N i w ijx j, 5: Compue ESOM-0 descen direcion d i,(0) = D ii, gi, 6: for k = 0,..., K 1 do 7: Exchange d i,(k) wih neighbors [ j N i ] 8: Compue d i,(k 1)= D ii, B ijd j,(k) g i, j N i,j=i 9: end for 10: Updae primal ierae: x i,1 = x i, d i,(k). 11: Exchange ieraes x i, wih neighbors j N i. 12: Updae dual ierae: q i,1 = q i, α(1 w ii)x i,1 α 13: end for j N i w ijx j,1. 2 f i (x i, )((1 w ii )ɛ)i, where x i, is he primal variable of node i a sep. Thus, he block D ii, is locally available a node i. Moreover, node i can evaluae he blocks B ii = α(1 w ii )I and B ij = αw ij I wihou exra communicaion. In addiion, nodes can compue he gradien g by communicaing wih heir neighbors. To confirm his claim observe ha he ih elemen of g = [g 1, ;... ; g n, ] associaed wih node i is given by g i, := f i (x i, ) q i, α(1 w ii )x i, α w ij x j,, (22) j N i where q i, R p is he ih elemen of q = [q 1, ;... ; q n, ] and x i, he primal variable of node i a sep and hey are boh available a node i. Hence, he updae in (16) can be implemened in a decenralized manner. Likewise, nodes can implemen he dual updae in (17) using he local updae q i,1 = q i, α(1 w ii )x i,1 α j N i w ij x j,1, (23) which requires access o he local primal variable x j,1 of he neighboring nodes j N i. The seps of ESOM-K are summarized in Algorihm 1. The core seps are Seps 5-9 which correspond o compuing he ESOM-K primal descen direcion d i, (K). In Sep 5, Each node compues is iniial descen direcion d i, (0) using he block D ii, and he local gradien g i, compued in Seps 3 and 4, respecively. Seps 7 and 8 correspond o he recursion in (21). In sep 7, nodes exchange heir kh level descen direcion d i, (k) wih heir neighboring nodes o compue he (k 1)h descen direcion d i, (k 1) in Sep 8. The oucome of his recursion is he Kh level descen direcion d i, (K) which is required for he updae of he primal variable x i, in Sep 10. Noice ha he blocks of he neighbor sparse marix B, which are required for sep 8, are compued and sored in Sep 1. Afer updaing he primal variables in Sep 10, nodes exchange heir updaed variables x i,1 wih heir neighbors j N i in Sep 11. By having access o he decision variable of neighboring nodes, nodes updae heir local dual variable q i, in Sep 12.
5 5 Remark 1 The proposed ESOM algorihm solves problem (4) in he dual domain by defining he proximal augmened Lagrangian. I is also possible o solve problem (4) in he primal domain by solving a penaly version of (4). In paricular, by using he quadraic penaly funcion (1/2). 2 for he consrain (I Z) 1/2 x wih penaly coefficien α, we obain he penalized version of (4) ˆx := argmin x R np f(x) α 2 xt (I Z)x, (24) where ˆx is he opimal argumen of he penalized objecive funcion. Noice ha ˆx is no equal o he opimal argumen x and he disance x ˆx is in he order of O(1/α). The objecive funcion in (24) can be minimized by descending hrough he gradien descen direcion which leads o he updae of decenralized gradien descen (DGD) [33]. The convergence of DGD can be improved by using Newon s mehod. Noice ha he Hessian of he objecive funcion in (24) is given by Ĥ := 2 f(x) α(i Z). (25) The Hessian Ĥ in (25) is idenical o he Hessian H in (9) excep for he erm ɛi. Therefore, he same echnique for approximaing he Hessian inverse Ĥ can be used o approximae he Newon direcion of he penalized objecive funcion in (24) which leads o he updae of he Nework Newon (NN) mehods [12], [13]. Thus, ESOM and NN use an approximae decenralized variaion of Newon s mehod for solving wo differen problems. In oher words, ESOM uses he approximae Newon direcion for minimizing he augmened Lagrangian of (4), while NN solves a penalized version of (4) using his approximaion. This difference jusifies he reason ha he sequence of ieraes generaed by ESOM converges o he opimal argumen x (Secion IV), while NN converges o a neighborhood of x. Remark 2 ESOM approximaes he augmened Lagrangian L(x, v) in (6) by is second order approximaion. If we subsiue he augmened Lagrangian by is firs order approximaion we can recover he updae of EXTRA proposed in [31]. To be more precise, we can subsiue L(x, v ) by is firs order approximaion L(x, v ) L(x, v ) T (x x ) near he poin (x, v ) o updae he primal variable x. Considering his subsiuion and he definiion of he augmened Lagrangian in (5) I follows ha he updae for he primal variable x can be wrien as x 1 = x 1 ] [ f(x ) α(i Z) 1/2 v α(i Z)x. (26) ɛ By subracing he updae a sep from he updae a sep and using he dual variables relaion ha v 1 = v α(i Z) 1/2 x 1 we obain he updae ( x 1 = 2I [ α ɛ α2 ɛ ] ) (I Z) x (I α ) ɛ (I Z) x 1 ɛ ( f(x ) f(x )). (27) The updae in (27) shows a firs-order approximaion of he PMM. I is no hard o show ha for specific choices of α and ɛ, he updae in (27) is equivalen o he updae of EXTRA in [31]. Thus, we expec o observe faser convergence for ESOM relaive o EXTRA as i incorporaes second-order informaion. This advanage is sudied in Secion V. IV. CONVERGENCE ANALYSIS In his secion we sudy convergence raes of PMM and ESOM. Firs, we show ha he sequence of ieraes x generaed by PMM converges linearly o he opimal argumen x. Alhough, PMM canno be implemened in a decenralized fashion, is convergence rae can be used as a benchmark for evaluaing performance of ESOM. We hen follow he secion by analyzing convergence properies ESOM. We show ha ESOM exhibis a linear convergence rae and compare is facor of linear convergence wih he linear convergence facor of PMM. In proving hese resuls we consider he following assumpions. Assumpion 1 The local objecive funcions f i (x) are wice differeniable and he eigenvalues of he local objecive funcions Hessian 2 f(x) are bounded by posiive consans 0 < m M <, i.e. mi 2 f i (x i ) MI, (28) for all x i R p and i = 1,..., n. The lower bound in (28) is equivalen o he condiion ha he local objecive funcions f i are srongly convex wih consan m > 0. The upper bound for he eigenvalues of he Hessians 2 f i implies ha he gradiens of he local objecive funcions f i are Lipschiz coninuous wih consan M. Noice ha he global objecive funcion 2 f(x) is a block diagonal marix where is ih diagonal block is 2 f i (x i ). Therefore, he bounds on he eigenvalues of he local Hessians 2 f i (x i ) in (28) also hold for he global objecive funcion Hessian 2 f(x). I.e., mi 2 f(x) MI, (29) for all x R np. Thus, he global objecive funcion f is also srongly convex wih consan m and is gradiens f are Lipschiz coninuous wih consan M. A. Convergence of Proximal Mehod of Mulipliers (PMM) Convergence rae of PMM can be considered as a benchmark for he convergence rae of ESOM. To esablish linear convergence of PMM, We firs sudy he relaionship beween he primal x and dual v ieraes generaed by PMM and he opimal argumens x and v in he following lemma. Lemma 1 Consider he updaes for he proximal mehod of mulipliers in (6) and (7). The sequences of primal and dual ieraes generaed by PMM saisfy and v 1 v α(i Z) 1/2 (x 1 x ) = 0, (30) f(x 1 ) f(x ) (I Z) 1/2 (v 1 v ) Proof: See Appendix A. ɛ(x 1 x ) = 0. (31) Considering he preliminary resuls in (30) and (31), we can sae convergence resuls of PMM. To do so, we prove linear convergence of a Lyapunov funcion of he primal x x 2
6 6 and dual v v 2 errors. To be more precise, we define he vecor u R 2np and marix G R np np as [ ] [ ] v I 0 u =, G =. (32) x 0 αɛi Noice ha he sequence u is he concaenaion of he dual variable v and primal variable x. Likewise, we can define u as he concaenaion of he opimal argumens v and x. We proceed o prove ha he sequence u u 2 G converges linearly o null. Observe ha u u 2 G can be simplified as v v 2 αɛ x x 2. This observaion shows ha u u 2 G is a Lyapunov funcion of he primal x x 2 and dual v v 2 errors. Therefore, linear convergence of he sequence u u 2 G implies linear convergence of he sequence x x 2. In he following heorem, we show ha he sequence u u 2 G converges o zero a a linear rae. Theorem 1 Consider he proximal mehod of mulipliers as inroduced in (6) and (7). Consider β > 1 as an arbirary consan sricly larger han 1 and define ˆλ min (I Z) as he smalles nonzero eigenvalue of he marix I Z. Furher, recall he definiions of he vecor u and marix G in (32). If Assumpion 1 holds, hen he sequence of Lyapunov funcions u u 2 G generaed by PMM saisfies u 1 u 2 G 1 1 δ u u 2 G, (33) where he consan δ is given by { } ˆλ min (I Z) 2mM δ = min, β(m M) ɛ(m M), (β 1)αˆλ min (I Z). βɛ (34) Proof: See Appendix B. The resul in Theorem 1 shows linear convergence of he sequence u u 2 G generaed by PMM where he facor of linear convergence is 1/(1 δ). Observe ha larger δ implies smaller linear convergence facor 1/(1δ) and faser convergence. Noice ha all he erms in he minimizaion in (34) are posiive and herefore he consan δ is sricly larger han 0. In addiion, he resul in Theorem 1 holds for any feasible se of parameers β > 1, ɛ > 0, and α > 0; however, maximizing he parameer δ requires properly choosing he se of parameers β, ɛ, and α. Observe ha when he firs posiive eigenvalue ˆλ min (I Z) of he marix I Z, which is he second smalles eigenvalue of I Z, is small he consan δ becomes close o zero and convergence becomes slow. Noice ha small ˆλ min (I Z) shows ha he graph is no highly conneced. This observaion maches he inuiion ha when he graph has less edges he speed of convergence is slower. Addiionally, he upper bounds in (34) show ha when he condiion number M/m of he global objecive funcion f is large, δ becomes small and he linear convergence becomes slow. Alhough PMM enjoys a fas linear convergence rae, each ieraion of PMM requires infinie rounds of communicaions which makes i infeasible. In he following secion, we sudy convergence properies of ESOM as a second order approximaion of PMM ha is implemenable in decenralized seings. B. Convergence of ESOM We proceed o show ha he sequence of ieraes x generaed by ESOM converges linearly o he opimal argumen x = [ x ;... ; x ]. To do so, we firs prove linear convergence of he Lyapunov funcion u u 2 G as defined in (32). Moreover, we show ha by increasing he Hessian inverse approximaion accuracy, ESOM facor of linear convergence can be arbirary close o he linear convergence facor of PMM in Theorem 1. Noice ha ESOM is buil on a second order approximaion of he proximal augmened Lagrangian used in he updae of PMM. To guaranee ha he second order approximaion suggesed in ESOM is feasible, he local objecive funcions f i are required o be wice differeniable as assumed in Assumpion 1. The wice differeniabiliy of he local objecive funcions f i implies ha he aggregae funcion f, which is he sum of a se of wice differeniable funcions, is also wice differeniable. This observaion shows ha he global objecive funcion 2 f(x) is definable. Considering his observaion, we prove some preliminary resuls for he ieraes generaed by ESOM in he following lemma. Lemma 2 Consider he updaes of ESOM in (14) and (15). Recall he definiions of he augmened Lagrangian Hessian H in (9) and he approximae Hessian inverse H (K) in (13). If Assumpion 1 holds, hen he primal and dual ieraes generaed by ESOM saisfy v 1 v α(i Z) 1/2 (x 1 x ) = 0. (35) Moreover, we can show ha f(x 1 ) f(x ) (I Z) 1/2 (v 1 v ) (36) ɛ(x 1 x ) e = 0, where he error vecor e is defined as e := f(x ) 2 f(x )(x 1 x ) f(x 1 ) ( ) H (K) H (x 1 x ). (37) Proof: See Appendix C. The resuls in Theorem 2 show he relaionships beween he primal x and dual v ieraes generaed by ESOM and he opimal argumens x and v. The firs resul in (35) is idenical o he convergence propery of PMM in (30), while he second resul in (36) differs from (31) in having he exra summand e. The vecor e can be inerpreed as he error of second order approximaion for ESOM a sep. To be more precise, he opimaliy condiion of he primal updae of PMM is given by f(x 1 ) (I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x ) = 0 as shown in (31). Noice ha he second order approximaion of his condiion is equivalen o f(x ) 2 f(x )(x 1 x ) (I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x ) = 0. However, he exac Hessian inverse H = ( 2 f(x ) ɛi α(i Z)) canno be compued in a disribued manner o solve he opimaliy condiion. Thus, i is approximaed by he approximae Hessian inverse marix H (K) as inroduced in (13). This shows ha he approximae opimaliy condiion in ESOM is f(x ) (I Z) 1/2 v α(i Z)x H (x 1 x ) = 0. Hence, he difference beween he opimaliy condiions of PMM and ESOM is e = f(x ) f(x 1 )α(i Z)(x x 1 )
7 7 H (x 1 x ) ɛ(x 1 x ). By adding and subracing he erm H (x 1 x ), he definiion of he error vecor e in (37) follows. The observaion ha he vecor e characerizes he error of second order approximaion in ESOM, moivaes analyzing an upper bound for he error vecor norm e. To prove ha he norm e is bounded above we assume he following condiion is saisfied. Assumpion 2 The global objecive funcion Hessian 2 f(x) is Lipschiz coninuous wih consan L, i.e., 2 f(x) 2 f( x) L x x. (38) The condiions imposed by Assumpion 2 is cusomary in he analysis of second order mehods; see, e.g., [29]. In he following lemma we use he assumpion in (38) o prove an upper bound for he error norm e in erms of he disance x 1 x. Lemma 3 Consider ESOM as inroduced in (8)-(15) and recall he definiion of he error vecor e in (37). Furher, define c > 0 as a lower bound for he local weighs w ii. If Assumpions 1-2 hold, hen he error vecor norm e is bounded above by e Γ x 1 x, (39) where Γ is defined as { Γ := min 2M, L } 2 x 1 x (M ɛ (1 c)) ρ K1, and ρ := (1 c)/((1 c) m ɛ). Proof: See Appendix D. (40) Firs, noe ha he lower bound c > 0 on he local weighs w ii is implied from he fac ha all he local weighs are posiive. In paricular, we can define he lower bound c as c := min i w ii. The resul in (39) shows ha he error of second order approximaion in ESOM vanishes as he sequence of ieraes x approaches he opimal argumen x. We will show in Theorem 2 ha x x converges o zero which implies ha he limi of he sequence x 1 x is zero. To undersand he definiion of Γ in (40), we have o decompose he error vecor e in (37) ino wo pars. The firs par is f(x ) 2 f(x )(x 1 x ) f(x 1 ) which comes from he fac ha ESOM minimizes a second order approximaion of he proximal augmened Lagrangian insead of he exac proximal augmened Lagrangian. This erm can be bounded by min{2m, (L/2) x 1 x } x 1 x as shown in Lemma 3. The second par of he error vecor e is ( H (K) H )(x 1 x ) which shows he error of Hessian inverse approximaion. Noice ha compuaion of he exac Hessian inverse H is no possible and ESOM approximaes he exac Hessian by he approximaion H (K). According o he resuls in [12], he difference H (K) H can upper bounded by (M ɛ2(1 c)/α)ρ K1 which jusifies he second erm of he expression for Γ in (40). In he following heorem, we use he resul in Lemma 3 o show ha he sequence of Lyapunov funcions u u 2 G generaed by ESOM converges o zero linearly. Theorem 2 Consider ESOM as inroduced in (8)-(15). Consider β > 1 and φ > 1 as arbirary consans ha are sricly larger han 1, and ζ as a posiive consan ha is chosen from he inerval ζ ((m M)/2mM, ɛ/γ 2 ). Furher, recall he definiions of he vecor u and marix G in (32) and consider ˆλ min (I Z) as he smalles non-zero eigenvalue of he marix I Z. If Assumpions 1-2 hold, hen he sequence of Lyapunov funcions u u 2 G generaed by ESOM saisfies u 1 u 2 G 1 1 δ where he sequence δ is given by { [ δ ˆλ min (I Z) = min φβ(m M), 2mM ɛ(m M) 1 ζɛ [ (β 1)αˆλ min (I Z) 1 ζγ2 βɛ ɛ Proof: See Appendix E. u u 2 G. (41) ][ 1 φγ2 (β 1) (φ 1)ɛ 2 ], (42) ] } The resul in Theorem 2 shows linear convergence of he sequence u u 2 G generaed by ESOM where he facor of linear convergence is 1/(1δ ). Noice ha he posiive consan ξ is chosen from he inerval ((mm)/2mm, ɛ/γ 2 ). This inerval is non-empy if and only if he proximal parameer ɛ saisfies he condiion ɛ > Γ 2 (m M)/2mM. I follows from he resul in Theorem 2 ha he sequence of primal variables x converges o he opimal argumen x defined in (4). Corollary 1 Under he assumpions in Theorem 2, he sequence of squared errors x x 2 generaed by ESOM converges o zero a a linear rae, i.e., ( x x min {δ }. ) u 0 u 2 G. (43) αɛ Proof: According o he definiion of he sequence u and marix G, we can wrie u u 2 G = αɛ x x 2 v v 2 which implies ha x x 2 (1/αɛ) u u 2 G. Considering his resul and linear convergence of he sequence u u 2 G in (41), he claim in (43) follows. C. Convergence raes comparison The expression for δ in (42) verifies he inuiion ha he convergence rae of ESOM is slower han PMM. This is rue, since he upper bounds for δ in PMM are larger han heir equivalen upper bounds for δ in ESOM. We obain ha δ is smaller han δ which implies ha he linear convergence facor 1/(1 δ) of PMM is smaller han 1/(1 δ ) for ESOM. Therefore, for all seps, he linear convergence of PMM is faser han ESOM. Alhough, linear convergence facor of ESOM 1/(1δ ) is larger han 1/(1 δ) for PMM, as ime passes he gap beween hese wo consans becomes smaller. In paricular, noice ha afer a number of ieraions (L/2) x 1 x becomes smaller han 2M and Γ can be simplified as Γ L 2 x 1 x ((1 c) M ɛ) ρ K1. (44)
8 8 The erm (L/2) x 1 x evenually approaches zero, while he second erm (2(1 c)/α M ɛ)ρ K1 is consan. Alhough, he second erm is no approaching zero, by proper choice of ρ and K, his erm can become arbirary close o zero. Noice ha when Γ approaches zero, if we se ζ = 1/Γ he upper bounds in (42) for δ approach he upper bounds for δ of PMM in (34). Therefore, as ime passes Γ becomes smaller and he facor of linear convergence for ESOM 1/(1 δ ) becomes closer o he linear convergence facor of PMM 1/(1 δ). V. NUMERICAL EXPERIMENTS In his secion we compare he performances of ESOM, EX- TRA, Decenralized (D)ADMM, and Nework Newon (NN). Firs we consider a linear leas squares problem and hen we use he menioned mehods o solve a logisic regression problem. A. Decenralized linear leas squares Consider a decenralized linear leas squares problem where each agen i {1,, n} holds is privae measuremen equaion, y i = M i xν i, where y i R mi and M i R mi p are measured daa, x R p is he unknown variable, and ν i R mi is some unknown noise. The decenralized linear leas squares esimaes x by solving he opimizaion problem n x = argmin x M i x y i 2 2. (45) i=1 The nework in his experimen is randomly generaed wih conneciviy raio r = 3/n, where r is defined as he number of edges divided by he number of all possible ones, n(n 1)/2. We se n = 20, p = 5, and m i = 5 for all i = 1,..., n. The vecors y i and marices M i as well as he noise vecors ν (i), for all i are generaed following he sandard normal disribuion. We precondiion he aggregaed daa marices M i so ha he condiion number of he problem is 10. The decision variables x i are iniialized as x i,0 = 0 for all nodes i = 1,..., n and he iniial disance o he opimal is x i,0 x = 100. We use Meropolis consan edge weigh marix as he mixing marix W in all experimens. We run PMM, EXTRA, and ESOM- K wih fixed hand-opimized sepsizes α. The bes choices of α for ESOM-0, ESOM-1, and ESOM-2 are α = 0.03, α = 0.04, and α = 0.05, respecively. The sepsize α = 0.1 leads o he bes performance for EXTRA which is considered in he numerical experimens. Noice ha for variaions of NN-K, here is no opimal choice of sepsize smaller sepsize leads o more accurae bu slow convergence, while large sepsize acceleraes he convergence bu o a less accurae neighborhood of he opimal soluion. Therefore, for NN-0, NN-1, and NN-2 we se α = 0.001, α = 0.008, and α = 0.02, respecively. Alhough he PMM algorihm is no implemenable in a decenralized fashion, we use is convergence pah which is generaed in a cenralized manner as our benchmark. The choice of sepsize for PMM is α = 2. Fig. 1 illusraes he relaive error x x / x 0 x versus he number of ieraions. Noice ha he vecor x is he concaenaion of he local vecors x i, and he opimal vecor x is defined as x = [ x ;... ; x ] R np. Observe ha all he variaions of NN-K fail o converge o he opimal argumen and hey converge linearly o a neighborhood of he opimal soluion x. Relaive error NN-0 NN-1 NN-2 EXTRA ESOM-0 ESOM-1 ESOM-2 PMM Number of ieraions Fig. 1: Relaive error x x / x 0 x of EXTRA, ESOM-K, NN- K, and PMM versus number of ieraions for he leas squares problem. Using a larger value of K for ESOM-K leads o faser convergence and makes he convergence pah closer o he one for PMM. Relaive error NN-0 NN-1 NN-2 EXTRA ESOM-2 ESOM-1 ESOM Rounds of communicaions Fig. 2: Relaive error x x / x 0 x of EXTRA, ESOM-K, NN- K, and PMM versus rounds of communicaions wih neighboring nodes for he leas squares problem. ESOM-0 is he mos efficien algorihm in erms of communicaion cos among all he mehods. Among he decenralized algorihms wih exac linear convergence rae, EXTRA has he wors performance and all he variaions of ESOM-K ouperform EXTRA. Recall ha he problem condiion number is 10 in our experimen and he difference beween EXTRA and ESOM-K is more significan for problems wih larger condiion numbers. Furher, choosing a larger value of K for ESOM-K leads o faser convergence and as we increase K he convergence pah of ESOM-K approaches he convergence pah of PMM. EXTRA requires one round of communicaions per ieraion, while NN-K and ESOM-K require K 1 rounds of local communicaions per ieraion. Thus, convergence pahs of hese mehods in erms of rounds of communicaions migh be differen from he ones in Fig. 1. The convergence pahs of NN, ESOM, EXTRA in erms of rounds of local communicaions are shown in Fig. 2. In his plo we ignore PMM, since i requires infinie rounds of communicaions per ieraion. The main difference beween Figs. 1 and 2 is in he performances of ESOM-0, ESOM-
9 9 1, and ESOM-2. All of he variaions of ESOM ouperform EXTRA in erms of rounds of communicaions, while he bes performance belongs o ESOM-0. This observaion shows ha increasing he approximaion level K does no necessary improve he performance of ESOM-K in erms of communicaion cos. B. Decenralized logisic regression We consider he applicaion of ESOM for solving a logisic regression problem in a form x λ n m i := argmin x R p 2 x 2 ln ( 1 exp ( (s T ij x)y ij)), (46) i=1 j=1 where every agen i has access o m i raining samples (s ij, y ij ) R p {, 1}, j = 1,, m i, including explanaory/feaure variables s ij and binary oupus/oucomes y ij. The regularizaion erm (λ/2) x 2 is added o avoid overfiing where λ is a posiive consan. Hence, in he decenralized seing he local objecive funcion f i of node i is given by f i ( x) = λ m i 2n x 2 ln ( 1 exp ( (s T ij x)y ij)). (47) j=1 The seings are as follows. The conneced nework is randomly generaed wih n = 20 agens and conneciviy raio r = 3/n. Each agen holds 3 samples, i.e., m i = 3, for all i. The dimension of sample vecors s ij is p = 3. The samples are randomly generaed, and he opimal logisic classifier x is pre-compued hrough cenralized adapive gradien mehod. We use Meropolis consan edge weigh marix as he mixing marix W in ESOM- K. The sepsize α for ESOM-0, ESOM-1, ESOM-2, EXTRA, and DADMM are hand-opimized and he bes of each is used for he comparison. Fig. 3 and Fig 4 showcase he convergence pahs of ESOM- 0, ESOM-1, ESOM-2, EXTRA, and DADMM versus number of ieraions and rounds of communicaions, respecively. The resuls mach he observaions for he leas squares problem in Fig. 1 and Fig. 2. Differen versions of ESOM-K converge faser han EXTRA boh in erms of communicaion cos and number of ieraions. Moreover, ESOM-2 converges faser han ESOM-1 and ESOM-0 in erms of number of ieraions, while ESOM-0 has he bes performance in erms of communicaion cos for achieving a arge accuracy. Comparing he convergence pahs of ESOM- 0, ESOM-1, and ESOM-2 wih DADMM shows ha number of ieraions required for he convergence of DADMM is larger han he required ieraions for ESOM-0, ESOM-1, and ESOM-2. In erms of communicaion cos, DADMM has a beer performance relaive o ESOM-1 and ESOM-2, while ESOM-0 is he mos efficien algorihm. VI. CONCLUSIONS We sudied he consensus opimizaion problem where he componens of a global objecive funcion are available a differen nodes of a nework. We proposed an Exac Second-Order Mehod (ESOM) ha converges o he opimal argumen of he global objecive funcion a a linear rae. We developed he updae of ESOM by subsiuing he primal updae of Proximal Mehod of Mulipliers (PMM) wih is second order approximaion. Moreover, Relaive error EXTRA DADMM ESOM-0 ESOM-1 ESOM Number of ieraions Fig. 3: Relaive error x x / x 0 x of EXTRA, ESOM-K, and DADMM versus number of ieraions for he logisic regression problem. EXTRA is significanly slower han he ESOM mehods. The proposed mehods (ESOM-K) ouperform DADMM. Relaive error Rounds of communicaions EXTRA DADMM ESOM-0 ESOM-1 ESOM-2 Fig. 4: Relaive error x x / x 0 x of EXTRA, ESOM-K, and DADMM versus rounds of communicaions for he logisic regression problem. ESOM-0 has he bes performance in erms of rounds of communicaions and i ouperforms DADMM. we approximaed he Hessian inverse of he proximal augmened Lagrangian by runcaing is Taylor s series. This approximaion leads o a class of algorihms ESOM-K where K 1 indicaes he number of Taylor s series erms ha are used for Hessian inverse approximaion. Convergence analysis of ESOM-K shows ha he sequence of ieraes converges o he opimal argumen linearly irrespecive o he choice of K. We showed ha he linear convergence facor of ESOM-K is a funcion of ime and he choice of K. The linear convergence facor of ESOM approaches he linear convergence facor of PMM as ime passes. Moreover, larger choice of K makes he facor of linear convergence for ESOM closer o he one for PMM. Numerical resuls verify he heoreical linear convergence and he relaion beween he linear convergence facor of ESOM-K and PMM. Furher, we observed ha larger choice of K for ESOM-K leads o faser convergence in erms of number of ieraions, while he mos efficien version of ESOM-K in erms of communicaion cos is ESOM-0.
10 10 APPENDIX A PROOF OF LEMMA 1 Consider he updaes of PMM in (6) and (7). According o (4), he opimal argumen x saisfies he condiion (I Z) 1/2 x = 0. This observaion in conjuncion wih he dual variable updae in (7) yields he claim in (30). To prove he claim in (31), noe ha he opimaliy condiion of (6) implies x L(x 1, v ) ɛ(x 1 x ) = 0. Based on he definiion of he Lagrangian L(x, v) in (5), he opimaliy condiion for he primal updae of PMM can be wrien as f(x 1 )(I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x ) = 0. (48) Furher, noice ha one of he KKT condiions of he opimizaion problem in (4) is f(x ) (I Z) 1/2 v = 0. (49) Moreover, he opimal soluion x = [ x ;... ; x ] of (4) lies in null{i Z}. Therefore, we obain α(i Z)x = 0. (50) Subracing he equaliies in (49) and (50) from (48) yields f(x 1 ) f(x ) (I Z) 1/2 (v v ) α(i Z)(x 1 x ) ɛ(x 1 x ) = 0. (51) Regrouping he erms in (30) implies ha v is equivalen o v = v 1 α(i Z) 1/2 (x 1 x ). (52) Subsiuing v in (51) by he expression in he righ hand side of (52) follows he claim in (31). APPENDIX B PROOF OF THEOREM 1 According o Assumpion 1, he global objecive funcion f is srongly convex wih consan m and is gradiens f are Lipschiz coninuous wih consan M. Considering hese assumpions, we obain ha he inner produc (x 1 x ) T ( f(x 1 ) f(x )) is lower bounded by mm m M x 1 x 2 1 m M f(x 1) f(x ) 2 (x 1 x ) T ( f(x 1 ) f(x )). (53) The resul in (31) shows ha he difference f(x 1 ) f(x ) is equal o (I Z) 1/2 (v 1 v ) ɛ(x 1 x ). Apply his subsiuion ino (53) and muliply boh sides of he resuled inequaliy by 2 o obain 2mM m M x 1 x 2 2 m M f(x 1) f(x ) 2 2(x 1 x ) T (I Z) 1/2 (v 1 v ) 2ɛ(x 1 x ) T (x 1 x ). (54) Based on he resul in (30), we can subsiue (x 1 x ) T (I Z) 1/2 by (1/α)(v 1 v ) T. Thus, we can rewrie (54) as mm m M x 1 x 2 m M f(x 1) f(x ) 2 (55) 2(v 1 v ) T (v 1 v ) ɛ(x 1 x ) T (x 1 x ). Noice ha for any vecors a, b, and c we can wrie 2(a b) T (a c) = a b 2 a c 2 b c 2. (56) By seing a = v 1, b = v, and c = v we obain ha he inner produc 2(v 1 v ) T (v 1 v ) in (55) can be wrien as v 1 v 2 v 1 v 2 v v 2. Likewise, seing a = x 1, b = x, and c = x implies ha he inner produc 2(x 1 x ) T (x 1 x ) in (55) is equal o x 1 x 2 x 1 x 2 x x 2. Applying hese simplificaions ino (55) yields mm m M x 1 x 2 m M f(x 1) f(x ) 2 v v 2 v 1 v 2 v 1 v 2 αɛ x x 2 αɛ x 1 x 2 αɛ x 1 x 2. (57) Now using he definiions of he variable u and marix G in (32) we can subsiue v v 2 v 1 v 2 αɛ x x 2 αɛ x 1 x 2 by u u 2 G u 1 u 2 G. Moreover, he squared norm v 1 v 2 is equivalen o x 1 x 2 α 2 (I Z) based on he resul in (30). By applying hese subsiuions we can rewrie (57) as mm m M x 1 x 2 m M f(x 1) f(x ) 2 u u 2 G u 1 u 2 G αɛ x 1 x 2 x 1 x 2 α 2 (I Z). (58) Regrouping he erms in (58) leads o he following lower bound for he difference u u 2 G u 1 u 2 G, u u 2 G u 1 u 2 G m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 mm mm Iα2 (I Z). (59) Observe ha he resul in (59) provides a lower bound for he decremen u u 2 G u 1 u 2 G. To prove he claim in (33), we need o show ha for a posiive consan δ we have u u 2 G u 1 u 2 G δ u 1 u 2 G. Therefore, he inequaliy in (33) is saisfied if we can show ha he lower bound in (59) is greaer han δ u 1 u 2 G or equivalenly δ v 1 v 2 δαɛ x 1 x 2 m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 mm mm Iα2 (I Z). (60) To prove ha he inequaliy in (60) for some δ > 0, we firs find an upper bound for he squared norm v 1 v 2 in erms of he summands in he righ hand side of (60). To do so, consider he relaion (31) along wih he fac ha v 1 and v boh lying in he column space of (I Z) 1/2. I follows ha v 1 v 2 is bounded above by v 1 v 2 βɛ 2 (β 1)ˆλ min (I Z) x 1 x 2 β ˆλ min (I Z) f(x 1) f(x ) 2. (61)
11 11 where β > 1 is a unable free parameer and ˆλ min (I Z) is he smalles non-zero eigenvalue of I Z. Considering he resul in (61) o saisfy he inequaliy in (60), which is a sufficien condiion for he claim in (33), i remains o show ha m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 mm mm Iα2 (I Z) δβɛ 2 (β 1)ˆλ min (I Z) x 1 x 2 δɛα x 1 x 2 δβ ˆλ min (I Z) f(x 1) f(x ) 2. (62) To enable (62) and consequenly enabling (60), we only need o verify ha here exiss δ > 0 such ha mm m M I α2 (I Z) δαɛi, αɛ m M δβ ˆλ min (I Z), δβɛ 2 (β 1)ˆλ min (I Z). (63) The condiions in (63) are saisfied if he consan δ is chosen as in (34). Therefore, for δ in (34) he claim in (60) holds, which implies he claim in (33). APPENDIX C PROOF OF LEMMA 2 Consider he primal updae of ESOM in (14). By regrouping he erms we obain ha f(x ) (I Z) 1/2 v α(i Z)x H (x 1 x ) = 0, (64) where H is he inverse of he Hessian inverse approximaion H (K). Recall he definiion of he exac Hessian H in (9). Adding and subracing he erm H (x 1 x ) o he expression in (64) yields f(x ) 2 f(x )(x 1 x ) (I Z) 1/2 v (65) α(i Z)x 1 ɛ(x 1 x ) ( H H )(x 1 x ) = 0. Now using he definiion of he error vecor e in (37) we can rewrie (65) as f(x 1 ) (I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x ) e = 0. (66) Noice ha he resul in (66) is idenical o he expression for PMM in (48) excep for he error erm e. To prove he claim in (36) from (66), i remains o follow he seps in (49)-(52). APPENDIX D PROOF OF LEMMA 3 To prove he resul in (39), we firs use he resul in Proposiion 2 of [29]. I shows ha when he eigenvalues of he Hessian 2 f(x) are bounded above by M and he Hessian is Lipschiz coninuous wih consan L we can wrie f(x ) 2 f(x )(x 1 x ) f(x 1 ) { min 2M, L } 2 x 1 x. (67) Considering he resul in (67), i remains o find an upper bound for he second erm of he error vecor e which is ( H (K) H )(x 1 x ). To do so, we develop firs an upper bound for he norm H (K) H. Noice ha by facoring he erm H (K) and using he Cauchy-Schwarz inequaliy we obain ha H (K) H H (K) I H H (K). (68) According o Lemma 3 in [12], we can simplify I H H (K) as (BD ) K1. This simplificaion implies ha I H H (K) = BD K1. (69) Observe ha he marices B and D in his paper are differen from he ones in [12], bu he analyses of hem are very similar. Similar o he proof of Proposiion 2 in [12] we define ˆD := (I Z d ). Noice ha he marix ˆD is bock diagonal where is ih diagonal block is (1 w ii )I p. Thus, ˆD is posiive definie and inverible. Hence, We are allowed o wrie he produc BD as BD = ( B ˆD ) ( ˆDD ). (70) The nex sep is o find an upper bound for he eigenvalues of B ˆD in (70). Based on he definiions of marices B and ˆD, he produc B ˆD is given by B ˆD = (I 2Z d Z) (2(I Z d )). (71) According o he resul in Proposiion 2 of [12], he eigenvalues of he marix (I 2Z d Z)(2(I Z d )) are uniformly bounded by 0 and 1. Thus, we obain ha B ˆD 1. (72) According o he definiions of he marices ˆD and D, he produc ˆD 1/2 D /2 is block diagonal and he ih diagonal block is given by [ ] ( ˆDD 2 f i (x i, ) ɛi = I). (73) ii (1 w ii ) Based on Assumpion 1, he eigenvalues of he local Hessians 2 f i (x i ) are bounded by m and M. Furher, noice ha he diagonal elemens w ii of he weigh marix W are bounded below by c. Considering hese bounds, we can show ha he eigenvalues of he marices (1/(1 w ii ))( 2 f i (x i, ) ɛi) I for all i = 1,..., n are bounded below by [ m ɛ (1 c) 1 ] I 2 f i (x i, ) ɛi (1 w ii ) I. (74) By considering he bounds in (74), he eigenvalues of each block of he marix ˆDD, inroduced in (73), are bounded above as ( 2 [ ] f i (x i, ) ɛi m ɛ I) (1 w ii ) (1 c) 1 I. (75) The upper bound in (75) for he eigenvalues of each diagonal block of he marix ˆDD implies ha he marix norm ˆDD is bounded above by (1 c) ˆDD ρ := (1 c) m ɛ. (76) Considering he upper bounds in (72) and (76) and he relaion
Aryan Mokhtari, Wei Shi, Qing Ling, and Alejandro Ribeiro. cost function n
IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 507 A Decenralized Second-Order Mehod wih Exac Linear Convergence Rae for Consensus Opimizaion Aryan Mokhari,
More informationNetwork Newton Distributed Optimization Methods
Nework Newon Disribued Opimizaion Mehods Aryan Mokhari, Qing Ling, and Alejandro Ribeiro Absrac We sudy he problem of minimizing a sum of convex objecive funcions where he componens of he objecive are
More informationA Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs
PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3
More informationOptimality Conditions for Unconstrained Problems
62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x
More informationChapter 2. First Order Scalar Equations
Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.
More informationMATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018
MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren
More informationPENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD
PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.
More informationLecture 20: Riccati Equations and Least Squares Feedback Control
34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he
More informationTechnical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.
Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear
More informationLecture 9: September 25
0-725: Opimizaion Fall 202 Lecure 9: Sepember 25 Lecurer: Geoff Gordon/Ryan Tibshirani Scribes: Xuezhi Wang, Subhodeep Moira, Abhimanu Kumar Noe: LaTeX emplae couresy of UC Berkeley EECS dep. Disclaimer:
More informationt is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...
Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger
More informationAn introduction to the theory of SDDP algorithm
An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking
More informationPrimal-Dual Splitting: Recent Improvements and Variants
Primal-Dual Spliing: Recen Improvemens and Varians 1 Thomas Pock and 2 Anonin Chambolle 1 Insiue for Compuer Graphics and Vision, TU Graz, Ausria 2 CMAP & CNRS École Polyechnique, France The proximal poin
More informationChapter 3 Boundary Value Problem
Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le
More information1 Review of Zero-Sum Games
COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any
More informationAppendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection
Appendix o Online l -Dicionary Learning wih Applicaion o Novel Documen Deecion Shiva Prasad Kasiviswanahan Huahua Wang Arindam Banerjee Prem Melville A Background abou ADMM In his secion, we give a brief
More information3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon
3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of
More informationMatrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality
Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]
More informationA Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization
A Forward-Backward Spliing Mehod wih Componen-wise Lazy Evaluaion for Online Srucured Convex Opimizaion Yukihiro Togari and Nobuo Yamashia March 28, 2016 Absrac: We consider large-scale opimizaion problems
More informationMATH 128A, SUMMER 2009, FINAL EXAM SOLUTION
MATH 28A, SUMME 2009, FINAL EXAM SOLUTION BENJAMIN JOHNSON () (8 poins) [Lagrange Inerpolaion] (a) (4 poins) Le f be a funcion defined a some real numbers x 0,..., x n. Give a defining equaion for he Lagrange
More informationNotes on Kalman Filtering
Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren
More informationCHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK
175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he
More informationChapter 6. Systems of First Order Linear Differential Equations
Chaper 6 Sysems of Firs Order Linear Differenial Equaions We will only discuss firs order sysems However higher order sysems may be made ino firs order sysems by a rick shown below We will have a sligh
More informationAn Introduction to Malliavin calculus and its applications
An Inroducion o Malliavin calculus and is applicaions Lecure 5: Smoohness of he densiy and Hörmander s heorem David Nualar Deparmen of Mahemaics Kansas Universiy Universiy of Wyoming Summer School 214
More informationInventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions
Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.
More informationT L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB
Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal
More informationOn Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems
MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes
More informationLet us start with a two dimensional case. We consider a vector ( x,
Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our
More informationMulti-scale 2D acoustic full waveform inversion with high frequency impulsive source
Muli-scale D acousic full waveform inversion wih high frequency impulsive source Vladimir N Zubov*, Universiy of Calgary, Calgary AB vzubov@ucalgaryca and Michael P Lamoureux, Universiy of Calgary, Calgary
More informationOrdinary Differential Equations
Ordinary Differenial Equaions 5. Examples of linear differenial equaions and heir applicaions We consider some examples of sysems of linear differenial equaions wih consan coefficiens y = a y +... + a
More informationOnline Convex Optimization Example And Follow-The-Leader
CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion
More informationRobust estimation based on the first- and third-moment restrictions of the power transformation model
h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,
More information23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes
Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals
More informationExpert Advice for Amateurs
Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he
More information2. Nonlinear Conservation Law Equations
. Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear
More informationVehicle Arrival Models : Headway
Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where
More informationFinish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!
MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his
More informationPhysics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle
Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,
More information10. State Space Methods
. Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he
More informationIMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013
IMPLICI AND INVERSE FUNCION HEOREMS PAUL SCHRIMPF 1 OCOBER 25, 213 UNIVERSIY OF BRIISH COLUMBIA ECONOMICS 526 We have exensively sudied how o solve sysems of linear equaions. We know how o check wheher
More informationDecentralized Stochastic Control with Partial History Sharing: A Common Information Approach
1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv:1209.1695v1 [cs.sy] 8 Sep 2012 Absrac A general model
More informationNotes for Lecture 17-18
U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up
More informationSTATE-SPACE MODELLING. A mass balance across the tank gives:
B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing
More informationClass Meeting # 10: Introduction to the Wave Equation
MATH 8.5 COURSE NOTES - CLASS MEETING # 0 8.5 Inroducion o PDEs, Fall 0 Professor: Jared Speck Class Meeing # 0: Inroducion o he Wave Equaion. Wha is he wave equaion? The sandard wave equaion for a funcion
More informationDifferential Equations
Mah 21 (Fall 29) Differenial Equaions Soluion #3 1. Find he paricular soluion of he following differenial equaion by variaion of parameer (a) y + y = csc (b) 2 y + y y = ln, > Soluion: (a) The corresponding
More informationThen. 1 The eigenvalues of A are inside R = n i=1 R i. 2 Union of any k circles not intersecting the other (n k)
Ger sgorin Circle Chaper 9 Approimaing Eigenvalues Per-Olof Persson persson@berkeley.edu Deparmen of Mahemaics Universiy of California, Berkeley Mah 128B Numerical Analysis (Ger sgorin Circle) Le A be
More informationRandom Walk with Anti-Correlated Steps
Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and
More informationSimulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010
Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid
More informationSome Basic Information about M-S-D Systems
Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,
More informationEE363 homework 1 solutions
EE363 Prof. S. Boyd EE363 homework 1 soluions 1. LQR for a riple accumulaor. We consider he sysem x +1 = Ax + Bu, y = Cx, wih 1 1 A = 1 1, B =, C = [ 1 ]. 1 1 This sysem has ransfer funcion H(z) = (z 1)
More information14 Autoregressive Moving Average Models
14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class
More informationSection 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients
Secion 3.5 Nonhomogeneous Equaions; Mehod of Undeermined Coefficiens Key Terms/Ideas: Linear Differenial operaor Nonlinear operaor Second order homogeneous DE Second order nonhomogeneous DE Soluion o homogeneous
More informationSZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1
SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision
More informationZürich. ETH Master Course: L Autonomous Mobile Robots Localization II
Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),
More informationMath 10B: Mock Mid II. April 13, 2016
Name: Soluions Mah 10B: Mock Mid II April 13, 016 1. ( poins) Sae, wih jusificaion, wheher he following saemens are rue or false. (a) If a 3 3 marix A saisfies A 3 A = 0, hen i canno be inverible. True.
More informationTwo Coupled Oscillators / Normal Modes
Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own
More informationOnline Appendix to Solution Methods for Models with Rare Disasters
Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,
More informationLecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.
Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in
More informationTHE BERNOULLI NUMBERS. t k. = lim. = lim = 1, d t B 1 = lim. 1+e t te t = lim t 0 (e t 1) 2. = lim = 1 2.
THE BERNOULLI NUMBERS The Bernoulli numbers are defined here by he exponenial generaing funcion ( e The firs one is easy o compue: (2 and (3 B 0 lim 0 e lim, 0 e ( d B lim 0 d e +e e lim 0 (e 2 lim 0 2(e
More informationOscillation of an Euler Cauchy Dynamic Equation S. Huff, G. Olumolode, N. Pennington, and A. Peterson
PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DYNAMICAL SYSTEMS AND DIFFERENTIAL EQUATIONS May 4 7, 00, Wilmingon, NC, USA pp 0 Oscillaion of an Euler Cauchy Dynamic Equaion S Huff, G Olumolode,
More informationMean-square Stability Control for Networked Systems with Stochastic Time Delay
JOURNAL OF SIMULAION VOL. 5 NO. May 7 Mean-square Sabiliy Conrol for Newored Sysems wih Sochasic ime Delay YAO Hejun YUAN Fushun School of Mahemaics and Saisics Anyang Normal Universiy Anyang Henan. 455
More information20. Applications of the Genetic-Drift Model
0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0
More informationHamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:
M ah 5 7 Fall 9 L ecure O c. 4, 9 ) Hamilon- J acobi Equaion: Weak S oluion We coninue he sudy of he Hamilon-Jacobi equaion: We have shown ha u + H D u) = R n, ) ; u = g R n { = }. ). In general we canno
More informationEXERCISES FOR SECTION 1.5
1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler
More informationLogic in computer science
Logic in compuer science Logic plays an imporan role in compuer science Logic is ofen called he calculus of compuer science Logic plays a similar role in compuer science o ha played by calculus in he physical
More informationArticle from. Predictive Analytics and Futurism. July 2016 Issue 13
Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning
More informationACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin
ACE 56 Fall 005 Lecure 4: Simple Linear Regression Model: Specificaion and Esimaion by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Simple Regression: Economic and Saisical Model
More informationGlobal Convergence of Online Limited Memory BFGS
Journal of Machine Learning Research 16 (2015 3151-3181 Submied 9/14; Revised 7/15; Published 12/15 Global Convergence of Online Limied Memory BFGS Aryan Mokhari Alejandro Ribeiro Deparmen of Elecrical
More informationEssential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems
Essenial Microeconomics -- 6.5: OPIMAL CONROL Consider he following class of opimizaion problems Max{ U( k, x) + U+ ( k+ ) k+ k F( k, x)}. { x, k+ } = In he language of conrol heory, he vecor k is he vecor
More informationOn Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature
On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check
More informationSupplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence
Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given
More informationELE 538B: Large-Scale Optimization for Data Science. Quasi-Newton methods. Yuxin Chen Princeton University, Spring 2018
ELE 538B: Large-Scale Opimizaion for Daa Science Quasi-Newon mehods Yuxin Chen Princeon Universiy, Spring 208 00 op ff(x (x)(k)) f p 2 L µ f 05 k f (xk ) k f (xk ) =) f op ieraions converges in only 5
More informationChristos Papadimitriou & Luca Trevisan November 22, 2016
U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream
More informationEcon107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)
I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression
More informationMacroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3
Macroeconomic Theory Ph.D. Qualifying Examinaion Fall 2005 Comprehensive Examinaion UCLA Dep. of Economics You have 4 hours o complee he exam. There are hree pars o he exam. Answer all pars. Each par has
More information15. Vector Valued Functions
1. Vecor Valued Funcions Up o his poin, we have presened vecors wih consan componens, for example, 1, and,,4. However, we can allow he componens of a vecor o be funcions of a common variable. For example,
More informationFinal Spring 2007
.615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o
More information2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006
2.160 Sysem Idenificaion, Esimaion, and Learning Lecure Noes No. 8 March 6, 2006 4.9 Eended Kalman Filer In many pracical problems, he process dynamics are nonlinear. w Process Dynamics v y u Model (Linearized)
More informationMath 334 Fall 2011 Homework 11 Solutions
Dec. 2, 2 Mah 334 Fall 2 Homework Soluions Basic Problem. Transform he following iniial value problem ino an iniial value problem for a sysem: u + p()u + q() u g(), u() u, u () v. () Soluion. Le v u. Then
More informationSPECTRAL EVOLUTION OF A ONE PARAMETER EXTENSION OF A REAL SYMMETRIC TOEPLITZ MATRIX* William F. Trench. SIAM J. Matrix Anal. Appl. 11 (1990),
SPECTRAL EVOLUTION OF A ONE PARAMETER EXTENSION OF A REAL SYMMETRIC TOEPLITZ MATRIX* William F Trench SIAM J Marix Anal Appl 11 (1990), 601-611 Absrac Le T n = ( i j ) n i,j=1 (n 3) be a real symmeric
More informationLinear Response Theory: The connection between QFT and experiments
Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and
More informationSome Ramsey results for the n-cube
Some Ramsey resuls for he n-cube Ron Graham Universiy of California, San Diego Jozsef Solymosi Universiy of Briish Columbia, Vancouver, Canada Absrac In his noe we esablish a Ramsey-ype resul for cerain
More informationDistributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information
Disribued Ficiious Play for Opimal Behavior of Muli-Agen Sysems wih Incomplee Informaion Ceyhun Eksin and Alejandro Ribeiro arxiv:602.02066v [cs.g] 5 Feb 206 Absrac A muli-agen sysem operaes in an uncerain
More informationUnsteady Flow Problems
School of Mechanical Aerospace and Civil Engineering Unseady Flow Problems T. J. Craf George Begg Building, C41 TPFE MSc CFD-1 Reading: J. Ferziger, M. Peric, Compuaional Mehods for Fluid Dynamics H.K.
More informationTime series model fitting via Kalman smoothing and EM estimation in TimeModels.jl
Time series model fiing via Kalman smoohing and EM esimaion in TimeModels.jl Gord Sephen Las updaed: January 206 Conens Inroducion 2. Moivaion and Acknowledgemens....................... 2.2 Noaion......................................
More informationNavneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi
Creep in Viscoelasic Subsances Numerical mehods o calculae he coefficiens of he Prony equaion using creep es daa and Herediary Inegrals Mehod Navnee Saini, Mayank Goyal, Vishal Bansal (23); Term Projec
More informationLecture 2 October ε-approximation of 2-player zero-sum games
Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion
More informationdy dx = xey (a) y(0) = 2 (b) y(1) = 2.5 SOLUTION: See next page
Assignmen 1 MATH 2270 SOLUTION Please wrie ou complee soluions for each of he following 6 problems (one more will sill be added). You may, of course, consul wih your classmaes, he exbook or oher resources,
More informationOptimal Path Planning for Flexible Redundant Robot Manipulators
25 WSEAS In. Conf. on DYNAMICAL SYSEMS and CONROL, Venice, Ialy, November 2-4, 25 (pp363-368) Opimal Pah Planning for Flexible Redundan Robo Manipulaors H. HOMAEI, M. KESHMIRI Deparmen of Mechanical Engineering
More informationGuest Lectures for Dr. MacFarlane s EE3350 Part Deux
Gues Lecures for Dr. MacFarlane s EE3350 Par Deux Michael Plane Mon., 08-30-2010 Wrie name in corner. Poin ou his is a review, so I will go faser. Remind hem o go lisen o online lecure abou geing an A
More informationExponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits
DOI: 0.545/mjis.07.5009 Exponenial Weighed Moving Average (EWMA) Char Under The Assumpion of Moderaeness And Is 3 Conrol Limis KALPESH S TAILOR Assisan Professor, Deparmen of Saisics, M. K. Bhavnagar Universiy,
More informationAccelerated Distributed Nesterov Gradient Descent for Convex and Smooth Functions
07 IEEE 56h Annual Conference on Decision and Conrol (CDC) December -5, 07, Melbourne, Ausralia Acceleraed Disribued Neserov Gradien Descen for Convex and Smooh Funcions Guannan Qu, Na Li Absrac This paper
More informationODEs II, Lecture 1: Homogeneous Linear Systems - I. Mike Raugh 1. March 8, 2004
ODEs II, Lecure : Homogeneous Linear Sysems - I Mike Raugh March 8, 4 Inroducion. In he firs lecure we discussed a sysem of linear ODEs for modeling he excreion of lead from he human body, saw how o ransform
More informationA primal-dual Laplacian gradient flow dynamics for distributed resource allocation problems
2018 Annual American Conrol Conference (ACC) June 27 29, 2018. Wisconsin Cener, Milwaukee, USA A primal-dual Laplacian gradien flow dynamics for disribued resource allocaion problems Dongsheng Ding and
More informationStationary Distribution. Design and Analysis of Algorithms Andrei Bulatov
Saionary Disribuion Design and Analysis of Algorihms Andrei Bulaov Algorihms Markov Chains 34-2 Classificaion of Saes k By P we denoe he (i,j)-enry of i, j Sae is accessible from sae if 0 for some k 0
More informationAverage Number of Lattice Points in a Disk
Average Number of Laice Poins in a Disk Sujay Jayakar Rober S. Sricharz Absrac The difference beween he number of laice poins in a disk of radius /π and he area of he disk /4π is equal o he error in he
More informationOn the probabilistic stability of the monomial functional equation
Available online a www.jnsa.com J. Nonlinear Sci. Appl. 6 (013), 51 59 Research Aricle On he probabilisic sabiliy of he monomial funcional equaion Claudia Zaharia Wes Universiy of Timişoara, Deparmen of
More informationt dt t SCLP Bellman (1953) CLP (Dantzig, Tyndall, Grinold, Perold, Anstreicher 60's-80's) Anderson (1978) SCLP
Coninuous Linear Programming. Separaed Coninuous Linear Programming Bellman (1953) max c () u() d H () u () + Gsusds (,) () a () u (), < < CLP (Danzig, yndall, Grinold, Perold, Ansreicher 6's-8's) Anderson
More informationSolutions from Chapter 9.1 and 9.2
Soluions from Chaper 9 and 92 Secion 9 Problem # This basically boils down o an exercise in he chain rule from calculus We are looking for soluions of he form: u( x) = f( k x c) where k x R 3 and k is
More information