Aryan Mokhtari, Wei Shi, Qing Ling, and Alejandro Ribeiro. cost function n

Size: px
Start display at page:

Download "Aryan Mokhtari, Wei Shi, Qing Ling, and Alejandro Ribeiro. cost function n"

Transcription

1 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER A Decenralized Second-Order Mehod wih Exac Linear Convergence Rae for Consensus Opimizaion Aryan Mokhari, Wei Shi, Qing Ling, and Alejandro Ribeiro Absrac This paper considers decenralized consensus opimizaion problems where differen summands of a global objecive funcion are available a nodes of a nework ha can communicae wih neighbors only. The proximal mehod of mulipliers is considered as a powerful ool ha relies on proximal primal descen and dual ascen updaes on a suiably defined augmened Lagrangian. The srucure of he augmened Lagrangian makes his problem nondecomposable, which precludes disribued implemenaions. This problem is regularly addressed by he use of he alernaing direcion mehod of mulipliers. The exac secondorder mehod (ESOM) is inroduced here as an alernaive ha relies on: Firs, he use of a separable quadraic approximaion of he augmened Lagrangian, and second, a runcaed Taylor s series o esimae he soluion of he firs-order condiion imposed on he minimizaion of he quadraic approximaion of he augmened Lagrangian. The sequences of primal and dual variables generaed by ESOM are shown o converge linearly o heir opimal argumens when he aggregae cos funcion is srongly convex and is gradiens are Lipschiz coninuous. Numerical resuls demonsrae advanages of ESOM relaive o decenralized alernaives in solving leas-squares and logisic regression problems. Index Terms Decenralized opimizaion, mehod of mulipliers, muli-agen neworks, second-order mehods. I. INTRODUCTION IN DECENTRALIZED consensus opimizaion problems, componens of a global objecive funcion ha is o be minimized are available a differen nodes of a nework. Formally, consider a decision variable x R p and a conneced nework conaining n nodes where each node i has access o a local objecive funcion f i : R p R. Nodes can exchange informaion wih neighbors only and ry o minimize he global Manuscrip received February 1, 2016; revised July 30, 2016; acceped Sepember 19, Dae of publicaion Sepember 26, 2016; dae of curren version November 4, This work was suppored in par by he Naional Science Foundaion CAREER CCF , in par by he ONR under Gran N , in par by he Naional Science Foundaion China under Gran , and in par by he Naional Science Foundaion Anhui under Gran QF130. The gues edior coordinaing he review of his manuscrip and approving i for publicaion was Prof. Vincenzo Maa. A. Mokhari and A. Ribeiro are wih he Deparmen of Elecrical and Sysems Engineering, Universiy of Pennsylvania, Philadelphia, PA USA ( aryanm@seas.upenn.edu; aribeiro@seas.upenn.edu). W. Shi is wih he Coordinaed Science Laboraory, Universiy of Illinois a Urbana-Champaign, Urbana, IL USA ( wilburs@illinois.edu). Q. Ling is wih he Deparmen of Auomaion, Universiy of Science and Technology of China, Anhui , China ( qingling@mail.usc. edu.cn). Color versions of one or more of he figures in his paper are available online a hp://ieeexplore.ieee.org. Digial Objec Idenifier /TSIPN cos funcion n i=1 f i( x), x := argmin x R p n f i ( x). (1) We assume ha he local objecive funcions f i ( x) are srongly convex. The global objecive funcion n i=1 f i( x), which is he sum of a se of srongly convex funcions, is also srongly convex. Problems like (1) arise in decenralized conrol [1] [3], wireless communicaion [4], [5], sensor neworks [6] [8], and large scale machine learning [9] [11]. Decenralized mehods for solving (1) can be divided ino wo classes: primal domain mehods and dual domain mehods. Decenralized gradien descen (DGD) is a well-esablished primal mehod ha implemens gradien descen on a penalized version of (1) whose gradien can be separaed ino per-node componens. Nework Newon (NN) is a more recen alernaive ha acceleraes he convergence of DGD by incorporaing second order informaion of he penalized objecive [12], [13]. Boh, DGD and NN, converge o a neighborhood of he opimal argumen x when using a consan sepsize and converge sublinearly o he exac opimal argumen if using a diminishing sepsize. Dual domain mehods build on he fac ha he dual funcion of (1) has a gradien wih separable srucure. The use of plain dual gradien descen is possible bu generally slow o converge [14] [16]. In cenralized opimizaion, beer convergence speeds are aained by he mehod of mulipliers (MM) ha adds a quadraic augmenaion erm o he Lagrangian [17], [18], or he proximal (P)MM ha adds an addiional erm o keep ieraes close. In eiher case, he quadraic erm ha is added o consruc he augmened Lagrangian makes disribued compuaion of primal gradiens impossible. This issue is mos ofen overcome wih he use of decenralized (D) versions of he alernaing direcion mehod of mulipliers (ADMM) [6], [19], [20]. Besides he ADMM, oher mehods ha use differen alernaives o approximae he gradiens of he dual funcion have also been proposed [21] [27]. The convergence raes of hese mehods have no been sudied excep for he DADMM and is varians ha are known o converge linearly o he opimal argumen when he local funcions are srongly convex and heir gradiens are Lipschiz coninuous [20], [28], [29]. An imporan observaion here is ha while all of hese mehods ry o approximae he MM or he PMM, he performance penaly enailed by he approximaion has no been sudied. This paper inroduces he exac second order mehod (ESOM) which uses quadraic approximaions of he augmened Lagrangians of (1) and leads o a se of separable subproblems. Similar o oher second order mehods, implemenaion i= X 2016 IEEE. Personal use is permied, bu republicaion/redisribuion requires IEEE permission. See hp:// sandards/publicaions/righs/index.hml for more informaion.

2 508 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 of ESOM requires compuaion of Hessian inverses. Disribued implemenaion of his operaion is infeasible because while he Hessian of he proximal augmened Lagrangian is neighbor sparse, is inverse is no. ESOM resolves his issue by using he Hessian inverse approximaion echnique inroduced in [12], [13], [30]. This echnique consiss of runcaing he Taylor s series of he Hessian inverse o order K o obain he family of mehods ESOM-K. Implemenaion of his expansion in erms of local operaions is possible. A remarkable propery of all ESOM-K mehods is ha hey can be shown o pay a performance penaly relaive o (cenralized) PMM ha vanishes wih increasing ieraions. We begin he paper by reformulaing (1) in a form more suiable for decenralized implemenaion (Proposiion 1) and proceed o describe he PMM (Secion II). ESOM is a variaion of PMM ha subsiues he proximal augmened Lagrangian wih is quadraic approximaion (Secion III). Implemenaion of ESOM requires compuing he inverse of he Hessian of he proximal augmened Lagrangian. Since his inversion canno be compued using local and neighboring informaion, ESOM-K approximaes he Hessian inverse wih he K-order runcaion of he Taylor s series expansion of he Hessian inverse. This expansion can be carried ou using an inner loop of local operaions. This and oher deails required for decenralized implemenaion of ESOM-K are discussed in Secion III-A along wih a discussion of how ESOM can be inerpreed as a saddle poin generalizaion of he Nework Newon mehods proposed in [31] (Remark 2) or a second order version of he EXTRA mehod in [32] (Remark 3). Convergence analyses of PMM and ESOM are hen presened (Secion IV). Linear convergence of PMM is esablished (Secion IV-A) and linear convergence facors explicily derived o use as benchmarks (Theorem 1). In he ESOM analysis (Secion IV-B) we provide an upper bound for he error of he proximal augmened Lagrangian approximaion (Lemma 3). We leverage his resul o prove linear convergence of ESOM (Theorem 2) and o show ha ESOM s linear convergence facor approaches he corresponding PMM facor as ime grows (Secion IV-C). This indicaes ha he convergence pahs of (disribued) ESOM-K and (cenralized) PMM are very close. We also sudy he dependency of he convergence consan wih he algorihm s order K. ESOM radeoffs and comparisons wih oher decenralized mehods for solving consensus opimizaion problems are illusraed in numerical experimens (Secion V) for a decenralized leas squares problem (Secion V-A) and a decenralized logisic regression classificaion problem (Secion V-B). Numerical resuls in boh seings verify ha larger K leads o faser convergence in erms of number of ieraions. However, we observe ha all versions of ESOM-K exhibi similar convergence raes in erms of he number of communicaion exchanges. This implies ha ESOM-0 is preferable wih respec o he laer meric and ha larger K is jusified when compuaional cos is of ineres. Faser convergence relaive o EXTRA, Nework Newon, and DQM is observed. We close he paper wih concluding remarks (Secion VI). Noaion: Vecors are wrien as x R n and marices as A R n n.givenn vecors x i, he vecor x =[x 1 ;...; x n ] represens a sacking of he elemens of each individual x i.we use x and A o denoe he Euclidean norm of vecor x and marix A, respecively. The norm of vecor x wih respec o posiive definie marix A is x A := (x T Ax) 1/2.Givena funcion f is gradien x is denoed as f(x) and is Hessian as 2 f(x). II. PROXIMAL METHOD OF MULTIPLIERS Le x i R p be a copy of he decision variable x kep a node i and define N i as he neighborhood of node i. Assuming he nework is bidirecionally conneced, he opimizaion problem in (1) is equivalen o he program n {x i } n i=1 := argmin f i (x i ), {x i } n i =1 i=1 s.. x i = x j, for all i, j N i. (2) Indeed, he consrain in (2) enforces he consensus condiion x 1 = = x n for any feasible poin of (2). Wih his condiion saisfied, he objecive in (2) is equal o he objecive funcion in (1) from where i follows ha he opimal local variables x i are all equal o he opimal argumen x of (1), i.e., x 1 = = x n = x. To derive ESOM define x := [x 1 ;...; x n ] R np as he concaenaion of he local decision variables x i and he aggregae funcion f : R np R as f(x) =f(x 1,...,x n ):= n i=1 f i(x i ) as he sum of all he local funcions f i (x i ).Inroduce he marix W R n n wih elemens w ij 0 represening a weigh ha node i assigns o variables of node j. The weigh w ij =0if and only if j/ N i {i}. Themarix W is furher required o saisfy W T = W, W1 = 1, null(i W) =span(1). (3) The firs condiion implies ha he weighs are symmeric, i.e., w ij = w ji. The second condiion ensures ha he weighs of a given node sum up o 1, i.e., n j=1 w ij =1for all i. Since W1 = 1 we have ha I W is rank deficien. The las condiion null(i W) =span(1) makes he rank of I W exacly equal o n 1 [33]. The marix W can be used o reformulae (2) as we show in he following proposiion. Proposiion 1: Define he marix Z := W I p R np R np as he Kronecker produc of he weigh marix W and he ideniy marix I p, and consider he definiions of he global vecor x := [x 1 ;...; x n ] and aggregae funcion f(x) := n i=1 f i(x i ). The opimizaion problem in (2) is equivalen o x = argmin f(x) s.. (I Z) 1/2 x = 0. (4) x R np I.e., x =[x 1;...; x n ] wih {x i }n i=1 he soluion of (2). Proof: We jus show ha he consrain ((I n W) I p )x =(I np Z)x = 0 is also a consensus consrain. To do so begin by noicing ha since I W is posiive semidefinie, I Z =(I W) I p is also posiive semidefinie. Therefore, he null space of he square roo marix (I Z) 1/2 is equal

3 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 509 o he null space of I Z and we conclude ha saisfying he condiion (I Z) 1/2 x is equivalen o he consensus condiion x 1 = = x n. This observaion in conjuncion wih he definiion of he aggregae funcion f(x) = n i=1 f i(x i ) shows ha he programs in (4) and (3) are equivalen. In paricular, he opimal soluion of (4) is x =[x 1;...; x n] wih {x i }n i=1 he soluion of (2). The formulaion in (4) is used o define he proximal mehod of mulipliers (PMM) ha we consider in his paper. To do so inroduce dual variables v R np o define he augmened Lagrangian L(x, v) of (4) as L(x, v) =f(x)v T (I Z) 1/2 x α 2 xt (I Z)x, (5) where α is a posiive consan. Given he properies of he marix Z, he augmenaion erm (α/2)x T (I Z)x is null when he variable x is a feasible soluion of (4). Oherwise, he inner produc is posiive and behaves as a penaly for he violaion of he consensus consrain. Inroduce a ime index N and define x and v as primal and dual ieraes a sep. The primal variable x 1 is updaed by minimizing he sum of he augmened Lagrangian in (5) and he proximal erm (ɛ/2) x x 2. We hen have ha { x 1 = argmin L(x, v ) ɛ x R np 2 x x 2}, (6) where he proximal coefficien ɛ>0 is a sricly posiive consan. The dual variable v is updaed by ascending hrough he gradien of he augmened Lagrangian wih respec o he dual variable v L(x 1, v ) wih sepsize α v 1 = v α(i Z) 1/2 x 1. (7) The updaes in (6) and (7) for PMM can be considered as a generalizaion of he mehod of mulipliers (MM), because seing he proximal coefficien ɛ =0recovers he updaes of MM. The proximal erm (ɛ/2) x x 2 is added o keep he updaed variable x 1 close o he previous ierae x. This does no affec convergence guaranees bu improves compuaional sabiliy. The primal updae in (6) may be compuaionally cosly because i requires solving a convex program and canno be implemened in a decenralized manner because he augmenaion erm (1/)x T (I Z)x in (5) is no separable. In he following secion, we propose an approximaion of PMM ha makes he minimizaion in (6) compuaionally economic and separable over nodes of he nework. This leads o he se of decenralized updaes ha define he ESOM algorihm. III. ESOM: EXACT SECOND-ORDER METHOD To reduce he compuaional complexiy of (6) and obain a separable updae we inroduce a second order approximaion of he augmened Lagrangian in (5). Consider hen he second order Taylor s expansion L(x, v ) L(x, v ) x L (x, v ) T (x x )(1/2)(x x ) T 2 xl(x, v )(x x ) of he augmened Lagrangian wih respec o x cenered around (x, v ). Using his approximaion in lieu of L(x, v ) in (6) leads o he primal updae { x 1 = argmin L(x, v ) x L(x, v ) T (x x ) x R np 1 2 (x x ) T( 2 xl(x, v )ɛi ) } (x x ). (8) The minimizaion in he righ hand side of (8) is of a posiive definie quadraic form. Thus, upon defining he Hessian marix H R np np as H := 2 f(x )α(i Z)ɛI, (9) and considering he explici form of he augmened Lagrangian gradien x L(x, v ) [cf. (5)] i follows ha he variable x 1 in(8)isgivenby x 1 = x H [ f(x )(I Z) 1/2 ] v α(i Z)x. (10) A fundamenal observaion here is ha he marix H, which is he Hessian of he objecive funcion in (8), is block neighbor sparse. By block neighbor sparse we mean ha he (i, j)h block is non-zero if and only if j N i or j = i. To confirm his claim, observe ha 2 f(x ) R np np is a block diagonal marix where is ih diagonal block is he Hessian of he ih local funcion, 2 f i (x i, ) R p p. Addiionally, marix ɛi np is a diagonal marix which implies ha he erm 2 f(x )ɛi np is a block diagonal marix wih blocks 2 f i (x i, )ɛi p. Furher, i follows from he definiion of he marix Z ha he marix I Z is neighbor sparse. Therefore, he Hessian H is also neighbor sparse. Alhough he Hessian H is neighbor sparse, is inverse is no. This observaion leads o he conclusion ha he updae in (10) is no implemenable in a decenralized manner, i.e., nodes canno implemen (10) by exchanging informaion only wih heir neighbors. To resolve his issue, we use a Hessian inverse approximaion ha is buil on runcaing he Taylor s series of he Hessian H as in [12], [30]. To do so, we ry o decompose he Hessian as H = D B where D is a block diagonal posiive definie marix and B is a neighbor sparse posiive semidefinie marix. In paricular, define D as inverse H D := 2 f(x )ɛi (I Z d ), (11) where Z d := diag(z). Observing he definiions of he marices H and D and considering he relaion B = D H we conclude ha B is given by B := α (I 2Z d Z). (12) Noice ha using he decomposiion H = D B and by facoring D 1/2, he Hessian inverse can be wrien as H = D /2 (I D /2 BD /2 ) D /2. Observe ha he inverse marix (I D /2 BD /2 ) can be subsiued by is Taylor s series u=0 (D/2 BD /2 ) u. Noe ha his is rue if he eigenvalues of he marix D /2 BD /2 are smaller han 1. We prove in Appendix D ha his condiion is saisfied. However, compuaion of he series requires global communicaion which

4 510 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 is no affordable in decenralized seings. Thus, we approximae he Hessian inverse H by runcaing he firs K 1 erms of is Taylor s series which leads o he Hessian inverse approximaion (K), H H (K) :=D /2 K u=0 ( ) u D /2 BD /2 /2 D. (13) Noice ha he approximae Hessian inverse H (K) is K-hop block neighbor sparse, i.e., he (i, j)h block is nonzero if and only if here is a leas one pah beween nodes i and j wih lengh K or smaller. We inroduce he Exac Second-Order Mehod (ESOM) as a second order mehod for solving decenralized opimizaion problems which subsiues he Hessian inverse in updae (10) by is K block neighbor sparse approximaion H (K) defined in (13). Therefore, he primal updae of ESOM is x 1 = x H (K) [ f(x )(I Z) 1/2 ] v α(i Z)x. (14) The ESOM dual updae is idenical o he updae in (7), v 1 = v α(i Z) 1/2 x 1. (15) Noice ha ESOM is differen from PMM in approximaing he augmened Lagrangian in he primal updae of PMM by a second order approximaion. Furher, ESOM approximaes he Hessian inverse of he augmened Lagrangian by runcaing he Taylor s series of he Hessian inverse which is no necessarily neighbor sparse. In he following subsecion we sudy he implanaion deails of he updaes in (14) and (15). Remark 1: The Hessian decomposiion H = D B wih he marices D and B in (11) and (12), respecively, is no he only valid decomposiion. All decomposiions of he form H = D ± B are valid if D is posiive definie and he eigenvalues of he marix D /2 B D /2 are in he inerval (, 1). The suggesed framework guaranees ha he marix B is posiive semidefinie which is helpful in he analysis of he proposed ESOM mehod. A more comprehensive sudy of alernaive decomposiions is sudied in [34]. A. Decenralized Implemenaion of ESOM The updaes in (14) and (15) show ha ESOM is a second order approximaion of PMM. Alhough hese updaes are necessary for undersanding he raionale behind ESOM, hey are no implemenable in a decenralized fashion since he marix (I Z) 1/2 is no neighbor sparse. To resolve his issue, define he sequence of variables q as q := (I Z) 1/2 v. Considering he definiion of q, he primal updae in (14) can be wrien as x 1 = x H (K) ( ) f(x )q α(i Z)x. (16) Muliplying he dual updae in (15) by (I Z) 1/2 from he lef hand side and using he definiion q := (I Z) 1/2 v yields q 1 = q α(i Z)x 1. (17) Noice ha he sysem of updaes in (16) and (17) is equivalen o he updaes in (14) and (15), i.e., he sequences of variables x generaed by hem are idenical. Nodes can implemen he primal-dual updaes in (16) and (17) in a decenralized manner, since he squared roo marix (I Z) 1/2 is eliminaed from he updaes and nodes can compue he producs (I Z)x and (I Z)x 1 by exchanging informaion wih heir neighbors. To characerize he local updae of each node for implemening he updaes in (16) and (17), define g := x L(x, v )= f(x )q α(i Z)x, (18) as he gradien of he augmened Lagrangian in (5). Furher, define he primal descen direcion d (K) wih K levels of approximaion as d (K) := H (K)g, (19) which implies ha he updae in (16) can be wrien as x 1 = x d (K). According o he definiions of he Hessian inverse approximaion in (13), he explici expression for he K u=0 ( descen direcion ) d (K) is given by d (K) =D /2 u D /2 BD /2 /2 D g. Considering his definiion, we can simplify he expression for he descen direcion d (k 1) as d (k 1)= D /2 k1 u=1 ( D /2 BD /2 ) u /2 D g D g, (20) where we have separaed he firs erm of he sum from he res. Facorize D B from he summands in (20) o obain k ( ) ud d (k 1)= D BD /2 D /2 BD /2 /2 g u=0 D g. (21) Based on he definiion of he descen direcion d (k), we obain ha he firs erm in he righ hand side of (21) can be simplified as D Bd (k). Therefore, he descen direcions d (k) and d (k 1)saisfy he condiion d (k 1)=D Bd (k) D g. (22) Define d i, (k) as he descen direcion of node i a sep which is he ih elemen of he global descen direcion d (k) = [d 1, (k);...; d n, (k)]. Therefore, he localized version of he relaion in (22) a node i is given by d i, (k 1)=D ii, B ij d j, (k) D ii, g i,. (23) j=i,j N i The updae in (23) shows ha node i can compue is (k 1)h descen direcion d i, (k 1)if i has access o he kh descen direcion d i, (k) of iself and is neighbors d j, (k) for j N i. Thus, if nodes iniialize wih he ESOM-0 descen direcion d i, (0) = D ii, g i, and exchange heir descen direcions wih heir neighbors for K rounds and use he updae in (23), hey can compue heir local ESOM-K descen direcion d i, (K). Noice ha he ih diagonal block D is given by D ii, := 2 f i (x i, ) ((1 w ii )ɛ)i, where x i, is he primal variable of node i a sep. Thus, he block D ii, is locally available a node i.

5 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 511 Algorihm 1: ESOM-K Mehod a Node i. Require: Iniial ieraes x i,0 = x j,0 = 0 for j N i and q i,0 = 0. 1: B blocks: B ii = α(1 w ii )I and B ij = αw ij I 2: for =0, 1, 2,...do 3: D block: D ii, = 2 f i (x i, )((1 w ii )ɛ)i 4: Compue g i, = f i (x i, )q i, α(1 w ii )x i, α j N i w ij x j, 5: Compue ESOM-0 descen direcion d i, (0) = D ii, g i, 6: for k =0,...,K 1 do 7: Exchange d i, (k) wih neighbors j N i 8: Compue d i, (k 1)=D ii, [ j N i,j=i B ijd j, (k) g i, ] 9: end for 10: Updae primal ierae: x i,1 = x i, d i, (K). 11: Exchange ieraes x i,1 wih neighbors j N i. 12: Updae dual ierae: q i,1 = q i, α(1 w ii )x i,1 α j N i w ij x j,1. 13: end for Moreover, node i can evaluae he blocks B ii = α(1 w ii )I and B ij = αw ij I wihou exra communicaion. In addiion, nodes can compue he gradien g by communicaing wih heir neighbors. To confirm his claim observe ha he ih elemen of g =[g 1, ;...; g n, ] associaed wih node i is given by g i, := f i (x i, )q i, α(1 w ii )x i, α j N i w ij x j,, (24) where q i, R p is he ih elemen of q =[q 1, ;...; q n, ] and x i, he primal variable of node i a sep and hey are boh available a node i. Hence, he updae in (16) can be implemened in a decenralized manner. Likewise, nodes can implemen he dual updae in (17) using he local updae q i,1 = q i, α(1 w ii )x i,1 α j N i w ij x j,1, (25) which requires access o he local primal variable x j,1 of he neighboring nodes j N i. The seps of ESOM-K are summarized in Algorihm 1. The core seps are Seps 5 9 which correspond o compuing he ESOM-K primal descen direcion d i, (K). In Sep 5, Each node compues is iniial descen direcion d i, (0) using he block D ii, and he local gradien g i, compued in Seps 3 and 4, respecively. Seps 7 and 8 correspond o he recursion in (23). In sep 7, nodes exchange heir kh level descen direcion d i, (k) wih heir neighboring nodes o compue he (k 1)h descen direcion d i, (k 1)in Sep 8. The oucome of his recursion is he Kh level descen direcion d i, (K) which is required for he updae of he primal variable x i, in Sep 10. Noice ha he blocks of he neighbor sparse marix B, which are required for sep 8, are compued and sored in Sep 1. Afer updaing he primal variables in Sep 10, nodes exchange heir updaed variables x i,1 wih heir neighbors j N i in Sep 11. By having access o he decision variable of neighboring nodes, nodes updae heir local dual variable q i, in Sep 12. Remark 2: The proposed ESOM algorihm solves problem (4) in he dual domain by defining he proximal augmened Lagrangian. I is also possible o solve problem (4) in he primal domain by solving a penaly version of (4). In paricular, by using he quadraic penaly funcion (1/2). 2 for he consrain (I Z) 1/2 x wih penaly coefficien α, we obain he penalized version of (4) ˆx := argmin f(x) α x R np 2 xt (I Z)x, (26) where ˆx is he opimal argumen of he penalized objecive funcion. Noice ha ˆx is no equal o he opimal argumen x and he disance x ˆx depends on he choice of α. The objecive funcion in (26) can be minimized by descending hrough he gradien descen direcion which leads o he updae of decenralized gradien descen (DGD) [35]. The convergence of DGD can be improved by using Newon s mehod. Noice ha he Hessian of he objecive funcion in (26) is given by Ĥ := 2 f(x)α(i Z). (27) The Hessian Ĥ in (27) is idenical o he Hessian H in (9) excep for he erm ɛi. Therefore, he same echnique for approximaing he Hessian inverse Ĥ can be used o approximae he Newon direcion of he penalized objecive funcion in (26) which leads o he updae of he Nework Newon (NN) mehods [12], [13]. Thus, ESOM and NN use an approximae decenralized variaion of Newon s mehod for solving wo differen problems. In oher words, ESOM uses he approximae Newon direcion for minimizing he augmened Lagrangian of (4), while NN solves a penalized version of (4) using his approximaion. This difference jusifies he reason ha he sequence of ieraes generaed by ESOM converges o he opimal argumen x (Secion IV), while NN converges o a neighborhood of x. Remark 3: ESOM approximaes he augmened Lagrangian L(x, v) in (6) by is second order approximaion. If we subsiue he augmened Lagrangian by is firs order approximaion we can recover he updae of EXTRA proposed in [32]. To be more precise, we can subsiue L(x, v ) in (6) by is firs order approximaion L(x, v ) L(x, v ) T (x x ) near he poin (x, v ) o updae he primal variable x. Considering his subsiuion, he updae of x 1 is given by { x 1 = argmin L(x, v ) L(x, v ) T (x x ) x R np ɛ 2 x x 2 }. (28) Thus, considering he definiion of he augmened Lagrangian in (5) he updaed variable x 1 can be explicily wrien as x 1 = x 1 ] [ f(x )(I Z) 1/2 v α(i Z)x. ɛ (29) By subracing he updae a sep 1 from he updae a sep and using he dual variables relaion ha v 1 = v

6 512 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 α(i Z) 1/2 x 1 we obain he updae x 1 = (2I ɛ ) (I Z) x (I α ) ɛ (I Z) x 1 ɛ ( f(x ) f(x )). (30) The updae in (30) shows a firs-order approximaion of he PMM. I is no hard o show ha for specific choices of α and ɛ, he updae in (30) is equivalen o he updae of EXTRA in [32]. Thus, we expec o observe faser convergence for ESOM relaive o EXTRA as i incorporaes second-order informaion. This advanage is sudied in Secion V. IV. CONVERGENCE ANALYSIS In his secion, we sudy convergence raes of PMM and ESOM. Firs, we show ha he sequence of ieraes x generaed by PMM converges linearly o he opimal argumen x. Alhough, PMM canno be implemened in a decenralized fashion, is convergence rae can be used as a benchmark for evaluaing he performance of ESOM. We hen follow he secion by analyzing convergence properies ESOM. We show ha ESOM exhibis a linear convergence rae and compare is facor of linear convergence wih he linear convergence facor of PMM. In proving hese resuls we consider he following assumpions. Assumpion 1: The local objecive funcions f i (x) are wice differeniable and he eigenvalues of he local objecive funcions Hessian 2 f(x) are bounded by posiive consans 0 < m M<, i.e. mi 2 f i (x i ) MI, (31) for all x i R p and i =1,...,n. The lower bound in (31) implies ha he local objecive funcions f i are srongly convex wih consan m>0. The upper bound for he eigenvalues of he Hessians 2 f i implies ha he gradiens of he local objecive funcions f i are Lipschiz coninuous wih consan M. Noice ha he global objecive funcion 2 f(x) is a block diagonal marix where is ih diagonal block is 2 f i (x i ). Therefore, he bounds on he eigenvalues of he local Hessians 2 f i (x i ) in (31) also hold for he global objecive funcion Hessian 2 f(x). I.e., mi 2 f(x) MI, (32) for all x R np. Thus, he global objecive funcion f is also srongly convex wih consan m and is gradiens f are Lipschiz coninuous wih consan M. A. Convergence of Proximal Mehod of Mulipliers (PMM) Convergence rae of PMM can be considered as a benchmark for he convergence rae of ESOM. To esablish linear convergence of PMM, We firs sudy he relaionship beween he primal x and dual v ieraes generaed by PMM and he opimal argumens x and v in he following lemma. Lemma 1: Consider he updaes for he proximal mehod of mulipliers in (6) and (7). The sequences of primal and dual ieraes generaed by PMM saisfy and v 1 v α(i Z) 1/2 (x 1 x )=0, (33) f(x 1 ) f(x )(I Z) 1/2 (v 1 v ) ɛ(x 1 x )=0. (34) Proof: See Appendix A. Considering he preliminary resuls in (33) and (34), we can sae convergence resuls of PMM. To do so, we prove linear convergence of a Lyapunov funcion of he primal x x 2 and dual v v 2 errors. To be more precise, we define he vecor u R 2np and marix G R np np as [ ] [ ] v I 0 u =, G =. (35) x 0 αɛi Noice ha he sequence u is he concaenaion of he dual variable v and primal variable x. Likewise, we can define u as he concaenaion of he opimal argumens v and x. We proceed o prove ha he sequence u u 2 G converges linearly o null. Observe ha u u 2 G can be simplified as v v 2 αɛ x x 2. This observaion shows ha u u 2 G is a Lyapunov funcion of he primal x x 2 and dual v v 2 errors. Therefore, linear convergence of he sequence u u 2 G implies linear convergence of he sequence x x 2. In he following heorem, we show ha he sequence u u 2 G converges o zero a a linear rae. Theorem 1: Consider he proximal mehod of mulipliers as inroduced in (6) and (7). Consider β>1as an arbirary consan sricly larger han 1 and define ˆλ min (I Z) as he smalles non-zero eigenvalue of he marix I Z. Furher, recall he definiions of he vecor u and marix G in (35). If Assumpion 1 holds, hen he sequence of Lyapunov funcions u u 2 G generaed by PMM saisfies u 1 u 2 G 1 1δ u u 2 G, (36) where he consan δ is given by { δ = min ˆλ min (I Z) β(m M), 2mM ɛ(mm), (β)αˆλ min (I Z) βɛ }. (37) Proof: See Appendix B. The resul in Theorem 1 shows linear convergence of he sequence u u 2 G generaed by PMM where he facor of linear convergence is 1/(1 δ). Observe ha larger δ implies smaller linear convergence facor 1/(1 δ) and faser convergence. Noice ha all he erms in he minimizaion in (37) are posiive and herefore he consan δ is sricly larger han 0. In addiion, he resul in Theorem 1 holds for any feasible se of parameers β>1, ɛ>0, and α>0; however, maximizing he parameer δ requires properly choosing he se of parameers β, ɛ, and α. Observe ha when he firs posiive eigenvalue ˆλ min (I Z) of he marix I Z, which is he second smalles eigenvalue of

7 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 513 I Z, is small he consan δ becomes close o zero and convergence becomes slow. Noice ha small ˆλ min (I Z) shows ha he graph is no highly conneced. This observaion maches he inuiion ha when he graph has less edges he speed of convergence is slower. Addiionally, he upper bounds in (37) show ha when he condiion number M/m of he global objecive funcion f is large, δ becomes small and he linear convergence becomes slow. Alhough PMM enjoys a fas linear convergence rae, each ieraion of PMM requires infinie rounds of communicaions which make i infeasible. In he following secion, we sudy convergence properies of ESOM as a second order approximaion of PMM ha is implemenable in decenralized seings. B. Convergence of ESOM We proceed o show ha he sequence of ieraes x generaed by ESOM converges linearly o he opimal argumen x = [ x ;...; x ]. To do so, we firs prove linear convergence of he Lyapunov funcion u u 2 G as defined in (35). Moreover, we show ha by increasing he Hessian inverse approximaion accuracy, ESOM facor of linear convergence can be arbirary close o he linear convergence facor of PMM in Theorem 1. Noice ha ESOM is buil on a second order approximaion of he proximal augmened Lagrangian used in he updae of PMM. To guaranee ha he second order approximaion suggesed in ESOM is feasible, he local objecive funcions f i are required o be wice differeniable as assumed in Assumpion 1. The wice differeniabiliy of he local objecive funcions f i implies ha he aggregae funcion f, which is he sum of a se of wice differeniable funcions, is also wice differeniable. This observaion shows ha he global objecive funcion 2 f(x) is definable. Considering his observaion, we prove some preliminary resuls for he ieraes generaed by ESOM in he following lemma. Lemma 2: Consider he updaes of ESOM in (14) and (15). Recall he definiions of he augmened Lagrangian Hessian H in (9) and he approximae Hessian inverse H (K) in (13). If Assumpion 1 holds, hen he primal and dual ieraes generaed by ESOM saisfy v 1 v α(i Z) 1/2 (x 1 x )=0. (38) Moreover, we can show ha f(x 1 ) f(x )(I Z) 1/2 (v 1 v ) (39) ɛ(x 1 x )e = 0, where he error vecor e is defined as e := f(x ) 2 f(x )(x 1 x ) f(x 1 ) ( ) H (K) H (x 1 x ). (40) Proof: See Appendix C. The resuls in Theorem 2 show he relaionships beween he primal x and dual v ieraes generaed by ESOM and he opimal argumens x and v. The firs resul in (38) is idenical o he convergence propery of PMM in (33), while he second resul in (39) differs from (34) in having he exra summand e.the H vecor e can be inerpreed as he error of second order approximaion for ESOM a sep. To be more precise, he opimaliy condiion of he primal updae of PMM is given by f(x 1 ) (I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x )=0 as shown in (34). Noice ha he second order approximaion of his condiion is equivalen o f(x ) 2 f(x )(x 1 x )(I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x )=0. However, he exac Hessian inverse H =( 2 f(x )ɛi α(i Z)) canno be compued in a disribued manner o solve he opimaliy condiion. Thus, i is approximaed by he approximae Hessian inverse marix (K) as inroduced in (13). This shows ha he approximae opimaliy condiion in ESOM is f(x )(I Z) 1/2 v α(i Z)x H (x 1 x )= 0. Hence, he difference beween he opimaliy condiions of PMM and ESOM is e = f(x ) f(x 1 )α(i Z)(x x 1 ) H (x 1 x ) ɛ(x 1 x ). By adding and subracing he erm H (x 1 x ), he definiion of he error vecor e in (40) follows. The observaion ha he vecor e characerizes he error of second order approximaion in ESOM, moivaes analyzing an upper bound for he error vecor norm e. To prove ha he norm e is bounded above we assume he following condiion is saisfied. Assumpion 2: The global objecive funcion Hessian 2 f(x) is Lipschiz coninuous wih consan L, i.e., 2 f(x) 2 f( x) L x x. (41) The condiions imposed by Assumpion 2 are cusomary in he analysis of second-order mehods; see, e.g., [29]. In he following lemma, we use he assumpion in (41) o prove an upper bound for he error norm e in erms of x 1 x. Lemma 3: Consider ESOM as inroduced in (8) (15) and recall he definiion of he error vecor e in (40). Furher, define c>0 as a lower bound for he local weighs w ii. If Assumpions 1 2 hold, hen he error vecor norm e is bounded above by e Γ x 1 x, (42) where Γ is defined as { Γ :=min 2M, L } 2 x 1 x (M ɛ (1 c)) ρ K 1, (43) and ρ := (1 c)/((1 c)m ɛ). Proof: See Appendix D. Firs, noe ha he lower bound c>0on he local weighs w ii is implied from he fac ha all he local weighs are posiive. In paricular, we can define he lower bound c as c := min i w ii.the resul in (42) shows ha he error of second order approximaion in ESOM vanishes as he sequence of ieraes x approaches he opimal argumen x. We will show in Theorem 2 ha x x converges o zero which implies ha he limi of he sequence x 1 x is zero. To undersand he definiion of Γ in (43), we have o decompose he error vecor e in (40) ino wo pars. The firs par is f(x ) 2 f(x )(x 1 x ) f(x 1 ) which comes from he fac ha ESOM minimizes a second order approximaion of he proximal augmened Lagrangian insead

8 514 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 of he exac proximal augmened Lagrangian. This erm can be bounded by min{2m,(l/2) x 1 x } x 1 x as shown in Lemma 3. The second par of he error vecor e is ( H (K) H )(x 1 x ) which shows he error of Hessian inverse approximaion. Noice ha compuaion of he exac is no possible and ESOM approximaes H (K). According o he resuls in [12], he difference H (K) H can upper bounded by (M ɛ 2(1 c)/α)ρ K 1 which jusifies he second erm of he expression for Γ in (43). In he following heorem, we use he resul in Lemma 3 o show ha he se- Hessian inverse H he exac Hessian by he approximaion quence of Lyapunov funcions u u 2 G generaed by ESOM converges o zero linearly. Theorem 2: Consider ESOM as inroduced in (8) (15). Consider β>1 and φ>1 as arbirary consans ha are sricly larger han 1, and ζ as a posiive consan ha is chosen from he inerval ζ ((m M)/2mM, ɛ/γ 2 ). Furher, recall he definiions of he vecor u and marix G in (35) and consider ˆλ min (I Z) as he smalles non-zero eigenvalue of he marix I Z. If Assumpions 1 and 2 hold, hen he sequence of Lyapunov funcions u u 2 G generaed by ESOM saisfies u 1 u 2 1 G 1δ u u 2 G. (44) where he sequence δ is given by { [ δ ˆλ min (I Z) = min φβ(m M), 2mM ɛ(m M) 1 ], (45) ζɛ [ ][ ] } (β 1)αˆλ min (I Z) 1 ζγ2 1 φγ2 (β 1) βɛ ɛ (φ 1)ɛ 2. Proof: See Appendix E. The resul in Theorem 2 shows linear convergence of he sequence u u 2 G generaed by ESOM where he facor of linear convergence is 1/(1 δ ). Noice ha he posiive consan ζ is chosen from he inerval ((m M)/2mM, ɛ/γ 2 ). This inerval is non-empy if and only if he proximal parameer ɛ saisfies he condiion ɛ>γ 2 (m M)/2mM. However, Γ also depends on ɛ which makes i unclear if here always exiss a choice of ɛ ha saisfies he inequaliy ɛ>γ 2 (m M)/2mM. In Appendix F, we provide he condiion on ɛ ha guaranees ɛ>γ 2 (m M)/2mM holds. I follows from he resul in Theorem 2 ha he sequence of primal variables x converges o he opimal argumen x defined in (4). Corollary 1: Under he assumpions in Theorem 2, he sequence of squared errors x x 2 generaed by ESOM converges o zero a a linear rae, i.e., ( ) x x 2 1 u 0 u 2 G 1 min {δ }. (46) αɛ Proof: According o he definiion of he sequence u and marix G, we can wrie u u 2 G = αɛ x x 2 v v 2 which implies ha x x 2 (1/αɛ) u u 2 G. Considering his resul and linear convergence of he sequence u u 2 G in (44), he claim in (46) follows. C. Convergence Raes Comparison The expression for δ in (45) verifies he inuiion ha he convergence rae of ESOM is slower han PMM. This is rue, since he upper bounds for δ in PMM are larger han heir equivalen upper bounds for δ in ESOM. We obain ha δ is smaller han δ which implies ha he linear convergence facor 1/(1 δ) of PMM is smaller han 1/(1 δ ) for ESOM. Therefore, for all seps, he linear convergence of PMM is faser han ESOM. Alhough, linear convergence facor of ESOM 1/(1 δ ) is larger han 1/(1 δ) for PMM, as ime passes he gap beween hese wo consans becomes smaller. In paricular, afer a number of ieraions (L/2) x 1 x becomes smaller han 2M, and Γ can be simplified as Γ L 2 x 1 x ((1 c)m ɛ) ρ K 1. (47) The erm (L/2) x 1 x evenually approaches zero, while he second erm (2(1 c)/α M ɛ)ρ K 1 is consan. Alhough, he second erm is no approaching zero, by proper choice of ρ and K, his erm can become arbirary close o zero. Noice ha when Γ approaches zero, if we se ζ =1/Γ he upper bounds in (45) for δ approach he upper bounds for δ of PMM in (37). Therefore, as ime passes Γ becomes smaller, and he facor of linear convergence for ESOM 1/(1 δ ) becomes closer o he linear convergence facor of PMM 1/(1 δ). V. NUMERICAL EXPERIMENTS In his secion, we compare he performances of ESOM, EXTRA, Decenralized Quadraically approximaed ADMM (DQM), and Nework Newon (NN). Firs, we consider a linear leas squares problem and hen we use he menioned mehods o solve a logisic regression problem. A. Decenralized Linear Leas Squares Consider a decenralized linear leas squares problem where each agen i {1,...,n} holds is privae measuremen equaion, y i = M i x ν i, where y i R m i and M i R m i p are measured daa, x R p is he unknown variable, and ν i R m i is some unknown noise. The decenralized linear leas squares esimaes x by solving he opimizaion problem x = argmin x n M i x y i 2 2. (48) i=1 The nework in his experimen is randomly generaed wih conneciviy raio r =3/n, where r is defined as he number of edges divided by he number of all possible ones, n(n 1)/2. We se n =20, p =5, and m i =5 for all i =1,...,n.The vecors y i and marices M i as well as he noise vecors ν (i), for all i are generaed following he sandard normal disribuion. We precondiion he aggregaed daa marices M i so ha he condiion number of he problem is 10. The decision variables x i are iniialized as x i,0 =0for all nodes i =1,...,n and he iniial disance o he opimal is x i,0 x = 100.

9 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 515 Fig. 1. Relaive error x x / x 0 x of EXTRA, ESOM-K, NN-K, and PMM versus number of ieraions for he leas squares problem. Using a larger K for ESOM-K leads o faser convergence and makes he convergence pah closer o he one for PMM. We use Meropolis consan edge weigh marix as he mixing marix W in all experimens. We run PMM, EXTRA, and ESOM-K wih fixed hand-opimized sepsizes α. The bes choices of α for ESOM-0, ESOM-1, and ESOM-2 are α =0.03, α =0.04, and α =0.05, respecively. The sepsize α =0.1 leads o he bes performance for EXTRA which is considered in he numerical experimens. Noice ha for variaions of NN-K, here is no opimal choice of sepsize smaller sepsize leads o more accurae bu slow convergence, while large sepsize acceleraes he convergence bu o a less accurae neighborhood of he opimal soluion. Therefore, for NN-0, NN-1, and NN-2 we se α =0.001, α =0.008, and α =0.02, respecively. Alhough he PMM algorihm is no implemenable in a decenralized fashion, we use is convergence pah which is generaed in a cenralized manner as our benchmark. The choice of sepsize for PMM is α =2. Fig. 1 illusraes he relaive error x x / x 0 x versus he number of ieraions. Noice ha he vecor x is he concaenaion of he local vecors x i, and he opimal vecor x is defined as x =[ x ;...; x ] R np. Observe ha all he variaions of NN-K fail o converge o he opimal argumen and hey converge linearly o a neighborhood of he opimal soluion x. Among he decenralized algorihms wih exac linear convergence rae, EXTRA has he wors performance and all he variaions of ESOM-K ouperform EXTRA. Recall ha he problem condiion number is 10 in our experimen and he difference beween EXTRA and ESOM-K is more significan for problems wih larger condiion numbers. Furher, choosing a larger value of K for ESOM-K leads o faser convergence and as we increase K he convergence pah of ESOM-K approaches he convergence pah of PMM. EXTRA requires one round of communicaions per ieraion, while NN-K and ESOM-K require K 1rounds of local communicaions per ieraion. Thus, convergence pahs of hese mehods in erms of rounds of communicaions migh be differen from he ones in Fig. 1. The convergence pahs of NN, ESOM, EXTRA in erms of rounds of local communicaions are shown in Fig. 2. In his plo we ignore PMM, since i requires infinie rounds of communicaions per ieraion. The main Fig. 2. Relaive error x x / x 0 x of EXTRA, ESOM-K, NN-K, and PMM versus rounds of communicaions wih neighboring nodes for he leas squares problem. ESOM-0 is he mos efficien algorihm in erms of communicaion cos among all he mehods. difference beween Figs. 1 and 2 is in he performances of ESOM-0, ESOM-1, and ESOM-2. All of he variaions of ESOM ouperform EXTRA in erms of rounds of communicaions, while he bes performance belongs o ESOM-0. This observaion shows ha increasing he approximaion level K does no necessary improve he performance of ESOM-K in erms of communicaion cos. B. Decenralized Logisic Regression We consider he applicaion of ESOM for solving a logisic regression problem in a form x λ n m i := argmin x R p 2 x 2 ln ( 1 exp ( ( s T ij x) )) y ij, i=1 j=1 (49) where every agen i has access o m i raining samples (s ij,y ij ) R p {, 1}, j =1,...,m i, including explanaory/feaure variables s ij and binary oupus/oucomes y ij. The regularizaion erm (λ/2) x 2 is added o avoid overfiing where λ > 0. Hence, in he decenralized seing he local objecive funcion f i of node i is given by f i ( x) = λ m i 2n x 2 ln ( 1 exp ( ( s T ij x) )) y ij. (50) j=1 The seings are as follows. The conneced nework is randomly generaed wih n =20 agens and conneciviy raio r =3/n. Each agen holds 3 samples, i.e., m i =3, for all i. The dimension of sample vecors s ij is p =3.Thesamples are randomly generaed, and he opimal logisic classifier x is pre-compued hrough cenralized adapive gradien mehod. We use Meropolis consan edge weigh marix as he mixing marixw in ESOM-K. The sepsize α for ESOM-0, ESOM-1, ESOM-2, EXTRA, and DQM are hand-opimized and he bes of each is used for he comparison. Figs. 3 and 4 showcase he convergence pahs of ESOM-0, ESOM-1, ESOM-2, EXTRA, and DQM versus number of ieraions and rounds of communicaions, respecively. The resuls mach he observaions for he leas squares problem in

10 516 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 Fig. 3. Relaive error x x / x 0 x of EXTRA, ESOM-K, and DQM versus number of ieraions for he logisic regression problem. EX- TRA is significanly slower han he ESOM mehods. The proposed mehods (ESOM-K) ouperform DQM. proximal augmened Lagrangian by runcaing is Taylor s series. This approximaion leads o a class of algorihms ESOM-K where K 1indicaes he number of Taylor s series erms ha are used for Hessian inverse approximaion. Convergence analysis of ESOM-K shows ha he sequence of ieraes converges o he opimal argumen linearly irrespecive o he choice of K. We showed ha he linear convergence facor of ESOM-K is a funcion of ime and he choice of K. The linear convergence facor of ESOM approaches he linear convergence facor of PMM as ime passes. Moreover, larger choice of K makes he facor of linear convergence for ESOM closer o he one for PMM. Numerical resuls verify he heoreical linear convergence and he relaion beween he linear convergence facor of ESOM-K and PMM. Furher, we observed ha larger choice of K for ESOM-K leads o faser convergence in erms of number of ieraions, while he mos efficien version of ESOM-K in erms of communicaion cos is ESOM-0. APPENDIX A PROOF OF LEMMA 1 Consider he updaes of PMM in (6) and (7). According o (4), he opimal argumen x saisfies he condiion (I Z) 1/2 x = 0. This observaion in conjuncion wih he dual variable updae in (7) yields he claim in (33). To prove he claim in (34), noe ha he opimaliy condiion of (6) implies x L(x 1, v )ɛ(x 1 x )=0. Based on he definiion of he Lagrangian L(x, v) in (5), he opimaliy condiion for he primal updae of PMM can be wrien as Fig. 4. Relaive error x x / x 0 x of EXTRA, ESOM-K, and DQM versus rounds of communicaions for he logisic regression problem. ESOM-0 has he bes performance in erms of rounds of communicaions and i ouperforms DQM. Figs. 1 and 2. Differen versions of ESOM-K converge faser han EXTRA boh in erms of communicaion cos and number of ieraions. Moreover, ESOM-2 converges faser han ESOM- 1 and ESOM-0 in erms of number of ieraions, while ESOM-0 has he bes performance in erms of communicaion cos for achieving a arge accuracy. Comparing he convergence pahs of ESOM-0, ESOM-1, and ESOM-2 wih DQM shows ha number of ieraions required for he convergence of DQM is larger han he required ieraions for ESOM-0, ESOM-1, and ESOM-2. In erms of communicaion cos, DQM has a beer performance relaive o ESOM-1 and ESOM-2, while ESOM-0 is he mos efficien algorihm. VI. CONCLUSION We sudied he consensus opimizaion problem where he componens of a global objecive funcion are available a differen nodes of a nework. We proposed an Exac Second-Order Mehod (ESOM) ha converges o he opimal argumen of he global objecive funcion a a linear rae. We developed he updae of ESOM by subsiuing he primal updae of Proximal Mehod of Mulipliers (PMM) wih is second order approximaion. Moreover, we approximaed he Hessian inverse of he f(x 1 )(I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x )=0. (51) Furher, noice ha one of he KKT condiions of he opimizaion problem in (4) is f(x )(I Z) 1/2 v = 0. (52) Moreover, he opimal soluion x =[ x ;...; x ] of (4) lies in null{i Z}. Therefore, we obain α(i Z)x = 0. (53) Subracing he equaliies in (52) and (53) from (51) yields f(x 1 ) f(x )(I Z) 1/2 (v v ) α(i Z)(x 1 x )ɛ(x 1 x )=0. (54) Regrouping he erms in (33) implies ha v is equivalen o v = v 1 α(i Z) 1/2 (x 1 x ). (55) Subsiuing v in (54) by he expression in he righ hand side of (55) leads o he claim in (34). APPENDIX B PROOF OF THEOREM 1 According o Assumpion 1, he global objecive funcion f is srongly convex wih consan m and is gradiens f are Lipschiz coninuous wih consan M. Considering

11 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 517 hese assumpions, we obain ha he inner produc (x 1 x ) T ( f(x 1 ) f(x )) is lower bounded by mm m M x 1 x 2 1 m M f(x 1) f(x ) 2 (x 1 x ) T ( f(x 1 ) f(x )). (56) The resul in (34) shows ha he difference f(x 1 ) f(x ) is equal o (I Z) 1/2 (v 1 v ) ɛ(x 1 x ). Apply his subsiuion ino (56) and muliply boh sides of he resuled inequaliy by 2 o obain 2mM m M x 1 x 2 2 m M f(x 1) f(x ) 2 2(x 1 x ) T (I Z) 1/2 (v 1 v ) 2ɛ(x 1 x ) T (x 1 x ). (57) Based on he resul in (33), we can subsiue (x 1 x ) T (I Z) 1/2 by (1/α)(v 1 v ) T. Thus, we can rewrie (57) as mm m M x 1 x 2 m M f(x 1) f(x ) 2 2(v 1 v ) T (v 1 v ) ɛ(x 1 x ) T (x 1 x ). (58) Noice ha for any vecors a, b, and c we can wrie 2(a b) T (a c) = a b 2 a c 2 b c 2. By seing a = v 1, b = v, and c = v we obain ha he inner produc 2(v 1 v ) T (v 1 v ) in (58) can be wrien as v 1 v 2 v 1 v 2 v v 2. Likewise, seing a = x 1, b = x, and c = x implies ha he inner produc 2(x 1 x ) T (x 1 x ) in (58) is equal o x 1 x 2 x 1 x 2 x x 2. Hence, (58) can be simplified as mm m M x 1 x 2 m M f(x 1) f(x ) 2 αɛ x x 2 αɛ x 1 x 2 αɛ x 1 x 2 v v 2 v 1 v 2 v 1 v 2. (59) Now using he definiions of he variable u and marix G in (35) we can subsiue v v 2 v 1 v 2 αɛ x x 2 αɛ x 1 x 2 by u u 2 G u 1 u 2 G. Moreover, he squared norm v 1 v 2 is equivalen o x 1 x 2 α 2 (I Z) based on he resul in (33). By applying hese subsiuions we can rewrie (59) as mm m M x 1 x 2 m M f(x 1) f(x ) 2 u u 2 G u 1 u 2 G αɛ x 1 x 2 x 1 x 2 α 2 (I Z). (60) Regrouping he erms in (60) leads o he following lower bound for he difference u u 2 G u 1 u 2 G, u u 2 G u 1 u 2 G m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 2 αmm m M Iα2 (I Z). (61) Observe ha he resul in (61) provides a lower bound for he decremen u u 2 G u 1 u 2 G. To prove he claim in (36), we need o show ha for a posiive consan δ we have u u 2 G u 1 u 2 G δ u 1 u 2 G. Therefore, he inequaliy in (36) is saisfied if we can show ha he lower bound in (61) is greaer han δ u 1 u 2 G or equivalenly δ v 1 v 2 δαɛ x 1 x 2 m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 2 αmm m M Iα2 (I Z). (62) To prove ha he inequaliy in (62) holds for some δ>0,we firs find an upper bound for he squared norm v 1 v 2 in erms of he summands in he righ hand side of (62). To do so, consider he relaion (34) along wih he fac ha v 1 and v boh lie in he column space of (I Z) 1/2. Noe ha here always exiss a unique v ha lies in he column space of (I Z) 1/2 check Lemma 1 in [28]. Since we know ha boh v 1 and v lie in he column space of (I Z) 1/2, here exiss a vecor r R np such ha v v 1 =(I Z) 1/2 r.this relaion implies ha (I Z) 1/2 (v 1 v ) 2 can be wrien as (I Z)r 2 = r T (I Z) 2 r. The eigenvalues of he marix (I Z) 2 are he squared of eigenvalues of he marix (I Z). Thus, we can wrie r T (I Z) 2 r ˆλ min (I Z)r T (I Z)r, where ˆλ min (I Z) is he smalles non-zero eigenvalue of he marix I Z. Observing his inequaliy and he definiion v v 1 =(I Z) 1/2 r we can wrie (I Z) 1/2 (v 1 v ) 2 ˆλ min (I Z) v 1 v 2. (63) Moreover, from he inequaliy in (34) we obain ha (I Z) 1/2 (v 1 v ) 2 is bounded above by (I Z) 1/2 v 1 v 2 βɛ2 (β 1) x 1 x 2 β f(x 1 ) f(x ) 2, (64) where β>1 is a unable free parameer. Replacing he norm (I Z) 1/2 v 1 v 2 in (64) by is lower bound in (63) follows ha v 1 v 2 is bounded above by v 1 v 2 βɛ 2 (β 1)ˆλ min (I Z) x 1 x 2 (65) β ˆλ min (I Z) f(x 1) f(x ) 2. Considering he resul in (65) o saisfy he inequaliy in (62), which is a sufficien condiion for he claim in (36), i remains

12 518 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 o show ha m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 2 αmm m M Iα2 (I Z) δβɛ 2 (β 1)ˆλ min (I Z) x 1 x 2 δɛα x 1 x 2 δβ ˆλ min (I Z) f(x 1) f(x ) 2. (66) To enable (66) and consequenly enabling (62), we only need o verify ha here exiss δ>0 such ha mm m M I α2 (I Z) δαɛi, m M δβ ˆλ min (I Z), δβɛ 2 αɛ (β 1)ˆλ min (I Z). (67) The condiions in (67) are saisfied if he consan δ is chosen as in (37). Therefore, for δ in (37) he claim in (62) holds, which implies he claim in (36). APPENDIX C PROOF OF LEMMA 2 Consider he primal updae of ESOM in (14). By regrouping he erms we obain ha f(x )(I Z) 1/2 v α(i Z)x H (x 1 x )=0, (68) where H is he inverse of he Hessian inverse approximaion H (K). Recall he definiion of he exac Hessian H in (9). Adding and subracing he erm H (x 1 x ) o he expression in (68) yields f(x ) 2 f(x )(x 1 x )(I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x ) ( H H )(x 1 x )=0. (69) Now using he definiion of he error vecor e in (40) we can rewrie (69) as f(x 1 )(I Z) 1/2 v α(i Z)x 1 ɛ(x 1 x )e = 0. (70) Noice ha he resul in (70) is idenical o he expression for PMM in (51) excep for he error erm e. To prove he claim in (39) from (70), i remains o follow he seps in (52) (55). APPENDIX D PROOF OF LEMMA 3 To prove he resul in (42), we firs use he resul in Proposiion 2 of [29]. I shows ha when he eigenvalues of he Hessian 2 f(x) are bounded above by M and he Hessian is Lipschiz coninuous wih consan L we can wrie f(x ) 2 f(x )(x 1 x ) f(x 1 ) { x 1 x min 2M, L } 2 x 1 x. (71) Considering he resul in (71), i remains o find an upper bound for he second erm of he error vecor e which is ( H (K) H )(x 1 x ). To do so, we develop firs an upper bound for he norm H (K) H. Noice ha by facoring he erm H (K) 1/2 from lef and righ, and using he Cauchy- Schwarz inequaliy we obain ha H (K) H H (K) I H 1 2 (K)H H 1 2 (K). (72) Noe ha he eigenvalues of he marices I H H (K) /2 and I H (K)H H/2 (K) are he same since hese wo marices are similar. In linear algebra, wo marices A and à are called similar if à = P AP for an inverible marix P. Thus, we proceed o find bounds for he eigenvalues of I H H (K), o bound he norm in (72). According o Lemma 3 in [12], we can simplify I H H (K) as I H H (K) =(BD ) K 1. (73) Noe ha he marices B and D in his paper are differen from he ones in [12], bu he analyses of hem are very similar. Following he proof of Proposiion 2 in [12], we define ˆD := (I Z d ). Noice ha he marix ˆD is bock diagonal where is ih diagonal block is (1 w ii )I p. Thus, ˆD is posiive definie and inverible. Insead of sudying an upper bound for he eigenvalues of BD, we ry o find an upper bound for he eigenvalues of is similar marix D /2 BD /2 which is symmeric. We are allowed o wrie he produc D /2 BD /2 as ( D 1 2 BD 1 2 = D 1 2 ˆD 1 2 )(ˆD )(ˆD ) 1 2 B ˆD/2 1 2 D 1 2. (74) The nex sep is o find an upper bound for he eigenvalues of B ˆD in (74). Based on he definiions of marices B and ˆD, he produc B ˆD is given by B ˆD =(I 2Z d Z) (2(I Z d )). (75) According o he resul in Proposiion 2 of [12], he eigenvalues of he marix (I 2Z d Z)(2(I Z d )) are uniformly bounded by 0 and 1. Thus, we obain ha he eigenvalues of ˆD /2 B ˆD /2 are bounded by 0 and 1 and we can wrie ˆD 1 2 B ˆD (76) According o he definiions of he marices ˆD and D,he produc ˆD 1/2 D /2 is block diagonal and he ih diagonal block is given by [ ] ( ˆDD 2 = f i (x i, )ɛi I). (77) ii (1 w ii ) Based on Assumpion 1, he eigenvalues of he local Hessians 2 f i (x i ) are bounded by m and M. Furher, noice ha he diagonal elemens w ii of he weigh marix W are bounded below

13 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 519 by c. Considering hese bounds, we can show ha he eigenvalues of he marices (1/(1 w ii ))( 2 f i (x i, )ɛi)i for all i =1,...,nare bounded below by [ m ɛ (1 c) 1 ] I 2 f i (x i, )ɛi (1 w ii ) I. (78) By considering he bounds in (78), he eigenvalues of each block of he marix ˆDD, inroduced in (77), are bounded above as ( 2 [ ] f i (x i, )ɛi m ɛ I) (1 w ii ) (1 c) 1 I. (79) The upper bound in (79) for he eigenvalues of each diagonal block of he marix ˆDD implies ha he marix norm is bounded above by ˆDD (1 c) ˆDD ρ := (1 c)m ɛ. (80) Considering he upper bounds in (76) and (80) and he relaion in (74) we obain ha D 1 2 BD 1 2 ρ. (81) Thus, he eigenvalues of he posiive definie symmeric marix D /2 BD /2 are bounded by ρ. Hence, he eigenvalues of is similar marix BD are bounded by ρ. This bound along wih he resul in (73) shows ha he eigenvalues of he marix I H H (K) are uniformly bounded by 0 and ρ K 1. Therefore, he eigenvalues of is similar symmeric marix I H (K)H H/2 (K) are beween 0 and ρ K which /2 /2 implies ha I H (K)H H/2 (K) ρ K 1. This resul in conjuncion wih he inequaliy in (72) yields H (K) H ρ K 1 H (K) (82) To bound he norm H (K), we firs find a lower bound for he eigenvalues of he approximae Hessian inverse H (K). Noice ha according o he definiion of he approximae Hessian inverse in (13), we can wrie K H (K) :=D D (D /2 BD /2 ) u D /2. u=1 (83) Noice ha according o he resul in Proposiion 1 of [12], he marix (I 2Z d Z) is posiive semidefinie which implies ha B = α (I 2Z d Z) is also posiive semidefinie. Thus, all he K summands in (83) are posiive semidefinie and as a resul we obain ha D H (K). (84) The eigenvalues of I Z d are bounded above by 1 c, since all he local weighs w ii are larger han c. This observaion in conjuncion wih he srong convexiy of he global objecive funcion f implies ha he eigenvalues of D = 2 f(x ) ɛi (I Z d ) are bounded above by M ɛ (1 c). Therefore, he eigenvalues of D are bounded below as 1 M ɛ (1 c) I D. (85) The resuls in (84) and (85) imply ha he eigenvalues of he approximae Hessian inverse H (K) are greaer han 1/(M ɛ (1 c)). Therefore, he eigenvalues of he posiive definie marix H (K) are smaller han M ɛ (1 c) and we can wrie H (K) M ɛ (1 c). (86) Considering he inequaliies in (82) and (86) and using he Cauchy-Schwarz inequaliy we can show ha he norm ( H (K) H )(x 1 x ) is bounded above by ( H (K) H )(x 1 x ) (M ɛ (1 c)) ρ K 1 x 1 x. (87) Observing he inequaliies in (71) and (87) and using he riangle inequaliy he claim in (42) follows. APPENDIX E PROOF OF THEOREM 2 Noice ha in proving he claim in (44) we use some of he seps in he proof of Theorem 1 o avoid rewriing similar equaions. Firs, noe ha according o he resul in (39), he difference f(x 1 ) f(x ) for he ESOM mehod can be wrien as f(x 1 ) f(x )= (I Z) 1/2 (v 1 v ) ɛ(x 1 x ) e. (88) Now recall he he inequaliy in (56) and subsiue he gradiens difference f(x 1 ) f(x ) in he inner produc (x 1 x ) T ( f(x 1 ) f(x )) by he expression in he righ hand side of (88). Applying his subsiuion and muliplying boh sides of he implied inequaliy by follows mm m M x 1 x 2 m M f(x 1) f(x ) 2 ɛ(x 1 x ) T (x 1 x ) (x 1 x ) T e (x 1 x ) T (I Z) 1/2 (v 1 v ). (89) By following he seps in (57) (61), he resul in (89) leads o a lower bound for u u 2 G u 1 u 2 G as u u 2 G u 1 u 2 G m M f(x 1) f(x ) 2 αɛ x 1 x 2 x 1 x 2 2 αmm m M Iα2 (I Z) (x 1 x ) T e. (90) Noe ha he inner produc 2(x 1 x ) T e is bounded below by (1/ζ) x 1 x 2 ζ e 2 for any posiive consan ζ>0. Thus, he lower bound in (90) can be updaed

14 520 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 as u u 2 G u 1 u 2 G x 1 x 2 ( 2 αmm m M α ζ )Iα2 (I Z) αɛ x 1 x 2 m M f(x 1) f(x ) 2 αζ e 2. (91) To esablish (44), we need o show ha he difference u u 2 G u 1 u 2 G is bounded below by δ u 1 u 2 G. To do so, we show ha he lower bound for u u 2 G u 1 u 2 G in (91) is larger han δ u 1 u 2 G, i.e., δ v 1 v 2 δ αɛ x 1 x 2 x 1 x 2 ( 2 αmm m M α ζ )Iα2 (I Z) αɛ x 1 x 2 m M f(x 1) f(x ) 2 αζ e 2. (92) We proceed o find an upper bound for he squared norm v 1 v 2 in erms of he summands in he righ hand side of (92). Consider he relaion (70) as well as he fac ha v 1 and v boh lie in he column space of (I Z) 1/2. I follows ha v 1 v 2 is bounded above by v 1 v 2 βɛ2 (β 1)ˆλ x 1 x 2 βφ (φ 1)ˆλ e 2 φβ ˆλ f(x 1) f(x ) 2, (93) where we have used ˆλ insead of ˆλ min (I Z) o simplify noaion. By subsiuing he upper bound in (93) for he squared norm v 1 v 2 in (92) we obain a sufficien condiion for he resul in (92) which is given by δ αɛ x 1 x 2 δ βɛ 2 (β 1)ˆλ x 1 x 2 δ φβ ˆλ f(x 1) f(x ) 2 δ βφα 2 e 2 (φ 1)ˆλ x 1 x 2 ( 2 αmm m M α ζ )Iα2 (I Z) αɛ x 1 x 2 m M f(x 1) f(x ) 2 αζ e 2. (94) Subsiue he squared norm e 2 erms in (94) by he upper bound in (42). I follows from his subsiuion and regrouping he erms ha 0 x 1 x 2 ( 2 αmm m M ( [ m M δ φβ ˆλ α ζ δ αɛ)iα2 (I Z) ) f(x 1 ) f(x ) 2 αɛ δ βɛ 2 (β 1)ˆλ δ βφγ 2 (φ 1)ˆλ αζγ2 ] x 1 x 2. (95) Noice ha if he inequaliy in (95) is saisfied, hen he resul in (94) holds which implies he resul in (92) and he linear convergence claim in (44). To saisfy he inequaliy in (95) we need o make sure ha he coefficiens of he erms x 1 x 2, x 1 x 2, and f(x 1 ) f(x ) 2 are non-negaive. Therefore, he inequaliy in (95) holds if δ saisfies mm m M α ζ δ αɛ 0, αɛ δ βɛ 2 (β 1)ˆλ δ βφγ 2 (φ 1)ˆλ αζγ2. m M δ φβ ˆλ (96) The condiions in (96) are saisfied if δ is chosen as in (45). Thus, δ in (45) saisfies he condiions in (96) and he claim in (44) holds. APPENDIX F The resul in Theorem 2 holds if he inerval ((m M)/2mM, ɛ/γ 2 ) is non-empy or equivalenly if he inequaliy ɛ>γ 2 (m M)/2mM holds. However, Γ depends on ɛ which makes i unclear if here exiss a choice of ɛ ha saisfies he inequaliy ɛ>γ 2 (m M)/2mM. In he following proposiion, we prove ha he inerval ((m M)/2mM, ɛ/γ 2 ) is non-empy for a proper choice of ɛ. Proposiion 2: Consider ESOM as inroduced in (8) (15). Recall he definiion of Γ in (43). If he consan ɛ is chosen such ha ɛ> m M ( 2M (1 c) M ) 2, (97) 2mM m hen he inequaliy ɛ>γ 2 (m M)/2mM holds and he se ((m M)/2mM, ɛ/γ 2 ) is non-empy. Proof: Noe ha he condiion ɛ>γ 2 (m M)/2mM is equivalen o 2ɛmM Γ <. (98) m M According o he definiion of Γ, he expression ρ := (1 c)/((1 c)m ɛ), and he fac ha 2M min { 2M, L 2 x 1 x }, we can wrie ( Γ 2M (M ɛ (1 c)) (1 c) (1 c)m ɛ ) K 1. (99) The resuls in (98) and (99) show ha he inequaliy ɛ> Γ 2 (m M)/2mM holds if he following inequaliy holds, ( 2M (M ɛ (1 c)) < (1 c) (1 c)m ɛ ) K 1 2ɛmM m M. (100) Thus, if he condiion in (100) holds hen we have ɛ>γ 2 (m M)/2mM. Noe ha ((1 c)/((1 c) m ɛ)) K 1 (1 c)/((1 c)m ɛ) for any K 0. Thus, if he following inequaliy is saisfied he inequaliy in

15 MOKHTARI e al.: DECENTRALIZED SECOND-ORDER METHOD WITH EXACT LINEAR CONVERGENCE RATE 521 (100) is also valid, [ 2M (M ɛ (1 c)) (1 c) (1 c)m ɛ ] 2ɛmM <. m M (101) Considering ha m<m and (1 c)ɛ>0, we obain ha ha (M ɛ (1 c))/(m ɛ (1 c)) M/m. Replacing (M ɛ (1 c))/(m ɛ (1 c)) in (101) by he upper bound M/m implies ha 2M (1 c) M m < 2ɛmM m M. (102) Noe ha if he condiion in (102) holds hen he condiion in (101) is saisfied. The resul in (102) shows ha if ɛ saisfies ɛ> m M ( 2M (1 c) M ) 2, (103) 2mM m hen he inequaliy in (102) and consequenly he inequaliies in (101) and (100) hold rue which follows ha he condiion ɛ>γ 2 (m M)/2mM is saisfied. REFERENCES [1] F. Bullo, J. Corés, and S. Marinez, Disribued Conrol of Roboic Neworks: A Mahemaical Approach o Moion Coordinaion Algorihms. Princeon, NJ, USA: Princeon Univ. Press, [2] Y. Cao, W. Yu, W. Ren, and G. Chen, An overview of recen progress in he sudy of disribued muli-agen coordinaion, IEEE Trans. Ind. Inf., vol. 9, pp , Feb [3] C. G. Lopes and A. H. Sayed, Diffusion leas-mean squares over adapive neworks: Formulaion and performance analysis, IEEE Trans. Signal Process., vol. 56, no. 7, pp , Jul [4] A. Ribeiro, Ergodic sochasic opimizaion algorihms for wireless communicaion and neworking, IEEE Trans. Signal Process., vol. 58, no. 12, pp , Dec [5] A. Ribeiro, Opimal resource allocaion in wireless communicaion and neworking, EURASIP J. Wireless Commun. New., vol. 2012, no. 1, pp. 1 19, [6] I. D. Schizas, A. Ribeiro, and G. B. Giannakis, Consensus in ad hoc WSNS wih noisy links par I: Disribued esimaion of deerminisic signals, IEEE Trans. Signal Process., vol.56,no.1,pp ,Jan [7] U. A. Khan, S. Kar, and J. M. Moura, Diland: An algorihm for disribued sensor localizaion wih noisy disance measuremens, IEEE Trans. Signal Process., vol. 58, no. 3, pp , Mar [8] M. Rabba and R. Nowak, Disribued opimizaion in sensor neworks, in Proc. ACM 3rd In. Symp. Inf. Process. Sensor New., 2004, pp [9] R. Bekkerman, M. Bilenko, and J. Langford, Scaling Up Machine Learning: Parallel and Disribued Approaches. Cambridge, U.K.: Cambridge Univ. Press, [10] K. I. Tsianos, S. Lawlor, and M. G. Rabba, Consensus-based disribued opimizaion: Pracical issues and applicaions in large-scale machine learning, in Proc. 50h Annu. Alleron Conf. Commun. Conrol Compu., 2012, pp [11] V. Cevher, S. Becker, and M. Schmid, Convex opimizaion for big daa: Scalable, randomized, and parallel algorihms for big daa analyics, IEEE Signal Process. Mag., vol. 31, no. 5, pp , Sep [12] A. Mokhari, Q. Ling, and A. Ribeiro, Nework newon-par I: Algorihm and convergence, arxiv: , [13] A. Mokhari, Q. Ling, and A. Ribeiro, Nework newon-par II: Convergence rae and implemenaion, arxiv: , [14] D. P. Bersekas and J. N. Tsisiklis, Parallel and Disribued Compuaion: Numerical Mehods. Englewood Cliffs, NJ, USA: Prenice-Hall, [15] A. P. Ruszczyński, Nonlinear Opimizaion, vol. 13. Princeon, NJ, USA: Princeon Univ. Press, [16] M. G. Rabba, R. D. Nowak, J. Bucklew, Generalized consensus compuaion in neworked sysems wih erasure links, in Proc. IEEE 6h Workshop Signal Process. Adv. Wireless Commun., 2005, pp [17] M. R. Hesenes, Muliplier and gradien mehods, J. Opim. Theory Appl., vol. 4, no. 5, pp , [18] D. P. Bersekas, Consrained Opimizaion and Lagrange Muliplier Mehods. New York, NY, USA: Academic, [19] S. Boyd, N. Parikh, E. Chu, B. Peleao, and J. Ecksein, Disribued opimizaion and saisical learning via he alernaing direcion mehod of mulipliers, Found. Trends Mach. Learn., vol. 3, no. 1, pp , [20] W. Shi, Q. Ling, K. Yuan, G. Wu, and W. Yin, On he linear convergence of he ADMM in decenralized consensus opimizaion, IEEE Trans. Signal Process., vol. 62, no. 7, pp , Jan [21] N. Waanabe, Y. Nishimura, and M. Masubara, Decomposiion in large sysem opimizaion using he mehod of mulipliers, J. Opim. Theory Appl., vol. 25, no. 2, pp , [22] G. Sephanopoulos and A. W. Weserberg, The use of Hesenes mehod of mulipliers o resolve dual gaps in engineering sysem opimizaion, J. Opim. Theory Appl., vol. 15, no. 3, pp , [23] J. M. Mulvey and A. Ruszczyn, A diagonal quadraic approximaion mehod for large scale linear programs, Oper. Res. Le., vol. 12, no. 4, pp , [24] A. Ruszczyński, On convergence of an augmened lagrangian decomposiion mehod for sparse convex opimizaion, Mah. Oper. Res., vol. 20, no. 3, pp , [25] R. Tappenden, P. Richárik, and B. Büke, Separable approximaions and decomposiion mehods for he augmened Lagrangian, Opim. Mehods Sofw., vol. 30, no. 3, pp , [26] N. Chazipanagiois, D. Dencheva, and M. M. Zavlanos, An augmened lagrangian mehod for disribued opimizaion, Mah. Program., vol.152, no. 1, pp , Aug [27] D. Jakoveic, J. Xavier, and J. M. Moura, Cooperaive convex opimizaion in neworked sysems: Augmened lagrangian algorihms wih direced gossip communicaion, IEEE Trans. Signal Process., vol.59,no.8, pp , Feb [28] Q. Ling, W. Shi, G. Wu, and A. Ribeiro, DLM: Decenralized linearized alernaing direcion mehod of mulipliers, IEEE Trans. Signal Process, vol. 63, no. 15, pp , Aug [29] A. Mokhari, W. Shi, Q. Ling, and A. Ribeiro, DQM: Decenralized quadraically approximaed alernaing direcion mehod of mulipliers, IEEE Trans. Signal Process., vol. 64, no. 19, pp , Oc [30] M. Zargham, A. Ribeiro, A. Ozdaglar, and A. Jadbabaie, Acceleraed dual descen for nework flow opimizaion, IEEE Trans. Auom. Conrol, vol. 59, no. 4, pp , Apr [31] A. Mokhari, Q. Ling, and A. Ribeiro, An approximae newon mehod for disribued opimizaion, in Proc. IEEE In. Conf. Acous., Speech, Signal Process., 2015, pp [32] W. Shi, Q. Ling, G. Wu, and W. Yin, Exra: An exac firs-order algorihm for decenralized consensus opimizaion, SIAM J. Opim., vol. 25, no. 2, pp , [33] S. Boyd, P. Diaconis, and L. Xiao, Fases mixing markov chain on a graph, SIAM Rev., vol. 46, no. 4, pp , [34] D. Bajovic, D. Jakoveic, N. Krejic, and N. K. Jerinkic, Newonlike mehod wih diagonal correcion for disribued opimizaion, arxiv: , [35] A. Nedic and A. Ozdaglar, Disribued subgradien mehods for muliagen opimizaion, IEEE Trans. Auom. Conrol,vol.54,no.1,pp.48 61, Jan Aryan Mokhari received he B.Sc. degree in elecrical engineering from Sharif Universiy of Technology, Tehran, Iran, in 2011, and he M.S. degree in elecrical engineering from he Universiy of Pennsylvania, Philadelphia, PA, USA, in Since 2012, he has been working oward he Ph.D. degree in he Deparmen of Elecrical and Sysems Engineering, Universiy of Pennsylvania, Philadelphia, PA. From June o Augus 2010, he was an inern a he Advanced Digial Sciences Cener, Singapore. He was a research inern wih he Big-Daa Machine Learning Group, Yahoo!, Sunnyvale, CA, from June o Augus His research ineress include he areas of opimizaion, machine learning, conrol, and signal processing. His curren research focuses on developing sochasic, disribued (parallel), and decenralized mehods for large-scale opimizaion problems.

16 522 IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 2, NO. 4, DECEMBER 2016 Wei Shi received he B.E. degree in auomaion and he Ph.D. degree in conrol science and engineering boh from he Universiy of Science and Technology of China, Hefei, China, in 2010 and 2015, respecively. From 2015 o 2016 he was a Posdocoral Research Associae a Coordinaed Science Laboraory, Universiy of Illinois a Urbana-Champaign, Urbana. He is currenly a Posdocoral Research Associae a Boson Universiy, Boson, MA, USA. His research ineress include opimizaion and is applicaions in signal processing and conrol. Qing Ling received he B.E. degree in auomaion and he Ph.D. degree in conrol heory and conrol engineering boh from he Universiy of Science and Technology of China, Hefei, China, in 2001 and 2006, respecively. From 2006 o 2009, he was a Posdocoral Research Fellow in he Deparmen of Elecrical and Compuer Engineering, Michigan Technological Universiy. Since 2009, he has been an Associae Professor in he Deparmen of Auomaion, Universiy of Science and Technology of China. His curren research ineress include he decenralized opimizaion of neworked muliagen sysems. Alejandro Ribeiro received he B.Sc. degree in elecrical engineering from he Universidad de la Republica Orienal del Uruguay, Monevideo, Uruguay, in 1998 and he M.Sc. and Ph.D. degree in elecrical engineering from he Deparmen of Elecrical and Compuer Engineering, Universiy of Minnesoa, Minneapolis, MN, USA, in 2005 and 2007, respecively. From 1998 o 2003, he was a Member of he Technical Saff a Bellsouh Monevideo. Afer he M.Sc. and Ph.D. degrees, in 2008 he joined he Universiy of Pennsylvania, Philadelphia, PA, USA, where he is currenly he Rosenbluh Associae Professor a he Deparmen of Elecrical and Sysems Engineering. His research ineress include he applicaions of saisical signal processing o he sudy of neworks and neworked phenomena. His focus is on srucured represenaions of neworked daa srucures, graph signal processing, nework opimizaion, robo eams, and neworked conrol. Dr. Ribeiro received he 2014 O. Hugo Schuck Bes Paper Award, he 2012 S. Reid Warren, Jr. Award presened by Penn s undergraduae suden body for ousanding eaching, he NSF CAREER Award in 2010, and Paper Awards a he 2016 SSP Workshop, 2016 SAM Workshop, 2015 Asilomar SSC Conference, ACC 2013, ICASSP 2006, and ICASSP He is a Fulbrigh Scholar and a Penn Fellow.

A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization

A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization 1 A Decenralized Second-Order Mehod wih Exac Linear Convergence Rae for Consensus Opimizaion Aryan Mokhari, Wei Shi, Qing Ling, and Alejandro Ribeiro Absrac This paper considers decenralized consensus

More information

Network Newton Distributed Optimization Methods

Network Newton Distributed Optimization Methods Nework Newon Disribued Opimizaion Mehods Aryan Mokhari, Qing Ling, and Alejandro Ribeiro Absrac We sudy he problem of minimizing a sum of convex objecive funcions where he componens of he objecive are

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018 MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren

More information

Optimality Conditions for Unconstrained Problems

Optimality Conditions for Unconstrained Problems 62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Lecture 9: September 25

Lecture 9: September 25 0-725: Opimizaion Fall 202 Lecure 9: Sepember 25 Lecurer: Geoff Gordon/Ryan Tibshirani Scribes: Xuezhi Wang, Subhodeep Moira, Abhimanu Kumar Noe: LaTeX emplae couresy of UC Berkeley EECS dep. Disclaimer:

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection Appendix o Online l -Dicionary Learning wih Applicaion o Novel Documen Deecion Shiva Prasad Kasiviswanahan Huahua Wang Arindam Banerjee Prem Melville A Background abou ADMM In his secion, we give a brief

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Primal-Dual Splitting: Recent Improvements and Variants

Primal-Dual Splitting: Recent Improvements and Variants Primal-Dual Spliing: Recen Improvemens and Varians 1 Thomas Pock and 2 Anonin Chambolle 1 Insiue for Compuer Graphics and Vision, TU Graz, Ausria 2 CMAP & CNRS École Polyechnique, France The proximal poin

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

MATH 128A, SUMMER 2009, FINAL EXAM SOLUTION

MATH 128A, SUMMER 2009, FINAL EXAM SOLUTION MATH 28A, SUMME 2009, FINAL EXAM SOLUTION BENJAMIN JOHNSON () (8 poins) [Lagrange Inerpolaion] (a) (4 poins) Le f be a funcion defined a some real numbers x 0,..., x n. Give a defining equaion for he Lagrange

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization A Forward-Backward Spliing Mehod wih Componen-wise Lazy Evaluaion for Online Srucured Convex Opimizaion Yukihiro Togari and Nobuo Yamashia March 28, 2016 Absrac: We consider large-scale opimizaion problems

More information

Chapter 3 Boundary Value Problem

Chapter 3 Boundary Value Problem Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le

More information

An Introduction to Malliavin calculus and its applications

An Introduction to Malliavin calculus and its applications An Inroducion o Malliavin calculus and is applicaions Lecure 5: Smoohness of he densiy and Hörmander s heorem David Nualar Deparmen of Mahemaics Kansas Universiy Universiy of Wyoming Summer School 214

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source Muli-scale D acousic full waveform inversion wih high frequency impulsive source Vladimir N Zubov*, Universiy of Calgary, Calgary AB vzubov@ucalgaryca and Michael P Lamoureux, Universiy of Calgary, Calgary

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

Mean-square Stability Control for Networked Systems with Stochastic Time Delay JOURNAL OF SIMULAION VOL. 5 NO. May 7 Mean-square Sabiliy Conrol for Newored Sysems wih Sochasic ime Delay YAO Hejun YUAN Fushun School of Mahemaics and Saisics Anyang Normal Universiy Anyang Henan. 455

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

Ordinary Differential Equations

Ordinary Differential Equations Ordinary Differenial Equaions 5. Examples of linear differenial equaions and heir applicaions We consider some examples of sysems of linear differenial equaions wih consan coefficiens y = a y +... + a

More information

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach 1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv:1209.1695v1 [cs.sy] 8 Sep 2012 Absrac A general model

More information

Then. 1 The eigenvalues of A are inside R = n i=1 R i. 2 Union of any k circles not intersecting the other (n k)

Then. 1 The eigenvalues of A are inside R = n i=1 R i. 2 Union of any k circles not intersecting the other (n k) Ger sgorin Circle Chaper 9 Approimaing Eigenvalues Per-Olof Persson persson@berkeley.edu Deparmen of Mahemaics Universiy of California, Berkeley Mah 128B Numerical Analysis (Ger sgorin Circle) Le A be

More information

Chapter 6. Systems of First Order Linear Differential Equations

Chapter 6. Systems of First Order Linear Differential Equations Chaper 6 Sysems of Firs Order Linear Differenial Equaions We will only discuss firs order sysems However higher order sysems may be made ino firs order sysems by a rick shown below We will have a sligh

More information

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits DOI: 0.545/mjis.07.5009 Exponenial Weighed Moving Average (EWMA) Char Under The Assumpion of Moderaeness And Is 3 Conrol Limis KALPESH S TAILOR Assisan Professor, Deparmen of Saisics, M. K. Bhavnagar Universiy,

More information

IMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013

IMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013 IMPLICI AND INVERSE FUNCION HEOREMS PAUL SCHRIMPF 1 OCOBER 25, 213 UNIVERSIY OF BRIISH COLUMBIA ECONOMICS 526 We have exensively sudied how o solve sysems of linear equaions. We know how o check wheher

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

Class Meeting # 10: Introduction to the Wave Equation

Class Meeting # 10: Introduction to the Wave Equation MATH 8.5 COURSE NOTES - CLASS MEETING # 0 8.5 Inroducion o PDEs, Fall 0 Professor: Jared Speck Class Meeing # 0: Inroducion o he Wave Equaion. Wha is he wave equaion? The sandard wave equaion for a funcion

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

EE363 homework 1 solutions

EE363 homework 1 solutions EE363 Prof. S. Boyd EE363 homework 1 soluions 1. LQR for a riple accumulaor. We consider he sysem x +1 = Ax + Bu, y = Cx, wih 1 1 A = 1 1, B =, C = [ 1 ]. 1 1 This sysem has ransfer funcion H(z) = (z 1)

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Logic in computer science

Logic in computer science Logic in compuer science Logic plays an imporan role in compuer science Logic is ofen called he calculus of compuer science Logic plays a similar role in compuer science o ha played by calculus in he physical

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Differential Equations

Differential Equations Mah 21 (Fall 29) Differenial Equaions Soluion #3 1. Find he paricular soluion of he following differenial equaion by variaion of parameer (a) y + y = csc (b) 2 y + y y = ln, > Soluion: (a) The corresponding

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

Optimal Path Planning for Flexible Redundant Robot Manipulators

Optimal Path Planning for Flexible Redundant Robot Manipulators 25 WSEAS In. Conf. on DYNAMICAL SYSEMS and CONROL, Venice, Ialy, November 2-4, 25 (pp363-368) Opimal Pah Planning for Flexible Redundan Robo Manipulaors H. HOMAEI, M. KESHMIRI Deparmen of Mechanical Engineering

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

THE BERNOULLI NUMBERS. t k. = lim. = lim = 1, d t B 1 = lim. 1+e t te t = lim t 0 (e t 1) 2. = lim = 1 2.

THE BERNOULLI NUMBERS. t k. = lim. = lim = 1, d t B 1 = lim. 1+e t te t = lim t 0 (e t 1) 2. = lim = 1 2. THE BERNOULLI NUMBERS The Bernoulli numbers are defined here by he exponenial generaing funcion ( e The firs one is easy o compue: (2 and (3 B 0 lim 0 e lim, 0 e ( d B lim 0 d e +e e lim 0 (e 2 lim 0 2(e

More information

Some Ramsey results for the n-cube

Some Ramsey results for the n-cube Some Ramsey resuls for he n-cube Ron Graham Universiy of California, San Diego Jozsef Solymosi Universiy of Briish Columbia, Vancouver, Canada Absrac In his noe we esablish a Ramsey-ype resul for cerain

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006 2.160 Sysem Idenificaion, Esimaion, and Learning Lecure Noes No. 8 March 6, 2006 4.9 Eended Kalman Filer In many pracical problems, he process dynamics are nonlinear. w Process Dynamics v y u Model (Linearized)

More information

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi Creep in Viscoelasic Subsances Numerical mehods o calculae he coefficiens of he Prony equaion using creep es daa and Herediary Inegrals Mehod Navnee Saini, Mayank Goyal, Vishal Bansal (23); Term Projec

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

BBP-type formulas, in general bases, for arctangents of real numbers

BBP-type formulas, in general bases, for arctangents of real numbers Noes on Number Theory and Discree Mahemaics Vol. 19, 13, No. 3, 33 54 BBP-ype formulas, in general bases, for arcangens of real numbers Kunle Adegoke 1 and Olawanle Layeni 2 1 Deparmen of Physics, Obafemi

More information

Unsteady Flow Problems

Unsteady Flow Problems School of Mechanical Aerospace and Civil Engineering Unseady Flow Problems T. J. Craf George Begg Building, C41 TPFE MSc CFD-1 Reading: J. Ferziger, M. Peric, Compuaional Mehods for Fluid Dynamics H.K.

More information

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems Essenial Microeconomics -- 6.5: OPIMAL CONROL Consider he following class of opimizaion problems Max{ U( k, x) + U+ ( k+ ) k+ k F( k, x)}. { x, k+ } = In he language of conrol heory, he vecor k is he vecor

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

SPECTRAL EVOLUTION OF A ONE PARAMETER EXTENSION OF A REAL SYMMETRIC TOEPLITZ MATRIX* William F. Trench. SIAM J. Matrix Anal. Appl. 11 (1990),

SPECTRAL EVOLUTION OF A ONE PARAMETER EXTENSION OF A REAL SYMMETRIC TOEPLITZ MATRIX* William F. Trench. SIAM J. Matrix Anal. Appl. 11 (1990), SPECTRAL EVOLUTION OF A ONE PARAMETER EXTENSION OF A REAL SYMMETRIC TOEPLITZ MATRIX* William F Trench SIAM J Marix Anal Appl 11 (1990), 601-611 Absrac Le T n = ( i j ) n i,j=1 (n 3) be a real symmeric

More information

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients Secion 3.5 Nonhomogeneous Equaions; Mehod of Undeermined Coefficiens Key Terms/Ideas: Linear Differenial operaor Nonlinear operaor Second order homogeneous DE Second order nonhomogeneous DE Soluion o homogeneous

More information

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales Advances in Dynamical Sysems and Applicaions. ISSN 0973-5321 Volume 1 Number 1 (2006, pp. 103 112 c Research India Publicaions hp://www.ripublicaion.com/adsa.hm The Asympoic Behavior of Nonoscillaory Soluions

More information

Analyze patterns and relationships. 3. Generate two numerical patterns using AC

Analyze patterns and relationships. 3. Generate two numerical patterns using AC envision ah 2.0 5h Grade ah Curriculum Quarer 1 Quarer 2 Quarer 3 Quarer 4 andards: =ajor =upporing =Addiional Firs 30 Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 andards: Operaions and Algebraic Thinking

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3 Macroeconomic Theory Ph.D. Qualifying Examinaion Fall 2005 Comprehensive Examinaion UCLA Dep. of Economics You have 4 hours o complee he exam. There are hree pars o he exam. Answer all pars. Each par has

More information

ELE 538B: Large-Scale Optimization for Data Science. Quasi-Newton methods. Yuxin Chen Princeton University, Spring 2018

ELE 538B: Large-Scale Optimization for Data Science. Quasi-Newton methods. Yuxin Chen Princeton University, Spring 2018 ELE 538B: Large-Scale Opimizaion for Daa Science Quasi-Newon mehods Yuxin Chen Princeon Universiy, Spring 208 00 op ff(x (x)(k)) f p 2 L µ f 05 k f (xk ) k f (xk ) =) f op ieraions converges in only 5

More information

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin ACE 56 Fall 005 Lecure 4: Simple Linear Regression Model: Specificaion and Esimaion by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Simple Regression: Economic and Saisical Model

More information

Math 10B: Mock Mid II. April 13, 2016

Math 10B: Mock Mid II. April 13, 2016 Name: Soluions Mah 10B: Mock Mid II April 13, 016 1. ( poins) Sae, wih jusificaion, wheher he following saemens are rue or false. (a) If a 3 3 marix A saisfies A 3 A = 0, hen i canno be inverible. True.

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

Global Convergence of Online Limited Memory BFGS

Global Convergence of Online Limited Memory BFGS Journal of Machine Learning Research 16 (2015 3151-3181 Submied 9/14; Revised 7/15; Published 12/15 Global Convergence of Online Limied Memory BFGS Aryan Mokhari Alejandro Ribeiro Deparmen of Elecrical

More information

Solutions to Assignment 1

Solutions to Assignment 1 MA 2326 Differenial Equaions Insrucor: Peronela Radu Friday, February 8, 203 Soluions o Assignmen. Find he general soluions of he following ODEs: (a) 2 x = an x Soluion: I is a separable equaion as we

More information

Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information

Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information Disribued Ficiious Play for Opimal Behavior of Muli-Agen Sysems wih Incomplee Informaion Ceyhun Eksin and Alejandro Ribeiro arxiv:602.02066v [cs.g] 5 Feb 206 Absrac A muli-agen sysem operaes in an uncerain

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation: M ah 5 7 Fall 9 L ecure O c. 4, 9 ) Hamilon- J acobi Equaion: Weak S oluion We coninue he sudy of he Hamilon-Jacobi equaion: We have shown ha u + H D u) = R n, ) ; u = g R n { = }. ). In general we canno

More information

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux Gues Lecures for Dr. MacFarlane s EE3350 Par Deux Michael Plane Mon., 08-30-2010 Wrie name in corner. Poin ou his is a review, so I will go faser. Remind hem o go lisen o online lecure abou geing an A

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

A Hop Constrained Min-Sum Arborescence with Outage Costs

A Hop Constrained Min-Sum Arborescence with Outage Costs A Hop Consrained Min-Sum Arborescence wih Ouage Coss Rakesh Kawara Minnesoa Sae Universiy, Mankao, MN 56001 Email: Kawara@mnsu.edu Absrac The hop consrained min-sum arborescence wih ouage coss problem

More information

Math 334 Fall 2011 Homework 11 Solutions

Math 334 Fall 2011 Homework 11 Solutions Dec. 2, 2 Mah 334 Fall 2 Homework Soluions Basic Problem. Transform he following iniial value problem ino an iniial value problem for a sysem: u + p()u + q() u g(), u() u, u () v. () Soluion. Le v u. Then

More information

Oscillation of an Euler Cauchy Dynamic Equation S. Huff, G. Olumolode, N. Pennington, and A. Peterson

Oscillation of an Euler Cauchy Dynamic Equation S. Huff, G. Olumolode, N. Pennington, and A. Peterson PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DYNAMICAL SYSTEMS AND DIFFERENTIAL EQUATIONS May 4 7, 00, Wilmingon, NC, USA pp 0 Oscillaion of an Euler Cauchy Dynamic Equaion S Huff, G Olumolode,

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information