ADD-OPT: Accelerated Distributed Directed Optimization

Size: px
Start display at page:

Download "ADD-OPT: Accelerated Distributed Directed Optimization"

Transcription

1 1 ADD-OPT: Accelerated Distributed Directed Optimization Chenguang Xi, Student Member, IEEE, and Usman A. Khan, Senior Member, IEEE Abstract This paper considers distributed optimization problems where agents collectively minimize the sum of locally nown convex functions. We focus on the case when the communication between agents is described by a directed graph. The proposed algorithm, ADD-OPT: Accelerated Distributed Directed OPTimization, achieves the best nown rate of convergence for this class of problems, Oµ ) for 0 < µ < 1 given that the objective functions are strongly-convex, where is the number of iterations. Moreover, ADD-OPT supports a wider and more realistic range of step-size compared with DEXTRA, the first algorithm to provide a linear convergence rate over directed graphs. In particular, the greatest lower bound of DEXTRA s step-size is strictly greater than zero while that of ADD-OPT s equals exactly to zero. Simulation examples further illustrate the improvements. Index Terms Distributed optimization, directed graph, linear convergence, DEXTRA. I. INTRODUCTION Distributed consensus optimization problems are modeled as minimizing a global objective function by multiple agents over the networ. Formally, we consider a decision variable z R p and a strongly connected networ containing n agents where each agent i has only access to a local objective function f i : R p R. The goal is to have each local agent minimize the sum of objectives, n i=1 f iz), by letting agents exchanging information with neighbors only. The formulation has gained great interests due to its widespread application in, e.g., large scale machine learning, [1, 2], model predictive control, [3], cognitive networs, [4, 5], source localization, [6, 7], resource scheduling, [8], and message routing, [9]. Most of the existing algorithms assume information exchange over undirected networs 1 where the communication between agents bidirectional, i.e., if agent i can send information to agent j, then agent j can also send information to agent i. However, the distributed optimization problem over directed networs has wider applications than that over undirected graphs. For example, agents may broadcast at different power levels, implying communication capability in one direction, but not the other. Moreover, an algorithm should remain to converge when slow communication lins are eliminated from the networ because of their delaying the whole networ, thus results in a directed graph. In general, a directed topology is more flexible compared with an undirected topology. Related methods over undirected graphs are well established. A number of recent papers proposed Distributed Gradient Descent DGD), [10 13], which achieves O ln ) for general convex functions, and O ln ) when objective func- tions are strongly-convex, where is the number of iterations. The authors are with the ECE Department at Tufts University, Medford, MA; chenguang.xi@tufts.edu, han@ece.tufts.edu. This wor has been partially supported by an NSF Career Award # CCF We use the terms networ and graph interchangeably. The convergence rates can be accelerated with an additional Lipschitz continuous gradient assumption. With the Lipschitz assumption, the wor in [14] shows that DGD with a constant step-size converges at rate O 1 ) to an error neighborhood of step-size, and further converges linearly to an error neighborhood for strongly-convex functions. The distributed Nestrov s method, [15], accelerates the convergence rate to O ln ) 2 for general convex functions. EXTRA, [16], shows the exact convergence with rate O 1 ) for general convex functions and a linear rate for strongly-convex functions. The wor in [17] improves EXTRA by relaxing the weighting matrices to be asymmetric. Besides the gradient-based methods, the distributed implementation of ADMM, [18 20], also wors well to solve the problem over undirected graphs. We report the literatures considering directed graphs here. Broadly, the following three approaches refer to the techniques of reaching average consensus 2 over directed graphs, and extend the results to solve the distributed optimization. Gradient-Push GP), [25 28], combines DGD, [10], and pushsum consensus, [29, 30]. Directed-Distributed Gradient Descent D-DGD), [31, 32], extends Cai and Ishii s wor, [33], by combining it with DGD. In [34], the authors apply the weight-balancing technique, [35], to DGD. These gradientbased methods, [25 28, 31, 32, 34], restricted by the diminishing step-size, converge relatively slow at O ln ). When the objective functions are strongly-convex, the convergence rate can be accelerated to O ln ), [36]. A recent paper proposed a fast distributed algorithm, termed DEXTRA, [37, 38], to solve the distributed consensus optimization problem over directed graphs. By combining the push-sum protocol, [29, 30], and EXTRA, [16], DEXTRA achieves a linear convergence rate given that the objective functions are strongly-convex. However, one drawbac of DEXTRA is the restrictive range of step-size. The greatest lower bound of DEXTRA s step-size is strictly greater than zero. In particular, a proper step-size, α, for DEXTRA converging to the optimal solution lies in α α, α), where α and α denote the lower and upper bound respectively. It is true that α > 0. Considering that it is hard to estimate α in a distributed setting because the expression of α requires the global nowledge, it is always an open problem on how to pic a proper step-size, α, in DEXTRA to guarantee the convergence, i.e., α α, α). In contrast if α = 0, agents can pic whatever small value for α to ensure the convergence. In this paper, we propose ADD-OPT to solve the distributed optimization over directed graphs. The algorithm achieves a linear convergence rate when the objective functions are strongly-convex. Compared to DEXTRA, ADD-OPT s stepsize, α, lies in α 0, α), i.e., α = 0. This guarantees ADD- 2 See [21 24], for additional information on average-consensus problems.

2 2 OPT to be a more reliable algorithm in distributed setting. We show that, after some derivation, the ADD-OPT has a similar representation as DEXTRA. In this point of view, it can be regarded as an extended result of DEXTRA. The remainder of the paper is organized as follows. Section II describes, interprets ADD-OPT, and show its relations with DEXTRA. We also analyze the performance of both ADD-OPT and DEXTRA when the step-size is zero. Section III presents the appropriate assumptions and states the main convergence results. In Section IV, we present some lemmas as the basis of the proof of ADD-OPT s convergence. The proof of main results is provided in Section V. We show numerical results in Section VI and Section VII contains the concluding remars. Basic Notations: We use lowercase bold letters to denote vectors and uppercase italic letters to denote matrices. For a matrix, A = {a ij }, we denote a ij the element of A at ith row, jth column. The matrix, I n, represents the n n identity, and 1 n and 0 n are the n-dimensional vector of all 1 s and 0 s for any n. The inner product of two vectors, x and y, is x, y, or x y. The Euclidean norm of any vector, x, is denoted by x. For any fx), fx) denotes the gradient of f at x. The spectral radius of a matrix, A, is represented by ρa), and λa) denotes any eigenvalue of A. II. ADD-OPT DEVELOPMENT In this section, we formulate the optimization problem and describe ADD-OPT. We first derive an informal but intuitive proof showing that ADD-OPT pushes the agents to achieve consensus and reach the optimal solution of Problem P1. After proposing ADD-OPT, we derive it to a similar representation of DEXTRA to show the relations between the two. The analysis helps to reveal how to increase the range of step-size in DEXTRA by adjusting the weigthing matrices. We also analyze the performance of both ADD-OPT and DEXTRA when the step-size is zero. Formal convergence results are left to Sections III. Consider a strongly-connected networ of n agents communicating over a directed graph, G = V, E), where V is the set of agents, and E is the collection of ordered pairs, i, j), i, j V, such that agent j can send information to agent i. Define Ni in to be the collection of in-neighbors, i.e., the set of agents that can send information to agent i. Similarly, Ni out is the set of out-neighbors of agent i. We allow both Ni in and Ni out to include the node i itself. Note that in a directed graph when i, j) E, it is not necessary that j, i) E. Consequently, Ni in Ni out, in general. We focus on solving a convex optimization problem that is distributed over the above multi-agent networ. In particular, the networ of agents cooperatively solve the following optimization problem: n P1 : min fz) = f i z), i=1 where each local objective function, f i : R p R, is convex and differentiable, and nown only by agent i. Our goal is to develop a distributed iterative algorithm such that each agent converges to the global solution of Problem P1. A. ADD-OPT Algorithm To solve Problem P1, we describe the implementation of ADD-OPT as follows. Each agent, j V, maintains three vector variables: x,j, z,j, w,j R p, as well as a scalar variable, y,j R, where is the discrete-time index. At th iteration, agent j weights its states, a ij x,j, a ij y,j, as well as a ij w,j, and sends these to each of its out-neighbors, i Nj out, where the weights, a ij s are such that: > 0, i Nj out, n a ij = a ij = 1, j. 1) 0, otw., With agent i receiving the information from its in-neighbors, j Ni in, +1,i, y +1,i, z +1,i and w +1,i as follows: x +1,i = a ij x,j αw,i, 2a) j N in i y +1,i = j N in i a ij y,j, i=1 z +1,i = x +1,i, y +1,i w +1,i = a ij w,j + f i z +1,i ) f i z,i ). j N in i 2b) 2c) 2d) In the above, f i z,i ) in the gradient of the function f i z) at z = z,i, for all 0. The step-size, α, is a positive number within a certain interval. We will explicitly show the range of α in Section III. For any agent i, it is initiated with an arbitrary vector, x 0,i, and with w 0,i = f i z 0,i ) and y 0,i = 1. We note that the implementation of Eq. 2) needs each agent to at least have the nowledge of its out-neighbors degree such that it can design the weights according to Eqs. 1). See [25 28, 31, 32, 34, 37] for the similar assumptions. To simplify the analysis, we assume from now on that all sequences updated by Eq. 2) have only one dimension, i.e., p = 1; thus x,i, y,i, w,i, z,i R, i,. For x i, w i, z i R p being p-dimensional vectors, the proof is the same for every dimension by applying the results to each coordinate. Therefore, assuming p = 1 is without the loss of generality. We next write Eq. 2) in a matrix form. Define x, y, w, z, f R n as x = [x,1,, x,n ], y = [y,1,, y,n ], w = [w,1,, w,n ], z = [z,1,, z,n ], and f = [ f 1 z,1 ),, f n z,n )]. Let A = {a ij } R n n be the collection of weights a ij. It is clear that A is a columnstochastic matrix. Define a diagonal matrix, Y R n n, for each, as follow Y = diag y ). 3) Given that the graph, G, is strongly-connected and the corresponding weighting matrix, A, is non-negative, it follows that Y is invertible for any. Then, we can write Eq. 2) in the matrix form equivalently as follows: x +1 =Ax αw, y +1 =Ay, 4a) 4b) z +1 =Y +1 x +1, 4c)

3 3 w +1 =Aw + f +1 f, where similarly we have the initial condition w 0 = f 0. B. Interpretation of ADD-OPT 4d) Based on Eq. 4), we now give an intuitive interpretation on the convergence of ADD-OPT to the optimal solution. By combining Eqs. 4a) and 4d), we obtain that x +1 = Ax α [Aw + f f ] [ ] Ax x = Ax αa α [ f f ] α = 2Ax A 2 x α [ f f ]. 5) Assume that the sequences generated by Eq. 4) converge to their limits not necessarily true), denoted by x, y, w, z, f, respectively. It follows from Eq. 5) that x = 2Ax A 2 x α [ f f ], 6) which implies that I n A) 2 x = 0 n, or x span{y }, considering that y = Ay. Therefore, we obtain that z = Y x span{1 n }, 7) where the consensus is reached. By summing up the updates in Eq. 5) over from 0 to, we obtain that x = Ax + A I n )x r A 2 A)x r α f. r=1 r=1 Noting that x = Ax, it follows α f = A I n )x r A 2 A)x r. r=1 Therefore, we obtain that α1 n f = 1 n A I n ) x r 1 n A 2 A) x r = 0, which is the optimality condition of Problem P1 considering that z span{1 n }. To conclude, if we assume the sequences updated in Eq. 4) have limits, x, y, w, z, f, we have the fact that z achieves consensus and reaches the optimal solution of Problem P1. We next discuss the relations between ADD-OPT and DEXTRA. C. ADD-OPT and DEXTRA Recent papers proposed a fast distributed algorithm, termed DEXTRA [37, 38], to solve the corresponding distributed optimization problem over directed graphs. It achieves a linear convergence rate given that the objective functions are strongly-convex. At th iteration, each agent i eeps and updates three states, x,i, y,i, and z,i. The iteration, in matrix form, is shown as follow. x +1 = I n + A) x Ãx α [ f f ], y +1 =Ay, 8a) 8b) z +1 =Y +1 x +1, 8c) where à is a column-stochastic matrix satisfying that à = θi n + 1 θ)a with any θ 0, 1 2 ], and all other notations stic to the same definition appeared earlier in the paper. By comparing Eqs. 5) and 8a), 4b) and 8b), and 4c) and 8c), it follows that the only difference between ADD-OPT and DEXTRA lies in the weighting matrices used when updating x. From DEXTRA to ADD-OPT, we change I n +A) in 8a) to 2A in 5), and à to A2, respectively. Mathematically, if A = I n, the two algorithms become the same. Therefore, ADD- OPT can be regarded as an extended version of DEXTRA, in that we will show in Section III that it has a wider range of step-size compared to DEXTRA, i.e., the greatest lower bound, α, of ADD-OPT s step-size is zero while that of DEXTRA s is positve. This also reveals the reason why in DEXTRA we prefer constructing A to be an extremely diagonal dominant matrix see Assumption A2c) in [37]). The more similar A is to I n, the closer α approaches zero. However, in DEXTRA α can never reach zero since A can not be the identity, I n, which otherwise means there is no communication between agents. Therefore, ADD-OPT can not be regarded as a special case of DEXTRA since A I n. In Section V, we provide a totally different, but much more compact and elegant proof, compared to DEXTRA s proof, to show the linear convergence rate of ADD-OPT. Note that the implementation of Eq. 5) requires agents to communicate with the neighbors of their own neighbors because of A 2 ), which taes two iterations for each agent. This explains why ADD-OPT needs to eep and update 4 variables, x, y, w, and z, compared with DEXTRA which only have 3 variables, x, y, and z. It can be considered as a tradeoff of between increasing the step-size range and decreasing the number of variables. D. Interpretation of Algorithms when α = 0 For any gradient-based method, let α = 0 is mathematically equivalent to f = 0 n for all, which means all local objective functions are constants. Thus, the methods are simplified to have agents reach consensus only. We now discuss whether DEXTRA can push all agents to consensus if we force the step-size to be zero. Assume that the sequences generated by Eq. 8) converge to their limits not necessarily true), denoted by x, y, z, and f, respectively. Consider Eq. 8a) when α = 0, x +1 = I n + A)x Ãx. 9) By summing Eq. 9) over from 0 to, we obtain that x = Ax à A)x r. 10) According to the previous analysis from Eqs. 6) to 7), we now that in order to reach consensus, i.e., z span{1 n }, it is equivalent to satisfy x span{y }, which results in à A)x r = 0 n. 11) Since x 0 is arbitrary, and only x satisfying that x = Ax or à A)x = 0 n, we have that à A)x r 0 n.

4 4 This reaches a contradiction. Therefore, DEXTRA does not push all agents to the consensus when α = 0. We now consider the performance of ADD-OPT when α = 0. We still assume that the sequences generated by Eq. 4) converge to their limits not necessarily true), denoted by x, y, w, z, f, respectively. In fact, it is straightforward to observe that Eq. 4) with having α = 0 is eactly the pushsum consensus, [29, 30], which push agents to reach average consensus in a directed graph. Therefore, ADD-OPT converges to the optimal solution when α = 0. To better compare ADD- OPT with DEXTRA, we analyze ADD-OPT using similar derivations for DEXTRA above. By summing up the updates in Eq. 5) over from 0 to, we obatin that x = Ax + AI n A)x 0 which is sufficient to satisfy that I n A) 2 x r, 12) r=1 A I n )x r = Ax 0, 13) r=1 by considering the condition that x = Ax. Compared Eqs. 13) from the derivation of ADD-OPT with 11) from DEXTRA, we say that ADD-OPT wors when α = 0 because there exists a additional term Ax 0. The infinit sum r=1 A I n)x r is accumulated to compensate this initial term Ax 0. In the next section, we state the convergence result with appropriate assumptions. III. ASSUMPTIONS, NOTATIONS, AND MAIN RESULT With appropriate assumptions, our main result states that ADD-OPT converges to the optimal solution of Problem P1 linearly. We state again that from now on we assume that the states of agents have only one dimension, i.e., p = 1, which is without the loss of generality. In this paper, we assume that the agent graph, G, is strongly-connected; each local function, f i z), is convex and differentiable, and the optimal solution of Problem P1 and the corresponding optimal value exist. Formally, we denote the optimal solution by z and optimal value by f, i.e., f = fz ) = min z R fz). Besides the above assumptions, we formally present assumptions regarding the gradients of objective functions as follows, which is standard for smooth convex functions, [14, 16, 37], Assumption A1 Lipschitz continuous gradients and strong convexity). Each private function, f i, is differentiable and strongly-convex, and the gradient is Lipschitz continuous, i.e., for any i and z 1, z 2 R, a) there exists a positive constant l such that, f i z 1 ) f i z 2 ) l z 1 z 2 ; b) there exists a positive constant s such that, s z 1 z 2 2 f i z 1 ) f i z 2 ), z 1 z 2. With these assumptions, we are able to present ADD-OPT s convergence result, the representation of which are based on the following notations. Based on earlier notations, x, w, and f, we further define x, w, z, g, h R n as x = 1 n 1 n1 n x, 14) w = 1 n 1 n1 n w, 15) z = z 1 n, 16) g = 1 n 1 n1 n f, 17) h = 1 n 1 n1 n fx ), 18) where fx ) = [ f 1 1 n 1 n x ),..., f n 1 n 1 n x )]. We denote constants τ, ɛ, and η as τ = A I n, 19) ɛ = I n A, 20) η = max 1 αl, 1 αs ), 21) where A is the column-stochastic weighting matrix used in Eq. 4), A = lim A represents A s limit, α is the step-size, and l and s are respectively the Lipschitz gradient constant and strong-convexity constant in Assumption A1. Let Y be the limit of Y in Eq. 3), Y = lim Y, 22) and y and y be the maximum of Y and Y over, respectively, y = max Y, 23) y = max Y. 24) Moreover, we define two constants, σ, and, γ 1, through the following two lemmas, which is related to the convergence of A and Y. Note that Lemmas 1 and 2 are reformulated with notations introduced above to simplify the proof. Lemma 1. Nedic et al. [25]) Consider Y, generated from the column-stochastic matrix, A, and its limit Y. There exist 0 < γ 1 < 1 and 0 < T < such that for all Y Y T γ 1. 25) Lemma 2. Olshevsy et al. [39]) Consider Y in Eq. 22), and A being the column-stochastic matrix used in Eq. 4). For any a R n, define a = 1 n 1 n1 n a. There exist 0 < σ < 1 such that for all Aa Y a σ a Y a. 26) Based on the above notations, we finally denote t, s R 3, and G, H R 3 3 for all as x Y x x t = x z, s = 0, w Y g 0 σ 0 α G = αly ) η 0, ɛlτy + αɛl 2 yy ) 2 αɛl 2 yy ) σ + αɛly )

5 H = αly T γ1 0 0, 27) αly + 2)ɛly T 2 γ1 0 0 We now state the ey relation in this paper. Theorem 1. Let Assumption 2 holds. The following inequality holds for all 1, Proof. See Section V. t Gt + H s. 28) Eq. 28) in Theorem 1 is the ey relation of this paper. We leave the complete proof to Section V, with the help of several auxiliary relations in Section IV. Note that Eq. 28) provides a linear iterative relation between t and t with matrix G and H. Thus, the convergence of t is fully determined by G and H. More specifically, if we want to prove a linear convergence rate of t to zero, it is sufficient to show that ρg) < 1, where ρ ) denotes the spectral radius, as well as the linear decaying of H, which is straightforward since 0 < γ 1 < 1. In Lemma 3, we first show that with appropriate step-size, the spectral radius of G is less than 1. Following Lemma 3, we show the linear convergence rate of G and H in Lemma 4. Lemma 3. Consider the matrix, G α, defined in Eq. 27) as a function of the step-size, α. It follows that ρg α ) < 1 if the step-size, α 0, α), where ɛτs)2 + 4ɛyl + s)s1 σ) α = 2 ɛτs. 29) 2ɛlyy l + s) 4ɛyl+s)s1 σ) 2 Proof. It is easy to verify that α 2ɛlyy l+s) < 1 l. As a result, we have η = 1 αs. When α = 0, we have that σ 0 0 G 0 = 0 1 0, 30) ɛlτy 0 σ the eigenvalues of which are σ, σ, and 1. Therefore, ρg 0 ) = 1, where ρ ) denotes the spectral radius. We now consider how the eigenvalue 1 is changed if we slightly increase α from 0. We denote P Gα q) = detqi n G α ) the characteristic polynomial of G α. By letting detqi n G α ) = 0, we get the following equation. q σ) 2 αɛly q σ))q 1 + αs) α 3 l 3 ɛyy 2 αq 1 + αs)ɛlτy + αɛl 2 yy 2 )) = 0. 31) Since we have already shown that 1 is one of the eigenvalues of G 0, Eq. 31) is valid when q = 1 and α = 0. Tae the derivative on both sides of Eq. 31), and let q = 1 and α = 0, dq dα α=0,q=1 we obtain that = s < 0. This is saying that when α increases from 0 slightly, ρg α ) will decrease first. We now calculate all possible values of α for λg α ) = 1. Let q = 1 in Eq. 31), and solve the step-size, α, we obtain that, α 1 = 0, α 2 < 0, and ɛτs)2 + 4ɛyl + s)s1 σ) α 3 = α = 2 ɛτs. 2ɛlyy l + s) Since there is no other value of α for λg α ) = 1, we now that all eigenvalues of G α is less than 1 for α 0, α) by considering the fact that eigenvalues are continuous functions of matrix. Therefore, ρg α ) < 1 when α 0, α). Lemma 4. With the step-size, α 0, α), where α is defined in Eq. 29), the following statements hold for all, a) there exist 0 < γ 1 < 1 and 0 < Γ 1 <, where γ 1 is defined in Eq. 25), such that H = Γ 1 γ 1 ; b) there exist 0 < γ 2 < 1 and 0 < Γ 2 <, such that G Γ 2 γ 2 ; c) there exist γ = max{γ 1, γ 2 } and Γ = Γ 1 Γ 2 /γ, such that for all 0 r, G r H r Γγ. Proof. a). This is easy to verify according to Eq. 27), and by letting Γ 1 = ly T α 2 + αly + 2) 2 ɛ 2 y. 2 b). We represent G in the Jordan canonical form as G = P J Q. According to Lemma 3, we have that all diagonal entries in J are smaller than one. Therefore, there exist 0 < Γ 2 < and 0 < γ 2 < 1, such that G P Q J Γ2 γ 2. 32) c). The proof of c) is achieved by combining a) and b). We now present the main result of this paper in Theorem 2, which shows the linear convergence rate of ADD-OPT. Theorem 2. Let Assumption 2 holds. With the step-size, α 0, α), where α is defined in Eq. 29), the sequence, {z }, generated by ADD-OPT, converges exactly to the optimal solution, z, at a linear rate, i.e., there exist some bounded constants M > 0 and γ < µ < 1, where γ is used in Lemma 4c), such that for any, z z Mµ. 33) Proof. We write Eq. 28) recursively, which results t G t 0 + G r H r s r. 34) By taing the norm on both sides of Eq. 34), and considering Lemma 4, we obtain that t G t0 + G r H r sr Γ 2 γ2 t 0 + Γγ s r, 35) in which we can bound s r as s r x r Y x r + Y x r z + Y z 1 + y) t r + y z. 36)

6 6 Therefore, we have that for all ) t Γ 2 t 0 + Γ1 + y) t r + Γy z γ. 37) Our first step is to show that t is bounded for all. It is true that there exists some bounded K > 0 such that for all > K it satisfies that Γ 2 + Γ1 + 2y)) γ 1. 38) Define Φ = max 0 K t, z ), which is bounded since K is bounded. It is true that t Φ for 0 K. Consider the case when = K + 1. By combining Eqs. 37) and 38), we have that ) t K+1 Φ Γ 2 + Γ1 + 2y)K + 1) γ K+1 Φ. 39) We repeat the procedures to show that t Φ for all. The next step is to show that t decays linearly. For any µ satisfying γ < µ < 1, there exist a constant U such that µ γ ) > U for all. Therefore, by bounding all t and z by Φ in Eq. 37), we obtain that for all ) t Φ Γ 2 + Γ1 + 2y) γ ) ) γ µ µ Φ Γ 2 + Γ1 + 2y)U U ) Φ Γ 2 + Γ1 + 2y)U µ. 40) It follows that z z and t satisfy the relation that z z Y x Y Y x + Y Y z z + Y Y x Y Y z y 1 + y) t + y T γ1 z, 41) where in the second inequality we use the relation Y Y I n Y Y Y y T γ1 achieved from Eq. 25). By combining Eqs. 40) and 41), we obtain that z z y Φ [1 + y)γ 2 + Γ1 + 2y)U) + T ] µ. The desired result is obtained by letting M = y Φ[1 + y)γ 2 + Γ1 + 2y)U) + T ]. Themorem 2 shows the linear convergence rate of ADD-OPT. In Sections IV and V, we prove Theorem 1. IV. AUXILIARY RELATIONS We provide several basic relations in this section, which will help the proof of Theorem 1. Lemma 5 derives iterative equations that govern the average sequence x and w. Lemma 6 gives inequalities that are direct consequences of Eq. 25). Lemma 7 can be found in standard optimization literature, [40]. It states that if we perform a gradient descent step with a fixed step-size for a strongly convex and smooth function, then the distance to optimizer shrins by at least a fixed ratio. Lemma 5. The following equations hold for all, a) w = g ; b) x +1 = x αg. Proof. Since A is column-stochastic, satisfying 1 n A = 1 n, we obtain that w = 1 n 1 n1 n Aw + f f ) = w + g g. Do this recursively, and we have that w = w 0 + g g 0. Recall that we have the initial condition that w 0 = f 0, which is equivalently to w 0 = g. Hence, we achieve the result of a). The proof of b) follows the following derivation, x +1 = 1 n 1 n1 n Ax αw ) = x αw, = x αg, where the last equation use result of a). The proof is done. Lemma 6. The following inequalities hold for all 1, a) Y Y I n y T γ1 ; b) Y Y 2y 2 T γ1. Proof. By considering Eq. 25), it follows that Y Y I n Y Y Y y T γ The proof of b) follows Y Y Y Y Y Y 2y T 2 γ1. This finishes the proof. 1 ; Lemma 7. Let Assumption A1 hold for the objective function, fz), in P1, and s and l are the strong-convexity constant and Lipschitz continuous gradient constant, respectively. For any z R, define z + = z α fz), where 0 < α < 2 l. Then z + z η z z, where η = max 1 αl, 1 αs ). V. CONVERGENCE ANALYSIS The proof of Theorem 1 is provided in this section. We will bound x Y x, x z, and w Y g by the linear combinations of their past values, i.e., x Y x, x z, and w Y g, as well as x. The coefficients will be shown to be the entries of G and H. Step 1: Bound x Y x. According to Eq. 4a) and Lemma 5b), we obtain that x Y x Ax Y x + α w Y g. 42) Note that Ax Y x σ x Y x from Eq. 26), we have x Y x σ x Y x + α w Y g. 43)

7 7 Step 2: Bound x z. By considering Lemma 5b), we obtain that x = [x αh ] α [g h ]. 44) Let x + = x αh, which is performing a centralized) gradient descent to minimize the objective function in Problem P1. Therefore, we have that, according to Lemma 7, x + z η x z. 45) By applying the Lipschitz continuous gradient assumption, Assumption 2a), we obtain g h 1 n 1 n1 n l z x. 46) Therefore, it follows that x z x + z + α g h η x z + αl z x. 47) Notice by Eq. 4c) and Lemma 6a), it follows that z x Y x Y x ) + Y Y I n ) x y x Y x + y T γ 1 x, 48) where in the second inequality we also mae use of the relation x x. By substituting Eq. 48) into Eq. 47), we obtain that x z αly x Y x + η x z Step 3: Bound w Y g. According to Eq. 4d), we have + αly T γ 1 x. 49) w Y g Aw Y g + f f ) Y g Y g ). With Lemma 5a) and Eq. 26), we obtain that Aw Y g = Aw Y w It follows from the definition of g that σ w Y w. 50) f f ) Y g Y g ) = I n 1 ) n Y 1 n 1 n f f ). 51) Since 1 n Y 1 n 1 n = A, where A = lim A, we obtain that f f ) Y g Y g ) ɛl z z, where in the preceding relation we use the Lipschitz continuous gradient assumption, Assumption A1a). Therefore, we have w Y g σ w Y g + ɛl z z. 52) We now bound z z. Note that h = 1 n 1 n1 n fx ) l x z. 53) As a result, we have Y w Y w Y g ) + Y Y h Y g h ) + Y y w Y g + y yl x z + y yl z x y w Y g + y yl x z + y 2 yl x Y x + y 2 ylt γ 1 x, 54) where the last inequality is valid by considering Eq. 48). With the upper bound of Y w provided in the preceding relation and note that A I n )Y x = 0 n, we can bound z z as follow. z z Y x x ) + Y Y ) x Y A I n ) x + α Y w + Y Y x y τ + αy yl) 2 x Y x + αy w Y g + αy yl x z + αyl + 2)y 2 T γ 1 x. 55) By substituting Eq. 55) in Eq. 52), we obtain that w Y g ɛlτy + αɛl 2 yy 2 ) x Y x + αɛl 2 yy x z + σ + αɛly ) w Y g + αyl + 2)ɛly 2 T γ 1 x. 56) Step 4: By combining Eqs. 43) in step 1, 49) in step 2, and 56) in step 3, we complete the proof. VI. NUMERICAL EXPERIMENTS In this section, we compare the performances of algorithms solving the distributed consensus optimization problem over directed graphs, including ADD-OPT, DEXTRA [37], Gradient-Push [25], Directed-Distributed Gradient Descent [31], and the Weighting Balancing-Distributed Gradient Descent [34]. Our numerical experiments are based on the distributed logistic regression problem over a directed graph: z β = argmin z R p 2 z 2 + n m i ln [ 1 + exp c ijz ) )] b ij, i=1 j=1 where for any agent i, it is accessible to m i training examples, c ij, b ij ) R p {, +1}, where c ij includes the p features of the jth training example of agent i, and b ij is the corresponding label. This problem can be formulated in the form of P1 with the private objective function f i being f i = β m i 2n z 2 + ln [ 1 + exp c ijz ) )] b ij. j=1

8 8 In our setting, we have n = 10, m i = 10, for all i, and p = 3. The networ topology is described in Fig. 1. In the implementation of algorithms, we apply to all algorithms the local degree weighting strategy, i.e., to assign each agent itself and its out-neighbors equal weights according to the agents s own out-degree. In our first experiment, we Residual DEXTRA α=0.001 DEXTRA α=0.2 DEXTRA α=0.3 Augmented DEXTRA α=0.001 Augmented DEXTRA α=0.2 Augmented DEXTRA α= Fig. 1: A strongly-connected directed networ. compare the convergence rates between ADD-OPT and other methods that designed for directed graphs. we apply the same local degree weighting strategy to all methods. The step-size used in Gradient-Push, Directed-Distributed Gradient Descent, and WeightingBalencing-Distributed Gradient Descent is α = 1/. The constant step-size used in DEXTRA and ADD- OPT is α = 1. It can be found that ADD-OPT and DEXTRA has a fast linear convergence rate, while other methods are sublinear. The convergence rate performances between different algorithms are found in Fig Fig. 3: Comparison between ADD-OPT and DEXTRA in terms of step-size ranges. can be found in Fig. 4 that the practical upper bound of stepsize is much bigger, i.e., α = In the implementation of ADD-OPT when we are trying to estimate α, we set α 1 10l assuming that l is not a global nowledge. This is an experienced estimation. Finally, we also show the relation between convergence speed and step-size does not simply satisfy any linear function. This can be found in Fig. 4. Residual 10 0 Gradient-Push Directed-Distributed Gradient Descent WeightBalancing-Distributed Gradient Descent DEXTRA Augmented DEXTRA Residual α=0.01 α=0.05 α=0.1 α=0.5 α=1.1 α=1.11 α=1.12 α= Fig. 2: Convergence rates between optimization methods for directed networs Fig. 4: The range of ADD-OPT s step-size. The second experiment compares ADD-OPT and DEXTRA in terms of their step-size ranges. We stic to the same local degree weighting strategy for both ADD-OPT and DEXTRA. It is shown in Fig. 3 that the greatest lower bound of DEXTRA is round α = 0.2. Since the value of α requires the global nowledge, it is hard for agents to estimate this value in the distributed implementation. Once agents pic some value for α < 0.2, DEXTRA diverges. In contrast, agents that implementing ADD-OPT can pic whatever small values to ensure the convergence. According to the setting, we can calculate that τ = 1.25, ɛ = 1.11, y = 1.96, y = 2.2, and l = 1. It is also satisfied that σ < 1. Therefore, we can estimate α < = 0.3. It VII. CONCLUSIONS In this paper, we focus on solving the distributed consensus optimization problem over directed graphs. The proposed algorithm, termed ADD-OPT, can be viewed as an improvement of our recent wor, DEXTRA. ADD-OPT converges at a linear rate Oµ ) for 0 < µ < 1 given the assumption that the objective functions are strongly-convex, where is the number of iterations. Compared to DEXTRA, ADD-OPT owns a wider and more realistic range of step-size for the convergence. In particular, the greatest lower bound of DEXTRA s step-size is strictly greater than zero while that of ADD-OPT s equals exactly to zero. This guarantees the convergence of ADD-OPT in the distributed implementation as long as agents picing

9 9 some small step-size. Therefore, ADD-OPT is more reliable in distributed setting. We provide a much more compact proof compared with DEXTRA to show the linear convergence rate of ADD-OPT. Simulation examples further illustrate the improvements. REFERENCES [1] S. Boyd, N. Parih, E. Chu, B. Peleato, and J. Ecstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundation and Trends in Maching Learning, vol. 3, no. 1, pp , Jan [2] V. Cevher, S. Becer, and M. Schmidt, Convex optimization for big data: Scalable, randomized, and parallel algorithms for big data analytics, IEEE Signal Processing Magazine, vol. 31, no. 5, pp , [3] I. Necoara and J. A. K. Suyens, Application of a smoothing technique to decomposition in convex optimization, IEEE Transactions on Automatic Control, vol. 53, no. 11, pp , Dec [4] G. Mateos, J. A. Bazerque, and G. B. Giannais, Distributed sparse linear regression, IEEE Transactions on Signal Processing, vol. 58, no. 10, pp , Oct [5] J. A. Bazerque and G. B. Giannais, Distributed spectrum sensing for cognitive radio networs by exploiting sparsity, IEEE Transactions on Signal Processing, vol. 58, no. 3, pp , March [6] M. Rabbat and R. Nowa, Distributed optimization in sensor networs, in 3rd International Symposium on Information Processing in Sensor Networs, Bereley, CA, Apr. 2004, pp [7] U. A. Khan, S. Kar, and J. M. F. Moura, Diland: An algorithm for distributed sensor localization with noisy distance measurements, IEEE Transactions on Signal Processing, vol. 58, no. 3, pp , Mar [8] C. L and L. Li, A distributed multiple dimensional qos constrained resource scheduling optimization policy in computational grid, Journal of Computer and System Sciences, vol. 72, no. 4, pp , [9] G. Neglia, G. Reina, and S. Alouf, Distributed gradient optimization for epidemic routing: A preliminary evaluation, in 2nd IFIP in IEEE Wireless Days, Paris, Dec. 2009, pp [10] A. Nedic and A. Ozdaglar, Distributed subgradient methods for multi-agent optimization, IEEE Transactions on Automatic Control, vol. 54, no. 1, pp , Jan [11] A. Nedic, A. Ozdaglar, and P. A. Parrilo, Constrained consensus and optimization in multi-agent networs, IEEE Transactions on Automatic Control, vol. 55, no. 4, pp , Apr [12] I. Lobel and A. Ozdaglar, Distributed subgradient methods for convex optimization over random networs, IEEE Transactions on Automatic Control, vol. 56, no. 6, pp , Jun [13] S. S. Ram, A. Nedic, and V. V. Veeravalli, Distributed stochastic subgradient projection algorithms for convex optimization, Journal of Optimization Theory and Applications, vol. 147, no. 3, pp , [14] K. Yuan, Q. Ling, and W. Yin, On the convergence of decentralized gradient descent, arxiv preprint arxiv: , [15] D. Jaovetic, J. Xavier, and J. M. F. Moura, Fast distributed gradient methods, IEEE Transactions on Automatic Control, vol. 59, no. 5, pp , [16] W. Shi, Q. Ling, G. Wu, and W Yin, Extra: An exact first-order algorithm for decentralized consensus optimization, SIAM Journal on Optimization, vol. 25, no. 2, pp , [17] G. Qu and N. Li, Harnessing smoothness to accelerate distributed optimization, arxiv preprint arxiv: , [18] J. F. C. Mota, J. M. F. Xavier, P. M. Q. Aguiar, and M. Puschel, D-ADMM: A communication-efficient distributed algorithm for separable optimization, IEEE Transactions on Signal Processing, vol. 61, no. 10, pp , May [19] E. Wei and A. Ozdaglar, Distributed alternating direction method of multipliers, in 51st IEEE Annual Conference on Decision and Control, Dec. 2012, pp [20] W. Shi, Q. Ling, K Yuan, G Wu, and W Yin, On the linear convergence of the admm in decentralized consensus optimization, IEEE Transactions on Signal Processing, vol. 62, no. 7, pp , April [21] A. Jadbabaie, J. Lim, and A. S. Morse, Coordination of groups of mobile autonomous agents using nearest neighbor rules, IEEE Transactions on Automatic Control, vol. 48, no. 6, pp , Jun [22] R. Olfati-Saber and R. M. Murray, Consensus problems in networs of agents with switching topology and time-delays, IEEE Transactions on Automatic Control, vol. 49, no. 9, pp , Sep [23] R. Olfati-Saber and R. M. Murray, Consensus protocols for networs of dynamic agents, in IEEE American Control Conference, Denver, Colorado, Jun. 2003, vol. 2, pp [24] L. Xiao, S. Boyd, and S. J. Kim, Distributed average consensus with least-mean-square deviation, Journal of Parallel and Distributed Computing, vol. 67, no. 1, pp , [25] A. Nedic and A. Olshevsy, Distributed optimization over time-varying directed graphs, IEEE Transactions on Automatic Control, vol. PP, no. 99, pp. 1 1, [26] K. I. Tsianos, S. Lawlor, and M. G. Rabbat, Push-sum distributed dual averaging for convex optimization, in 51st IEEE Annual Conference on Decision and Control, Maui, Hawaii, Dec. 2012, pp [27] K. I. Tsianos, The role of the Networ in Distributed Optimization Algorithms: Convergence Rates, Scalability, Communication/Computation Tradeoffs and Communication Delays, Ph.D. thesis, Dept. Elect. Comp. Eng. McGill University, [28] K. I. Tsianos, S. Lawlor, and M. G. Rabbat, Consensus-based distributed optimization: Practical issues and applications in large-scale machine learning, in 50th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, Oct. 2012, pp [29] D. Kempe, A. Dobra, and J. Gehre, Gossip-based computation of aggregate information, in 44th Annual IEEE Symposium on Foundations of Computer Science, Oct. 2003, pp [30] F. Benezit, V. Blondel, P. Thiran, J. Tsitsilis, and M. Vetterli, Weighted gossip: Distributed averaging using non-doubly stochastic matrices, in IEEE International Symposium on Information Theory, Jun. 2010, pp [31] C. Xi, Q. Wu, and U. A. Khan, Distributed gradient descent over directed graphs, arxiv preprint arxiv: , [32] C. Xi and U. A. Khan, Distributed subgradient projection algorithm over directed graphs, arxiv preprint arxiv: , [33] K. Cai and H. Ishii, Average consensus on general strongly connected digraphs, Automatica, vol. 48, no. 11, pp , [34] A. Mahdoumi and A. Ozdaglar, Graph balancing for distributed subgradient methods over directed graphs, to appear in 54th IEEE Annual Conference on Decision and Control, [35] L. Hooi-Tong, On a class of directed graphswith an application to traffic-flow problems, Operations Research, vol. 18, no. 1, pp , [36] A. Nedic and A. Olshevsy, Distributed optimization of strongly convex functions on directed time-varying graphs, in IEEE Global Conference on Signal and Information Processing, Dec. 2013, pp [37] C. Xi and U. A. Khan, On the linear convergence of distributed optimization over directed graphs, arxiv preprint arxiv: , 2015.

10 10 [38] J. Zeng and W. Yin, Extrapush for convex smooth decentralized optimization over directed networs, arxiv preprint arxiv: , [39] A. Olshevsy and J. N. Tsitsilis, Convergence speed in distributed consensus and averaging, SIAM Journal on Control and Optimization, vol. 48, no. 1, pp , [40] S. Bubec, Convex optimization: Algorithms and complexity, arxiv preprint arxiv: , 2014.

On the Linear Convergence of Distributed Optimization over Directed Graphs

On the Linear Convergence of Distributed Optimization over Directed Graphs 1 On the Linear Convergence of Distributed Optimization over Directed Graphs Chenguang Xi, and Usman A. Khan arxiv:1510.0149v1 [math.oc] 7 Oct 015 Abstract This paper develops a fast distributed algorithm,

More information

On the linear convergence of distributed optimization over directed graphs

On the linear convergence of distributed optimization over directed graphs 1 On the linear convergence of distributed optimization over directed graphs Chenguang Xi, and Usman A. Khan arxiv:1510.0149v4 [math.oc] 7 May 016 Abstract This paper develops a fast distributed algorithm,

More information

ADMM and Fast Gradient Methods for Distributed Optimization

ADMM and Fast Gradient Methods for Distributed Optimization ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work

More information

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Decentralized Quadratically Approimated Alternating Direction Method of Multipliers Aryan Mokhtari Wei Shi Qing Ling Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania

More information

Distributed Optimization over Networks Gossip-Based Algorithms

Distributed Optimization over Networks Gossip-Based Algorithms Distributed Optimization over Networks Gossip-Based Algorithms Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Random

More information

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China)

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China) Network Newton Aryan Mokhtari, Qing Ling and Alejandro Ribeiro University of Pennsylvania, University of Science and Technology (China) aryanm@seas.upenn.edu, qingling@mail.ustc.edu.cn, aribeiro@seas.upenn.edu

More information

Distributed Optimization over Random Networks

Distributed Optimization over Random Networks Distributed Optimization over Random Networks Ilan Lobel and Asu Ozdaglar Allerton Conference September 2008 Operations Research Center and Electrical Engineering & Computer Science Massachusetts Institute

More information

A Distributed Newton Method for Network Utility Maximization, I: Algorithm

A Distributed Newton Method for Network Utility Maximization, I: Algorithm A Distributed Newton Method for Networ Utility Maximization, I: Algorithm Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract Most existing wors use dual decomposition and first-order

More information

Decentralized Consensus Optimization with Asynchrony and Delay

Decentralized Consensus Optimization with Asynchrony and Delay Decentralized Consensus Optimization with Asynchrony and Delay Tianyu Wu, Kun Yuan 2, Qing Ling 3, Wotao Yin, and Ali H. Sayed 2 Department of Mathematics, 2 Department of Electrical Engineering, University

More information

Asynchronous Non-Convex Optimization For Separable Problem

Asynchronous Non-Convex Optimization For Separable Problem Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent

More information

Distributed intelligence in multi agent systems

Distributed intelligence in multi agent systems Distributed intelligence in multi agent systems Usman Khan Department of Electrical and Computer Engineering Tufts University Workshop on Distributed Optimization, Information Processing, and Learning

More information

DECENTRALIZED algorithms are used to solve optimization

DECENTRALIZED algorithms are used to solve optimization 5158 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 19, OCTOBER 1, 016 DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Aryan Mohtari, Wei Shi, Qing Ling,

More information

Convergence rates for distributed stochastic optimization over random networks

Convergence rates for distributed stochastic optimization over random networks Convergence rates for distributed stochastic optimization over random networs Dusan Jaovetic, Dragana Bajovic, Anit Kumar Sahu and Soummya Kar Abstract We establish the O ) convergence rate for distributed

More information

Discrete-time Consensus Filters on Directed Switching Graphs

Discrete-time Consensus Filters on Directed Switching Graphs 214 11th IEEE International Conference on Control & Automation (ICCA) June 18-2, 214. Taichung, Taiwan Discrete-time Consensus Filters on Directed Switching Graphs Shuai Li and Yi Guo Abstract We consider

More information

Convergence Rate for Consensus with Delays

Convergence Rate for Consensus with Delays Convergence Rate for Consensus with Delays Angelia Nedić and Asuman Ozdaglar October 8, 2007 Abstract We study the problem of reaching a consensus in the values of a distributed system of agents with time-varying

More information

A Distributed Newton Method for Network Utility Maximization

A Distributed Newton Method for Network Utility Maximization A Distributed Newton Method for Networ Utility Maximization Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie Abstract Most existing wor uses dual decomposition and subgradient methods to solve Networ Utility

More information

Fast Linear Iterations for Distributed Averaging 1

Fast Linear Iterations for Distributed Averaging 1 Fast Linear Iterations for Distributed Averaging 1 Lin Xiao Stephen Boyd Information Systems Laboratory, Stanford University Stanford, CA 943-91 lxiao@stanford.edu, boyd@stanford.edu Abstract We consider

More information

Distributed Algorithms for Optimization and Nash-Games on Graphs

Distributed Algorithms for Optimization and Nash-Games on Graphs Distributed Algorithms for Optimization and Nash-Games on Graphs Angelia Nedić angelia@illinois.edu ECEE Department Arizona State University, Tempe, AZ 0 Large Networked Systems The recent advances in

More information

DLM: Decentralized Linearized Alternating Direction Method of Multipliers

DLM: Decentralized Linearized Alternating Direction Method of Multipliers 1 DLM: Decentralized Linearized Alternating Direction Method of Multipliers Qing Ling, Wei Shi, Gang Wu, and Alejandro Ribeiro Abstract This paper develops the Decentralized Linearized Alternating Direction

More information

Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods

Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods Mahyar Fazlyab, Santiago Paternain, Alejandro Ribeiro and Victor M. Preciado Abstract In this paper, we consider a class of

More information

Finite-Time Distributed Consensus in Graphs with Time-Invariant Topologies

Finite-Time Distributed Consensus in Graphs with Time-Invariant Topologies Finite-Time Distributed Consensus in Graphs with Time-Invariant Topologies Shreyas Sundaram and Christoforos N Hadjicostis Abstract We present a method for achieving consensus in distributed systems in

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

Weighted Gossip: Distributed Averaging Using Non-Doubly Stochastic Matrices

Weighted Gossip: Distributed Averaging Using Non-Doubly Stochastic Matrices Weighted Gossip: Distributed Averaging Using Non-Doubly Stochastic Matrices Florence Bénézit ENS-INRIA, France Vincent Blondel UCL, Belgium Patric Thiran EPFL, Switzerland John Tsitsilis MIT, USA Martin

More information

Distributed Consensus Optimization

Distributed Consensus Optimization Distributed Consensus Optimization Ming Yan Michigan State University, CMSE/Mathematics September 14, 2018 Decentralized-1 Backgroundwhy andwe motivation need decentralized optimization? I Decentralized

More information

c 2015 Society for Industrial and Applied Mathematics

c 2015 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 5, No., pp. 944 966 c 05 Society for Industrial and Applied Mathematics EXTRA: AN EXACT FIRST-ORDER ALGORITHM FOR DECENTRALIZED CONSENSUS OPTIMIZATION WEI SHI, QING LING, GANG WU, AND

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Consensus-Based Distributed Optimization with Malicious Nodes

Consensus-Based Distributed Optimization with Malicious Nodes Consensus-Based Distributed Optimization with Malicious Nodes Shreyas Sundaram Bahman Gharesifard Abstract We investigate the vulnerabilities of consensusbased distributed optimization protocols to nodes

More information

Optimization over time-varying directed graphs with row and column-stochastic matrices

Optimization over time-varying directed graphs with row and column-stochastic matrices 1 Optimization over time-varying directed graphs ith ro and column-stochastic matrices Fahteh Saadatniai, Student Member, IEEE, Ran Xin, Student Member, IEEE, and Usman A Khan, Senior Member, IEEE arxiv:181007393v1

More information

Quantized Average Consensus on Gossip Digraphs

Quantized Average Consensus on Gossip Digraphs Quantized Average Consensus on Gossip Digraphs Hideaki Ishii Tokyo Institute of Technology Joint work with Kai Cai Workshop on Uncertain Dynamical Systems Udine, Italy August 25th, 2011 Multi-Agent Consensus

More information

Asymptotics, asynchrony, and asymmetry in distributed consensus

Asymptotics, asynchrony, and asymmetry in distributed consensus DANCES Seminar 1 / Asymptotics, asynchrony, and asymmetry in distributed consensus Anand D. Information Theory and Applications Center University of California, San Diego 9 March 011 Joint work with Alex

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

Consensus of Information Under Dynamically Changing Interaction Topologies

Consensus of Information Under Dynamically Changing Interaction Topologies Consensus of Information Under Dynamically Changing Interaction Topologies Wei Ren and Randal W. Beard Abstract This paper considers the problem of information consensus among multiple agents in the presence

More information

Consensus Seeking in Multi-agent Systems Under Dynamically Changing Interaction Topologies

Consensus Seeking in Multi-agent Systems Under Dynamically Changing Interaction Topologies IEEE TRANSACTIONS ON AUTOMATIC CONTROL, SUBMITTED FOR PUBLICATION AS A TECHNICAL NOTE. 1 Consensus Seeking in Multi-agent Systems Under Dynamically Changing Interaction Topologies Wei Ren, Student Member,

More information

Push-sum Algorithm on Time-varying Random Graphs

Push-sum Algorithm on Time-varying Random Graphs Push-sum Algorithm on Time-varying Random Graphs by Pouya Rezaeinia A thesis submitted to the Department of Mathematics and Statistics in conformity with the requirements for the degree of Master of Applied

More information

Constrained Consensus and Optimization in Multi-Agent Networks

Constrained Consensus and Optimization in Multi-Agent Networks Constrained Consensus Optimization in Multi-Agent Networks The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

Distributed Estimation and Detection for Smart Grid

Distributed Estimation and Detection for Smart Grid Distributed Estimation and Detection for Smart Grid Texas A&M University Joint Wor with: S. Kar (CMU), R. Tandon (Princeton), H. V. Poor (Princeton), and J. M. F. Moura (CMU) 1 Distributed Estimation/Detection

More information

Newton-like method with diagonal correction for distributed optimization

Newton-like method with diagonal correction for distributed optimization Newton-lie method with diagonal correction for distributed optimization Dragana Bajović Dušan Jaovetić Nataša Krejić Nataša Krlec Jerinić August 15, 2015 Abstract We consider distributed optimization problems

More information

Newton-like method with diagonal correction for distributed optimization

Newton-like method with diagonal correction for distributed optimization Newton-lie method with diagonal correction for distributed optimization Dragana Bajović Dušan Jaovetić Nataša Krejić Nataša Krlec Jerinić February 7, 2017 Abstract We consider distributed optimization

More information

Fastest Distributed Consensus Averaging Problem on. Chain of Rhombus Networks

Fastest Distributed Consensus Averaging Problem on. Chain of Rhombus Networks Fastest Distributed Consensus Averaging Problem on Chain of Rhombus Networks Saber Jafarizadeh Department of Electrical Engineering Sharif University of Technology, Azadi Ave, Tehran, Iran Email: jafarizadeh@ee.sharif.edu

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

arxiv: v3 [math.oc] 1 Jul 2015

arxiv: v3 [math.oc] 1 Jul 2015 On the Convergence of Decentralized Gradient Descent Kun Yuan Qing Ling Wotao Yin arxiv:1310.7063v3 [math.oc] 1 Jul 015 Abstract Consider the consensus problem of minimizing f(x) = n fi(x), where x Rp

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

Consensus Seeking in Multi-agent Systems Under Dynamically Changing Interaction Topologies

Consensus Seeking in Multi-agent Systems Under Dynamically Changing Interaction Topologies IEEE TRANSACTIONS ON AUTOMATIC CONTROL, SUBMITTED FOR PUBLICATION AS A TECHNICAL NOTE. Consensus Seeking in Multi-agent Systems Under Dynamically Changing Interaction Topologies Wei Ren, Student Member,

More information

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University Convergence Rates in Decentralized Optimization Alex Olshevsky Department of Electrical and Computer Engineering Boston University Distributed and Multi-agent Control Strong need for protocols to coordinate

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

A Distributed Newton Method for Network Optimization

A Distributed Newton Method for Network Optimization A Distributed Newton Method for Networ Optimization Ali Jadbabaie, Asuman Ozdaglar, and Michael Zargham Abstract Most existing wor uses dual decomposition and subgradient methods to solve networ optimization

More information

arxiv: v1 [math.oc] 29 Sep 2018

arxiv: v1 [math.oc] 29 Sep 2018 Distributed Finite-time Least Squares Solver for Network Linear Equations Tao Yang a, Jemin George b, Jiahu Qin c,, Xinlei Yi d, Junfeng Wu e arxiv:856v mathoc 9 Sep 8 a Department of Electrical Engineering,

More information

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology

More information

Research on Consistency Problem of Network Multi-agent Car System with State Predictor

Research on Consistency Problem of Network Multi-agent Car System with State Predictor International Core Journal of Engineering Vol. No.9 06 ISSN: 44-895 Research on Consistency Problem of Network Multi-agent Car System with State Predictor Yue Duan a, Linli Zhou b and Yue Wu c Institute

More information

Resilient Distributed Optimization Algorithm against Adversary Attacks

Resilient Distributed Optimization Algorithm against Adversary Attacks 207 3th IEEE International Conference on Control & Automation (ICCA) July 3-6, 207. Ohrid, Macedonia Resilient Distributed Optimization Algorithm against Adversary Attacks Chengcheng Zhao, Jianping He

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

arxiv: v2 [cs.dc] 2 May 2018

arxiv: v2 [cs.dc] 2 May 2018 On the Convergence Rate of Average Consensus and Distributed Optimization over Unreliable Networks Lili Su EECS, MI arxiv:1606.08904v2 [cs.dc] 2 May 2018 Abstract. We consider the problems of reaching

More information

Dual Ascent. Ryan Tibshirani Convex Optimization

Dual Ascent. Ryan Tibshirani Convex Optimization Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate

More information

ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS. Michael G. Rabbat

ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS. Michael G. Rabbat ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS Michael G. Rabbat Dept. of Electrical and Computer Engineering McGill University, Montréal, Québec Email: michael.rabbat@mcgill.ca ABSTRACT This paper

More information

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)

More information

Distributed online optimization over jointly connected digraphs

Distributed online optimization over jointly connected digraphs Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Mathematical Theory of Networks and Systems

More information

Consensus Optimization with Delayed and Stochastic Gradients on Decentralized Networks

Consensus Optimization with Delayed and Stochastic Gradients on Decentralized Networks 06 IEEE International Conference on Big Data (Big Data) Consensus Optimization with Delayed and Stochastic Gradients on Decentralized Networks Benjamin Sirb Xiaojing Ye Department of Mathematics and Statistics

More information

Network Optimization with Heuristic Rational Agents

Network Optimization with Heuristic Rational Agents Network Optimization with Heuristic Rational Agents Ceyhun Eksin and Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania {ceksin, aribeiro}@seas.upenn.edu Abstract

More information

Lecture 1 and 2: Introduction and Graph theory basics. Spring EE 194, Networked estimation and control (Prof. Khan) January 23, 2012

Lecture 1 and 2: Introduction and Graph theory basics. Spring EE 194, Networked estimation and control (Prof. Khan) January 23, 2012 Lecture 1 and 2: Introduction and Graph theory basics Spring 2012 - EE 194, Networked estimation and control (Prof. Khan) January 23, 2012 Spring 2012: EE-194-02 Networked estimation and control Schedule

More information

arxiv: v1 [math.oc] 24 Dec 2018

arxiv: v1 [math.oc] 24 Dec 2018 On Increasing Self-Confidence in Non-Bayesian Social Learning over Time-Varying Directed Graphs César A. Uribe and Ali Jadbabaie arxiv:82.989v [math.oc] 24 Dec 28 Abstract We study the convergence of the

More information

Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren, Member, IEEE

Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren, Member, IEEE IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 57, NO. 1, JANUARY 2012 33 Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren,

More information

Lecture 16: Introduction to Neural Networks

Lecture 16: Introduction to Neural Networks Lecture 16: Introduction to Neural Networs Instructor: Aditya Bhasara Scribe: Philippe David CS 5966/6966: Theory of Machine Learning March 20 th, 2017 Abstract In this lecture, we consider Bacpropagation,

More information

Automatica. Distributed discrete-time coordinated tracking with a time-varying reference state and limited communication

Automatica. Distributed discrete-time coordinated tracking with a time-varying reference state and limited communication Automatica 45 (2009 1299 1305 Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Brief paper Distributed discrete-time coordinated tracking with a

More information

Diffusion based Projection Method for Distributed Source Localization in Wireless Sensor Networks

Diffusion based Projection Method for Distributed Source Localization in Wireless Sensor Networks The Third International Workshop on Wireless Sensor, Actuator and Robot Networks Diffusion based Projection Method for Distributed Source Localization in Wireless Sensor Networks Wei Meng 1, Wendong Xiao,

More information

Alternative Characterization of Ergodicity for Doubly Stochastic Chains

Alternative Characterization of Ergodicity for Doubly Stochastic Chains Alternative Characterization of Ergodicity for Doubly Stochastic Chains Behrouz Touri and Angelia Nedić Abstract In this paper we discuss the ergodicity of stochastic and doubly stochastic chains. We define

More information

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Husheng Li 1 and Huaiyu Dai 2 1 Department of Electrical Engineering and Computer

More information

Convergence of Rule-of-Thumb Learning Rules in Social Networks

Convergence of Rule-of-Thumb Learning Rules in Social Networks Convergence of Rule-of-Thumb Learning Rules in Social Networks Daron Acemoglu Department of Economics Massachusetts Institute of Technology Cambridge, MA 02142 Email: daron@mit.edu Angelia Nedić Department

More information

Preconditioning via Diagonal Scaling

Preconditioning via Diagonal Scaling Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.

More information

On Quantized Consensus by Means of Gossip Algorithm Part II: Convergence Time

On Quantized Consensus by Means of Gossip Algorithm Part II: Convergence Time Submitted, 009 American Control Conference (ACC) http://www.cds.caltech.edu/~murray/papers/008q_lm09b-acc.html On Quantized Consensus by Means of Gossip Algorithm Part II: Convergence Time Javad Lavaei

More information

Distributed Consensus over Network with Noisy Links

Distributed Consensus over Network with Noisy Links Distributed Consensus over Network with Noisy Links Behrouz Touri Department of Industrial and Enterprise Systems Engineering University of Illinois Urbana, IL 61801 touri1@illinois.edu Abstract We consider

More information

Complex Laplacians and Applications in Multi-Agent Systems

Complex Laplacians and Applications in Multi-Agent Systems 1 Complex Laplacians and Applications in Multi-Agent Systems Jiu-Gang Dong, and Li Qiu, Fellow, IEEE arxiv:1406.186v [math.oc] 14 Apr 015 Abstract Complex-valued Laplacians have been shown to be powerful

More information

Distributed Consensus and Linear Functional Calculation in Networks: An Observability Perspective

Distributed Consensus and Linear Functional Calculation in Networks: An Observability Perspective Distributed Consensus and Linear Functional Calculation in Networks: An Observability Perspective ABSTRACT Shreyas Sundaram Coordinated Science Laboratory and Dept of Electrical and Computer Engineering

More information

Zero-Gradient-Sum Algorithms for Distributed Convex Optimization: The Continuous-Time Case

Zero-Gradient-Sum Algorithms for Distributed Convex Optimization: The Continuous-Time Case 0 American Control Conference on O'Farrell Street, San Francisco, CA, USA June 9 - July 0, 0 Zero-Gradient-Sum Algorithms for Distributed Convex Optimization: The Continuous-Time Case Jie Lu and Choon

More information

Agreement algorithms for synchronization of clocks in nodes of stochastic networks

Agreement algorithms for synchronization of clocks in nodes of stochastic networks UDC 519.248: 62 192 Agreement algorithms for synchronization of clocks in nodes of stochastic networks L. Manita, A. Manita National Research University Higher School of Economics, Moscow Institute of

More information

Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems

Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems 1 Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems Mauro Franceschelli, Andrea Gasparri, Alessandro Giua, and Giovanni Ulivi Abstract In this paper the formation stabilization problem

More information

Distributed Computation of Quantiles via ADMM

Distributed Computation of Quantiles via ADMM 1 Distributed Computation of Quantiles via ADMM Franck Iutzeler Abstract In this paper, we derive distributed synchronous and asynchronous algorithms for computing uantiles of the agents local values.

More information

Distributed Averaging in Sensor Networks Based on Broadcast Gossip Algorithms

Distributed Averaging in Sensor Networks Based on Broadcast Gossip Algorithms 1 Distributed Averaging in Sensor Networks Based on Broadcast Gossip Algorithms Mauro Franceschelli, Alessandro Giua and Carla Seatzu. Abstract In this paper we propose a new decentralized algorithm to

More information

Distributed Control of the Laplacian Spectral Moments of a Network

Distributed Control of the Laplacian Spectral Moments of a Network Distributed Control of the Laplacian Spectral Moments of a Networ Victor M. Preciado, Michael M. Zavlanos, Ali Jadbabaie, and George J. Pappas Abstract It is well-nown that the eigenvalue spectrum of the

More information

Average-Consensus of Multi-Agent Systems with Direct Topology Based on Event-Triggered Control

Average-Consensus of Multi-Agent Systems with Direct Topology Based on Event-Triggered Control Outline Background Preliminaries Consensus Numerical simulations Conclusions Average-Consensus of Multi-Agent Systems with Direct Topology Based on Event-Triggered Control Email: lzhx@nankai.edu.cn, chenzq@nankai.edu.cn

More information

On Quantized Consensus by Means of Gossip Algorithm Part I: Convergence Proof

On Quantized Consensus by Means of Gossip Algorithm Part I: Convergence Proof Submitted, 9 American Control Conference (ACC) http://www.cds.caltech.edu/~murray/papers/8q_lm9a-acc.html On Quantized Consensus by Means of Gossip Algorithm Part I: Convergence Proof Javad Lavaei and

More information

arxiv: v2 [eess.sp] 20 Nov 2017

arxiv: v2 [eess.sp] 20 Nov 2017 Distributed Change Detection Based on Average Consensus Qinghua Liu and Yao Xie November, 2017 arxiv:1710.10378v2 [eess.sp] 20 Nov 2017 Abstract Distributed change-point detection has been a fundamental

More information

An Optimization-based Approach to Decentralized Assignability

An Optimization-based Approach to Decentralized Assignability 2016 American Control Conference (ACC) Boston Marriott Copley Place July 6-8, 2016 Boston, MA, USA An Optimization-based Approach to Decentralized Assignability Alborz Alavian and Michael Rotkowitz Abstract

More information

IN the multiagent systems literature, the consensus problem,

IN the multiagent systems literature, the consensus problem, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 63, NO. 7, JULY 206 663 Periodic Behaviors for Discrete-Time Second-Order Multiagent Systems With Input Saturation Constraints Tao Yang,

More information

Decentralized Estimation of the Algebraic Connectivity for Strongly Connected Networks

Decentralized Estimation of the Algebraic Connectivity for Strongly Connected Networks Decentralized Estimation of the Algebraic Connectivity for Strongly Connected Networs Hasan A Poonawala and Mar W Spong Abstract The second smallest eigenvalue λ 2 (L) of the Laplacian L of a networ G

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Stability Analysis of Stochastically Varying Formations of Dynamic Agents

Stability Analysis of Stochastically Varying Formations of Dynamic Agents Stability Analysis of Stochastically Varying Formations of Dynamic Agents Vijay Gupta, Babak Hassibi and Richard M. Murray Division of Engineering and Applied Science California Institute of Technology

More information

Learning distributions and hypothesis testing via social learning

Learning distributions and hypothesis testing via social learning UMich EECS 2015 1 / 48 Learning distributions and hypothesis testing via social learning Anand D. Department of Electrical and Computer Engineering, The State University of New Jersey September 29, 2015

More information

Hegselmann-Krause Dynamics: An Upper Bound on Termination Time

Hegselmann-Krause Dynamics: An Upper Bound on Termination Time Hegselmann-Krause Dynamics: An Upper Bound on Termination Time B. Touri Coordinated Science Laboratory University of Illinois Urbana, IL 680 touri@illinois.edu A. Nedić Industrial and Enterprise Systems

More information

On Distributed Optimization using Generalized Gossip

On Distributed Optimization using Generalized Gossip For presentation in CDC 015 On Distributed Optimization using Generalized Gossip Zhanhong Jiang Soumik Sarkar Kushal Mukherjee zhjiang@iastate.edu soumiks@iastate.edu MukherK@utrc.utc.com Department of

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Distributed Receding Horizon Control of Cost Coupled Systems

Distributed Receding Horizon Control of Cost Coupled Systems Distributed Receding Horizon Control of Cost Coupled Systems William B. Dunbar Abstract This paper considers the problem of distributed control of dynamically decoupled systems that are subject to decoupled

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Consensus Problems on Small World Graphs: A Structural Study

Consensus Problems on Small World Graphs: A Structural Study Consensus Problems on Small World Graphs: A Structural Study Pedram Hovareshti and John S. Baras 1 Department of Electrical and Computer Engineering and the Institute for Systems Research, University of

More information

1 The independent set problem

1 The independent set problem ORF 523 Lecture 11 Spring 2016, Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Tuesday, March 29, 2016 When in doubt on the accuracy of these notes, please cross chec with the instructor

More information

Communication/Computation Tradeoffs in Consensus-Based Distributed Optimization

Communication/Computation Tradeoffs in Consensus-Based Distributed Optimization Communication/Computation radeoffs in Consensus-Based Distributed Optimization Konstantinos I. sianos, Sean Lawlor, and Michael G. Rabbat Department of Electrical and Computer Engineering McGill University,

More information

Asymmetric Information Diffusion via Gossiping on Static And Dynamic Networks

Asymmetric Information Diffusion via Gossiping on Static And Dynamic Networks Asymmetric Information Diffusion via Gossiping on Static And Dynamic Networks Ercan Yildiz 1, Anna Scaglione 2, Asuman Ozdaglar 3 Cornell University, UC Davis, MIT Abstract In this paper we consider the

More information

Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems

Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems L.C.Smith College of Engineering and Computer Science Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems Ao Ren Sijia Liu Ruizhe Cai Wujie Wen

More information

Measurement partitioning and observational. equivalence in state estimation

Measurement partitioning and observational. equivalence in state estimation Measurement partitioning and observational 1 equivalence in state estimation Mohammadreza Doostmohammadian, Student Member, IEEE, and Usman A. Khan, Senior Member, IEEE arxiv:1412.5111v1 [cs.it] 16 Dec

More information

Continuous-time Distributed Convex Optimization with Set Constraints

Continuous-time Distributed Convex Optimization with Set Constraints Preprints of the 19th World Congress The International Federation of Automatic Control Cape Town, South Africa. August 24-29, 214 Continuous-time Distributed Convex Optimization with Set Constraints Shuai

More information