Decentralized Consensus Optimization with Asynchrony and Delay

Size: px
Start display at page:

Download "Decentralized Consensus Optimization with Asynchrony and Delay"

Transcription

1 Decentralized Consensus Optimization with Asynchrony and Delay Tianyu Wu, Kun Yuan 2, Qing Ling 3, Wotao Yin, and Ali H. Sayed 2 Department of Mathematics, 2 Department of Electrical Engineering, University of California, Los Angeles 3 Department of Automation, University of Science and Technology of China Abstract We propose an asynchronous, decentralized algorithm for consensus optimization. The algorithm runs in a network of agents, where the agents perform local computation and communicate with neighbors. We design our algorithm so that the agents can compute and communicate independently, at different times, for different durations. This reduces the waiting time for the slowest agent or longest communication delay and also eliminates the need for a global clock. Mathematically, our algorithm involves both primal and dual variables, uses fixed parameters, and has convergence guarantees under a bounded delay assumption and a random agent assumption. When running synchronously, its performance matches the current state-of-the-art algorithms (for example, PG-EXTRA, which however fails to converge without synchronization.) Through simulations, we demonstrate that our asynchronous algorithm converges much faster that it does under synchronization. Index Terms decentralized, asynchronous, delay, consensus optimization. I. INTRODUCTION AND RELATED WORK This paper considers a connected network of n agents that cooperatively solve the consensus optimization problem minimize f(x) := f i (x), x R p n where f i (x) := s i (x) + r i (x), i =,..., n. () We assume that the functions s i, r i : R p R are convex differentiable and possibly nondifferentiable functions, respectively. We call f i = s i + r i a composite objective function. Each s i and r i are kept private by agent i =, 2,, n, and r i often serves as the regularization term or the indicator function to a certain constraint on x. Decentralized algorithms rely on agents local computation, as well as the information exchange between agents. Such algorithms are generally robust to failure of critical relaying agents and scalable with network size. In the decentralized setting, it is inefficient to synchronize multiple nodes and links. To see this, let x R p be the local variable of agent i, and let x = [x (),..., x (n) ] R n p stack all local variables. To perform an iteration that updates the entire x k to x k+, all the agents must wait for the slowest agent or longest communicate delay. Hence, the performance This work was supported in part by NSF grants CCF , DMS-37602, and ECCS-40772, NSF China grant , and DARPA project N s: {wuty,kunyuan,wotaoyin,sayed}@ucla.edu, qingling@mail.ustc.edu.cn is determined by the worst, not the average. In addition, a global coordinator or clock must be implemented, making it expensive to build, configure, and scale the network. This paper proposes a new asynchronous decentralized algorithm (which also works if synchronized, of course). Consider a connected network G = {V, E} with agents V = {, 2,, n} and undirected edges E = {, 2,, m}. By convention, all edges (i, j) E obey i < j. Our algorithm involves both node-variables x = [x (),..., x (n) ] and edgevariables y = [y (),..., y (m) ]. We associate each row of y with an edge e = (i, j) E, and for simplicity we let agent i keep the variable y (this choice is arbitrary). It is easy to first present our algorithm in the abstract, synchronous form: x k+ T x (x k, y k ), y k+ T y (x k, y k ), (2a) (2b) where T x, T y are operators suitable for decentralized implementation. (Their expressions are given in II-A). The performance of (2) matches the state-of-the-art synchronous algorithms (PG-)EXTRA [], [2] and ADMM [3] [5]. In our asynchronous setting, agents can compute and communicate independently, at different moments, for different durations. There is no coordination whatsoever. As such, we must count iterations in a new way: k is incremented when any agent completes a round of its computation (ties are broken arbitrarily). Suppose that this occurs with agent i, which we let perform the following updates to its local variables: x k+ T x i (x k τ k, y k δk ) (3a) y k+ (i,j) T y (i,j) (xk τ k, y k δk ), j : (i, j) E. (3b) The rows of x, y not held by agent i remain unchanged from k to k +. In (3), Ti x and T y (i,j) are the sub-operators of T x and T y corresponding to agent i and edge (i, j), respectively. Again, their computation uses only those entries of x k τ k, y k δk held by agent i and its neighbors. We let τ k R n + and δ k R m + be vectors of delays. If j is a neighbor of i, then the jth row of x k τ k is (x k τ k j (j) ), meaning that agent i uses a copy x (j) that is τj k -iteration out of date. Mathematically, under uniformly bounded (but otherwise arbitrary) delays and that the next update is done by a random agent, we will show that our sequence {x k } converges to a The index i of the agent responsible for any kth update is random and independent of those responsible for the earlier updates,..., k.

2 2 x () y (,2) y (,3) k agent agent agent 3 x y (2,3) (2) x (3) /me Fig. : Network and uncoordinated computing. solution to Problem () with probability one. What can cause delays? Clearly, communication latency and bandwidth limits introduce delays. Besides, as agents start and finish their iterations independently, one agent may have updated its variables whilst its neighbors are still working on their current iterations that still use older (i.e., delayed) copies of those variables. Hence, both computation and communication cause delays and count for positive τ k and δ k. However, the two types of delays are mathematically indifferent in (3). Fig. depicts a simple three-agent network and how the agents perform their computing. The graph has V = {, 2, 3} and E = {(, 2), (, 3), (2, 3)}. The node-variable x and edge-variable(s) y (i,j) are assigned to agent i. At the moments when agents finish their updates, k =, 2,... are assigned. Not depicted in Fig. is that updated variables may take time to arrive at neighbors. As mentioned above, both communication and uncoordinate computing can cause delays. A. Relationship to certain synchronous algorithms Our algorithm, if running synchronously, can be algebraically reduced to PG-EXTRA [2]; they solve Problem () with a fixed step-size parameter and are typically faster than algorithms using diminishing step sizes. Also, both algorithms generalizes EXTRA [4], which only deals with differentiable functions. However, divergence (or convergence to wrong solutions) is observed when we run EXTRA and PG-EXTRA in our asynchronous setting. Our algorithm works correctly, and for this we must introduce the variable y, which leads to the moderate cost of updating and communicating y. Our algorithm is also very different from decentralized ADMM [3] [5] except that both algorithms can use fixed parameters. Distributed gradient descent [6], [7] and (Prox- )diffusion methods [8], [9] also use fixed step sizes, and they converge fast only to approximate solutions. B. Related decentralized algorithms under different settings Our setting of asynchrony is different from randomized single activation, which is assumed by the randomized gossip algorithm [0], []. Their setting activates only one edge at a time and does not allow delay. That is, before each activation, computation and communication associated with previous activations must have been completed, and only one edge in each neighborhood can be activated at any time. Likewise, our setting is different from randomized multi-activation in papers such as [2], [3] for consensus averaging and [4] [9] for consensus optimization, which activate multiple edges each 7 time and still does not allow delay. These algorithms can be alternatively viewed as synchronous algorithms running in a sequence of varying subgraphs. Since each iteration waits for the slowest agent or longest communication that are previously activated, a certain coordinator or global clock is needed. We shall also distinguish our setting from the fixedcommunication-delay setting [20], [2], where the information passing through each edge takes a fixed number of iterations to arrive. (Different edges can have different such numbers, and agents can compute with only the information they have, instead of waiting.) As demonstrated in [20], this setting can be transferred into no communication delay by replacing an edge with a chain of dummy nodes. Information passing through a chain of τ dummy nodes simulates an edge with a τ- iteration delay. The computation in this setting is synchronous, so a coordinator or global clock is still needed. Other work [20], [22] considers random communication delay in their setting. However, their algorithm is only suitable for consensus averaging, not yet for the more general problem (). Our setting is identical to the setting outlined in 2.6 of [23], where the introduced asynchronous decentralized ADMM allows both computation and communication delays. Our algorithm, however, handles composite functions. C. Contributions This paper introduces a decentralized algorithm for Problem () that has provable convergence when the next update is done by a random agent and when communication is subject to arbitrary but bounded delays. If running synchronously, our algorithm is as fast as the state-of-the-art algorithms except, to allow asynchrony, our algorithm involves updating and communicating the edge-variable y. When our algorithm runs asynchronously, it eliminates wait and is significantly faster. Our asynchronous setting is considerably less restrictive than the settings under which recent non-synchronous or nondeterministic decentralized algorithms are proposed. In our setting, the computation and communication of agents are uncoordinated. A global clock is no longer needed. Technical contributions are also made. We borrow ideas from the monotone operator theory and primal-dual operator splitting to derive our algorithms in just a few steps. (The edge variable y is our dual variable). To establish convergence under information delays, we do not simply follow the existing analysis of PG-EXTRA; instead, motivated by [23], [24], a new non-euclidean metric is introduced to absorb the delays. In machine learning, developing asynchronous algorithms for Problem () has become a hot topic. Unlike the existing majority, our algorithm involves both primal and dual variables; hence, our analysis is different. In particular, we cannot rely on the monotonicity of objective values. Instead, our approach establishes monotonic conditional expectations of certain distances to solution. We believe this new analysis can extend to a general set of primal-dual algorithms. D. Discussion and weaknesses As the reader will soon see, our algorithm involves a relaxation parameter that depends on the bound of delays,

3 3 which is a weakness. As a matter of fact, requiring delays to be bounded is itself a weakness. However, with more careful analysis, the bound can be relaxed and the relaxation parameter can be set adaptively to the local delays (i.e., how much out of date an agent knows about the variables of its neighbors). We leave this work to a forthcoming longer report. A weakness of both (PG)-EXTRA and our algorithm is that a step size parameter depends on a global Lipschitz constant. Unless the constant is known a prior, we must apply consensus averaging to obtain it. There is a way to overcome this weakness by assigning a parameter to each agent; however, we have to leave it to the longer report. A weakness that is difficult to overcome is the random agent assumption. It is not always practical, but if completely dropping it, we would face the worst deterministic case, which requires impractical assumptions to the problem. The real world lies somewhere between the worst deterministic case and the ideal random case. Our results shall shed some light on how to control an asynchronous algorithm in practice. Finally, we have not extended our algorithm to deal with directed graphs like the synchronous algorithm ExtraPush [25]. E. Notation Each agent i holds a local variable x R p, whose value at iteration k is denoted as x k. We introduce variable x to stack all local variables x s: x := x () x (2). x (n) Rn p. (4) The ith row of x is the local variable x R p kept by agent i. Now we define s(x) := s i (x ), r(x) := r i (x ), (5) as well as f(x) := n f i(x ) = s(x) + r(x). We then define the gradient of s(x) as s (x () ) s 2 (x (2) ) s(x) :=. s n (x (n) ) Rn p. (6) The inner product on R n p is defined as x, x = tr(x x) = n x x, and the norm is defined as x = x, x. II. ALGORITHMS In our network G = {V, E}, to each edge (i, j) E, we assign a weight w ij 0, which agent i uses to scale x (j) receiving from agent j. Likewise, let w ji = w ij for agent j. If (i, j) / E, then w ij = w ji = 0. For i, N i denotes the neighborhood of agent i, and E i denotes the set of all edges connected to i. Let W = [w ij ] R n n denote the weight matrix, which is symmetric and assumed to be doubly stochastic. Such W can be generated through the maximum-degree [26] or Metropolis- Hastings rules [26]. It is easy to verify that null{i W } = span{}. Introduce the diagonal matrix D R m m with diagonal entries D e,e = w ij /2 for each edge e = (i, j). Let C = [c ei ] R m n be the incidence matrix of G, and define V := DC (7) as the scaled incidence matrix. It is easy to verify: Proposition (Matrix factorization). 2 (I W ) = V V. (8) A. Proposed primal-dual algorithm Let us reformulate Problem (). First, it is equivalent to minimize f i (x ) + r i (x ), {x (),,x (n) } subject to x () = x (2) = = x (n). (9) Since null{i W } = span{}, Problem (9) is equivalent to minimize s(x) + r(x) x R n p subject to (I W )x = 0. (0) By Proposition, Problem (0) is further equivalent to minimize s(x) + r(x) x R n p subject to V x = 0, () which can be reformulated into the saddle-point problem max y R m p min s(x) + r(x) + y, V x, (2) x R n p α where α > 0 is a parameter and y is the dual variable. Problem (2) can be solved iteratively by the primal-dual algorithm that is adapted from [27] [28]: { y k+ =y k + V x k, x k+ =prox αr [x k α s(x k ) V (2y k+ y k (3) )]. 2 Next, in the x-update, eliminating y k+ by plugging in the y-update and, with I 2V V = W, we arrive at: { y k+ = y k + V x k, x k+ = prox αr [W x k α s(x k ) V y k (4) ], which computes (y k+, x k+ ) from (y k, x k ). Applying W, V and V requires communication. Other operations are local. B. Synchronous algorithm Alg. implements the iteration (4) in the synchronous setting, which requires two synchronization barriers in each iteration k. The first one holds computing until an agent receives all necessary input; after the agent finishes updating its variables, the second barrier holds it from sending out its updates until all of its neighbors also finish computing theirs (otherwise, an update intended for iteration k + may arrive at a neighbor too early, entering its computing still at iteration k). Note that the second barrier can be replaced by a buffer. 2 Proximal operator: prox αr (w) := arg min v r(v) + 2α v w 2.

4 4 Algorithm : Synchronous algorithm based on (4) Input: Starting point x 0, y 0. Set counter k = 0; while all agents i V in parallel do Wait until x k (j), j N i, and y k (l), l E i, are received; Compute: ( x k+ = prox αri w ij x k (j) α s i(x k ) V li y k ) (l) ; j N i l E i y k+ = y k + ( V ei x k + V ejx k (j)), e = (i, j) E; Wait until all neighbors finish computing; Set k k + ; Send out x k+ C. Asynchronous algorithm and y (k+) (i,j), (i, j) E, to neighbors; As already discussed, every agent computes and communicates independently in this setting. Hence, no synchronization barrier is needed. We let k increase by whenever an agent finishes a round of updating its variables. In general, agents compute with delayed information from its neighbors. Also, relaxation is added to the abstract update (3) to ensure convergence; the relaxation η i depends on how out of date an agent knows about the inputs from its neighbors. Longer delays require a smaller η i and cause slower convergence. Compute: ( x k+ = prox αri w ij x k τ k j (j) j N i ỹ k+ = y k + ( V ei x k + V ejx k τ k j (j) Relaxed updates: x k+ α s i (x k ) l E i V li y k δk l (l) ), e = (i, j) E; ( ) = x k + η i x k+ x k, ( ) = y k+ + η i ỹ k+ y k, e = (i, j) E. y k+ (5) The entries of x, y not held by agent i remain unchanged from k to k +. Alg. 2 implements the asynchronous updates. Algorithm 2: Asynchronous algorithm based on (5) Input: Starting point x 0, y 0 ; while each agent i asynchronously do Compute per (5) using the information it has; Send updated x and y (i,j), (i, j) E to neighbours; III. CONVERGENCE ANALYSIS We present our main assumptions and convergence results. As space is limited, all proofs are left to the longer report. Assumption. For any k > 0, the index i k of the agent responsible for the kth update is random and has probability q i := P (i k = i) > 0. ), The random variables i, i 2, are independent. This assumption is satisfied under either of the following scenarios: every agent i is activated following an independent Poisson process with parameter λ i and its computation is instant, leading to q i = λ i /( n λ i); (ii) every agent i runs continuously, and the duration of each round follows the exponential distribution exp(β i ), leading to q i = β i /( n β i ). Scenarios and (ii) appear in some literature as assumptions. Assumption 2. The delays τ k j, j =, 2,..., n and δk e, e =, 2,..., m, k, defined in (5) have an upper bound τ > 0. Assumption 3. Statistically speaking, the delays τ k j, j =, 2,..., n and δ k e, e =, 2,..., m, at iteration k, are independent of the index i k of the agent responsible for the update. We admit that this assumption is rather artificial, but it is crucial to our proof. In the worst case scenario, when the delays τ k j and δ k e always achieve their upper bound τ, Assumption 3 still holds and the convergence of the algorithm is still provable. In reality, what happens is between the worst case and the no-delay case. Since the weight matrix W associated with the network is symmetric and doubly stochastic, its eigenvalues lie in [, ]. We further restrict its minimum eigenvalue λ min (W ): Assumption 4. λ min (W ) >. Lemma. Under Assumption 4, we have [ ] I V G := 0. (6) V I Let ρ min := λ min (G) > 0 be the smallest eigenvalue of G and κ be its condition number. Assumption 5. ) The functions s i and r i are closed, proper and convex; 2) the functions s i are differentiable and satisfy: s i (x) s i ( x) L i x x, x, x R p ; 3) the parameter α in Eq. (4) and Alg. 2 satisfies 0 < α < 2ρ min L, where L = max i L i. We have the following theorem for z k := [x k ; y k ]. Theorem. Let Z be the set of primal-dual solutions to (2), (z k ) k 0 be the sequence generated by Alg. 2, and η i = η nq nq i with η (0, η max ] where η max < min 2τ κq and min+κ q min := min i q i. Then (z k ) k 0 converges to a point in Z with probability. This theorem guarantees that, if we run the asynchronous algorithm 2 from an arbitrary starting point x 0, then with probability, the sequence {x k } produced will converge to one of the solutions to problem (2). The upper bound of η max becomes smaller if the maximum delay is larger, or if the matrix M becomes more ill-conditioned. While the theorem bounds η i by a uniform η max, the bound can be improved by locally adapting it to the delay only pertaining to agent i. We leave this and other improvements to the longer report.

5 5 Relative error Time (ms) Synchronous Alg Asynchronous Alg 2 Fig. 2: asynchronous and synchronous algorithms IV. NUMERICAL EXPERIMENTS Since there is no similar asynchronous algorithm to compare with, we compare our algorithm between its synchronous and asynchronous settings, i.e., Alg. versus Alg. 2, to illustrate asynchronous advantages. The computation times and communication times are generated randomly. The tested problem is decentralized compressed sensing. Each agent i {,, n} holds some measurements: b = A x + e R m, where A R m p is a sensing matrix, x R p is the common unknown sparse signal, and e is i.i.d. Gaussian noise. The goal is to recover x. The number of measurements n m may be less than the number of unknowns p, so we solve the l -regularized least squares: minimize x n s i (x) + r i (x), (7) where s i (x) = 2 A x b 2 2, r i (x) = γ x, and γ is the regularization parameter with agent i. The tested network has 0 nodes and 4 edges. We set m = 3 for i =,, 0 and p = 50. The entries of A, e are independently sampled from the standard normal distribution N(0, ), and A is normalized so that A =, where is the induced 2-norm. The signal x is generated randomly with 20% nonzero elements. We simulate computation and communication times. The computation time of agent i is sampled from exp(µ i ). For agent i, µ i is set as 2+ X, X N(0, ). The communication time between agents are independently sampled from exp(0.6). We run the synchronous Alg. and the asynchronous Alg. 2 and plot the relative error xk x F x 0 x F against time, as depicted in Fig. 2. x is the exact solution. The step sizes of both algorithms are tuned by hand and are nearly optimal. From Fig. 2 we can see that both algorithms exhibit linear convergence and that Alg. 2 converges significantly faster. Within the same period (roughtly 2760ms), the asynchronous algorithm finishes 2 times as many rounds of computation and communication, due to the elimination of waiting time. REFERENCES [] W. Shi, Q. Ling, G. Wu, and W. Yin, EXTRA: An exact first-order algorithm for decentralized consensus optimization, SIAM Journal on Optimization, vol. 25, no. 2, pp , 205. [2] W. Shi, Q. Ling, G. Wu, and W. Yin, A proximal gradient algorithm for decentralized composite optimization, IEEE Transactions on Signal Processing, vol. 63, no. 22, pp , 205. [3] I. D. Schizas, A. Ribeiro, and G. B. Giannakis, Consensus in Ad hoc WSNs with noisy links Part I: Distributed estimation of deterministic signals, IEEE Transactions on Signal Processing, vol. 56, no., pp , [4] W. Shi, Q. Ling, K. Yuan, G. Wu, and W. Yin, On the linear convergence of the ADMM in decentralized consensus optimization, IEEE Transactions on Signal Processing, vol. 62, no. 7, pp , 204. [5] T.-H. Chang, M. Hong, and X. Wang, Multi-agent distributed optimization via inexact consensus admm, IEEE Transactions on Signal Processing, vol. 63, no. 2, pp , 205. [6] A. Nedić and A. Ozdaglar, Distributed subgradient methods for multiagent optimization, IEEE Transactions on Automatic Control, vol. 54, no., pp. 48 6, [7] K. Yuan, Q. Ling, and W. Yin, On the convergence of decentralized gradient descent, arxiv preprint arxiv: , 203. [8] A. H. Sayed, Adaptive networks, Proceedings of the IEEE, vol. 02, no. 4, pp , April 204. [9] S. Vlaski and A. H. Sayed, Proximal diffusion for stochastic costs with non-differentiable regularizers, in Proc. International Conference on Acoustic, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 205, pp [0] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, Randomized gossip algorithms, IEEE/ACM Transactions on Networking (TON), vol. 4, no. SI, pp , [] A. G. Dimakis, S. Kar, J. M. Moura, M. G. Rabbat, and A. Scaglione, Gossip algorithms for distributed signal processing, Proceedings of the IEEE, vol. 98, no., pp , 200. [2] J. M. Kar, S.and Moura, Sensor networks with random links: Topology design for distributed consensus, IEEE Transactions on Signal Processing, vol. 56, no. 7, pp , [3] F. Fagnani and S. Zampieri, Randomized consensus algorithms over large scale networks, IEEE Journal on Selected Areas in Communications, vol. 26, no. 4, pp , [4] A. Nedic and A. Olshevsky, Distributed optimization over time-varying directed graphs, IEEE Transactions on Automatic Control, vol. 60, no. 3, pp , 205. [5] F. Iutzeler, P. Bianchi, P. Ciblat, and W. Hachem, Asynchronous distributed optimization using a randomized alternating direction method of multipliers, in Conference on Decision and Control (CDC). IEEE, 203, pp [6] P. D. Lorenzo, S. Barbarossa, and A. H. Sayed, Decentralized resource assignment in cognitive networks based on swarming mechanisms over random graphs, IEEE Transactions on Signal Processing, vol. 60, no. 7, pp , 202. [7] X. Zhao and A. H. Sayed, Asynchronous adaptation and learning over networks Part I: Modeling and stability analysis, IEEE Transactions on Signal Processing, vol. 63, no. 4, pp , 205. [8] E. Wei and A. Ozdaglar, On the o(/k) convergence of asynchronous distributed alternating direction method of multipliers, in Proc. IEEE Global Conference on Signal and Information Processing (GlobalSIP), 203, pp [9] M. Hong and T. Chang, Stochastic proximal gradient consensus over random networks, arxiv preprint arxiv: , 205. [20] K. I. Tsianos and M. G. Rabbat, Distributed consensus and optimization under communication delays, in Allerton Conference on Communication, Control, and Computing. IEEE, 20, pp [2] K. I. Tsianos and M. G. Rabbat, Distributed dual averaging for convex optimization under communication delays, in IEEE American Control Conference, 202, pp [22] L. Liu, S.and Xie and H. Zhang, Distributed consensus for multi-agent systems with delays and noises in transmission channels, Automatica, vol. 47, no. 5, pp , 20. [23] Z. Peng, Y. Xu, M. Yan, and W. Yin, ARock: an algorithmic framework for asynchronous parallel coordinate updates, ArXiv e-prints arxiv: , June 205. [24] Z. Peng, T. Wu, Y. Xu, M. Yan, and W. Yin, Coordinate friendly structures, algorithms and applications, Annals of Mathematical Sciences and Applications, vol., no., pp. 57 9, 206.

6 [25] J. Zeng and W. Yin, ExtraPush for convex smooth decentralized optimization over directed networks, UCLA CAM Report 5-6, 205. [26] A. H. Sayed, Adaptation, learning, and optimization over networks, Foundations and Trends in Machine Learning, vol. 7, no. 4-5, pp. 3 80, 204. [27] L. Condat, A primal dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms, Journal of Optimization Theory and Applications, vol. 58, no. 2, pp , 203. [28] B. C. Vũ, A splitting algorithm for dual monotone inclusions involving cocoercive operators, Advances in Computational Mathematics, vol. 38, no. 3, pp ,

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Decentralized Quadratically Approimated Alternating Direction Method of Multipliers Aryan Mokhtari Wei Shi Qing Ling Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania

More information

ADMM and Fast Gradient Methods for Distributed Optimization

ADMM and Fast Gradient Methods for Distributed Optimization ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work

More information

Distributed Optimization over Networks Gossip-Based Algorithms

Distributed Optimization over Networks Gossip-Based Algorithms Distributed Optimization over Networks Gossip-Based Algorithms Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Random

More information

Distributed Consensus Optimization

Distributed Consensus Optimization Distributed Consensus Optimization Ming Yan Michigan State University, CMSE/Mathematics September 14, 2018 Decentralized-1 Backgroundwhy andwe motivation need decentralized optimization? I Decentralized

More information

Distributed Computation of Quantiles via ADMM

Distributed Computation of Quantiles via ADMM 1 Distributed Computation of Quantiles via ADMM Franck Iutzeler Abstract In this paper, we derive distributed synchronous and asynchronous algorithms for computing uantiles of the agents local values.

More information

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China)

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China) Network Newton Aryan Mokhtari, Qing Ling and Alejandro Ribeiro University of Pennsylvania, University of Science and Technology (China) aryanm@seas.upenn.edu, qingling@mail.ustc.edu.cn, aribeiro@seas.upenn.edu

More information

Distributed Optimization over Random Networks

Distributed Optimization over Random Networks Distributed Optimization over Random Networks Ilan Lobel and Asu Ozdaglar Allerton Conference September 2008 Operations Research Center and Electrical Engineering & Computer Science Massachusetts Institute

More information

ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS. Michael G. Rabbat

ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS. Michael G. Rabbat ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS Michael G. Rabbat Dept. of Electrical and Computer Engineering McGill University, Montréal, Québec Email: michael.rabbat@mcgill.ca ABSTRACT This paper

More information

Asynchronous Non-Convex Optimization For Separable Problem

Asynchronous Non-Convex Optimization For Separable Problem Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent

More information

Asynchronous Parallel Computing in Signal Processing and Machine Learning

Asynchronous Parallel Computing in Signal Processing and Machine Learning Asynchronous Parallel Computing in Signal Processing and Machine Learning Wotao Yin (UCLA Math) joint with Zhimin Peng (UCLA), Yangyang Xu (IMA), Ming Yan (MSU) Optimization and Parsimonious Modeling IMA,

More information

Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods

Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods Mahyar Fazlyab, Santiago Paternain, Alejandro Ribeiro and Victor M. Preciado Abstract In this paper, we consider a class of

More information

DLM: Decentralized Linearized Alternating Direction Method of Multipliers

DLM: Decentralized Linearized Alternating Direction Method of Multipliers 1 DLM: Decentralized Linearized Alternating Direction Method of Multipliers Qing Ling, Wei Shi, Gang Wu, and Alejandro Ribeiro Abstract This paper develops the Decentralized Linearized Alternating Direction

More information

ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates

ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan Wotao Yin May 3, 216 Abstract Finding a fixed point to a nonexpansive operator, i.e., x = T

More information

Consensus-Based Distributed Optimization with Malicious Nodes

Consensus-Based Distributed Optimization with Malicious Nodes Consensus-Based Distributed Optimization with Malicious Nodes Shreyas Sundaram Bahman Gharesifard Abstract We investigate the vulnerabilities of consensusbased distributed optimization protocols to nodes

More information

arxiv: v1 [math.oc] 29 Sep 2018

arxiv: v1 [math.oc] 29 Sep 2018 Distributed Finite-time Least Squares Solver for Network Linear Equations Tao Yang a, Jemin George b, Jiahu Qin c,, Xinlei Yi d, Junfeng Wu e arxiv:856v mathoc 9 Sep 8 a Department of Electrical Engineering,

More information

Convergence Rate for Consensus with Delays

Convergence Rate for Consensus with Delays Convergence Rate for Consensus with Delays Angelia Nedić and Asuman Ozdaglar October 8, 2007 Abstract We study the problem of reaching a consensus in the values of a distributed system of agents with time-varying

More information

Primal-dual coordinate descent

Primal-dual coordinate descent Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x

More information

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function

More information

ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates

ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan Wotao Yin July 7, 215 The problem of finding a fixed point to a nonexpansive operator is an abstraction

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

Quantized Average Consensus on Gossip Digraphs

Quantized Average Consensus on Gossip Digraphs Quantized Average Consensus on Gossip Digraphs Hideaki Ishii Tokyo Institute of Technology Joint work with Kai Cai Workshop on Uncertain Dynamical Systems Udine, Italy August 25th, 2011 Multi-Agent Consensus

More information

Fastest Distributed Consensus Averaging Problem on. Chain of Rhombus Networks

Fastest Distributed Consensus Averaging Problem on. Chain of Rhombus Networks Fastest Distributed Consensus Averaging Problem on Chain of Rhombus Networks Saber Jafarizadeh Department of Electrical Engineering Sharif University of Technology, Azadi Ave, Tehran, Iran Email: jafarizadeh@ee.sharif.edu

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

Randomized Gossip Algorithms

Randomized Gossip Algorithms 2508 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 52, NO 6, JUNE 2006 Randomized Gossip Algorithms Stephen Boyd, Fellow, IEEE, Arpita Ghosh, Student Member, IEEE, Balaji Prabhakar, Member, IEEE, and Devavrat

More information

Resilient Distributed Optimization Algorithm against Adversary Attacks

Resilient Distributed Optimization Algorithm against Adversary Attacks 207 3th IEEE International Conference on Control & Automation (ICCA) July 3-6, 207. Ohrid, Macedonia Resilient Distributed Optimization Algorithm against Adversary Attacks Chengcheng Zhao, Jianping He

More information

Primal-dual algorithms for the sum of two and three functions 1

Primal-dual algorithms for the sum of two and three functions 1 Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms

More information

Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones

Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones Wotao Yin joint: Fei Feng, Robert Hannah, Yanli Liu, Ernest Ryu (UCLA, Math) DIMACS: Distributed Optimization,

More information

Distributed online optimization over jointly connected digraphs

Distributed online optimization over jointly connected digraphs Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Southern California Optimization Day UC San

More information

Discrete-time Consensus Filters on Directed Switching Graphs

Discrete-time Consensus Filters on Directed Switching Graphs 214 11th IEEE International Conference on Control & Automation (ICCA) June 18-2, 214. Taichung, Taiwan Discrete-time Consensus Filters on Directed Switching Graphs Shuai Li and Yi Guo Abstract We consider

More information

Network Optimization with Heuristic Rational Agents

Network Optimization with Heuristic Rational Agents Network Optimization with Heuristic Rational Agents Ceyhun Eksin and Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania {ceksin, aribeiro}@seas.upenn.edu Abstract

More information

BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS

BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS Songtao Lu, Ioannis Tsaknakis, and Mingyi Hong Department of Electrical

More information

Diffusion based Projection Method for Distributed Source Localization in Wireless Sensor Networks

Diffusion based Projection Method for Distributed Source Localization in Wireless Sensor Networks The Third International Workshop on Wireless Sensor, Actuator and Robot Networks Diffusion based Projection Method for Distributed Source Localization in Wireless Sensor Networks Wei Meng 1, Wendong Xiao,

More information

arxiv: v3 [math.oc] 1 Jul 2015

arxiv: v3 [math.oc] 1 Jul 2015 On the Convergence of Decentralized Gradient Descent Kun Yuan Qing Ling Wotao Yin arxiv:1310.7063v3 [math.oc] 1 Jul 015 Abstract Consider the consensus problem of minimizing f(x) = n fi(x), where x Rp

More information

Distributed Consensus over Network with Noisy Links

Distributed Consensus over Network with Noisy Links Distributed Consensus over Network with Noisy Links Behrouz Touri Department of Industrial and Enterprise Systems Engineering University of Illinois Urbana, IL 61801 touri1@illinois.edu Abstract We consider

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Asymptotics, asynchrony, and asymmetry in distributed consensus

Asymptotics, asynchrony, and asymmetry in distributed consensus DANCES Seminar 1 / Asymptotics, asynchrony, and asymmetry in distributed consensus Anand D. Information Theory and Applications Center University of California, San Diego 9 March 011 Joint work with Alex

More information

Distributed online optimization over jointly connected digraphs

Distributed online optimization over jointly connected digraphs Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Mathematical Theory of Networks and Systems

More information

Continuous-time Distributed Convex Optimization with Set Constraints

Continuous-time Distributed Convex Optimization with Set Constraints Preprints of the 19th World Congress The International Federation of Automatic Control Cape Town, South Africa. August 24-29, 214 Continuous-time Distributed Convex Optimization with Set Constraints Shuai

More information

arxiv: v2 [eess.sp] 20 Nov 2017

arxiv: v2 [eess.sp] 20 Nov 2017 Distributed Change Detection Based on Average Consensus Qinghua Liu and Yao Xie November, 2017 arxiv:1710.10378v2 [eess.sp] 20 Nov 2017 Abstract Distributed change-point detection has been a fundamental

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

On Agreement Problems with Gossip Algorithms in absence of common reference frames

On Agreement Problems with Gossip Algorithms in absence of common reference frames On Agreement Problems with Gossip Algorithms in absence of common reference frames M. Franceschelli A. Gasparri mauro.franceschelli@diee.unica.it gasparri@dia.uniroma3.it Dipartimento di Ingegneria Elettrica

More information

Asynchronous Distributed Optimization. via Randomized Dual Proximal Gradient

Asynchronous Distributed Optimization. via Randomized Dual Proximal Gradient Asynchronous Distributed Optimization 1 via Randomized Dual Proximal Gradient Ivano Notarnicola and Giuseppe Notarstefano arxiv:1509.08373v2 [cs.sy] 24 Jun 2016 Abstract In this paper we consider distributed

More information

Convergence rates for distributed stochastic optimization over random networks

Convergence rates for distributed stochastic optimization over random networks Convergence rates for distributed stochastic optimization over random networs Dusan Jaovetic, Dragana Bajovic, Anit Kumar Sahu and Soummya Kar Abstract We establish the O ) convergence rate for distributed

More information

DECENTRALIZED algorithms are used to solve optimization

DECENTRALIZED algorithms are used to solve optimization 5158 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 19, OCTOBER 1, 016 DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Aryan Mohtari, Wei Shi, Qing Ling,

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

Time Synchronization in WSNs: A Maximum Value Based Consensus Approach

Time Synchronization in WSNs: A Maximum Value Based Consensus Approach 211 5th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC) Orlando, FL, USA, December 12-15, 211 Time Synchronization in WSNs: A Maximum Value Based Consensus Approach Jianping

More information

On the linear convergence of distributed optimization over directed graphs

On the linear convergence of distributed optimization over directed graphs 1 On the linear convergence of distributed optimization over directed graphs Chenguang Xi, and Usman A. Khan arxiv:1510.0149v4 [math.oc] 7 May 016 Abstract This paper develops a fast distributed algorithm,

More information

Quantized average consensus via dynamic coding/decoding schemes

Quantized average consensus via dynamic coding/decoding schemes Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec 9-, 2008 Quantized average consensus via dynamic coding/decoding schemes Ruggero Carli Francesco Bullo Sandro Zampieri

More information

A Distributed Newton Method for Network Utility Maximization

A Distributed Newton Method for Network Utility Maximization A Distributed Newton Method for Networ Utility Maximization Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie Abstract Most existing wor uses dual decomposition and subgradient methods to solve Networ Utility

More information

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

On Distributed Averaging Algorithms and Quantization Effects

On Distributed Averaging Algorithms and Quantization Effects 2506 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 11, NOVEMBER 2009 On Distributed Averaging Algorithms and Quantization Effects Angelia Nedić, Alex Olshevsky, Asuman Ozdaglar, Member, IEEE, and

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University Convergence Rates in Decentralized Optimization Alex Olshevsky Department of Electrical and Computer Engineering Boston University Distributed and Multi-agent Control Strong need for protocols to coordinate

More information

Stochastic Quasi-Newton Methods

Stochastic Quasi-Newton Methods Stochastic Quasi-Newton Methods Donald Goldfarb Department of IEOR Columbia University UCLA Distinguished Lecture Series May 17-19, 2016 1 / 35 Outline Stochastic Approximation Stochastic Gradient Descent

More information

Distributed Algorithms for Optimization and Nash-Games on Graphs

Distributed Algorithms for Optimization and Nash-Games on Graphs Distributed Algorithms for Optimization and Nash-Games on Graphs Angelia Nedić angelia@illinois.edu ECEE Department Arizona State University, Tempe, AZ 0 Large Networked Systems The recent advances in

More information

On Quantized Consensus by Means of Gossip Algorithm Part I: Convergence Proof

On Quantized Consensus by Means of Gossip Algorithm Part I: Convergence Proof Submitted, 9 American Control Conference (ACC) http://www.cds.caltech.edu/~murray/papers/8q_lm9a-acc.html On Quantized Consensus by Means of Gossip Algorithm Part I: Convergence Proof Javad Lavaei and

More information

c 2015 Society for Industrial and Applied Mathematics

c 2015 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 5, No., pp. 944 966 c 05 Society for Industrial and Applied Mathematics EXTRA: AN EXACT FIRST-ORDER ALGORITHM FOR DECENTRALIZED CONSENSUS OPTIMIZATION WEI SHI, QING LING, GANG WU, AND

More information

Dual Ascent. Ryan Tibshirani Convex Optimization

Dual Ascent. Ryan Tibshirani Convex Optimization Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate

More information

The Multi-Path Utility Maximization Problem

The Multi-Path Utility Maximization Problem The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract

More information

Asymmetric Information Diffusion via Gossiping on Static And Dynamic Networks

Asymmetric Information Diffusion via Gossiping on Static And Dynamic Networks Asymmetric Information Diffusion via Gossiping on Static And Dynamic Networks Ercan Yildiz 1, Anna Scaglione 2, Asuman Ozdaglar 3 Cornell University, UC Davis, MIT Abstract In this paper we consider the

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

arxiv: v1 [math.oc] 24 Dec 2018

arxiv: v1 [math.oc] 24 Dec 2018 On Increasing Self-Confidence in Non-Bayesian Social Learning over Time-Varying Directed Graphs César A. Uribe and Ali Jadbabaie arxiv:82.989v [math.oc] 24 Dec 28 Abstract We study the convergence of the

More information

Parallel Coordinate Optimization

Parallel Coordinate Optimization 1 / 38 Parallel Coordinate Optimization Julie Nutini MLRG - Spring Term March 6 th, 2018 2 / 38 Contours of a function F : IR 2 IR. Goal: Find the minimizer of F. Coordinate Descent in 2D Contours of a

More information

Distributed Convex Optimization

Distributed Convex Optimization Master Program 2013-2015 Electrical Engineering Distributed Convex Optimization A Study on the Primal-Dual Method of Multipliers Delft University of Technology He Ming Zhang, Guoqiang Zhang, Richard Heusdens

More information

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in

More information

A Gossip Algorithm for Aggregative Games on Graphs

A Gossip Algorithm for Aggregative Games on Graphs A Gossip Algorithm for Aggregative Games on Graphs Jayash Koshal, Angelia Nedić and Uday V. Shanbhag Abstract We consider a class of games, termed as aggregative games, being played over a distributed

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

On the Linear Convergence of Distributed Optimization over Directed Graphs

On the Linear Convergence of Distributed Optimization over Directed Graphs 1 On the Linear Convergence of Distributed Optimization over Directed Graphs Chenguang Xi, and Usman A. Khan arxiv:1510.0149v1 [math.oc] 7 Oct 015 Abstract This paper develops a fast distributed algorithm,

More information

Zeno-free, distributed event-triggered communication and control for multi-agent average consensus

Zeno-free, distributed event-triggered communication and control for multi-agent average consensus Zeno-free, distributed event-triggered communication and control for multi-agent average consensus Cameron Nowzari Jorge Cortés Abstract This paper studies a distributed event-triggered communication and

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

On Optimal Frame Conditioners

On Optimal Frame Conditioners On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,

More information

Multi-Robotic Systems

Multi-Robotic Systems CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed

More information

arxiv: v2 [math.oc] 7 Apr 2017

arxiv: v2 [math.oc] 7 Apr 2017 Optimal algorithms for smooth and strongly convex distributed optimization in networks arxiv:702.08704v2 [math.oc] 7 Apr 207 Kevin Scaman Francis Bach 2 Sébastien Bubeck 3 Yin Tat Lee 3 Laurent Massoulié

More information

Design of Non-Binary Quasi-Cyclic LDPC Codes by Absorbing Set Removal

Design of Non-Binary Quasi-Cyclic LDPC Codes by Absorbing Set Removal Design of Non-Binary Quasi-Cyclic LDPC Codes by Absorbing Set Removal Behzad Amiri Electrical Eng. Department University of California, Los Angeles Los Angeles, USA Email: amiri@ucla.edu Jorge Arturo Flores

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

Constrained Consensus and Optimization in Multi-Agent Networks

Constrained Consensus and Optimization in Multi-Agent Networks Constrained Consensus Optimization in Multi-Agent Networks The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

Block stochastic gradient update method

Block stochastic gradient update method Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of Minnesota Department of Mathematics, UCLA November 1, 2015 This work was done while in Rice University 1 / 26 Stochastic

More information

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State

More information

Consensus Optimization with Delayed and Stochastic Gradients on Decentralized Networks

Consensus Optimization with Delayed and Stochastic Gradients on Decentralized Networks 06 IEEE International Conference on Big Data (Big Data) Consensus Optimization with Delayed and Stochastic Gradients on Decentralized Networks Benjamin Sirb Xiaojing Ye Department of Mathematics and Statistics

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

Fast Linear Iterations for Distributed Averaging 1

Fast Linear Iterations for Distributed Averaging 1 Fast Linear Iterations for Distributed Averaging 1 Lin Xiao Stephen Boyd Information Systems Laboratory, Stanford University Stanford, CA 943-91 lxiao@stanford.edu, boyd@stanford.edu Abstract We consider

More information

Hamiltonian Quantized Gossip

Hamiltonian Quantized Gossip 1 Hamiltonian Quantized Gossip Mauro Franceschelli, Alessandro Giua, Carla Seatzu Abstract The main contribution of this paper is an algorithm to solve the quantized consensus problem over networks represented

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren, Member, IEEE

Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren, Member, IEEE IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 57, NO. 1, JANUARY 2012 33 Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren,

More information

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 4, AUGUST

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 4, AUGUST IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 4, AUGUST 2011 833 Non-Asymptotic Analysis of an Optimal Algorithm for Network-Constrained Averaging With Noisy Links Nima Noorshams Martin

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

ADD-OPT: Accelerated Distributed Directed Optimization

ADD-OPT: Accelerated Distributed Directed Optimization 1 ADD-OPT: Accelerated Distributed Directed Optimization Chenguang Xi, Student Member, IEEE, and Usman A. Khan, Senior Member, IEEE Abstract This paper considers distributed optimization problems where

More information

A DELAYED PROXIMAL GRADIENT METHOD WITH LINEAR CONVERGENCE RATE. Hamid Reza Feyzmahdavian, Arda Aytekin, and Mikael Johansson

A DELAYED PROXIMAL GRADIENT METHOD WITH LINEAR CONVERGENCE RATE. Hamid Reza Feyzmahdavian, Arda Aytekin, and Mikael Johansson 204 IEEE INTERNATIONAL WORKSHOP ON ACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 2 24, 204, REIS, FRANCE A DELAYED PROXIAL GRADIENT ETHOD WITH LINEAR CONVERGENCE RATE Hamid Reza Feyzmahdavian, Arda Aytekin,

More information

On Quantized Consensus by Means of Gossip Algorithm Part II: Convergence Time

On Quantized Consensus by Means of Gossip Algorithm Part II: Convergence Time Submitted, 009 American Control Conference (ACC) http://www.cds.caltech.edu/~murray/papers/008q_lm09b-acc.html On Quantized Consensus by Means of Gossip Algorithm Part II: Convergence Time Javad Lavaei

More information

Asynchronous Distributed Averaging on. communication networks.

Asynchronous Distributed Averaging on. communication networks. 1 Asynchronous Distributed Averaging on Communication Networks Mortada Mehyar, Demetri Spanos, John Pongsajapan, Steven H. Low, Richard M. Murray California Institute of Technology {morr, demetri, pongsaja,

More information

A Distributed Newton Method for Network Utility Maximization, I: Algorithm

A Distributed Newton Method for Network Utility Maximization, I: Algorithm A Distributed Newton Method for Networ Utility Maximization, I: Algorithm Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract Most existing wors use dual decomposition and first-order

More information

Agreement algorithms for synchronization of clocks in nodes of stochastic networks

Agreement algorithms for synchronization of clocks in nodes of stochastic networks UDC 519.248: 62 192 Agreement algorithms for synchronization of clocks in nodes of stochastic networks L. Manita, A. Manita National Research University Higher School of Economics, Moscow Institute of

More information

Convergence in Multiagent Coordination, Consensus, and Flocking

Convergence in Multiagent Coordination, Consensus, and Flocking Convergence in Multiagent Coordination, Consensus, and Flocking Vincent D. Blondel, Julien M. Hendrickx, Alex Olshevsky, and John N. Tsitsiklis Abstract We discuss an old distributed algorithm for reaching

More information

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 9, SEPTEMBER 2010 1987 Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE Abstract

More information