Distributed online optimization over jointly connected digraphs

Size: px
Start display at page:

Download "Distributed online optimization over jointly connected digraphs"

Transcription

1 Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego Southern California Optimization Day UC San Diego, May 23, / 29

2 Overview: Distributed online optimization Distributed optimization Online optimization Case study: medical diagnosis 2 / 29

3 Distributed online optimization Why distributed? information is distributed across group of agents need to interact to optimize performance Why online? information becomes incrementally available need adaptive solution 3 / 29

4 Machine learning in healthcare Medical findings, symptoms: age factor amnesia before impact deterioration in GCS score open skull fracture loss of consciousness vomiting Any acute brain finding revealed on Computerized Tomography? (-1 = not present, 1 = present) The Canadian CT Head Rule for patients with minor head injury 4 / 29

5 Binary classification feature vector of patient s: w s = ((w s ) 1,..., (w s ) d 1 ) true diagnosis: y s { 1, 1} wanted weights: x = (x 1,..., x d ) predictor: h(x, w s ) = x (w s, 1) margin: m s (x) = y s h(x, w s ) 5 / 29

6 Binary classification feature vector of patient s: w s = ((w s ) 1,..., (w s ) d 1 ) true diagnosis: y s { 1, 1} wanted weights: x = (x 1,..., x d ) predictor: h(x, w s ) = x (w s, 1) margin: m s (x) = y s h(x, w s ) Given the data set {w s } P s=1, estimate x Rd by solving min f (x) = min x Rd x R d P l(m s (x)) s=1 where the loss function l : R R is decreasing and convex 5 / 29

7 Review of distributed convex optimization 6 / 29

8 In the diagnosis example Health center i {1,..., N} manages a set of patients P i f (x) = i=1 s P i l(y s h(x, w s )) = f i (x) i=1 7 / 29

9 In the diagnosis example Health center i {1,..., N} manages a set of patients P i f (x) = i=1 s P i l(y s h(x, w s )) = f i (x) i=1 Goal: best predicting model w h(x, w) min x R d f i (x) i=1 using local information 7 / 29

10 What do we mean by using local information? Agent i maintains an estimate x i t of x arg min x R d f i (x) Agent i has access to f i Agent i can share its estimate x i t with neighboring agents i=1 f 1 f 2 f 5 a 13 a 14 a 21 a 25 A = a 31 a 32 a 41 a 52 f 3 f 4 8 / 29

11 What do we mean by using local information? Agent i maintains an estimate x i t of x arg min x R d f i (x) Agent i has access to f i Agent i can share its estimate x i t with neighboring agents i=1 f 1 f 2 f 5 a 13 a 14 a 21 a 25 A = a 31 a 32 a 41 a 52 f 3 f 4 Application to distributed estimation in wireless sensor networks and beyond... sensor is any channel for the machine to learn 8 / 29

12 How do agents agree on the optimizer? Spreading of information (gossip, time-varying topologies, B-joint connectivity) Relation between consensus & local minimization 9 / 29

13 How do agents agree on the optimizer? Spreading of information (gossip, time-varying topologies, B-joint connectivity) Relation between consensus & local minimization A. Nedić and A. Ozdaglar, TAC, 09 N zk+1 i = a ij,k z j k η tg, xk+1 i = Π X (zk+1 i ), j=1 where A k = (a ij,k ) is doubly stochastic and g f i (xk i ) J. C. Duchi, A. Agarwal, and M. J. Wainwright, TAC, 12 9 / 29

14 Saddle-point dynamics The minimization problem can be regarded as min x R d f i (x) = i=1 min x 1,...,x N R d x 1 =...=x N f i (x i ) = i=1 min x (R d ) N Lx=0 f i (x i ), i=1 where (Lx) i = N j=1 a ij(x i x j ) 10 / 29

15 Saddle-point dynamics The minimization problem can be regarded as min x R d f i (x) = i=1 min x 1,...,x N R d x 1 =...=x N f i (x i ) = i=1 min x (R d ) N Lx=0 f i (x i ), i=1 where (Lx) i = N j=1 a ij(x i x j ) The augmented Lagrangian when L is symmetric is F (x, z) := f (x) + a 2 x L x + z L x, which is convex-concave, and the saddle-point dynamics F (x, z) ẋ = = f (x) a Lx Lz x F (x, z) ż = = Lx z 10 / 29

16 Weight-balanced digraphs ẋ = f (x) a Lx Lz ż = Lx Lagrangian F (x, z) f (x) + a 2 x L x + z L x, f (x) = N i=1 f i (x i ) 11 / 29

17 Weight-balanced digraphs ẋ = f (x) a Lx Lz ż = Lx Lagrangian F (x, z) f (x) + a 2 x L x + z L x, f (x) = N i=1 f i (x i ) ẋ = F (x, z) x = f (x) a 1 2( L + L ) x L z (Non distributed!) ż = F (x, z) z = Lx J. Wang and N. Elia (with L = L), Allerton, / 29

18 Weight-balanced digraphs ẋ = f (x) a Lx Lz ż = Lx Lagrangian F (x, z) f (x) + a 2 x L x + z L x, f (x) = N i=1 f i (x i ) F (x, z) ẋ = = f (x) a 1 x 2( ) L + L x L z (Non distributed!) changed to f (x) a Lx Lz ż = F (x, z) z = Lx B. Gharesifard and J. Cortés, CDC, 12 Note: a > 0; otherwise the linear part of the saddle-point dynamics is a Hamiltonian system 11 / 29

19 Weight-balanced digraphs ẋ = f (x) a Lx Lz ż = Lx Example: 4 agents in a directed cycle Evolution of the agents s estimates {[x i 1(t), x i 2(t)]} f 1 (x 1, x 2 ) = 1 2 ((x 1 4) 2 + (x 2 3) 2 ) f 2 (x 1, x 2 ) = x 1 + 3x 2 2 f 3 (x 1, x 2 ) = log(e x1+3 + e x2+1 ) f 4 (x 1, x 2 ) = (x 1 + 2x 2 + 5) 2 + (x 1 x 2 4) time, t convergence to a neighborhood of optimizer (1.10, 2.74) size of the neighborhood depends on size of the noise [DMN-JC, 13] 11 / 29

20 Review of online convex optimization 12 / 29

21 Different kind of optimization: sequential decision making Resuming the diagnosis example: Each round t {1,..., T } question (features, medical findings): w t decision (about using CT): h(x t, w t ) outcome (by CT findings/follow up of patient): y t loss: l(y t h(x t, w t )) Choose x t & Incur loss f t (x t ) := l(y t h(x t, w t )) 13 / 29

22 Different kind of optimization: sequential decision making Resuming the diagnosis example: Each round t {1,..., T } question (features, medical findings): w t decision (about using CT): h(x t, w t ) outcome (by CT findings/follow up of patient): y t loss: l(y t h(x t, w t )) Choose x t & Incur loss f t (x t ) := l(y t h(x t, w t )) Goal: sublinear regret T T R(u, T ) := f t (x t ) f t (u) o(t ) t=1 t=1 using historical observations 13 / 29

23 Why regret? If the regret is sublinear, T f t (x t ) t=1 T f t (u) + o(t ), t=1 then, 1 T T f t (x t ) t=1 1 T T t=1 f t (u) + o(t ) T In temporal average, online decisions {x t } T t=1 fixed decision in hindsight perform as well as best No regrets, my friend 14 / 29

24 What about generalization error? Sublinear regret does not imply x t+1 will do well with f t+1 No assumptions about sequence {f t }; it can follow an unknown stochastic or deterministic model, or be chosen adversarially In our example, f t := l(y t h(x t, w t )). If some model w h(x, w) explains reasonably the data in hindsight, then the online models w h(xt, w) perform just as well in average 15 / 29

25 What about generalization error? Sublinear regret does not imply x t+1 will do well with f t+1 No assumptions about sequence {f t }; it can follow an unknown stochastic or deterministic model, or be chosen adversarially In our example, f t := l(y t h(x t, w t )). If some model w h(x, w) explains reasonably the data in hindsight, then the online models w h(xt, w) perform just as well in average Other applications: portfolio selection online advertisement placement interactive learning 15 / 29

26 Some classical results Projected gradient descent: x t+1 = Π S (x t η t f t (x t )), (1) where Π S is a projection onto a compact set S R d, & f 2 H Follow-the-Regularized-Leader: x t+1 = arg min y S ( t s=1 ) f s (y) + ψ(y) 16 / 29

27 Some classical results Projected gradient descent: x t+1 = Π S (x t η t f t (x t )), (1) where Π S is a projection onto a compact set S R d, & f 2 H Follow-the-Regularized-Leader: x t+1 = arg min y S ( t s=1 ) f s (y) + ψ(y) Martin Zinkevich, 03 (1) achieves O( T ) regret under convexity with η t = 1 t Elad Hazan, Amit Agarwal, and Satyen Kale, 07 (1) achieves O(log T ) regret under p-strong convexity with ηt = 1 pt Others: Online Newton Step, Follow the Regularized Leader, etc. 16 / 29

28 Our contribution: Combining both aspects 17 / 29

29 Back again to the diagnosis example: Health center i {1,..., N} takes care of a set of patients P i t at time t f i (x) = T l(y s h(x, w s )) = t=1 s Pt i T t=1 f i t (x) 18 / 29

30 Back again to the diagnosis example: Health center i {1,..., N} takes care of a set of patients P i t at time t f i (x) = T l(y s h(x, w s )) = t=1 s Pt i T t=1 f i t (x) R j (u, T ) := Goal: sublinear agent regret T t=1 i=1 f i t (x j t ) T t=1 i=1 f i t (u) o(t ) using local information & historical observations 18 / 29

31 Challenge: Coordinate hospitals f 1 t f 2 t f 5 t ft 3 ft 4 Need to design distributed online algorithms 19 / 29

32 Previous work on consensus-based online algorithms F. Yan, S. Sundaram, S. V. N. Vishwanathan and Y. Qi, TAC Projected Subgradient Descent log(t ) regret (local strong convexity & bounded subgradients) T regret (convexity & bounded subgradients) Both analysis require a projection onto a compact set S. Hosseini, A. Chapman and M. Mesbahi, CDC, 13 Dual Averaging T regret (convexity & bounded subgradients) General regularized projection onto a convex closed set. K. I. Tsianos and M. G. Rabbat, arxiv, 12 Projected Subgradient Descent Empirical risk as opposed to regret analysis Communication digraph in all cases is fixed, strongly connected & weight-balanced 20 / 29

33 Our contributions (informally) time-varying communication digraphs under B-joint connectivity & weight-balanced unconstrained optimization (no projection step onto a bounded set) log T regret (local strong convexity & bounded subgradients) T regret (convexity & bounded subgradients) 21 / 29

34 Coordination algorithm x i t+1 = x i t η t g x i t Subgradient descent on previous local objectives, g x i t f i t 22 / 29

35 Coordination algorithm x i t+1 = x i t η t g x i t ( + σ a j=1,j i ( a ij,t x j t xt i ) Proportional (linear) feedback on disagreement with neighbors 22 / 29

36 Coordination algorithm x i t+1 = x i t z i t+1 = z i t η t g x i t ( + σ a σ j=1,j i j=1,j i ( a ij,t x j t xt i ) + N ( a ij,t z j t z i ) ) t j=1,j i ( a ij,t x j t xt i ) Integral (linear) feedback on disagreement with neighbors 22 / 29

37 Coordination algorithm x i t+1 = x i t z i t+1 = z i t η t g x i t ( + σ a σ j=1,j i j=1,j i ( a ij,t x j t xt i ) + N ( a ij,t z j t z i ) ) t j=1,j i ( a ij,t x j t xt i ) Union of graphs over intervals of length B is strongly connected. f 1 t f 2 t f 5 t ft 3 ft 4 a 41,t+1 a 14,t+1 time t time t + 1 time t / 29

38 Coordination algorithm x i t+1 = x i t z i t+1 = z i t η t g x i t ( + σ a σ j=1,j i j=1,j i ( a ij,t x j t xt i ) + N ( a ij,t z j t z i ) ) t j=1,j i ( a ij,t x j t xt i ) Compact representation [ xt+1 z t+1 ] = [ xt z t ] σ [ alt L t L t 0 ] [ xt z t ] ] [ gxt η t 0 22 / 29

39 Coordination algorithm x i t+1 = x i t z i t+1 = z i t η t g x i t ( + σ a σ j=1,j i j=1,j i ( a ij,t x j t xt i ) + N ( a ij,t z j t z i ) ) t j=1,j i ( a ij,t x j t xt i ) Compact representation & generalization [ xt+1 z t+1 ] = [ xt z t ] σ [ alt L t L t 0 ] [ xt z t v t+1 = (I σg L t )v t η t g t, ] ] [ gxt η t 0 22 / 29

40 Our contributions Theorem Assume that {ft 1,..., ft N } T t=1 are convex functions in Rd with H-bounded subgradient sets, nonempty and uniformly bounded sets of minimizers, and p-strongly convex in a suff. large neighborhood of their minimizers The sequence of weight-balanced communication digraphs is nondegenerate, and B-jointly-connected G R K K is diagonalizable with positive real eigenvalues Then, taking learning rates η t = 1 p t, for any j {1,..., N} and u R d R j (u, T ) C( u log T ), 23 / 29

41 Our contributions Theorem Assume that {ft 1,..., ft N } T t=1 are convex functions in Rd with H-bounded subgradient sets, nonempty and uniformly bounded sets of minimizers, and p-strongly convex in a suff. large neighborhood of their minimizers The sequence of weight-balanced communication digraphs is nondegenerate, and B-jointly-connected G R K K is diagonalizable with positive real eigenvalues Relaxing strong convexity to convexity and using the Doubling Trick scheme (see S. Shalev-Shwartz) for the learning rates, R j (u, T ) C u 2 2 T, for any j {1,..., N} and u R d 23 / 29

42 Idea of the proof Network regret R j (u, T ) := T t=1 i=1 f i t (x i t) T t=1 i=1 f i t (u) Disagreement dynamics under B-joint connectivity Bound on the trajectories uniform in T 24 / 29

43 Simulations: acute brain finding revealed on Computerized Tomography 25 / 29

44 Agents estimates Average regret ( max j 1/T R j x T, { N } i ft i T t =1) { x i t,7 }N i= time, t Centralized Proportional-I ntegral disagreement feedback Proportional disagreement feedback C entralized time horizon, T ft i (x) = l(y s h(x, w s )) x 2 2 s Pt i where l(m) = log ( 1 + e 2m) 26 / 29

45 Conclusions Distributed online unconstrained convex optimization with sublinear regret under B-joint connectivity Relevant for regression & classification that play a crucial role in machine learning, computer vision, etc. Future work Refine guarantees under model for evolution of objective functions Enable agents to cooperatively select features that strike the balance sensibility/specificity Effect of noise on the performance 27 / 29

46 Future horizons for distributed optimization in healthcare Engage & detect disease before it happens 28 / 29

47 Thank you for listening! 29 / 29

Distributed online optimization over jointly connected digraphs

Distributed online optimization over jointly connected digraphs Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Mathematical Theory of Networks and Systems

More information

Distributed online second-order dynamics for convex optimization over switching connected graphs

Distributed online second-order dynamics for convex optimization over switching connected graphs Distributed online second-order dynamics for convex optimization over switching connected graphs David Mateos-Núñez Jorge Cortés Abstract This paper studies the regret of a family of distributed algorithms

More information

ADMM and Fast Gradient Methods for Distributed Optimization

ADMM and Fast Gradient Methods for Distributed Optimization ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work

More information

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning. Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research

More information

Asynchronous Non-Convex Optimization For Separable Problem

Asynchronous Non-Convex Optimization For Separable Problem Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent

More information

Full-information Online Learning

Full-information Online Learning Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2

More information

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints Hao Yu and Michael J. Neely Department of Electrical Engineering

More information

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research OLSO Online Learning and Stochastic Optimization Yoram Singer August 10, 2016 Google Research References Introduction to Online Convex Optimization, Elad Hazan, Princeton University Online Learning and

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow

More information

Online Learning with Experts & Multiplicative Weights Algorithms

Online Learning with Experts & Multiplicative Weights Algorithms Online Learning with Experts & Multiplicative Weights Algorithms CS 159 lecture #2 Stephan Zheng April 1, 2016 Caltech Table of contents 1. Online Learning with Experts With a perfect expert Without perfect

More information

Stochastic and Adversarial Online Learning without Hyperparameters

Stochastic and Adversarial Online Learning without Hyperparameters Stochastic and Adversarial Online Learning without Hyperparameters Ashok Cutkosky Department of Computer Science Stanford University ashokc@cs.stanford.edu Kwabena Boahen Department of Bioengineering Stanford

More information

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 Online Convex Optimization Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 The General Setting The General Setting (Cover) Given only the above, learning isn't always possible Some Natural

More information

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning Tutorial: PART 2 Online Convex Optimization, A Game- Theoretic Approach to Learning Elad Hazan Princeton University Satyen Kale Yahoo Research Exploiting curvature: logarithmic regret Logarithmic regret

More information

Lecture: Adaptive Filtering

Lecture: Adaptive Filtering ECE 830 Spring 2013 Statistical Signal Processing instructors: K. Jamieson and R. Nowak Lecture: Adaptive Filtering Adaptive filters are commonly used for online filtering of signals. The goal is to estimate

More information

Projection-free Distributed Online Learning in Networks

Projection-free Distributed Online Learning in Networks Wenpeng Zhang Peilin Zhao 2 Wenwu Zhu Steven C. H. Hoi 3 Tong Zhang 4 Abstract The conditional gradient algorithm has regained a surge of research interest in recent years due to its high efficiency in

More information

A survey: The convex optimization approach to regret minimization

A survey: The convex optimization approach to regret minimization A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization

More information

No-Regret Algorithms for Unconstrained Online Convex Optimization

No-Regret Algorithms for Unconstrained Online Convex Optimization No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com

More information

The Online Approach to Machine Learning

The Online Approach to Machine Learning The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I

More information

Distributed Online Optimization in Dynamic Environments Using Mirror Descent. Shahin Shahrampour and Ali Jadbabaie, Fellow, IEEE

Distributed Online Optimization in Dynamic Environments Using Mirror Descent. Shahin Shahrampour and Ali Jadbabaie, Fellow, IEEE 714 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 3, MARCH 2018 Distributed Online Optimization in Dynamic Environments Using Mirror Descent Shahin Shahrampour and Ali Jadbabaie, Fellow, IEEE Abstract

More information

Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections

Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections Jianhui Chen 1, Tianbao Yang 2, Qihang Lin 2, Lijun Zhang 3, and Yi Chang 4 July 18, 2016 Yahoo Research 1, The

More information

Convergence rate of SGD

Convergence rate of SGD Convergence rate of SGD heorem: (see Nemirovski et al 09 from readings) Let f be a strongly convex stochastic function Assume gradient of f is Lipschitz continuous and bounded hen, for step sizes: he expected

More information

Online Learning of Optimal Strategies in Unknown Environments

Online Learning of Optimal Strategies in Unknown Environments 215 IEEE 54th Annual Conference on Decision and Control (CDC) December 15-18, 215. Osaka, Japan Online Learning of Optimal Strategies in Unknown Environments Santiago Paternain and Alejandro Ribeiro Abstract

More information

Consensus-Based Distributed Optimization with Malicious Nodes

Consensus-Based Distributed Optimization with Malicious Nodes Consensus-Based Distributed Optimization with Malicious Nodes Shreyas Sundaram Bahman Gharesifard Abstract We investigate the vulnerabilities of consensusbased distributed optimization protocols to nodes

More information

Trade-Offs in Distributed Learning and Optimization

Trade-Offs in Distributed Learning and Optimization Trade-Offs in Distributed Learning and Optimization Ohad Shamir Weizmann Institute of Science Includes joint works with Yossi Arjevani, Nathan Srebro and Tong Zhang IHES Workshop March 2016 Distributed

More information

Alireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017

Alireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017 s s Machine Learning Reading Group The University of British Columbia Summer 2017 (OCO) Convex 1/29 Outline (OCO) Convex Stochastic Bernoulli s (OCO) Convex 2/29 At each iteration t, the player chooses

More information

Subgradient Methods for Maximum Margin Structured Learning

Subgradient Methods for Maximum Margin Structured Learning Nathan D. Ratliff ndr@ri.cmu.edu J. Andrew Bagnell dbagnell@ri.cmu.edu Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. 15213 USA Martin A. Zinkevich maz@cs.ualberta.ca Department of Computing

More information

Littlestone s Dimension and Online Learnability

Littlestone s Dimension and Online Learnability Littlestone s Dimension and Online Learnability Shai Shalev-Shwartz Toyota Technological Institute at Chicago The Hebrew University Talk at UCSD workshop, February, 2009 Joint work with Shai Ben-David

More information

Adaptive Online Learning in Dynamic Environments

Adaptive Online Learning in Dynamic Environments Adaptive Online Learning in Dynamic Environments Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {zhanglj, lusy, zhouzh}@lamda.nju.edu.cn

More information

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

More information

Notes on AdaGrad. Joseph Perla 2014

Notes on AdaGrad. Joseph Perla 2014 Notes on AdaGrad Joseph Perla 2014 1 Introduction Stochastic Gradient Descent (SGD) is a common online learning algorithm for optimizing convex (and often non-convex) functions in machine learning today.

More information

Adaptive Gradient Methods AdaGrad / Adam. Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade

Adaptive Gradient Methods AdaGrad / Adam. Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Adaptive Gradient Methods AdaGrad / Adam Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade 1 Announcements: HW3 posted Dual coordinate ascent (some review of SGD and random

More information

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Vicente L. Malave February 23, 2011 Outline Notation minimize a number of functions φ

More information

Stochastic Optimization

Stochastic Optimization Introduction Related Work SGD Epoch-GD LM A DA NANJING UNIVERSITY Lijun Zhang Nanjing University, China May 26, 2017 Introduction Related Work SGD Epoch-GD Outline 1 Introduction 2 Related Work 3 Stochastic

More information

Distributed continuous-time convex optimization on weight-balanced digraphs

Distributed continuous-time convex optimization on weight-balanced digraphs Distributed continuous-time convex optimization on weight-balanced digraphs 1 Bahman Gharesifard Jorge Cortés Abstract This paper studies the continuous-time distributed optimization of a sum of convex

More information

Lecture 16: Perceptron and Exponential Weights Algorithm

Lecture 16: Perceptron and Exponential Weights Algorithm EECS 598-005: Theoretical Foundations of Machine Learning Fall 2015 Lecture 16: Perceptron and Exponential Weights Algorithm Lecturer: Jacob Abernethy Scribes: Yue Wang, Editors: Weiqing Yu and Andrew

More information

arxiv: v1 [cs.lg] 23 Jan 2019

arxiv: v1 [cs.lg] 23 Jan 2019 Cooperative Online Learning: Keeping your Neighbors Updated Nicolò Cesa-Bianchi, Tommaso R. Cesari, and Claire Monteleoni 2 Dipartimento di Informatica, Università degli Studi di Milano, Italy 2 Department

More information

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Decentralized Quadratically Approimated Alternating Direction Method of Multipliers Aryan Mokhtari Wei Shi Qing Ling Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania

More information

On the Generalization Ability of Online Strongly Convex Programming Algorithms

On the Generalization Ability of Online Strongly Convex Programming Algorithms On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract

More information

Online Learning and Online Convex Optimization

Online Learning and Online Convex Optimization Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game

More information

Distributed Optimization over Random Networks

Distributed Optimization over Random Networks Distributed Optimization over Random Networks Ilan Lobel and Asu Ozdaglar Allerton Conference September 2008 Operations Research Center and Electrical Engineering & Computer Science Massachusetts Institute

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Online Convex Optimization MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Online projected sub-gradient descent. Exponentiated Gradient (EG). Mirror descent.

More information

Regret bounded by gradual variation for online convex optimization

Regret bounded by gradual variation for online convex optimization Noname manuscript No. will be inserted by the editor Regret bounded by gradual variation for online convex optimization Tianbao Yang Mehrdad Mahdavi Rong Jin Shenghuo Zhu Received: date / Accepted: date

More information

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University Convergence Rates in Decentralized Optimization Alex Olshevsky Department of Electrical and Computer Engineering Boston University Distributed and Multi-agent Control Strong need for protocols to coordinate

More information

Online Passive-Aggressive Algorithms

Online Passive-Aggressive Algorithms Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods

Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods Distributed Smooth and Strongly Convex Optimization with Inexact Dual Methods Mahyar Fazlyab, Santiago Paternain, Alejandro Ribeiro and Victor M. Preciado Abstract In this paper, we consider a class of

More information

Online Submodular Minimization

Online Submodular Minimization Online Submodular Minimization Elad Hazan IBM Almaden Research Center 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Yahoo! Research 4301 Great America Parkway, Santa Clara, CA 95054 skale@yahoo-inc.com

More information

Ad Placement Strategies

Ad Placement Strategies Case Study : Estimating Click Probabilities Intro Logistic Regression Gradient Descent + SGD AdaGrad Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 7 th, 04 Ad

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

Resilient Distributed Optimization Algorithm against Adversary Attacks

Resilient Distributed Optimization Algorithm against Adversary Attacks 207 3th IEEE International Conference on Control & Automation (ICCA) July 3-6, 207. Ohrid, Macedonia Resilient Distributed Optimization Algorithm against Adversary Attacks Chengcheng Zhao, Jianping He

More information

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization by Mark Franklin Godwin A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy

More information

The No-Regret Framework for Online Learning

The No-Regret Framework for Online Learning The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,

More information

Adaptive Gradient Methods AdaGrad / Adam

Adaptive Gradient Methods AdaGrad / Adam Case Study 1: Estimating Click Probabilities Adaptive Gradient Methods AdaGrad / Adam Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade 1 The Problem with GD (and SGD)

More information

Distributed Coordination for Separable Convex Optimization with Coupling Constraints

Distributed Coordination for Separable Convex Optimization with Coupling Constraints Distributed Coordination for Separable Convex Optimization with Coupling Constraints Simon K. Niederländer Jorge Cortés Abstract This paper considers a network of agents described by an undirected graph

More information

Asynchronous Online Learning in Multi-Agent Systems with Proximity Constraints

Asynchronous Online Learning in Multi-Agent Systems with Proximity Constraints 1 Asynchronous Online Learning in Multi-Agent Systems with Proximity Constraints Amrit Singh Bedi, Student Member, IEEE, Alec Koppel, Member, IEEE, and Ketan Rajawat, Member, IEEE Abstract We consider

More information

Online Sparse Linear Regression

Online Sparse Linear Regression JMLR: Workshop and Conference Proceedings vol 49:1 11, 2016 Online Sparse Linear Regression Dean Foster Amazon DEAN@FOSTER.NET Satyen Kale Yahoo Research SATYEN@YAHOO-INC.COM Howard Karloff HOWARD@CC.GATECH.EDU

More information

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 4

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 4 Statistical Machine Learning II Spring 07, Learning Theory, Lecture 4 Jean Honorio jhonorio@purdue.edu Deterministic Optimization For brevity, everywhere differentiable functions will be called smooth.

More information

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms EECS 598-005: heoretical Foundations of Machine Learning Fall 2015 Lecture 23: Online convex optimization Lecturer: Jacob Abernethy Scribes: Vikas Dhiman Disclaimer: hese notes have not been subjected

More information

arxiv: v1 [stat.ml] 23 Dec 2015

arxiv: v1 [stat.ml] 23 Dec 2015 Adaptive Algorithms for Online Convex Optimization with Long-term Constraints Rodolphe Jenatton Jim Huang Cédric Archambeau Amazon, Berlin Amazon, Seattle {jenatton,huangjim,cedrica}@amazon.com arxiv:5.74v

More information

Online Convex Optimization Using Predictions

Online Convex Optimization Using Predictions Online Convex Optimization Using Predictions Niangjun Chen Joint work with Anish Agarwal, Lachlan Andrew, Siddharth Barman, and Adam Wierman 1 c " c " (x " ) F x " 2 c ) c ) x ) F x " x ) β x ) x " 3 F

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. Converting online to batch. Online convex optimization.

More information

Online Learning with Non-Convex Losses and Non-Stationary Regret

Online Learning with Non-Convex Losses and Non-Stationary Regret Online Learning with Non-Convex Losses and Non-Stationary Regret Xiang Gao Xiaobo Li Shuzhong Zhang University of Minnesota University of Minnesota University of Minnesota Abstract In this paper, we consider

More information

Stochastic gradient methods for machine learning

Stochastic gradient methods for machine learning Stochastic gradient methods for machine learning Francis Bach INRIA - Ecole Normale Supérieure, Paris, France Joint work with Eric Moulines, Nicolas Le Roux and Mark Schmidt - January 2013 Context Machine

More information

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology February 2014

More information

Generalized Boosting Algorithms for Convex Optimization

Generalized Boosting Algorithms for Convex Optimization Alexander Grubb agrubb@cmu.edu J. Andrew Bagnell dbagnell@ri.cmu.edu School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 523 USA Abstract Boosting is a popular way to derive powerful

More information

Learnability, Stability, Regularization and Strong Convexity

Learnability, Stability, Regularization and Strong Convexity Learnability, Stability, Regularization and Strong Convexity Nati Srebro Shai Shalev-Shwartz HUJI Ohad Shamir Weizmann Karthik Sridharan Cornell Ambuj Tewari Michigan Toyota Technological Institute Chicago

More information

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017 Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training

More information

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning Francesco Orabona Yahoo! Labs New York, USA francesco@orabona.com Abstract Stochastic gradient descent algorithms

More information

Decentralized Consensus Optimization with Asynchrony and Delay

Decentralized Consensus Optimization with Asynchrony and Delay Decentralized Consensus Optimization with Asynchrony and Delay Tianyu Wu, Kun Yuan 2, Qing Ling 3, Wotao Yin, and Ali H. Sayed 2 Department of Mathematics, 2 Department of Electrical Engineering, University

More information

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization Alexander Rakhlin University of Pennsylvania Ohad Shamir Microsoft Research New England Karthik Sridharan University of Pennsylvania

More information

Warm up: risk prediction with logistic regression

Warm up: risk prediction with logistic regression Warm up: risk prediction with logistic regression Boss gives you a bunch of data on loans defaulting or not: {(x i,y i )} n i= x i 2 R d, y i 2 {, } You model the data as: P (Y = y x, w) = + exp( yw T

More information

Imitation Learning by Coaching. Abstract

Imitation Learning by Coaching. Abstract Imitation Learning by Coaching He He Hal Daumé III Department of Computer Science University of Maryland College Park, MD 20740 {hhe,hal}@cs.umd.edu Abstract Jason Eisner Department of Computer Science

More information

Online Passive-Aggressive Algorithms

Online Passive-Aggressive Algorithms Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron

Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron CS446: Machine Learning, Fall 2017 Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron Lecturer: Sanmi Koyejo Scribe: Ke Wang, Oct. 24th, 2017 Agenda Recap: SVM and Hinge loss, Representer

More information

arxiv: v1 [math.oc] 29 Sep 2018

arxiv: v1 [math.oc] 29 Sep 2018 Distributed Finite-time Least Squares Solver for Network Linear Equations Tao Yang a, Jemin George b, Jiahu Qin c,, Xinlei Yi d, Junfeng Wu e arxiv:856v mathoc 9 Sep 8 a Department of Electrical Engineering,

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Computational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 17: Stochastic Optimization Part II: Realizable vs Agnostic Rates Part III: Nearest Neighbor Classification Stochastic

More information

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure Alberto Bietti Julien Mairal Inria Grenoble (Thoth) March 21, 2017 Alberto Bietti Stochastic MISO March 21,

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

From Bandits to Experts: A Tale of Domination and Independence

From Bandits to Experts: A Tale of Domination and Independence From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A

More information

Recitation 9. Gradient Boosting. Brett Bernstein. March 30, CDS at NYU. Brett Bernstein (CDS at NYU) Recitation 9 March 30, / 14

Recitation 9. Gradient Boosting. Brett Bernstein. March 30, CDS at NYU. Brett Bernstein (CDS at NYU) Recitation 9 March 30, / 14 Brett Bernstein CDS at NYU March 30, 2017 Brett Bernstein (CDS at NYU) Recitation 9 March 30, 2017 1 / 14 Initial Question Intro Question Question Suppose 10 different meteorologists have produced functions

More information

Logarithmic Regret Algorithms for Online Convex Optimization

Logarithmic Regret Algorithms for Online Convex Optimization Logarithmic Regret Algorithms for Online Convex Optimization Elad Hazan 1, Adam Kalai 2, Satyen Kale 1, and Amit Agarwal 1 1 Princeton University {ehazan,satyen,aagarwal}@princeton.edu 2 TTI-Chicago kalai@tti-c.org

More information

Distributed Online Optimization for Multi-Agent Networks with Coupled Inequality Constraints

Distributed Online Optimization for Multi-Agent Networks with Coupled Inequality Constraints Distributed Online Optimization for Multi-Agent etworks with Coupled Inequality Constraints Xiuxian Li, Xinlei Yi, and Lihua Xie arxiv:805.05573v [math.oc] 5 May 08 Abstract his paper investigates the

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 3: Linear Models I (LFD 3.2, 3.3) Cho-Jui Hsieh UC Davis Jan 17, 2018 Linear Regression (LFD 3.2) Regression Classification: Customer record Yes/No Regression: predicting

More information

Stochastic Gradient Descent with Only One Projection

Stochastic Gradient Descent with Only One Projection Stochastic Gradient Descent with Only One Projection Mehrdad Mahdavi, ianbao Yang, Rong Jin, Shenghuo Zhu, and Jinfeng Yi Dept. of Computer Science and Engineering, Michigan State University, MI, USA Machine

More information

Algorithmic Stability and Generalization Christoph Lampert

Algorithmic Stability and Generalization Christoph Lampert Algorithmic Stability and Generalization Christoph Lampert November 28, 2018 1 / 32 IST Austria (Institute of Science and Technology Austria) institute for basic research opened in 2009 located in outskirts

More information

Online Learning and Sequential Decision Making

Online Learning and Sequential Decision Making Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning

More information

arxiv: v4 [math.oc] 5 Jan 2016

arxiv: v4 [math.oc] 5 Jan 2016 Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The

More information

Zeno-free, distributed event-triggered communication and control for multi-agent average consensus

Zeno-free, distributed event-triggered communication and control for multi-agent average consensus Zeno-free, distributed event-triggered communication and control for multi-agent average consensus Cameron Nowzari Jorge Cortés Abstract This paper studies a distributed event-triggered communication and

More information

Support vector machines Lecture 4

Support vector machines Lecture 4 Support vector machines Lecture 4 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin Q: What does the Perceptron mistake bound tell us? Theorem: The

More information

Constrained Consensus and Optimization in Multi-Agent Networks

Constrained Consensus and Optimization in Multi-Agent Networks Constrained Consensus Optimization in Multi-Agent Networks The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

4.1 Online Convex Optimization

4.1 Online Convex Optimization CS/CNS/EE 53: Advanced Topics in Machine Learning Topic: Online Convex Optimization and Online SVM Lecturer: Daniel Golovin Scribe: Xiaodi Hou Date: Jan 3, 4. Online Convex Optimization Definition 4..

More information

Distributed Optimization over Networks Gossip-Based Algorithms

Distributed Optimization over Networks Gossip-Based Algorithms Distributed Optimization over Networks Gossip-Based Algorithms Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Random

More information

No-Regret Learning in Convex Games

No-Regret Learning in Convex Games Geoffrey J. Gordon Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213 Amy Greenwald Casey Marks Department of Computer Science, Brown University, Providence, RI 02912 Abstract

More information

Minimizing Adaptive Regret with One Gradient per Iteration

Minimizing Adaptive Regret with One Gradient per Iteration Minimizing Adaptive Regret with One Gradient per Iteration Guanghui Wang, Dakuan Zhao, Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China wanggh@lamda.nju.edu.cn,

More information

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China)

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China) Network Newton Aryan Mokhtari, Qing Ling and Alejandro Ribeiro University of Pennsylvania, University of Science and Technology (China) aryanm@seas.upenn.edu, qingling@mail.ustc.edu.cn, aribeiro@seas.upenn.edu

More information

arxiv: v3 [math.oc] 1 Jul 2015

arxiv: v3 [math.oc] 1 Jul 2015 On the Convergence of Decentralized Gradient Descent Kun Yuan Qing Ling Wotao Yin arxiv:1310.7063v3 [math.oc] 1 Jul 015 Abstract Consider the consensus problem of minimizing f(x) = n fi(x), where x Rp

More information

Quantized Average Consensus on Gossip Digraphs

Quantized Average Consensus on Gossip Digraphs Quantized Average Consensus on Gossip Digraphs Hideaki Ishii Tokyo Institute of Technology Joint work with Kai Cai Workshop on Uncertain Dynamical Systems Udine, Italy August 25th, 2011 Multi-Agent Consensus

More information

arxiv: v1 [math.oc] 24 Dec 2018

arxiv: v1 [math.oc] 24 Dec 2018 On Increasing Self-Confidence in Non-Bayesian Social Learning over Time-Varying Directed Graphs César A. Uribe and Ali Jadbabaie arxiv:82.989v [math.oc] 24 Dec 28 Abstract We study the convergence of the

More information

On Computational Thinking, Inferential Thinking and Data Science

On Computational Thinking, Inferential Thinking and Data Science On Computational Thinking, Inferential Thinking and Data Science Michael I. Jordan University of California, Berkeley December 17, 2016 A Job Description, circa 2015 Your Boss: I need a Big Data system

More information

A Greedy Framework for First-Order Optimization

A Greedy Framework for First-Order Optimization A Greedy Framework for First-Order Optimization Jacob Steinhardt Department of Computer Science Stanford University Stanford, CA 94305 jsteinhardt@cs.stanford.edu Jonathan Huggins Department of EECS Massachusetts

More information

Zero-Gradient-Sum Algorithms for Distributed Convex Optimization: The Continuous-Time Case

Zero-Gradient-Sum Algorithms for Distributed Convex Optimization: The Continuous-Time Case 0 American Control Conference on O'Farrell Street, San Francisco, CA, USA June 9 - July 0, 0 Zero-Gradient-Sum Algorithms for Distributed Convex Optimization: The Continuous-Time Case Jie Lu and Choon

More information