Distributed Optimization: Analysis and Synthesis via Circuits

Similar documents
Dual Decomposition.

Primal/Dual Decomposition Methods

Distributed Estimation via Dual Decomposition

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Lecture #3. Review: Power

Kirchhoff's Laws and Circuit Analysis (EC 2)

Written Examination

Dual Proximal Gradient Method

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

COOKBOOK KVL AND KCL A COMPLETE GUIDE

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

Distributed Optimization via Alternating Direction Method of Multipliers

Solving Dual Problems

Dual Ascent. Ryan Tibshirani Convex Optimization

Chapter 2. Engr228 Circuit Analysis. Dr Curtis Nelson

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725

9. Dual decomposition and dual algorithms

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Problem Set 3: Solution Due on Mon. 7 th Oct. in class. Fall 2013

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Basic. Theory. ircuit. Charles A. Desoer. Ernest S. Kuh. and. McGraw-Hill Book Company

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

6. MESH ANALYSIS 6.1 INTRODUCTION

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

EE364b Convex Optimization II May 30 June 2, Final exam

Notes for course EE1.1 Circuit Analysis TOPIC 10 2-PORT CIRCUITS

Optimization methods

Analytic Center Cutting-Plane Method

5. Subgradient method

Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems

Interior Point Algorithms for Constrained Convex Optimization

ECE 1311: Electric Circuits. Chapter 2: Basic laws

Chapter 10 AC Analysis Using Phasors


Relation of Pure Minimum Cost Flow Model to Linear Programming

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Conjugate Gradient Method

(includes both Phases I & II)

Network Graphs and Tellegen s Theorem

Splitting methods for decomposing separable convex programs

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Math Models of OR: Some Definitions

Dual and primal-dual methods

Midterm Exam (closed book/notes) Tuesday, February 23, 2010

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

Lecture: Smoothing.

6. Proximal gradient method

Lecture IV: LTI models of physical systems

Nonlinear Optimization for Optimal Control

ENGR 2405 Class No Electric Circuits I

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Slack Variable. Max Z= 3x 1 + 4x 2 + 5X 3. Subject to: X 1 + X 2 + X x 1 + 4x 2 + X X 1 + X 2 + 4X 3 10 X 1 0, X 2 0, X 3 0

To find the step response of an RC circuit

ELECTRICAL THEORY. Ideal Basic Circuit Element

CS711008Z Algorithm Design and Analysis

EE292: Fundamentals of ECE

1 Number Systems and Errors 1

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note 11

Introduction to Applied Linear Algebra with MATLAB

Series & Parallel Resistors 3/17/2015 1

Lab 9/14/2012. Lab Power / Energy Series / Parallel Small Signal Applications. Outline. Power Supply

Basics of Network Theory (Part-I)

Outline. Week 5: Circuits. Course Notes: 3.5. Goals: Use linear algebra to determine voltage drops and branch currents.

Math 273a: Optimization Subgradients of convex functions

Nonlinear Least Squares

Chapter 5. Department of Mechanical Engineering

(includes both Phases I & II)

ECE2262 Electric Circuits. Chapter 1: Basic Concepts. Overview of the material discussed in ENG 1450

Stochastic Subgradient Method

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Notes for course EE1.1 Circuit Analysis TOPIC 4 NODAL ANALYSIS

Charge The most basic quantity in an electric circuit is the electric charge. Charge is an electrical property of the atomic particles of which matter

Lecture 24 November 27

Convex Optimization of Graph Laplacian Eigenvalues

On the acceleration of augmented Lagrangian method for linearly constrained optimization

Accelerated primal-dual methods for linearly constrained convex problems

Sequential Convex Programming

minimize x subject to (x 2)(x 4) u,

4y Springer NONLINEAR INTEGER PROGRAMMING

Optimization for Machine Learning

DEPARTMENT OF STATISTICS AND OPERATIONS RESEARCH OPERATIONS RESEARCH DETERMINISTIC QUALIFYING EXAMINATION. Part I: Short Questions

Lagrange Relaxation: Introduction and Applications

3.10 Lagrangian relaxation

Lecture 8. Strong Duality Results. September 22, 2008

Electric Current. Note: Current has polarity. EECS 42, Spring 2005 Week 2a 1

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Two Port Networks. Definition of 2-Port Network A two-port network is an electrical network with two separate ports for input and output

Constrained Optimization and Lagrangian Duality

Chapter 10: Sinusoidal Steady-State Analysis

Final exam.

Problem info Geometry model Labelled Objects Results Nonlinear dependencies

Automatic Formulation of Circuit Equations

EE364b Homework 2. λ i f i ( x) = 0, i=1

Decomposition of Linear Port-Hamiltonian Systems

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Part 1. The Review of Linear Programming

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

LECTURE 8 RC AND RL FIRST-ORDER CIRCUITS (PART 1)

Outline. Relaxation. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Lagrangian Relaxation. Lecture 12 Single Machine Models, Column Generation

3. THE SIMPLEX ALGORITHM

Transcription:

Distributed Optimization: Analysis and Synthesis via Circuits Stephen Boyd Prof. S. Boyd, EE364b, Stanford University

Outline canonical form for distributed convex optimization circuit intepretation primal decomposition dual decomposition prox decomposition momentum terms Prof. S. Boyd, EE364b, Stanford University 1

Distributed convex optimization problem convex optimization problem partitioned into coupled subsystems divide variables, constraints, objective terms into two groups local variables, constraints, objective terms appear in only one subsystem complicating variables, constraints, objective terms appear in more than one subsystem describe by hypergraph subsystems are nodes complicating variables, constraints, objective terms are hyperedges Prof. S. Boyd, EE364b, Stanford University 2

Conditional separability separable problem: can solve by solving subsystems separately, e.g., minimize f 1 (x 1 )+f 2 (x 2 ) subject to x 1 C 1, x 2 C 2 in distributed problem, two subsystems are conditionally separable if they are separable when all other variables are fixed two subsystems not connected by a net are conditionally separable cf. conditional independence in Bayes net: two variables not connected by hyperedge are conditionally independent, given all other variables Prof. S. Boyd, EE364b, Stanford University 3

Examples minimize f 1 (z 1,x)+f 2 (z 2,x), with variables z 1,z 2,x x is the complicating variable; when fixed, problem is separable z 1, z 2 are private or local variables x is a public or interface or boundary variable between the two subproblems hypergraph: two nodes connected by an edge optimal control problem state is the complicating variable between past and future hypergraph: simple chain Prof. S. Boyd, EE364b, Stanford University 4

Transformation to standard form introduce slack variables for complicating inequality constraints introduce local copies of complicating variables implicitly minimize over private variables (preserves convexity) represent local constraints in domain of objective term we are left with all variables are public, associated with a single node all constraints are consistency constraints, i.e., equality of two or more variables Prof. S. Boyd, EE364b, Stanford University 5

Example minimize f 1 (z 1,x)+f 2 (z 2,x), with variables z 1,z 2,x introduce local copies of complicating variable: minimize f 1 (z 1,x 1 )+f 2 (z 2,x 2 ) subject to x 1 = x 2 eliminate local variables: with f i (x i ) = inf zi f i (z i,x i ) minimize f1 (x 1 )+ f 2 (x 2 ) subject to x 1 = x 2 Prof. S. Boyd, EE364b, Stanford University 6

General form n subsystems with variables x 1,...,x n m nets with common variable values z 1,...,z m n minimize i=1 f i(x i ) subject to x i = E i z, i = 1,...,n matrices E i give netlist or hypergraph (row k is e p, where kth entry of x i is in net p) Prof. S. Boyd, EE364b, Stanford University 7

Optimality conditions introduce dual variable y i associated with x i = E i z optimality conditions are f i (x i ) = y i x i = E i z n i=1 ET i y i = 0 (subsystem relations) (primal feasibility) (dual feasibility) (for nondifferentiable case,replace f i (x i ) with g i f i (x i )) primal condition: (primal) variables on each net are the same dual condition: dual variables on each net sum to zero Prof. S. Boyd, EE364b, Stanford University 8

Circuit interpretation (primal/voltages) subsystems are (grounded) nonlinear resistors nets are wires (nets); consistency constraint is KVL z j is voltage on net j x i is vector of pin voltages for resistor i Prof. S. Boyd, EE364b, Stanford University 9

Circuit interpretation (dual/currents) y i is vector of currents entering resistor i dual feasibility is KCL: sum of currents leaving net j is zero V-I characteristic for resistor i: y i = f i (x i ) f i (x) is content function of resistor i convexity of f i is incremental passivity of resistor i: (x i x i ) T (y i ỹ i ) 0, y i = f i (x i ), ỹ i = f i ( x i ) optimality conditions are exactly the circuit equations Prof. S. Boyd, EE364b, Stanford University 10

Decomposition methods solve distributed problem iteratively algorithm state maintained in nets each step consists of (parallel) update of subsystem primal and dual variables, based only on adjacent net states update of the net states, based only on adjacent subsystems algorithms differ in interface to subsystems state and update Prof. S. Boyd, EE364b, Stanford University 11

Primal decomposition repeat 1. distribute net variables to adjacent subsystems x i := E i z 2. optimize subsystems (separately) solve subproblems to evaluate y i = f i (x i ) 3. collect and sum dual variables for each net w := n i=1 ET i y i 4. update net variables z := z α k w. step factor α k chosen by standard gradient or subgradient rules Prof. S. Boyd, EE364b, Stanford University 12

Primal decomposition algorithm state is net variable z (net voltages) w = n i=1 ET i y i is dual residual (net current residuals) primal feasibility maintained; dual feasibility approached in limit subsystems are voltage controlled: voltage x i is asserted at subsystem pins pin currents y i are then found Prof. S. Boyd, EE364b, Stanford University 13

Circuit interpretation connect capacitor to each net; system relaxes to equilibrium forward Euler update is primal decomposition incremental passivity implies convergence to equilibrium x 1 x 2 x 3 z Prof. S. Boyd, EE364b, Stanford University 14

Dual decomposition initialize y i so that n i=1 ET i y i = 0 (dual variables sum to zero on each net) repeat 1. optimize subsystems (separately) find x i with f i (x i ) = y i, i.e., minimize f i (x i ) y T i x i 2. collect and average primal variables over each net z := (E T E) 1 E T x 3. update dual variables y := y α k (x Ez) Prof. S. Boyd, EE364b, Stanford University 15

Dual decomposition algorithm state is dual variable y x Ez is consistency residual dual feasibility maintained; primal feasibility approached in limit subsystems are current controlled: pin currents y i are asserted pin voltages x i are then found Prof. S. Boyd, EE364b, Stanford University 16

Circuit interpretation connect inductor to each pin; system relaxes to equilibrium forward Euler update is dual decomposition incremental passivity implies convergence to equilibrium x 1 x 2 x 3 y 1 y 2 y 3 z Prof. S. Boyd, EE364b, Stanford University 17

prox operator: P ρ (y, x) = argmin x Prox(imal) interface ( f(x) y T x+(ρ/2) x x 2 2) contains usual dual term y T x and proximal regularization term amounts to solving f(x)+ρ(x x) = y circuit interpretation: drive via resistance R = 1/ρ cf. voltage (primal) drive or current (dual) drive x y f(x) x Prof. S. Boyd, EE364b, Stanford University 18

initialize y i so that n i=1 ET i y i = 0 Prox decomposition repeat 1. optimize subsystems (separately) minimize f i (x i ) y T i x i+(ρ/2) x i E i z 2 2. collect and average primal variables over each net z := (E T E) 1 E T x 3. update dual variables y := y ρ(x Ez) step size ρ in dual update guaranteed to work Prof. S. Boyd, EE364b, Stanford University 19

has many other names... algorithm state is dual variable y y ρ(x x) is dual feasible Prox decomposition primal and dual feasibility approached in limit subsystems are resistor driven; must support prox interface interpretations regularized dual decomposition PI feedback (as opposed to I only feedback) Prof. S. Boyd, EE364b, Stanford University 20

Circuit interpretation connect inductor resistor to each pin; system relaxes to equilibrium forward Euler update is prox decomposition incremental passivity implies convergence to equilibrium x 1 x 2 x 3 y 1 y 2 y 3 z Prof. S. Boyd, EE364b, Stanford University 21

Momentum terms in optimization method, current search direction is standard search direction (gradient, subgradient, prox... ) plus last search direction, scaled interpretations/examples smooth/low pass filter/average search directions add momentum to search algorithm ( heavy-ball method ) two term method (CG) Nesterov optimal order method often dramatically improves convergence Prof. S. Boyd, EE364b, Stanford University 22

You guessed it algorithm: prox decomposition with momentum just add capacitor to prox LR circuit x 1 x 2 x 3 y 1 y 2 y 3 z Prof. S. Boyd, EE364b, Stanford University 23

Conclusions to get a distributed optimization algorithm: represent as circuit with interconnecting wires replace interconnect wires with passive circuits that reduce to wires at equilibrium discretize circuit dynamics subsystem interfaces depend on circuit drive (current, voltage, via resistor) convergence hinges on incremental passivity Prof. S. Boyd, EE364b, Stanford University 24