Decoupling Coupled Constraints Through Utility Design
|
|
- Dinah Reeves
- 6 years ago
- Views:
Transcription
1 1 Decoupling Coupled Constraints Through Utility Design Na Li and Jason R. Marden Abstract The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent global behavior is desirable with respect to a given system level objective. In many systems this control design is further complicated by coupled constraints on the agents behavior. This paper seeks to address the design of such algorithms using the field of game theory. In particular, is it possible to design local agent utility functions for an unconstrained game such that all resulting pure Nash equilibria (i) optimize the given system level objective and (ii) satisfy the given coupled constraint. Such developments would greatly simplify the control design by eliminating the need to explicitly consider the constraint. Unfortunately, we illustrate that the standard game theoretic framework is not suitable for this design objective. However, we demonstrate that by adding an additional state variable in the game environment, i.e., moving towards a special class of Markov games termed state based games, we can satisfy these performance criteria through utility design. The key to this realization is incorporating exterior penalty functions and barrier functions into the design of the agents utility functions. I. INTRODUCTION The central goal in multiagent systems is to derive desirable collective behaviors through the design of individual agent control algorithms [1] [7]. In many systems, the desirable collective behavior must also satisfy a given coupled constraint on the agents actions [2] [7]. One example is the problem of TCP control where the users sending rates need to satisfy link capacity constraints [4]. An alternative example is the problem of power control in MIMO interference systems where the users sending rates need to satisfy quality of service constraints This research was supported by AFOSR grant #FA Na Li is a graduate student of Control and Dynamical Systems, California Institute of Technology, Pasadena, CA 91125, USA nali@caltech.edu J. R. Marden is with the Department of Electrical, Computer & Energy Engineering, University of Colorado, Boulder, CO 80309, USA jason.marden@colorado.edu. Corresponding author.
2 2 [5], [6]. Regardless of the specific application domain, these coupled constraints bring additional complexity to the control algorithm design. There are two main research directions aimed at designing distributed control algorithms to satisfy performance criteria involving coupled constraints. The first direction seeks to design algorithms that ensure the coupled constraint is always satisfied, e.g., the well-studied consensus algorithm [1], [3], [8], [9]. While theoretically appealing, such algorithms lack a robustness to environmental uncertainties, noisy measurements, and inconsistent clock rates amongst the agents. The second direction seeks to design algorithms that ensure the limiting behavior satisfies the coupled constraints, e.g., dual decomposition [4], [10] [13] and subgradient methods [14], [15]. Such algorithms often require a two-time scale solution approach by introducing pricing terms to help coordinate behavior. Depending on the application domain, these two-time scale approaches may be prohibitive either by the informational dependence on the pricing terms or the rigidity of the update algorithm. 1 Another key problem associated with the above algorithms is the informational dependence on the agents control policies. From a design perspective, it is important to have a methodology for control design which guarantees that the resulting agent control policies are within an admissible space, e.g., control policies depend only on local information. However, when using the aforementioned algorithms, the informational dependence of the resulting agents control policies is a byproduct of the algorithm and the structure of the system level objective function. Hence, the resulting control policies need not be admissible. Recently, the approaches in [14] and [15] seek to address this problem by adding such informational constraints to the resulting control policies. The derived results in [14], [15] hold for a special class of optimization problems 1 For example, in [11] the authors introduce a two-time scale algorithm for a network coding problem where the pricing terms are set, agents reach an equilibrium, and then the pricing terms are updated. This type of algorithm is difficult to implement as convergence rates for reaching an equilibrium are uncharacterized; therefore, it is hard to determine when to adapt the pricing terms. In alternative settings, such as [4], the agents can reach an equilibrium in one time step; hence, implementing such an algorithm is not necessarily prohibitive. However, the fact that agents can equilibrate in one time step is a result of the special structure of the objective function and in general is not true for other system level objective functions.
3 3 with separable objective functions and coupled feasibility constraints. 2 In contrast, we focus on an alternative class of optimization problems with non-separable objective functions and linear coupled constraints. While our solution approach is fundamentally different than the approaches in [14], [15], in spirit the approaches are similar as informational limitations are overcome by introducing local estimation terms for the agents. Our results offer certain advantages over the results in [14], [15] by (i) providing automatic robustness guarantees and (ii) providing algorithms which ensure that the coupled constraints are always satisfied. 3 This paper focuses on using game theory for the design of agent control policies. A game theoretic approach to cooperative control involves the following two steps: (i) Game design: Define the agents as self-interested decision makers in a game theoretic environment. (ii) Agent decision rules: Define a distributed learning algorithm, i.e., a distributed control law, for each agent to coordinate group behavior. The goal is to do both steps in conjunction with one another to ensure that the emergent global behavior is desirable with respect to a given system level objective [16], [17]. One of the central appeals of game theory for the design and control of multiagent systems is that it provides a hierarchical decomposition between the distribution of the optimization problem (game design) and the specific distributed control laws (agent decision rules) [18]. For example, if the game is designed as a potential game then a system designer can appeal to a wide class of distributed learning algorithms which guarantee convergence to a (pure) Nash equilibrium under a variety of informational dependencies [19] [23]. This decomposition provides the system designer with tremendous flexibility to meet the objectives and constraints inherent to a broad class of multiagent systems. Furthermore, the framework of potential games is inherently robust. That is, even if the agents control policies are inherently heterogenous stemming from delays 2 The paper [14] considers a distributed optimization problem where each agent i {1,..., n} is endowed with a local objective n function f i( ) and a local constraint set X i. The considered system level objective is of the form min x n i=1 X fi(x) i i=1 where f i : R n R is a convex function and X i R n is a closed convex set. The algorithm in [15] also focuses on distributed optimization problems with the same separable objective functions but with coupled constraints of the form of g(x) 0. 3 The algorithm presented in [15] does not ensure that the coupled constraints are always satisfied for the considered class of optimization problems with linear coupled constraints of the form Ax b. The results contained in this paper (Section IV-C) provide such algorithms.
4 4 or corruption in information, varying degrees of accessible information, and variations in clock rates amongst the agents, behavior will still to converge to a pure Nash equilibrium in such settings. This theme is encapsulated in [22] which proves that virtually any reasonable decision making process will converge to a pure Nash equilibrium in potential games. The first part of this work focuses on the following question: Is it possible to design a game such that (i) the agents utility functions depend only on local information, (ii) all resulting equilibria satisfy the desired performance criterion, and (iii) the resulting game has an underlying structure that can be exploited by distributed algorithms? Our first result proves that the framework of strategic form games is not rich enough to meet these objectives. We demonstrate this impossibility result on the well-studied consensus problem which possesses a convex system level objective function and linear coupled constraints. Ultimately, the desired locality in each agent s utility function limits the degree to which we can achieve these objectives. Our second result establishes a systematic methodology for game design that satisfies the above objectives by considering a broader class of games than that of strategic form games. Specifically, we consider a simplified class of Markov games [24], termed state based games, which introduces an underlying state space into the game theoretic environment [25]. Here, the local state variables are introduced as a mechanism for each agent to estimate constraint violations using only available information. The novelty of our approach stems from integrating classical optimization techniques, in particular exterior penalty methods and barrier function methods, into the design of the agents cost functions as shown in Sections IV-B and IV-C respectively. Both methodologies ensure that all three objectives are satisfied (Theorems 2 and 3). The core difference between the two approaches is that barrier functions can also be used to ensure that the constraint is satisfied dynamically in addition to asymptotically. The developed methodologies, whether incorporating the exterior penalty method or the barrier function method, ensure that the resulting game fits into a class of games termed state based potential games [25]. A state based potential game, which represents an extension of potential games to the class of state based games, possesses an underlying structure that can be exploited in distributed learning. Our last result proves that the learning algorithm gradient play [26] [28] asymptotically converges to a set of equilibria, termed stationary state Nash equilibria, in state based potential games. Accordingly, the two step design approach achieves our desired performance criteria.
5 5 II. PRELIMINARIES A. Multiagent Systems: Problem Definition We consider a multiagent system consisting of n agents denoted by the set N {1,, n}. Each agent i N is endowed with a set of possible decisions (or values) denoted by V i which we assume is a convex subset of R, i.e., V i R. 4 We denote a joint decision by the tuple v (v 1,..., v n ) V i N V i where V is referred to as the set of joint decisions. There is a global objective of the form φ : V R that a system designer seeks to minimize. We assume throughout that φ is differentiable and convex unless otherwise noted. Lastly, we focus on multiagent systems with an inherent coupled constraint represented by a set of linear inequalities { n i=1 A k i v i C k 0} k M {1,...,m}. More formally, the optimization problem takes on the form min vi V i,i N φ(v 1, v 2,..., v n ) (1) s.t. ni=1 A k i v i C k 0, k M We do not explicitly highlight the equality case, n i=1 A k i v i C k = 0, since this can be handled by two inequalities of the form n i=1 A k i v i C k 0 and ( ni=1 A k i v i C k) 0. The focus of this paper is different from that of direct optimization as our goal is to establish an interaction framework where each decision maker i N makes a decision independently in response to local information. These informational dependencies are modeled by an undirected and connected communication graph G = (N, E) with nodes N and edges E N N where the neighbor set of agent i is N i {j N : (i, j) E}. 5 The interaction framework produces a sequence of decisions v(0), v(1), v(2),..., where at each iteration t {0, 1,...} the decision of each agent i is chosen independently according to a local control law of the form v i (t) = F i ({Information about agent j at stage t} j Ni ) (2) which designates how each agent processes available information to formulate a decision at each iteration. The goal in this setting is to design the local controllers {F i ( )} i N within the desired informational constraints such that the collective behavior converges to a joint decision v that 4 The results in this paper also hold for value with a higher dimension, i.e, V i R d i where d i > 1 and d i can be different from d j if i j. For ease of exposition, we focus purely on the case d i = 1 for all i N. 5 By convention we let i N i for each agent i N.
6 6 solves the optimization problem in (1). Note that these informational constraints often limit the availability of established distributed optimization approaches. The main objective of this paper is to develop a methodology for decoupling coupled constraints through local estimation terms. Hence, we assume throughout that it is possible to design local agent control policies of the form (2) for solving the class of optimization problems in (1) without the coupled constraints. Accordingly, we assume that for each agent i N there exists a function M i : j N i V j R such that for any value profiles v, v V where v i = v i we have ( ) ( {v } ) φ(v) φ(v ) = M i {vj } j Ni Mi j Here v i denotes the value profile of agents other than agent i, i.e., v i = (v 1,..., v i 1, v i+1,..., v n ). There are several examples of systems with such objective functions including consensus [1], [3], [8], [9], power control [10], network coding [11], [29], sensor coverage [30], network routing [4], [31], among many more. 6 Furthermore, recent work in [32] focuses on how to achieve this for more general objective functions by introducing auxiliary state variables. j N i (3) B. Background: Finite Strategic Form Games This paper focuses on using game theory to derive control policies of the form (2). Game theory provides a rich analytical framework to analyze distributed systems where the decisionmakers are responding to local and heterogenous objective functions. The core game theoretic framework is that of strategic form games which consists of an agent set N {1,..., n}. Each agent i N has an action set A i and a cost function J i : A R where A := i N A i denotes the joint action set. For an action profile a = (a 1,..., a n ), let a i denote the action profile of agents other than agent i, i.e., a i = (a 1,..., a i 1, a i+1,..., a n ). We will frequently represent a game by the tuple G = {N, {A i } i N, {J i } i N }. Pure Nash equilibria represent one of the most famous equilibrium concepts for such distributed systems. An action profile a A is called a pure Nash equilibrium if for all i N J i (a i, a i) = min a i A i J i (a i, a i). 6 Consider the consensus problem introduced in Section III-A. The objective function for consensus is of the form 1 v 2 i N,j N i v i j 2 2. It is straightforward to verify that this objective function satisfies (3) where M i ({v j} j Ni ) = v j N i v i j 2 2.
7 7 Our motivation for studying pure Nash equilibria derives from the fact that for many game structures, e.g., potential games, there are a wide array of distributed learning algorithms which guarantee convergence to a (pure) Nash equilibrium under a variety of informational dependencies [19] [23]. III. AN IMPOSSIBILITY RESULT FOR GAME DESIGN We focus on game theory as a tool for obtaining distributed solutions to the optimization problem (1). Accordingly, we formulate the optimization problem as a game where the agent set is N, the action set of each agent is V i, and each agent is assigned a cost function of the form J i : V R. This section focuses on the feasibility of developing a methodology for game design that achieves the following three objectives: (O-1): Each agent s cost function relies solely on local information and is of the form J i : j N i V j R. Furthermore, the cost functions should possess a degree of scalability with regards to the size of the system and the topology of the communication graph. (O-2): All Nash equilibria of the resulting game G represent solutions to the optimization problem in (1). (O-3): The resulting game G possesses an underlying structure, e.g., potential games [19] [23], that can be exploited by distributed learning algorithm. Accomplishing these objectives would ensure that the agents control policies resulting from the designed game plus a suitable learning algorithm would be of the local form in (2). However, in this section we demonstrate that achieving these objectives using the framework of strategic form games is impossible in general. We demonstrate this impossibility result on the well-studied consensus problem. A. Definition: Average Consensus We now present the average consensus problem in [1]. Consider a set of agents N = {1,, n} where each agent i N has an initial value v i (0) R. The goal of the consensus problem is to establish a set of local control laws of the form (2) such that each agent s value converges to the average of the initial values, i.e., lim t v i (t) = v = 1 n i N v i (0). Given that the communication graph G is connected, this can be formalized as the following optimization
8 8 problem: 1 min v 2 i N,j N i v i v j 2 2 s.t. i N v i = i N v i (0). Furthermore, the control laws {F i ( )} i N should provide the desired asymptotic guarantees for any initial value profile v(0) and communication graph G. This implies that the underlying control design must be invariant to these parameters. In the case of average consensus, this also implies that the underlying control design must be invariant to the agents indices. We refer to such control designs as scalable. (4) B. Average Consensus: A Game Theoretic Formulation Consider a game theoretic model for the consensus problem where the agent set is N and the action set of each agent is R, i.e., A i = R for all agents i N. To facilitate the design of scalable agent control policies, we focus of the design of agent cost functions of the form J i (v) J ( ) {v j, v j (0)} j Ni (5) where the function J( ) is invariant to specific indices assigned to agents. Notice that this design of J( ) leads to a well defined game irrespective of the agent set N, initial value profile v(0), or the structure of the communication graph {N i } i N. The following proposition demonstrates that it is impossible to design J( ) such that for any game induced by an initial value profile and communication graph all resulting Nash equilibria solve the optimization problem in (4). Proposition 1 There does not exist a single J( ) such that for any game induced by a connected communication graph G, an initial value profile v(0) V, and agents cost functions of the form (5), the Nash equilibria of the induced game represent solutions to the optimization problem in (4). Proof: Suppose that there exists a single J( ) that satisfies the proposition. We will now construct a counterexample to show this is impossible. Consider two average consensus problems with a communication graph and initial values (either 0 or 20) as highlighted in Figure 1. For example, in setting (a) we have N = {1, 2, 3, 4}, E = {{1, 2}, {1, 3}, {3, 4}}, v 2 (0) = 20, and v 1 (0) = v 3 (0) = v 4 (0) = 0. Define a cost function for each agent of the form (5) for both
9 9 Fig. 1. Two consensus problems for the proof of Proposition 1. settings (a) and (b) using this single J( ). Suppose (v1, v2, v3, v4) represents a Nash equilibrium for setting (a). This implies that for any agent i N, we have J ( ( ) {vi, v i (0)}, {vj, v j (0)} j Ni \i) J {vi, v i (0)}, {vj, v j (0)} j Ni \i (6) for any v i V i. Furthermore, by assumption we know that vi = 5 for all agents i = 1, 2, 3, 4. We now claim that (v1, v2, v3, v4, v5) = (5, 5, 5, 5, 5) must also be a Nash equilibrium for setting (b). By writing down the Nash equilibrium condition in (6) for setting (b), it is straightforward to see that agents 1, 2, 3, 4 in setting (a) have the same cost function as agents 1, 2, 3, 4 in setting (b). This results from the fact that the agents cost functions are invariant to the specific agent indices. Therefore, since (v1, v2, v3, v4) = (5, 5, 5, 5) is a Nash equilibrium for (a) this implies that no member of the above group has a unilateral incentive to deviate from vi = 5 in setting (b). Furthermore, agent 5 in setting (b) has the same cost function as agent 3 in setting (a) also resulting from the fact that the agents cost functions are invariant to the specific agent indices. Therefore, agent 5 also does not have a unilateral incentive to deviate from v5 = 5 in setting (b) following the same logic as above. Therefore (v1, v2, v3, v4, v5) = (5, 5, 5, 5, 5) is a Nash equilibrium of (b). The impossibility comes from the fact that 5 does not represent the average of setting (b). IV. STATE BASED GAME DESIGNS The previous section demonstrates that the framework of strategic form games is not suitable for handling coupled constraints through the design of agent cost functions. In this section, we
10 10 focus on achieving objectives (O-1) (O-3) using the framework of state based games [25] which represents a simple yet tractable class of Markov games [24]. In such games, each agent i N is assigned a local state based cost function of the form J i : X j A j R j N i where X j is a set of local state variables for agent j and A j is a set of actions for agent j. We present two separate designs in this section that achieves objectives (O-1) (O-3) by utilizing these local state variables as local agents estimates of the coupled constraint violations. The first design employs exterior penalty functions in the design of the agents cost functions. The second design employs barrier functions in the design of the agents cost functions. The core difference between the two approaches is that barrier functions can be used to ensure that the constraint is satisfied dynamically in addition to asymptotically. A. Preliminaries: State Based Games State based games represent an extension of finite strategic form games where an underlying state space is introduced to the game theoretic environment. The class of state based games considered in this paper consists of the following elements: (i) an agent set N, (ii) a state space X, (iii) state dependent action sets of the form A i (x) for each agent i N and state x X, (iv) state dependent cost functions of the form J i : X A R for each agent i N where A i N A i and A i x X A i (x), and (v) a deterministic state transition function f : X A X. Lastly, we focus on state based games where for any state x X there exists a null action 0 A(x) such that x = f(x, 0). We denote a state based game by the tuple G = {N, {A i } i N, {J i } i N, X, f}. In this paper we focus on the following class of games, termed state based potential games [25], which represent an extension of potential games to the framework of state based games. Definition 1 (State Based Potential Game) A game G is a state based potential game if there exists a potential function Φ : X A R that satisfies the following two properties for every state x X: (D-1): For any agent i N, action profile a A(x), and action a i A i (x) J i (x, a i, a i ) J i (x, a) = Φ(x, a i, a i ) Φ(x, a)
11 11 (D-2): For any action profile a A(x) and new state x = f(x, a) the potential function satisfies Φ( x, 0) = Φ(x, a). The first condition states that each agent s cost function is aligned with the potential function in the same fashion as in potential games [19]. The second condition relates to the evolution on the potential function along the state trajectory. 7 We focus on the class of state based potential games since dynamics can be derived which converge to the following class of equilibria (Section V): Definition 2 (Stationary State Nash Equilibrium) A state action pair [x, a ] is a stationary state Nash equilibrium if (D-1): For any agent i N we have a i arg min ai A i (x ) J i (x, a i, a i). (D-2): The state is a fixed point of the state transition function, i.e., x = f(x, a ). Note that a (strict) stationary state Nash equilibrium represents a fixed point of a better reply process [23]. Proposition 3 in the appendix proves that a stationary state Nash equilibrium is guaranteed to exist in any state based potential game. B. Design Using Exterior Penalty Functions Our first design integrates exterior penalty functions into the agents cost functions. 8 We now introduce the specifics of our designed game. Table I in the appendix summarizes all necessary notations. State Space: The starting point of our design is an underlying state space X where each state x X is defined as a tuple x = (v, e), where v = (v 1,..., v n ) R n is the profile of values and 7 The definitions of state based potential games differs slightly from the definition in [25] since we focus on state dependent actions sets and games where there exists null actions. 8 Exterior penalty functions are widely used in optimization for transforming a constrained optimization problem into an equivalent uncoupled constrained optimization problem where the constraints are translated to penalties that are directly integrated into the objective functions. More specifically, the coupled constrained optimization problem (1) is transformed into an uncoupled constrained optimization problem min v V φ(v) + µα(v) where α(v) is a penalty function associated with the constraints { i Ak i v i C k 0} k M and µ > 0 is a tradeoff parameter. The value of this transformation centers on the fact that solving this uncoupled constrained optimization problem is easier than solving the optimization problem in (1). For each µ > 0 the solution to the uncoupled constrained optimization problem may represent an infeasible solution to the coupled constrained optimization problem in (1); however, as µ, the limit of the solutions to the uncoupled constrained optimization problem is an optimal solution to the optimization problem in (1). See Theorem in [33] for more details.
12 e = { } e k i is the profile of estimation terms. The term k M,i N ek i represents agent i s estimate for the k-th constraint, i.e., e k i n i=1 A k i v i C k. Actions: Each agent i is assigned a state dependent action set A i (x) that permits the agent to change its value and constraint estimation through communication with neighboring agents. Specifically, an action a i is defined as a tuple a i = (ˆv i, {ê 1 i,..., ê m i }) where ˆv i indicates a change in the agent s value and ê k i indicates a change in the agent s estimate of the k-th constraint. Here, the change in estimation terms for agent i is represented by a tuple ê k i = { } ê k i j. The term j N i ê k i j indicates the estimation value that agent i exchanges (or passes) to agent j N i regarding the k-th constraint. State Dynamics: For notational simplicity let ê k i in j N i ê k j i and ê k i out j N i ê k i j denote the total estimation passed to and from agent i regarding the k-th constraint respectively. The state transition function f(x, a) takes on the following form: for any state x = (v, e) and action a = (ˆv, ê) the ensuing state x = (ṽ, ẽ) = f(x, a) is ṽ i = v i + ˆv i, ẽ i = { e k i + A k i ˆv i + ê k i in ê k i out } 12 k M. (7) Let v(0) be the initial value profile and define the initial estimation terms e(0) to satisfy i N e k i (0) = i N A k i v i (0) C k for all constraints k M. 9 Let x(0) = (v(0), e(0)) be the initial state. It is straightforward to show that for any sequence of action profiles a(0), a(1),..., the resulting state trajectory x(t) = (v(t), e(t)) = f(x(t 1), a(t 1)) satisfies for all t 1 and constraints k M. Therefore e k i (t) = A k i v i (t) C k (8) i N i N e k i (t) 0 A k i v i (t) C k 0 (9) i N i N for any constraint k M. Hence, the estimation terms encode information regarding whether the constraints are violated. Action Sets: The optimization problem in (1) requires that v i V i for all agents i N. Hence, the state dependent action set of agent i for a given state x = (v, e) is defined as 9 Note that there are several ways to define the initial estimation terms to satisfy this equality.
13 13 A i (x) {(ˆv, ê) : v i + ˆv i V i }. The null action 0 takes on the form ˆv i = 0, i N ê k i j = 0, i N, j N i, k M. (10) Agent Cost Functions: We define the state based cost function for each agent i N as a summation of two terms relating to the objective φ and the constraints Av C respectively as J i (x, a) = J φ i (x, a) + µji c (x, a) (11) where µ > 0 is a trade-off parameter. For any state x X and admissible action profile a i N A i (x) the components take on the form J φ i (x, a) = M i ( {ṽj } j Ni ) (12) Ji c (x, a) = m [ ( )] max 0, ẽ k 2 j (13) j N i k=1 where x = (ṽ, ẽ) = f(x, a) represents the ensuing state and M i ( ) is defined in (3). The following two theorems demonstrate two analytical properties of the designed game. The first property establishes that the designed state based game is a state based potential game. We present the proofs of both theorems in the appendix. Theorem 1 Model the constrained optimization problem in (1) as a state based game G with a fixed trade-off parameter µ > 0 as depicted in Section IV-B. The state based game is a state based potential game with potential function m [ ( )] max 0, ẽ k 2 i (14) Φ(x, a) = φ(ṽ) + µ i N where x = (ṽ, ẽ) = f(x, a) represents the ensuing state. The next theorem provides a complete characterization of the set of stationary state Nash equilibria in the designed state based game. k=1 Theorem 2 Model the constrained optimization problem in (1) as a state based game G with a fixed trade-off parameter µ > 0 as depicted in Section IV-B. Suppose the objective function φ : V R is convex and differentiable and the communication graph G is undirected and connected. A state action pair [x, a] = [(v, e), (ˆv, ê)] is a stationary state Nash equilibrium in game G if and only if the following four conditions are satisfied:
14 14 (i) The value profile v is an optimal point of the uncoupled constrained optimization problem min φ(v) + µ [ ( max 0, )] 2 A k i v i C k. (15) v V n i N k M (ii) The estimation profile e satisfies max ( ( ) 0, e k 1 i = n max 0, i N A k i v i C k ) (iii) The change in value profile satisfies ˆv i = 0 for all agents i N., i N, k M. (iv) The net change in estimation profile is 0, i.e., ê k i in ê k i out = 0 for all agents i N and constraints k M. This characterization proves the equivalence between the stationary state Nash equilibria of the designed game and solutions to the uncoupled constrained optimization problem in (15). Therefore, as µ all equilibria of our designed game are solutions to the coupled constrained optimization problem in (1) [33]. Note that this equivalence is surprising given the fact that we impose a restriction on the underlying decision making architecture. In particular, we impose a distributed decision making architecture where each of the decision making entities only has access to local information. Recall that such local information is encoded by the communication graph G which is undirected and connected. C. Design Using Barrier Functions In this section we introduce our second design which integrates barrier functions, as opposed to exterior penalty functions, into the agents cost functions. 10 Both design approaches focus on solving the constrained optimization problem through a sequence of subproblems which generates a trajectory of value profiles v(1), v(2),..., that converges to the optimal solution. The key difference between the two approaches lies in the feasibility of the intermediate solutions as 10 Much like exterior penalty functions, barrier functions are widely used in optimization for transforming a constrained optimization problem into an equivalent, yet easier to solve, optimization problem where the constraints are translated to penalties that are directly integrated into the objective functions. More specifically, the constrained optimization problem in (1) is transformed into the following problem: min v V φ(v) + µb(v) subject to the constraint { n i=1 Ak i v i C k < 0} k M where B(v) is a barrier function associated with the constraints { n i=1 Ak i v i C k < 0} k M and µ > 0 is a given constant or tradeoff parameter. The barrier function is added to the objective function and ultimately prohibits the generated points from leaving the feasible region. Hence, this method generates a feasible point for each µ. When µ 0, the limit of those feasible points is an optimal solution to the original problem. See Theorem in [33] for more details.
15 15 barrier functions can be used to ensure that the intermediate solutions are in fact feasible. We now introduce the specifics of our designed game. State Space, Actions, State Dynamics: These three parts are identical to those in Section IV-B. Admissible Action Sets: Let x = (v, e) represent a feasible state where the value profile v satisfies Av < C, v i V i for each agent i N, and the estimation profile e satisfies e k i < 0 for each i N and k M. Define the admissible action set for each agent i N as A i (x) { (ˆv i, ê i ) : v i + ˆv i V i, e k i + A k i ˆv i ê k i out < 0, ê k i j < 0, k M }. (16) According to the state transition rule in (7), it is straightforward to verify that for any admissible action profile a A(x), the ensuing state [ṽ, ẽ] = f(x, a) is also feasible. Accordingly, if the initial state x(0) is feasible then all subsequent states are also feasible, i.e, all states x(1), x(2),..., generated according to the process x(t + 1) = f(x(t), a(t)) where a(t) A(x(t)) for each time t {0, 1,...}. This implies that the coupled constraint will always be satisfied since for each time t > 0 the value profile satisfies Av(t) < C, v i (t) V i for each agent i N, and the estimation profile e satisfies e k i (t) < 0 for each i N and k M. Furthermore, note that each agent i can evaluate the admissible action set A i (x) using only local information. Agent Cost Functions: Each agent i N is assigned a cost function of the form (11) where the sole difference is the cost function associated with the estimation terms which now takes on the form Ji c (x, a) = m log ( ) ẽ k j. (17) j N i k=1 where x = (ṽ, ẽ) = f(x, a) represents the ensuing state. Here, the cost function Ji c ( ) utilizes a barrier function, as opposed to a penalty function, which penalizes values close to the constraint boundary. The following theorem demonstrates that the game design using barrier functions possesses the same analytical properties of the designed game using exterior penalty function. We omit the proof of the following theorem as it is virtually identical to the proofs of Theorems 1 and 2. Theorem 3 Model the constrained optimization problem in (1) as a state based game G with a fixed trade-off parameter µ > 0 as depicted in Section IV-C. The state based game is a state based potential game with potential function Φ(x, a) = φ(ṽ) µ m log ( ) ẽ k i i N k=1
16 16 where x = (ṽ, ẽ) = f(x, a) represents the ensuing state. Furthermore, suppose the objective function φ : V R is convex and differentiable and the communication graph G is undirected and connected. A state action pair [x, a] = [(v, e), (ˆv, ê)] is a stationary state Nash equilibrium if and only if the following four conditions are satisfied: (i) The value profile v is an optimal point of the following optimization problem min v V φ(v) nµ ( log C k ) A k i v i k M i N (18) s.t. m A k i v i C k < 0, k M i=1 (ii) The estimation profile e satisfies e k i = 1 A k i v i C k, n i N i N, k M. (iii) The change in value profile satisfies ˆv i = 0 for all agents i N. (iv) The net change in estimation profile is 0, i.e., ê k i in ê k i out = 0 for all agents i N and constraints k M. As µ 0 all equilibria of our designed game are solutions to the constrained optimization problem in (1) [33]. V. GRADIENT PLAY In this section we prove that the learning algorithm gradient play [26] [28] converges to a stationary state Nash equilibrium in any state based potential game. Since the designed games, either with exterior penalty functions or barrier functions, represent state based potential games, the algorithm gradient play can be utilized to design control laws of the form (2). A. Gradient Play for State Based Potential Games Suppose that A i (x) is a closed convex set for all i N and x X. Let x(t) represent the state at time t. Accordingly to the learning algorithm gradient play, each agent i N selects an action a i (t) A i (x(t)) according to a i (t) = [ ɛ J ] i(x(t), a) + a i a=0 (19) where [ ] + represents the projection onto the closed convex set A i (x(t)) and ɛ is the step size which is a positive constant.
17 17 Theorem 4 Let G be a state based potential games with a potential function Φ(x, a) satisfies: (i) Φ(x, a) is a convex, differentiable and bounded below. (ii) There exist a constant L such that for any x X and for any a, a A(x), a Φ(x, a) a Φ(x, a ) 2 L a a 2 where a Φ(x, a) ( Φ a 1,..., Φ a n ). 11 (iii) A i (x) is a closed convex set for all i N and x X. If the step size is smaller than 2/L, then the state action pair [x(t), a(t)] of the gradient play process in (19) asymptotically converges to a stationary state Nash equilibrium of the form [x, 0]. Proof: We will prove that the potential function Φ(x, a) = Φ( x, 0) where x f(x, a) is monotonically decreasing during the gradient play process provided that the step size is sufficiently small. The gradient play process in (19) can be expressed using the state based potential function as a i (t) = [ ɛ J i(x(t), a) a i ] + [ = ɛ a=0 Φ(x(t), a) a i ] + (20) a=0 Let x(t + 1) = f(x(t), a(t)). By Definition 1, we know that Φ(x(t + 1), 0) = Φ(x(t), a(t)). Accordingly, we will focus on analyzing the sequence Φ(x(t), 0) which satisfies Φ(x(t + 1), 0) Φ(x(t), 0) = Φ(x(t), a(t)) Φ(x(t), 0) a(t) T Φ(x(t), a) T + L a 2 a(t) 2 2 where the second inequality is based on Proposition A.24 in [34]. By the Projection Theorem [34], we know that ( which is equivalent to ɛ a(t) T Φ(x(t), a) a Φ(x(t), a) a If the step size is smaller than 2, we have that L Φ(x(t + 1), 0) Φ(x(t), 0) a=0 ) T a(t) ( a(t)) 0 a=0 1 a=0 ɛ a(t)t a(t). ( L 2 1 ɛ ) a(t) and the equality holds if and only if a(t) = 0. Therefore, Φ(x(t), 0) is monotonically decreasing along the trajectory x(t). Since Φ(x(t), 0) is bounded below, Φ(x(t), 0) keeps decreasing until it 11 This says that aφ(x, a) is a Lipschitz function on variable a.
18 18 reaches a fixed point, which means a(t) = 0. By Lemma 1, we know that [x(t), a(t)] converges to the stationary state Nash equilibrium [x, 0]. The rate of convergence of gradient play depends on the structure of the potential function Φ, the state transition function f, and the stepsize ɛ. A larger ɛ generally leads to faster convergence but can also lead to instability. The bound on the stepsize ɛ in Theorem 4 is conservative as larger stepsize can usually be used without losing stability. Furthermore, the same proof techniques can be used to prove convergence even with heterogeneous stepsizes among agents. This implies that agents can update synchronously or asynchronously without altering the asymptotic guarantees. B. Gradient Play for Designed Game Using Exterior Penalty Functions The state based game design using penalty functions depicted in Section IV-B represents a state potential game with a potential function that is convex and differentiable. Accordingly, we can implement the gradient play algorithm to reach a stationary state Nash equilibrium provided that φ is bounded below and V i is a closed convex set for all i N. The gradient play algorithm can be described as follows: at each time t 0 each agent i selects an action a i (t) (ˆv i (t), ê i (t)) given the state x(t) = (v(t), e(t)) according to [ ˆv i (t) = ɛ J ] i(x(t), a) + ˆv i a=0 = ɛ φ(v) + 2µ A k v i i max ( 0, e k i (t) ) v=v(t) k M ê k i j(t) = ɛ J i(x(t), a) ê k i j a=0 + (21) = ɛ 2µ ( max ( 0, e k i (t) ) max ( 0, e k j (t) )) (22) where [ ] + represents the projection onto the closed convex set Aˆv i (x(t)) {ˆv i : v i (t)+ ˆv i V i }. C. Gradient Play for Designed Game Using Barrier Functions The state based game design using barrier functions depicted in Section IV-C is a state potential game with a potential function that is convex and differentiable. Accordingly, we can implement the algorithm gradient play to reach a stationary state Nash equilibrium. However, in order to guarantee that the coupled constraints { i A k i v i C k < 0} k M is satisfied at each time step,
19 19 we need adjust the gradient play algorithm in (19) as follows: at each time t 0 each agent i selects an action a i (t) (ˆv i (t), ê i (t)) given the state x(t) = (v(t), e(t)) according to ˆv i = β(t) ɛ J i(x(t), a) ˆv i a=0 = β(t) ɛ φ(v) µ A k 1 v i i (23) v=v(t) k M e k i ê k i j = β(t) ɛ J i(x(t), a) = β(t) ɛ ê k i j a=0 ( ) 1 e k i 1 e k j where ɛ is the step size which is a positive constant and β(t) = ( ) k(t) 1 2 where k(t) is the smallest nonnegative integer k for which ( ( ) 1 k ɛ J i(x(t), a) ( ) 1 k, ɛ J i(x(t), a) 2 ˆv i a=0 2 ê k i j a=0 ) A i (x(t)). (24) VI. ILLUSTRATIONS In this section we illustrate our results on two separate cooperative control problems where the desired global behavior must satisfy a given coupled constraint. A. Consensus Consider the consensus problem described in Section III-A. Recall that it was impossible to design agent cost functions of the form (5) which ensured that all resulting Nash equilibria solved the optimization problem in (4). However, the state based game design using exterior penalty functions (Section IV-B) accomplishes this task using the framework of state based games. Note that the state based game design using barrier functions (Section IV-C) is not applicable in this setting since the constraint is an equality constraint. 1) A Simple Example: We will now demonstrate the state based game design using exterior penalty functions on the consensus problem highlighted in Figure 1. First, we express the constraint i v i = i v i (0) as two inequality constraints i v i i v i (0) and i v i i v i (0). Each agent i has a local state variable of the form x i (v i, e 1 i, e 2 i ) where e 1 i and e 2 i are the estimation terms associated with the two inequality constraints. Agent i s action a i is
20 a i (ˆv i, ê 1 i, ê 2 i ) where ê k i = { ê k i j } j N i for k = 1, 2. The state transition rule and local cost function are defined in (7) and (11) where M i ({v j } j Ni ) j N i v i v j 2 2 satisfies (3). For concreteness, consider agent 3 in the setting of Figure 1.a as an example. A state of agent 3 is of the form x 3 (v 3, e 1 3, e 2 3). An action of agent 3 is of the form a 3 (ˆv 3, ê 1 3 1, ê 1 3 4, ê 2 3 1, ê 2 3 4). The state transition rule is of the form [ṽ, ẽ] = f ([v, e], [ˆv, ê]) where ṽ 3 = v 3 + ˆv 3, ẽ 1 3 = e 3 + ˆv 3 ê ê ê ê 1 4 3, ẽ 2 3 = e 3 ˆv 3 ê ê ê ê The local cost function of agent 3 is of the form J 3 ([v, e], [ˆv, ê]) = ṽ 3 ṽ ṽ 3 ṽ 4 2 +µ ( [max ( )] 2 [ ( )] 2 [ ( )] ) 2 k=1,2 0, ẽ k 3 + max 0, ẽ k 1 + max 0, ẽ k 4. Notice that the above design methodology does not depend on the initial value profile and communication graph which was a requirement stated in Section III-B. Figure 2 shows the simulation results for those two different settings by applying the state based game design using exterior penalty functions (Section IV-B) and the gradient play algorithm (Section V-B). In the simulation, we set µ = 1 and stepsize ɛ = The simulation demonstrates that using state based game design we can achieve average consensus for both settings. Furthermore, the agents achieve the correct estimate of the couple constraints. 2) Analytical Properties of Pairwise Decomposable Objective Functions: The simulation results in the previous section highlight that the control design obtained using the state based game design with exterior penalty functions in conjunction with the gradient play learning algorithm yields the exact solutions to the consensus problem in (4). However, Theorem 2 suggests that in order to attain such guarantees we need to let µ. The following theorem demonstrates that if the objective function has a special pairwise decomposable structure, such as the objective function in the consensus problem, we can achieve the optimal solution of the original optimization problem in (1) with any fixed µ > 0. We present the proof of the following proposition in the appendix. Proposition 2 Consider the state based game depicted in Section IV-B with a fixed trade-off parameter µ > 0. Suppose the following conditions are satisfied: 20
21 21 Fig. 2. Convergence of gradient play for the state based game design using exterior penalty functions for consensus problems (a) and (b) in Figure 1 respectively. The left figures show the convergence of value v(t) which demonstrates that all the agents asymptotically reach the average consensus point. The right figures show the convergence of the first estimation terms e 1 i (t) for all agents i N. Recall, that e 1 i (t) = 0 for all i N implies that the coupled constraint is satisfied. 1) The value set for any agent i N is V i = R. 2) The objective function φ is convex, differentiable, and pairwise decomposable, i.e., φ(v) = d ij (v j v i ) (25) i N j N i where d ij (v j v i ) = d ji (v i v j ) for any two agent i, j N. 3) The constraint is a single equality constraint i N A i v i = c where i N A i 0. 4) The communication graph is undirected and connected. The state action pair [x, a] = [(v, e), (ˆv, ê)] is a stationary state Nash equilibrium if and only if the following four conditions are satisfied: (i) The value profile v is an optimal point of the constrained optimization problem (1).
22 22 (ii) The estimation profile e satisfies e k i = 0 for all agents i N and constraints k = 1, (iii) The change in value profile ˆv satisfies ˆv i = 0 for all agents i N. (iv) The net change in estimation profile ê is 0, i.e. ê k i in ê k i out = 0 for all agents i N and constraints k M. 3) A More Complex Example: We simulated a two-dimension average consensus problem with 61 agents and the results are presented in Figure In the simulation, we set µ = 1 and stepsize ɛ = 0.2. The figure on the left shows the initial conditions, i.e., the agents, the communication graph, the initial values, and the average of the initial values. The figure on the right illustrates the convergence of the dynamics attained by applying the state based game design proposed in Section IV-B in conjunction with the gradient play dynamics in Section V-B. Figure 4 demonstrates the evolution of Φ(x(t), a(t)), as defined in (14), during the gradient play learning process. Note that Φ(x(t), a(t)) converges to 0 very quickly. Moreover, the evolution of the individual components of the potential function Φ( ) which pertain to the value and estimation terms respectively, Φ v (x(t), a(t)) = φ(ṽ) and Φ e (x(t), a(t)) = µ [ 2 k=1 max(0, ẽ k i ) ] 2, are plotted in Figure 4. Since the agents are initially dispersed, the magnitude of Φ(x(t), a(t)) is primarily caused by the value component Φ v (x(t), a(t)). Gradually the agents values move closer together and the estimation cost Φ e (x(t), a(t)) is the main contributor to Φ( ). This plot demonstrates that agents are usually responding to inaccurate estimates of the value profile; however, these inaccuracies do not impact the limiting behavior of the dynamics. B. Power control Consider an optical channel problem where there is a set of channels N = {1,..., n} that are transmitted across an optical fiber by wavelength-multiplexing [10]. The system level objective is to minimize the total power while ensuring that the optical signal-to-noise ratio (OSNR) is larger than a specified threshold. For any channel i N let v i be the optical power at the input, p i be the optical power at the output, and n in i and n out i OSNR of channel i N at the receiver is defined as OSNR i = be the noise at the input and output. The p i n out i = n in Γ ii v i i + j N Γ i,j v j where 12 Note that a single equality constraint is represented by two inequality constraints. 13 All results contained in this paper immediately extend to multi-dimensional optimization problems, i.e., d i 1.
23 23 Fig. 3. The left figure shows the communication graph: Green nodes represent agents and their positions represent their initial values v i(0), which are two dimensional. Blue lines represent the communication links and the red node is the (weighted) average. The right figure demonstrates that the dynamics converge. The XY plane highlights the value of each agent while the Z axis shows the time step. Each curve is a trajectory of one agent s value v i(t). Fig. 4. Dynamics of Φ(x(t), a(t)), Φ v (x(t), a(t)), and Φ e (x(t), a(t)) for the two-dimensional average consensus problem. Γ = [Γ i,j ] is a system matrix. We consider a power control problem for this channel, which is defined as: min vi 0 s.t. i N v i Γ i,i v i n in+ i j i Γ i,jv j r i, i N where r i is the threshold for channel i s OSNR. It is straightforward to observe that the constraints can be transformed into the following linear constraints r i n in i (26) + r i Γ i,j v j Γ i,i v i 0, i N. (27) j N
24 24 Moreover, the objective function satisfies (3) with M i (v i ) = v i. Therefore, we can use the state based game designs depicted in Section IV-B and Section IV-C to solve this channel problem. We consider the following setup with N = 4, n in i system matrix Γ = = 0.5, OSNR thresholds r = (4, 5, 3, 2) and The communication graph is Figure 5 shows the simulation of the power control problem using the state based game design incorporating exterior penalty functions and the learning algorithm gradient play. We set µ = 50 and ɛ = Figure 5 illustrates that the value profile v converges to (v 1,..., v 4 ) = ( , , , ). This is close to the optimal result (v 1, v 2, v 3, v 4 ) = ( , , , ) which was obtained using linear programming. Fig. 5. Power control game design: Trajectory of each agent s power level v i using the gradient play algorithm with the state based game design using exterior penalty functions. Figure 6 shows the simulation of the power control problem using the state based game design incorporating barrier functions and the learning algorithm gradient play. We set µ = 0.05 and ɛ = Figure 6 illustrates that the value profile v converges to (v 1,..., v 4 ) = (14.30, 19.77, 12.98, 6.68) which is close to the optimal result. Furthermore, note that the constraints are always satisfied when using the gradient play algorithm for this setting.
Designing Games to Handle Coupled Constraints
Designing Games to Handle Coupled Constraints Na Li and Jason R. Marden Abstract The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent
More informationUtility Design for Distributed Engineering Systems
Utility Design for Distributed Engineering Systems Jason R. Marden University of Colorado at Boulder (joint work with Na Li, Caltech) University of Oxford June 7, 2011 Cooperative control Goal: Derive
More informationSelecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden
1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized
More informationDesigning Games for Distributed Optimization
Designing Games for Distributed Optimization Na Li and Jason R. Marden Abstract The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent
More informationGame Theoretic Learning in Distributed Control
Game Theoretic Learning in Distributed Control Jason R. Marden Jeff S. Shamma November 1, 2016 May 11, 2017 (revised) Abstract In distributed architecture control problems, there is a collection of interconnected
More informationGeneralized Efficiency Bounds In Distributed Resource Allocation
1 Generalized Efficiency Bounds In Distributed Resource Allocation Jason R. Marden Tim Roughgarden Abstract Game theory is emerging as a popular tool for distributed control of multiagent systems. To take
More informationDistributed Receding Horizon Control of Cost Coupled Systems
Distributed Receding Horizon Control of Cost Coupled Systems William B. Dunbar Abstract This paper considers the problem of distributed control of dynamically decoupled systems that are subject to decoupled
More informationNear-Potential Games: Geometry and Dynamics
Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics
More informationConstrained Consensus and Optimization in Multi-Agent Networks
Constrained Consensus Optimization in Multi-Agent Networks The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher
More informationNear-Potential Games: Geometry and Dynamics
Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics
More informationSolving Dual Problems
Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem
More informationConnections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem
Proceedings of the European Control Conference 2007 Kos, Greece, July 2-5, 2007 Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem Jason R. Marden, Gürdal
More informationLecture 1. Stochastic Optimization: Introduction. January 8, 2018
Lecture 1 Stochastic Optimization: Introduction January 8, 2018 Optimization Concerned with mininmization/maximization of mathematical functions Often subject to constraints Euler (1707-1783): Nothing
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationOn Equilibria of Distributed Message-Passing Games
On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu
More informationAlternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications
Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization
More informationAchieving Pareto Optimality Through Distributed Learning
1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and
More informationAchieving Pareto Optimality Through Distributed Learning
1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and
More informationPrivacy and Fault-Tolerance in Distributed Optimization. Nitin Vaidya University of Illinois at Urbana-Champaign
Privacy and Fault-Tolerance in Distributed Optimization Nitin Vaidya University of Illinois at Urbana-Champaign Acknowledgements Shripad Gade Lili Su argmin x2x SX i=1 i f i (x) Applications g f i (x)
More informationDistributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 9, SEPTEMBER 2010 1987 Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE Abstract
More informationCopyrighted Material. 1.1 Large-Scale Interconnected Dynamical Systems
Chapter One Introduction 1.1 Large-Scale Interconnected Dynamical Systems Modern complex dynamical systems 1 are highly interconnected and mutually interdependent, both physically and through a multitude
More informationCompetitive Scheduling in Wireless Collision Channels with Correlated Channel State
Competitive Scheduling in Wireless Collision Channels with Correlated Channel State Utku Ozan Candogan, Ishai Menache, Asuman Ozdaglar and Pablo A. Parrilo Abstract We consider a wireless collision channel,
More informationRobust Network Codes for Unicast Connections: A Case Study
Robust Network Codes for Unicast Connections: A Case Study Salim Y. El Rouayheb, Alex Sprintson, and Costas Georghiades Department of Electrical and Computer Engineering Texas A&M University College Station,
More informationA Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology
A Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology Amir Amini, Amir Asif, Arash Mohammadi Electrical and Computer Engineering,, Montreal, Canada.
More informationPerformance Tradeoffs in Distributed Control Systems
University of Colorado, Boulder CU Scholar Aerospace Engineering Sciences Graduate Theses & Dissertations Aerospace Engineering Sciences Spring 1-1-2016 Performance Tradeoffs in Distributed Control Systems
More informationDistributed Optimization over Networks Gossip-Based Algorithms
Distributed Optimization over Networks Gossip-Based Algorithms Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Random
More informationMulti-Robotic Systems
CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed
More informationDistributed Optimization over Random Networks
Distributed Optimization over Random Networks Ilan Lobel and Asu Ozdaglar Allerton Conference September 2008 Operations Research Center and Electrical Engineering & Computer Science Massachusetts Institute
More informationEfficient Sensor Network Planning Method. Using Approximate Potential Game
Efficient Sensor Network Planning Method 1 Using Approximate Potential Game Su-Jin Lee, Young-Jin Park, and Han-Lim Choi, Member, IEEE arxiv:1707.00796v1 [cs.gt] 4 Jul 2017 Abstract This paper addresses
More informationOSNR Optimization in Optical Networks: Extension for Capacity Constraints
5 American Control Conference June 8-5. Portland OR USA ThB3.6 OSNR Optimization in Optical Networks: Extension for Capacity Constraints Yan Pan and Lacra Pavel Abstract This paper builds on the OSNR model
More informationConsensus Protocols for Networks of Dynamic Agents
Consensus Protocols for Networks of Dynamic Agents Reza Olfati Saber Richard M. Murray Control and Dynamical Systems California Institute of Technology Pasadena, CA 91125 e-mail: {olfati,murray}@cds.caltech.edu
More informationExact and Approximate Equilibria for Optimal Group Network Formation
Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich and Bugra Caskurlu Computer Science Department, RPI, 110 8th Street, Troy, NY 12180 {eanshel,caskub}@cs.rpi.edu Abstract.
More informationCharacterization of Convex and Concave Resource Allocation Problems in Interference Coupled Wireless Systems
2382 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 59, NO 5, MAY 2011 Characterization of Convex and Concave Resource Allocation Problems in Interference Coupled Wireless Systems Holger Boche, Fellow, IEEE,
More informationA Hybrid Framework for Resource Allocation among Multiple Agents Moving on Discrete Environments
A Hybrid Framework for Resource Allocation among Multiple Agents Moving on Discrete Environments Jorge L. Piovesan, Chaouki T. Abdallah, and Herbert G. Tanner 1 Abstract We consider the problem of controlling
More informationLearning Approaches to the Witsenhausen Counterexample From a View of Potential Games
Learning Approaches to the Witsenhausen Counterexample From a View of Potential Games Na Li, Jason R. Marden and Jeff S. Shamma Abstract Since Witsenhausen put forward his remarkable counterexample in
More informationDistributed Optimization. Song Chong EE, KAIST
Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links
More informationApproximation Metrics for Discrete and Continuous Systems
University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science May 2007 Approximation Metrics for Discrete Continuous Systems Antoine Girard University
More informationAchieving Pareto Optimality Through Distributed Learning
1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and
More information6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011
6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System Ashutosh Nayyar, Student
More informationThe Multi-Path Utility Maximization Problem
The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract
More informationSEVERAL social and engineering systems can be thought of as a collection of multiple
The Importance of System-Level Information in Multiagent Systems Design: Cardinality and Covering Problems Dario Paccagnan and Jason R. Marden 2 Abstract A fundamental challenge in multiagent systems is
More informationA Distributed Newton Method for Network Utility Maximization, II: Convergence
A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility
More information3.10 Lagrangian relaxation
3.10 Lagrangian relaxation Consider a generic ILP problem min {c t x : Ax b, Dx d, x Z n } with integer coefficients. Suppose Dx d are the complicating constraints. Often the linear relaxation and the
More informationLecture 6: Communication Complexity of Auctions
Algorithmic Game Theory October 13, 2008 Lecture 6: Communication Complexity of Auctions Lecturer: Sébastien Lahaie Scribe: Rajat Dixit, Sébastien Lahaie In this lecture we examine the amount of communication
More informationLEARNING IN CONCAVE GAMES
LEARNING IN CONCAVE GAMES P. Mertikopoulos French National Center for Scientific Research (CNRS) Laboratoire d Informatique de Grenoble GSBE ETBC seminar Maastricht, October 22, 2015 Motivation and Preliminaries
More information12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria
12. LOCAL SEARCH gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley h ttp://www.cs.princeton.edu/~wayne/kleinberg-tardos
More informationBranch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems
Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems Yongjia Song James R. Luedtke August 9, 2012 Abstract We study solution approaches for the design of reliably
More informationOblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games
Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract
More informationRouting. Topics: 6.976/ESD.937 1
Routing Topics: Definition Architecture for routing data plane algorithm Current routing algorithm control plane algorithm Optimal routing algorithm known algorithms and implementation issues new solution
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More informationFor general queries, contact
PART I INTRODUCTION LECTURE Noncooperative Games This lecture uses several examples to introduce the key principles of noncooperative game theory Elements of a Game Cooperative vs Noncooperative Games:
More informationOnline Appendix for Asymptotically Optimal Prior-Free Clock Auctions
Online Appendix for Asymptotically Optimal Prior-Free Clock Auctions Simon Loertscher and Leslie M. Marx January 16, 2019 Online Appendix: Alternative clock auctions In this appendix we describe two alternative
More information10-725/36-725: Convex Optimization Spring Lecture 21: April 6
10-725/36-725: Conve Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 21: April 6 Scribes: Chiqun Zhang, Hanqi Cheng, Waleed Ammar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationA State-Space Approach to Control of Interconnected Systems
A State-Space Approach to Control of Interconnected Systems Part II: General Interconnections Cédric Langbort Center for the Mathematics of Information CALIFORNIA INSTITUTE OF TECHNOLOGY clangbort@ist.caltech.edu
More informationDecentralized Quasi-Newton Methods
1 Decentralized Quasi-Newton Methods Mark Eisen, Aryan Mokhtari, and Alejandro Ribeiro Abstract We introduce the decentralized Broyden-Fletcher- Goldfarb-Shanno (D-BFGS) method as a variation of the BFGS
More informationA Novel Iterative Convex Approximation Method
1 A Novel Iterative Convex Approximation Method Yang Yang and Marius Pesavento Abstract In this paper, we propose a novel iterative convex approximation algorithm to efficiently compute stationary points
More informationDefinitions and Proofs
Giving Advice vs. Making Decisions: Transparency, Information, and Delegation Online Appendix A Definitions and Proofs A. The Informational Environment The set of states of nature is denoted by = [, ],
More informationConsensus-Based Distributed Optimization with Malicious Nodes
Consensus-Based Distributed Optimization with Malicious Nodes Shreyas Sundaram Bahman Gharesifard Abstract We investigate the vulnerabilities of consensusbased distributed optimization protocols to nodes
More informationNash Equilibria for Combined Flow Control and Routing in Networks: Asymptotic Behavior for a Large Number of Users
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 47, NO. 6, JUNE 2002 917 Nash Equilibria for Combined Flow Control and Routing in Networks: Asymptotic Behavior for a Large Number of Users Eitan Altman, Senior
More informationCS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash
CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationA Price-Based Approach for Controlling Networked Distributed Energy Resources
A Price-Based Approach for Controlling Networked Distributed Energy Resources Alejandro D. Domínguez-García (joint work with Bahman Gharesifard and Tamer Başar) Coordinated Science Laboratory Department
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationExponential Moving Average Based Multiagent Reinforcement Learning Algorithms
Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Mostafa D. Awheda Department of Systems and Computer Engineering Carleton University Ottawa, Canada KS 5B6 Email: mawheda@sce.carleton.ca
More informationOn Scalable Coding in the Presence of Decoder Side Information
On Scalable Coding in the Presence of Decoder Side Information Emrah Akyol, Urbashi Mitra Dep. of Electrical Eng. USC, CA, US Email: {eakyol, ubli}@usc.edu Ertem Tuncel Dep. of Electrical Eng. UC Riverside,
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationDecentralized Stabilization of Heterogeneous Linear Multi-Agent Systems
1 Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems Mauro Franceschelli, Andrea Gasparri, Alessandro Giua, and Giovanni Ulivi Abstract In this paper the formation stabilization problem
More informationGame Theoretic Approach to Power Control in Cellular CDMA
Game Theoretic Approach to Power Control in Cellular CDMA Sarma Gunturi Texas Instruments(India) Bangalore - 56 7, INDIA Email : gssarma@ticom Fernando Paganini Electrical Engineering Department University
More informationarxiv: v1 [cs.dc] 4 Oct 2018
Distributed Reconfiguration of Maximal Independent Sets Keren Censor-Hillel 1 and Mikael Rabie 2 1 Department of Computer Science, Technion, Israel, ckeren@cs.technion.ac.il 2 Aalto University, Helsinki,
More informationPrioritized Sweeping Converges to the Optimal Value Function
Technical Report DCS-TR-631 Prioritized Sweeping Converges to the Optimal Value Function Lihong Li and Michael L. Littman {lihong,mlittman}@cs.rutgers.edu RL 3 Laboratory Department of Computer Science
More information1 Lattices and Tarski s Theorem
MS&E 336 Lecture 8: Supermodular games Ramesh Johari April 30, 2007 In this lecture, we develop the theory of supermodular games; key references are the papers of Topkis [7], Vives [8], and Milgrom and
More informationAsymptotics, asynchrony, and asymmetry in distributed consensus
DANCES Seminar 1 / Asymptotics, asynchrony, and asymmetry in distributed consensus Anand D. Information Theory and Applications Center University of California, San Diego 9 March 011 Joint work with Alex
More informationSelf-stabilizing uncoupled dynamics
Self-stabilizing uncoupled dynamics Aaron D. Jaggard 1, Neil Lutz 2, Michael Schapira 3, and Rebecca N. Wright 4 1 U.S. Naval Research Laboratory, Washington, DC 20375, USA. aaron.jaggard@nrl.navy.mil
More informationOptimization and Stability of TCP/IP with Delay-Sensitive Utility Functions
Optimization and Stability of TCP/IP with Delay-Sensitive Utility Functions Thesis by John Pongsajapan In Partial Fulfillment of the Requirements for the Degree of Master of Science California Institute
More informationAN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN
AN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN A Thesis Presented to The Academic Faculty by Bryan Larish In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy
More informationThe Multi-Agent Rendezvous Problem - The Asynchronous Case
43rd IEEE Conference on Decision and Control December 14-17, 2004 Atlantis, Paradise Island, Bahamas WeB03.3 The Multi-Agent Rendezvous Problem - The Asynchronous Case J. Lin and A.S. Morse Yale University
More informationTradable Permits for System-Optimized Networks. Anna Nagurney Isenberg School of Management University of Massachusetts Amherst, MA 01003
Tradable Permits for System-Optimized Networks Anna Nagurney Isenberg School of Management University of Massachusetts Amherst, MA 01003 c 2002 Introduction In this lecture, I return to the policy mechanism
More information1520 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 9, SEPTEMBER Reza Olfati-Saber, Member, IEEE, and Richard M. Murray, Member, IEEE
1520 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 9, SEPTEMBER 2004 Consensus Problems in Networks of Agents With Switching Topology and Time-Delays Reza Olfati-Saber, Member, IEEE, and Richard
More informationLinear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space
Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................
More informationDistributed Learning based on Entropy-Driven Game Dynamics
Distributed Learning based on Entropy-Driven Game Dynamics Bruno Gaujal joint work with Pierre Coucheney and Panayotis Mertikopoulos Inria Aug., 2014 Model Shared resource systems (network, processors)
More informationOn Backward Product of Stochastic Matrices
On Backward Product of Stochastic Matrices Behrouz Touri and Angelia Nedić 1 Abstract We study the ergodicity of backward product of stochastic and doubly stochastic matrices by introducing the concept
More informationInternet Monetization
Internet Monetization March May, 2013 Discrete time Finite A decision process (MDP) is reward process with decisions. It models an environment in which all states are and time is divided into stages. Definition
More informationStructural and Multidisciplinary Optimization. P. Duysinx and P. Tossings
Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be
More informationMicroeconomic Algorithms for Flow Control in Virtual Circuit Networks (Subset in Infocom 1989)
Microeconomic Algorithms for Flow Control in Virtual Circuit Networks (Subset in Infocom 1989) September 13th, 1995 Donald Ferguson*,** Christos Nikolaou* Yechiam Yemini** *IBM T.J. Watson Research Center
More informationSupplementary appendix to the paper Hierarchical cheap talk Not for publication
Supplementary appendix to the paper Hierarchical cheap talk Not for publication Attila Ambrus, Eduardo M. Azevedo, and Yuichiro Kamada December 3, 011 1 Monotonicity of the set of pure-strategy equilibria
More informationResearch on Consistency Problem of Network Multi-agent Car System with State Predictor
International Core Journal of Engineering Vol. No.9 06 ISSN: 44-895 Research on Consistency Problem of Network Multi-agent Car System with State Predictor Yue Duan a, Linli Zhou b and Yue Wu c Institute
More informationOn the Approximate Linear Programming Approach for Network Revenue Management Problems
On the Approximate Linear Programming Approach for Network Revenue Management Problems Chaoxu Tong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,
More informationThis Dissertation. entitled. Event-triggered distributed algorithms for network optimization. Pu Wan
This Dissertation entitled Event-triggered distributed algorithms for network optimization typeset with nddiss2ε v3.0 (2005/07/27) on November 30, 2009 for Pu Wan This L A TEX2ε classfile conforms to the
More informationQ-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
MATHEMATICS OF OPERATIONS RESEARCH Vol. 37, No. 1, February 2012, pp. 66 94 ISSN 0364-765X (print) ISSN 1526-5471 (online) http://dx.doi.org/10.1287/moor.1110.0532 2012 INFORMS Q-Learning and Enhanced
More informationExact and Approximate Equilibria for Optimal Group Network Formation
Noname manuscript No. will be inserted by the editor) Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich Bugra Caskurlu Received: December 2009 / Accepted: Abstract
More informationIterative Encoder-Controller Design for Feedback Control Over Noisy Channels
IEEE TRANSACTIONS ON AUTOMATIC CONTROL 1 Iterative Encoder-Controller Design for Feedback Control Over Noisy Channels Lei Bao, Member, IEEE, Mikael Skoglund, Senior Member, IEEE, and Karl Henrik Johansson,
More informationRobust Power Control of Optical Networks with Time-Delays. Nemanja Stefanovic
Robust Power Control of Optical Networks with Time-Delays by Nemanja Stefanovic A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Electrical
More informationPenalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.
AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier
More informationCoevolutionary Modeling in Networks 1/39
Coevolutionary Modeling in Networks Jeff S. Shamma joint work with Ibrahim Al-Shyoukh & Georgios Chasparis & IMA Workshop on Analysis and Control of Network Dynamics October 19 23, 2015 Jeff S. Shamma
More informationA Review of Linear Programming
A Review of Linear Programming Instructor: Farid Alizadeh IEOR 4600y Spring 2001 February 14, 2001 1 Overview In this note we review the basic properties of linear programming including the primal simplex
More informationMulti-Agent Learning with Policy Prediction
Multi-Agent Learning with Policy Prediction Chongjie Zhang Computer Science Department University of Massachusetts Amherst, MA 3 USA chongjie@cs.umass.edu Victor Lesser Computer Science Department University
More informationDEPARTMENT OF STATISTICS AND OPERATIONS RESEARCH OPERATIONS RESEARCH DETERMINISTIC QUALIFYING EXAMINATION. Part I: Short Questions
DEPARTMENT OF STATISTICS AND OPERATIONS RESEARCH OPERATIONS RESEARCH DETERMINISTIC QUALIFYING EXAMINATION Part I: Short Questions August 12, 2008 9:00 am - 12 pm General Instructions This examination is
More informationInteger Programming ISE 418. Lecture 8. Dr. Ted Ralphs
Integer Programming ISE 418 Lecture 8 Dr. Ted Ralphs ISE 418 Lecture 8 1 Reading for This Lecture Wolsey Chapter 2 Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Duality for Mixed-Integer
More informationA Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty
2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of
More informationZeno-free, distributed event-triggered communication and control for multi-agent average consensus
Zeno-free, distributed event-triggered communication and control for multi-agent average consensus Cameron Nowzari Jorge Cortés Abstract This paper studies a distributed event-triggered communication and
More information