Nash Equilibrium Seeking with Output Regulation. Andrew R. Romano

Size: px

Start display at page:

Download "Nash Equilibrium Seeking with Output Regulation. Andrew R. Romano"

Cuthbert Barrett
5 years ago
Views:

1 Nash Equilibrium Seeking with Output Regulation by Andrew R. Romano A thesis submitted in conformity with the requirements for the degree of Master of Applied Science The Edward S. Rogers Sr. Department of Electrical & Computer Engineering University of Toronto c Copyright 2018 by Andrew R. Romano

2 Abstract Nash Equilibrium Seeking with Output Regulation Andrew R. Romano Master of Applied Science Graduate Department of The Edward S. Rogers Sr. Department of Electrical & Computer Engineering University of Toronto 2018 We consider the problem of Nash equilibrium seeking (NE) for games with agents subject to exogenous signals. Using the framework of output regulation, we design dynamic gradient play feedback control laws that incorporate internal models of the exogenous signals. We start by considering linear agents with linear exosignals under full-information. We incorporate a Laplacian-based consensus algorithm to handle the case of partial information. Finally, we extend this framework to handle a class of nonlinear agents subject to nonlinear exosignals. ii

3 Acknowledgements I would first like to thank my supervisor, Professor Lacra Pavel, for her invaluable help during this process. Her intuition and level of mathematical rigour has helped me through much of my research. I thank my fellow collaborators, Bolin Gao, Dian Gadjov, Farzad Salehisadaghiani, Peng Yi and Mathieu Sylvestre for always being there to talk through problems when I needed it. I have honestly learned so much from you guys. I thank my friends and family for providing support over the years of my studies. My parents, Richard and Paula, are the most loving and caring people I have ever known. My sister, Amy, has always been there for me and has provided so much love and support this past year. Finally, I would like to thank my loving partner Meghan for always being there for me and being the best and most supportive partner anyone could ask for. iii

4 Table of Contents Acknowledgements Contents List of Figures Index of Symbols iii iv vii ix 1 Introduction Motivation Literature Review Contributions Organization Background Mathematical Background Definite Functions and Matrices Convex and Monotone Functions Continuity and Differentiability of Functions Dynamical Systems Lyapunov Stablity Stability of LTI Systems Observability of LTI Systems Input to State Stability Small-Gain Theorem Passivity Feedback Linearization Output Regulation Problem Graph Theory Game Theory Potential Games Gradient Dynamics iv

5 2.4.3 Partial-Information Gradient Dynamics Problem Formulation Plants with Additive Exosignals Disturbance on the Dynamics Reference on the Action Full Information Games with Gradient Type Dynamics Games with Disturbances Games with References General Dynamics Single-Integrator Plants Games with Disturbances Full-Information Partial-Information Games with References Full-Information Partial-Information NE Seeking For Double Integrator Plants Games without Disturbances Full-Information Partial-Information Games with Disturbances Full-Information Partial-Information with Disturbances NE Seeking for Higher Order Integrators Games without Disturbances Full-Information Partial-Information Games with Disturbances Full-Information Partial-Information Potential Games Games with Disturbances Games with References Incrementally Passive Internal Models Nonlinear Plant Models 75 9 Numerical Results Quadratic Game Full-Information Partial-Information v

6 9.2 OSNR Game Full-Information Partial-Information Sensor Networks Velocity-Actuated Robots Force-Actuated Robots Kinematic Unicycle Agents Dynamic Unicycle Agents Sensor Network with Time-Varying Target Positions Conclusions Future Work vi

7 List of Figures 2.1 A game as the Interconnection Between Agent Dynamics Gradient Dynamics (3.1) as Feedback Interconnection of an Integrator and the Pseudo- Gradient A Game with Disturbance on the Agent Dynamics A Game with Additive Signal on the Action of Each Player Gradient Dynamics (3.7) as Feedback Interconnection of an Integrator and the Pseudo- Gradient Map Agent dynamics (3.9) as an error feedback output regulation problem Agent dynamics (3.11) as an integrator output regulation problem Agent dynamics (3.15) as an output regulation problem Agent dynamics (3.17) as an output regulation problem Agent dynamics (3.19) as an integrator output regulation problem NE found by basic gradient descent algorithm (2.31) Comparison of gradient dynamics and the algorithm given by (7.5) showing the effectiveness of the designed algorithm Comparison of gradient dynamics and the algorithm given by (4.5) showing the effectiveness of the designed algorithm Comparison of gradient dynamics and the algorithm given by (7.13) showing the effectiveness of the designed algorithm Comparison of gradient dynamics and the algorithm given by (4.22) showing the effectiveness of the designed algorithm Random Communication Graph, G c Comparison of gradient dynamics and the algorithm given by (4.11) showing the effectiveness of the designed algorithm Comparison of gradient dynamics and the algorithm given by (4.28) showing the effectiveness of the designed algorithm Comparison of gradient play dynamics and the dynamics given by (4.11) for an OSNR game showing the effectiveness of the proposed algorithm Random communication graph, G c vii

8 9.11 Comparison of gradient play dynamics and the dynamics given by (4.11) for an OSNR game showing the effectiveness of the proposed algorithm Random communication graph, G c Comparison of gradient dynamics (2.31) with proposed algorithm (4.5) for single-integrator agents Comparison of Laplacian-based algorithm (2.34) with partial-information disturbance rejection algorithm (4.11) for single-integrator agents Comparison of multi-integrator gradient-play (5.6) with proposed algorithm (5.23) for double-integrator agents Comparison of Laplacian-based algorithm (5.13) with partial-information disturbance rejection algorithm (5.31) for double-integrator agents High-gain gradient-play feedback (8.19) for a group of kinematic unicycles High-gain gradient-play feedback (8.19) for a group of dynamic unicycles Proposed algorithm (4.22) for single-integrator agents showing tracking of the NE trajectory Proposed algorithm (4.28) for single-integrator agents showing tracking of the NE trajectory under partial-information viii

9 Index of Symbols 1 N The N 1 vector of ones 0 The zero matrix of appropriate size f(x) The gradient of f(x) i f(x) The partial derivative of f(x) with respect to x i λ i (A) The i th smallest eigenvalue of the matrix A R n n ξ i ξ Σ i Σ The internal model state of agent i The stacked internal model states of all agents The closed-loop dynamics of agent i The closed-loop dynamics of all agents blkdiag(a 1,..., A n ) The block diagonal matrix with A i R ni mi as the block diagonal elements C The field of complex numbers col(x 1,..., x n ) The stacked vector of x i R ni D f(x) The Jacobian of f(x) D i D diag(a 1,..., a n ) E F (x) F(x) G(I, J i, R ni ) G(I, E) G c I J i (x) J(x) L L N i NE The exomodel of agent i The exomodel of all agents The diagonal matrix with a i R as the diagonal elements The edge set of a graph G The pseudo-gradient of cost function J(x) The extended pseudo-gradient of cost function J(x) A game defined by agent set I, cost functions J i and action spaces R ni A graph defined by the node set I and edge set E A communication graph The set of agents of a game or nodes in a graph The cost function of agent i The stacked vector of all agents cost functions The Laplacian matrix of graph G The extended Laplacian matrix of graph G The neighbours of node i in graph G Nash equilibrium ix

10 P i P R R i R S i S w i w x i x i x x i x i i x i i x x x y i y i y yi m z i z The plant dynamics of agent i The stacked plant dynamics of all agents The field of real numbers The action selection matrix of agent i The action selection matrix of all agents The action selection matrix of agent i The action selection matrix of all agents The exosystem state of agent i The stacked vector of all agents exosystem states The action of agent i The stacked vector of the actions of all other agents except i The stacked vector of all agents actions The estimate of agent i The action of agent i Agent i s estimate of all others actions The stacked vector of all agents estmates The consensus component of all agents estimates The orthogonal to consensus component of all agents estimates The output of agent i The stacked vector of the outputs of all other agents except i The stacked vector of all agents outputs The measured outputs available to agent i The zero dynamics of agent i The zero dynamics of all agents x

11 Chapter 1 Introduction Game theory has become an increasingly active field in the past few decades. It finds many applications in the modelling of interactions between independent decision makers. The areas in which it has been used in recent years include economics, engineering, social interactions between people, board games, and many more. The relevant equilibrium state of the game is called the Nash equilibrium (NE). It is a strategy set in the game where no player can individually do better by changing strategies. A main focus in game theoretic literature is learning in games. This means that as a player in the game, repeatedly plays, it try to improve its outcome over time. Usually, the goal of any learning algorithm is to converge to an NE of the game. In many cases, the players of the game might be affected by external factors (outside of the framework of the game) in making their decisions. Alternatively, they may have more complicated decision dynamics. For example, in a sensor network, a group of robots each try to reach some global position while maintaining a connected communication structure. In this game, the action of each robot is given by its position. However, each robot cannot simply choose its position, it must move by taking into account its own dynamics, e.g., controlling wheels and steering for wheeled robots. This complicates the ability for each robot to play its desired action. Additionally, the robots may be affected by a strong wind that hampers their ability to move to their desired locations. This is what we call a disturbance. It is situations like these that will be addressed in this thesis. In addition to addressing these issues, we also incorporate a communication protocol between players in the game. Many situations arise in the literature and in applications where the players of the game may not have full knowledge of the others actions. Therefore, we use a communication protocol where each player can exchange information with a subset of the players in the game. They can then use this information to inform their decisions. 1.1 Motivation Although there has been much research into Nash equilibrium in games, not much has gone into NE seeking with output regulation. There has been some research into NE seeking with noisy feedback [1], [2]. However, these consider stochastic noise models and do not employ the internal model framework from control theory. Instead, we will look at deterministic disturbance and reference models and employ 1

12 Chapter 1. Introduction 2 the tools based on the internal model principle. Robustness to external disturbances has been discussed in [3]. Here, a time-varying pricing function that affects the cost functions of each agent is considered and robustness is investigated. However, no formal regulation algorithm is employed. There are many scenarios wherein the players of the game are affected by noise or disturbances. Examples include demand-side management in smart-grids [4], feedback control for PEV charging load allocation [3], power control for optical-signal-to-noise (OSNR) in the presence of pilot tones [5], among others. Yet there have been relatively few works on Nash equilibrium seeking in such settings, in the presence of dynamic disturbances, noise or uncertainties. In this thesis, we hope to open up this field and provide some useful results to address many of these open problems. A relevant example is the case of pilot tones in optical networks. Game theory has recently been used extensively in the optimization of optical signal to noise ratio (OSNR) in optical networks. However, these algorithms do not generally take into account the presence of pilot tones. Pilot tones are low modulation sinusoidal signals used to transmit monitoring data across the network. They are specific to each channel and are of known frequency but unknown amplitude or phase. Because they are low amplitude, they are ignored in the OSNR models used in the game theoretic literature, even though they have a negative affect on the OSNR of each of the agents. By treating the pilot tones as a disturbance on each of the players in the game, we can use control theoretic principles to cancel out the effect of the pilot tones on the OSNR, thereby converging to the true NE of the game. 1.2 Literature Review Our work is related to the Nash equilibrium seeking and the output regulation literature as well as to distributed optimization. Nash equilibrium seeking has received much attention and found many applications in the literature, including wireless communication networks [6], [7], [8], [9], optical networks [10], [11], smart-grid and PEV charging [4], [12], [3], noncooperative flow control problems [13], [14] and multi-agent formation control problems [15]. The broad applicability of the field makes is useful in studying a wide array of engineering problems. The linear multivariable output regulation problem has been addressed in [16] and [17]. This was extended to include the nonlinear output regulation problem in [18]. Since then, there has been much research on the output regulation problem including asymptotic internal models [19], adaptive control methods [20] and high-gain feedback methods [21], [22]. Over the past decade, multi-agent control problems have come to the forefront of the field. This includes many multi-agent output regulation problems as well. Output agreement and coordination with output regulation have been addressed [23] and coordination [24] with disturbance rejection and tracking using the internal model principle for classes of heterogeneous nonlinear agents [25], [26]. Networks of linear systems with dynamic edges have been addressed [27]. The study of output regulation in multi-agent systems gives a starting point for the study of similar problems from a game theoretical perspective as some multi-agent problems can be viewed as specific cases of game theoretical problems. The synchronization problem, for example, can be regarded as a game where each agent s cost function corresponds to the sum of the squared distances between it and all of its neighbours. We stress, however, that NE seeking is not a subset of the multi-agent control literature and that the synchronization problem is a particular example of a problem that is found in both literatures. In general, the coupling of the cost functions in a game prevents the use of the tools readily used in the multi-agent literature, e.g.,

13 Chapter 1. Introduction 3 passivity. A related field is that of distributed optimization. In a distributed optimization problem, a group of agents try to cooperatively minimize a global cost function. The algorithms used exploit the summability of the cost functions to create distributed algorithms. Optimization schemes with output regulation have been researched for single integrator systems [28], systems with unit relative degree [29], and systems with double integrator dynamics [30]. This framework provides good motivation for NE seeking problems. However, there are some key differences between the distributed optimization and game theoretic set-ups. The cooperative nature of the optimization problem and the separability of the cost functions create decoupled gradients for each agent. In the game set-up, the cost functions are not decoupled and each has components of the others actions that lead to coupled gradients. This creates added difficulties in designing algorithms for NE seeking. 1.3 Contributions In this thesis, we consider a general class of N player, continuous-kernel games with players modelled as dynamical systems with disturbance and reference signals. We develop Nash equilibrium seeking algorithms for different dynamics and prove their convergence. 1. In Theorem 4.1, an observer based design is applied to strongly monotone game with single integrator plant dynamics with a linear disturbance under full-information. In Theorem 4.2, the results from Theorem 4.1 are combined with an estimate based consensus algorithm to cover the case where the agents do not have full-information about all others actions. In Theorem 4.3, an observer based NE seeking algorithm is applied to strongly-monotone games with integrator plants and linear references. In Theorem 4.4, these results are extended to cover the partial-information case using an estimate based consensus algorithm. 2. In Theorem 5.1, we introduce an NE seeking algorithm for double-integrators under full-information and strict-monotonicity of the pseudo-gradient. In Theorem 5.2, these results are combined with an estimate based consensus algorithm for the partial information case. In Theorem 5.3, an observer based NE seeking algorithm that builds off the results of Theorem 5.1 is introduced for stronglymonotone games with double-integrator plants and linear disturbances under full-information. In Theorem 5.4, the consensus type algorithm from Theorem 5.2 is augmented with an observer to handle disturbances. 3. In Theorem 6.1, we extend the gradient descent algorithm for double-integrators to handle cases where the agents are modelled as higher order chains of integrators under full-information. In Theorem 6.2, these results are extended to cover the partial-information case using an estimate based consensus algorithm. In Theorem 6.3, disturbances are considered and an observer is used in conjunction with the algorithm from Theorem 6.1. In Theorem 6.4, the results from Theorem 6.2 are combined with an observer to handle to partial-information case with disturbances. 4. In Theorem 7.1 the passivity based framework is applied to potential games, first considering single integrator plants with linear disturbance models and then extended to plants with linear references in Theorem 7.2 and nonlinear exosystems with incrementally passive internal models in Theorem 7.3. In all cases only strict-monotonicity of the pseudo-gradient is required.

14 Chapter 1. Introduction 4 5. In Theorem 8.1, a class of first order nonlinear systems with strongly monotone games are considered and convergence to NE shown under full-information. 1.4 Organization This thesis is organized as follows 1. In Chapter 2, we provide mathematical background on linear algebra, convex and functional analysis, dynamic systems theory, linear systems theory, nonlinear control theory, graph theory and game theory. 2. In Chapter 3, we provide problem set-up and introduce the framework used to solve the NE seeking problems defined in this thesis. 3. In Chapter 4, we provide NE seeking for single integrator agents with disturbance rejection and reference tracking under full-information. We extend these results for the partial-information case. 4. In Chapter 5, we introduce NE seeking for double-integrators. We investigate full and partialinformation algorithms. We extend these to reject disturbances. 5. In Chapter 6, we generalize the results from Chapter 5 to multi-integrator plant models. We include full and partial information algorithms with and without disturbances. 6. In Chapter 7, we provide results on output regulation for NE seeking for potential games using a passivity based algorithm. These results hold for games with strictly monotone pseudo-gradients. 7. In Chapter 8, we propose an NE seeking algorithm for a class of relative degree one MIMO systems. 8. In Chapter 9, we give simulation results. 9. In Chapter 10, we provide conclusions and future work.

15 Chapter 2 Background In the following chapter, we provide the relevant background for the work involved. This includes mathematical notations and theorems, dynamical systems and stability, background on graph theory that is required for the information exchange over a network in a game set-up and finally, background on game theory. 2.1 Mathematical Background Definite Functions and Matrices The notion of positive (conversely negative) definiteness of a function is of great importance in many stability theorems for nonlinear systems. Definition 2.1. Let V : R n R and x 0 R n. Then, V is (i) Positive definite at x 0 if V (x 0 ) = 0 and V (x) > 0, x x 0, (ii) Positive semi-definite at x 0 if V (x 0 ) = 0 and V (x) 0, x x 0, (iii) Negative definite at x 0 if V (x 0 ) = 0 and V (x) < 0, x x 0, (iv) Negative semi-definite at x 0 if V (x 0 ) = 0 and V (x) 0, x x 0. We can also specify these definitions for the class of symmetric matrices. Definition 2.2. A square matrix, A R n n is (i) (Symmetric) positive definite if A = A T and x T Ax > 0, x 0 R n, (ii) (Symmetric) positive semi-definite if A = A T and x T Ax 0, x R n, (iii) (Symmetric) negative definite at x 0 if A = A T and x T Ax < 0, x 0 R n, (iv) (Symmetric) negative semi-definite at x 0 if A = A T and x T Ax 0, x R n. Matrices that have to above properties have specific spectra, given by the following theorem. Theorem 2.1. (Fact 3, Section 8.4, [31]) Given a square matrix A R n n, the eigenvalues of A are 5

16 Chapter 2. Background 6 (i) Non-negative real if and only if A is positive semi-definite, (ii) Positive real if and only if A is positive definite Convex and Monotone Functions In game theory, as in optimization problems, the structure of the cost function is greatly important for the global solvability of the problem. In particular, the so called convexity of the cost functions gives the existence of global minima and allows the use of algorithms such as gradient-descent for finding them. Here, we give some useful definitions for types of convex functions. Definition 2.3. Let f : R n R. Then, f is (i) Convex if x, y R n, α [0, 1], f(αx + (1 α)y) αf(x) + (1 α)f(y), (ii) Strictly convex if x, y R n, x y, α [0, 1], f(αx + (1 α)y) < αf(x) + (1 α)f(y), (iii) Strongly convex if there exists µ > 0, such that x, y R n, α [0, 1], f(αx + (1 α)y) αf(x) + (1 α)f(y) µ 2 α(1 α) x y 2. Another important class of functions are the classes of monotone functions. These functions are an extension of non-decreasing functions of one variable to higher dimensions. They are important for the computation of equilibria in a game context. Definition 2.4. Let f : R n R n. Then, f is (i) Monotone if x, y R n, (x y) T (f(x) f(y)) 0, (ii) Strictly monotone if x, y R n, x y, (x y) T (f(x) f(y)) > 0, (iii) Strongly monotone if there exists µ > 0, such that x, y R n, (x y) T (f(x) f(y)) µ x y 2. There is an important connection between convex functions and monotone ones, namely that one is the gradient of the other. Theorem 2.2. (Example 22.4, [32]) Let f : R n R be a differentiable function. Then, f is (i) Convex if and only if f is monotone, (ii) Strictly convex if only if f is strictly monotone, (iii) Strongly convex if and only if f is strongly monotone. Note that this does not require that a monotone function be the gradient of a convex one. We will see later that in most game set-ups, we have a monotone function that is not the gradient of any function Continuity and Differentiability of Functions The final piece of mathematical background are the notions of continuity and differentiability of functions. First we will introduce the notion of a continuous function. Definition 2.5. Given a function f : R n R m, we say f is

17 Chapter 2. Background 7 (i) Continuous at x if for every ɛ > 0 there exists δ > 0 such that for all y R n such that x y < δ, we have that f(x) f(y) < ɛ, (ii) Continuous if it is continuous at all x R n. A stronger form of continuity is the concept of Lipschitz continuity. Lipschitz continuity plays an important role in the existence and uniqueness of solutions to ordinary differential equations, as we will see in the next section. Definition 2.6. Given a function f : R n R m, we say that f is (i) Lipschitz continuous at x if there exists r, L R such that f(y) f(z) L y z holds for all y, z {a R n : a x < r}, (ii) Locally Lipschitz continuous if f is Lipschitz continuous at all x R n, (iii) Globally Lipschitz continuous if there exists L R such that f(x) f(y) L x y holds for all x, y R n. The next definition we will give will be the Fréchet derivative. derivative of a real valued function to higher dimensions. This extends the notion of the Definition 2.7. Given function f : R n R m, and a point x R n, f is called Fréchet differentiable at x if there exists a bounded linear operator df : R n R m such that f(x + h) f(x) df(x)h lim = 0. h 0 h We call the unique vector f(x) such that df(x)h = f(x) T h holds for all h R n the gradient of f at x. If this hold for all x R n we say that f is Fréchet differentiable or simply differentiable. Definition 2.8. Given a function f : R n R m, we say f is (i) Continuously differentiable or C 1 if it is continuous and differentiable and its derivative is continuous, (ii) C k if it is k times differentiable and its first k derivatives are continuous, (iii) Smooth or C if it is infinitely many times differentiable and all of its derivatives are continuous. 2.2 Dynamical Systems As we will show later, the NE seeking algorithms that we will investigate will be formulated as a coupledset of continuous-time ordinary differential equations (ODE) on R n given by ẋ = f(x) (2.1) where f : R n R n is a vector field on R n. As such, many theorems from nonlinear dynamical systems theory will be important for proving convergence of the algorithms.

18 Chapter 2. Background Lyapunov Stablity We will consider an equilibrium point, x of (2.1), i.e., f( x) = 0. This is a necessary definition as the first step to show convergence to the equilibrium of a game for our algorithm will be to show that the equilibrium point of the agent dynamics is also the equilibrium of the game. We will next characterize stability properties of the equilibrium point. Definition 2.9. (Definition 4.1, [33]) The equilibrium point x = x of (2.1) is (i) stable if, for each ɛ > 0, there exists a δ > 0 such that if x(0) < δ, then x(t) < ɛ, t > 0. (ii) unstable if it is not stable (iii) asymptotically stable if it is stable and δ can be chosen such that if x(0) < δ, then lim t x(t) = x. Moreover, if δ =, we say that it is globally asymptotically stable. Once the equilibrium point is determined, we will need to show that it is asymptotically stable. The following theorems will give sufficient conditions for the stability of an equilibrium point of (2.1). The first gives sufficient conditions for the local stability of an equilibrium point. Theorem 2.3. (Theorem 4.1, [33]) Let x = x be an equilibrium point for (2.1) and D R n be a domain containing x = x. Let V : D R be a continuously differentiable function such that (i) V (x) is positive definite at x = x (ii) V (x) is negative semi-definite at x = x Then, x = x is a stable equilibrium point for (2.1). Moreover, if V (x) is negative definite at x = x, then x = x is an asymptotically stable equilibrium point for (2.1). By imposing additional properties on the Lyapunov function, we can extend these results to be global ones. Theorem 2.4. (Theorem 4.2, [33]) Let x = x be an equilibrium point for (2.1). Let V : R n R be a continuously differentiable function such that (i) V (x) is positive definite at x = x (ii) V (x) is negative definite at x = x (iii) V (x) is radially unbounded in x Then, x = x is globally asymptotically stable for (2.1). In many cases, the negative definiteness condition is not fulfilled, however this does not mean that the equilibrium is not asymptotically stable. We can use the results of LaSalle s Invariance Theorem to help handle some of these cases. Theorem 2.5. (Theorem 4.4, [33]) Let Ω D be a compact set that is positively invariant for (2.1). Let V : D R be a continuously differentiable function such that V (x) 0 in Ω. Let E be the set of all points in Ω where V (x) = 0. Let M be the largest invariant set in E. Then every solution starting in Ω approaches M as t.

19 Chapter 2. Background Stability of LTI Systems Consider now a linear time-invariant (LTI) dynamical system given by ẋ = Ax (2.2) All of the stability properties presented in the previous section are applicable to LTI systems as well. In addition to those theorems, the stability of (2.2) is determined exactly by the eigenvalues of A. Theorem 2.6. (Theorem 4.5, [33]) The equilibrium point x = 0 of ẋ = Ax is stable if and only if all eigenvalues of A satisfy Re(λ i ) 0 and for every eigenvalue with Re(λ i ) = 0 and algebraic multiplicity q i 2, rank(a λ i I) = n q i, where n is the dimension of x. The equilibrium point x = 0 is globally asymptotically stable if and only if all eigenvalues of A satisfy Re(λ i ) < 0. Given (2.2) satisfying the requirements of Theorem 2.6, we can easily determine a Lyapunov function for the system. Theorem 2.7. (Theorem 4.6, [33]) A matrix A is Hurwitz; that is, Re(λ i ) < 0 for all eigenvalues of A, if and only if, for any given positive definite symmetric matrix Q there exists a unique positive definite symmetric matrix P that satisfies P A + A T P = Q Observability of LTI Systems In this section we will consider LTI systems with input u R m and output y R p ẋ = Ax + Bu y = Cx + Du (2.3) A major property of an (2.3) is observability. That is, given an unknown initial condition and a known input u, can we uniquely determine x(t) using only measurement of y. Definition We say that (2.3) or the pair (A, C) is observable if for any initial condition x(0) = x 0, there exists a finite time, t > 0, such that the intial state, x 0, can be uniquely determined using only knowledge of the input, u, and the output, y, over the time [0, t]. This is hard to check in practice. However, we have a rank condition that says whether or not an LTI system is observable. Theorem 2.8. (Theorem 6.O1, [34]) The pair (A, C) is observable if and only if rank(q o ) = n where C Q o = CA (2.4) CA n 1

20 Chapter 2. Background 10 To construct an observer for (2.3), we create a system that has the same dynamics as our plant, but with an extra error driving term ˆx = Aˆx + Bu + L(y ŷ) ŷ = C ˆx + Du (2.5) Since the goal is to have the estimate converge to the actual value, we need ˆx x. If we define the error to be e := x ˆx and take its time-derivative we get ė = (A + LC)e (2.6) In order for the error to go to zero, we need the eigenvalues of A + LC to have negative real part. The following theorem gives a necessary and sufficient condition for when this can be done. Theorem 2.9. (Theorems 6.5 and 8.M3, [34]) The eigenvalues of A + LC can be placed arbitrarily in the complex plane if and only if the pair (A, C) is observable Input to State Stability The Lyapunov stability results from the previous sections apply only to systems with no input, u 0 or for systems with state feedback inputs that have been resolved into the form (2.1); they do not apply to systems with general inputs. In linear systems theory, we have that any asymptotically stable system will have a bounded state given any bounded input. However, for general nonlinear systems of the form ẋ = f(x, u) (2.7) this is not the case. In fact, (2.7) may be globally asymptotically stable with u 0 but given a bounded non zero input, u, the state may become unbounded. Therefore it is necessary to investigate properties of (2.7) that will guarantee that the solutions will remain bounded given any bounded input. To do this, we must first introduce the concept of a comparison function. These functions will be useful in discussing the extended stability results. Definition A continuous function α : [0, a) [0, ) is said to belong to class K if it is strictly increasing and if α(0) = 0. Definition A continuous function α : [0, a) [0, ) is said to belong to class K if it belongs to class K, a = and lim r α(r) =. Definition A continuous function β : [0, a) [0, ) [0, ) is said to belong to class KL if for each fixed s, β(r, s) belongs to class K and for each fixed r, β(r, s) is decreasing with respect to s and β(r, s) 0 for s. Using the comparison functions defined above, we can now define by what we mean for a system to be input-to-state stable. For this we want that any bounded input leads to a bounded state and also that given u 0, we have that the system is asymptotically stable.

21 Chapter 2. Background 11 Definition System (2.7) is input-to-state stable (ISS) if there exist β KL and γ K such that for any initial state x(t 0 ) and any bounded input u(t) x(t) β( x(t 0 ), t t 0 ) + γ( u(τ) [0,t) ) where [0,t) denotes the sup [0,t) u(t). Since this definition is hard to check in practice, we have the following Lyapunov type result to check whether a given system is ISS. Theorem (Theorem 4.19, [33]) Let V (x) be a continuously differentiable function such that α 1 ( x ) V (x) α 2 ( x ) V f(x, u) W (x), x ρ( u ) > 0 x x R n, u R m, where α 1, α 2 K, ρ K, and W (x) is a continuous positive definite function. Then system (2.7) is ISS with γ = α 1 1 α 2 ρ. by Input to state stability is useful in the study of cascade systems. Consider the cascade system given ẋ 1 = f 1 (x 1, x 2 ) (2.8) ẋ 2 = f 2 (x 2, u) (2.9) where f 1 : R n1 R n2 R n1 and f 2 : R n2 R n2 are locally Lipschitz in x = col(x 1, x 2 ). Lemma 2.1. (Lemma 4.7, [33]) If the system (2.8) with x 2 as an input is ISS and the origin of (2.9) is globally uniformly asymptotically stable, then the origin of the cascade system (2.8) and (2.9) is globally uniformly asymptotically stable with u Small-Gain Theorem Input-to-state stability is a powerful tool for analyzing the stability of a single system with a given input. However, it tells nothing on its own about the stabiltiy of feedback interconnected systems. Therefore, we need the following extension of the small-gain theorem, from [35], to talk analyze the stability of feedback interconnected ISS systems. Consider the following interconnection of two systems ẋ 1 = f 1 (x 1, x 2, u 1 ) (2.10) ẋ 2 = f 2 (x 2, x 1, u 2 ) (2.11) Theorem (Theorem 2.1, [35]) Suppose (2.10) and (2.11) are ISS and (x 2, u 1 ) and (x 1, u 2 ) as inputs respectively, i.e., there exists β 1, β 2 KL, γ x 1, γ u 1, γ x 2, γ u 2 K, such that the following holds x 1 β 1 ( x 1 (0), t) + γ x 1 ( x 2 [0,t) ) + γ u 1 ( u 1 [0,t) ) x 2 β 2 ( x 2 (0), t) + γ x 2 ( x 1 [0,t) ) + γ u 2 ( u 2 [0,t) )

22 Chapter 2. Background 12 If there exist two functions ρ 1, ρ 2 K satisfying (Id +ρ 2 ) γ x 2 (Id +ρ 1 ) γ x 1 (s) s (Id +ρ 1 ) γ x 1 (Id +ρ 2 ) γ x 2 (s) s (2.12) for all s 0, then system (2.10)-(2.11) is ISS with respect to with (x 1, x 2 ) as state and (u 1, u 2 ) as input is ISS. Remark 2.1. When γ x 1 and γ x 2 are scalars, the requirements in (2.12) become γ x 1 γ x Passivity In this section, we will introduce the notion of passivity and its usefulness in the study of feedback interconnected systems. Consider the following nonlinear system with input u R m and output y R m ẋ = f(x, u) y = h(x, u) (2.13) where x R n, and u, y R p, f is locally Lipschitz and h continuous. Definition (2.13) is called Passive if there exists a positive-definite C 1 storage function V : R n R such that for all u R p and x R n V (x) u T y Theorem (Theorem 6.1, [33]) The feedback interconnection of two passive systems is passive. Definition (2.13) is Equlibrium Independent Passive (EIP) if there exists a positive-definite C 1 storage function V : R n R such that for all u R p and x R n V (x) (u ū) T (y ȳ) Definition (2.13) is Incrementally Passive if there exists a C 1, regular, positive definite storage function V : R n R n R such that for any two inputs u, u R p and the corresponding solutions x, x R n, their respective outputs y, y R n satisfy V (x, x ) (u u ) T (y y ) Feedback Linearization Certain nonlinear systems, through the use of a nonlinear feedback and a coordinate transform, can be converted into a system that has a linear input-to-output response. This process is called feedback linearization. By doing this transformation, we can design a linear control law as an outer loop controller to stabilize the resulting system. This method provides a systematic way to stabilize a certain class of

23 Chapter 2. Background 13 nonlinear systems. In this section, we consider multivariable control systems of the form m ẋ = f(x) + g i (x)u i i=1 y 1 = h 1 (x) y m = h m (x) (2.14) We want to find a feedback transformation u = a 0 (x) + a 1 (x)v where v R m is a new input, and a coordinate transformation x (ξ, z) such that the resulting closed loop system is given by ż = q(z, ξ) ξ 1 1 = ξ 2 1 ξ r1 1 1 = v 1 ξ rm 1 m ξ 1 m = ξ 2 1 = v m y 1 = ξ 1 1 y m = ξ 1 m i.e., parallel chains of integrators with additional zero-dynamics. This is called the Noninteracting Control Problem. One of the main properties of a control-affine system to have in order to be feedback linearizable is the notion of vector relative degree. Definition [36] A multivariable nonlinear system of the form (2.14) has a (vector) relative degree {r 1,..., r m } at a point x 0 if (i) L gj L k f h i (x) = 0 for all 1 j m, for all 1 i m for all k < r i 1 and for all x in a neighbourhood of x 0 where L f h(x) = h(x) T f(x) is the Lie derivative of h along f and subsequent Lie derivatives defined inductively,

24 Chapter 2. Background 14 (ii) the m m matrix L g1 L r1 1 f h 1 (x) L gm L r1 1 f h 1 (x) A(x) = L g1 L r2 1 f h 2 (x) L gm L r2 1 f h 2 (x) L g1 L rm 1 f h m (x) L gm L rm 1 f h m (x) is nonsingular at x = x 0. Proposition 2.1. (Proposition 5.3.1, [36]) Consider a multivariable nonlinear system of the form (2.14). Then, the Noninteracting Control Problem is solvable at x 0 if and only if system has a vector relative degree {r 1,..., r m } at x 0. The feedback solving this problem is given by u = A(x) 1 ( b(x) + v) where b(x) = col(l r1 f h 1(x),..., L rm f h m(x)) Output Regulation Problem The following is from [18]. Consider a control system with external disturbance and reference signals of the form ẇ = s(w) ẋ = f(x, w) + g(x, w)u e = h(x, w) (2.15) where x R n, u, e R p, and w W R q. Where W is a compact subset containing the origin. Assumption 2.1. The exosystem, ẇ = s(w), is neutrally-stable, i.e., w = 0 is a Lyapunov stable equilibrium and there exists a neighbourhood of Poisson stable points around w = 0. The error feedback output regulation problem is to find an error feedback control law such that lim t + e = 0. As such, we define a controller of the form ξ = η(ξ, e) u = θ(ξ) (2.16) Using the above control law, the full closed loop system becomes ẋ = f(x, w) + g(x, w)θ(ξ) ξ = η(ξ, h(x, w)) ẇ = s(w) (2.17) Lemma 2.2. (Lemma 8.4.1, [36]) Assume that, for some η(ξ, e), and θ(ξ), that the equilibrium (x, ξ) =

25 Chapter 2. Background 15 ( x, ξ) of ẋ = f(x, 0) + g(x, 0)θ(ξ) ξ = η(ξ, h(x, 0)) is asymptotically stable. Then there exists a neighbourhood V U Ξ W of ( x, 0, 0) such that for each initial condition (x(0), ξ(0), w(0)) V, the solution of (2.17) satisfies lim t + h(x(t), w(t)) = 0 if and only if there exist mappings x = π(w) and ξ = σ(w), with π(0) = x and σ(0) = 0, defined in a neighbourhood W o W of the origin, satisfying the conditions for all w W o. π s(w) = f(π(w), w) + g(π(w), w)θ(σ(w)) w (2.18) σ s(w) = η(σ(w), 0) w (2.19) h(π(w),w) = 0 (2.20) The solution of these equations is necessary to solve the output regulation problem. By solving these equations, one finds the zero dynamics manifold (2.20), a control input the make the manifold invariant (2.18), and an internal model used to generate that control input (2.19). In order to design the internal models, we will use the following result, originally from [37]. Pick a controllable pair (G, H) R d d R d 1, where G is a Hurwitz matrix and let G R nd nd and H R nd n be defined as G = blkdiag(g,..., G) (2.21) H = blkdiag(h,..., H) (2.22) Lemma 2.3. (Lemma 3, [22]) Let d > 2q + 2. There exist an b > 0 and a subset S C of zero Lebesgue measure such that if the eigenvalues of F are in {λ C Re(λ) b}, then there exist a differentiable function σ j : W R d and a continuous bounded function φ j : R d R such that for all w W where θ(w) is the solution to σ j w s(w) = F σ j(w) + G[θ(w)] j [θ(w)] j = φ j (σ j (w)) π s(w) = f(π(w), w) g(π(w), w)θ(w) w (2.23) h(π(w),w) = 0 (2.24) Thus, this internal model provides a solution to (2.18)-(2.20).

26 Chapter 2. Background Graph Theory In this section, we will introduce some of the main definitions and results from graph theory, originally from [38]. These will be important for the case when players in the game will communicate over a network. In this case, we will model their information exchange using graph theory. A graph, G, is defined as a collection of two sets, G = (I, E), where I = {1,..., N} is the set of nodes in the graph and E = {e 1,..., e m } I I is the set of edges. The edges define connections between the nodes and e i = (j, k) E if the the nodes j and k are connected from j to k. In this work, we assume that the edges are unweighted and undirected. Therefore, we have that (j, k) E if and only if (k, j) E. We will use the tools from algebraic graph theory to carry out our analysis, which express graph theoretic principles in terms of linear algebra. Our first definition will be the adjacency matrix. Definition Given a graph, G, the adjacency matrix of G, A = [a i,j ] R N N is defined as 1, (i, j) E a i,j = 0, otherwise For an undirected graph, we have that A = A T. Definition Given an undirected graph, G, and a node i I we say that the degree of i or deg(i) = j i a i,j = N i is the number of nodes connected to i. We call D = diag(deg(1),..., deg(n)) the degree matrix of G. The next definition is the Laplacian matrix, which helps characterize the connectivity of a graph. Definition Given an undirected graph, G, the Laplacian matrix of G, L R I I is defined as L = D A. For an undirected and connected graph, L is symmectric positive semi-definite with a simple zero eigenvalue such that 0 < λ 2 (L) λ N (L). The eigenvector corresponding to the zero eigenvalue is along what is called the consensus subspace and we have that L1 N = 0. Moreover, we have that for all y R n such that 1 T Ny = 0, λ 2 (L) y 2 y T Ly λ N (L) y Game Theory In this section, we provide the necessary background on static game theory. Consider a set of players, I = {1,..., N}. Each player i I controls its own action x i Ω i R ni. The overall action set of the players is Ω = Ω 1 Ω N R n, where n = i I n i. Let x = (x i, x i ) Ω denote the overall action profile of all players, where x i Ω i = Ω 1 Ω i 1 Ω i+1 Ω N is the action set of all players except for player i. Let J i : Ω R be the cost function of player i. Each player tries to minimize its own cost function over its action. Denote the game G(I, J i, Ω i ). We denote Σ i the agent dynamics of player i, i.e., the way in which player i chooses its action, x i, based off of the information, s i, available to it. We denote the overall agent dynamics as Σ. Definition Given a game G(I, J i, Ω i ), an action profile x = (x i, x i ) Ω is a Nash Equilibrium (NE) of G if J i (x i, x i) J i (z i, x i) i I, z i Ω i

27 Chapter 2. Background 17 Σ i x i s i s i G Σ i x i Figure 2.1: A game as the Interconnection Between Agent Dynamics At a Nash Equilibrium no player can unilaterally decrease its cost function, and thus has no incentive to switch strategies on its own. Assumption 2.2. For each i I, let Ω i = R ni, the cost function J i : Ω R be C 1 in its arguments and strictly convex and radially unbounded in x i. Under Assumption 2.2 from Corollary 4.2 in [39] the game admits a pure Nash Equilibrium solution, y Ω. In addition, an NE satisfies i J i (x i, x i) = 0, i I (2.25) where i J i (x i, x i) = x i J i (x i, x i ) R ni is the gradient of player i s cost function with respect to its own action. By denoting the pseudo-gradient F (x) = [ 1 J i (x) T,..., 1 J 1 (x) T ] T, i.e. the stacked vector of all the players partial derivatives, we can rewrite (2.25) as F (x) = 0 (2.26) We denote the set of all NE in the game as Γ NE = { (x i, x i ) R n i J i (x i, x i ) = 0, i I } (2.27) Assumption 2.3. Assumptions on the pseudo-gradient. (i) The pseudo-gradient F : Ω R n is strictly monotone, (x x ) T (F (x) F (x )) > 0, y y R n and for all fixed x R n there exists α x K such that F (x) F (x ) α x ( x x ) for all x R n. (ii) The pseudo-gradient F : Ω R n is strongly monotone, (x x ) T (F (x) F (x )) µ x x 2, x, x R n for µ > 0 and Lipschitz continuous, F (x) F (x ) θ x x. (iii) The pseudo-gradient F : Ω R n is strongly monotone, (x x ) T (F (x) F (x )) µ x x 2, x, x R n for µ > 0 and C 1. These assumptions give sufficient conditions for the existence of Nash equilibria, described by the following theorem. Theorem (Theorem 3, [40]) Given a game G(I, J i, R ni ), suppose that J i (x i, x i ) are C 1 on R n and convex in x i for every fixed x i. Then, the following statements hold: (i) Suppose that Assumption 2.3(i) holds, then G has at most one NE.

28 Chapter 2. Background 18 (ii) Suppose that Assumption 2.3(ii) or (iii) holds, then G has a unique NE. NE. Therefore, Assumptions 2.2 and 2.3 together give sufficient conditions for the existence of a unique Potential Games A potential game is a game in which the Nash equilibria correspond to minimum points of a potential function. Games of this type were first formally described in [41]. Definition A game G(I, J i, Ω i ) is called a full-potential game if there exists a function Φ : Ω R such that for all y i Ω i J i (x i, x i ) J i (x i, x i ) =Φ(x i, x i ) Φ(x i, x i ), i I, x i, x i Ω i (2.28) Lemma 2.4. [41] Let G be game with Ω i = R ni. Suppose J i : Ω i R is C 1, and let Φ : Ω R. Then Φ is a potential function for G if and only if Φ is C 1 and J i x i = Φ x i, i I (2.29) Lemma 2.4 means that F (x) = Φ(x), i.e. the pseudo-gradient of the game is the full-gradient of the potential function Gradient Dynamics The most basic Nash equilibrium seeking algorithm is called gradient-descent or gradient-play. In this algorithm, each player attempts to improve its cost function by changing its action along the negative gradient of its cost function. In continuous-time, with full-information, these dynamics are given by Σ i : ẋ i = i J i (x i, x i ), i I (2.30) or in stacked vector form, Σ : ẋ = F (x) (2.31) Under Assumption 2.2, the solutions of (2.31) exist and are unique for any initial condition, x(0). Under Assumption 2.3(i), the unique Nash Equilibrium of the game is globally asymptotically stable. Lemma 2.5. (Lemma 1, [42])Consider a game G(I, J i, Ω i ) with perfect information. Under Assumption 2.2, the equilibrium point of (2.31), x, is an NE of the game. Additionally, under Assumption 2.3(i), x = x, is the unique NE of the game and is globally asymptotically stable Partial-Information Gradient Dynamics While the gradient dynamics (2.30) require that each agent has full-information about the others actions, this is not always the case. To handle this, we present an augmented gradient dynamics scheme that

29 Chapter 2. Background 19 uses a Laplacian based conensus algorithm with estimates of the others actions, originally from [42]. Consider a game with information shared over a communication graph, G c, with Laplacian, L. Referring to the representation in Figure 2.1, in this case, assume that each agent has only partial, local decision information, s i = x j for j N i. Since each agent has only partial information based off of the information exchanged over a communication graph, each agent s dynamics are augmented with an auxiliary state the provides an estimate of all the other agents actions. This estimate is then updated by exchanging information with its neighbours. Assumption 2.4. The undirected graph G c is connected. Based on local communication with its neighbours, each agent i computes estimates of all other agents actions, x i i = col(xi 1,..., x i i 1, xi i+1,..., xi N ) Rn i and uses these estimates to compute its gradient, i J i (x i, x i i ). Denote xi = col(x i 1,..., x i i 1, x i, x i i+1,..., xi N ) Rn and x = col(x 1,..., x N ) R Nn. Define the following selection matrices ] R i := [0 ni n<i I ni 0 ni n>i [ ] I n<i 0 n<i n S i := i 0 n<i n >i (2.32) 0 n>i n <i 0 n>i n i I n>i where n <i := j<i j,i I n j and n >i := j>i j,i I n j. It can be shown that for all i I, R T i R i + S T i S i = I n (2.33) Then x i = R i x i, x i i = S ix i, and x i = R T i x i + Si T xi i. Consider the following agent dynamics ẋ i i = S i j N Σ i : i (x i x j ) ẋ i = i J i (x i, x i i ) R i j N i (x i x j ), i I (2.34) These dynamics have two terms. The first is a gradient-descent term with the estimate used to compute the gradient. The second is a Laplacian based consensus term on the estimates. The vector of stacked gradients i J i (x i, x i i ) in (2.34), computed based on estimates, is called the extended pseudo-gradient, defined as F(x) = col( 1 J 1 (x 1, x 1 1),..., N J N (x N, x N N)). (2.35) Note that F satisfies F(1 N x) = F (x) for any x, hence F(1 N x ) = 0 (2.36) Assumption 2.5. The extended pseudo-gradient is globally Lipschitz continuous, F(x) F(x ) θ x x, for all x, x R Nn, for some θ > 0. Theorem (Theorem 2, [42]) Consider a game G(I, J i, Ω i ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by Σ i, (2.34). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the unique NE, x = x, is globally asymptotically stable for (4.11) for all w(0) W. Moreover, each player s estimates converge globally to the NE values,

30 Chapter 2. Background 20 x = 1 N x. In this chapter, we presented the necessary background on mathematics, dynamical systems, control theory, graph theory and game theory. We will use this background information in our problem definition and our solutions. In the next chapter, we will define our problem and introduce the framework we will use to solve it.

31 Chapter 3 Problem Formulation In this chapter, we present our problem formulation. We will start from the game theoretic background introduced in Section 2.4. We will provide a reinterpretation of the gradient dynamics in the form of dynamic agents and use this interpretation to introduce the output regulation problem as it pertains to games. We stress that while we consider the agents in a game to have dynamics, the game is still a static one, merely played by agents who are dynamic decision makers, and thus falls under the results presented in Section 2.4. Consider the basic full-information gradient-dynamics, Σ i : ẋ i = i J i (x i, x i ), i I (3.1) Instead of viewing (3.1) as the update algorithm for an agent over time, we can instead view it as the closed-loop system of a plant, modelled as an integrator, with a gradient feedback. Therefore, the plant dynamics are given by ẋ i = u i, i I P i : (3.2) y i = x i with the input generated by u i = i J i (x i, x i ), i I (3.3) This reinterpretation of the gradient dynamics allows for us to consider different plants, P i, which may have some exogenous signals affecting their dynamics. It also allows us to consider higher-order plants, for example. For these different plant structures, we can view the problem as one of finding a suitable feedback, u i, stabilizing Γ NE. In this framework, gradient-dynamics presents itself as a special case of a broader class of NE seeking dynamics, where it can be viewed as the interconnection of an integrator with the pseudo-gradient map, Figure

32 Chapter 3. Problem Formulation 22 v P i { ẋ = v y = x y F ( ) Figure 3.1: Gradient Dynamics (3.1) as Feedback Interconnection of an Integrator and the Pseudo- Gradient 3.1 Plants with Additive Exosignals The main focus of this thesis is to investigate NE seeking algorithms in cases where a disturbance or reference may be present in the agent dynamics of a game. We will show that by using the framework discussed in this chapter, that is by viewing the agent dynamics as a plant with a feedback, we can design feedbacks that converge to the NE of a game while accomplishing any output regulation requirements. We call this NE seeking with output regulation. Before we introduce the specific type of plant dynamics we will investigate, we will introduce how the exosignals might affect the agents in a game. In this thesis, we will consider two types of exosignals, what we will call a disturbance that directly affects the plant dynamics of a particular agent and what we will call a reference that affects the action that a particular agent plays in the game. It is important to note that these exosignals may affect the NE set of the game Disturbance on the Dynamics The first case we consider is the case where there is a disturbance acting on the agent dynamics. Each player s dynamics are thus affected by a disturbance, d i, but the actions themselves are unaffected as shown in Figure 3.2. d i Σ i x i s i s i G Σ i x i d i Figure 3.2: A Game with Disturbance on the Agent Dynamics Reference on the Action In the second case, we consider there to be a reference signal that acts on the action each agent chooses. Each player s processing algorithm is unaffected, as is the communication between the agents. However, the actions themselves are affected by a reference, p i, as shown in Figure 3.3. The cost function of each

33 Chapter 3. Problem Formulation 23 agent is based upon the error between each agent s action and its reference. We can think about this as the generalization of reference tracking for one agent, where the cost would be given by J(x) = x p 2. Σ i x i p i s i s i G Σ i x i p i Figure 3.3: A Game with Additive Signal on the Action of Each Player In both of these cases, the goal with be to properly design Σ i such that the NE set of the game is asymptotically stable irrespective of the reference or disturbance. We will do this by decomposing the agent dynamics Σ i into a plant, P i, and then designing a feedback to stabilize the NE set. To accomplish this, we will use a combination of gradient descent and the internal model principle. 3.2 Full Information Games with Gradient Type Dynamics The first variation of the integrator plant model (3.2) is to consider the case where the plant, P i, can be viewed as an integrator with a reference or disturbance that can be generated by a linear exosystem. We will consider both of these cases separately. In this section, we will discuss the form of these two plant models as well as how the NE set is affected by the new dynamics Games with Disturbances First, we will consider that the plant model (3.2) is replaced by an integrator with additive disturbance ẋ i = u i + d i P i : (3.4) y i = x i where d i can be generated by a linear exosystem ẇ i = S i w i D i : (3.5) d i = D i w i Since the disturbance affects the agent dynamics, but not the action itself, the NE set remains unchanged for this case. Therefore, it is given by Γ NE = { x R n i J i (x i, x i ) = 0, i I } We will introduce ways in which we can pose the NE seeking problem for (3.4).

34 Chapter 3. Problem Formulation 24 Gradient-Descent First, we can look the agent dynamics as if they are a slightly modified version of the gradient-dynamics. The corresponding gradient-descent dynamics for (3.4) are given by ẋ i = i J i (x i, x i ) + u i + d i Σ i : (3.6) y i = x i where we allow an extra control input u i to be designed. Rewriting this for the whole system, with u = col(u 1,..., u N ), d = col(d 1,..., d N ), we get ẋ = F (x) + u + d Σ : y = x (3.7) This can be viewed as a feedback interconnection of an integrator with the pseudo-gradient map, Figure 3.4. u d v P : { ẋ = v y = x y F ( ) Figure 3.4: Gradient Dynamics (3.7) as Feedback Interconnection of an Integrator and the Pseudo- Gradient Map This is not the easiest way to approach the problem. This mainly has to do with the fact that the dynamics in this form have a coupled, nonlinear component. By disconnecting parts of the feedback loop, we can simplify the problem. Pseudo-Gradient as the Output Map We can recast the problem of designing u as an error feedback output regulation problem by disconnecting the feedback loop and instead using the gradient as the error map. This gives us a slightly different plant given by ẋ i = u i + d i P i : (3.8) e i = i J i (x i, x i ) Now the problem becomes one of finding a feedback u i that stabilizes J i (x i, x i ) = 0, equivalent to solving the output regulation problem given in Lemma 2.2. Rewriting it as the full system, we get ẋ = u + d P : (3.9) e = F (x)

35 Chapter 3. Problem Formulation 25 The problem is now a cascade of an integrator and the pseudo-gradient map, Figure 3.5. Note that by taking u = F (x) + ũ and taking x as the output, we recover (3.7). u d { v ẋ = v y e P F ( ) y = x Figure 3.5: Agent dynamics (3.9) as an error feedback output regulation problem We will see later that this interpretation lends itself well to a passivity based feedback for the case of potential games (Chapter 7). Integrator Dynamics Finally, and perhaps the most useful formulation for our purposes is to treat the plant simply as an integrator with full measurement of x i. This allows us to use an LTI observer based approach to the output regulation problem. Therefore the plant is simply given by ẋ i = u i + d i P i : (3.10) y i = x i Now the problem becomes one of finding a control input, u i, that stabilizes x. In stacked vector form, the dynamics become ẋ = u + d P : (3.11) y = x The problem is now simply an integrator, Figure 3.6. Note that by taking u = F (x) + ũ, we recover (3.7). u d v P : { ẋ = v y = x y Figure 3.6: Agent dynamics (3.11) as an integrator output regulation problem Each of these three approaches will lend itself to different methods of controller synthesis. For the most part in this thesis, we will focus on considering the plant dynamics (3.10) (Chapter 4) and double and multi-integrator extensions (Chapters 5 and 6) Games with References The next variation on the plant (3.2) is to consider that the plant can be modelled as an integrator with reference at the output. ẋ i = u i P i : (3.12) y i = x i p i

36 Chapter 3. Problem Formulation 26 where p i can be generated by a linear exosystem ẇ i = S i w i D i : (3.13) p i = C i w i Since the disturbance affects each agents action, the NE set becomes a function of p. It is given by Γ NE = { (x, p) R n R n i J i (x i p i, x i p i ) = 0, i I } In the same way that we viewed the case from Section in three different ways, we will discuss three similar ways that we can think about the problem. Gradient-Descent The first formulation is to view the agent dynamics as a gradient-descent scheme, Σ i, but with additional reference type terms in the gradient of each agent. These dynamics are given by ẋ i = i J i (x i p i, x i p i ) + u i Σ i : (3.14) y i = x i p i We can rewrite this system in stacked vector form ẋ = F (x p) + u Σ : y = x p (3.15) This can be viewed as a feedback interconnection of an integrator with the pseudo-gradient, Figure 3.7. u p { v ẋ = v q y P : q = x ũ F ( ) Figure 3.7: Agent dynamics (3.15) as an output regulation problem As before, this is not the most useful formulation of the problem as this leads to a nonlinear dynamical system. It is however, the most immediate, highlighting the benefits of our approach. Pseudo-Gradient as the Error Map Similar to the disturbance case, we can reformulate (3.15) as an error feedback regulation problem by disconnecting the feedback loop and using the pseudo-gradient as the error map. This gives the following

37 Chapter 3. Problem Formulation 27 system ẋ i = u i P i : e i = i J i (x i p i, x i p i ) (3.16) Now the problem becomes one of finding a feedback u i that stabilizes J i (x i p i, x i p i ) = 0. Rewriting to get the full system gives ẋ = u P : (3.17) y = F (x p) This can be viewed as a cascade of an integrator with the pseudo-gradient, Figure 3.8. u P : { ẋ = v q = x q p F ( ) y Figure 3.8: Agent dynamics (3.17) as an output regulation problem Again, this formulation will lend itself particularly well to a passivity based approach which we will investigate in Chapter 7. Integrator Dynamics Finally, we consider that each plant is simply modelled as an integrator, with input to be designed. ẋ i = u i P i : (3.18) y i = x i p i Now the problem becomes one of finding a feedback u i that stabilizes y = x. Rewriting to get the full system gives ẋ = u P : (3.19) y = x p This can be viewed as a simple integrator, Figure 3.8. u P : { ẋ = v q = x q p y Figure 3.9: Agent dynamics (3.19) as an integrator output regulation problem This formulation is again useful since we can design the control law as if each plant is simply an LTI system and treating any nonlinearities as an input to the plant. We will use this formulation in Chapter 4.

38 Chapter 3. Problem Formulation General Dynamics In addition to considering that each agent is modelled as a single integrator with additive reference and disturbance, we also consider that each agent may be modelled as a more general dynamical system of the form ẋ i = f i (x i, w i, u i ) P i : (3.20) y i = h i (x i, w i ) In this case, the control object remains the same: to find feedbacks u i such that the overall dynamics of the game converge to a sufficiently defined Γ NE. We will address double and multi-integrator agents in Chapters 5 and 6 and a special type of nonlinear system in Chapter 8. In this chapter, we introduced the problem that we will solve in this thesis, namely the problem of NE seeking with output regulation. We also introduced the framework that we will use to solve it. By decomposing the agent dynamics into a plant and finding a suitable feedback, we can use the tools from control theory to stabilize the NE set of a game. In the next chapter, we will investigate NE seeking algorithms for integrator plant models with disturbances or references. In the subsequent chapters, we will consider alternative plant models.

39 Chapter 4 Single-Integrator Plants In this chapter we will investigate NE seeking algorithms for the plant models introduced in Chapter 3. First, we will investigate a full-information NE seeking algorithm for single integrator plants subjected to a disturbance, (3.10). We will expand this algorithm to include a Laplacian based consensus term for the partial-information case. Next, we will introduce an NE seeking algorithm for plant models with additive references, (3.18), under full-information. Similarly we will combine this with a Laplacian based consensus algorithm to handle the partial-information case. Each algorithm will be based off of the standard gradient-descent algorithm, augmented with a reduced-order observer for the exosystem state. In all of our proofs, we will leverage the input-to-state stability of the resulting agent dynamics induced by the strong monotonicity of the pseudo-gradient. 4.1 Games with Disturbances First, we consider that each agent can be modelled as an integrator plant with dynamics given in (3.10) or by ẋ i = u i + d i P i : (4.1) y i = x i where x i, u i, y i, d i R ni and the disturbance can be generated by a linear exosystem of the form ẇ i = S i w i D i : (4.2) d i = D i w i where w i W i R qi and W i is compact. Throughout this thesis, we will assume that (S i, D i ) is an observable pair. Since the NE of this game is not affect of by the disturbance, the NE set is the same as the standard game set-up, given by Γ NE = { x R n i J i (x i, x i ) = 0, i I } (4.3) 29

40 Chapter 4. Single-Integrator Plants 30 Under Assumptions 2.2 and 2.3(ii), the NE, x, is unique. We will first investigate both full-information and partial-information NE seeking algorithms for this case Full-Information First, we consider that each agent has full-information about the others actions. The goal then is to design a feedback, u i for all i I, using full-information such that x x. Our proposed u i is dynamic and is generated by ξ i = S i (K i x i + ξ i ) + K i i J i (x i, x i ), i I u i = i J i (x i, x i ) D i (K i x i + ξ i ) (4.4) where K i is chosen such that σ(s i K i D i ) C. It can be verified that this choice of input satisfies the regulator equations. Due to the simplicity of the solution, we will not explicitly so their solution. We will similarly do the same for Chapters 4-6. This gives the agent dynamics Σ i as, ẋ i = i J i (x i, x i ) D i (K i x i + ξ i ) + d i, i I Σ i : (4.5) ξ i = S i (K i x i + ξ i ) + K i i J i (x i, x i ) which we call dynamic gradient-play for disturbance rejection. Intuitively (4.5) is a gradient-play scheme augmented with a dynamic correction term used to estimate and cancel out the disturbance. It is important to note that this is in the form of a reduced order observer for w i. This is similar to the standard gradient-descent, (2.31), with an observer used to cancel out the disturbance. Theorem 4.1. Consider a game G(I, J i, R ni ) with full-information, and agent dynamics given by (4.1),(4.4), or Σ i, (4.5). Under Assumptions 2.2 and 2.3(ii) the unique NE, x = x, is globally asymptotically stable for all w(0) W. Proof. The stacked dynamics of the closed-loop system are given by ẇ = Sw ẋ = F (x) D(Kx + ξ) + Dw Σ : ξ = S(Kx + ξ) + KF (x) (4.6) where u = col(u 1,..., u N ), S = blkdiag(s 1,..., S N ), D = blkdiag(d 1,..., D N ), C = blkdiag(c 1,..., C N ) and w = col(w 1,..., w N ) W. Consider the coordinate change ξ ρ := w (Kx + ξ). Then taking the time derivative of ρ, by (4.6), ρ = Sw K( F (x) D(Kx + ξ) + Dw) S(Kx + ξ) KF (x) = (S KD)ρ

41 Chapter 4. Single-Integrator Plants 31 In the new coordinates, the closed-loop dynamics are ẇ = Sw ẋ = F (x) + Dρ ρ = (S KD)ρ (4.7) Consider the coordinate transformation x x := x x. The dynamics of the ( x, ρ) subsystem in (4.7) are given as x = F ( x + x ) + Dρ (4.8) ρ = (S KD)ρ (4.9) This system can be considered as a cascade of two systems, with (4.9) generating the input for (4.8). Now consider the Lyapunov candidate function for (4.8) V ( x) = 1 2 x 2 By taking the time derivative of V ( x) along solutions of (4.8), using F (x ) = 0 by (2.25) and Assumption 2.3, we get V = x T ( F ( x + x ) + Dρ) = x T (F ( x + x ) F (x )) + x T Dρ µ x 2 + D x ρ Then, V (µ a) x 2, x D a ρ > 0 for any 0 < a < µ. Therefore, (4.8) is input-to-state stable relative to the input ρ by Theorem Since (4.8) is ISS and (4.9) is an asymptotically stable linear system by (4.4), the origin of (4.8)-(4.9) is globally asymptotically stable by Lemma 2.1 and therefore (x, 0) of the original system (4.7) is globally asymptotically stable for all w(0) W. Compared to distributed optimization, our algorithm cannot exploit summability of the cost functions to decompose the problem into a local minimization problem, with an extra consensus term. The coupling in our algorithm is in the cost function itself. In addition, the agents in a game are selfish decision makers and have not incentive to cooperate to reach a global minimum. Compared to multi-agent coordination, our agent dynamics are coupled by the gradient-descent terms. Therefore, we cannot exploit the fact that each agent has its own Lyapunov function, such as is typically done in passivity based coordination control where the overall Lyapunov function is the sum of the individual ones. Here, we must find a Lyapunov function for the stacked system itself.

42 Chapter 4. Single-Integrator Plants Partial-Information In this section, we extend the results of the previous section to consider games with information shared over a communication graph, G c, with Laplacian, L. Referring to the representation in Figure 3.2, in this case we assume that each agent has only partial, local decision information, s i = x j for j N i. Since each agent has only partial information based off of the information exchanged over a communication graph, we augment each agent s dynamics with an auxiliary state that provides an estimate of all the other agents actions. This estimate is then updated by exchanging information with its neighbours. Based on local communication with its neighbours, each agent i computes estimates of all other agents actions, x i i = col(xi 1,..., x i i 1, xi i+1,..., xi N ) Rn i and uses these estimates to compute its gradient, i J i (x i, x i i ). We denote xi = col(x i 1,..., x i i 1, x i, x i i+1,..., xi N ) Rn and x = col(x 1,..., x N ) R Nn. We want to design a dynamic feedback such that, in steady-state, x i = x j for all i, j I and such that x converges to x for all w(0) W. The proposed agent dynamics builds on the results from Section and an additional consensus component as in Section that drives agents to reach the consensus subspace, where all decision estimates are the same. First we consider that each agent is modelled as a single integrator. Using the selection matrices (2.32), we have that x i = R i x i, x i i = S ix i, and x i = R T i x i + S T i xi i. Consider that for each agent (4.1), u i is generated by ẋ i i = S i (x i x j ) j N i ( ) ξ i = S i (K i x i + ξ i ) + K i i J i (x i, x i i) + R i (x i x j ) j N i u i = i J i (x i, x i i) R i where K i is chosen such that σ(s i K i D i ) C. j N i (x i x j ) D i (K i x i + ξ i ) (4.10) Note that (4.10) has a dynamic component ξ i as in the full-information case combined with another dynamic Laplacian-based estimate-consensus component ẋ i i, which in steady-state should bring all x i to the consensus subspace, x i = x j. With (4.10), the agent dynamics Σ i are given by ẋ i = i J i (x i, x i i ) R i Σ i : ẋ i i = S i j N i (x i x j ) ξ 1 i = S i(k i x i + ξ i ) + K i ( i J i (x i, x i i ) + R i j N i (x i x j ) D i (K i x i + ξ i ) + d i ) j N i (x i x j ) (4.11) Note that in the full information case there is no need for estimates x i i, and (4.11) reduces to (4.5). This is similar to the graph-baaed gradient-descent, (2.34), with an observer to cancel out the disturbance. Theorem 4.2. Consider a game G(I, J i, R ni ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by (4.1),(4.10), or Σ i, (4.11). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the unique NE, x = x, is globally asymptotically stable for (4.11) for all w(0) W. x = 1 N x, for all w(0) W. Moreover, each player s estimates converge globally to the NE values, Proof. In stacked form, using F, (2.35), the dynamics Σ i, (4.11), of all agents i I, with disturbances

43 Chapter 4. Single-Integrator Plants 33 generated by (6.31), can be written as ẇ = Sw ẋ = F(x) RLx D(Kx + ξ) + Dw Σ : Sẋ = SLx ξ = S(Kx + ξ) + K(F(x) + RLx) where R=blkdiag(R 1,..., R N ),S =blkdiag(s 1,..., S N ),L=L I N,x= col(x 1,..., x N ) andcol(x 1 1,..., x N N ) = Sx from x i i = S ix i. From x i = R T i x i + S T i xi i it follows that x = RT x + S T Sx. Thus, we can write the foregoing as ẇ = Sw ẋ = R T F(x) Lx + R T (D(w (Kx + ξ))) Σ : ξ = S(Kx + ξ) + K(F(x) + RLx) (4.12) where R T R + S T S = I Nn was used (from (2.33)). Consider the coordinate transformation ξ ρ := w (Kx + ξ). Taking the time derivative of ρ, we get ρ = Sw K( F(x) D(Kx + ξ) + Dw) K( RLx) S(Kx + ξ) K(F(x) + RLx) = (S KD)ρ In the new coordinate system, the dynamics are given by ẇ = Sw ẋ = R T F(x) Lx + R T Dρ ρ = (S KD)ρ (4.13) By shifting the coordinates y x := x x, where x = 1 N x, the dynamics of the ( x, ρ) subsystem become x = R T F( x + x) L( x + x) + R T Dρ (4.14) ρ = (S KD)ρ (4.15) Note that (4.14)-(4.15) is a cascade of two subsystems, with (4.15) generating the external input (4.14). We then decompose R Nn as R Nn = C n E n, where C n = {1 N x x R n } is the consensus subspace, and E n is its orthogonal complement. Any x R Nn can be decomposed as x = x + x, where x = P E x E n, x = P C x C n, and the two projection matrices are P C = 1 N 1 N 1 T N I n, P E = I Nn 1 N 1 N 1 T N I n. Thus, x = 1 N x, for some x R n, and since x = 1 N x C n, it follows that x = x x = x + x, where x = x, x = 1 N (x x ). Therefore L x = 0 and λ 2 (L) x 2 x L x for all x E n by properties of the Laplacian under Assumption 2.4.

44 Chapter 4. Single-Integrator Plants 34 Furthermore, x 2 = x + x 2 = x 2 + x 2 (4.16) Take the Lyapunov candidate function V ( x) = 1 2 x 2 Taking the time derivative to V along the solutions of (4.14) and using F( x) = 0 by (2.36), we get V = ( x + x ) T R T [F( x + x) F( x)] ( x + x ) T L( x + x + x) + ( x + x ) T R T Dρ Since x, x Null(L) and using F( x + x) F( x) = F( x + x + x) F( x + x) + F( x + x) F( x), we get V = ( x ) T R T [F( x + x + x) F( x + x)] ( x ) T R T [F( x + x) F( x)] ( x ) T L( x ) ( x ) T R T [F( x + x + x) F( x + x)] ( x ) T R T [F( x + x) F( x)] + ( x + x ) T R T Dρ Using ( x ) T L( x ) λ 2 (L) x 2, F( x + x) = F(1 N x) = F (x), F( x) = F(1 N x ) = F (x ) and R( x ) = R(1 N (x x )) = x x, yields V ( x ) T R T [F( x + x + x) F( x + x)] ( x ) T R T [F (x) F (x )] λ 2 (L) x 2 (x x ) T [F( x + x + x) F( x + x)] (x x ) T [F (x) F (x )] + ( x + x ) T R T Dρ Using F( x + x + x) F( x + x) θ x by Assumption 2.5, R x R x, F (x) F (x ) θ x x θ x x, and (x x ) T [F (x) F (x )] µ x x 2 by Assumption 2.3(ii), yields Using x x = 1 N x, we can write V θ x 2 + θ x x x λ 2 (L) x 2 + θ x x x µ x x 2 + ( x + x ) T R T Dρ [ V x ] [ 1 x N µ ] [ ] 1 N θ x 1 N θ λ 2 (L) θ x + x + x R T D ρ

45 Chapter 4. Single-Integrator Plants 35 Then for any x RT D a ρ, where a > 0 [ V x ] [ 1 x N µ ] [ ] 1 N θ x 1 N θ λ 2 (L) θ x + a x + x 2 and using (4.16), we can write [ V x ] [ 1 x N µ a ] [ ] 1 N θ x 1 N θ λ 2 (L) θ a x For the x-subsystem in (4.15) to be input-to-state stable, we need the above matrix on the right-hand side to be positive definite. Thus, there needs to exists an a > 0 such that a < 1 N µ, a < λ 2(L) θ, and ( 1 N µ a)(λ 2(L) θ a) 1 N θ2 > 0. By examining the sign of the roots of its characteristic polynomial, we can show that the matrix is positive definite when µ(λ 2 (L) θ) > θ 2 and a is in the intersection of the above inequalities, which is guaranteed to be nonempty if µ(λ 2 (L) θ) > θ 2. Then V W ( x), x RT D a ρ 0, where W ( x) is a positive definite function. Therefore, the x-subsystem in (4.15) is ISS with respect to ρ by Theorem Since ρ = (S LD)ρ is asymptotically stable by (4.10), it follows that the origin of (4.15) is globally asymptotically stable by Lemma 2.1, and therefore ( x, 0) of the original system, (4.13), is globally asymptotically stable. Remark 4.1. Local results follow if Assumption 2.5 holds only locally around x = 1 N x. We note that the class of quadratic games satisfies Assumption 2.5 globally. For the matrix to show input-tostate stability, we need the above matrix to be positive definite. Thus, we need to find a > 0 such that a < 1 N µ, a < λ 2(L) θ, and ( 1 N µ a)(λ 2(L) θ a) 1 N θ2 > 0. Will will now solve for conditions that the intersection of these sets is non-empty. Examining the last inequality By solving for the roots of the polynomial ( 1 ) N µ a (λ 2 (L) θ a) 1 N θ2 > 0 1 N µ(λ 2(L) θ) 1 ( 1 N θ2 N µ + λ 2(L) θ) a + a 2 > 0 a = 2[( 1 1 ) ( 1 ) 2 N µ + λ 2(L) θ ± N µ + λ 1 [ 2(L) θ 4 µ(λ2 (L) θ) θ N 2] ] If we consider the root associated with the minus sign and require that it be greater than zero, since if this root, a 0 is greater than zero, then there are values of that satisfy the inequality 0 < a < a 0 and the intersection of all the sets will be non-empty. ( 1 ( 1 ) 2 N µ + λ 2(L) θ) N µ + λ 1 [ 2(L) θ 4 µ(λ2 (L) θ) θ N 2] > 0 We also want the root to be real, so we get ( 1 ) 2 ( 1 ) 2 N µ + λ 2(L) θ > N µ + λ 1 [ 2(L) θ 4 µ(λ2 (L) θ) θ 2] > 0 N

46 Chapter 4. Single-Integrator Plants 36 Which is the same requirement as Examining the left part of the inequality, we get ( 1 ) 2 N µ + λ 1 [ 2(L) θ > 4 µ(λ2 (L) θ) θ 2] > 0 N 4 1 [ µ(λ2 (L) θ) θ 2] > 0 N which is true if µ(λ 2 (L) θ) > θ 2. By examining the left inequality ( 1 ) 2 N µ + λ 1 [ 2(L) θ > 4 µ(λ2 (L) θ) θ 2] N (λ 2 (L) θ) ( 1 ) 2 N µ(λ 2(L) θ) + N µ 1 > 4 N µ(λ 2(L) θ) 4 1 N θ2 (λ 2 (L) θ) 2 2 ( 1 ) 2 N µ(λ 2(L) θ) + N µ 1 > 4 N θ2( 1 ) 2 N µ + λ 1 2(L) θ > 4 N θ2 Which is true for any choices λ 2 (L), θ, µ, and N. Therefore, if we choose a (0, 1 ) ( ) ( N µ 0, λ 2 (L) θ 0, 2[( 1 1 ) N µ + λ 2(L) θ Then V W ( x), where W ( x) is a positive definite function. ( 1 ) 2 N µ + λ 1 [ ]) 2(L) θ 4 µ(λ2 (L) θ) θ2] N 4.2 Games with References In this section, we investigate NE seeking algorithms for the case when the agents can be modelled as having a plant with an additive reference signal (3.18) or given by ẋ i = u i P i : (4.17) y i = x i p i where p i can be generated by a linear exosystem ẇ i = S i w i D i : (4.18) p i = C i w i Again, we assume (S i, C i ) is observable. We note again that the NE set for this set-up is a a function of the exosystem state w i instead of the usual singleton, x. The NE set can be described by Γ NE = { (x, w) R n W i J i (x i C i w i, x i C i w i ) = 0, i I } (4.19)

47 Chapter 4. Single-Integrator Plants 37 and therefore is not merely a singleton, x. However, under Assumptions 2.2 and 2.3(ii), this set has an alternative characterization given by Γ NE = { (x, w) R n W x Cw = F 1 (0) } where F 1 (0) is unique under Assumptions 2.2 and 2.3(ii) due to uniqueness of the NE induced by those assumptions. This will be a more useful characterization since it characterizes the NE as an affine function of w Full-Information First, we will consider that each agent has full-information about each of the others realized action, given by x j p j for all j I as this information is necessary to compute its partial gradient. This may seem like more information than used in Section 4.1, however since x j and p j need not be communicated separately, it results in the same amount of information being passed between agents. In addition to the full-information assumption, we assume that each agent has separate measurement of its state, x i, and reference signal p i so that the measured output, yi m, is given by y m i = [ x i p i ] (4.20) In our design we will use a reduced order observer for w i and thus will make use of this assumption. If this is not the case, a full observer can be used instead. and similar results can be shown. Our proposed u i is dynamic and is generated by ξ i = S i ξ i + K i (p i C i ξ i ), i I u i = i J i (x i p i, x i p i ) C i S i ξ i (4.21) where K i is chosen such that σ(s i K i D i ) C. This gives the agent dynamics Σ i as, ẋ i = i J i (x i p i, x i p i ) + C i S i ξ i, i I Σ i : ξ i = S i ξ i + K i (p i C i ξ i ) (4.22) which we call dynamic gradient-play for reference tracking. Similar to (4.5), (4.22) is a gradient-play scheme augmented with a dynamic correction term used to estimate the full exosystem in order to track the reference signal. Again, this is in the form of a reduced order observer for w. Theorem 4.3. Consider a game G(I, J i, R ni ) with full-information, and agent dynamics given by (4.17),(4.21), or Σ i, (4.22). Under Assumptions 2.2 and 2.3(ii) the NE set (4.19) is globally asymptotically stable for all w(0) W.

48 Chapter 4. Single-Integrator Plants 38 Proof. The stacked dynamics of the closed-loop system are given by ẇ = Sw ẋ = F (x Cw) + CSξ Σ : ξ = Sξ + KC(w ξ) (4.23) where u = col(u 1,..., u N ), S = blkdiag(s 1,..., S N ), D = blkdiag(d 1,..., D N ), C = blkdiag(c 1,..., C N ) and w = col(w 1,..., w N ) W. Consider the coordinate change ξ ρ := w ξ. Then taking the time derivative of ρ, by (4.23), ρ = Sw Sξ + KC(w ξ) = (S KC)ρ In the new coordinates, the closed-loop dynamics are ẇ = Sw ẋ = F (x + Cw) CS(ρ w) ρ = (S KC)ρ (4.24) Consider the coordinate transformation x ỹ := x Cw y, where y = F 1 (0). The dynamics of the (ỹ, ρ) subsystem in (4.24) are given as ỹ = F (ỹ + y ) CSρ (4.25) ρ = (S KC)ρ (4.26) This system can be considered as a cascade of two systems, with (4.26) generating the input for (4.25). Now consider the Lyapunov candidate function for (4.25) V (ỹ) = 1 2 ỹ 2 By taking the time derivative of V (ỹ) along solutions of (4.25), using F (y ) = 0 by (2.25) and Assumption 2.3, we get V = ỹ T ( F (ỹ + y ) CSρ) = ỹ T (F (ỹ + y ) F (y )) ỹ T CSρ µ ỹ 2 + CS ỹ ρ Then, V (µ a) ỹ 2, ỹ CS ρ > 0 a for any 0 < a < µ. Therefore, (4.25) is input-to-state stable relative to the input ρ by Theorem Since (4.25) is ISS and (4.26) is an asymptotically stable linear system by (4.21), the origin of (4.25)-

49 Chapter 4. Single-Integrator Plants 39 (4.25) is globally asymptotically stable by Lemma 2.1 and therefore the set x = y + Cw of the original system (4.24) is globally asymptotically stable for all w(0) W Partial-Information In this section, we extend the results of Section to games with information shared over a communication graph, G c, with Laplacian, L. Similar to the results of Section we will use a Laplacian based consensus algorithm introduced in Section We assume that each agent communicates its realized action, x i p i, with its neighbours. Referring to the representation in Figure 3.2, we have that s i = x j p j for j N i. Since each agent has only partial information based off of the information exchanged over a communication graph, we augment each agent s dynamics with an auxiliary state that provides an estimate of all the other agents actions. information with its neighbours. This estimate is then updated by exchanging Based on local communication with its neighbours, each agent i computes estimates of all other agents realized actions, y i i = col(yi 1,..., y i i 1, yi i+1,..., yi N ) and uses these estimates to com- Rn i pute its gradient, i J i (x i p i, y i i ). We denote yi = col(y i 1,..., y i i 1, x i p i, y i i+1,..., yi N ) Rn and y = col(y 1,..., y N ) R Nn. We want to design a dynamic feedback such that, in steady-state, y i = y j for all i, j I and such that x converges to the NE set (4.19) for all exosystem states, w. The proposed agent dynamics build upon those presented in Sections and Using the selection matrices (2.32), we have that x i p i = R i y i, y i i = S iy i, and y i = R T i y i +S T i yi i. Consider that for each agent (4.17), u i is generated by ẏ i i = S i j N i (y i y j ) ξ i = S i ξ i + K i (p i C i ξ i ) u i = i J i (x i p i, y i i) R i where K i is chosen such that σ(s i K i D i ) C. j N i (y i y j ) + C i S i ξ i (4.27) Note that (4.27) has a dynamic component ξ i as in the full-information case combined with another dynamic Laplacian-based estimate-consensus component ẏ i i, which in steady-state should bring all y i to the consensus subspace, y i = y j. With (4.10), the agent dynamics Σ i are then given by ẋ i = i J i (x i p i, y i i ) R i Σ i : ẏ i i = S i j N i (y i y j ) ξ i = S i ξ i + K i (p i C i ξ i ) j N i (y i y j ) + C i S i ξ i (4.28) Note that in the full information case there is no need for estimates y i i, and (4.28) reduces to (4.22). Theorem 4.4. Consider a game G(I, J i, R ni ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by (4.1),(4.10), or (4.11). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the NE set (4.19) is globally asymptotically stable for (4.28) for all w(0) W. Moreover, each player s estimates converge globally to the NE values, ȳ = 1 N F 1 (0), for all w(0) W.

50 Chapter 4. Single-Integrator Plants 40 Proof. In stacked form, using F, (2.35), the dynamics (4.28), of all agents i I, with disturbances generated by (4.18), can be written as ẇ = Sw ẋ = F(y) RLy + CSξ Σ : Sẏ = SLy ξ = Sξ + KC(w ξ) where R=blkdiag(R 1,..., R N ),S =blkdiag(s 1,..., S N ),L=L I N,y= col(y 1,..., y N ) andcol(y 1 1,..., y N N ) = Sy from y i i = S iy i. From y i = R T i x i + S T i yi i it follows that y = RT (x + Cw) + S T Sy. Thus, we can write the foregoing as ẇ = Sw ẏ = R T F(y) Ly + R T CSξ Σ : ξ = Sξ + KC(w ξ) (4.29) where R T R+S T S = I Nn was used (from (2.33)). Consider the coordinate transformation ξ ρ := w ξ. Taking the time derivative of ρ, we get ρ = Sw Sξ + KC(w ξ) = (S KC)ρ In the new coordinate system, the dynamics are given by ẇ = Sw ẏ = R T F(y) Ly + R T (CS(w ρ)) ρ = (S KC)ρ (4.30) By shifting the coordinates y ỹ := y ȳ, where ȳ = 1 N y and y = F 1 (0), the dynamics of the (ỹ, ρ) subsystem become ỹ = R T F(ỹ + ȳ) L(ỹ + ȳ) R T CSρ (4.31) ρ = (S KC)ρ (4.32) Note that (4.31)-(4.32) is a cascade of two subsystems, with (4.32) generating the external input for (4.31). We then decompose R Nn as R Nn = C n E n, where C n = {1 N y y R n } is the consensus subspace, and E n is its orthogonal complement. Any y R Nn can be decomposed as y = y + y, where y = P E y E n, y = P C y C n, and the two projection matrices are P C = 1 N 1 N 1 T N I n, P E = I Nn 1 N 1 N 1 T N I n. Thus, y = 1 N y, for some y R n, and since ȳ = 1 N y C n, it follows that ỹ = y ȳ = ỹ + ỹ, where ỹ = y, ỹ = 1 N (y y ). Therefore Lỹ = 0 and λ 2 (L) ỹ 2 ỹ Lỹ for all ỹ E n by properties of the Laplacian under Assumption 2.4. Furthermore,

51 Chapter 4. Single-Integrator Plants 41 ỹ 2 = ỹ + ỹ 2 = ỹ 2 + ỹ 2 (4.33) Take the Lyapunov candidate function V (ỹ) = 1 2 ỹ 2 Taking the time derivative to V along the solutions of (4.31) and using F(ȳ) = 0 by (2.36), we get V = (ỹ + ỹ ) T R T [F(ỹ + ȳ) F(ȳ)] (ỹ + ỹ ) T L(ỹ + ỹ + ȳ) (ỹ + ỹ ) T R T CSρ Since ȳ, ỹ Null(L) and using F(ỹ + ȳ) F(ȳ) = F(ỹ + ỹ + ȳ) F(ỹ + ȳ) + F(ỹ + ȳ) F(ȳ), we get V = (ỹ ) T R T [F(ỹ + ỹ + ȳ) F(ỹ + ȳ)] (ỹ ) T R T [F(ỹ + ȳ) F(ȳ)] (ỹ ) T L(ỹ ) (ỹ ) T R T [F(ỹ + ỹ + ȳ) F(ỹ + ȳ)] (ỹ ) T R T [F(ỹ + ȳ) F(ȳ)] (ỹ + ỹ ) T R T CSρ Using (ỹ ) T L(ỹ ) λ 2 (L) ỹ 2, F(ỹ + ȳ) = F(1 N y) = F (x), F(ȳ) = F(1 N y ) = F (y ) and R(ỹ ) = R(1 N (y y )) = y y, yields V (ỹ ) T R T [F(ỹ + ỹ + ȳ) F(ỹ + ȳ)] (ỹ ) T R T [F (y) F (y )] λ 2 (L) ỹ 2 (y y ) T [F(ỹ + ỹ + ȳ) F(ỹ + ȳ)] (y y ) T [F (y) F (y )] (ỹ + ỹ ) T R T CSρ Using F(ỹ +ỹ +ȳ) F(ỹ +ȳ) θ ỹ by Assumption 2.5, Rỹ R ỹ, F (y) F (y ) θ y y θ y y, and (y y ) T [F (y) F (y )] µ y y 2 by Assumption 2.3, yields Using y y = 1 N ỹ, we can write V θ ỹ 2 + θ ỹ y y λ 2 (L) ỹ 2 + θ y y ỹ µ y y 2 (ỹ + ỹ ) T R T CSρ [ V ỹ ] [ 1 ỹ N µ ] [ ] 1 N θ ỹ N θ λ 2 (L) θ ỹ + ỹ + ỹ R T CS ρ 1 Then for any y RT CS a ρ, where a > 0 [ V ỹ ] [ 1 ỹ N µ ] [ ] 1 N θ ỹ 1 N θ λ 2 (L) θ ỹ + a ỹ + ỹ 2

52 Chapter 4. Single-Integrator Plants 42 and using (4.33), we can write [ V ỹ ] [ 1 ỹ N µ a ] [ ] 1 N θ ỹ 1 N θ λ 2 (L) θ a ỹ For (4.31) to be input-to-state stable, we need the above matrix on the right-hand side to be positive definite. By the same argument used in the proof of Theorem 4.2, there exists a > 0 such that the above matrix is positive-definite. Then V W (ỹ), ỹ RT CS a ρ 0, where W (ỹ) is a positive definite function. Therefore, the ỹ-subsystem in (4.32) is ISS with respect to ρ by Theorem Since (4.32) is asymptotically stable by (4.27), it follows that the origin of (4.31) - (4.32) is asymptotically stable by Lemma 2.1, and therefore x = y + Cw of the original system, (4.30), is asymptotically stable for all w(0) W. In this chapter we investigated agent dynamics that were used for NE seeking in games with disturbances or references. In both of these cases, full-information as well as partial information was discussed and convergence proven. The results of this chapter show the effectiveness of the framework proposed in this thesis, that is to decompose the agent dynamics into a plant with a suitable feedback. In the next chapter, we will see how we can continue to use this framework by considering plants with higher-order dynamics.

53 Chapter 5 NE Seeking For Double Integrator Plants In this chapter we continue to expand the framework presented in the previous chapters. We will consider replacing the plant model P i of each agent with a double integrator instead. First we will consider a case without any disturbances and then we consider that each agent has an additive disturbance. In both of these set-ups, we will investigate full-information algorithms as well as partial-information ones. Our motivation for considering these types of plants is two-fold. First, we might consider that the agents in the game have some sort of inherent dynamics, such as double integrator robots playing a game wherein the cost functions are functions of their positions. Second, we may want to consider a higher-order plant in order to have better convergence rates on the overall dynamics in the system, such as is done extensively in the optimization literature, e.g., the heavy-ball method. 5.1 Games without Disturbances First, we will investigate games without disturbances. We will consider that each agent can be modelled as a double integrator plant ẋ 1 i = x2 i P i : ẋ 2 i = u i y i = x 1 i where x i = col(x 1 i, x2 i ) R2ni, u i, y i R ni. Each agent has a cost function that depends on all the agents actions, x 1 i, but not their whole state vector. J i (x 1 i, x1 i ). (5.1) Therefore, the cost of each agent is given by In these cases it is important to note that by extending the space, we have to consider that the NE set of the game also changes. Now we have to consider that the NE set involves the velocities as well and is thus given by Γ NE = { (x 1, x 2 ) R n R n i J i (x 1 i, x 1 i) = 0, x 2 i = 0, i I } (5.2) 43

54 Chapter 5. NE Seeking For Double Integrator Plants 44 Therefore we can characterize the NE of the system as being given by x = (x 1, 0) such that F (x 1 ) = 0 (5.3) x 2 = 0 (5.4) Under Assumptions 2.2 and 2.3, x 1 is unique. Consider a similar set is given by Γ NE = { (x 1, x 2 ) R n R n i J i (x 1 i + i x 2, x 1 i + i x 2 i) = 0, x 2 i = 0, i I } for appropriately sized matrices i, i and note that it reduces to (5.2) since x 2 = 0 on the set. Therefore instead of each agent minimizing over just its own action, x 1 i, it can minimize over a linear combination of its states with the constraint that at equilibrium, x 2 i = 0. We will approach the problem with this method Full-Information First we consider the full-information case and assume that each agent can share its state with all other agents. Our design will be based upon gradient descent with an extra stabilizing term on x 2 i. We consider the following feedback u i = i J i (x 1 i + b i x 2 i, x 1 i + b i x 2 i) 1 b i x 2 i (5.5) where K i is chosen such that σ(s i K i D i ) C and b i > 0. This leads to agent dynamics, ẋ 1 i Σ i : = x2 i ẋ 2 i = ij i (x 1 i + b ix 2 i, x1 i + b ix 2 i ) 1 b i x 2 i (5.6) The intuition behind the choice of calculating the gradient at a point that is a combination of x 1 and x 2 is that gradient-descent is a method that works well for integrator system, i.e., ones with unit relative degree. By creating a fictitious output x 1 i +b ix 2 i we decrease the relative degree of each agent to one. This creates a hyperplane x 1 + Bx 2 x 1 = 0, where B = blkdiag(b 1 I n1,..., b N I nn ), on which the pseudogradient map is zero. Under Assumptions 2.2 and 2.3(i), the hyperplane is attractive for (5.6). We then stabilize x 2 = 0 using a feedback that renders this hyperplane invariant for (5.6) thereby stabilizing x = x. Intuitively, each agent makes a prediction that is the current position plus scaled velocity and computes its partial-gradient at this future point. Although this may seem like an unnecessary step as it has some extra components on top of what would be a sort of heavy-ball type algorithm, we will see in the next chapter that this framework allows us to extend this algorithm to include games where each plant is a modelled as higher-order integrator, with differing orders between the agents. We also note that the communication between agents in this case is not increased at all compared to the results in Chapter 4 as each agent need not communicate x i fully, but only x 1 i + b ix 2 i, which is the same dimension as x1 i. Future work would involve creating an algorithm for stabilizing the NE where only position information, x 1 is used in the computation of the partial-gradient feedback for each agent. We note that this algorithm, as well as its extension in the next chapter, is similar to a passivity

55 Chapter 5. NE Seeking For Double Integrator Plants 45 based group coordination design, e.g., [19]. If we examine each plant with output γ i and take the storage function V i (x i ) = 1 2 xt i P ix i where [ ] 1 b P i = i 1 1 b i Taking the time derivative along the solutions of (5.1) V i = 1 2 xt i [ ] [ ] bi x 1 1 i + 1 b i 1 b i 2 xt i [ 1 bi 1 1 b i ] [ b i ] x i + x T i [ 1 bi ] [ ] b i 1 u i = (x 1 i + b i x 2 i ) T u i therefore by Definition 2.15, (5.1) is passive with output γ i = (x 1 i + b ix 2 i ). However, we stress that the feedback i J i (x 1 i + b ix 2 i, x1 i + b ix 2 i ) is not necessarily the proper gradient of any function, as required in [19]. This only occurs in the very special case of potential games. Therefore each agent is not a passive system on its own when the feedback is added. Therefore, we need to use slightly different methods of analysis. Theorem 5.1. Consider a game G(I, J i, R ni ) with agent dynamics given by (5.1),(5.5) or Σ i, (5.6). Under Assumptions 2.2 and 2.3(i), (x 1, x 2 ) = (x 1, 0) is globally asymptotically stable for (5.6). Proof. From (5.6), the stacked dynamics are ẋ 1 = x 2 Σ : (5.7) ẋ 2 = F (x 1 + Bx 2 ) B 1 x 2 Consider the coordinate transformation x 1 γ := x 1 + Bx 2. Taking the time derivative of γ yields γ = ẋ 1 + Bẋ 2 = x 2 + B( F (x 1 + Bx 2 ) B 1 x 2 ) = BF (γ) We will then shift the coordinates by taking γ := γ x 1. The dynamics (5.7) in the new coordinates are then given by γ = BF ( γ + x 1 ) (5.8) ẋ 2 = F ( γ + x 1 ) Bx 2 (5.9) We decompose the (5.8)-(5.9) into a cascade with (5.8) generating the input for (5.9). Consider the Lyapunov candidate function for (5.8) V 1 ( γ) = 1 2 ( γ)t B 1 ( γ) (5.10) Taking the derivative of (5.10) along the solutions of (5.8), yields V 1 = ( γ) T F ( γ + x 1 )

56 Chapter 5. NE Seeking For Double Integrator Plants 46 Using F (x 1 ) = 0 from (5.3) and by Assumption 2.3(i), we get V 1 = ( γ + x 1 x 1 ) T (F ( γ + x 1 ) F (x 1 )) < 0, γ 0 Therefore V 1 is negative definite and by Theorem 2.4, γ = 0 is globally asymptotically stable. What is left to show is that (5.9) ISS with respect to γ. Consider the Lyapunov candidate function V 2 (x 2 ) = 1 2 x2 2 (5.11) Taking the derivative of (5.11) along the solutions of (5.9) yields V 2 = (x 2 ) T B 1 (x 2 ) (x 2 ) T F ( γ + x 1 ) 1 b m x 2 + x 2 F ( γ + x 1 ) where b m = max i I b i. Using (5.3) and by Assumption 2.3(i), we get V 2 1 x 2 + x 2 F ( γ + x 1 ) F (x 1 ) b m 1 x 2 + x 2 α x ( γ ) b m ( 1 ) a x 2, b m x 2 1 a α x ( γ ) > 0 where a < 1 b m. Therefore by Theorem 2.10, (5.9) is ISS with input γ and since the origin of (5.8) is globally asymptotically stable, ( γ, x 2 ) = (0, 0) is globally asymptotically stable and the NE (x 1, x 2 ) = (x 1, 0) is globally asymptotically stable for Σ i, (5.6) Partial-Information To extend the previous results to the include the partial information case, we will use a similar Laplacian based consensus scheme as presented in Section However, instead of each agent estimating the others actions, x 1 i, they will estimate the sum of the actions used to compute the gradient in (5.5). They will then share these estimates as well as their own sums with their neighbours. Therefore, each agent i computes γ i i = col(γi 1,..., γ i i 1, γi i+1,..., γi N ) Rn i and uses these estimates to compute its gradient, i J i (x i + b i x 2 i, γi i ). We denote γi = col(γ i 1,..., γ i i 1, x1 i + b ix 2 i, γi i+1,..., γi N ) Rn and γ = col(γ 1,..., γ N ) R Nn. We want to design a dynamic feedback such that, in steady-state, γ i = γ j for all i, j I and such that x converges to x as defined in (5.3),(5.4). dynamic feedback γ i i = S i j N i (γ i γ j ) u i = i J i (x 1 i + b i x 2 i, γ i i) 1 b i x 2 i R i We consider the following j N i (γ i γ j ) (5.12)

57 Chapter 5. NE Seeking For Double Integrator Plants 47 where K i is chosen such that σ(s i K i D i ) C and b i > 0. Which leads to agent dynamics γ i i = S i j N i (γ i γ j ), i I Σ i : ẋ 1 i = x2 i ẋ 2 i = ij i (x 1 i + b ix 2 i, γ i) 1 b i x 2 i R i j N i (γ i γ j ) (5.13) Theorem 5.2. Consider a game G(I, J i, R ni ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by (5.1),(5.12), or Σ i, (5.13). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the unique NE, (x 1, x 2 ) = (x 1, 0), is globally asymptotically stable for (5.13) for all w(0) W. Moreover, each player s estimates converge globally to the NE values, γ = 1 N x 1. Proof. The stacked dynamics of the system is given by S γ = SLγ Σ : ẋ 1 = x 2 ẋ 2 = B 1 x 2 F(γ) RLγ (5.14) Consider the coordinate transformation x 1 Rγ. In the new coordinates, the dynamics are given by S γ = SLγ R γ = BF(γ) BRLγ ẋ 2 = B 1 x 2 F(γ) + RLγ Using the properties of R and S, γ = R T BF(γ) (R T BR + S T S)Lγ ẋ 2 = B 1 x 2 F(γ) RLγ (5.15) Consider the coordinate transformation γ γ := γ γ. In the new coordinates, using L γ = 0, the dynamics are given by γ = R T BF( γ + γ) (R T BR + S T S)L γ (5.16) ẋ 2 = B 1 x 2 F( γ + γ) RL γ (5.17) We then decompose R Nn as R Nn = C n E n, where C n = {1 N x x R n } is the consensus subspace, and E n is its orthogonal complement. Any γ R Nn can be decomposed as γ = γ + γ, where γ = P E γ E n, γ = P C γ C n, and the two projection matrices are P C = 1 N 1 N 1 T N I n, P E = I Nn 1 N 1 N 1 T N I n. Thus, γ = 1 N x, for some x R n, and since γ = 1 N x C n, it follows that γ = x γ = γ + γ, where γ = γ, γ = 1 N (x x ). Therefore L γ = 0 and λ 2 (L) γ 2 γ L γ for all γ E n by properties of the Laplacian under Assumption 2.4. Furthermore, γ 2 = γ + γ 2 = γ 2 + γ 2 (5.18)

58 Chapter 5. NE Seeking For Double Integrator Plants 48 Considering (5.16)-(5.17) as a cascade system, first take the Lyapunov candidate function for (5.16) V ( γ) = 1 2 ( γ)t R T B 1 R γ ( γ)t S T S γ Taking the time derivative to V along the solutions of (5.16) and using F( γ) = 0 by (2.36), we get V = γ T R T B 1 (BF ( γ + γ) BRL γ) γ T S T SL γ = γ T (R T F ( γ + γ) + L γ) By separating γ into perpendicular and orthogonal components to the consensus space, γ = γ + γ, we get V = ( γ + γ ) T (R T F ( γ + γ + γ) + L( γ + γ )) Since γ, γ Null(L) and using F( γ + γ) F( γ) = F( γ + γ + γ) F( γ + γ) + F( γ + γ) F( γ), we get V = ( γ ) T R T [F( γ + γ + γ) F( γ + γ)] ( γ ) T R T [F( γ + γ) F( γ)] ( γ ) T L( γ ) ( γ ) T R T [F( γ + γ + γ) F( γ + γ)] ( γ ) T R T [F( γ + γ) F( γ)] Using ( γ ) T L( γ ) λ 2 (L) γ 2, F( γ + γ) = F(1 N γ) = F (γ), F( γ) = F(1 N x 1 ) = F (x 1 ) and R( γ ) = R(1 N (γ x 1 )) = γ x 1, yields V ( γ ) T R T [F( γ + γ + γ) F( γ + γ)] ( γ ) T R T [F (x) F (x )] λ 2 (L) γ 2 (γ x 1 ) T [F( γ + γ + γ) F( γ + γ)] (γ x 1 ) T [F (γ) F (x 1 )] Using F( γ + γ + γ) F( γ + x) θ γ by Assumption 2.5, R γ R γ, F (γ) F (x 1 ) θ γ x 1 θ γ x 1, and (γ x 1 ) T [F (γ) F (x 1 )] µ γ x 1 2 by Assumption 2.3, yields Using γ x 1 = 1 N γ, we can write V θ γ 2 + θ γ γ x 1 λ 2 (L) γ 2 + θ γ x 1 γ µ γ x 1 2 [ V γ ] [ 1 γ N µ ] [ ] 1 N θ γ 1 N θ λ 2 (L) θ γ if µ(λ 2 (L) θ) > θ 2, then the above matrix is positive definite and thus by Theorem 2.4, γ = 0 is globally asymptotically stable.

59 Chapter 5. NE Seeking For Double Integrator Plants 49 Now consider (5.17) with input γ and Lyapunov candidate function V 2 (x 2 ) = 1 2 x2 2 (5.19) Taking the derivative of (5.19) along the solutions of (5.17) yields V 2 = (x 2 ) T B 1 (x 2 ) (x 2 ) T (F( γ + γ) + RL γ) 1 b m x x 2 F( γ + γ) + x 2 RL γ where b m = max i I b i. Using (2.36) and by Assumption 2.5, we get Then, we can conclude that V 2 1 b m x x 2 (θ γ + RL γ ) ( 1 ) V 2 a x 2 2, b m x 2 θ + RL γ 0 a where 0 < a < 1 b m. Therefore by Theorem 2.10, (5.17) is ISS with input γ and by Lemma 2.1 the origin of the cascade (5.16)-(5.17) is asymptically stable. Therefore ( γ, 0) of the original system (5.15) is globally asymptotically stable and (x 1, x 2 ) = (x 1, 0) is globally asymptotically stable. 5.2 Games with Disturbances In this section, we combine the results of Section 5.1 with the results from Chapter 4 to cover the case where there is an additive disturbance at the input channel of the plant of each agent. Consider that each agent is modelled as a double integrator ẋ 1 i P i : = x2 i (5.20) ẋ 2 i = u i + d i where x i = col(x 1 i, x2 i ) R2ni, u i, y i, d i R ni. Each agent also has a cost function that depends on all the agents actions, x 1, but none of their velocities x 2. Therefore, the cost of each agent is given by J i (x i, x i ). ẇ i = S i w i D i : (5.21) d i = D i w i where w i W i R qi and (S i, D i ) is observable. Since the disturbance d i has no effect on the action of each agent, the NE set will remain the same as the disturbance free case, given by (5.2). We will consider both full-information and partial-information.

60 Chapter 5. NE Seeking For Double Integrator Plants Full-Information Frist, we will investigate the full-information case. We will use the results from Section combined with the reduced order observer introduced in Section for disturbance rejection. We consider the following dynamic feedback. ( ξ i = S i (K i x 2 i + ξ) + K i i J i (x 1 i + b i x 2 i, x 1 i + b i x 2 i) + 1 ) x 2 i b i u i = i J i (x 1 i + x 2 i, x 1 i + x 2 i) 1 b i x 2 i D i (K i x 2 i + ξ i ) (5.22) where K i is chosen such that σ(s i K i D i ) C and b i > 0. This gives agent dynamics ẋ 1 i = x2 i Σ i : ẋ 2 i = ij i (x 1 i + b ix 2 i, x1 i + b ix 2 i ) 1 b i x 2 i D i(k i x 2 i + ξ i) + d i ( ) ξ i = S i (K i x 2 i + ξ) + K i J i (x 1 i + b ix 2 i, x1 i + b ix 2 i ) + 1 b i x 2 i (5.23) Theorem 5.3. Consider a game G(I, J i, R ni ) with agent dynamics given by (5.20),(5.22) or Σ i, (5.23). Under Assumptions 2.2 and 2.3(ii), (x 1, x 2 ) = (x 1, 0) is asymptotically stable for (5.23) for all w(0) W. Proof. From (5.23), the stacked dynamics are ẇ = Sw ẋ 1 = x 2 Σ : ẋ 2 = F (x 1 + Bx 2 ) B 1 x 2 D(Kx 2 + ξ) + Dw ξ = S(Kx 2 + ξ) + K(F (x 1 + Bx 2 ) + B 1 x 2 ) (5.24) Consider the coordinate transformation x 1 γ := x 1 + Bx 2, ξ ρ := w (Kx 2 + ξ). The dynamics of the closed-loop system in these coordinates are ẇ = Sw γ = BF (γ) + BDρ ẋ 2 = F (γ) B 1 x 2 D(Kx 2 + ξ) + Dw ρ = (S KD)ρ Consider the shift in coordinates γ γ := γ x 1. In these coordinates the (γ, x 2, ρ) subsystem is given by γ = BF ( γ + x 1 ) + BDρ (5.25) ẋ 2 = F ( γ + x 1 ) B 1 x 2 D(Kx 2 + ξ) + Dw (5.26) ρ = (S KD)ρ (5.27) Consider the ( γ, ρ) subsystem as a cascade system. Now consider the Lyapunov candidate function for

61 Chapter 5. NE Seeking For Double Integrator Plants 51 (5.25) V 1 ( γ) = 1 2 ( γ)t B 1 ( γ) (5.28) Taking the derivative of (5.28) along the solutions of (5.25), yields V 1 = ( γ) T F ( γ + x 1 ) + γ T Dρ Using F (x 1 ) = 0 from (5.3) and by Assumption 2.3(ii), we get V 1 = ( γ + x 1 x 1 ) T (F ( γ + x 1 ) F (x 1 )) + γ T Dρ µ γ 2 + γ D ρ (µ a) γ 2, γ D a ρ > 0 where 0 < a < µ. Therefore by Theorem 2.10, (5.25) is ISS with input ρ and by Lemma 2.1 and (5.22), the origin of the (γ, ρ) subsystem is globally asymptotically stable. Therefore, it is left to show that the x 2 subsystem is ISS with respect to (γ, ρ) as inputs. Consider the Lyapunov candidate function V 2 (x 2 ) = 1 2 x2 2 (5.29) Taking the derivative along the solutions of (5.26) V 2 = (x 2 ) T B 1 x 2 (x 2 ) T (F ( γ + x 1 ) Dρ) (x 2 ) T B 1 x 2 + x 2 ( F ( γ + y ) + D ρ ) Using (2.25) and by Assumption 2.3(ii) V 2 1 b m x x 2 ( D ρ + θ γ ) 1 b m x β x 2 ( ρ + γ ) where b m = max i I b i and β = max{θ, D }. By the equivalence of norms, we get V 2 (x 2 ) T B 1 x 2 + β x 2 ( ρ 1 + γ 1 ) By considering that ρ and γ are two components of an input v := col( γ, ρ), then they form orthogonal components of v. By the properties of 1, we get V 2 1 b m x β x 2 v 1

62 Chapter 5. NE Seeking For Double Integrator Plants 52 Using the equivalence of norms, we get where 0 < a < 1 b m. V 2 1 x β n + q x 2 v b m ( 1 ) a x 2 2, x 2 β n + q v 0 b m b Therefore by Theorem 2.10, (5.26) is ISS with input v = ( γ, ρ) and thus by Lemma 2.1, the origin of the cascade system (5.25)-(5.27) is globally asymptotically stable. (x 1, x 2 ) = (x 1, 0) is globally asymptotically stable for (5.24) for all w(0) W. Thus, Partial-Information with Disturbances In this section we will provide an extension of the disturbance rejection algorithm presented in Section combined with a estimate consensus algorithm from Section Based on local communication with its neighbours, each agent i computes estimates of all other agents actions, γ i i = col(γ i 1,..., γ i i 1, γi i+1,..., γi N ) and uses these estimates to compute its gradient, Rn i i J i (x i + b i x 2 i, γi i ). We denote yi = col(γ i 1,..., γ i i 1, x1 i + b ix 2 i, γi i+1,..., γi N ) Rn and γ = col(y 1,..., γ N ) R Nn. We want to design a dynamic feedback such that, in steady-state, γ i = γ j for all i, j I and such that x converges to x. γ i i = S i (γ i γ j ) j N i ξ i = S i (K i x 2 i + ξ i ) + K ( i J i (x 1 i + b i x 2 i, γ i i) + 1 ) x 2 i R i (γ i γ j ) b i j N i u i = i J i (x 1 i + b i x 2 i, γ i i) 1 x 2 i R i (γ i γ j ) D i (K i x 2 i + ξ) (5.30) b i j N i where K i is chosen such that σ(s i K i D i ) C and b i > 0. This leads to agent dynamics of γ i i = S i j N i (γ i γ j ), i I ẋ 1 i Σ i : = x2 i ẋ 2 i = ij i (x 1 i + b ix 2 i, γ i) 1 b i x 2 i R i j N i (γ i γ j ) D i (K i x 2 i + ξ i) + d i ( ξ i = S i (K i x 2 i + ξ) + K i J i (x 1 i + b ix 2 i, γi i ) + 1 b i x 2 i R ) i j N i (γ i γ j ) (5.31) Theorem 5.4. Consider a game G(I, J i, R ni ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by (5.1),(5.30), or Σ i, (5.31). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the unique NE, x 1 = y, is globally asymptotically stable for (5.31) for all w(0) W. γ = 1 N y. Moreover, each player s estimates converge globally to the NE values,

63 Chapter 5. NE Seeking For Double Integrator Plants 53 Proof. The stacked dynamics of the system is given by ẇ = Sw S γ = SLγ ẋ 1 = x 2 Σ : ẋ 2 = B 1 x 2 F(γ) RLγ D(Kx 2 + ξ) + Dw ξ = S(Kx 2 + ξ) + K(F(γ) + B 1 x 2 + RLγ) (5.32) Consider the coordinate transformation x 1 Rγ. In the new coordinates, the dynamics are given by ẇ = Sw S γ = SLγ R γ = BF(γ) BRLγ BD(Kx 2 + ξ w) ẋ 2 = B 1 x 2 F(γ) RLγ D(Kx 2 + ξ) + Dw ξ = S(Kx 2 + ξ) + K(F(γ) + B 1 x 2 + RLγ) Using the properties of R and S, ẇ = Sw γ = R T BF(γ) (R T BR + S T S)Lγ R T BD(Kx 2 + ξ w) ẋ 2 = B 1 x 2 F(γ) + RLγ D(Kx 2 + ξ) + Dw ξ = S(Kx 2 + ξ) + K(F(γ) + B 1 x 2 + RLγ) Consider the coordinate transformation given by ξ ρ := w (Kx + ξ). Then by taking the timederivative of ρ, we get ρ = Sw K( B 1 x 2 F(γ) + RLγ D(Kx 2 + ξ) + Dw) S(Kx 2 + ξ) K(F(γ) + B 1 x 2 + RLγ) = (S KD)ρ Consider the coordinate transformation γ γ := γ γ. In the new coordinates, using L γ = 0, the dynamics of the ( γ, x 2, ρ) are given by γ = R T BF( γ + γ) (R T BR + S T S)L γ R T BDρ (5.33) ẋ 2 = B 1 x 2 F( γ + γ) + RL γ + Dρ (5.34) ρ = (S KD)ρ (5.35) We then decompose R Nn as R Nn = C n E n, where C n = {1 N γ γ R n } is the consensus subspace, and E n is its orthogonal complement. Any γ R Nn can be decomposed as γ = γ + γ, where γ = P E γ E n, γ = P C γ C n, and the two projection matrices are P C = 1 N 1 N 1 T N I n, P E = I Nn 1 N 1 N 1 T N I n. Thus, γ = 1 N x, for some x R n, and since γ = 1 N x C n, it follows that γ = x γ = γ + γ, where γ = γ, γ = 1 N (x x ). Therefore L γ = 0

64 Chapter 5. NE Seeking For Double Integrator Plants 54 and λ 2 (L) γ 2 γ L γ for all γ E n by properties of the Laplacian under Assumption 2.4. Furthermore, γ 2 = γ + γ 2 = γ 2 + γ 2 (5.36) Take the Lyapunov candidate function V ( γ) = 1 2 ( γ)t R T B 1 R γ ( γ)t S T S γ Taking the time derivative to V along the solutions of (5.33) and using F( γ) = 0 by (2.36), we get V = γ T R T B 1 (BF ( γ + γ) BRL γ + BDρ) γ T S T SL γ = γ T (R T F ( γ + γ) + L γ + R T Dρ) By separating γ into perpendicular and orthogonal components to the consensus space, γ = γ + γ, we get V = ( γ + γ ) T (R T F ( γ + γ + γ) + L( γ + γ ) + R T Dρ) Using the same analysis as used in the proof of Theorem 4.2, we can conclude that [ V γ ] [ 1 γ N µ a ] [ ] 1 N θ γ 1 N θ λ 2 (L) θ a γ for any γ RT D a ρ. It also follows from the proof of Theorem 4.2, that there exists a > 0 such that the above matrix is positive definite. Therefore, we conclude by Theorem 2.10 that the γ subsystem is ISS with input ρ. Since the origin of the ρ subsystem is globally asymptotically stable, we conclude by Lemma 2.1 that the origin of the ( γ, ρ) subsystem is globally asymptotically stable. Now consider the x 2 subsystem (5.34) with input γ and Lyapunov candidate function V 2 (x 2 ) = 1 2 x2 2 (5.37) Taking the derivative of (5.37) along the solutions of (5.34) yields V 2 = (x 2 ) T B 1 (x 2 ) (x 2 ) T (F( γ + γ) + RL γ + Dρ) 1 b m x x 2 F( γ + γ) + x 2 RL γ + x 2 D ρ where b m = max i I b i. Using (2.25) and by Assumption 2.3(ii) V 2 (x 2 ) T B 1 x 2 + x 2 ( D ρ + ( RL + θ) γ ) 1 b m x β x 2 ( ρ + γ )

65 Chapter 5. NE Seeking For Double Integrator Plants 55 where b m = max i I b i and β = max{ RL + θ, D }. By the equivalence of norms, we get V 2 (x 2 ) T B 1 x 2 + β x 2 ( ρ 1 + γ 1 ) By considering that ρ and γ are two components of an input v := col( γ, ρ), then they form orthogonal components of v. By the properties of the one-norm, we get Using the equivalence of norms, we get V 2 1 b m x β x 2 v 1 V 2 1 x β n + q x 2 v b m ( 1 ) a x 2 2, x 2 β n + q v 0 b m a Therefore, we can conclude that (5.34) is ISS with v = ( γ, ρ) by Theorem Since the origin of the ( γ, ρ) subsystem is globally asymptotically stable, by Lemma 2.1 the origin of (5.33)-(5.35) is globally asymptotically stable and therefore (x 1, x 2 ) = (x 1, 0) is globally asymptotically stable for (5.32) for all w(0) W. In this chapter, we considered NE seeking for double-integrator plants. First, we considered a case without disturbances and then expanded this to include linear disturbances at the input channels of each plant. In both cases, we used a redefined output reducing the relative degree of each plant with a stabilizing inner-loop feedback. We then used an outer-loop gradient descent feedback to stabilize the NE. In the next chapter, we will extend this to cover the case where each agent is modelled as a multi-integrator.

66 Chapter 6 NE Seeking for Higher Order Integrators In this chapter, we extend the results of Chapter 5 to include agent whose plant models are given by higher-order integrators. We will use a similar method of defining a fictional output for each agent that yields unit relative degree. We will then use this to define a hyperplane that intersects the NE of the game, use gradient-descent to make this hyperplane attractive and finally make the NE attractive with a stabilizing feedback that renders the hyperplane invariant. First, we will consider games without disturbances and then look at games with agents subject to disturbances. In both cases, we will consider full and partial information. We will note that in all cases, we do not assume that each agent is of the same order. This is one of the advantages of this method. 6.1 Games without Disturbances In this section, we provide an extension of the results from Section to the case where each agent is modelled as a multi-integrator. We consider that each agent is modelled as an ri th order integrator plant ẋ i = A i x i + B i u i P i : (6.1) y i = C i x i where x i = col(x 1 i,..., xri i ) Rniri, u i, y i R ni and 0 I I A i = , B i = I I [ ] C i = I (6.2) We denote x = col(x 1,..., x N ) R m and x k to be the stacked vector of of x k i R m k if k r i, i.e., the stacked vector of all states of a certain order from agents who have states of that order. We denote n = i I n i, m = i I n ir i, m k = i I,r i k n ir i and r = max i I r i. Much like the double integrator 56

67 Chapter 6. NE Seeking for Higher Order Integrators 57 case, the NE set of the game includes the full state and is given by Γ NE = { x R m i J i (x 1 i, x 1 i) = 0, x 2 i = = x ri i = 0, i I } (6.3) Therefore we can characterize the NE of the system as being given by x such that F (x 1 ) = 0 (6.4) x 2 = 0 x r = 0 (6.5) Under Assumptions 2.2 and 2.3, x 1 is unique. We note that a similar set is given by Γ NE = { x R m i J i (x 1 i + 2 k r i i kx k i, x 1 i + 2 k r i i k xk i) = 0, x 2 i = = x ri i = 0, i I } (6.6) for appropriately sized matrices i k, i k, which reduces to (6.3) when x1 = = x r = 0. Therefore, if each agent minimizes over some linear combination of each agent s states with the constraint that at equilibrium all higher order states are zero, the NE of the game will be reached. We will use this fact to design our algorithm Full-Information First, we consider that each agent has full-information about the states of the others. We will define a fictitious output for each agent that they will minimize over, much like was used in the previous chapter. Define the following output for each agent γ i := r i 2 k=0 c i kx k+1 i + x ri i (6.7) where c i k are the ascending coefficients of an (r i 1) th order Hurwitz polynomial, a i (z), with c i 0 = 1 and c i r j 1 = 1, such as a i (z) = (z + 1) ri 1. Each agent attempts to min γi J i (γ i, γ i ) while requiring that x 2 i = = xri i = 0 at equilibrium. Therefore, we consider the feedback r i 2 u i = i J i (γ i, γ i ) c i kx k+2 i (6.8) which is a gradient descent term on γ i plus a stabilizing feedback for the higher-order states. This leads to agent dynamics of k=0 Σ i : ẋ i = Ãix i B i i J i (γ i, γ i ) (6.9)

68 Chapter 6. NE Seeking for Higher Order Integrators 58 where 0 I I 0 Ã i = I 0 c i 0I c i 1I c i r i 2I (6.10) Remark 6.1. We note that this algorithm reduces to gradient-play for the case of single-integrator agents and to the double-integrator algorithm (Chapter 5) for double-integrator plants. Therefore, we can view gradient-play as a specific case of a broader class of NE seeking algorithms for multi-integrator agents. In fact all algorithms presented in this chapter reduce to those presented in Chapters 4 and 5 for the special cases of single and double-integrators respectively. Again, we will see that γ x 1 defines a hyperplane that intersects the NE. This hyperplane is rendered attractive by the gradient feedback. On this hyperplane, the NE is attractive and the dynamics Ãi render it invariant. We will leverage this in our proof by decomposing the system into dynamics orthogonal to the hyperplane and dynamics parallel to it. We will show that the hyperplane is asymptotically stable and that the dynamics on the hyperplane are ISS with respect to the orthogonal dynamics. By shifting coordinates x 1 i γ i, and stacking x s i := (x2 i,..., xri i ), we can write the dynamics (6.9) as γ i = i J i (γ i, γ i ) ẋ s i = A s i x s i B s i i J i (γ i, γ i ) (6.11) where 0 I I 0 0 A s i = , B s i = I 0 c i 0I c i 1I c i 2I c i r i 2I I (6.12) Here we see that the γ i dynamics become decoupled from the x s i dynamics. Theorem 6.1. Consider a game G(I, J i, R ni ) with full-information, and agent dynamics given by (6.1),(6.8), or Σ i, (6.9). Under Assumptions 2.2 and 2.3(i) the unique NE, x, is globally asymptotically stable for (6.9). Proof. By stacking the dynamics from (6.11) with γ = col(γ 1,..., γ N ) and x s = col(x s 1,..., x s N ) γ = F (γ) (6.13) ẋ s = Ax s BF (γ) (6.14) where A = blkdiag(a s 1,..., A s N ) and B = blkdiag(bs 1,..., B s N ). By Lemma 2.5, under Assumptions 2.2 and 2.3(i), γ = x 1 of (6.13) is globally asymptotically stable. Consider the coordinate transformation

69 Chapter 6. NE Seeking for Higher Order Integrators 59 γ γ := γ x 1, then the dynamics (6.13) and (6.14) are given by γ = F ( γ + x 1 ) (6.15) ẋ s = Ax s BF ( γ + x 1 ) (6.16) Since γ = x 1 of (6.13) is globally asymptotically stable, the origin of (6.15) is globally asymptically stable. Consider the x s subsystem (6.16) with input γ and Lyapunov candidate function V 2 (x s ) = (x s ) T P x s (6.17) where P = P T > 0 is such that A T P + P A < Q, Q = Q T > 0, which is guaranteed to exist since A is Hurwitz. Taking the derivative of (6.17) along the solutions of (6.14) yields V 2 = (x s ) T (A T P + P A)x s 2(x s ) T P BF (γ) (x s ) T Qx s + 2 P B x s F ( γ + x ) Using (2.25) and by Assumption 2.3(i), we get V 2 (x s ) T Qx s + 2 P B x s α( γ ) Then, we can conclude that V 2 (x s ) T (Q ai)x s, x s 2 P B α( γ ) 0 a where 0 < ai < Q. Therefore by Theorem 2.10, (6.16) is ISS with input γ and by Lemma 2.1 the origin of the cascade (6.15)-(6.16) is asymptotically stable. Therefore (x 1, x 2,..., x r ) = (x, 0,..., 0) of the original system (6.9) is asymptotically stable Partial-Information Next, we will address the case where agents have only partial-information communicated over a graph, G c. We will extend the results of Section by combining them with a similar communication scheme, originally discussed in Section and extended in We will have that each agent, i, keeps estimates of all other agents γ j defined in (6.7) as γ i j. Then we have γi i = col(γi 1,..., γ i i 1, γi i+1,..., γi N ) R n i and uses these estimates to compute its gradient, i J i (γ i, γ i i ). We denote γi = col(γ i 1,..., γ i i 1, γ i, γ i i+1,..., γi N ) Rn and γ = col(γ 1,..., γ N ) R Nn. We want to design a dynamic feedback such that, in steady-state, γ i = γ j for all i, j I and such that x converges to x as defined in (6.4),(6.5). Our feedback is dynamic and is given by γ i i = S i j N i (γ i γ j ) r i 2 u i = i J i (γ i, γ i i) c i jx j+2 i j=0 R i j N i (γ i γ j ) (6.18)

70 Chapter 6. NE Seeking for Higher Order Integrators 60 where c i j and γ i are defined in (6.7). Which lead to agent dynamics of where γ i i = S i j N Σ i : i (γ i γ j ), i I ẋ i = Ãix i B i ( i J i (γ i, γ i i ) + R ) (6.19) i j N i (γ i γ j ) 0 I I 0... Ã i = I 0 c i 0I c i 1I c i r i 2I (6.20) Next, we consider the coordinate transformation x 1 i γ i, and stacking x s i := (x2 i,..., xri i ), we can write the dynamics (6.19) as γ i i = S i j N i (γ i γ j ), i I γ i = i J i (γ i, γ i i) R i (γ i γ j ) j N i ) ẋ s i = A s i x s i Bi ( s i J i (γ i, γ i i) + R i (γ i γ j ) j N i (6.21) where 0 I I 0 0 A s i = , B s i = I 0 c i 0I c i 1I c i 2I c i r i 2I I (6.22) Theorem 6.2. Consider a game G(I, J i, R ni ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by (6.1),(6.18), or Σ i, (6.19). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the unique NE, x = x, is globally asymptotically stable for (6.19) for all w(0) W. γ = 1 N x 1. Moreover, each player s estimates converge globally to the NE values, Proof. By stacking the dynamics from (6.21) with x s = col(x s 1,..., x s n), the dynamics are given by S γ = SLγ R γ = F(γ) RLγ ẋ s = Ax s B(F(γ) + RLγ)

71 Chapter 6. NE Seeking for Higher Order Integrators 61 where A = blkdiag(a s 1,..., A s N ) and B = blkdiag(bs 1,..., BN s ). Using the properties of R and S, γ = R T F(γ) Lγ (6.23) ẋ s = Ax s B(F(γ) + RLγ) (6.24) By Theorem 2.14, under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2, then the γ = γ is globally asymptotically stable for (6.23). Consider the coordinate transformation γ γ := γ γ. In the new coordinates, using L γ = 0, the dynamics are given by γ = R T F( γ + γ) L γ (6.25) ẋ s = Ax s B(F( γ + γ) + RL γ) (6.26) Since γ was globally asymptotically stable for (6.23), the origin is globally asymptotically stable for (6.25). Now consider the x s subsystem (6.26) with input γ and Lyapunov candidate function V 2 (x 2 ) = 1 2 xs 2 (6.27) Taking the derivative of (6.27) along the solutions of (6.26) yields V 2 = x s 2 (x s ) T (F( γ + γ) + RL γ) x s 2 + x s F( γ + γ) + x s RL γ Using (2.36) and by Assumption 2.5, we get V 2 x s 2 + x s (θ γ + RL γ ) Then, we can conclude that V 2 (1 a) x s 2, x s θ + RL γ 0 a where 0 < a < 1. Therefore by Theorem 2.10, (6.26) is ISS with input γ and by Lemma 2.1 the origin of the cascade (6.25)-(6.26) is asymptically stable. Therefore ( γ, 0,..., 0) of the original system (6.21) is globally asymptotically stable and x = x is globally asymptotically stable. 6.2 Games with Disturbances In this section, we investigate NE seeking for multi-integrator systems subjected to a disturbance at the input channel. We will merge the results from Section 6.1 concerning NE seeking for multi-integrators with the reduced order observer framework introduced in Chapter 4. We will look at agents who have plant models that can be described by ẋ i = A i x i + B i (u i + d i ) P i : (6.28) y i = C i x i

72 Chapter 6. NE Seeking for Higher Order Integrators 62 where x i = col(x 1 i,..., xri i ) Rniri, u i, y i R ni and 0 I I 0 0 A i = , B i = I I [ ] C i = I (6.29) (6.30) Where the disturbance, d i, can be generated by an observable linear exosystem ẇ i = S i w i, w i (0) W i, i I D i : (6.31) d i = D i w i Each agent additionally has a cost function that depends on all the agents actions, x 1, but none of their higher-order states. Therefore, the cost of each agent is given by J i (x 1 i, x1 i ). We will note that the NE set will remain unchanged and is given by (6.3). We will the same, alternative characterization of the NE set given by (6.6) for our gradient descent algorithm. We will investigate both full and partial information Full-Information First, we consider the case where each agent has full-information about every agent s actions. We will use the same definition of γ i as used in the previous sections. Therefore, our feedback is dynamic and given by ξ i = S i (K i x ri i + ξ i ) + K i i J i (γ i, γ i ), i I r i 2 u i = i J i (γ i, γ i ) c i jx j+2 i j=0 D i (K i x ri i + ξ i ) (6.32) where γ i and c i j where as defined in (6.7) for all i I. This leads to agent dynamics given by ξ i Σ i : ẋ i = S i (K i x ri i + ξ i ) + K i i J i (γ i, γ i ) = Ãix i B i ( i J i (γ i, γ i ) D i w i + D i (K i x ri i + ξ i )) (6.33) 0 I I 0 Ã i = I 0 c i 0I c i 1I c i r i 2I (6.34)

73 Chapter 6. NE Seeking for Higher Order Integrators 63 Now, consider the coordinate transformation x 1 i γ i, and stacking x s i := (x2 i,..., xri i ), we can write the dynamics (6.33) as γ i = i J i (γ i, γ i ) + D i w i D i (K i x ri i + ξ i ) ( ) ẋ s i = A s i x s i Bi s i J i (γ i, γ i ) D i w i + D i (K i x ri i + ξ i )) ξ i = S i (K i x ri i + ξ i ) + K i i J i (γ i, γ i ) (6.35) where 0 I I 0 0 A s i = , B s i = I 0 c i 0I c i 1I c i 2I c i r i 2I I (6.36) Theorem 6.3. Consider a game G(I, J i, R ni ) with full-information, and agent dynamics given by (6.28),(6.32), or Σ i, (6.33). Under Assumptions 2.2 and 2.3(ii) the unique NE, x, is globally asymptotically stable for (6.33) for all w(0) W. Proof. Consider the coordinate transformation ξ i ρ i := w i (K i x ri i +ξ i), and denote ρ = col(ρ 1,..., ρ N ) and γ = col(γ 1,..., γ N ). Additionally, denote x s = col(x s 1,..., x s N ). In the new coordinates, the stacked dynamics are given by ẇ = Sw (6.37) ρ = (S KD)ρ (6.38) γ = F (γ) + Dρ (6.39) ẋ s = Ax s B(F (γ) Dρ) (6.40) where A = blkdiag(a s 1,..., A s N ) and B = blkdiag(bs 1,..., BN s ). By Theorem 4.1, under Assumptions 2.2 and 2.3(ii), γ = x 1 is globally asymptotically stable for (6.37)-(6.39). Consider the coordinate transformation γ γ := γ x 1. The dynamics are then given by ẇ = Sw (6.41) ρ = (S KD)ρ (6.42) γ = F ( γ + x 1 ) + Dρ (6.43) ẋ s = Ax s B(F ( γ + x 1 ) Dρ) (6.44) Since γ = x 1 is globally asymptotically stable for (6.37)-(6.39), γ = 0 is globally asymptotically stable for (6.41)-(6.43). Consider the Lyapunov candidate function for (6.44) V 2 (x s ) = (x s ) T P x s (6.45)

74 Chapter 6. NE Seeking for Higher Order Integrators 64 Taking the derivative along the solutions of (6.44) V 2 = (x s ) T (A T P + P A)x s 2(x s ) T P B(F ( γ + x 1 ) Dρ) (x s ) T Qx s + 2 P B x s ( F ( γ + x 1 ) + D ρ ) Using (2.25) and by Assumption 2.3(ii) V 2 (x s ) T Qx s + 2 P B x s ( D ρ + θ γ ) (x s ) T Qx s + 2β P B x s ( ρ + γ ) where β = max{θ, D }. By the equivalence of norms, we get V 2 (x s ) T Qx s + 2β P B x s ( ρ 1 + γ 1 ) By considering that ρ and γ are two components of an input v := col(γ, ρ), then they form orthogonal components of v. By the properties of the one-norm, we get V 2 (x s ) T Qx s + 2β P B x s v 1 Using the equivalence of norms, we get V 2 (x s ) T Qx s + 2β P B n + q x s v (x s ) T (Q bi)x s, x s 2β P B n + q v 0 b where 0 < bi < Q. Therefore by Theorem 2.10 (6.44) is ISS with v = col( γ, ρ) as an input. Since the origin of ( γ, ρ) is asymptotically stable, the origin of the cascade (6.42)-(6.44) is also asymptotically stable by Lemma 2.1. Therefore (x 1, x s ) = (x 1, 0) of the original system (6.33) is asymptotically stable Partial-Information Next we consider the case where the agents are modelled as multi-integrators affected by a disturbance at the input channel, thus have dynamics given by (6.28) and have partial-information communicated over a graph, G c. We will combine the results of the previous section with that Section Using the same definitions of the estimates γ defined in Section 6.1.2, consider the feedback γ i i = S i ξ i = S i (K i x ri i j N i (γ i γ j ) + ξ i ) + K i i J i (γ i, γ i i) + R i r i 2 u i = i J i (γ i, γ i i) c i jx j+2 i j=0 D i (K i x ri i j N i (γ i γ j ) + ξ i ) R i j N i (γ i γ j ) (6.46)

75 Chapter 6. NE Seeking for Higher Order Integrators 65 where γ i and c i j as defined in (6.7) for all i I. Which lead to agent dynamics of ξ i = S i (K i x ri i + ξ i ) + K i i J i (γ i i, γi i ) + R i j N i (γ i γ j ) Σ i : γ i i = S i j N i (γ i γ j ) ẋ i = Ãix i B i ( i J i (γ i, γ i ) + R i j N i (γ i γ j ) D i w + D i (K i x ri i + ξ i )) (6.47) where 0 I I 0 Ã i = I 0 c i 0I c i 1I c i r i 2I (6.48) Now, consider the coordinate transformation x 1 i γ i, and stacking x s i := (x2 i,..., xri i ), we can write the dynamics (6.47) as γ i i = S i j N i (γ i γ j ) γ i = i J i (γ i, γ i ) R i (γ i γ j ) + D i w D i (K i x ri i + ξ i ) j N i ẋ s i = A s i x s i B i ( i J i (γ i, γ i ) + R i (γ i γ j ) D i w + D i (K i x ri i + ξ i )) j N i ξ i = S i (K i x ri i + ξ i ) + K i i J i (γ i, γ i ) (6.49) where 0 I I 0 0 A s i = , B s i = I 0 c i 0I c i 1I c i 2I c i r i 2I I (6.50) Theorem 6.4. Consider a game G(I, J i, Ω i ) with partial information communicated over a graph G c with Laplacian L and agent dynamics given by (6.28),(6.46), or Σ i, (6.47). Under Assumptions 2.2, 2.3(ii), 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2 then the unique NE, x = x, is globally asymptotically stable for (6.47) for all w(0) W. x = 1 N x, for all w(0) W. Moreover, each player s estimates converge globally to the NE values, Proof. Consider the coordinate transformation ξ i ρ i := w i (K i x ri i +ξ i) and denote x s = col(x s 1,..., x s N )

76 Chapter 6. NE Seeking for Higher Order Integrators 66 and ρ = col(ρ 1,..., ρ N ). In the new coordinates, the stacked dynamics are given by ẇ = Sw ρ = (S KD)ρ S γ = SLγ R γ = F(γ) RLγ ẋ s = Ax s B(F(γ) + RLγ) A = blkdiag(a s 1,..., A s N ) and B = blkdiag(bs 1,..., BN s ). Using the properties of R and S, ẇ = Sw (6.51) ρ = (S KD)ρ (6.52) γ = R T F(γ) Lγ + R T Dρ (6.53) ẋ s = Ax s B(F(γ) + RLγ + Dρ) (6.54) By Theorem 4.2, under Assumptions 2.2, 2.3, 2.4 and 2.5, if µ(λ 2 (L) θ) > θ 2, x = x of the (6.52)-(6.53) subsystem is globally asymptotically stable. Consider the coordinate tranformation γ γ := γ γ. In the new coordinates, the dynamics are given by ẇ = Sw (6.55) ρ = (S KD)ρ (6.56) γ = R T F( γ + γ) L γ + R T Dρ (6.57) ẋ s = Ax s B(F( γ + γ) + RL γ + Dρ) (6.58) Now consider the x s subsystem (6.58) with Lyapunov candidate function V 2 (x s ) = (x s ) T P x s (6.59) Taking the derivative along the solutions of (6.58) V 2 = (x s ) T (A T P + P A)x s 2(x s ) T P B(F( γ + γ) + RL γ + Dρ) (x s ) T Qx s + 2 P B x s ( F( γ + γ) + RL γ + D ρ ) Using F( γ) = 0 and by Assumption 2.3(ii) V 2 (x s ) T Qx s + 2 P B x s ( D ρ + θ γ + RL γ ) (x s ) T Qx s + 2β P B x s ( ρ + γ )

77 Chapter 6. NE Seeking for Higher Order Integrators 67 where β = max{θ, D + RL }. By the equivalence of norms, we get V 2 (x s ) T Qx s + 2β P B x s ( ρ 1 + γ 1 ) By considering that ρ and x are two components of an input v := col( γ, ρ), then they form orthogonal components of v. By the properties of the one-norm, we get V 2 (x s ) T Qx s + 2β P B x s v 1 Using the equivalence of norms, we get V 2 (x s ) T Qx s + 2β P B Nn + q x s v (x s ) T (Q bi)x s, x s 2β P B Nn + q v 0 b where 0 < bi < Q.Therefore by Theorem 2.10, (6.58) is ISS with input col( γ, ρ) and by Lemma 2.1 the origin of the cascade (6.56)-(6.58) is asymptotically stable. Therefore ( γ, 0) of the original system (6.47) is asymptotically stable and thus x = x is globally asymptotically stable. In this chapter, we further extended the results of the previous two chapters to cover the case where each agent is modelled as an ri th order integrator. We covered both the disturbance free case as well as a case with an additive linear disturbance. Both of these cases featured full and partial-information NE seeking algorithms. We noted that the results of this chapter hold even for agents that have different orders and that these results reduce to those presented in Chapters 4 and 5 for the case of single and double-integrator plants respectively.

78 Chapter 7 Potential Games In this chapter, we return to the framework of the single integrator as presented in Chapter 3. Instead of viewing each plant model as purely an integrator, e.g., (3.10) and (3.18), as was used in Chapter 4, we will instead view each plant as an integrator with its partial gradient as the output map, e.g., (3.8) and (3.16). we will show that this formulation leads to a passive plant for a certain class of games, namely potential games. We will then show that we can accomplish the NE seeking problem with disturbance rejection or reference tracking using passivity based methods. In addition to merely providing an alternative way of approaching the problem, we will see that using the passivity-based approach for potential games allows us to relax the strong-monotonicity assumption on the pseudo-gradient used in Chapter 4, replacing it instead with a strict-monotonicity assumption, which is weaker. Therefore we see a trade-off between certain properties of the game, e.g., by restricting the games to be potential games, we can relax the monotonicity assumption. We will look at both games with disturbances and games with references in this chapter. Finally, we will show how we can use a similar scheme for integrator plants that have nonlinear exosystems with incrementally passive internal models. In all of these cases, we consider only full-information. This is due to the fact that the passivity framework developed here is incompatible with the Laplacian-based consensus algorithm used in the previous chapters. 7.1 Games with Disturbances First, we will consider the case where the agents in the game are subjected to a disturbance and can be modelled as integrator plants with the partial-gradient as the output map (3.8) or ẋ i = u i + d i P i : (7.1) e i = i J i (x i, x i ) where the disturbance, d i can be generated by a linear exosystem ẇ i = S i w i D i : (7.2) d i = D i w i 68

79 Chapter 7. Potential Games 69 We note that the NE set remains unchanged by the disturbace and is thus given by Γ NE = { x R n i J i (x i, x i ) = 0, i I } Under Assumptions 2.2 and 2.3(i) this is the singleton x = x. Note that an equivalent formulation is given by Γ NE = { x R n e i = 0, i I } We will use this fact that the NE corresponds to e i = 0 together with passivity induced by the potential game to design our feedbacks, u i. We will consider the stacked plant ẋ = u + d P : (7.3) e = F (x) and show that it is passive from u to e if the game is a full-potential game. It is important to note that because of the fact that e i depends on (x i, x i ) and not just x i, each individual plant is not passive on its own, rather we must look at the stacked plant. Lemma 7.1. A full potential game with dynamics given by (7.3) is passive with d 0. Proof. Consider the storage function, V (x) = Φ(x), without loss of generality assume Φ(x ) = 0 is the minimum of Φ. Taking the time derivative along the solutions of (7.3) gives V (x) = F (x) T u e T u. Therefore the system is passive. We will use a similar passivity based feedback for each agent, inspired by the results of [24], given by ξ i = S i ξ i + Q 1 i D i e i u i = D i ξ i e i (7.4) where Q i = Q T i 0 such that S T i Q i + Q i S i 0. The agent dynamics are then given by ẋ i = D i ξ i i J i (x i, x i ) + d i Σ i : ξ i = S i ξ i + Q 1 i D i i J i (x i, x i ) (7.5) Compared to (4.5), we have a similar algorithm, with a passivity-based internal model replacing the reduced-order observer. This allows us to use a passivity-based analysis and to weaken our assumption on the pseudo-gradient. Theorem 7.1. Consider a full-potential game G(I, J i, R ni ) with potential function Φ(x), agent dynamics given by (7.1),(7.4) or Σ i, (7.5). Under Assumptions 2.2 and 2.3(i), the unique NE, x, is globally asymptotically stable for all w(0) W.

80 Chapter 7. Potential Games 70 Proof. The closed loop system is then given by ẇ = Sw ẋ = Dξ F (x) + Dw Σ : ξ = Sξ + Q 1 DF (x) (7.6) where Q = diag(q 1,..., Q N ), C = diag(c 1,..., C N ), and S = diag(s 1,..., S N ). Take the Lyapunov function V (x, ξ, w) = Φ(x) (w ξ)t Q(w ξ), which under Assumptions 2.2 and 2.3(i) is radially unbounded in x, and assume without loss of generality Φ(x ) = 0 is the minimum of Φ. Taking the time derivative along the solutions of (7.6) gives V = F (x) T (C(w ξ) F (x)) + (w ξ) T Q(S(w ξ) Q 1 C T F (x)) F (x) T F (x) + F (x) T C(w ξ) (w ξ) T C T F (x) F (x) T F (x) 0 (7.7) Since V is radially unbounded in x as Φ is, and V = 0 if and only if F (x) = 0 if and only if x = x, x = x is globally asymptotically stable by Theorem Games with References Next, we will consider a full-potential game with references under full-information. We consider the case where the plant dynamics are given by (3.16) or ẋ i = u i P i : (7.8) e i = i J i (x i p i, x i p i ) where the reference, p i, can be generated by a linear exosystem of the form ẇ i = S i w i D i : (7.9) p i = C i w i We note here that the NE set is again dependent upon the exosystem state, w i, and is given by Γ NE = { (x, w) R n W i J i (x i C i w i, x i C i w i ) = 0, i I } (7.10) Under Assumptions 2.2 and 2.3(i) this is the equivalent to x + Cw = F 1 (0) where F 1 (0) is unique. Note that an equivalent formulation is given by Γ NE = { (x, w) R n W e i = 0, i I }

81 Chapter 7. Potential Games 71 Again, we will use a similar passivity based approach as used in the previous section. Therefore, we must rewrite the plant dynamics in stacked form ẋ = u P : e = F (x p) Lemma 7.2. A full potential game with dynamics given by (7.11) is passive with p 0. (7.11) Proof. Consider the storage function, V (x) = Φ(x), without loss of generality assume Φ(x ) = 0 is the minimum of Φ. Taking the time derivative along the solutions of (7.11) gives V (x) = F (x) T u e T u. Therefore the system is passive. We use a passivity based feedback for each agent, similar to the one presented in the previous section, given by ξ i = S i ξ i Q 1 i Si T Ci T e i u i = C i S i ξ i e i (7.12) where Q i = Q T i 0 such that S T i Q i + Q i S i 0. We can then write the agent dynamics as ẋ i = C i S i ξ i i J i (x i p i, x i p i ) Σ i : ξ i = S i ξ i Q 1 i Si T CT i ij i (x i p i, x i p i ) (7.13) Theorem 7.2. Consider a full-potential game G(I, J i, R ni ) with potential function Φ(x), agent dynamics given by (7.8),(7.12) or Σ i, (7.13). Under Assumptions 2.2 and 2.3(i), the NE set (7.10) is globally asymptotically stable for all w(0) W. Proof. We consider the full stacked dynamics of the system ẇ = Sw (7.14) ẋ = CSξ F (x Cw) Σ : (7.15) ξ = Sξ Q 1 S T C T F (x Cw) Take the Lyapunov function V (x, ξ, w) = Φ(x Cw) (w ξ)t Q(w ξ) and assume without loss of generality assume Φ(x ) = 0 is the minimum of Φ. Taking the time derivative along the solutions of (7.14) gives V = F (x Cw) T ( CS(w ξ) F (x Cw)) + (w ξ) T Q(S(w ξ) + Q 1 S T C T F (x Cw)) F (x Cw) T F (x Cw) F (x Cw) T CS(w ξ) + (w ξ) T S T C T F (x Cw) F (x Cw) T F (x Cw) 0 (7.16) Since V is radially unbounded in x as Φ is radially unbounded in its argument and w evolves in a compact set, and V = 0 if and only if F (x Cw) = 0 if and only if x Cw = x, the NE set (7.10) is

82 Chapter 7. Potential Games 72 globally asymptotically stable by Theorem Incrementally Passive Internal Models We can extend the results of the previous two sections to include cases where the exosystem models might be nonlinear. Using the same passivity framework, we will show convergence to the NE set for integrator plants with additive nonlinear reference and disturbance with incrementally passive internal models. We will consider the case where each plant can be modelled as ẋ i = u i + d i P i : (7.17) e i = i J i (x i p i, x i p i ) where the disturbance, d i, and reference, p i, can be generated by a nonlinear exosystem of the form ẇ i = s i (w i ) D i : p i = p i (w i ) (7.18) d i = d i (w i ) We note here that the NE set is again dependent upon the exosystem state, w i, and is given by Γ NE = { (x, w) R n W i J i (x i p i (w i ), x i p i (w i )) = 0, i I } (7.19) Under Assumptions 2.2 and 2.3(i) this is the equivalent to x + p(w) = F 1 (0) where F 1 (0) is unique. Note that an equivalent formulation is given by Γ NE = { (x, w) R n W e i = 0, i I } Since the disturbance and reference are nonlinear in nature, we will explicity solve the regulator equations for this case. To remove the dependence on x i, we will do so for the full system and then decompose the solution to be distributed. Therefore, consider the stacked plant ẇ = s(w) ẋ = u + d(w) P : e i = F (x p(w)) (7.20) By substituting into (2.20), we get π(w) = F 1 (0) + p(w) (7.21) Substituting that into (2.18) gives θ(σ(w)) = d(w) + p(w) s(w) (7.22) w

83 Chapter 7. Potential Games 73 Finally substituting into (2.19) gives σ(w) s(w) = η(σ(w), 0) (7.23) w Due to the diagonal structure of p(w) w and separable nature of d(w) and s(w), θ( ) and σ(w) will have a similar, separable structures. We therefore assume that each agent has an internal model of the form ξ i = η i (ξ i, v i ) ū i = θ i (ξ i ) (7.24) Where, the each internal model satisfies θ i (σ i (w i )) = d i (w i ) + p i(w i ) w i s i (w i ) σ i (w i ) w i s i (w i ) = η i (σ i (w i ), 0) (7.25) Assumption 7.1. Each internal model is incrementally passive with storage function, W i (ξ i, ξ i ). Take each individual input to be given by ξ i = η i (ξ i, e i ) (7.26) u i = θ i (ξ i ) e i (7.27) Then the resulting agent dynamics are given by ẋ i = θ i (ξ i ) i J i (x i p i, x i p i ) + d i Σ i : ξ i = η i (ξ i, i J i (x i p i, x i p i )) (7.28) Theorem 7.3. Consider a full-potential game G(I, J i, R ni ) with potential function Φ(x), agent dynamics given by (7.17),(7.26) or Σ i, (7.28). Under Assumptions 2.2, 2.3(i) and (7.1), the NE set (7.19) is globally asymptotically stable for all w(0) W. Consider the stacked agent dynamics Proof. ẇ = s(w) ẋ = θ(ξ) F (x p(w)) + d(w) Σ : ξ = η(ξ, F (x p(w))) (7.29) Consider Lyapunov candidate function V (x, w, ξ, ξ ) = Φ(x p(w)) + W (ξ, ξ ), where W (ξ, ξ ) =

84 Chapter 7. Potential Games 74 i W i(ξ i, ξ i ) and ξ = σ(w). By taking the time dervative along the solutions of (7.29), we get V (x, w, ξ, ξ ) F (x p(w)) T ( F (x p(w)) + d(w) + θ(ξ) p(w) w s(w)) + (θ(ξ) θ(ξ )) T ( F (x p(w)) ( F (x p(w)) T F (x p(w)) + F (x p(w)) T d(w) p(w) ) w s(w) + θ(σ(w)) F (x p(w)) T F (x p(w)) 0 So V 0 and V = 0 if and only if x + p(w) = F 1 (0). Therefore, by Theorem 2.5, the NE set (7.19) is globally asymptotically stable for all w(0) W. In this chapter, we provided an alternative way of looking at an integrator plant model, which led to a passivity based framework. While this framework restricted us to looking solely at potential games, it allowed us to weaken our assumption on the monotonicity of the pseudo-gradient and allow us to consider strictly-monotone games. Finally, it allowed us to consider nonlinear disturbance and reference signals in addition. In the next chapter, we will expand upon the nonlinear plant and exosystem case, this time without using passivity and therefore without the need for potential games at the expense of requiring strong-monotonicity of the pseudo-gradient.

85 Chapter 8 Nonlinear Plant Models In this chapter, we want to show that the framework introduced in the previous chapters can be extended to include a more generic, nonlinear plant model with possibly nonlinear exosystem. In this way, all of the previous results in this paper can be covered by this case. Our results follow closely to those of [22], with our proof following along the same lines as the proof presented there. The main difference is that we replace their assumptions on the high-frequency gain matrix of the error system with strongmonotonicity of the pseudo-gradient. Another difference is that we consider each agent in the game to be independent, whereas [22] considers only one agent. Therefore it is important for us to design a distributed control law. We consider that each agent can be modelled as a first order, control affine, nonlinear plant with dynamics given by ż i = q i (x i, z i, w i ) P i : ẋ i = f i (x i, z i, w i ) + u i (8.1) y i = x i p i (w i ) where x i, u i, y i R ni, z i R vi with an exogenous signal, denoted w i, that can be generated by a nonlinear exosystem D i : ẇ i = s i (w i ) (8.2) where w i W i R qi and W i is compact. Remark 8.1. Many systems of relative degree {1,... 1} can be turned into the form (8.1) via a feedback transformation and suitable change of coordinates. We recognize that this will not always be possible without full knowledge of the exosystem state. However it is always possible for systems of the form ẋ i = f i (x i, w i ) + g i (x i )u i (8.3) y i = h i (x i ) p i (w i ) (8.4) where y i, u i R mi, x i R ni and w i W i R qi, with only knowledge of x i provided they have a well-defined relative degree {1,..., 1}. 75

86 Chapter 8. Nonlinear Plant Models 76 We also assume that each agent has a cost function that given by J i (x i p i (w i ), x i p i (x i )). Therefore, the NE of the game is dependent on w = col(w 1,..., x N ) and is given by Γ NE = { (x, z, w) R n R v W i J i (x i p i (w i ), x i p i w i ) = 0, z i = π i z(w i ) i I } (8.5) where π i z(w i ) to be determined comes from the solution to the regulator equations (2.18)-(2.20). Note that under Assumptions 2.2 and 2.3, an alternative characterization of the NE set is given by Γ NE = { (x, z, w) R n R v W x p(w) = F 1 (0), z i = π i z(w i ), i I } (8.6) where F 1 (0) is a singleton under Assumptions 2.2 and 2.3(iii). We would like to note that this formulation covers all of the previous plant models used. Trivially, the single integrator with either a disturbance or a reference, or both, falls under this category. Example 8.1. Consider the single integrator plant given in (3.12), ẋ i = u i + d i P i : y i = x i with disturbance given by ẇ i = S i w i D i : d i = D i w i Can be put in the form (8.1),(8.2) by omitting z i and taking f i (x i, w i ) = D i w i, p i (w i ) = 0 and s i (w i ) = S i w i. Less trivially, we note that the double and multi-integrator plants discussed in Chapters 5 and 6 also fall under this category. Example 8.2. Consider the double integrator plant given in (5.20), ẋ 1 i P i : = x2 i ẋ 2 i = u i + d i with disturbance given by ẇ i = S i w i D i : d i = D i w i By taking the output to be γ = x 1 i + b ix 2 i and using the inner loop feedback u i = 1 b i x 2 i + 1 b i v i, we get ẋ 1 i = x2 i P i : ẋ 2 i = 1 b i x 2 i + 1 b i v i + d i γ = x 1 i + b ix 2 i

87 Chapter 8. Nonlinear Plant Models 77 Changing coordinates by x 2 i γ, we get ẋ 1 i P i : = 1 b i x 1 i + 1 b i γ γ = v i + b i d i This can be put in the form (8.1),(8.2) by taking x i = γ, z i = x 1 i, f i(x i, z i, w i ) = b i D i w i, q i (x i, z i, w i ) = 1 b i x 1 i + 1 b i γ, p i (w i ) = 0 and s i (w i ) = S i w i. Therefore, this provides us with the most general plant model, as it includes all plants discussed previously in this thesis as well as a certain class of nonlinear plants. Returning to our feedback design, unlike in the previous chapters, wherein our feedback was designed individually and the final analysis done on the combined system, we will design our NE seeking feedback for the stacked system instead. We will then show that this corresponds to a mostly distributed feedback. We will then show convergence for the overall system. Therefore we rewrite the plant (8.1),(8.2) as the stacked system, yielding ẇ = s(w) ż = q(x, z, w) P : ẋ = f(x, z, w) + u y = x p(w) (8.7) Now the problem becomes one of finding control inputs, u i, that minimize the cost functions J i (x i + p i (x i ), x i + p i (x i )) over x i. Using an internal model based approach with high-gain feedback based off of the results of [22], we will show that if we do this, the output of the stacked dynamics will converge to the NE set (8.5). Using (8.7), the solution to the regulator equations (2.23) and (2.24) is given by π z w s(w) = q(πx (w), π z (w), w) (8.8) π x w s(w) = f(πx (w), π z (w), w) θ(w) (8.9) π x (w) p(w) = y (8.10) where y := F 1 (0). Due to the separable nature of x p(w) and the diagonal structure of π w, we can separate the above into the following πi z w s i(w i ) = q i (πi x (w i ), πi z (w i ), w i ) (8.11) πi x s i (w i ) = f i (πi x (w i ), πi z (w i ), w i ) θ i (w i ) w i (8.12) π x i (w i ) p i (w i ) = y i (8.13) Now, we will discuss the design of the internal model. for each agent, i I, pick a controllable pair (G i, H i ) R di di R di 1 as defined in Lemma 2.3, with G i := blkdiag(g i,..., G i ) and H i :=

88 Chapter 8. Nonlinear Plant Models 78 blkdiag(h i,..., H i ). Then by Lemma 2.3, (G i, H i ) satisfy the following σ i j w i s i (w i ) = G i σ i j(w i ) + H i [θ i (w i )] j [θ i (w i )] j = φ i j(σ i j(w i )) By setting σ i (w i ) = col(σ i 1(w i ),..., σ i n i (w i )) φ i (ξ i ) = col(φ i 1(ξ i 1),..., φ i n i (ξ i n i )) The above can be rewritten as σ i w i s i (w i ) = G i σ i (w i ) + H i θ i (w i ) (8.14) θ i (w i ) = φ i (σ i (w i )) (8.15) By further setting σ(w) = col(σ 1 (w 1 ),..., σ N (w N )) φ(ξ) = col(φ 1 (ξ 1 ),..., φ N (ξ N )) G = blkdiag(g 1,..., G N ) H = blkdiag(h 1,..., H N ) We can stack all the exosystems together and write σ s(w) = Gσ(w) + Hθ(w) w θ(w) = φ(σ(w)) Assumption 8.1. A solution to (2.23) and (2.24) exists and the system (8.7) with output ỹ = x i p i (w i ) F 1 (0) is uniformly detectable relative to the compact set Γ NE, (8.5), with linear gain function, i.e., there exists β(, ) KL and l R such that (w, x, z) ΓNE β( w(0), x(0), z(0) ΓNE, t) + l ỹ [0,t) Assume that the game is characterized by full-information, i.e., each agent has full information about all the other agents realized actions, y i. Consider then a new choice of output function for each agent e i = i J i (x i p i (w i ), x i p i (w i )) which is possible since each agent as full knowledge of x p(w). In stacked form, where e := col(e 1,..., e N ), e = F (x p(w)) This choice of function is a natural change in output as e = 0 corresponds uniquely to x p(w) = F 1 (0)

89 Chapter 8. Nonlinear Plant Models 79 without need for a priori knowledge of F 1 (0). With this output function, the new dynamics for stacked system is given by ẇ = s(w) ż = q(x, z, w) ẋ = f(x, z, w) + u e = F (x p(w)) (8.16) The error dynamics are given by ( ė = D F (x p(w)) f(x, z, w) p ) w s(w) + D F (x p(w))u (8.17) Under Assumption 2.3(iii), D F (x p(w)) has full rank everywhere, therefore (8.16) has well-defined vector relative degree {1,..., 1} everywhere. We will also show that strong monotonicity of the pseudogradient maintains the uniform detectability property. Proposition 8.1. Under Assumptions 2.2, 2.3(iii) and 8.1, system (8.16) is uniformly detectable relative to Γ NE, (8.5), with linear gain function, i.e., there exists β(, ) KL and l R such that (w, x, z) ΓNE β ( w(0), x(0), z(0) ΓNE, t) + l e [0,t) Proof. By Assumption 2.3(iii), (x p(w) y ) T (F (x p(w)) F (y )) µ x p(w) y 2 Which implies that F (x p(w)) F (y ) µ x p(w) y Which gives the estimate e 0 µ x p(w) y So, by Assumption 8.1 we can conclude that (w, x, z) ΓNE β( w(0), x(0), z(0) ΓNE, t) + l µ e [0,t) We will leverage the uniform detectability of the overall plant by using a high-gain error feedback stabilization method for the NE set. Therefore, using the internal model (8.14), each agent uses a

90 Chapter 8. Nonlinear Plant Models 80 feedback of the form ξ = G i ξ i + H i (φ i (ξ i ) + ke i ) u i = φ i (ξ i ) ke i (8.18) where k > 0, to be designed, is a universal gain parameter that is the same across all agents. Remark 8.2. It is important to note that the need for a universal gain parameter, chosen as to be the same across all agents makes this algorithm not entirely distributed, as all the agents must agree upon a k. We will revisit the integrator example to show what this feedback would look like for the linear case. Example 8.3. Consider again, the single-integrator plant given by ẋ i = u i + d i P i : y i = x i with disturbance given by ẇ i = S i w i D i : d i = D i w i where w i W i R qi. To design the internal model, fix G i R qi qi and H i R qi to be a controllable pair. Then there exists Φ i R 1 qi such that G i + H i Φ i has the same spectrum as S i. There exists a nonsingular j i such that Rqi qi j i S iw i = (G i + H i Φ i ) j i w i [D i w i ] j = Φ i j i w i for j = 1,..., n i. Using this, we can take the feedback as ξ = G i ξ i + H i (Φ i ξ i + ke i ) u i = Φ i ξ i ke i where Φ i = blkdiag(φ i,..., Φ i ). We see that this provides a linear internal model of the linear exosystem. Returning to the analysis, we will examine the behaviour of the stacked, closed loop system. The dynamics of which are given by ẇ = s(w) ż = q(x, z, w) ẋ = ke φ(ξ) + f(x, z, w) Σ : ξ = Gξ + H(φ(ξ) + ke) e = F (x p(w)) (8.19)

91 Chapter 8. Nonlinear Plant Models 81 Where now, the error dynamics follow ( ė = D F (x p(w)) ke φ(ξ) + f(x, z, w) p ) w s(w) (8.20) Consider the following coordinate transformation ξ η := 1 (ξ σ(w)) k e ζ := e + 1 (φ(ξ) φ(σ(w))) (8.21) k In these new coordinates, the dynamics of the closed-loop system are given by ẇ = s(w) ż = q(x, z, w) ẋ = ke φ(ξ) + f(x, z, w) η = Gη + Hζ ζ = k D F (x p(w))ζ + δ 0 (x, z, w) + δ 1 (w, η, ζ) (8.22) where [ δ 0 (x, z, w) = D F (x p(w)) f(x, z, w) θ(w) p ] w s(w) δ 1 (w, η, ζ) = 1 [ φ(kη + σ(w)) φ(σ(w))][gσ(w) + Hφ(σ(w))] k + φ(kη + σ(w))(gσ(w) + Hφ(σ(w))) We will now state two assumptions about the structure of the dynamics Assumption 8.2. There exists c 0 > 0 such that for all (x, z, w) R m R n m W ( D F (x p(w)) (f(x, z, w) θ(w)) p ) w s(w) c0 (w, x, z) ΓNE Assumption 8.3. φ( ) is C 2, bounded and has bounded gradient φ( ). Using these two assumptions, we can get necessary bounds on some of the terms in our dynamics that are useful for proving global convergence. Lemma 8.1. Under Assumptions 8.2 and 8.3 there exist L 0, c 1, c 2 > 0 all independent of k such that 1 k (φ(kη + σ(w)) φ(σ(w)) L0 η δ 1 (w, η, ζ) c 1 ζ + c 2 η

92 Chapter 8. Nonlinear Plant Models 82 Proof. The first follows from the fact that Assumption 8.3 means that φ( ) is globally Lipschitz 1 k (φ(kη + σ(w)) φ(σ(w))) 1 k L 0 kη + σ(w) σ(w) L 0 η which gives the first inequality. Since φ( ) is bounded, there exists L 1 > 0 such that φ(kη + σ(w)) L 1 Since W is compact, there exists L 2 > 0 such that Gσ(w) + Hφ(σ(w)) L 2 Since φ( ) is globally Lipschitz, there exists L 3 > 0 such that 1 k ( φ(kη + σ(w)) φ(σ(w)) L 3 Therefore, we can conclude that δ 1 (w, η, ζ) c 1 ζ + c 2 η We will now show that the (x, z, w, η) subsystem, in the new coordinates, is uniformly detectable with ζ as an output. Proposition 8.2. Suppose Assumptions 8.1, 8.2 and 8.3 hold. Then (8.22) with ζ as an output is uniformly detectable relative to Γ NE = {(x, z, w, η) (x, z, w) Γ NE, η = 0} with linear gain function. Proof. Consider the (x, w) dynamics in (8.22). From Proposition 8.1, (x, z, w) ΓNE β( (x(0), z(0)w(0)) ΓNE, t) + l e [0,t) β( (x(0), z(0), w(0)) ΓNE, t) + l ζ + 1 k (φ(kη + σ(w)) φ(σ(w))) [0,t) β( (x(0), z(0), w(0)) ΓNE, t) + l( ζ [0,t) + L 0 η [0,t) ) Since G is Hurwitz, there exist d 0, d 1, and λ 0 all > 0 such that η(t) d 0 η(0) e λ0t + d 1 ζ [0,t) Combining the above equations, using properties of the cascade of ISS systems, gives (x, z, w, η) Γ NE β ( (x(0), z(0), w(0), η(0)) Γ NE, t) + l ζ [0,t) for some β (, ) KL and l > 0. We will now show that the ζ subsystem, in the new coordinates, is uniformly detectable with

93 Chapter 8. Nonlinear Plant Models 83 (x, z, w, η) as an output. We will use this in conjunction with the previous proposition to show express the overall system as the feedback interconnection of uniformly detectable systems. Proposition 8.3. Suppose Assumption 2.3(iii) holds, then for any number ν > 0, these exists a number k such that for all k > k, there exists ˆβ(, ) KL such that for every ζ(0), the inequality ζ ˆβ( ζ(0), t) + ν (x, z, w, η) Γ NE,[0,t) holds along the solutions of (8.22). Proof. Consider the Lyapunov candidate function V (ζ) = 1 2 ζ 2 (8.23) Taking the derivative along the solutions of (8.22) V = kζ D F (x p(w))ζ + ζ T δ 0 (x, z, w) + ζ T δ 1 (w, η, ζ) k µ ζ 2 + c 0 (x, z, w) Γ NE ζ + c 1 η ζ + c 2 ζ 2 (k µ c 2 ) ζ 2 + c 0 (x, z, w, η) Γ NE Let ν > 0 be a fixed number. Choose k > 0 such that there exists k 2 > 0 satisfying We see that if ζ ν (x, z, w, η) Γ NE, then k µ c 2 + c 0 ν + k 2 V k 2 ζ 2 Therefore, we can conclude that if V (ζ) 1 2 ν2 ( (x, w, z) Γ NE ) 2 (8.24) Then V satisfies V 2k 2 V which means that there exist M, λ 1 > 0 such that ζ Me λ1t ζ(0) Otherwise, we have that V (ζ) 1 2 ν2 ( (x, z, w, η) Γ NE ) 2 which means that ζ 2 ν 2 ( (x, z, w, η) Γ NE ) 2

94 Chapter 8. Nonlinear Plant Models 84 and thus ζ ν (x, z, w, η) Γ NE Combining the above inequalities gives us the final estimate ζ Me λ1t ζ(0) + ν (x, z, w, η) Γ NE,[0,t)] where ν > 0 can be arbitrarily chosen by suitable choice of k. Theorem 8.1. Given a game G(I, J i, R ni ) with agent dynamics given by (8.1),(8.18). Suppose Assumptions 2.2, 2.3(iii), 8.1, 8.2 and 8.3 hold, then there exists a k > 0 such that for all k > k, the NE set (8.5) is globally asymptotically stable for (8.19) for all w(0) W. Proof. By Proposition 8.2, we have that there exist β (, ) KL and l > 0 such that (x, z, w, η) Γ NE β ( (x(0), z(0), w(0), η(0)) Γ NE, t) + l ζ [0,t) which means that the (x, z, w, η) subsystem in ISS. Similarly, by Proposition 8.3, we have that there exist β (, ) KL such that ζ β ( ζ(0), t) + ν (x, z, w, η) Γ NE,[0,t) where ν > 0 can be arbitrarily chosen by suitable choice of k. Therefore the ζ subsystem is ISS with input (x, z, w, η). Therefore, if we choose k such that l ν < 1 for all k > k, then by Theorem 2.11, the feedback interconnection of the two subsystems is asymptotically stable and therefore Γ NE in the original coordinates is asymptotically stable. In this chapter, we presented an NE seeking algorithm for a class of nonlinear plant models. We noted that this captures all of the previous plant models discussed in the thesis. We note that this algorithm depends upon the selection of a universal gain parameter, which limits its usefulness in a purely distributed setting. This algorithm also requires full-information, so it cannot be used in a partial-information set-up.

Chapter 9 Numerical Results In this chapter, we present numerical results for the algorithms presented in this thesis.

We then present two real-world examples: an optical network OSNR game where we investigate the algorithms from Chapter 4 and a mobile robotics sensor-network

95 Chapter 9 Numerical Results In this chapter, we present numerical results for the algorithms presented in this thesis. First, we consider an academic quadratic potential game example and compare the algorithms from Chapters 4 and 7. We then present two real-world examples: an optical network OSNR game where we investigate the algorithms from Chapter 4 and a mobile robotics sensor-network game where we investigate the algorithms from Chapters 5 and Quadratic Game Example 9.1. Consider a quadratic game which 20 players with cost function given by J i (x i, x i ) = c i (x i ) x i f(x), where c i (x i ) = ( (i 1))x i and f(x) = 2200 j I x j. We consider that each agent is modelled as an integrator plant (3.2) and uses a basic gradient descent algorithm (2.31). Figure 9.1: NE found by basic gradient descent algorithm (2.31) We will compare the algorithms designed in this thesis to the Nash equilibrium found in Figure

action Chapter 9. Numerical Results 86 9.1.1 Full-Information Potential Game with Disturbance Example 9.2. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has plant dynamics given by (3.

31) subject to disturbances (b) Agent dynamics Σ i (7.5) Figure 9.2: Comparison of gradient dynamics and the algorithm given by (7.

2, the gradient descent algorithm ẋ = F (x)+d does not converge to the Nash equilibrium, but rather has noise surrounding the steady-state values.

96 action Chapter 9. Numerical Results Full-Information Potential Game with Disturbance Example 9.2. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has plant dynamics given by (3.4) or ẋ i = u i +d i, where d i (t) = 50 sin(20t)+ 50 sin(23t). We investigate the proposed agent dynamics (7.5) time (s) (a) Gradient-play dynamics (2.31) subject to disturbances (b) Agent dynamics Σ i (7.5) Figure 9.2: Comparison of gradient dynamics and the algorithm given by (7.5) showing the effectiveness of the designed algorithm As shown in Figure 9.2, the gradient descent algorithm ẋ = F (x)+d does not converge to the Nash equilibrium, but rather has noise surrounding the steady-state values. The internal-model-based control algorithm mitigates these disturbances and converges to the same NE values as found in Example 9.1. The proposed internal model based algorithm has some sustained oscillations, which should not theoretically be there. They are much smaller in amplitude than the oscillations caused by the disturbance. In theory, these should be fully rejected. Strongly Montone Game with Disturbance Example 9.3. Consider a game with the same cost functions as defined in Example 9.1 with plant dynamics given by (3.4) or ẋ i = u i + d i, where d i (t) = 50 sin(20t) + 50 sin(23t). We investigate the proposed agent dynamics (4.5). As shown in Figure 9.3, the gradient descent algorithm ẋ = F (x)+d does not converge to the Nash equilibrium, but rather has noise surrounding the steady-state values. The internal-model-based control algorithm mitigates these disturbances and converges to the same NE values as found by ẋ = F (x). Compared to the passivity based algorithm, the disturbance is rejected much quicker and there are no sustained oscillations around the NE values. Potential Game with Reference Example 9.4. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has plant dynamics given by (3.12) or ẋ i = u i, y i = x i p i, where

action Chapter 9. Numerical Results 87 200 150 100 50 (a) Gradient-play dynamics (2.31) subject to disturbances 0 0 5 10 15 20 25 time (s) (b) Agent dynamics (4.5) Figure 9.

As shown in Figure 9.4, the gradient descent algorithm ẋ = F (x p) does not converge to the Nash equilibrium, but has a small error between the action and the NE trajectory.

Strongly Monotone Game with Reference Example 9.5. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has plant dynamics given by (3.

97 action Chapter 9. Numerical Results (a) Gradient-play dynamics (2.31) subject to disturbances time (s) (b) Agent dynamics (4.5) Figure 9.3: Comparison of gradient dynamics and the algorithm given by (4.5) showing the effectiveness of the designed algorithm p i (t) = 5 sin(t) + sin(4t). We investigate the proposed agent dynamics (7.13). As shown in Figure 9.4, the gradient descent algorithm ẋ = F (x p) does not converge to the Nash equilibrium, but has a small error between the action and the NE trajectory. The internal model based control algorithm converges to the NE without any error. The proposed internal model based algorithm successfully tracks the NE trajectory. Strongly Monotone Game with Reference Example 9.5. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has plant dynamics given by (3.12) or ẋ i = u i, y i = x i p i, where p i (t) = 5 sin(t) + sin(4t). We investigate the proposed learning algorithm with dynamics given by (4.22). As shown in Figure 9.5, the gradient descent algorithm ẋ = F (x p) does not converge to the Nash equilibrium, but has a slight error between the action and the NE. The proposed algorithm converges to the NE values without any error. Compared to the passivity based algorithm, the disturbance is rejected much quicker Partial-Information In this section, we will investigate the performance of the partial-information algorithms. The same example as above will be used and the communication graph will be given by a random graph, detailed in Figure 9.6. Strongly Monotone Game with Disturbance - Random Graph Example 9.6. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has agent dynamics given by (3.4) or ẋ i = u i + d i, where d i (t) = 50 sin(20t) + 50 sin(23t) and that each player has local information that is communicated over a random communication graph, G c. We investigate the proposed agent dynamics (4.11).

88 Chapter 9. Numerical Results 200 action 150 100 50 0 0 20 40 time (s) 60 (a) Gradient-play dynamics (2.31) with reference signal (b) Agent dynamics Σi (7.

31) with reference signal showing slight error between the action and the NE Trajectory 65 70 time (s) 75 (d) Detail of Agent 8 dynamics Σi (7.13) showing convergence to the NE Figure 9.

7, the networked information base algorithm rejects the disturbance and converges to the same NE found in the full-information case.

Strongly Monotone Game with Reference - Random Graph Example 9.7. Consider a game with the same cost functions as defined in Example 9.1.

98 88 Chapter 9. Numerical Results 200 action time (s) 60 (a) Gradient-play dynamics (2.31) with reference signal (b) Agent dynamics Σi (7.13) Action of Agent 8 NE Trajectory Action of Agent 8 NE Trajectory 130 action (c) Detail of Agent 8 Gradient-play dynamics (2.31) with reference signal showing slight error between the action and the NE Trajectory time (s) 75 (d) Detail of Agent 8 dynamics Σi (7.13) showing convergence to the NE Figure 9.4: Comparison of gradient dynamics and the algorithm given by (7.13) showing the effectiveness of the designed algorithm As shown in Figure 9.7, the networked information base algorithm rejects the disturbance and converges to the same NE found in the full-information case. This algorithm converges slowly compared to the full-information. This is due to the fact that the information must be communicated over a communication graph. Strongly Monotone Game with Reference - Random Graph Example 9.7. Consider a game with the same cost functions as defined in Example 9.1. We consider additionally that each agent has plant dynamics given by (3.12) or x i = ui, yi = xi pi, where pi (t) = 5 sin(t) + sin(4t) and that each player has local information that is communicated over a random communication graph, Gc. We investigate the proposed learning algorithm given by (4.28). As shown in Figure 9.8, the networked information base algorithm converges to the same NE found in the full-information case. This algorithm converges slowly compared to the full-information. This is

99 action action action Chapter 9. Numerical Results time (s) (a) Gradient-play dynamics (2.31) subject to disturbances time (s) (b) Agent dynamics (4.22) subject to disturbances 130 Action of Agent 8 NE Trajectory 130 Action of Agent 8 NE Trajectory (c) Detial of Agent 8 gradient-play dynamics (2.31) showing a slight error between the action and the NE trajectory time (s) (d) Detial of Agent 8 dynamics (4.22) showing convergence to the NE Figure 9.5: Comparison of gradient dynamics and the algorithm given by (4.22) showing the effectiveness of the designed algorithm due to the fact that the information must be communicated over a communication graph. 9.2 OSNR Game Consider an optical signal-to-noise ratio (OSNR) model for wavelength-division multiplexing (WDM) optical links [10]. For this example, assume that a set of 10 channels, I = {1,..., 10}, are transmitted over an optically amplified link. We consider each channel as an agent and denote each agent s transmitting power as x i, while the noise power of each channel as n 0 i. We consider a game wherein each agent attempts to maximize its OSNR on its channel by adjusting its transmission power. Each agent has a cost function as in [43], given by

action Chapter 9. Numerical Results 90 16 20 12 8 10 19 14 13 7 6 4 11 5 2 9 3 18 1 15 17 Figure 9.6: Random Communication Graph, G c 200 150 100 50 (a) Gradient-play dynamics (2.

11) showing the effectiveness of the designed algorithm 1 J i (x i, x i ) = a i x i + P 0 j I x j ( x ) i b i ln 1 + c i n 0 i + j I,j i Γ ijx j (9.

We consider that each channel (agent) has plant dynamics given by (3.

100 action Chapter 9. Numerical Results Figure 9.6: Random Communication Graph, G c (a) Gradient-play dynamics (2.34) subject to disturbances time (s) (b) Agent dynamics (4.11) subject to disturbances Figure 9.7: Comparison of gradient dynamics and the algorithm given by (4.11) showing the effectiveness of the designed algorithm 1 J i (x i, x i ) = a i x i + P 0 j I x j ( x ) i b i ln 1 + c i n 0 i + j I,j i Γ ijx j (9.1) where a i > 0 is a pricing parameter, P 0 is the total power target of the link, b i > 0, and Γ = [Γ ij ] is the link system matrix. We choose these parameters to be the same as in [44]. We consider that each channel (agent) has plant dynamics given by (3.12) or ẋ i = u i + d i, with disturbance due to the presence of pilot-tones, modelled as d i = P 0 [1 + m i sin(2πf i t)] as in [5], where m i = 0.01i is the unknown modulation index and f i = 10i khz is the known frequency, for each i I Full-Information Example 9.8. First, we consider that each agent has full information about the others actions. In Figure 9.9b and Figure 9.9a, we compare the results of running agent dynamics, (4.5), with those under

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis Topic # 16.30/31 Feedback Control Systems Analysis of Nonlinear Systems Lyapunov Stability Analysis Fall 010 16.30/31 Lyapunov Stability Analysis Very general method to prove (or disprove) stability of