The Pennsylvania State University Schreyer Honors College. Department of Mathematics. An Optimal Control Problem Arising from an Evolutionary Game

Size: px

Start display at page:

Download "The Pennsylvania State University Schreyer Honors College. Department of Mathematics. An Optimal Control Problem Arising from an Evolutionary Game"

Harvey Harrell
5 years ago
Views:

1 The Pennsylvania State University Schreyer Honors College Department of Mathematics An Optimal Control Problem Arising from an Evolutionary Game Wente Brian Cai Spring 2014 A thesis submitted in partial fulfillment of the requirements for a baccalaureate degree in Mathematics with honors in Mathematics To be reviewed and approved* by the following: Christopher Griffin Research Associate & Asst. Professor Thesis Supervisor Nate Brown Professor of Mathematics Honors Adviser *Signatures will be on file if this thesis is approved.

2 Abstract This paper is an integrative study of evolutionary game theory and optimal control. We first study the basics of evolutionary game theory and introduce the model that we would like to study based on the game of Rock-Paper-Scissors. We then move on to an introduction of optimal control and identify the requirements that need to be fulfilled in order for a solution to become optimal. Finally, we explore different methods of modeling the Rock-Paper-Scissors game as an optimal control problem and try to find an optimal control that minimizes the cost of the game. Various linearization schemes are attempted and the results are discussed. i

3 Contents List of Figures iii 1 Introduction 1 2 Preliminaries and Literature Review Evolutionary Game Theory Optimal Control Theory Linear Quadratic Control Example Sufficienct Conditions for Optimal Control Problem Statement: Relating Optimal Control to Game Theory 10 3 Optimal Control of Generalized Rock-Paper-Scissors Constructing a General Form of the Problem Linearizing the Problem A Linear Problem Future Work and Conclusions Future Work Conclusion Bibliography 22 ii

4 List of Figures 1 a) An illustration of an unstable Nash Equilibrium fixed point when a = 5; b) An illustration of a non-linear center Nash equilibrium fixed point when a = 0; c) An illustration of a state Nash Equilibrium fixed point when a = Computed optimal control of rock-paper-scissors evolutionary game: x0) = 0.5, y0) = 0.2, z0) = 0.3, T = 40, σ = 1/ Computed optimal control of rock-paper-scissors evolutionary game varitation: x0) = 0.5, y0) = 0.2, z0) = 0.3, T = 40, σ = 1/ Computed optimal control of partially linearized dynamics x0) =.3 1 3, y0) =.3 1 3, z0) = Optimal control of fully linearized dynamics: x0) =.3 1 3, y0) =.3 1 3, z0) =.4 1 3, σ = iii

5 1 Introduction Evolutionary game theory studies the decision making of games over time, such that dominant strategies survive and get passed on to the next generation while failing strategies are phased out. We start with a large population of players playing a two player game. Each player is assigned a strategy by genetics), and the resulting payoff from play will determine which strategy was more successful and thus which strategy reproduces. We can then track the growth or decline of strategies in our population and predict the trajectory of each strategy. Our goal is to study the dynamics of the game while we dynamically alter the payoffs of the game. In the case of Rock-Paper- Scissors, it is understood that the optimal strategy is to pick rock, paper, or scissor in equal amounts 1, 1, 1 ). Consequently, this forms a fixed point of the evolutionary game resulting from rock-paper-scissor play. Under certain payoff assumptions in a population with many scissors and many rocks and only some paper, it is clear that the population of scissors will decrease from lack of reproduction) while the number of paper players increases. For certain classes of payoff, this process will lead to the fixed point 1, 1, 1 ), while in other cases we may observe cyclic behavior In the case of convergent behavior, how quickly that fixed point is reached is dependent on the game. The larger the difference between the winning and losing payoff i.e. the bigger the gap between winning and losing), the quicker the population approach the fixed point strategy. In this thesis, we consider the rock-paper-scissors payoff matrix: a 1 A = a a R 1 + a 1 0 Our objective is to dynamically alter the value of a rather than leaving a static value) in order to drive the population toward the fixed point. However, we assume that altering the value of a may be hard and therefore we attempt to minimize the total cost associated with driving the population to the prescribed fixed point. We study this problem theoretically and numerically showing two solutions for the resulting non-linear optimal control problem with interesting properties. We then attempt to linearize the control problem to obtain a closed form solution. Unfortunately, in so doing we lose some of the characteristics of the control problem and obtain an entirely new control problem with new solution structure. We then discuss in future work how to recover the original problem while maintaining a linear dynamical system, but introducing non-linearity and non-convexity into the objective function. 1

6 2 Preliminaries and Literature Review 2.1 Evolutionary Game Theory Game theory is the study of decision making when an agent s payoff is affected by the decisions of many) other agents and is based on the payoff one gets for using certain strategies [2, 3, 5, 9, 12, 13]. For us a game consists of two players using a set, K of n pure strategies denoted by i = 1, 2,..., n. When Player I uses strategy m K and Player II uses strategy n K, there is a payoff a mn. With values a ij, i, j = 1, 2,..., n, we can create a n n payoff matrix A for player I when we assume the game is symmetric, Player II s payoff matrix is A T ) [12]. Game theory is an extensive subject; the interested reader should consult [2, 3, 5, 9, 12, 13] for details. All standard game theoretic definitions given in this thesis can be found in these references. A player s mixed strategy is a column vector x = x 1, x 2,..., x n ) T where each x i is the probability of using strategy i K. Let S n be the set of all mixed strategies in the game. So if Player I and II uses mixed strategies x, y S n, the expected payoff for Player I is x T Ay and the expected payoff for Player II is x T A T y. The strategy x is said to be the best response to y if x T Ay z T Ay z S n 1) Definition 2.1. A strategy x is a Nash) equilibrium for the symmetric two player game with payoff matrix A if: x ) T Ax y T Ax y S n 2) In evolutionary game theory, we observe how a population of players changes their strategies over time. Let s consider a large population playing the same symmetric game we defined earlier. Let p i t) 0 be the number of individuals at time t using strategy i and let pt) = i K p it) > 0 be the total population. The population state is the vector xt) = x 1 t), x 2 t),...x n t)) T such that each x i t) is the proportion of the population using strategy i at time t. The expected payoff for strategy i K would then be e T i Ax and the population average payoff would be x T Ax. Here e i is the i th standard basis vector in R n, when we have n strategies. Our goal is to derive a differential equation that determines the growth or decline of strategy proportions in the population. In our scenario, the growth and decline of strategies depends on the fitness of the strategy. In other words, the more successful a strategy is, the quicker the population will adopt that strategy over time while poorer strategies die off. The following derivation is due to [15]. We can define the change in the number of individuals using strategy i as ṗ i = [β + e T i Ax δ]p i 3) 2

7 where β 0 is the initial fitness of individuals in the population and δ 0 is the death rate for all individuals. By definition, pt)x i t) = p i t). Differentiating Equation 3 by t, we get: pẋ i = ṗ i ṗx i = [β + e T i Ax δ]p i [β + x T Ax δ]px i 4) If we divide both sides by p, we obtain the replicator dynamics: ẋ i = [e T i Ax x T Ax]x i i = 1,..., n 5) The replicator dynamics tell us that strategies that give a greater than average payoff grow in the population and strategies that give less than the average payoff decline in the population. The rest points of the replicator dynamic are the zeros of the right hand side of the replicator equation; i.e., all points x S n such that e T i Ax = x T Ax). A rest point, x, is stable if every neighborhood B of x contains a neighborhood B such that if we start at x 0, then the solution flow ϕt, x 0 ) B for all t 0. In other words, x is stable if whenever a point starts in a neighborhood of x, it will stay contained in that neighborhood over time. It is asymptotically stable if there exists a B such that lim t ϕt, x 0 ) = x for all x 0 B. The Folk Theorem of Evolutionary Game Theory explains the relationship between resting points and Nash equilibrium [6, 7]. Theorem 2.2 Folk Theorem of Evolutionary Games). Consider the replicator dynamics given in Equation 5 1. If x is a Nash equilibrium, then it is a rest point 2. If x is a strict Nash equilibrium, then it is asymptotically stable 3. If the rest point x is the limit of an interior orbit, then x is a Nash equlilibrium 4. If the rest point x is stable, thein it is a Nash equlibrium For the remainder of thesis, we will be exploring a two-player, symmetric game: Rock, Paper, Scissors. In this game, each pure strategy dominates i.e., beats) exactly one other. Every child in the world knows: Rock beats scissor, scissors beats paper and paper beats rock. So Strategy 1 dominates 2, Strategy 2 dominates 3, and Strategy 3 dominates 1. If two players pick the same strategy, the payoff is 0. Our payoff matrix is the following and is adapted from [7]: a 1 A = a a R 6) 1 + a 1 0 3

8 Zeeman [16] proved the following regarding Rock, Paper, Scissors: Theorem 2.3 Zeeman s Theorem [16]). The following conditions are equivalent for the Rock-Scissor-Paper game: 1. x is asymptotically stable 2. x is globally stable 3. det A > 0 4. x )T Ax > 0 What this tells us is that if we choose a > 0, det A > 0 which results in all orbits in the interior of Sn converging to x. If a < 0, det A < 0 and all orbits in the interior of Sn converge to the boundary Sn. Finally, if a is zero, then det A = 0, and all orbits in the interior of Sn are closed orbits around x. This is illustrated in Figure z z 0.5 z y x y a) x 1.0 b) y x c) Figure 1: a) An illustration of an unstable Nash Equilibrium fixed point when a = 5; b) An illustration of a non-linear center Nash equilibrium fixed point when a = 0; c) An illustration of a state Nash Equilibrium fixed point when a = Optimal Control Theory An optimal control problem [1, 4, 8] deals with finding time varying optimal controls or inputs) needed to maximize or minimize some cost as a functional of time varying state and the control. For the purposes of this thesis, we will investigate optimal control problems that depend only implicitly on time. Let x1 t),...xn t) be the list of state variables and u1 t),..., um t) be the control 4

9 inputs. The system dynamics is the set of n first order differential equations that dictates the path of the control variables: ẋ 1 = g 1 x 1 t),..., x n t), u 1 t),..., u m t), t) 7) ẋ n = g n x 1 t),..., x n t), u 1 t),..., u m t), t) 8) The homogeneous state equation can then be defined as: such that. ẋt) = gxt), ut)) 9) x 1 t) xt) =. 10) x n t) is the vector of our state variables and u 1 t) ut) =. 11) u m t) is the vector of controls. We will focus on a specific form for the optimal control problem. Our goal is to find an optimal control u that can minimize or maximize) our cost functional: J = ΨxT )) + T subject to our state equation constraints 0 fxt), ut))dt 12) ẋ = gxt), ut)) 13) where T is an maximum time we will consider. If T =, then we require infinite time-horizon controls, which we will not consider for simplicity. The interested reader may consult [4, 8, 14] for details on alternate forms of optimal control problems. For the remainder of this thesis, we will consider a minimization problem of Bolza type: T min ΨxT )) + fxt), ut))dt 0 14) s.t. ẋ = gxt), ut)) 5

10 We note that when ΨxT )) 0, then the problem is a Lagrange type problem, which we will also consider. Optimal control problems can be difficult to solve. Fortunately, necessary conditions that an optimal control u must satisfy can be derived. These conditions are generally derived using the Hamiltonian of the control problem: The Hamiltonian is defined as Hx t), u t), λ) = fxt), ut)) + λ T gxt), ut)) 15) where f is the cost function in Expression 12, g is the right hand side of the state equations, and λ is a vector of co-state multipliers 1. The following theorem is proven in most standard references on optimal control; see e.g., [1, 4, 8]. Theorem 2.4 Necessary Conditions for Optimal Control). Consider the Bolza problem given in Expression 14. If u is an optimal control, then Hx t), u t), λ t)) Hx t), ut), λ t)) 16) for all t [0, T ] and for all admissable inputs u U, and must satisfy the following conditions: 1. Pontryagin s Minimim Principle: ut) = H u definite, = 0 and 2 H u 2 is positive 2. Co-State Dynamics: λt) = H = x λt t) gx,u) + fx,u), x x 3. State Dynamics: ẋt) = H λ = gx, u), 4. Initial Condition: x0) = x 0, and 5. Transversality Condition: λt ) = Ψ xt )). x where partial derivatives with respect to vectors, e.g., H, denote a gradient u restricted to only those variables with respect to which we differentiate 2. 1 These co-states can be thought of like Lagrange multipliers [11] in ordinary optimization theory. 2 Note that gx,u) x is a Jacobian matrix. 6

11 2.3 Linear Quadratic Control Example Certain classes of optimal control problems can be solved in closed form. In particular, linear quadratic control LQC) problems have such special structure. We give an example of such a problem and its solution. min T 0 x 2 + u 2 dt s.t. ẋ = ax + bu x0) = x 0 The Hamiltonian for this control problem is: By the minimum principle we require 2 H u 2 for u, we have: 17) H = x 2 + u 2 + λax + bu) 18) > 0 and that H u H u = 2u + bλ = 0 = u = b 2 λ Applying the co-state dynamics to solve for λ yields: λ = H x The state dynamics to solve for ẋ yields: = 0. Solving = 2x + aλ) 19) H λ = gx, u) = ax + bu = ax b 2 λ 20) Note, ax + bu is the original right-hand-side of the state dynamics in the problem constraints. Finally, we have the initial condition x0) = x 0 and the transversality condition λt ) = 0 yielding a two-point boundary value problem: ẋ = ax b 2 λ 21) λ = 2x aλ 22) x0) = x 0 23) λt ) = 0 24) Re-writing in matrix form yields: [ẋ ] [ ] [ ] a b = 2 x λ 2 a λ 25) 7

12 To solve the system of differential equations, we plugged the equations into Mathematica DSolve) which gave us the following solutions: ) be t a 2 +b 1 + e 2t a 2 +b C 1 x = 4 a 2 + b e t a 2 +b a + a 2 + b + ae 2t a 2 +b + ) a 2 + be 2t a 2 +b + 2 a 2 + b e t a 2 +b a + ) a 2 + b ae 2t a 2 +b e 2 a 2 +b C 1 λ = 2 a 2 + b ) e t a 2 +b 1 + e 2 a 2 +b C 2 + a2 + b C 2 26) 27) Next, we solve our two unknown constants C 1 and C 2 by setting x0) = r r being some constant) and λt ) = 0. Again, using Mathematica, we get the following: e 2T ) a 2 +b r C 1 = a + a 2 + b ae 2T a 2 +b + a 2 + be 2T a 2 +b 28) C 2 = r 29) Finally, we can define our xt) and λt) functions: ) be t a 2 +b 1 + e 2t a 2 +b C 1 xt) = 4 a 2 + b e t a 2 +b a + a 2 + b + ae 2t a 2 +b + ) a 2 + be 2t a 2 +b + 2 a 2 + b e t a 2 +b a + ) a 2 + b ae 2t a 2 +b e 2 a 2 +b C 1 λt) = 2 a 2 + b ) e t a 2 +b 1 + e 2 a 2 +b C 2 + a2 + b C 2 30) 31) 8

13 2 1 + e 2T ) a 2 +b r C 1 = a + a 2 + b ae 2T a 2 +b + a 2 + be 2T a 2 +b 32) C 2 = r 33) 2.4 Sufficienct Conditions for Optimal Control We note that any solution u to Problem 14 must satisfy the necessary conditions set forth in Theorem 2.4. However it is possible that a function u satisfies these necessary conditions, but is not an optimal control. Mangaserian [10] and Arrow [4] have shown proved conditions under which the necessary conditions in Theorem 2.4 are sufficient for u to be an optimal control. Theorem 2.5 Mangaserian s Theorem). Suppose the admissible pair x, u ) satisfies all of the relevant continuous-time optimal control problem necessary conditions for OCP, the Hamiltonian H is jointly convex in x and u for all admissible solutions, t 0 and t f are fixed, x 0 is fixed, and there are no terminal time conditions Ψ[xT ), T ] = 0. Then any solution of the continuous-time optimal control necessary conditions is a global minimum. Theorem 2.6 Arrow s Theorem). Let x, u ) be an admissible pair for OCP when the Hamiltonian H is jointly convex in x and u for all admissible solutions, t 0 and t f are fixed, x 0 is fixed, and there are no terminal time conditions Ψ[xT ), T ] = 0. If there exists a continuous and piecewise continuously differentiable function λ = λ 1,..., λ n ) T such that the following conditions are satisfied: λ i = H x i, almost everywhere i = 1,..., n 34) Hx, u, λt), t) Hx, u, λ, t) for all u U and all t [0, T ] 35) Ĥx, λ, t) = minhx, u, λ, t) exists and is convex in x for all t [0, T ] u U 36) then x, u ) solves the OCP for the given. If Ĥx, λ, t) is strictly convex in x for all t, then x is unique but u is not necessarily unique). We will use Mangaserian s theorem in the sequel when we discuss linearization of the optimal control problem we consider. 9

14 2.5 Problem Statement: Relating Optimal Control to Game Theory So far, we have covered the basics of evolutionary game theory and observed how the values of parameters can affect the dynamics of the population see, e.g., Figure 1). We also studied optimal control and identified the conditions of an optimal control problem. Our next goal is to study the effects of parameters on the population when these parameters are considered to be control functions in an optimal control problem. Specifically, we will study an optimal control problem whose differential state evolution is governed by the replicator dynamics. However, such a problem can be difficult because the replicator dynamics are non-linear. Therefore, we focus on rock-paperscissors as an example and study methods to overcome the non-linearity. 10

15 3 Optimal Control of Generalized Rock-Paper- Scissors Recall the Rock-Paper-Scissors dynamics with payoff matrix given in Expression 6. We consider the case when the parameter a is allowed to vary in time as a control function. For rock-paper scissors, our goal is to find a function governing at) to drive the flow towards the Nash equilibrium 1, 1, ), over a finite time horizon T, subject to a cost incurred from a large at). We formulate our problem as: [ T min xt) 1 ) 2 + yt) zt) 0 3 3) 1 ) ] 2 + at) 2 dt 3 s.t. ẋ = x e T i Ax xax ) ẏ = y e T i Ay yay ) 37) ż = z e T i Az zaz ) Expanding the dynamics, we have: [ T min xt) 1 ) 2 + yt) zt) 0 3 3) 1 ) ] 2 + σat) 2 dt 3 s.t. ẋ = ax 2 y ax 2 z axyz + axz xy + xz ẏ = axy 2 axyz + axy ay 2 z + xy yz ż = axyz axz 2 ayz 2 + ayz xz + yz 38) We note that the objective functional is convex in the unknown functions, while the dynamics are highly) non-linear. We also note that the value of at) is not constrained but will be controlled in absolute value by the the objective functional itself; i.e., at) will not get too big or too small. With this system, our Hamiltonian will be: H = xt) 3) yt) 3) zt) 3) σa 2 + λ 1 t)ẋ + λ 2 t)ẏ + λ 3 t)ż 39) The optimal control was solved using Mathematica full equation listed in the Appendix). When we set T to be 40, we get the following plot: Figure 2a) plots our xt), yt), and zt) while b) plots the value of a from time 0 to

16 a) b) Figure 2: Computed optimal control of rock-paper-scissors evolutionary game: x0) = 0.5, y0) = 0.2, z0) = 0.3, T = 40, σ = 1/100 We also explored a modified cost functional such that xt) 3) ) yt) zt) 1 2 3) is outside of the integral. Thus our problem becomes: min xt) yt) 3) 3) zt) 3) 1 2 T + σat) 2 dt 0 s.t. ẋ = ax 2 y ax 2 z axyz + axz xy + xz ẏ = axy 2 axyz + axy ay 2 z + xy yz ż = axyz axz 2 ayz 2 + ayz xz + yz with the Hamiltonian: 40) H = σa 2 + λ 1 t)ẋ + λ 2 t)ẏ + λ 3 ż 41) When we solve an instance of the the optimal control with a computer algebra system, we get the following results: 3.1 Constructing a General Form of the Problem We also explored the method of linearizing the dynamics about the fixed point and then using the results from the LQC to find an optimal solution for a. We are still analyzing the control problem from the Rock-Paper-Scissor 12

17 a) b) Figure 3: Computed optimal control of rock-paper-scissors evolutionary game varitation: x0) = 0.5, y0) = 0.2, z0) = 0.3, T = 40, σ = 1/100 game with the payoff matrix: 0 1 a A = a ) 1 a 0 The resulting replicator dynamics are: ẋ i = x i e T i Ax x T Ax ) 43) This creates a complex convolution of the control problem: min ΦxT )) + T fx, a)dt 0 s.t. ẋ i = x i e T i Ax x T Ax ) 44) Suppose we express A such that: 0 1 a A = a 0 1 = a = B + ac 45) 1 a Denote: F i x) = x i e T i Bx x T Bx ) 46) G i x) = x i e T i Cx x T Cx ) 47) 13

18 Then we can write Problem 44 as: min ΦxT )) + T s.t. ẋ i = Fx) + agx) 0 fx, a)dt 48) Note, when a = 0, we recover fair rock-paper-scissors dynamics as the uncontrolled) state dynamics. Let and Fx) = [F 1 x),..., F n x)] T 49) G = [G 1 x),..., G n x)] T 50) n = 3 for Rock, Paper, Scissors) be vector valued functions. The Hamiltonian for this problem is: Suppose that: such that: Hx, λ, a) fx, a) + λ T Fx) + aλ T Gx) 51) fx, a) = 1 2 x x 2 + σ 2 a2 52) x = is the equilibrium point of rock paper scissors. Then: ) H a = σa + λt Gx) = 0 = a = 1 σ λt Gx) 54) If a is an optimal control, then the transversality condition would ensures that: λt ) = ΨT ) = a T ) = 1 ) T ΨT ) Gx) 55) x σ x In particular, if ΦxT )) 0, then at ) = 0. The resulting necessary conditions are the non-linear differential equations: ) 1 ẋ = Fx) σ λt Gx) Gx) 56) ) λ T = x x ) T λ T Fx) 1 x + σ λt Gx) λ T Gx) 57) x x0) = x 0 58) λt ) = ΨT ) x 14 59)

19 Here: Fx) x, Gx) x are the Jacobians of the vector valued functions F and G. Note, this is true of evolutionary game problem with a parameter a where the payoff matrix can be written as A = B + ac. 3.2 Linearizing the Problem Suppose we linearize the problem about the equilibrium point and suppose that ΦxT )) 0. Let: J = Fx) x 60) x=x H = Gx) 61) x x=x 62) Without loss of generality, suppose we translate the problem so that x = 0 i.e., by replacing x by x x ). Then the quasi-linearized control becomes: From this, we compute: min ẋ T x 2 + σ 2 a2 dt = Jx + ahx 63) a = 1 σ λt Hx 64) a quadratic form in the state and co-state. The necessary conditions for a to be an optimal control are then: ) 1 ẋ = Jx σ λt Hx Hx 65) ) λ T 1 = x T λ T J + σ λt Hx λ T H 66) x0) = x 0 67) λt ) = 0 68) which we can re-write as an almost linear two-point boundary value problem: [ẋ ] [ ] [ ] ) [ ] [ ] J 0 x 1 H 0 x = λ I n J T λ σ λt Hx 0 H T 69) λ x0) = x 0 70) λt ) = 0 71) 15

20 This problem, while non-linear preserves, some of the structure of the original problem, as we can see through numerical solution. Note, the optimal control, a) b) Figure 4: Computed optimal control of partially linearized dynamics x0) =.3 1 3, y0) =.3 1 3, z0) = shown in Figure 4 b), has the decaying characteristic of the control function shown in Figure 2 b) but does not have the oscillation. This is the result of substituting the Jacobians in for the non-linear replicator dynamics. 3.3 A Linear Problem The fundamental problem with this linearization is the resulting problem is still non-linear. To solve this, notice that the linearized dynamics can be written as: ẋ = Jx + ahx = Jx + H ax) 72) For the sake of argument, let u = ax be the control vector. Then a nonequivalent) similar LQC problem to be studied is: min ẋ T Our Hamiltonian for this problem is: x 2 + σ 2 u 2 dt = Jx + Hu 73) H = 1 2 x 2 + σ 2 u 2 + λ T Jx + Hu) 74) 16

21 From this, in the linearized problem we have: H x = σut + λ T H = 0 = u = HT λ σ The necessary conditions for u to be an optimal control are then: 75) ẋ = Jx 1 σ HHT λ 76) λ = x J T λ 77) x0) = x 0 78) λt ) = 0 79) Notice that this time, we have a linear two-point boundary value problem a Ricatti equation [8]) that can be written as the following: [ẋ ] [ ] [ ] J 1 = σ λ HHT x I n J T 80) λ Unfortunately, the solution cannot be written in extended form due to length, but will have the following form: [ ] [ ] ) x = C λ J T 1 exp σ HHT I n J T t 81) where C T is a constant that allows the expression to satisfy the two boundary conditions: x0) = x 0 and λt ) = 0. We illustrate an example solution in Figure 5. Here, we assume the control vector u = [u, v, w] T. Notice, the characteristics of u have no resemblance to the computed characteristics of a from e.g.) Figure 4. This suggests that while we have constructed a linear control problem, it does not provide us with useful information about our original non-linear control problem. We do note however, that Arrow s Theorem ensures that this is the true optimal control as linear quadratic control problems of this type satisfy appropriate convexity assumptions. To see why this is, note that by letter u = ax, we are implicitly allowing the control to operate independently in all three dimensions simultaneously; that is, we assume we can influence the payoff of winning the game independently among the three species rock, paper and scissors). Clearly these extra degrees of freedom allow the control system to achieve a value state value near the equilibrium point much more quickly than in the constrained dynamics of Problem 44 or 48. In our future work, we discuss a way of recovering these 17

22 a) b) Figure 5: Optimal control of fully linearized dynamics: x0) =.3 1 3, y0) =.3 1 3, z0) =.4 1 3, σ = 1 constraints within the objective function, thus moving all the non-convexity and non-linearity) to the objective, rather than the differential equations. 18

23 4 Future Work and Conclusions 4.1 Future Work As noted, when we compare the two methods of solving our linearization problem, we noticed that the dynamics of the second problem approaches the Nash equilibrium quicker than the first but the resulting control functions bear no resemblance to the control computed in the general non-linear system. We can recover the original problem by enforcing constraints of the form: u i x i = u j x j i j = u i x j = u j x i i j 82) and replacing the objective functional with min ẋ T x 2 + σ 2n = Jx + Hu u i x j = u j x i i i j ui x i ) 2 dt 83) The non-linear constraints can be priced out to produce the following again) non-linear control problem but now with linear dynamics: T 1 min 0 2 x 2 + σ ) 2 ui + ρ ij u i x j u j x i ) 2 dt 2n x i 84) ẋ = Jx + Hu i As a future research project, it would be interesting to explore the solutions of this problem and compare it to the original non-linear optimization problem. As additional future work, we note that the structure of the almost linearized problem Problem 48) is deceptively simple. We have not given up hope that it is possible to understand the structure of the solutions of Problem 48 analytically and therefore to shed some light on the behavior of the original non-linear problem near the fixed point. 4.2 Conclusion In this thesis, we studied evolutionary game theory and introduced the payoff matrix and replicator dynamics for the rock-paper-scissors game. Next, we surveyed results in optimal control theory and identified the characteristics of an optimal control. We then provided an example for a linear quadratic 19 i j

24 control problem, a form used later in the thesis. With this knowledge, we formulated an optimal control problem based on the rock-paper-scissor game and solved it numerically. We tried solving a variation of the game by modifying the cost functional and identified with the same problem that global optimality could not be proved. We then attempted to linearize the problem by substituting the Jacobian matrices into the non-linear replicator dynamics, but found that the problem was still not linear because of the way the control interacted. We tried a different method of linearization and were able to derive a closed form solution and found some interesting properties of it. However this method altered the characteristics of our original control problem, which led to an entirely new control problem with new solution structure. Finally, we discussed as future work how to recover the original problem after linearization by enforcing certain constraints and pricing them out in the objective functional 20

25 References [1] S. Aniţa, V. Arnăutu, and V. Capasso, An introduction to optimal control problems in life sciences and economics, Springer, [2] S. J. Brams, Game theory and politics, Dover Press, [3] Y. Freund and R. E. Schapire, Game theory, on-line prediction and boosting, Ninth Annual Conference on Computational Learning Theory, 1996, pp [4] T. L. Friesz, Dynamic Optimization and Differential Games, Springer, [5] C. Griffin, Game Theory: Penn State Math 486 Lecture Notes v 1.1), [6] J. Hofbauer and K. Sigmund, Evolutionary Games and Population Dynamics, Cambridge University Press, [7], Evolutionary Game Dynamics, Bulletin of the American Mathematical Society ), no. 4, [8] D. E. Kirk, Optimal control theory: An introduction, Dover Press, [9] R. D. Luce and H. Raiffa, Games and decisions: Introduction and critical survey, Dover Press, [10] O. L. Mangasarian, Sufficient conditions for the optimal control of nonlinear systems, SIAM J. Control ), no [11] J. E. Marsden and A. Tromba, Vector calculus, 5 ed., W. H. Freeman, [12] P. Morris, Introduction to Game Theory, Springer, [13] R. B. Myerson, Game theory: Analysis of conflict, Harvard University Press, [14] A. Takayama, Mathematical economics, 2 ed., Cambridge University Press, [15] J. W. Weibull, Evolutionary game theory, MIT Press, [16] E. C. Zeeman, Population dynamics from game theory, Global Theory of Dynamical Systems, Springer Lecture Notes in Mathematics, no. 819, Springer,

26 Academic Vita Wente Brian Cai Education The Pennsylvania State University, Spring 2014 Bachelor of Science in Mathematics Honors in Mathematics Activities Pfizer, Epidemiology Intern, May Aug 2012 JPMorgan Chase, Summer Analyst, June Aug 2013 Penn State Dance Marathon, PR Photography Captain, Sep Mar 2014

Mathematical Economics. Lecture Notes (in extracts)

Mathematical Economics. Lecture Notes (in extracts) Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter