The Pennsylvania State University. The Graduate School. College of Engineering DIFFERENTIAL STACKELBERG GAMES AND THEIR APPLICATION TO

Size: px

Start display at page:

Download "The Pennsylvania State University. The Graduate School. College of Engineering DIFFERENTIAL STACKELBERG GAMES AND THEIR APPLICATION TO"

Natalie Chase
5 years ago
Views:

1 The Pennsylvania State University The Graduate School College of Engineering DIFFERENTIAL STACKELBERG GAMES AND THEIR APPLICATION TO DYNAMIC PRICING, PRODUCTION PLANNING NETWORK DESIGN, AND LOGISTICS A Dissertation in Industrial Engineering and Operations Research by Amir H. Meimand 2013 Amir H. Meimand Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2013

2 The dissertation of Amir H. Meimand was reviewed and approved* by the following: Terry L. Friesz Marcus Chaired Professor of Industrial Engineering Dissertation Adviser, Chair of Committee Tao Yao Professor of Industrial Engineering Dissertation Co-Adviser Soundar R.T. Kumara Pearce Chaired Professor of Industrial Engineering Alberto Bressan Eberly Chaired Professor of Mathematics Venky Shankar Professor of Civil Engineering M. Jey Chandra Graduate Program Coordinator of Department of Industrial Engineering *Signatures are on file in the Graduate School

3 Abstract Recently, Stackelberg games have been employed by many economists who use game theory concepts to solve dynamic competitive service sector problems such as dynamic pricing, production planning, logistics, supply chain management, and transportation network flow prediction and control. Hence, Stackelberg games have become the focus of much research activity. In this thesis, we provide a framework for studying player interaction based on the Stackelberg- Cournot-Nash behavioral assumption. We briefly review the classical theory of dynamic Stackelberg games, and show how the Nash equilibrium of a lower level problem may be better described by so-called differential variational inequalities (DVI). We also show that when each agent in the lower level problem is solving a stochastic optimal control with a linear quadratic form, the stochastic Nash equilibrium can be expressed as a Riccati system of equations. Both a DVI formulation and a Riccati system of equations may be used to express the solution of the lower level problem implicitly as a function of the upper level problem s controls. Hence, we are able to convert the bi-level optimization problem into a single-level problem. Furthermore, we study the application of differential Stackelberg games on two different areas: freight transport, and strategic pricing and revenue management. In the first model, we con- sider a Stackelberg game between a single carrier that acts as the leader and multiple shippers involved in a Nash competition. In the second model, we study the interaction between a supplier who is the leader and multiple retailers who are competing to sell a homogeneous commodity in a market when the market price evolves based on an Ito-type stochastic process. iii

4 Table of Contents List of Figures ix List of Tables xi Acknowledgments xii 1 Introduction and Motivation Differential Stackelberg Game Applications Preliminary Mathematics Varitional Ineqaulity Optimal Control Problem Differential Varitional Ineqaulity DVI Formulation of Dynamic Network User Equilibriums with elastic Demand Introductory Remarks Notation and Essential Background Dynamics The Differential Variational Inequality Conclusion iv

5 3 Deterministic Differential Stackelberg Games The Notion of Nash and Stackelberg Equilibria Nash Equilibrium Stackelberg Equilibrium Alternative Formulations for Nash Games Differential Variational Inequality Formulation Nonlinear Complementarity Formulation Single Level Problem Formulation of Stackelberg Game Numerical Example Fixed Point Algorithm Descent Algorithm in Hilbert Space for the Projection Sub-Problem Penalty Method Function Stochastic Differential Stackelberg Games Stochastic Optimal Control: The Concept of Stochastic Process Stochastic Maximum Principle and Necessary Condition Sufficient Conditions for a Stochastic Optimal Control Problem v

6 4.2 Stochastic Differential Variational Inequality Riccati System of Equations Stochastic Linear Quadratic Problem Numerical Example Existence and Uniqueness of Solutions for ODE Backward Differentiation Equations Approximating Method for ODE Systems: Notion of Stochastic Dynamic Stackelberg Games Necessary Condition SDVI Formulation Riccati System of Equations Discrete-Time Approximation Existence and Uniqueness Dynamic Optimization and Differential Stackelberg Games Applied to Freight Transportation Introduction and Brief Literature Review The Shipper-Carrier Problem Notations vi

7 5.2.2 The Carrier s Objective Functional and Constraints The Shippers Objective Functional, Dynamics, and Constraints DVI Formulation of the Lower Level Problem Single level optimization problem Numerical Example Computational Performance and Numerical Results Conclusion Stochastic Differential Stackelberg Game Applied to Strategic Pricing and Revenue Management A Revenue Management Model for a Stochastic Stackelberg Game between a Supplier and Retailers Notations The Retailers Objective Functional and Constraints (Followers Problem) The Supplier s Objective Functional and Constraints (Leader Problem) Stochastic Analysis of the Lower Level Problem Single Level Problem vii

8 6.3.1 Discrete-Time Approximation with Backward Differentiation Equations Numerical Example Computational Performance and Numerical Results Conclusion 144 References 146 viii

9 List of Figures 3.1 Optimal Control Trajectory of Leader and Follower Optimal Control Trajectory of Leader and Follower Parameter K Stochastic State Variable Stochastic Control Variable Distribution of Objective Function Value Network of five arcs and four nodes Price trajectories for transportation between OD pairs Commodity shipment trajectories for shippers Allocation of shippers commodities to different nodes Inventory trajectories of shippers at different nodes Market price Deterministic and Stochastic Net Present Value Net Present Value Distribution of Retailers Stochastic Demand Allocation Deterministic Demand Allocation ix

10 6.5 Deterministic and Stochastic Market Price Deterministic and Stochastic Inventory Deterministic and Stochastic Wholesale Price Deterministic and Stochastic Production Rate x

11 List of Tables 4.1 Backward Differentiation Equations Parameters Path-Arc relationship Parameters of Shipper-Carrier Model Net Peresnt Value of Instantaneous Revenues for Shipper-Carrier Model Backward Differentiation Equations Parameters Parameters of Supply Chain Model xi

12 Acknowledgements First and foremost, I take the opportunity to thank my advisor, Professor Terry L. Friesz, for providing guidance, valuable suggestions, and financial support during my days at Penn State as a graduate student. He always encouraged me during my graduate student life, giving of his valuable time and providing new ideas and comments to make this dissertation. I am grateful to my other committee members, Professor Alberto Bressan, Professor Soundar R.T. Kumara, Professor Venky Shankar and Professor Tao Yao, who were especially generous with their time, support and encouragement, spent their time reading the draft and providing valuable feedback. I would also like to thanks to my colleague, Ke Han, and Pedro Neto all whom have provided support during the research process. I am also deeply grateful to my close friends in State College Bijan Zolghadr, Sharareh Zolghadr, Afshin Shafa, Ghazal Izadi, Sam Tajbakhsh, Gelareh Alam, Akbar Sharifi, Alireza Akhavan, Ashkan Jasour, Hessamodin Ahamdi and many more for their company, because of which my life as a graduate student wasn t ever a bit boring xii

13 Chapter 1 1 Intoduction and Motivation 1.1 Differential Stackelberg Game The concept of Stackelberg game was first introduced in economics in the 1930 s within the context of static economic competition by Stackelberg (1952). This game can be considered as a multistage game where the players make decision sequentially and each player is active only at one stage of the game. In a Stackelberg game with two players there are two stages. At the first stage the active player is the leader and the follower makes a decision at the second stage. From an optimization point of view, a two-player Stackelberg game is a two level hierarchical optimization problem where the leader minimizes his/her objective function subject to the follower s optimization problem. Static Stackelberg games are studied in several paper such as Simaan (1977) (a). The most popular method of solving for static Stacklbereg game as presented by Simaan (1977) (b) is to add a constaint to the upper level problem which impose the duality gap between primal and dual problem of follower equal to zero. Then the penalty or barrier function method is used to append the duality gap to the leader s objective. This method was used by Aiyoshi and Shmizu (1984); and Shimizu and Aiyoshi (1981). Pedroso (1996) also proposed an evolutionary heuristic method to solve a system of several equations and can be 1

14 employed to find a Stackelberg equilibrium. Hoesel (2008) presents a basic enumeration procedure based on a branch-and-bound method to solve static a Stackelberg game. The study of zero-sum differential games was initiated by Isaacs 1965 in which each player chooses a control policy from a set of admissible controls to minimize his/her objective function over a finite time horizon. Moreover, there is a state variable whose evolution is described by a differential equation and is uniquely determined by the value of control. A considerable literature exists on the application of Stackelberg games in the area of supply chains and economics, to the study of topics such as inventory and production planning as well as wholesale and retail pricing strategies, when the market demand growth dynamics or seasonal variation. Eliashberg and Steinberg (1987) applied a Stackelberg game model to a a decentralized assembly system in which the manufacturer is the leader and the retailer is the follower, assuming that the demand has seasonal fluctuations. In this dynamic model the manufacturer must determine the production rate and a constant wholesale price. The retailer must also decide about its processing, pricing, and inventory policies. Desai (1992) changed the previous model by considering the dynamic wholesale price instead of constant one as well as not allowing the retailer to carry inventory. Desai (1996) extends Desai (1992) by requiring the retailer to process the goods received from the manufacturer before they can be sold in the market. So, the retailer must decide about the processing rate and pricing policies. Batabyal and Beladi (2006) study the application of Stackelberg game for a non renewable resource when the seller as the leader controls price, and the buyer as the follower controls consuming rate. There are also some studies on the application of Stacklbereg game on network design. Wie (2007) develop a Stackelberg model for a network in which an operator as leader decides on tariffs on a subset of the arcs that he owns to maximize his revenue. The customers who are the followers select the cheapest path by solving a shortest path problem. Hoesel (2008) studies a Stackelberg game model in which the government authority tries to improve the network performance by determining congestion toll schedules at the upper level and au- 2

15 tomobile commuters choose departure time route pairs with minimum costs between their relative origins and destinations at lower level. The general method of solving differential Stackelberg games is to add constraints to the leader problem which impose the maximum principle necessary condition for the follower s problem. In this case the adjoint variables of follower s problem are considered as state variables for the leader. Hence, the bi-level differential problem can be converted to a single level two-point boundary problem. Kleimenov et al. (2010) developed a numerical method for Nash and Stackelberg solutions when the system dynamic is linear. Wie (2007) Employs the Hooke Jeeves algorithm to obtain the Stackelberg equilibrium. Abou-Kandil (1990) find a closed-form solution for discrete-time linear-quadratic Stackelberg Games. Wenyong et al. (2009) construct a huristic method based on Genetic Algorithm to solve tariff network problem formulated as a Stackelberg game. The first result on a stochastic Stackelberg game is presented by Castanon (1976), where the authors obtain the feedback Stackelberg solution of a two-player linear-quadratic-gaussian (LQG) nonzero-sum discrete-time dynamic game problem. Also Basar (1979)(a) solves a class of static stochastic Stackelberg games when the underlying statistics of the random variables are not necessarily Gaussian. Some of these results are then employed by Basar (1979)(b) to find the feedback Stackelberg solution of the LQG problem of Castanon (1976). Bagchi and Basar (1981) obtains the Stackelberg equilibrium to a class of two-player stochastic differential games described by linear state dynamics and quadratic objective functionals. The information structure of the problem is such that the players make independent noisy measurements of the initial state and are permitted to utilize only this information in constructing their controls. Yong (2002) studies linear quadratic stochastic Stackelberg games allowing the coeffi cient of cost functions to be random. In this paper the stochastic Riccati system of equations is employed to solve the follower s problem. There are also some more papers in the literature using Riccati system of equations to solve the stochastic Nash and Stackelberg game based on 3

16 Levy process with application in marketing and advertising such as Quang and Xung (2008). Pita et al. (2010) deals with a model in which uncertainty is assumed just for the follower. In the traditional Stackelberg game it is supposed that the follower plays optimally. But in the reality it is highly possible that the follower deviates from the optimal action due to the lack of observation. This problem is formulated as Mixed-Integer-Linear-Programming and an heuristic algorithm is proposed to achieve a robust solution. Oksendal et al. (2011) studied the application of stochastic Stackelberg games in the newsvendor problem. In this model the uncertainty based on Ito-Levy process is considered just for the follower and both objective functions are to maximize expected profits under the random demand rate. Stackelberg-Cournot-Nash equilibria were introduced by Sherali et al. (1983). In A Stackelberg- Cournot-Nash game, there are N + 1 players, N of which who are followers are in a Nash competition and the (N +1)st player is the leader, minimizing his/her objective function, taking into account the best replies of the other N players. Tobin (1992) studies the uniqueness of the Stackelberg-Cournot-Nash equilibrium. Miller et al. (1992) formulate a dynamic facility location model for a firm locates on a discrete network. It is assumed that this locating firm will act as the leader firm in an industry characterized by Stackelberg leader-follower competition. The lower level problem is formulated by means of variational inequalities and sensitivity analysis is employed to develop reaction function based dynamic models. So the leader solves his problem taking into account the expected replies of the followers. Xu (2005), and Daniel and Smeers (1997) developed a Stackelberg-Cournot-Nash model proposed by Sherali et al. (1983) for the case in which the market demand is uncertain. The demand uncertainty modeled as M scenario and their associated probability. The existence and uniqueness of Stackelberg-Cournot-Nash equilibrium solution under uncertainty is also studied by Daniel and Smeers (1997). In this thesis, we provide a framework for studying player interaction based on the Stackelberg-Cournot-Nash behavioral assumption. We briefly review the classical theory of dynamic Stackelberg games, and show how the Nash equilibrium of a lower level problem 4

17 may be better described by so-called differential variational inequalities (DVI). We also show that when each agent in the lower level problem is solving a stochastic optimal control with a linear quadratic form, the stochastic Nash equilibrium can be expressed as a Riccati system of equations. Both a DVI formulation and a Riccati system of equations may be used to express the solution of the lower level problem implicitly as a function of the upper level problem s controls. Hence, we are able to convert the bi-level optimization problem into a single-level problem. To be explicit, the following contributions are made in this dissertation: In chapter 2, the notion of dynamic user equilibrium (DUE) with simultaneous route and departure time choice with elastic demand is presented as a form of differential variational inequality (DVI). Our re-statement allows a concise demonstration that any DVI solution is continuous time dynamic user equilibrium. In chapter 3, a theoretical framework is presented to solve a differential Cournot-Nash Stackelberg game by formulating the lower level problem as DVI and converting the bi-level optimization problem into a single-level problem. In chapter 4, a theoretical framework is presented to solve a stochastic differential Cournot-Nash Stackelberg game, when the observation of state variable at the lower level problem has stochasticity. The stochastic bi-level optimization problem is converted into a single level one employing Riccati system of equation describing the stochastic Nash equilibrium of lower level problem. 1.2 Applications In the last two chapters the thesis, we describe two applications of dynamic Stackelberg game that we have studied as a part of this thesis research namely 1. dynamic competition in city logistics involving the shippers and the carriers (a class of problems also known as dynamic urban freight systems); 5

18 2. stochastic dynamic pricing and revenue management in a supply chain. Furthermore, we study the application of differential Stackelberg games on two different areas: freight transport, and strategic pricing and revenue management. In chapter 5, we consider a Stackelberg game between a single carrier that acts as the leader and multiple shippers involved in a Nash competition. In this chapter we are interested in solving the differential Cournot-Nash Stackelberg game to answer the following question: What will be the Nash equilibrium and stacklberge equilibrium when the transportation price setter acts as a leader while followers are competing to maximize the revenue by satisfying the demand at different nodes of a network? In chapter 6, we study the interaction between a supplier who is the leader and multiple retailers who are competing to sell a homogeneous commodity in a market when the market price evolves based on an Ito-type stochastic process. In this chapter we are trying to answer to the following questions: How the interaction between a supplier and retailers can be explained in the term of dynamic whole sale price and dynamic market price based on a stochastic Stackelberg mechanism when the market price is stochastic? What is the effect of stochasticty at the market price on the whole sale price and production plan strategy of leader? 1.3 Preliminary Mathematics First we briefly review some key definitions of function spaces used extensively in rest of the thesis. Definition 1 (Space of square-integrable functions) Space of square-integrable functions, L 2 [t 0, t f ], defined on the segment of the real line [t 0, t f ] R 1 + is a Banach space which 6

19 consists of all real-valued functions on [t 0, t f ], and the norm defined by tf z 2 = ( z(t) 2 dt) 1 2 < t 0 The space L 2 [t 0, t f ] can be obtained in a direct way by the use of the Lebesgue integral and Lebesgue measurable functions z on [t 0, t f ] such that the Lebesgue integral of z 2 over [t 0, t f ] exists and is finite. Definition 2 (Hilbert space) The vector space H 1 [t 0, t f ] is defined: H 1 [t 0, t f ] = {v v (L 2 [t 0, t f ]) m ; v x i L 2 [t 0, t f ] i = 1, 2,...m} when v is absolutely continous. Definition 3 (G differentiability). Suppose X and Y are locally convex topological vector spaces and V X. A functional J : X Y is Gateaux differentiable or G differentiable at v V in the direction ϕ X, if the limit J(v + θϕ) J(v) lim θ 0 θ exists. This limit is denoted by δj(v, ϕ). Theorem 1 (Riesz). Let V be a Hilbert space and L V a continuous linear form on V. Then there exists a unique element u L V such that v V L(v) = u L, v and L V = u L V 7

20 Conversely, we can associate with each u V the continuous linear form L u defined by v V L u (v) = u, v Definition 4 (Gradient). Let V be a Hilbert space with the scalar product.,.. If J is G differentiable at v V, and if δj(v, ϕ) is a continuous linear form with respect to ϕ, then, by theorem of 2, there exists an element J v V such that ϕ V δj(v, ϕ) = J v, ϕ J v is called the gradient of J at v Varitional Ineqaulity The finite dimensional variational inequality (VI) can be stated as follows: Definition 5 (Finite dimensional variational inequality problem) Let X R n be a nonempty subset of R n and let F : X R n R n be a mapping from X into itself. V I(X, F ) is to find a vector x X such that the following conditions hold F (x ) T (x x ) 0 y X R n (1.1) The problem V I(X, F ) is closely related to the nonlinear complementarity problem (NCP). Suffi ce it to say that under appropriate regularity conditions VIs and NCPs are equivalent to one another. Under more stringent conditions, the solution of a VIP is the solution of an NCP. Closely related to the finite dimensional VI is the quasi-variational inequality (QVI) which replaces X with a point-to-set mapping X : R n R n i.e. Definition 6 (Infinite dimensional variational inequality problem). Let X V be a nonempty subset of the Hilbert space V and let F : X V V be a mapping from V into itself. 8

21 V IP (X, F ) is to find a vector x V such that the following conditions hold F (x ) T (x x ) 0 y X V (1.2) Some excellent discussions of existence and uniqueness of solutions as well as algorithms for solving the finite dimensional VIP can be found in Harker (1984). Friedman (1972) also presents some details on infinite dimensional variational inequalities in function spaces. A parallel literature exists for complementarity problems but a review of it is omitted here for the sake of brevity. For a survey of recent works related to specific finite dimensional VI applications and algorithms see Ferris and Pang (1997). Algorithms and applications for infinite dimensional VIs are discussed by Baiocchi and Capelo (1984), Nagurney (1999); and Nagurney et al. (2002). Interestingly, some of the most recent applications of infinite dimensional variational inequalities, as well as research on numerical methods for their solution, has occurred in the context of option pricing and various exotic securities used in financial markets; see in particular [22] Optimal Control Problem We consider the following optimal control problem: subject to: tf min J = Γ(x(t f ), t f ) + u t 0 F (x(t), u(t), t)dt (1.3) dx dt = f(x(t), u(t), t) t [t 0, t f ] (1.4) x(t 0 ) = x 0 (1.5) u U (1.6) 9

22 where u {u : [t 0, t f ] U, U R m } (L 2 [t 0, t f ]) m x (H 1 [t 0, t f ]) n x 0 R n F : (H 1 [t 0, t f ]) n (L 2 [t 0, t f ]) m R 1 + L 2 [t 0, t f ] f : R n R m R 1 R n Γ : R n R n + R 1 We denote x by state variables, u by control variables, and t by time. In the remainder of this thesis, we suppress the time argument t when the meaning is obvious. Note that (L 2 [t 0, t f ]) m is the m fold product of the space of square-integrable functions L 2 [t 0, t f ] and the inner product defined by x(t), y(t) tf t 0 [x(t)] T y(t)dt (1.7) while (H 1 [t 0, t f ]) n is the n-fold product of the Sobolev space H 1 [t 0, t f ]. We assume this parametric optimal control problem satisfies the regularity conditions defined as follows: Definition 7 (Regularity). The parametric optimal control problem (1.3)-(1.6) is regular if the following conditions hold: 1. F (x, u, t) is convex and continuously differentiable with respect to x, u; 2. f(x, u, t) is affi ne and continuously differentiable with respect to x and u; 3. f(x, u, t) and f(x,u,t) x are bounded so that the operator x(u, t) : (L 2 [t 0, t f ]) m R 1 + (H 1 [t 0, t f ]) n 10

23 exists and is continuous and Gateaux-differentiable with respect to u; 4. Γ(x, t) is continuously differentiable with respect to x; 5. U is non-empty, convex and compact; and 6. x 0 R n is known and fixed. When the regularity conditions hold, the operator x(u, t) exists which is implicitly defined by the state dynamics dx dt = f(x, u, t). Further, the operator x(u, t) is continuous and Gateaux-differentiable by the following two theorems, which are reproduced from Theorem 4.2 and 4.4 of Bressan and Piccoli (2007), respectively. Theorem 2 Suppose the regularity conditions in definition (7) hold. Then, forevery t f > t 0, u U, the Cauchy problem dx dt = f(x(t), u(t), t) t [t 0, t f ] (1.8) x(t 0 ) = x 0 has a unique solution x(u, t) defined for all t [t 0, t f ]. The input-output map u(t) x(u; t) is continuous. Theorem 3 Suppose the regularity conditions in Definition (1) hold. Let u(t) U be a control whose corresponding solution x(u; t) of (1.8) is defined on [t 0, t f ]. Then, for every bounded measurable v(t) and every t [t 0, t f ], the map δ x(u + δv; t) is differentiable, hence the operator x(u, t) is Gateaux-differentiable. Now we give the gradient of the optimal control problem (1.3)-(1.6). Theorem 4 The objective functional J in (1.3)-(1.6) is continuously differentiable in the sense of Gateaux, and 11

24 J(u) = H(x, u, λ, t) u where H is the Hamiltonian H F (x, u, t) + λ T f(x, u, t) and the adjoint variable λ is a solution of the following terminal value problem: dλ dt = ( f x )T λ + ( g x )T λ(t f ) = Γ(x(t f), t f ) x(t f ) Proof. The derivative of J in the direction ρ becomes δj(u; ρ) = Γ(x(t f), t f ) tf y(t f ) + { F t f t 0 x y + F ρ}dt (1.9) u where y δx is a variation in x which implicitly depends on ρ. Furthermore, by definition x(t) = x 0 + t t 0 f(x, u, t)ds and therefore y δx = t t 0 [ f x y + f u ρ]ds We introduce adjoint variables λ defoned by the final value problem dλ dt = ( F x )T + ( f x )T λ λ(t f ) = Γ(x(t f), t f ) x(t f ) 12

25 so that (1.9) becomes δj(u; ρ) = λ(t f )y(t f ) + tf t 0 {[ ( dλ dt ) f F λt ]y + x u ρ}dt The integration by parts yields tf t 0 ( dλ tf dt )T ydt = λ(t f )y(t f ) + = λ(t f )y(t f ) + t 0 tf t 0 λ T dy dt dt λ T [ df dx y + df du ρ]dt It follows that δj(u; ρ) = λ(t f )y(t f ) λ(t f )y(t f ) + = tf t 0 {λ T f u + F u }ρdt = λ T f u + F u, ρ tf t 0 {λ T [ df dx y + df du ρ] λt f x y + F u ρ}dt Therefore, by theorem (4), the gradient of J(u) becomes J(u) = λ T f u + F u = H u This completes the proof. Now we are ready to state the necessary conditions for the optimal control problem (1.3)-(1.6). Theorem 5 A necessary condition for u U to yield an optimal solution of (1.3)-(1.6) is H(x, u, λ, t), u u 0 u U u 13

26 Proof. Let u U be arbitrary. Since U is convex, u U implies u + θ(u u ) U θ [0, 1] (1.10) Hence for u to be a minimum of J on U, it is necessary that u U [ d dθ J(u + θ(u u ))] θ=0 = δj(u, u u ) 0 (1.11) Since J is G differentiable at u and J is well-defined by theorem (3), we have δj(u, u u ) = λ T f(x, u, t) u = H(x, u, t), u u 0 u U u + F (x, u, t), u u (1.12) u Hence proof. In fact, (1.12) is a variational inequality, which will be introduced later in this section Differential Varitional Ineqaulity We follow Friesz et al. (2006) in referring to the following differential variational inequality problem as DV I(F, f, U, x 0, Γ): find u U such that F (x(u, t), u, t), u u 0 u U (1.13) where u U (L 2 [t 0, t f ]) m (1.14) x(u, t) = arg { dx dt = f(x, u, t), x(t 0) = x 0, Γ[x, x(t f ), t f ] = 0} (H 1 [t 0, t f ]) n (1.15) 14

27 and x 0 R n F : (H 1 [t 0, t f ]) n (L 2 [t 0, t f ]) m R 1 + L 2 [t 0, t f ] f : (H 1 [t 0, t f ]) n (L 2 [t 0, t f ]) m R 1 + (L 2 [t 0, t f ]) n Γ : (H 1 [t 0, t f ]) n R 1 + R 1 while (L 2 [t 0, t f ]) m is the m fold product of the space of square-integrable functions L 2 [t 0, t f ], and (H 1 [t 0, t f ]) n is the n fold product of the Sobolev space H 1 [t 0, t f ]. And the inner product in (1.13) is defined by F (x(u, t), u, t), u u = tf t 0 [F (x(u, t), u, t)] T (u u)dt 0 u U (1.16) To analyze (1.16) we will rely on the following notion of regularity: Definition 8 (Regularity). We call DV I(F, f, U, x 0, Γ) regular if: 1. F (x, u, t) is convex and continuously differentiable with respect to x, u; 2. f(x, u, t) is affi ne and continuously differentiable with respect to x and u; 3. f(x, u, t) and f(x,u,t) x are bounded so that the operator x(u, t) : (L 2 [t 0, t f ]) m R 1 + (H 1 [t 0, t f ]) n exists and is continuous and Gateaux-differentiable with respect to u; 4. Γ(x, t) is continuously differentiable with respect to x; 15

28 5. U is non-empty, convex and compact; and 6. x i 0 R n is known and fixed. The existence of the operator x(u, t) and its continuity and differentiability with respect to u is provided by Theorems 2 and 3. The motivation for this definition of regularity is to parallel as closely as possible those assumptions needed to analyze traditional optimal control problems from the point of view of infinite dimensional mathematical programming: see definition (7). We next note that (1.16) may be restated as the following optimal control problem min γ T Γ[x(t f ), t f ] + subject to: tf t 0 [F (x, u, t)] T udt (1.17) dx dt = f(x, u, t) (1.18) x(t 0 ) = x 0 (1.19) u U (1.20) x = x(u) is the optimal state vector. We point out that this optimal control problem is a mathematical abstraction and of no use for computation, since its criterion depends on knowledge of the variational inequality solution u. In what follows we will need the Hamiltonian for (1.17)-(1.20), namely H(x, u, λ, t) = [F (x, u, t)] T u + λ T f(x, u, t) (1.21) where λ(t) is the adjoint vector that solves the adjoint equations and transversality conditions for the given state and control variables. Note that for a given state vector and 16

29 a given instant in time (1.21) is convex in u when DV I(F, f, U, Γ) is regular. Furthermore, there is a fixed point form of DV I(F, f, U, Γ) which is presented in. Theorem 6 (Necessary conditions for DV I(F, f, U, Γ)) When regularity in the sense of definition (8) holds, solutions u U of DV I(F, f, U, Γ) must obey: 1. the finite dimensional variational inequality principle: [F (x, u, t) + u (λ ) T f(x, u, t)] T (u u ) 0 u U; 2. the state dynamics dx dt = f(x, u, t) x (t 0 ) = x 0 ; and 3. the adjoint dynamics ( 1) dλ dt = x (λ ) T f(x, u, t) λ (t f ) = γ T Γ[x (t f ), t f ) x (t f ) Proof. The Pontryagin minimum principle is a necessary condition for optimal control problem (1.3)-(1.6) so that u = arg{min u U H(x, u, λ, t)} for each t [t 0, t f ], which in turn, by virtue of regularity, is equivalent to [ u H(x, u, λ, t)] T (u u ) 0 17

30 Note that u H(x, u, λ, t) = F (x, u, t) + u λf(x, u, t) where for given u the adjoint variable λ is obtained as the solution for this terminal value problem: λ(u, t) = ( 1) dλ dt = xh(u, x, λ, t), λ(t f ) = γ T Γ[x(t f), t f ) x(t f ) λ(u, t) = ( 1) dλ dt = x[f (x, u, t)u] + x λ T f(x, u, t), λ(t f ) = γ T Γ[x(t f), t f ) x(t f ) λ(u, t) = ( 1) dλ dt = xλ T f(x, u, t), λ(t f ) = γ T Γ[x(t f), t f ) x(t f ) since x(u, t) is completely determined by knowledge of the controls u. The theorem follows immediately. Note that item 1 of this theorem refers to a finite dimensional variational inequality because, as explained in Friesz et al. (2006), the Pontryagin minimum principle from which it is derived minimizes the associated Hamiltonian for fixed time and fixed state and adjoint variables. We further note that the following existence result holds: Theorem 7 (Existence of a solution to DV I(F, f, U, Γ)) When regularity in the sense of Definition 2.7 holds, DV I(F, f, U, Γ) has a solution. Proof. By the assumption of regularity x(u, t) is well defined and continuous. So F (x(u, t), u, t) is continuous in u. Also by regularity we know U is convex and compact. Consequently, by Theorem 2 of Browder [27], DV I(F, f, U) has a solution. 18

31 Chapter 2 2 DVI Formulation of Dynamic Network User Equilibriums with elastic Demand 2.1 Introductory Remarks This chapter is an extension of the fixed-demand model presented in Friesz et al. (2011) ; it is concerned with a specific type of dynamic traffi c assignment (DTA) known as dynamic user equilibrium (DUE) for which travel cost, including delay as well as early and late arrival penalties, is identical for those route and departure time choices selected by travel agents between a given origin-destination pair. In particular, we consider a dynamic user equilibrium (DUE) with elastic demand and show how it can be formulated as a differential variational inequality (DVI). Most of the studies of DUE reported in the DTA literature are about dynamic user equilibrium with constant travel demand for each origin-destination pair. It is, of course, not generally true that travel demand is fixed, even for short time horizons. Arnott et al. (1993) and Yang and. Huang (1997) directly consider elastic travel demand; their work possesses a limited relationship to the analysis presented in this paper, but is concerned with a simple bottleneck instead of a nontrivial network, which is our focus herein. Yang and Meng (1998) extend a simple bottleneck model to a general queuing network with known elastic demand functions for each origin-destination (OD) pair. Wie et al. (2002) study a version of dynamic user equilibrium with elastic demand, using a complementarity formulation that requires path delays to be expressible in closed form. Szeto and Lo (2004) study dynamic user equilibrium with elastic travel demand when network loading is based on the cell transmission model (CTM); their formulation is based on discrete time and is expressed as a finite-dimensional variational inequality (VI). Han et al. (2011) study dy- 19

32 namic user equilibrium with elastic travel demand for a network whose traffi c flows are also described by CTM. Although Friesz et al. (2011) show that analysis and computation of dynamic user equilibrium with constant travel demand is tremendously simplified by stating it as a a differential variational inequality (DVI), they do not discuss how elastic demand may be accommodated within a DVI framework. We know of no prior published work that extends the Friesz et al. (2011) formulation to an elastic demand setting. That is, our demonstration in this paper that dynamic user equilibrium with elastic travel demand may be formulated as a differential variational inequality in continuous time is original and has not been previously reported. The DVI formulation of elastic demand DUE is not straightforward. In particular, the DVI presented herein has both infinite-dimensional and finite-dimensional terms; moreover, for any given origin-destination pair, inverse travel demand corresponding to a dynamic user equilibrium depends on the terminal value of a state variable representing cumulative departures. The DVI formulation achieved in this chapter is significant because it allows the still emergening theory of differential variational inequalities to be employed for the analysis and computation of solutions of the elastic-demand DUE problem when simultaneous departure time and route choice are within the purview of users, all of which constitutes a foundation problem within the field of dynamic traffi c assignment. A good review of recent insights into abstract differential variational inequality theory, including computational methods for solving such problems, is provided by Pang and Stewart (2008). Also, differential variational inequalities involving the kind of explicit, agent-specific control variables employed herein are presented in Friesz (2010). 20

33 2.2 Notation and Essential Background We assume the time interval of analysis is a single commuting period or day expressed as [t 0, t f ] R 1 + where t f > t 0, and both t 0 and t f are fixed. Here, as in all DUE modeling, the single most crucial ingredient is the path delay operator, which provides the delay on any path p per unit of flow departing from the origin of that path; it is denoted by D p (t, h) p P where P is the set of all paths employed by travelers, t denotes departure time, and h is a vector of departure rates. From these, we construct effective unit path delay operators Ψ p (t, h) by adding the so-called schedule delay E [t + D p (t, h) T A ]; that is Ψ p (t, h) = D p (t, h) + E [t + D p (t, h) T A ] p P where T A is the desired arrival time and T A < t f. The function E ( ) assesses a penalty whenever t + D p (t, h) T A (2.1) since t + D p (t, h) is the clock time at which departing traffi c arrives at the destination of path p P. We stipulate that each Ψ p (, h) : [t 0, t f ] R 1 ++ p P is measurable and strictly positive. We employ the obvious notation (Ψ p (, h) : p P) R P 21

34 to express the complete vector of effective delay operators. It is now well known that path delay operators may be obtained from an embedded delay model, data combined with response surface methodology, or data combined with inverse modeling. Unfortunately, regardless of how derived, realistic path delay operators do not possess the desirable property of monotonicity; they may also be non-differentiable. However, we shall have more to say about path delays in a subsequent section of this paper. Transportation demand is assumed to be expressed as the following invertible function Q ij (t f ) = F ij [v] for each origin-destination pair (i, j) W, where W is the set of all origin-destination pairs and v is a concatenation of origin-destination minimum travel costs v ij associated with (i, j) W. That is, we have that v ij R 1 + v = (v ij : (i, j) W) R W Note that to say v ij is a minimum travel cost means it is the minimum cost for all departure time choices and all route choices pertinent to origin-destination pair (i, j) W. Further note that Q ij (t f ) is the unknown cummulative travel demand between (i, j) W that must ultimately arrive by time t f. We will also find it convenient to form the complete vector of travel demands by concatenating the origin-specific travel demands to obtain Q = (Q ij (t f ) : (i, j) W) R W F : R W + R W + 22

35 The inverse demand function for every (i, j) W is v ij = Θ ij [Q (t f )], and we naturally define Θ = (Θ ij : (i, j) W) R W Additionally, we will define the set P ij to be the set of paths that connect origin-destination pair (i, j) W. We denote the space of square integrable functions for the real interval [t 0, t f ] by L 2 [t 0, t f ]. We stipulate that h ( L 2 + [t 0, t f ] ) P We write the flow conservation constraints as p P ij tf t 0 h p (t) dt = Q ij (t f ) (i, j) W (2.2) where (2.2) is comprised of Lebesgue integrals. In (2.2), we consider demand to be the cumulative travel that takes place by the end of the analysis horizon t f. We define the set of feasible flows for each v R W by Λ 0 = h 0 : p P ij h p (t) dt = Q ij (t f ) (i, j) W t 0 ( L 2 + [t 0, t f ] ) P tf Given h let us also define the essential infimum of effective travel delays v ij = ess inf [Ψ p (t, h) : p P ij ] (i, j) W We rely on the following definition of dynamic user equilibrium: Definition 9 Dynamic user equilibrium. A vector of departure rates (path flows) h Λ 0 23

36 is a dynamic user equilibrium if h p (t) > 0, p P ij = Ψ p [t, h (t)] = v ij We denote this equilibrium by DUE (Ψ, Λ 0, [t 0, t f ]). The meaning of Definition 9 is clear: positive departure rates at a particular time along a particular path must coincide with least effective travel delay. Furthermore Ψ p (t, h ) > v ij, p P ij = h p = 0 is an implication of Definition Dynamics We assume there are unknown terminal state variables Q ij (t f ), for all (i, j) W, which are the realized DUE travel demands. Moreover, for each origin-destination pair (i, j) W, inverse travel demand is expressed as v ij = Θ ij [Q (t f )] (2.3) Thus, the Q ij (t f ), for all (i, j) W, will be determined endogenously to the differential variational inequality presented subsequently in Section 2.4. Such an approach contrasts to the approach employed by Friesz et al. (2010) to study fixed-demand DUE by making each Q ij (t f ) an a priori fixed constant. Accordingly, we introduce the following dynamics: dq ij dt = p P ij h p Q ij (t 0 ) = 0 (i, j) W (2.4) 24

37 As a consequence, we employ the following alternative form of the feasible set: Λ = h 0 : dq ij dt = h p (t) Q ij (t 0 ) = 0 (i, j) W ( L 2 + [t 0, t f ] ) P p P ij (2.5) Note that the feasible set Λ in (2.5) is expressed as a set of path flows since knowledge of h completely determines the demands that satisfy the inital value problem (2.4). 2.4 The Differential Variational Inequality Experience with differential games in continuous time suggests that an elastic demand dynamic user equilibrium is equivalent to the following variational inequality under suitable regularity conditions: find h Λ such that p P tf t 0 Ψ p (t, h )(h p h p)dt (i,j) W [ ] Θ ij [Q (t f )] Q ij (t f ) Q ij (t f ) 0 h Λ (2.6) where Q (t f ) is the terminal equilibrium demand and Θ [Q (t f )] is the inverse demand function evaluated at Q (t f ). We refer to differential variational inequality (2.6) as DV I(Ψ, Θ, t 0, t f ). We now proceed constructively to show that (2.6) is equivalent to elastic demand DUE for appropriate regularity conditions. In particular we note that the differential variational variational inequality of interest may be written as (i,j) W p P ij tf t 0 Ψ p (t, h )h p dt (i,j) W (i,j) W p P ij Θ ij [Q (t f )] Q ij (t f ) tf t 0 Ψ p (t, h )h pdt (i,j) W Θ ij [Q (t f )] Q ij (2.7) (t f ) for all h Λ. Inequality (2.7) means that the solution h Λ satisfies the optimal control 25

38 problem min J 0 = Θ ij [Q (t f )] Q ij (t f ) + (i,j) W (i,j) W p P ij tf t 0 Ψ p (t, h )h p dt (2.8) subject to dq ij dt = p P ij h p (t) (i, j) W (2.9) Q ij (t 0 ) = 0 (i, j) W (2.10) h 0 (2.11) This optimal control problem may not be used for computation, since it involves knowledge of the DVI solution. However, it may be used to express necessary and suffi cient conditions for the solution of DV I(Ψ, Θ, t 0, t f ). In particular, the Hamiltonian for DV I(Ψ, Θ, t 0, t f ) is H = (i,j) W = (i,j) W Ψ p (t, h )h p + p P ij ] [Ψ p (t, h ) + λ ij p P ij (i,j) W h p λ ij p P ij h p where the adjoint equations are dλ ij dt = H Q ij = 0 (i, j) W, p P ij, t [t 0, t f ] (2.12) with transversality conditions λ ij (t f ) = (i,j) W Θ ij [Q (t f )] Q ij (t f ) Q ij (t f ) = Θ ij [Q (t f )] (i, j) W, p P ij, t [t 0, t f ] (2.13) 26

39 It is clear from (2.12) and (2.13) that λ ij (t) = Θ ij [Q (t f )], a constant We note that the Hamiltonian is linear in h and does not depend explicitly on the state variables. By Theorem 3.7 of Friesz (2010), the Mangasarian suffi ciency theorem assures the minimum principle and associated necessary conditions are also suffi cient. Since h is a control vector and must obey the minimum principle in R P for each instant of time, we enforce for which the Kuhn-Tucker conditions are min h H s.t. h 0 Ψ p (t, h ) Θ ij [Q (t f )] = ρ p 0 (i, j) W, p P ij, t [t 0, t f ] (2.14) where the ρ p are dual variables satisfying the complementary slackness conditions ρ p h p = 0 (i, j) W, p P ij, t [t 0, t f ] (2.15) From (2.14) and (2.15) we have immediately h p > 0, p P ij = Ψ p (t, h ) = Θ ij [Q (t f )] Ψ p (t, h ) > Θ ij [Q (t f )], p P ij = h p = 0 which are recognized as conditions describing a dynamic user equilibrium. We have noted above that (2.14) is equivalent to a dynamic user equilibrium; thus, showing that (2.14) corresponds to a solution of DV I(Ψ, Θ, t 0, t f ) will complete the demonstration 27

40 that DV I(Ψ, Θ, t 0, t f ) is equivalent to a dynamic user equilibrium. In particular note that ρ p (h p h p) = ρ p h p ρ p h p = ρ p h p 0 so that { } Ψ p (t, h ) Θ ij [Q (t f )] (h p h p) 0 (i, j) W, p P ij, t [t 0, t f ] (2.16) where h, h Λ. From (2.16) we obtain Ψ p (t, h )(h p h p) Θ ij [Q (t f )] (h p h p) 0 (i, j) W, p P ij, t [t 0, t f ] (2.17) From (2.17), we obtain tf p P (i,j) W t 0 Therefore t 0 Ψ p (t, h )(h p h p)dt Θ ij [Q (t f )] tf ( h p h p)dt 0 (2.18) p P ij p P ij p P tf t 0 Ψ p (t, h )(h p h p)dt (i,j) W Θ ij [Q (t f )] tf t 0 ( dq ij dt dq ij )dt 0, (2.19) dt from which DV I(Ψ, Θ, t 0, t f ) is obtained immediately, since Q ij (t 0 ) = Q ij(t 0 ) = 0. The preceding analysis has established the validity of the following theorem: Theorem 1 Elastic demand dynamic user equilibrium is equivalent to a differential variational inequality. Assume Ψ p (, h) : [t 0, t f ] R 1 ++ is measurable and strictly positive for all p P and all h Λ. Also assume that the elastic travel demand function is invertible, with inverse Θ ij (Q) for all (i, j) W. A vector of departure rates (path flows) h Λ is a dynamic user equilibrium with associated demand Q (t f ) if and only if h solves 28

41 DV I(Ψ, Θ, t 0, t f ). 2.5 Conclusion We have shown dynamic network user equilibrium based on simultaneous departure time and route choice in the presence of elastic travel demand may be formulated as a differential variational inequality (DVI). The DVI formulation presented herein lays the foundation for computing DUE solutions using methods like those presented in Friesz et al. (2011) for path delay operators determined by any model of network loading. In particular, a fixed point algorithm is presented in Friesz et al. (2011) for solving DVIs that express the notion of DUE with fixed demand; its convergence may only be assured when certain likely-to-beviolated restrictions on the shape of the effective delay operators are invoked. It remains to demonstrate that a fixed point algorithm will be successful for solving the elastic demand formulation presented in this paper. 29

42 Chapter 3 3 Deterministic Differential Stackelberg Games 3.1 The Notion of Nash and Stackelberg Equilibria Game theory provides a framework for modeling the interaction between groups of players whose utility functions and set of feasible strategies are related. There are several behavioral assumptions one may employ in modeling a game. The aim of this section is to present two of these behavioral models: namely, Nash (1951) and Stackelberg games (1934). Both Nash and Stackelberg games feature noncooperative equilibrium solutions. In such a solution, no player can improve their utility function by deviating from their equilibrium strategy Simaan (1977) (a). Many real-world problems can be formulated using these behavioral models, including several applications in supply chain, logistics, and transportation such as Dafermos and Nagurney (1987); Henderson and Quandt (1980); and Friesz et al. (2006) Nash Equilibrium Nash (1951) generalized the concept of equilibrium for the behavioral model consisting of N players who cannot improve their own payoff by deviating from their Nash equilibrium strategy, given that the other players use their equilibrium strategies. Suppose there are N players in a game and each player i {1, 2,..., N} chooses a feasible strategy tuple x i from the strategy set Ω i to maximize the utility function Θ i : Ω i R 1, which will generally depend on other players strategies. Every player i {1, 2,..., N} solves his/her best response problem: max x i Θ(x i ; x i ) subject to: x i Ω i (3.1) 30

The Pennsylvania State University The Graduate School DYNAMIC PRICING, COMPETITION AND UNCERTAINTY

The Pennsylvania State University The Graduate School DYNAMIC PRICING, COMPETITION AND UNCERTAINTY A Dissertation in Industrial Engineering by Changhyun Kwon c 2008 Changhyun Kwon Submitted in Partial