The principle of least action and two-point boundary value problems in orbital mechanics

Similar documents
A max-plus based fundamental solution for a class of infinite dimensional Riccati equations

Idempotent Method for Deception Games and Complexity Attenuation

Idempotent Method for Dynamic Games and Complexity Reduction in Min-Max Expansions

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Nonlinear L 2 -gain analysis via a cascade

A CURSE-OF-DIMENSIONALITY-FREE NUMERICAL METHOD FOR SOLUTION OF CERTAIN HJB PDES

Idempotent Method for Continuous-Time Stochastic Control and Complexity Attenuation

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

LINEAR-CONVEX CONTROL AND DUALITY

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Maximum Process Problems in Optimal Control Theory

On differential games with long-time-average cost

Deterministic Dynamic Programming

From convergent dynamics to incremental stability

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Mean field games and related models

ON CALCULATING THE VALUE OF A DIFFERENTIAL GAME IN THE CLASS OF COUNTER STRATEGIES 1,2

A Remark on IVP and TVP Non-Smooth Viscosity Solutions to Hamilton-Jacobi Equations

On the Bellman equation for control problems with exit times and unbounded cost functionals 1

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS

INFINITE TIME HORIZON OPTIMAL CONTROL OF THE SEMILINEAR HEAT EQUATION

Further Results on the Bellman Equation for Optimal Control Problems with Exit Times and Nonnegative Instantaneous Costs

Viscosity Solutions for Dummies (including Economists)

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

On the stability of receding horizon control with a general terminal cost

A TALE OF TWO CONFORMALLY INVARIANT METRICS

The Minimum Speed for a Blocking Problem on the Half Plane

Differential Games II. Marc Quincampoix Université de Bretagne Occidentale ( Brest-France) SADCO, London, September 2011

Random Bernstein-Markov factors

Limit value of dynamic zero-sum games with vanishing stage duration

Nonlinear Analysis 71 (2009) Contents lists available at ScienceDirect. Nonlinear Analysis. journal homepage:

HAMILTON-JACOBI THEORY AND PARAMETRIC ANALYSIS IN FULLY CONVEX PROBLEMS OF OPTIMAL CONTROL. R. T. Rockafellar 1

Asymptotic Perron Method for Stochastic Games and Control

Differential games withcontinuous, switching and impulse controls

LECTURE 15: COMPLETENESS AND CONVEXITY

Viscosity Solutions of the Bellman Equation for Perturbed Optimal Control Problems with Exit Times 0

Linear conic optimization for nonlinear optimal control

A double projection method for solving variational inequalities without monotonicity

How large is the class of operator equations solvable by a DSM Newton-type method?

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets

Max Plus Eigenvector Methods for Nonlinear H Problems: Error Analysis

Trajectory Tracking Control of Bimodal Piecewise Affine Systems

Optimal Control. Macroeconomics II SMU. Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112

EXISTENCE OF SOLUTIONS TO THE CAHN-HILLIARD/ALLEN-CAHN EQUATION WITH DEGENERATE MOBILITY

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems

Some Background Math Notes on Limsups, Sets, and Convexity

On a Lyapunov Functional Relating Shortening Curves and Viscous Conservation Laws

NONLINEAR DIFFERENTIAL INEQUALITY. 1. Introduction. In this paper the following nonlinear differential inequality

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

Weak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

Observer design for a general class of triangular systems

Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations

GENERAL EXISTENCE OF SOLUTIONS TO DYNAMIC PROGRAMMING PRINCIPLE. 1. Introduction

Extremal Solutions of Differential Inclusions via Baire Category: a Dual Approach

A maximum principle for optimal control system with endpoint constraints

Closed-Loop Impulse Control of Oscillating Systems

Risk-Sensitive Control with HARA Utility

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

ON WEAKLY NONLINEAR BACKWARD PARABOLIC PROBLEM

HJ equations. Reachability analysis. Optimal control problems

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Variational approach to mean field games with density constraints

Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes with Killing

IDEMPOTENT METHODS FOR CONTROL AND GAMES

Convergence of generalized entropy minimizers in sequences of convex problems

Level-set convex Hamilton-Jacobi equations on networks

1 Definition of the Riemann integral

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

An Overview of Risk-Sensitive Stochastic Optimal Control

Math 331 Homework Assignment Chapter 7 Page 1 of 9

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Nonlinear Systems and Control Lecture # 12 Converse Lyapunov Functions & Time Varying Systems. p. 1/1

Mean Field Games on networks

FUNCTIONAL COMPRESSION-EXPANSION FIXED POINT THEOREM

Numerical methods for Mean Field Games: additional material

Stability for solution of Differential Variational Inequalitiy

Singular Perturbations of Stochastic Control Problems with Unbounded Fast Variables

Noncooperative continuous-time Markov games

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form

Explicit solutions of some Linear-Quadratic Mean Field Games

A Present Position-Dependent Conditional Fourier-Feynman Transform and Convolution Product over Continuous Paths

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

Linear Quadratic Zero-Sum Two-Person Differential Games

Nonlinear stabilization via a linear observability

Weak sharp minima on Riemannian manifolds 1

Hausdorff Continuous Viscosity Solutions of Hamilton-Jacobi Equations

Stochastic optimal control with rough paths

Subdifferential representation of convex functions: refinements and applications

Global Maxwellians over All Space and Their Relation to Conserved Quantites of Classical Kinetic Equations

On continuous time contract theory

Thuong Nguyen. SADCO Internal Review Metting

Nonlinear representation, backward SDEs, and application to the Principal-Agent problem

Large Deviations Theory, Induced Log-Plus and Max-Plus Measures and their Applications

2 Statement of the problem and assumptions

A deterministic approach to the Skorokhod problem

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Dynamic and Stochastic Brenier Transport via Hopf-Lax formulae on Was

Régularité des équations de Hamilton-Jacobi du premier ordre et applications aux jeux à champ moyen

On Solvability of an Indefinite Riccati Equation

Transcription:

The principle of least action and two-point boundary value problems in orbital mechanics Seung Hak Han and William M McEneaney Abstract We consider a two-point boundary value problem (TPBVP) in orbital mechanics involving a small body (eg, a spacecraft or asteroid) and N larger bodies The least action principle TPBVP formulation is converted into an initial value problem via the addition of an appropriate terminal cost to the action functional The latter formulation is used to obtain a fundamental solution, which may be used to solve the TPBVP for a variety of boundary conditions within a certain class In particular, the method of convex duality allows one to interpret the least action principle as a differential game, where an opposing player maximizes over an indexed set of quadratics to yield the gravitational potential The fundamental solution is obtained as a set of solutions of associated Riccati equations I INTRODUCTION We examine the motion of a single body under the influence of the gravitational potential generated by N other celestial bodies, where the mass of the first body is negligible relative to the masses of the other bodies, and we suppose that the N large bodies are on known trajectories The single, small body follows a trajectory satisfying the principle of stationary action (cf, 5, 6), where under certain conditions, the stationary-action trajectory coincides with the least-action trajectory This allows problems in dynamics to be posed, instead, in terms of optimal control problems with vastly simplified dynamics In particular, one may convert two-point boundary-value problems (TPBVPs) for a dynamical system into initial value problems From the solution of a reachability problem, we develop the fundamental solution for a class of TPBVPs Although the gravitational potential does not take the form of a quadratic function, one may take a dynamic game approach, which allows the inner optimization problem to be posed in a linear-quadratic form In particular, the problem is converted into a differential game where one player controls the velocity, and the opposing player controls the potential energy term (cf,, 3) Research partially supported by AFOSR S H Han and WM McEneaney are with the Dept of Mechanical and Aerospace Engineering, University California San Diego, La Jolla, CA 993-4,USA Emails: shh3@ucsdedu and wmcneaney@ucsdedu It will be demonstrated that for the case where the time duration is less than a specified bound, the action functional is strictly convex in the velocity control For any potential energy control, the minimizing trajectory is the unique stationary point, and the least action is obtained by solution of associated Riccati equations The fundamental solution takes the form of a finitedimensional set of Riccati equation solutions For any TPBVP in a certain class, the solution is obtained by taking the maximum of a corresponding linear functional over that same finite-dimensional set II PROBLEM STATEMENT AND FUNDAMENTAL SOLUTION We consider a single body, moving among a set of N other bodies in space, IR n = IR 3 The only forces to be considered are gravitational The single body has negligible mass in relation to the masses of the other bodies, and consequently has no effect on their motion In particular, we suppose that the N bodies are moving along already-known trajectories We will obtain fundamental solutions of TPBVPs for the motion of the small body Although the small body might be any object, for the sake of concreteness, henceforth we will refer to it as the spacecraft The set of N bodies may be indexed as N =, N = {,,, N} Throughout, for integers a b, we will use a, b to denote {a, a +,, b, b} Let ρ i and R i denote the (uniform) density and radius of each body for i N Obviously, the mass of each body is given by m i = 4 3 πρ iri 3 Let ζi r = ζ i (r) denote the position of the center of body i at time r, ) We suppose that ζ = {ζ i } Ẑ = {{ζ i } ζ i C(, ); IR n ) i N } We assume that collision between bodies does not occur So, letting Y = {{y i } IR nn y i y j > R i j}, we define the subset of Ẑ as Z = {ζ Ẑ ζ C(, ); Y)} where R = max R i For simplicity, the spacecraft is considered as a point particle with mass m s Suppose that the position of the spacecraft at time r is denoted by ξ r, where also, we will use x IR n to denote generic position values Let t > We model the dynamics of the spacecraft position

as ξ r = u r, ξ = x, r (, t), () where u = u U,t with U s,t = L (s, t); IR n ) The kinetic energy, T, is given by T (v) = vt m s I n v = m s v v IR n, () where I n denotes the identity matrix of size n Given i N and Y = {y i } Y, the potential energy between the spacecraft at x and body i at y i, V i (x, y i ), is given by { 3R Gmi m i x yi s if x B Ri (y i ), V i (x, y i ) = R 3 i Gm im s x y i if x / B Ri (y i ), where G is the universal gravitational constant We define the total potential energy V : IR n Y IR n as V (x, Y ) = V i (x, y i ) We remark that we include the gravitational potential here within the extended bodies as the finiteness and smoothness of the potential are relevant at technical points in the theory, in spite of the infeasibility of spacecraft trajectories that pass through the bodies We remind the reader that we will obtain fundamental solutions for the TPBVPs through a game-theoretic formulation The game will appear through application of a generalization of convex duality to a control-problem formulation With that in mind, we define the action functional, J :, ) IR n U,t Z IR, as J (t, x, u, ζ) = T (u r ) V (ξ r, ζ r ) dr, (3) where V = V /m s and T = T /m s Attaching a terminal cost to J will yield a TPBVP, where we will control the terminal condition in the TPBVP with this terminal cost, and we will have initial condition ξ = x For background on this approach to TPBVPs for conservative systems, see Given generic terminal cost, ψ : IR n IR, let J(t, x, u, ζ) = J (t, x, u, ζ) + ψ(ξ t ), W (t, x, ζ) = { } inf J(t, x, u, ζ) (4) u U,t Suppose that the desired destination of the spacecraft is denoted by z IR n, ie, ξ t = z Then, we define a reachability problem of interest via the value function W :, ) IR n Z IR n IR by W (t, x, ζ, z) = inf u U,t { J (t, x, u, ζ) () holds with ξ = x, ξ t = z }, (5) and the function Ŵ :, ) IRn Z IR by Ŵ (t, x, ζ) = } inf { W (t, x, ζ, z) + ψ(z) (6) z IR n Theorem : The value function W of (4) and the function Ŵ of (6) are equivalent That is, W (t, x, ζ) = Ŵ (t, x, ζ) for all t, x IR n, and ζ Z We note that due to paper-length issues, we include only a subset of the proofs here In particular, we include critical, non-trivial proofs only It is seen that given W, the value function W of (4) for any terminal cost ψ can be evaluated via (6) Therefore, W may be regarded as fundamental to the solution of TPBVPs arising from the given orbital dynamics For the development of the fundamental solution, it is useful to firstly introduce a terminal cost which takes the form of min-plus delta-function Let ψ : IR n IR n, (where we take, =, ) {+ }) be given by ψ (y, z) = δ (y z), where δ denotes the min-plus delta-function (cf, 4, ) given by δ (y) = { if y =, otherwise We define the finite time-horizon payoff, J :, ) IR n U,t IR n IR {+ }, by J (t, x, u, ζ, z) = J (t, x, u, ζ) + ψ (ξ t, z), and the corresponding value function as W (t, x, ζ, z) = where ξ satisfies () Then, we have: inf J (t, x, u, ζ, z), (7) u U,t Theorem : The value function W of (7) and W of (5) are equivalent That is, W (t, x, ζ, z) = W (t, x, ζ, z) < (8) for all t >, x, z IR n and ζ Z III OPTIMAL CONTROL PROBLEM For the development of fundamental solutions to optimal control problem (4), we will define the value function W c with the quadratic form of terminal cost ψ c and demonstrate that the limit property such that lim c W c = W holds For c, ), let ψ c : IR n IR n, ) be given by ψ c (x, z) = c x z

We define the finite time-horizon payoff, J c :, ) IR n U,t Z IR n IR, by J c (t, x, u, ζ, z) = J (t, x, u, ζ) + ψ c (ξ t, z), (9) where J is given by (3) and its corresponding value function by W c (t, x, ζ, z) = inf u U,t J c (t, x, u, ζ, z), () where () holds From the definition of V, one can easily see that there exist K L, KL < such that V (x, Y ) + V (ˆx, Y ) K L x ˆx, (L) for all x, ˆx IR n and Y Y A A limit property V (x, Y ) K L( + x ) (L) Let t > Suppose that given x, z IR n, the straightline control from x to z is given by u s r = (/t)z x for all r, t, and we let the corresponding trajectory be denoted by ξ s Then, ξr s = x + u s ρ dρ () x + r ( t z x + r ) t x + r t z ( + t )( x + z ) = D (t)( x + z ) for all r, t Hence, by (L), which by (), V (ξr, s ζ r ) dr KL ( + ξr ) s dr = KL t + ξr s dr, which since D (t) > for all t >, K L t + td (t)( x + z ), K LtD (t)( + x + z ) Noting that ξ s t = z, given c, ), = K L(t + )( + x + z ) = D (t)( + x + z ) () W s (t, x, ζ, z) = J c (t, x, u s, ζ, z) = J (t, x, u s, ζ), which by the definition of u s and (), t z x + D (t)( + x + z ) t x + z + D (t)( + x + z ), which since ab a + b for all a, b, t x + z +D (t)( + x + z ), which implies that there exists D 3 = D 3 (t) < such that W s (t, x, ζ, z) = J c (t, x, u s, ζ, z) D 3 (+ x + z ) (3) Suppose that given c, ) and ε (,, there exists u c,ε U,t such that J c (t, x, u c,ε, ζ, z) W c (t, x, ζ, z) + ε (4) Let ξ c,ε be the trajectory corresponding to u c,ε By the non-negativity of T and V, c ξc,ε t z J c (t, x, u c,ε, ζ, z) W c (t, x, ζ, z) + ε, which by the sub-optimality of u s with respect to W c, which by (3), W s (t, x, ζ, z) + ε, D 3 ( + x + z ) +, which since a + b (a + b) for a, b, D 3 ( + x + z ) + ( D 3 ( + x + z ) + ), (5) which implies that there exists D = D(t) < such that Let û c,ε r ξ c,ε t z D( + x + z ) (6) = u c,ε r + t z ξc,ε t, r, t, (7) ˆξ c,ε t which yields = z where ˆξ c,ε denotes the trajectory corresponding to û c,ε Then, by (6) and (7), ξr c,ε c,ε ˆξ r t z ξ c,ε t dρ = r D( + x + z ) t (8) for all r, t Also, by (L) and (8), V (ξr c,ε, ζ r ) + V (ˆξ r c,ε, ζ r ) dr (9) K L ξ c,ε r ˆξ c,ε r By non-negativity of V and ψ c, uc,ε L = (,t) which implies that by (5) and (6), dr K D( L + x + z )t c T (u c,ε r ) dr J c (t, x, u c,ε, ζ, c) W c (t, x, ζ, z) + ε u c,ε L(,t) D( + x + z ) ()

Noting that a b < a b a + b for a, b IR n, T (u c,ε r ) T (û c,ε r ) dr u c,ε r û c,ε r u c,ε r + û c,ε r dr, which by (7) and the triangle inequality, z ξc,ε t t u c,ε r + z ξc,ε t dr, t which by applying Hölder s inequality, z ξc,ε t t u c,ε L(,t) + z ξ c,ε t, t which by (6) and (), D + D ( + x + z ) c t c t ˆD(t)( + x + z ), () for all x, z IR n and all c 4, ) with an appropriate choice of ˆD = ˆD(t) < Therefore, by (9), (), and non-negativity of ψ c, we have J c (t, x, u c,ε, ζ, z) J c (t, x, û c,ε, ζ, z) ˆD( + x + z ) K D( L + x + z )t c D 4(t)( + x + z ) for proper choice of D 4 (t) < This implies J c (t, x, u c,ε, ζ, z) W (t, x, ζ, z) D 4(t)( + x + z ), and further, since this is true for all ε (,, by (4), W c (t, x, ζ, z) W (t, x, ζ, z) D 4(t)( + x + z ) () Further, by the definitions of W c and W and (), W (t, x, ζ, z) D 4(t)( + x + z ) W c (t, x, ζ, z) W (t, x, ζ, z) for all x, z IR n, ζ Z, t >, and c 4 This immediately implies: Lemma 3: For t (, t), W (t, x, ζ, z) = lim c W c (t, x, ζ, z) = sup W c (t, x, ζ, z) c, ) where the convergence is uniform on compact subsets of, t) IR n Z IR n B Dynamic game approach to gravitation Recall that V does not take a linear-quadratic form in the position variable In the case where the potential energy takes a linear-quadratic form, the fundamental solution will be obtained through the solution of associated Riccati equations Therefore, one may take a dynamic game approach to gravitation, whereby opponent controls maximize over an indexed set of quadratics to yield the potential This requires an additional max-plus integral, over the opponent controls, beyond that which is required in the purely linear-quadratic potential case Consider f : (, ) IR is given by f( ˆd) = ˆd / Then, by the convex duality (cf, 7, 8, 9), we may represent f( ˆd) = sup ˆβ ˆd + a( ˆβ) ˆd (, ), (3) ˆβ< where a( ˆβ) = sup ˆβ ˆd f( ˆd) ˆβ (, ) ˆd> Further, since ˆβ ˆd f( ˆd) is strictly convex, the maximum is achieved at ˆd = ( ˆβ) 3/ so that a( ˆβ) = 3 ( ˆβ) /3 ˆβ (, ) Letting β = ˆβ and d = ˆd, (3) becomes d = sup 3 (β)/3 βd, d > (4) β> We note that 3 (β)/3 βd is strictly concave in β > and the maximum is attained at β = d 3 Therefore, we may rewrite (4) as d = sup β (,/d 3 3 (β)/3 βd, d > (5) For the case that x y i R i, applying (5) to (3), V i (x, y i ) = Gm i x y i = Gm i sup β (,/R 3 i (6) 3 (β)/3 β x y i We note that for x B Ri (y i ), letting β = R 3 i, we also obtain the same form as the argument of supremum in (6), ie, V i (x, y i ) = Gm i 3R i x yi R 3 i = Gm i 3 ( β) /3 β x y i (7)

Therefore, combining (6) and (7) yields V i (x, y i ) = Gm i sup β (,/R 3 i for all x, y i IR n Letting ˆα = V i (x, y i ) = 3 (β)/3 β x y i 3 (β)/3, we find sup µ i ˆα ˆα3 x yi ˆα (, /3R i for all x, y i IR n where µ i = Gmi ( 3 ) 3/ C Revisit payoff Let (8) A = {α = {α i } α i (, /3R i i N } Then, using (8), the potential energy, V, may be represented by V (x, Y ) = V i (x, Y ) = max α A { V (x, Y, α)} where V (x, Y, α) = Further, (9) may be written as J c (t, x, u, ζ, z) = = µ i α i (αi ) 3 x yi (9) T (u r ) V (ξ r, ζ r ) dr + ψ c (ξ t, z) (3) T (u r ) + max α A { V (ξ r, ζ r, α)} dr + ψ c (ξ t, z)(3) Given t >, let A t = C(, t; A) Also, we replace the time-indenpendent potential energy function, V, with V α (r, x, Y ) = V (x, Y, α r ) N = µ i αr i (αi r) 3 x y i (3) i= Given c, ), let J c :, ) IR n U,t A t Z IR n IR be given by J c (t, x, u, α, ζ, z) = T (u r ) V α (r, ξ r, ζ r ) dr + ψ c (ξ t, z) (33) Theorem 3: Let t, x, z IR n, and ζ Z For any u U,t, J c (t, x, u, ζ, z) = max J c (t, x, u, α, ζ, z) (34) Further, W c (t, x, ζ, z) = inf max J c (t, x, u, α, ζ, z) (35) u U,t Proof: Given u U,t, ξ denotes the state trajectory corresponding to u with ξ = x IR n Given t, x, z IR n and ζ Z, any α r is suboptimal in the maximization in (3) for any r, t and any α A t, and in particular, J c (t, x, u, ζ, z) T (u r ) + max t{ V (ξ r, ζ r, α r )} dr + ψ c (ξ t, z) α A = max T (u r ) V α (r, ξ r, ζ r ) dr + ψ c (ξ t, z) = max J c (t, x, u, α, ζ, z) (36) Let Y Y and ᾱ : IR n IR nn A be given by ᾱ (x, Y ) = {ᾱ i (x, y i )} where ᾱ i (x, y i ) = argmax µ i α α3 x yi α (, /3R i (37) for all x, y i IR n and all i N Let αr = α (r; u, ζ ) = {αr i i N } where i th element of α is given by αr i = ᾱ i (ξ r, ζr) i r, t (38) Given s, r, t and i N, let d i r = ξ r ζr i and d i s = ξ s ζs i The continuity of trajectories implies that given ε d >, there exists δd i = δi d (ε d) < such that r s < δd i implies di r d i s < ε d Assuming R i >, one can easily see that given ε d >, αr i αs i < ε d if r s < δd i Since this is true for all r, s, t and i N, α is continuous on, t so that α A c t Hence, by (9), (3), (37) and (38), V (ξ r, ζ r ) = V α (r, ξ r, ζ r ) r, t (39) and by (3) (3), (33), and (39), J c (t, x, u, ζ, z) = J c (t, x, u, α, ζ, z) max J c (t, x, u, α, ζ, z) (4) Consequently, combining (36) and (4) yields (34), which immediately implies (35) D Existence of optimal u Lemma 33: Let t (, ), x, z IR n, c, ), and ζ Z Let u U,t and the corresponding trajectory be denoted by ξ Let α r = α (r; u, ζ ) = α (ξ r, ζ r ) for all r, t where ᾱ is given in (37) Then, u is a critical point of J c (t, x,, ζ, z) if and only if u is a critical point of J c (t, x,, α, ζ, z)

Lemma 34: Let t = Gm i R 3 i / Let x, z IR n, c, ), α A t, and ζ Z Then, if t (, t), J c (t, x, u, α, ζ, z) is strictly convex in u Proof: Let u, ν U,t and δ > In order to prove the convexity of J c, we need to examine second-order differences in the direction of ν from u Note that ξ denotes the state trajectory corresponding to u and ξ + denotes the one corresponding to u + δν given by ξ + r = ξ r + δ ν ρ dρ r (, t) (4) We examine each of differences separately The firstorder difference of kinetic energy term is given by T (u r + δν r ) T (u r ) = u r + δν r u r = δv T r u r + δ ν r (4) for r, t Given α A t, V α (r, ξ r +, ζ r ) + V α (r, ξ r, ζ r ) = µ i (αr) i 3 ξ r ζ ir ξ +r ζ ir which by (4), = δ ( µ i (αr) i 3 (ξ r ζr) i T δ r µ i (α i r) 3 ν ρ dρ ) ν ρ dρ (43) Lastly, the first-order difference of terminal cost term is given by ψ c (ξ t +, z) ψ c (ξ t, z) = δc(ξ t z) T ν r dr + δ c ν r dr (44) Using (4) (44) and its analogous version with δν replacing δν, we obtains J c (t, x, u + δν, α, ζ, z) + J c (t, x, u δν, α, ζ, z) J c (t, x, u, α, ζ, z)/ = δ v r r µ i (α i r) 3 ν ρ dρ dr + δ c t ν r dr δ v r µ i (α i r) 3 ν ρ dρ dr which by Hölder inequality for the inner integral, δ ν r r By integration by parts, r = t ν r dr = t ν r dr µ i (αr) i 3 ν ρ dρ dr(45) ν ρ dρ dr (46) r d dr ν ρ dρ dr r ν r dr t ν r dr We note that since αr i (, /3R i for all r, t and i N, µ i (α i r) 3 Gm i R 3 i Employing (46) and (47) into (45) yields = K R (47) J c (t, x, u + δν, α, ζ, z) + J c (t, x, u δν, α, ζ, z) J c (t, x, u, α, ζ, z)/ { } δ ν r dr K R t ν r dr δ KR t ν r dr > if t < t = /K R and ν Since this is true for all direction ν from any u U,t, we see that J c (t, x, u, α, ζ, z) is strictly convex in u Corollary 35: Let x, z IR n, c, ), and ζ Z Then, if t (, t), J c (t, x, u, ζ, z) is strictly convex in u Combining Lemma 33, 34 and Corollay 35 immediately yields the following Lemma 36: For t (, t) and c, ), u is the optimal control for J c (t, x,, ζ, z) if and only if u is the optimal control for J c (t, x,, α, ζ, z) where α is given by (38) We now proceed to consider the game where the order of infimum and supremum are reversed For c,, let W c (t, x, ζ, z) = sup inf J c (t, x, u, α, ζ, z) u U,t = sup W α,c (t, x, ζ, z) (48) Lemma 37: For all t >, c >, x, z IR n, and ζ Z, both W α,c (t, x, ζ, z) (as well as J c (t, x, u, α, ζ, z)) and W α, (t, x, ζ, z) are strictly concave in α Combining Lemma 33, 34, 37 and Corollay 35 immediately yields the following

Theorem 38: If t (, t), then W c (t, x, ζ, z) = sup inf J c (t, x, u, α, ζ, z) u U,t = inf u U,t = W c (t, x, ζ, z) for all x, z IR n and ζ Z sup J c (t, x, u, α, ζ, z) By the monotonicity of W α,c with respect to c and Lemma 3, the following is easily obtained Lemma 39: For t (, t), W (t, x, ζ, z) = sup W α, (t, x, ζ, z) for all x, z IR n and ζ Z E Hamilton-Jacobi-Bellman PDE The Hamilton-Jacobi-Bellman (HJB) partial differential equation (PDE) associated with our problem is = r W α (r, x, ζ, z) H α (t r, x, ζ, x W α (r, x, ζ, z)) (49) W α (, x, ζ, z) = ψ c (x, z) x IR n (5) where the corresponding Hamiltonian is given by For t >, let H α (r, x, ζ, p) = V α (r, x, ζ r ) + p D t = C(, t IR n ) C ((, t) IR n ) Lemma 3: Let c, ), z IR n, ζ Z, and α A t Let s, t and r = t s Suppose that W α (,, ζ, z) D t satisfies (49) and (5) Then, W α (t, x, ζ, z) J c (t, x, u, α, ζ, z) for all x IR n and u U,t Further, W α (t, x, ζ, z) = J c (t, x, u, α, ζ, z) for the input u s = ũ(s, ξ s ) with ξ s given by () with ũ(s, x) = x W α (t s, x, ζ, z) and ξ = x Consquently, W α (t, x, ζ, z) = W α,c (t, x, ζ, z) IV FUNDAMENTAL SOLUTION AS SET OF RICCATI SOLUTION We will find that the fundamental solution may be given in terms of a set of solutions of associated Riccati equations A Differential Riccati Equation Given c, ), r t and α A t, ζ Z, we look for a solution, Wα,c, of the form W α,c (r, x, ζ, z) = x T Pr c x + x T Q c rz + z T Rrz c +p c r x + qr c z + γr c (5) where P c, Q c, R c, p c, q c, and γ c depend implicitly on given ζ and the choice of α A t and satisfy the respective initial value problems (R): and P r c = Pr c Ai t r, P c = ci n, Q c r = Pr c Q c r, Q c = ci n, R r c = Q c r T Q c r, R c = ci n, p c r = Pr c p c r + Ai t sζt s, i p c = n, q r c = Q c r T p c r, q c = n, r r c = p c r + { µ i α i t s (αt s) i 3 ζt s i } with r c = where A i s = µ i αs i 3 I n for all i M, r, t and m n denotes the zero matrix of size m n Upon examining (R), we see that the solutions, Pr c, Q c r, Rr, c are symmetric matrices, more precisely, diagonal matrices of size n n Lemma 4: For t, t), the value function W α,c of (48) and the explicit function W α,c of (5) satisfying (R) are equivalent That is, W α,c (r, x, ζ, z) = W α,c (r, x, ζ, z) for all x, z IR n, ζ Z, and r, t Proof: It will be sufficient to show that Wα,c satisfies the condition of Lemma 3 It is useful to rewrite (5) as W α,c (r, x, ζ, z) = x z T From (5) and (5), we have and Pr c Q c r p c r x Q T r Rr c qr c z p c r T qr c T γr c = XT z P c rx z (5) W α,c r (r, x, ζ, z) = XT z PrX c z, (53) Wα,c x (r, x, ζ, z) = Pr c Q c r p c rx z = Pc r X z, (54) where x Wα,c (r, x, ζ, z) = X T z P c r T Pc r X z = X T z P c rλp c rx z (55) Λ = I n n (n+) (n+) n (n+) (n+) Also, collecting like terms, we have V α (t r, x, ζ t r ) = xt A i t r x + t r ζ i T A i t rx + { µ i αt r i } ζi t r T A i t rζt r i = XT z Γ(α t r, ζ t r )X z, (56)

where Γ(α t r, ζ t r ) has the block structure with 3 row partitions and 3 column partitions as Γ = A i t r, Γ 3 = Γ T 3 = A i t rζ i t r, Γ 33 = µ i { α i t r (α i t r) 3 ζ i t r }, Γ = Γ = Γ = n n, Γ 3 = Γ T 3 = n Here, we note that (R) is equivalent to P c r = P c rλp c r + Γ(α t r, ζ t r ) with P c = C Consequently, substituting (53) (56) in the right-hand side of the PDE (49) yields r W α,c (r, x, ζ, z) H α (t r, x, ζ, Wα,c x (r, x, ζ, z)) P r c PrΛP c r c + Γ(α t r, ζ t r ) X z =, = XT z which implies (5) is a solution of HJB PDE (49), and by Lemma 34 and 3, W α,c (r, x, ζ, z) = W α,c (r, x, ζ, z) for all r, t, with t, t) Recall from Lemma 3 and 39 that the fundamental solution of interest is obtained as the c limit of W α,c Consequently, we have that for t < t, W (t, x, ζ, z) = sup W α, (t, x, ζ, z) = sup lim c Wα,c (t, x, ζ, z), which by Lemma 4, = sup lim c XT z P c t Xz = sup XT z Pt Xz (57) Let G t = G(t, {m i }, ζ) be given by G t = {P t (α) α A t } Then, the fundamental solution (57) can be represented by W (t, x, ζ, z) = sup XT z PX z P G t B An approximated solution In order to compute the suprema in (48) and (57), piecewise linear approximations to continuous function α A t are used to generate a tractable algorithm Consequently, we need to demonstrate that such approximations converge to the value function of interest Given t < t and K IN, let A t K be the set of continuous piecewise linear functions contained in A t, in particuar, α A t K is a piecewise linear function on a uniform grid with K + elements on the interval, t Given K IN, let K η = K η with η IN and W c η(t, x, ζ, z) = sup W α,c (t, x, ζ, z) Kη Since {A t K η } η IN is an increasing sequence of subsets contained in A t, for all η IN, W c η(t, x, ζ, z) W c η+(t, x, ζ, z) W c (t, x, ζ, z) Therefore, W c (t, x, ζ, c) = lim η W c η(t, x, ζ, z) = lim sup η α A η Kη W α,c (t, x, ζ, z) (58) Since (58) holds for all subsequences {K η } η IN IN, we have the following Lemma 4: Given K IN, let ᾱ = {ˆα i } A N (K+) where k th element of ˆα i is given by ˆα k i A for all i N and all k, K Assume that there exists a one to one and onto mapping L K such that L K : A N (K+) A t K Then, W c (t, x, ζ, z) = lim sup W L K ᾱ A N (K+) REFERENCES K (ᾱ),c (t, x, ζ, z) M Bardi and I Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations Birkhauser, Boston, 997 M Bardi and F Da Lio, On the Bellman equation for some unbounded control problems, Nonlinear Differential Equations and Appls, 4 (997), 49 5 3 R J Elliott and N J Kalton, The existence of value in differential games, Memoirs of the Amer Math Society, 6 (97) 4 MR James, Nonlinear semigroups for partially observed risksensitive control and minmax games, Stochastic Analysis, Control, Optimization and Applications: A Volume in Honor of WH Fleming, WM McEneaney, G Yin and Q Zhang, eds, Birkhauser, (999), 5774 5 RP Feynman, Space-Time Approach to Non-Relativistic Quantum Mechanics, Rev of Mod Phys, (948) 367 6 RP Feynman, The Feynman Lectures on Physics, Vol, Basic Books, (964) 9-9-4 7 RT Rockafellar, Conjugate Duality and Optimization, Regional conference Series in Applied Mathematics 6, SIAM (974) 8 RT Rockafellar and RJ Wets, Variational Analysis, Springer- Verlag, New York, 997 9 I Singer, Abstract Convex Analysis, Wiley- Interscience, New York, 997 WM McEneaney, Some classes of imperfect information finite state space stochastic games with finite-dimensional solutions, Applied Math and Optim, 5 (4), 878 WM McEneaney and PM Dower, The principle of least action and solution of two-point boundary-value problems on a limited time horizon, Proc SIAM Conf on Control and its Applics, (3), 99 6