THE HAMILTON-JACOBI THEORY FOR SOLVING OPTIMAL FEEDBACK CONTROL PROBLEMS WITH GENERAL BOUNDARY CONDITIONS

Size: px

Start display at page:

Download "THE HAMILTON-JACOBI THEORY FOR SOLVING OPTIMAL FEEDBACK CONTROL PROBLEMS WITH GENERAL BOUNDARY CONDITIONS"

Samuel Lester
6 years ago
Views:

1 THE HAMILTON-JACOBI THEORY FOR SOLVING OPTIMAL FEEDBACK CONTROL PROBLEMS WITH GENERAL BOUNDARY CONDITIONS by Chandeok Park A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Aerospace Engineering) in The University of Michigan 2006 Doctoral Committee: Associate Professor Daniel J. Scheeres, Chair Professor Anthony Bloch Professor Pierre T. Kabamba Professor N. Harris McClamroch

4 To My Parents Changseo Park & Sookhee Lee ii

5 TABLE OF CONTENTS DEDICATION LIST OF FIGURES LIST OF APPENDICES ii v vii ABSTRACT viii CHAPTER I. Introduction II. Hamiltonian System and Hamilton-Jacobi Theory III. Solutions of Optimal Feedback Control Problem by Hamilton- Jacobi theory Optimal Feedback Control Problem Formulated as a Two Point Boundary Value Problem of a Hamiltonian System Optimal Feedback Solution by Generating Functions Optimal Cost Function via the F 1 Generating Function IV. Numerical Implementations for a Class of Analytic Problems Optimal Feedback Control Law in Series Form Linear Quadratic Terminal Controller Continuous Thrust Optimal Rendezvous Maneuvers in a Central Gravity Field Optimal Feedback Control of the Heisenberg System V. Extended Applications of Generating Functions for Non- Smooth and Singular Problems Time Optimal Control of a Double Integrator iii

6 5.2 Linear Quadratic Singular Optimal Control Problem VI. Optimal Feedback Control by Hamiltonian Cauchy problem Formulation of the Cauchy Problem Generation of Optimal Feedback Control Law Numerical Computation Illustrative Examples VII. Conclusion Contributions Further Research Directions APPENDICES BIBLIOGRAPHY iv

7 LIST OF FIGURES Figure 4.1 Radial and Tangential Positions (Example 1) Radial and Tangential Velocities (Example 1) Radial and Tangential Controls (Example 1) Radial and Tangential Positions (Example 2) Radial and Tangential Velocities (Example 2) Radial and Tangential Controls (Example 2) Radial and Tangential Positions (Example 3) Radial and Tangential Velocities (Example 3) Radial and Tangential Controls (Example 3) Terminal Position and Velocity Offset x 0 = [r cos θ r sin θ 0 0], r = 0.15, 0 θ 2π Positional and Velocity Trajectory x 0 = [r cos θ r sin θ 0 0], r = 0.15, 0 θ 2π Control History x 0 = [r cos θ r sin θ 0 0], r = 0.15, 0 θ 2π Determinant of φ xλ (t) r(x, x 0 ) = f(tλ 30 ) Example 1: Optimal Trajectories for x 0 = [3 cos θ 3 sin θ 1], 0 θ 2π Example 1: Terminal Errors for x 0 = [3 cos θ 3 sin θ 1], 0 θ 2π. 64 v

8 4.17 Example 2: Optimal Trajectories for x 0 = [3 cos θ sin θ], 0 θ 2π Example 2: Terminal Errors for x 0 = [3 cos θ sin θ], 0 θ 2π Time-Optimal Control Logic for Double Integrator System (Terminal Condition at the Origin) Comparison of Singular Optimal Cost Loci of Initial Costates and Optimal Control Scheme x 0 = [0.1 cos θ 0.1 sin θ] T, 0 θ 2π The Solution to the Cauchy Problem Graphical Determination of Initial Costate Collection of the Initial States with the Same Optimal Time and Switching Curves for (ɛ = 1, a = 1) and (ɛ = 0, a = 1) vi

9 LIST OF APPENDICES Appendices A. Equivalence of Linear Quadratic Terminal Controller by Generating Functions vs. Ricatti Transformation B. Formulation of Hamiltonian Cauchy Problem by Invariant Imbedding Method C. Finite Difference Methods for Solving Cauchy Problems vii

10 ABSTRACT THE HAMILTON-JACOBI THEORY FOR SOLVING OPTIMAL FEEDBACK CONTROL PROBLEMS WITH GENERAL BOUNDARY CONDITIONS by Chandeok Park Chair: Daniel J. Scheeres This dissertation presents a general methodology for solving the optimal feedback control problem in the context of Hamiltonian system theory. It is first formulated as a two point boundary value problem for a standard Hamiltonian system, and the associated phase flow is viewed as a canonical transformation. Then relying on the Hamilton-Jacobi theory, we employ generating functions to develop a unified methodology for solving a variety of optimal feedback control formulations with general types of boundary conditions. The major accomplishment is to establish a theoretical connection between the optimal cost function and a special kind of generating function. Guided by this recognition, we are ultimately led to a new flexible representation of the optimal feedback control law for a given system, which is adjustable to various types of boundary conditions by algebraic conversions and partial differentiations. This adaptive property provides a substantial advantage over the classical dynamic programming method in the sense that we do not need to solve the Hamilton-Jacobi-Bellman equation repetitively for varying types of boundary conditions. Furthermore for a special type viii

11 of boundary condition, it also enables us to work around an inherent singularity of the Hamilton-Jacobi-Bellman equation by a special algebraic transformation. Taking full advantage of these theoretical insights, we develop a systematic algorithm for solving a class of optimal feedback control problems represented by smooth analytic Hamiltonians, and apply it to problems with different characteristics. Then, broadening the practical utility of generating functions for problems where the relevant Hamiltonian is non-smooth, we construct a pair of Cauchy problems from the associated Hamilton-Jacobi equations. This alternative formulation is justified by solving problems with control constraints which usually feature non-smoothness in the control logic. The main result of this research establishes that the optimal feedback control problem can be solved by the generating functions of the canonical solution flow corresponding to the necessary conditions. This result demonstrates the power of analyzing the optimal feedback control problem within the comprehensive field of classical Hamiltonian system theory. ix

12 CHAPTER I Introduction Generally stated, the optimal control problem is a pursuit of the best strategy for minimizing or maximizing a certain performance criterion subject to a dynamical system, possibly under the influences of additional various types of constraints. Its history dates back to ancient Greece when people studied the shortest paths joining two points in a plane, whose solution must have been intuitively known as the straight line segment [1]. Then the legend has it that in the 9th century B.C., Queen Dido was challenged to maximize the enclosed area with a closed curve of fixed length-the Greeks knew that the solution should be a circle, though it was not until the 19th century that it was rigourously proved. It is, however, indisputably recognized that the calculus of variation, founded at the end of 17th century, is the genuine forefather of modern optimal control theory. Through contributions from a variety of fields, it has been established as an independent subject and successfully applied to solve many interesting and practical problems. Newton, using a variational approach, shaped the nose of a body of revolution minimizing drag. Johann Bernoulli formulated and solved the famous brachistochrone (minimum-time) problem, which also attracted Leibniz, Newton, L Hopital, etc. Variational mechanics was launched by Euler and Lagrange, culmi- 1

13 2 nating in the Euler-Lagrange equation. The isoperimetric problem, a generalization of Queen Dido s problem, was studied in detail by Euler and Tonelli. Needless to say, modern optimal control theory has been firmly rooted on this heritage. Since its inauguration in the mid 1950 s, however, in an attempt to broaden the area of solvability, it has mutated from its predecessor into two different sophisticated branches: Pontryagin s minimum principle [2], and Bellman s dynamic programming [3]. Between these two, in general, in order to find the optimal feedback control (OFC) for a given system, we should resort to the latter and solve the Hamilton-Jacobi-Bellman equation (HJBE). Unfortunately even for relatively simple systems, this usually leads to a nonlinear partial differential equation (PDE), which does not have a closed form solution and is extremely difficult to solve in general. Consequently solving the HJBE has been, in itself, a very active field of research for nearly half a century [4, 5, 6, 7, 8, 9]. Other than this direct approach, in a natural effort to alleviate or even avert the difficulty of solving the HJBE, there also have been a myriad of other creative approaches for solving the OFC problem; various manipulations of the 1st order necessary conditions for optimality [10, 11, 12, 13], employment of state dependent Ricatti equations [14, 15, 16, 17, 18, 19], iterative techniques based on the generalized Hamilton-Jacobi-Bellman equation (GHJBE) [20, 21, 22, 23], derivation of a new governing equation by a special transformation [24], etc. Among all these diverse techniques, however, there has not been a remarkably superior result. Most of these methods exclusively consider some restricted types of performance index, dynamical system, or boundary condition, which clearly limits their applicability. Independent of the above contributions directly relating to optimal control theory, there have been studies which utilize classical Hamiltonian system theory [25, 26, 27,

14 3 28, 29, 30]; the 1st order necessary conditions for optimality, under mild assumptions of the implicit function theorem, are reduced to a standard Hamiltonian system. This direction of approach can be recognized as being somewhat more natural in the sense that the Hamiltonian system theory, through its even longer history, has been established as a rich structure with many distinctive characteristics, such as various symmetries, integral invariants, canonical transformations, etc [31, 32, 33]. However, all these works tend to deal with special canonical transformations to solve their specific problems, and do not provide any systematic solution procedure for general optimal control problems. Motivated by this rather ironical scantiness of qualified results, and instigated by Guibout and Scheeres recent achievement on solving two point boundary value problems (TPBVPs) of Hamiltonian systems [33, 34, 35], we have totally reanalyzed the OFC problem from this classical orientation. Quite fruitfully, not only we have solidified the pre-existing results from the modern optimal control theory, but also have gained some unique theoretical insights and practical interpretations, which have been materialized in our recent publications [36, 37, 38, 39, 40, 41, 42]. The greatest accomplishment would be that in the context of Hamiltonian systems, we have identified a fundamental function which provides the OFC law for arbitrary types of boundary conditions. All of our accomplishments are integrated around this key observation, and even extended to a certain degree in this thesis, which is structured as follows. We first review Hamiltonian systems and the Hamilton-Jacobi theory, which provides a base structure throughout the entire document (Chapter II). Within this framework, we proceed to develop how to derive the OFC subject to a given system for general boundary conditions, and justify our method by proving that our solution

15 4 satisfies the sufficient conditions for optimality (Chapter III). This theoretical development is followed by a step-by-step numerical implementation for a class of regular optimal control systems. It is first legitimized by reproducing a well-known solution for a type of linear quadratic problems, and then fully employed for the analysis of nonlinear OFC problems (Chapter IV). Then, as an attempt for generalization, we consider problems with control constraints and singular optimal control problems. Choosing a representative example for each case, we show that our method reproduces previously well-known solutions (Chapter V). Our aspiration for generalization continues. However, somewhat conceding the inherent limitation of our technique for problems with control constraints, we devise an alternative numerical process with a new set of governing equations. It justifies itself by recovering a well-known solution for a time-optimal control problem, and then demonstrates its utility by solving another non-trivial time-optimal control problem (Chapter VI). Finally the whole discussion ends with concluding remarks (Chapter VII).

16 CHAPTER II Hamiltonian System and Hamilton-Jacobi Theory This chapter briefly reviews the necessary and core properties of Hamiltonian systems and of the Hamilton-Jacobi theory, as they provide a cornerstone to the whole thesis. For more comprehensive discussions on this topic, refer to Greenwood [31], Goldstein [32], Caratheodory [43], etc. We begin by formally defining a Hamiltonian system. Definition 2.1 (Hamiltonian System) A dynamical system is called Hamiltonian if it is described by a pair of ordinary differential equations of the following vector form under a smooth function H(q(t), p(t), t) : R n R n R R q(t) ṗ(t) = H(q(t),p(t),t) p(t) H(q(t),p(t),t) q(t) (2.1) H is called the Hamiltonian, q the generalized coordinates, and p the generalized momenta of the system. Here note that the first equation of (2.1) can be rearranged such that p = p(q, q, t) under mild assumptions of the implicit function theorem. The following Hamilton s principle plays a key role in the Hamilton-Jacobi theory of canonical transformation, which we make use of later. 5

17 6 Theorem 2.1 (Hamilton s principle) Let the Lagrangian L(q, q, t) : R n R n R R be defined as L(q, q, t) = q T p(q, q, t) H(q, p(q, q, t), t) Then, during a fixed time interval [t 0, t 1 ], the integral I = tf t 0 Ldt (2.2) is stationary, i.e., δi = 0, with respect to path variations which vanish at the end points. Proof Refer to Greenwood [31, pp ]. (Q.E.D.) With the above definition and principle in mind, we now introduce a key notion of canonical transformation for the Hamiltonian system. Definition 2.2 Suppose there exists a transformation between (q, p) and a new set of coordinates (Q, P ) of the same dimension such that it is represented by Q(t) Q(q(t), p(t), t) =. (2.3) P (t) P (q(t), p(t), t) If the new set of variables (Q, P ) preserves the Hamiltonian structure under a new Hamiltonian K = K(Q(t), P (t), t) such that it satisfies K(Q(t),P (t),t) Q(t) P (t) =, P (t) then the transformation is called canonical. K(Q(t),P (t),t) Q(t) Now noting that K(Q(t), P (t), t) and H(q(t), p(t), t) represent the same system in two different coordinates, we attempt to associate them. Applying Hamilton s principle (2.2) to both K and H, we obtain tf ( δ pt q H(q, p, t) ) tf ( ) dt = δ P T Q K(Q, P, t) dt = 0. t 0 t 0

18 7 This implies that the integrands of the two integrals differ at most by a total time derivative of an arbitrary function F,i.e., p T q H(q, p, t) = P T Q K(Q, P, t) + df dt, (2.4) or equivalently p T dq H(q, p, t)dt = P T dq K(Q, P, t)dt + df (2.5) Note that both coordinate systems are connected by a single scalar function. Such a function is called a generating function and depends on both sets of coordinates, i.e., 4n + 2 variables. Using the 2n relations from (2.3), however, we see that F is reduced to a function of 2n + 2 variables. On the assumption that F is dependent upon n old coordinates and n new coordinates, we can compose 4 principal kinds of generating functions: 1 F 1 (q, Q, t), F 2 (q, P, t), F 3 (p, Q, t), F 4 (p, P, t) These provide their own mathematical relations for the given canonical transformation. For example, suppose we consider q and Q as independent variables. Then expanding the total time derivative of F 1, we obtain df 1 (q, Q, t) dt = F T 1 q q + F T 1 Q Q + F 1 t. (2.6) Substitution of (2.6) into (2.4) leads to ( p F ) T ( 1 q H = P + F ) T 1 Q K + F 1 q Q t, 1 In fact, we can compose many other kinds of generating functions as long as they are functions of n old coordinates and n new coordinates. It is up to our discretion to consider which variables are independent and which others are dependent.

19 8 which provides the complete mathematical connections for the given transformation: p = F 1(q, Q, t) q P = F 1(q, Q, t) Q K(Q, P, t) = H(q, p, t) + F 1(q, Q, t) t The same procedure for the other generating functions produces similar connections for each coordinate system: p = F 2(q, P, t) q Q = F 2(q, P, t) P K(Q, P, t) = H(q, p, t) + F 2(q, P, t) t q = F 3(p, Q, t) p P = F 3(p, Q, t) Q K(Q, P, t) = H(q, p, t) + F 3(p, Q, t) t q = F 4(p, P, t) p Q = F 4(p, P, t) P K(Q, P, t) = H(q, p, t) + F 4(p, P, t) t Now recall that the above generating functions provide different mathematical representations for the same canonical transformation. Whatever the representations might be, they should actually provide the same transformation rule. This suggests that there should exist links between these generating functions. This is indeed the case and the links are represented by the following Legendre transformations.

20 9 Theorem 2.2 (Legendre Transformation) The generating functions are all linked together by the following identities: F 2 (q, P, t) = F 1 (q, Q, t) + Q T P (2.7) F 3 (p, Q, t) = F 1 (q, Q, t) q T p (2.8) F 4 (p, P, t) = F 2 (q, P, t) + q T p (2.9) Proof Consider the first relation between the F 1 and the F 2. We start by rearranging the relation (2.5) for the F 1 : p T dq P T dq Hdt + Kdt = df 1 (q, Q, t) Desiring to replace Q by P as variables in the generating function and in the differential form, we add the following identity on both sides Q T dp + P T dq = d(q T P ) to yield p T dq Q T dp Hdt + Kdt = df 2 (q, P, t) = d(f 1 (q, Q, t) + Q T P ). Comparing the last two terms with each other leads exactly to the first relation. The other relations can be similarly proved with ease. For more details, refer to Greenwood [31, pp ]. (Q.E.D.) So far we have introduced some key properties of Hamiltonian systems and of the Hamilton-Jacobi theory. In the next chapter, we set up the optimal feedback control problem in this context and use these properties to develop a general solution methodology.

21 CHAPTER III Solutions of Optimal Feedback Control Problem by Hamilton-Jacobi theory With the prerequisite materials for the Hamiltonian system in the previous chapter in mind, we proceed to study optimal feedback control (OFC) problem for general boundary conditions. Specifically we formulate it as a two point boundary value problem (TPBVP) of the Hamiltonian system, define a canonical transformation, and employ generating functions to derive a general solution methodology. In doing so, it is shown that any generating function contains all the necessary information for the OFC problem and that algebraic manipulations provide solutions for arbitrary types of boundary conditions. This implies a substantial advantage over some classical methods, as we do not need to solve the relevant ordinary/partial differential equations repetitively for varying boundary conditions. Finally we show that our method indeed satisfies the sufficiency of optimality by proving that a special kind of generating function is fundamentally connected with the optimal cost function. 10

22 Optimal Feedback Control Problem Formulated as a Two Point Boundary Value Problem of a Hamiltonian System Consider minimization of the general Bolza-type performance index J = φ(x(t f ), t f ) + subject to a nonlinear dynamical system tf t 0 L(x(t), u(t), t)dt (3.1) ẋ = f(x(t), u(t), t) (3.2) satisfying the boundary condition given by x(t 0 ) = x 0 ψ(x(t f ), t f ) = 0 (3.3) Here x R n represents state, x 0 R n a specified initial point, u R m control, t R general time index, t 0 R initial time which is assumed to be fixed, t f R terminal time which may be fixed or vary, φ(x(t f ), t f ) : R n R R terminal time performance index, L(x(t), u(t), t) : R n R m R R full time performance index, f(x(t), u(t), t) : R n R m R R n system dynamics, and ψ(x(t f ), t f ) : R n R R p n terminal time constraint. The control u = [u 1 u 2 u m ] T is constrained by the following inequalities by component: u i u i0 = constant, i = 1, 2,, m The unconstrained problem can be dealt with by letting u i0, i = 1, 2,, m. Also the infinite horizon regulator problem can be covered by forcing t f. Studying the OFC problem for general types of boundary conditions, we consider the above minimization problem separately by the following classifications 1 : 1 This classification is mainly for convenience of analysis. Note that the HCP can be contained

23 12 Definition 3.1 (Hard and Soft Constraint Problems) Hard Constraint Problem (HCP): Terminal boundary condition for the state is specified to a fixed point x f R n. In general, φ(x(t f ), t f ) = constant can be removed without loss of generality. The terminal hyper plane ψ(x(t f ), t f ) = x(t f ) x f = 0 (3.4) becomes an explicit relation. Soft Constraint Problem (SCP): Terminal boundary condition is completely undetermined, or partially specified by a terminal hyper plane ψ(x(t f ), t f ) = 0. It is indirectly determined by minimizing φ(x(t f ), t f ) and/or satisfying ψ(x(t f ), t f ) = 0. Given this problem statement, the objective is to find the optimal cost and the OFC law for an arbitrary initial point (x, t) on a given domain of interest in R n R, compatible with system dynamics and terminal constraint. Then from any initial point, we can evaluate the optimal trajectory by simple forward integration of the system (3.2), updating the control as new state measurements are made. For that purpose, we start from the 1st order necessary conditions for optimality. Theorem 3.1 (Necessary Conditions for Optimality) Let the pre-hamiltonian H be defined such that H(x, λ, u, t) = L(x, u, t) + λ T f(x, u, t) (3.5) in the SCP by retaining the implicit form of the terminal constraint ψ(x(t f ), t f ) = 0. Indeed the SCP itself can deal with all types of boundary conditions treated in this thesis. Later it is shown that the specific solution procedure for the HCP is characteristically different from that for the SCP, which is the main reason for this rather ambiguous classification.

24 13 where λ is a costate adjoint to f. Then, the Pontryagin s principle provides the following 1st order necessary conditions for optimality: ẋ = H(x, λ, u, t) λ λ = H(x, λ, u, t) x u = arg min ū Proof Refer to Pontryagin [2]. (Q.E.D.) (3.6) (3.7) H(x, λ, ū, t) (3.8) Substituting (3.8) into (3.5), (3.6), and (3.7) results in a standard Hamiltonian system for state and costate only: H(x, λ, t) = H(x, λ, arg min ū ẋ = H(x, λ, t) λ λ = H(x, λ, t) x H(x, λ, ū, t), t) (3.9) (3.10) (3.11) Evaluating the optimal trajectory corresponds to solving this system of ordinary differential equations (ODEs) satisfying the given boundary conditions. For the HCP, the initial state x 0 and the terminal state x f are given explicitly and the initial costate λ 0 and the terminal costate λ f should be determined. For the SCP, the initial state is given whereas the terminal state, initial costate, and terminal costate should be determined. In this case the transversality condition relates the terminal state and costate, and provides n additional terminal boundary conditions by the following relation [44, Chapter 2]: λ(t f ) = [φ(x(t f), t f ) + ν T ψ(x(t f ), t f )] x(t f ) (3.12) where ν R n is a Lagrange multiplier adjoint to ψ. For both cases, we need to solve this system of ODEs with the same number of split boundary conditions. Hence the OFC problem is reduced to a TPBVP of the Hamiltonian system.

25 14 Finally regarding free terminal time problems where t f is free and to be determined in the process of optimization, they can be reformulated as problems with fixed terminal time by augmenting a new state as the varying time index [45]. Otherwise, desiring to analyze in their original form, we are provided with an additional transversality condition for the free terminal time [44, Chapter 2]: H(t f ) + φ(x(t f), t f ) t f = 0 (3.13) 3.2 Optimal Feedback Solution by Generating Functions Given the TPBVP of the Hamiltonian system (3.9)-(3.11), we can usually solve it by a variety of standard numerical techniques. However, it generally requires an iterative procedure with an initial guess for costate, which does not have any physical interpretations in general. Furthermore, it usually yields the open loop solution for a specific boundary condition, which does not fit into our goal of obtaining the OFC law on a given domain of interest. Inspired by this limitation, we now develop a new solution methodology based on the Hamilton-Jacobi theory of canonical transformation. As a first step, we view the Hamiltonian phase flow as a transformation between moving terminal coordinates (x(t), λ(t)) and fixed initial coordinates (x 0 (t), λ 0 (t)) for the time span t [t 0, t f ]. Noting that the latter set of coordinates (x 0 (t), λ 0 (t)) (x 0 (t 0 ), λ 0 (t 0 )) are all constants, we see that it becomes a Hamiltonian system with its Hamiltonian trivially defined as H 0 (x 0, λ 0, t 0 ) 0, and that the transformation between these two coordinate systems is canonical simply by definition. Then referring to the previous chapter, we can derive generating functions and their associated relations for this

26 15 canonical transformation: λ = F 1(x, x 0, t; t 0 ) x (3.14) λ 0 = F 1(x, x 0, t; t 0 ) x 0 (3.15) 0 = F 1(x, x 0, t; t 0 ) t ( + H x, F 1(x, x 0, t; t 0 ), t x ) (3.16) λ = F 2(x, λ 0, t; t 0 ) x (3.17) x 0 = F 2(x, λ 0, t; t 0 ) λ 0 (3.18) 0 = F 2(x, λ 0, t; t 0 ) t ( + H x, F 2(x, λ 0, t; t 0 ), t x ) (3.19) x = F 3(λ, x 0, t; t 0 ) λ (3.20) λ 0 = F 3(λ, x 0, t; t 0 ) x 0 (3.21) 0 = F 3(λ, x 0, t; t 0 ) t ( + H F 3(λ, x 0, t; t 0 ), λ, t λ ) (3.22) x = F 4(λ, λ 0, t; t 0 ) λ (3.23) x 0 = F 4(λ, λ 0, t; t 0 ) λ 0 (3.24) 0 = F 4(λ, λ 0, t; t 0 ) t ( + H F 4(λ, λ 0, t; t 0 ), λ, t λ ) (3.25) Note that in each generating function we have introduced a constant parameter t 0 representing the initial time for later application. It is also observed that in (3.16) and (3.19) λ s in the Hamiltonian have been replaced by F 1 / x and F 2 / x respectively, and that similarly in (3.22) and (3.25) x s in the Hamiltonian have been replaced by F 3 / λ and F 4 / λ respectively; all of them are now equations for their own generating functions, and are referred to as the Hamilton-Jacobi equations (HJEs).

27 16 Alternatively we can view the Hamiltonian phase flow as a canonical transformation between fixed terminal coordinates (x f, λ f, t f ) and moving initial coordinates (x, λ, t) and derive similar relations to the above. For example, the following relations hold for the F 1 generating function for this canonical transformation: λ f = F 1(x f, x, t f ; t) x f (3.26) λ = F 1(x f, x, t f ; t) x 0 = F 1(x f, x, t f ; t) t ( + H x, F 1(x f, x, t f ; t), t x ) (3.27) (3.28) Note the slight differences in signs of terms. This HJE is used in the next section to prove the sufficiency of our solution methodology. Suppose we have found these generating functions. Then, with the above relations, we can develop a solution methodology for the TPBVP for any general type of boundary conditions. Among these varieties, the choice of the appropriate one depends on the type of boundary condition. For the HCP, the F 1 (x, x 0, t; t 0 ) is the most appropriate choice as we know the initial and terminal states. Indeed computing the F 1, we can directly evaluate the initial and final costate from the associated relations (3.14)-(3.15) simply by partial differentiations: 2 F 1 λ f = F 1 (x f, x 0, t f ; t 0 ) x = (3.29) t=tf,x=x f x f λ 0 = F 1 x 0 = F 1(x f, x 0, t f ; t 0 ) (3.30) t=tf,x=x f x 0 Furthermore since any time t t f can be the initial time, (3.30) should hold for arbitrary initial conditions x = x(t) and λ = λ(t): λ = F 1(x f, x, t f ; t). (3.31) x 2 In fact, any other generating functions can be adopted to solve the HCP. For that purpose, however, using the F 2, F 3, or F 4 requires us to solve a set of implicit equations as well as to take partial differentiations, whereas employing the F 1 only necessitate taking partial differentiations. This observation of computational advantage is the justification of using the F 1 for the HCP.

28 17 Substitution of (3.31) into the optimality condition (3.8) yields the OFC law for the HCP: u = arg min ū ( H x, F ) 1(x f, x, t f ; t), ū, t x (3.32) For the SCP it is not immediately apparent which generating function is the most appropriate, as we have 3n unknown boundary conditions (λ 0, x f, λ f ) and p unknown Lagrange multipliers (ν). Whatever we may choose, we cannot determine the unknowns directly by partial differentiation, but need to solve a set of implicit algebraic equations. However as we are interested in both HCP and SCP, we choose the F 1 to avoid evaluating an additional generating function. Consider again the 2n relations (3.29)- (3.30) for the F 1, along with the n transversality conditions (3.12) and the p terminal constraints (3.3). If we regard (x 0, t 0, t f ) as independent parameters, they are (3n + p) equations for the same number of unknowns (x f, λ f, λ 0, ν). Thus, on the mild assumption of the implicit function theorem, the unknowns are rearranged as functions of known parameters (x 0, t 0, t f ): x f = x f (x 0, t f, t 0 ). (3.33) λ f = λ f (x 0, t f, t 0 ) (3.34) λ 0 = λ 0 (x 0, t f, t 0 ) (3.35) ν = ν(x 0, t f, t 0 ) (3.36) Then as in the case of HCP, introducing (3.35) into the optimality condition (3.8) and removing the subscript 0 for all initial quantities leads to the OFC law for the SCP: u = arg min ū H ( x, F 1(x f, x, t f ; t) x xf =x f (x,t f,t), ū, t ) (3.37)

29 18 Finally in order to apply the above procedure to the problems with varying terminal time, we can reformulate them as fixed terminal time problems [45]. Otherwise, in its original formulation, the above procedure can be generalized with the aid of the transversality condition (3.13), which compose a set of simultaneous algebraic equations for the same number of unknowns, together with (3.29)-(3.30) for the HCP or (3.33)-(3.36) for the SCP. So far, employing generating functions, we have shown that unknown boundary conditions are evaluated by a series of partial differentiations and algebraic manipulations without solving any differential equations. Then the optimal trajectory can be evaluated by simple forward/backward integration with a pair of known initial/terminal conditions. Furthermore from the F 1 generating function, we have developed general formulae for the OFC laws for both HCP and SCP, i.e., for general types of boundary conditions. Recall, however, that these control laws have been derived from the necessary conditions for optimality; we have yet considered sufficiency, thus our control laws remain as candidates. The next section is mainly dedicated to the formal proof that our candidate control laws satisfy sufficient conditions for optimality, thus they are optimal indeed. 3.3 Optimal Cost Function via the F 1 Generating Function The discussion in the previous section alludes that the F 1 generating function contains all the information for the optimal control system for all types of boundary conditions. Motivated by this suggestion, we further explore its properties specifically by seeking its relation with the optimal cost function. In doing so, we are naturally led to derive the sufficiency of the OFC laws from the F 1.

30 19 We begin by stating the sufficient conditions for optimality from the well-known Bellman s dynamic programming theory. Theorem 3.2 (Sufficient Conditions for Optimality) Suppose the following two conditions hold: 1. In the domain considered for (x, t), the pre-hamiltonian (3.5) has a unique minimizer with respect to u such that u = arg min ū H ( x, J ) x, ū, t 2. J(x, t) is sufficiently smooth (or analytic) and satisfies the Hamilton-Jacobi- Bellman equation (HJBE) with the given terminal boundary condition ( J (x, t) + min H x, J ) t ū x, ū, t = 0 (3.38) J(x(t f ), t f ) = φ(x(t f ), t f ) on ψ(x(t f ), t f ) = 0 Then J is the optimal cost and u is the corresponding OFC law. Proof Refer to Bryson and Ho [44, pp ], Athans and Falb [46, pp ], etc. The following key theorem establishes the connection between the optimal cost function and the F 1 generating function and proves the sufficiency of our OFC laws. Theorem 3.3 (Optimal Cost Derived from the F 1 Generating Function) For both HCP and SCP, the function V (x, t) = F 1 (x f, x, t f ; t) + φ(x f, t f ) on ψ(x f, t f ) = 0 satisfies the HJBE and the associated boundary condition (3.38), thus it is the optimal cost function. Furthermore, the OFC can be expressed as ( ) V (x, t) u = arg min H x, ū x, ū, t

31 20 Proof We only prove for the fixed terminal time problem, as the free terminal time problem can be converted into the former to [45]. Also without loss of generality we retain the notation ψ(x f, t f ) = 0 even for the HCP, not rewriting it as an explicit relation x(t f ) = x f. This enables us to deal with the HCP and the SCP together. First referring to the relations (3.33)-(3.36), we can rewrite the candidate optimal cost function with the Lagrange multiplier ν augmented as V (x, t) = F 1 (x f (x, t f, t), x, t f ; t) + φ(x f (x, t f, t), t f ) + ν(x, t f, t) T ψ(x f (x, t f, t), t f ) Taking partial derivative of V with respect to t and applying the chain rule, we obtain V t ( F1 = + ) ( x φ x f + t x f ) T ( ψ ψ + ν T x f x f t + ψ x f x f x t + F 1 x f x f t + F 1 x f x f x ( ν t + ν x x t t + φ x f x f x ) x t ) x t (3.39) Now equating the F 1 -associated relation (3.26) with the transversality condition (3.12), we have F 1 = φ + ν T ψ (3.40) x f x f x f Then introducing (3.40) into (3.39) and forcing ψ = 0, we obtain a significantly reduced expression V t = F 1 t (3.41) Similarly taking partial derivatives of V with respect to x, and repeating the same procedure results in ( V x = F1 x + F ) 1 x f + φ x f x f x x f V x = F 1 x x + ν x ψ x f ψ + νt x f x (3.42)

32 21 Finally substituting (3.41) and (3.42) into the HJE (3.28) yields 0 = V ( t + H 0 = V t + min ū which is indeed the HJBE for V in Theorem 3.2. x, V ) x, t ( H x, V ) x, ū, t, (3.43) Now consider the boundary condition for the candidate cost function V. As it is represented by the F 1 generating function, we should define a functional form for the F 1 which reflects this boundary condition at t = t f. Noting that the F 1 represents a canonical transformation between the initial and terminal coordinates, we see that it should define the following simple identity: x = x f λ = λ f at t = t f Here lies a difficulty, as the F 1 cannot realize such an identity transformation at t = t f ; it becomes singular as t t f since its independent arguments, the initial and terminal states, are equal and not independent at t = t f. 3 Instead, we can easily define the functional form of the F 2 which realizes this identity; indeed the form F 2 = x T f λ generates the identity transformation by the relations (3.17) and (3.18): x = F 2 λ = x f, λ f = F 2 x f = λ Thus we know the form of the F 2 at the terminal time. Using this, we can indirectly determine the value of the F 1 at the terminal time from the Legendre transformation (2.7): F 1 (x f, x, t f ; t) x=xf,t=t f = [ F 2 (x f, λ, t f ; t) λ T x ] x=xf,t=t f 0 (3.44) 3 In fact, this is an equivalent statement that the optimal cost function becomes singular at the terminal time for the HCP.

33 22 Also note that the evaluation of the expression (3.33) at the terminal time simply yields the identity: x f (x, t f, t) x=xf,t=t f = x f (3.45) Considering (3.44) and (3.45) together, we obtain V (x, t) x=xf,t=t f = [ F 1 (x f (x, t f ; t), x, t f ; t) + φ(x f (x, t f, t), t f ) +ν(x, t f, t) T ψ(x f (x, t f, t), t f )] x=xf,t=t f = φ(x f, t f ) on ψ(x f, t f ) = 0 (3.46) which satisfies the boundary condition given in Theorem 3.2. (3.43) and (3.46) concludes that V = F 1 + φ is indeed the optimal cost function. The OFC law has been determined from (3.32) and (3.37), which completes the proof. (Q.E.D) Note that for the HCP where φ(x(t f ), t f ) does not exist in general and ψ(x(t f ), t f ) becomes an explicit relation x(t f ) = x f, the optimal cost function is simply reduced to V (x, t) = F 1 (x f, x, t f ; t) Though we employ the same F 1 generating function for both HCP and SCP, the optimal cost function and the associated OFC law are quite different from each other; the F 1 is used directly for the HCP, whereas we introduce the relation (3.33) into the F 1 for the SCP. Also note that once the F 1 is determined, all additional steps of finding the optimal cost function for the SCP contains only algebraic manipulations and partial differentiations without solving any additional differential equations. This observation provides a substantial advantage of our solution technique over the classical dynamic programming approach, in which we need to solve the difficult HJBE repetitively for varying types of boundary conditions. All these results imply that

34 23 the F 1 generating function can be identified as a fundamental function that lies behind the optimal control system, which allows us to analyze the OFC problem in the comprehensive field of the Hamiltonian system theory. Given all these systematic procedures and theoretical justifications of our methodology based on Hamilton-Jacobi theory, we see that our task is obvious, i.e., how to actually find the F 1 generating function, which is mainly discussed in the subsequent chapters.

35 CHAPTER IV Numerical Implementations for a Class of Analytic Problems In the previous chapter, for a general optimal control problem, we have developed a method for obtaining the optimal feedback control (OFC) strategy from one scalar quantity, the F 1 generating function. This being stated, the practical validity of our proposed approach obviously relies on the attainability of its solution. There might be a variety of ways. It could be obtained somewhat luckily by intuition for some simple systems. Sometimes we may work inversely; by perturbing or adding some terms of a known generating function, we may devise a new one in the hope that it represents the system we desire to study. However, in general, finding the F 1 generating function corresponds to solving the Hamilton-Jacobi equation (HJE) defined for itself represented by (3.16). Observe that this usually becomes a nonlinear partial differential equation (PDE) depending upon the varying structure of the associated Hamiltonian. Mostly its form does not fall into any familiar PDE category with a known solution form. Furthermore, we also need to find at least one initial/boundary condition from which the solution propagates both temporally and spatially. All these situations suggest that solving the HJE should become a very formidable task even for some simple systems, and it is indeed so. 24

36 25 Recently Guibout and Scheeres, while studying the two point boundary value problem (TPBVP) of Hamiltonian system, developed a systematic procedure to solve the HJEs for generating functions for a class of problems [34, 35]. In this chapter, relying on their contribution to a certain extent, we particularize their method for our OFC problem. Specifically we delineate how to compute the F 1 generating function in the context of optimal control system, which is combined with the theoretical advancement on how to determine the optimal cost and the associated OFC law in the previous chapter. A classical linear quadratic problem is illustrated as an example for both hard constraint problem (HCP) and soft constraint problem (SCP), where the whole procedure is legitimized by showing that our solution recovers the wellknown solution from the Ricatti transformation technique. With this justification, we extensively study optimal rendezvous maneuvers in a central gravity field. Finally we demonstrate that our method is well adapted to the study of underactuated systems. 4.1 Optimal Feedback Control Law in Series Form Simply stated, the main difficulty, throughout the whole procedure to finding the OFC law by our approach, lies in finding the F 1 generating function. Once it is found, all the remaining steps only involve partial differentiations of a known function and algebraic manipulations of implicit/explicit equations. These post-processes might be lengthy and laborious if the system is of relatively high dimension. However, we do not need to solve any additional ordinary/partial differential equations for varying boundary conditions, which provides a substantial advantage of our proposed method. Recently Guibout and Scheeres showed that the generating functions, if they exist in analytical form, could be solved as power series expansions in their respective argu-

37 26 ments [34, 35]. Generally functions of the temporal variable, the coefficients of these power series can be obtained from a set of ordinary differential equations (ODEs) derived from the associated HJEs. To carry out their method, however, requires that some restrictions should be placed on the dynamical system and performance index. Guibout and Scheeres computational scheme constructs local solutions to the generating functions, i.e., expands them as power series about a specific nominal trajectory. This implies that a solution must be found in advance for an optimal control problem of interest, and that the proposed series-based technique operates in the vicinity of this nominal trajectory. Finding a nominal solution is, in itself, a possibly difficult task. However, note that it is often achieved by simply forcing null control in many practical situations. For example, the unforced zero equilibrium, satisfying f(x = 0, u = 0, t) = 0 in (3.2), can be selected as a nominal optimal trajectory for problems where the performance index is a norm of control variables. Another critical restriction is the analyticity of the Hamiltonian, as it is expanded as a power series of states and costates about a nominal solution. Specifically this places a requirement that L in the performance index (3.1) and f in the system dynamics (3.2) should be analytic and that the control u should be unbounded, since they compose the Hamiltonian through the Pontryagin principle 1. Finally we assume that the system (3.2) is controllable. Summarized, all these preconditions for applicability are enumerated as follows: 1. A nominal optimal solution is known. Otherwise, the unforced equilibrium, satisfying f(x = 0, u = 0, t) = 0 in (3.2), provides a trivial nominal solution. 1 It should be noted that even with all these strong assumptions about analyticity, the convergence of the series solution of generating functions is not always guaranteed. In some special circumstances including resonance phenomenon, it might be suspicious as time evolves, for which our series-based method should be executed with caution.

38 27 2. L in the performance index (3.1) and f in the system dynamics (3.2) are analytic. 3. u is unbounded. 4. The system (3.2) is controllable. With these strong but natural assumptions satisfied, we expand the Hamiltonian H in (3.9) as a power series with respect to the states and costates, and the F 1 generating function in (3.14)-(3.16) with respect to the initial and final states. These series are substituted into the HJE (3.16) to result in a polynomial equation for the initial and terminal states. Then a balancing technique is used to equate all like powers of these states to zero, which defines a set of recursive ODEs for the coefficients of the F 1 series. A major problem in this approach, however, lies in determining the initial/boundary conditions to initiate the integration of these ODEs. Observing that they are composed of the coefficients of the F 1, we must find the associated initial/boundary conditions from the F 1. In general, the one and only one clue comes from the definition of our canonical transformation representing the Hamiltonian phase flow; the simple identity (x, λ) = (x f, λ f ) at t = t f should be satisfied. However, recall in the proof of Theorem 3.3 that it cannot be realized by the F 1 ; its independent arguments, the initial and terminal states, lose their independency and become singular as t t f. The proof of Theorem 3.3 again provides a hint for how to circumvent this inherent obstacle. We first solve for a different kind of generating function with a well-defined functional form for this identity, which is then converted into the F 1 by the Legendre transformations (2.7)-(2.9). For example, we can launch the series-based technique

39 28 for the F 2, as it does not suffer from this singularity 2 ; the identity transformation is well defined by the functional form F 2 = x T f λ, which in turn provides the initial condition for the associated ODEs. Then the Legendre transformation (2.7), along with (3.17)-(3.18), enables us to convert the F 2 into the F 1 by purely algebraic manipulations 3. In general, a power series expansion for any kind of generating function can be transformed into a different kind of generating function through the Legendre transformations. So far it has been described how Guibout and Scheeres series-based scheme is adapted to our OFC problem, under the prerequisites to be satisfied in advance; attainability of reference solution, analyticity of Hamiltonian, controllability of dynamical system, etc. As an introductory demonstration of this approach, we now analyze the wellknown linear quadratic problem for which the solution procedure can be detailed explicitly. 4.2 Linear Quadratic Terminal Controller Consider minimization of the performance index J = 1 2 xt (t f )Q f x(t f ) subject to the linear system tf t 0 [ x T (t)q(t)x(t) + u T (t)r(t)u(t) ] dt ẋ(t) = A(t)x(t) + B(t)u(t). (4.1) 2 Among all possible forms of generating functions for a general canonical transformation, at least one of them is always well-defined at any instant [47, pp. 267]. However, in general all of them may suffer from multiple singularities at different moments. If this undesirable phenomena happens, it is impossible to solve for one generating function for the entire time span of interest. Instead, we initiate the time evolution for a generating function for which an initial condition is well-defined. Then, before a singularity occurs, we jump to another generating function via the Legendre transformation to re-initiate the time evolution. In practice, however, one of the four principal kinds of generating functions is always non-singular [34]. 3 Similar arguments hold for the F 3 generating function with F 3 = x T λ f.

40 29 Here Q f R n n 0 represents terminal time weight for state, Q(t) : R R n n 0 full time weight for state, R(t) : R R n n > 0 full time weight for control, A(t) : R R n n system matrix, and B(t) : R R n m control matrix. The initial condition is given as a fixed point x(t 0 ) = x 0 and the terminal condition is represented by a hyper plane Mx(t f ) β = 0. (4.2) where M R p n, β R p, and p n. The mathematical expression for the terminal condition differs between the HCP and the SCP: HCP: Q f = 0 n n, M = I n n, and β = x f R n is assumed to be fixed. SCP: Among many possible representations of SCP, we choose M 0 n n and β 0 n 1, which leads to a trivial identity and implies that the terminal condition is completely unspecified. Without loss of generality, this provides a mathematical simplicity for the transversality condition (3.12), as we do not need to adjoin the Lagrange multiplier to the terminal hyper plane (4.2). We start by defining the pre-hamiltonian and deriving the 1st order necessary conditions for optimality from Theorem 3.1: H(x, λ, u, t) = 1 2 (xt Qx + u T Ru) + λ T (Ax + Bu) (4.3) λ = Qx A T λ (4.4) u = R 1 B T λ. (4.5)

41 30 Substitution of (4.5) into (4.1), (4.3), and (4.4) eliminates the control variable u in each expression, leading to a standard Hamiltonian system for states and costates: H(x, λ, t) = 1 2 x λ T Q A AT BR 1 B T x λ (4.6) ẋ λ = A BR 1 B T Q A T x λ (4.7) Following the previously described procedure, we first evaluate the F 2 (x, λ 0, t; t 0 ) instead of the F 1 (x, x 0, t; t 0 ). Here note that the Hamiltonian (4.6) is only quadratic for states and costates, which enables us to express the F 2 also in quadratic form: F 2 (x, λ 0, t; t 0 ) = 1 2 x λ 0 Now recall the relation (3.17): T λ = F [ 2 x = F xx(t; t 0 ) F xλ0 (t; t 0 ) F λ0 x(t; t 0 ) F λ0 λ 0 (t; t 0 ) ] F xx F xλ0 x λ 0 x λ 0 (4.8) with which we can express the Hamiltonian (4.6) as a function of (x, λ 0 ): T H = 1 x I F xx Q AT I 0 x (4.9) 2 λ 0 0 F λ0 x A BR 1 B T F xx F xλ0 λ 0 Observe that H and F 2 above share the same arguments (x, λ 0 ) now. Introducing (4.8) and (4.9) to the HJE (3.19) results in T 0 = x F xx F xλ0 λ 0 F λ0 x F λ0 λ 0 + I F xx Q AT 0 F λ0 x A BR 1 B T I 0 F xx F xλ0 x, λ 0

42 31 whose sub-matrix components provide a set of matrix ODEs for F xx, F xλ0 = F T λ 0 x, and F λ0 λ 0 : 0 = F xx + Q + F xx A + A T F xx F xx BR 1 B T F xx 0 = F xλ0 + A T F xλ0 F xx BR 1 B T F xλ0 (4.10) 0 = F λ0 λ 0 F λ0 xbr 1 B T F xλ0 The corresponding initial conditions are determined from the functional form F 2 (x, λ 0, t 0 ; t 0 ) = x T λ 0 representing the identity transformation. Introducing this expression into (4.8) and comparing the coefficients of the same kind of terms, we obtain F xx (t 0 ; t 0 ) = 0 n n F xλ0 (t 0 ; t 0 ) = I n n (4.11) F λ0 λ 0 (t 0 ; t 0 ) = 0 n n. The solution of the ODEs (4.10) with the initial conditions (4.11) explicitly determines the F 2 by (4.8). It remains to convert it into the F 1 by the Legendre transformation (2.7): F 1 (x, x 0, t; t 0 ) = F 2 (x, λ 0, t; t 0 ) x T 0 λ 0 = 1 2 x λ 0 T F xx F xλ0 x λ 0 x T 0 λ 0. F λ0 x F λ0 λ 0 With the aid of the relation (3.18), x 0 = F 2 λ 0 = F λ0 xx + F λ0 λ 0 λ 0, this expression becomes, after some algebraic manipulations 4 T F 1 (x, x 0, t; t 0 ) = 1 x F xx F xλ0 F 1 λ 0 λ 0 F λ0 x 2 x 0 F 1 λ 0 λ 0 F λ0 x F xλ0 F 1 λ 0 λ 0 F 1 λ 0 λ 0 x x 0. (4.12) 4 Observe that the F 1 becomes singular due to the singularity of F 1 λ 0 λ 0 only at t = t 0. In this problem, it can be easily shown that there does not exist any other singularities.

Optimal Control of Spacecraft Orbital Maneuvers by the Hamilton-Jacobi Theory

AIAA Guidance, Navigation, and Control Conference and Exhibit 2-24 August 26, Keystone, Colorado AIAA 26-6234 Optimal Control of Spacecraft Orbital Maneuvers by the Hamilton-Jacobi Theory Chandeok Park,