Optimal Control. with. Aerospace Applications. James M. Longuski. Jose J. Guzman. John E. Prussing

Similar documents
SWEEP METHOD IN ANALYSIS OPTIMAL CONTROL FOR RENDEZ-VOUS PROBLEMS

Constrained Optimal Control I

Physics 5153 Classical Mechanics. Solution by Quadrature-1

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Definition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series.

Principles of Optimal Control Spring 2008

Lecture Notes for PHY 405 Classical Mechanics

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction

1 Relative degree and local normal forms

STAT 801: Mathematical Statistics. Hypothesis Testing

ENSC327 Communications Systems 2: Fourier Representations. School of Engineering Science Simon Fraser University

Fluctuationlessness Theorem and its Application to Boundary Value Problems of ODEs

Extreme Values of Functions

The Ascent Trajectory Optimization of Two-Stage-To-Orbit Aerospace Plane Based on Pseudospectral Method

Chapter 2. Optimal Control Theory

Received: 30 July 2017; Accepted: 29 September 2017; Published: 8 October 2017

Review of Prerequisite Skills for Unit # 2 (Derivatives) U2L2: Sec.2.1 The Derivative Function

2 Frequency-Domain Analysis

Analog Computing Technique

Functional differentiation

SIO 211B, Rudnick. We start with a definition of the Fourier transform! ĝ f of a time series! ( )

The Brachistochrone Curve

Feedback Linearization

0,0 B 5,0 C 0, 4 3,5. y x. Recitation Worksheet 1A. 1. Plot these points in the xy plane: A

Mathematical Methods of Physics I ChaosBook.org/ predrag/courses/phys Homework 1

Categories and Natural Transformations

The concept of limit

AP Calculus Notes: Unit 1 Limits & Continuity. Syllabus Objective: 1.1 The student will calculate limits using the basic limit theorems.

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve.

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions

Minimum-Time Trajectory Optimization of Multiple Revolution Low-Thrust Earth-Orbit Transfers

Lab on Taylor Polynomials. This Lab is accompanied by an Answer Sheet that you are to complete and turn in to your instructor.

A Simple Explanation of the Sobolev Gradient Method

A solid-fluid mixture theory of porous media

Probabilistic Optimisation applied to Spacecraft Rendezvous on Keplerian Orbits

OPTIMIZING PERIAPSIS-RAISE MANEUVERS USING LOW-THRUST PROPULSION

! " k x 2k$1 # $ k x 2k. " # p $ 1! px! p " p 1 # !"#$%&'"()'*"+$",&-('./&-/. !"#$%&'()"*#%+!'",' -./#")'.,&'+.0#.1)2,'!%)2%! !"#$%&'"%(")*$+&#,*$,#

Basic properties of limits

Simpler Functions for Decompositions

Basic mathematics of economic models. 3. Maximization

Exponential and Logarithmic. Functions CHAPTER The Algebra of Functions; Composite

Telescoping Decomposition Method for Solving First Order Nonlinear Differential Equations

10. Joint Moments and Joint Characteristic Functions

8. INTRODUCTION TO STATISTICAL THERMODYNAMICS

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results

( x) f = where P and Q are polynomials.

Chapter 2 Optimal Control Problem

Global Weak Solution of Planetary Geostrophic Equations with Inviscid Geostrophic Balance

Lecture 6, September 1, 2017

Getting Some Big Air

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd.

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67

Chapter 6 Reliability-based design and code developments

APPENDIX 1 ERROR ESTIMATION

The Principle of Least Action

MISS DISTANCE GENERALIZED VARIANCE NON-CENTRAL CHI DISTRIBUTION. Ken Chan ABSTRACT

On Convexity of Reachable Sets for Nonlinear Control Systems

Finite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract)

Feedback Linearization Lectures delivered at IIT-Kanpur, TEQIP program, September 2016.

Products and Convolutions of Gaussian Probability Density Functions

BANDELET IMAGE APPROXIMATION AND COMPRESSION

Chapter 11 Collision Theory

An introduction to Mathematical Theory of Control

OPTIMAL CONTROL CHAPTER INTRODUCTION

TLT-5200/5206 COMMUNICATION THEORY, Exercise 3, Fall TLT-5200/5206 COMMUNICATION THEORY, Exercise 3, Fall Problem 1.

Christoffel symbols and Gauss Theorema Egregium

Additional exercises in Stationary Stochastic Processes

Scattering of Solitons of Modified KdV Equation with Self-consistent Sources

On Picard value problem of some difference polynomials

EXISTENCE OF SOLUTIONS TO SYSTEMS OF EQUATIONS MODELLING COMPRESSIBLE FLUID FLOW

Calculus of Variations Summer Term 2014

ELEG 3143 Probability & Stochastic Process Ch. 4 Multiple Random Variables

IMP 2007 Introductory math course. 5. Optimization. Antonio Farfán Vallespín

Asymptotics of integrals

Mathematical Notation Math Calculus & Analytic Geometry III

Physics 2B Chapter 17 Notes - First Law of Thermo Spring 2018

Fisher Consistency of Multicategory Support Vector Machines

ENERGY ANALYSIS: CLOSED SYSTEM

Optimization-Based Control. Richard M. Murray Control and Dynamical Systems California Institute of Technology

Introduction to the Calculus of Variations

Image Enhancement (Spatial Filtering 2)

Joint work with Nguyen Hoang (Univ. Concepción, Chile) Padova, Italy, May 2018

Least-Squares Spectral Analysis Theory Summary

Numerical Solution of Ordinary Differential Equations in Fluctuationlessness Theorem Perspective

Computing proximal points of nonconvex functions

Lecture : Feedback Linearization

PRECISION ZEM/ZEV FEEDBACK GUIDANCE ALGORITHM UTILIZING VINTI S ANALYTIC SOLUTION OF PERTURBED KEPLER PROBLEM

Asymptotics of integrals

minimize x subject to (x 2)(x 4) u,

Optimisation and optimal control 1: Tutorial solutions (calculus of variations) t 3

GENERALIZED ABSTRACT NONSENSE: CATEGORY THEORY AND ADJUNCTIONS

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

Increasing and Decreasing Functions and the First Derivative Test. Increasing and Decreasing Functions. Video

Fs (30.0 N)(50.0 m) The magnitude of the force that the shopper exerts is f 48.0 N cos 29.0 cos 29.0 b. The work done by the pushing force F is

FLUID MECHANICS. Lecture 7 Exact solutions

CISE-301: Numerical Methods Topic 1:

PH.D. PRELIMINARY EXAMINATION MATHEMATICS

Calculus of Variations

This is example 3 on page 44 of BGH and example (b) on page 66 of Troutman.

Transcription:

Optimal Control with Aerospace Applications by James M. Longuski Jose J. Guzman John E. Prussing Published jointly by Microcosm Press and Springer 2014 Copyright Springer Science+Business Media New York 2014 ISBN 978-1-4614-8944-3

Chapter 3 The Euler-Lagrange Theorem 3.1 The Variation The brachistochrone problem posed by Johann Bernoulli was a new type o mathematical problem which required a new mathematical approach. Lagrange developed the calculus o variations in which he considered suboptimal paths nearby the optimal one. He then showed that, or arbitrary but ininitesimal variations rom the optimal path, the unction sought musbey a dierential equation now known as the Euler-Lagrange equation. Let us consider a generalization o Lagrange s original technique in which x(t, ε) represents a amily o curves that are near the optimal path. This one-parameter amily is illustrated in Fig. 3.1 or dierent, ininitesimally small values o ε (e.g., ε 1, ε 2, ε 3,...). When ε is set to zero, the curve is the optimal path. O course we don t know x(t, 0), but we assume it exists and seek conditions which will lead to its solution. Note that varying ε has just x ε 3 ε 2 ε 1 t 1 t Figure 3.1: One-parameter amily o curves, x(t, ε). During a variation the value o x changes with ε at a ixed time, t 1. as signiicant an eect as varying time. That is, or a given time, t 1, the value o x(t 1, ε) changes with ε. In our search or the optimal path we will ind that derivatives with respect 39

40 CHAPTER 3. THE EULER-LAGRANGE THEOREM to parameters (such as ε), as well as derivatives with respect to time, must be taken. Figure 3.2 depicts the optimal solution, x(t, 0), and a nearby non-optimal solution, x(t, ε). Now, let us expand x(t, ε) in a Taylor series about ε = 0 at time t: x(t, ε) = x(t, 0) + x ε (ε 0) + O(ε 2 ) x(t, 0) + x ε ε (3.1) where we neglect terms o order ε 2 (and above) as indicated by the Big O symbol, O(ε 2 ). The irst variation o a unction x(t, ε) at time t is deined as: δx(t) x(t, ε) ε ε (3.2) x x(t, ε) δx(t) x(t, 0) Figure 3.2: Retaining ε to irst-order expansion o Eq. (3.1) about ε = 0 provides the irst variation δx x ε. ε Figure 3.2 illustrates the irst variation. Another orm o the irst variation originally used by Lagrange is: t δx(t) = εη(t) (3.3) We see in Eq. (3.3) the essential eatures o the variation. First, the amplitude is determined by ε which is always assumed to be ininitesimally small. Second, the unction η(t) is arbitrary. That is, the unction η(t) represents an arbitrarily large se virtual unctions; it is not itsel a particular unction. O course, we can always choose speciic unctions and examine their variations. Example 3.1 Assumed varied path.

3.2. EULER-LAGRANGE EQUATION AND BRACHISTOCHRONE PROBLEM 41 I we let x(t, ε) = (ε + ε 2 ) sin(t), then x(t, 0) = 0 is the optimal solution and δx(t) = x ε ε = [1 + 2ε] ε sin t = ε sin t (3.4) Figure 3.3 illustrates this particular variation. perturbation on the amplitude. In general the variation merely creates a x 2 π π 0 π 2 π t Figure 3.3: An example o expansion about ε = 0 rom Eq. (3.4). 3.2 The Euler-Lagrange Equation and the Brachistochrone Problem Beore introducing the Euler-Lagrange theorem, which provides the necessary conditions or trajectory optimization (such as the problem o launching a satellite into orbit), we derive the Euler-Lagrange equation and apply it to the brachistochrone problem. The Euler-Lagrange equation is simpler to derive than the theorem because the Euler- Lagrange equation only solves or a path, such as the shape o y(x) or the brachistochrone problem. The Euler-Lagrange theorem applies to the more diicult case (o optimal control) in which a control input is involved, such as the steering o the thrust vector on a launch vehicle.

42 CHAPTER 3. THE EULER-LAGRANGE THEOREM Let us consider the generalization o the brachistochrone problem in which a path y(x) between two ixed points y(x 0 ) and y(x ) must be ound to minimize J = x where J is a scalar, F is known, and y (x) = dy/dx. Following Lagrange s technique, let Since the endpoints are ixed we have x 0 F [y(x), y (x), x] dx (3.5) y(x, ε) = y(x) + εη(x) (3.6) η(x 0 ) = η(x ) = 0 (3.7) We ind the minimum or J by setting the irst variation o J equal to zero δj = dj dε ε = 0 (3.8) Thus we must have dj dε = = x x 0 x x 0 y ε + F ) y y ε dx ] dx = 0 ( F y [ F F η(x) + y y η (x) (3.9) Now, by what Lanczos [1986] called an ingenious application o the method o integration by parts, we write x [ ] x F F x x 0 y η (x)dx = y η(x) η(x) d ( ) F dx (3.10) x 0 x 0 dx y From Eq. (3.7), the irst term on the right side drops out. Substituting the remaining term on the right hand side o Eq. (3.10) into Eq. (3.9) we obtain x [ F y d ( )] F η(x) dx = 0 (3.11) dx y x 0 Because η(x) is arbitrary (except or restriction upon continuity and the end conditions) it ollows that or Eq. (3.11) to hold, a necessary condition is that the integrand vanishes. Thus, we have F y d ( ) F = 0 (3.12) dx y which is the Euler-Lagrange equation.

3.2. EULER-LAGRANGE EQUATION AND BRACHISTOCHRONE PROBLEM 43 Next we re-examine the brachistochrone problem o chapter 2. [See Fig. 2.3 and Eq. (2.18).] The time to be minimized is x 1 + y 2 t = dx (3.13) 2gx 0 where we must ind the time-optimal path y = y(x). The Euler-Lagrange equation, Eq. (3.12), must be satisied. Since F, the integrand o Eq. (3.13), is not an explicit unction o y the irst term o Eq. (3.12) is zero so the term inside the parentheses must be constant: F y = y 2gx(1 + y 2 ) = c (3.14) Equation (3.14) can be rearranged as dy dx = 2gc 2 x 1 2gc 2 x (3.15) The (not immediately obvious) trigonometric substitution x = a(1 cos θ) (3.16) solves the problem where a = 1/(4gc 2 ). By substituting Eq. (3.16) into Eq. (3.15), we obtain (ater simpliication): y = a(1 cos θ) dθ (3.17) which provides y = a(θ sin θ) (3.18) Eqs. (3.16) and (3.18) are the parametric equations o a cycloid that begins at the origin. In this analysis, we have shown that the cycloid provides a stationary value or time, but not necessarily the minimum time. Greenwood [1997] points out that a comparison o nearby trajectories conirms that the cycloid does indeed give the minimum-time path. Example 3.2 A simple problem in calculus o variations. Assume an optimal x (t) exists and take an assumed varied path: x(t, ε) = x (t) + ε sin t (3.19) In this case the irst variation, δx(t) = ε sin t, is the same as given in Eq. (3.4) and illustrated in Fig. 3.3. Find a necessary condition or the problem to minimize: J = π 0 F (t, x, ẋ)dt (3.20)

44 CHAPTER 3. THE EULER-LAGRANGE THEOREM J J[x (t)] 0 ε Figure 3.4: Minimizing the cost unctional J(ε), as in the case o Eq. (3.23), means that J(0) = J min. with boundary conditions: We can write the cost unctional as J(ε): J(ε) = π 0 x(0) = x o (3.21) x(π) = x (3.22) F [t, x + ε sin t, ẋ + ε cos t]dt (3.23) since ẋ(t, ε) = ẋ + ε cos t. Figure 3.4 represents the behavior o J(ε). Dierentiating with respect to ε, we have: dj π ( ) F dε = F sin t + 0 x ẋ cos t dt (3.24) Integrating the last term by parts we obtain: π 0 F cos tdt = ẋ ( ) F π π ( ) ẋ sin t d F sin t dt (3.25) 0 0 dt ẋ Thereore: dj dε = π 0 [ F x d ( )] F sin t dt = 0 (3.26) dt ẋ In this simple problem, we again deduce that the Euler-Lagrange equation must hold (or this very speciic variation): F x d ( ) F = 0 (3.27) dt ẋ

3.3. THE EULER-LAGRANGE THEOREM 45 3.3 The Euler-Lagrange Theorem To derive the necessary conditions o the Euler-Lagrange theorem, the change in J due to control variations δu (and hence δx) and to the dierential change in the terminal time t must be ound. Here we make an important distinction between the brachistochrone problem, where the endpoints o the trajectory are ixed, and the launch problem, where the inal endpoint is not necessarily ixed. In the latter case we must allow nearby suboptimal paths to take more time to achieve orbit than the time-optimal path. In Fig. 3.5, we illustrate x dx x(t, ε) δx(t ) ẋ (t ) ẋ (t )dt x (t) = x(t, 0) dt t (0) = t t (ε) t Figure 3.5: The variation at the inal time is δx (t ); however, the dierence in the inal state, dx, can be greater due to the dierence in the inal time, dt. the complication that arises rom the ree inal boundary condition. For simplicity, we only show one componen the state variable, where x(t, ε) represents the varied path and x (t) = x(t, 0) is the optimal path. Here we adopt notation in which the optimal path and optimal inal time are indicated by an asterisk. The inal time varies with nearby suboptimal paths and is represented by t (ε). Thus, the optimal inal time is given by t = t (0). We note in Fig. 3.5 that, at the optimal inal time, the variation is δx(t ), but because the varied path can take longer (as in the case o a suboptimal launch trajectory requiring more time to achieve orbit than the optimal trajectory), the change in the inal state (with respect to the optimal inal state) is dx. Thus, dx x[t (ε), ε] x[t (0), 0] (3.28) = x[t (ε), ε] x (t ) Examining Fig. 3.5 we can write: dx δx(t ) + ẋ (t )dt (3.29)

46 CHAPTER 3. THE EULER-LAGRANGE THEOREM Alternatively, using a Taylor series expansion we obtain [ ] x dt x[t (ε), ε] = x[t (0), 0] + t dε [ x + (ε 0) ] (ε 0) + O(ε ε 2 ) (3.30) where dε = (ε 0). So to irsrder, dx becomes dx { } { } x[t (0), 0] x[t (0), 0] dt + ε t ε (3.31) ẋ (t )dt + δx(t ) as beore. 3.3.1 Proo Outline o the Euler-Lagrange Theorem Now we restate the Problem o Bolza (rom chapter 2) or which we will prove (or more precisely provide an outline o the proo o) the Euler-Lagrange theorem: For a speciied t 0 minimize: subject to: where, J = φ(t, x ) + t L(t, x, u)dt (3.32) ẋ = (t, x, u) Process Equations (3.33a) x( ) = x o Initial Conditions (I.C.s) (3.33b) Ψ(t, x ) = 0 Terminal Constraints (3.33c) x is an n-vector u is an m-vector Ψ is a q-vector with 0 q n and J is the cost. The upper bound q on the dimension o Ψ is based on the number o independent constraints that determine the inal values o all n state variables. [Note: in the present counting scheme, Eq. (3.33b) represents n I.C.s and does not count as an I.C. Later, in section 3.3.3 we will include as an I.C. to obtain n + 1 I.C.s.] Assume φ, L,, Ψ C 1 on their respective domains and that the optimal control, u (t), is unconstrained. I u (t) C 0 [, t ] minimizes J, then the Euler-Lagrange theorem states

3.3. THE EULER-LAGRANGE THEOREM 47 that there exist a time-varying multiplier vector λ T (t) = (λ 1, λ 2,..., λ n ) and a constant multiplier vector ν T = (ν 1, ν 2,..., ν q ) such that with the Hamiltonian H(t, x, u, λ) L(t, x, u) + λ T (t, x, u) (3.34a) and a terminal unction Φ(t, x ) φ(t, x ) + ν T Ψ(t, x ) (3.34b) the ollowing necessary conditions must hold: λ T = H = Hx (3.35a) x and the transversality condition: λ T (t ) = Φ x H u = 0 T (3.35b) (3.35c) Ω(t, x, u ) L + dφ = 0 (3.35d) dt which applies only i t is unspeciied (i.e. dt 0). The Euler-Lagrange theorem [Eqs. (3.35)] assumes there exists a one-parameter amily u(t, ε) which satisies the constraints and u(t, 0) = u (t). We also note that there exists an associated amily x(t, ε) ormed by integrating ẋ = [t, x, u(t, ε)] with x( ) = x o where x (t) C 1. (The state variable is usually one order higher in continuity class than the control variable due to the integration o ẋ =.) Proo Outline o the Euler-Lagrange theorem. Let u(t, ε) be a one-parameter amily o admissible controls with u (t) = u(t, 0). Augment the cost unctional with the constraints by some as yet undeined variables λ 1 (t),..., λ n (t) and ν 1,..., ν q : J(ε) = φ{t (ε), x[t (ε), ε]} + ν T Ψ{t (ε), x[t (ε), ε]} t (ε) ( + L[t, x(t, ε), u(t, ε)] + λ T (t) {[t, x(t, ε), u(t, ε)] ẋ(t, ε)} ) dt(3.36) Equation (3.36) provides the cost, J(ε), or nearby suboptimal solutions. By deinition, J(0) is the minimum cost. Following the method o Lagrange, we will show that the problem o inding the control u (t) to minimize J can be converted into the problem o solving the algebraic and dierential equations, Eqs. (3.35). (The meaning and methods o solving these equations will be discussed later.) We see in the integrand o Eq. (3.36) that the term

48 CHAPTER 3. THE EULER-LAGRANGE THEOREM J J J bounded ε ε ε dj Figure 3.6: Possible graphs o J. The case o dε > 0 can occur or bounded control problems which are discussed in chapter 9. λ T { ẋ} is zero on the optimal trajectory due to the process equations, Eq. (3.33a). Similarly, the term ν T Ψ is zero due to Eq. (3.33c). Figure 3.6 illustrates how the cos J(ε) may vary with ε. For unbounded controls we have: dj dε = 0 (3.37) corresponding to the two diagrams on the let in Fig. 3.6. For bounded controls: dj dε 0 (3.38) as illustrated in the right-most diagram o Fig. 3.6. In the present proo we assume the control is unconstrained so thanly Eq. (3.37) holds. We consider the problem o constrained (or bounded) controls later in the text. Using the deinition o the Hamiltonian, Eq. (3.34a), and the terminal unction, Eq. (3.34b), in Eq. (3.36) we obtain J(ε) = Φ{t (ε), x[t (ε), ε]} + t (ε) { H[t, x(t, ε), u(t, ε), λ(t)] λ T (t)ẋ(t, ε) } dt (3.39) Next we need Leibniz rule to orm dj/dε o the unctional J(ε): ( ) d b(ε) (x, ε)dx = [b(ε), ε] db(ε) [a(ε), ε]da(ε) + dε dε dε a(ε) b(ε) a(ε) (x, ε) dx (3.40) ε We note that ε appears in the upper bound, t (ε), o the integral in Eq. (3.39) and that ε also appears in the Hamiltonian through x(t, ε) and u(t, ε) since H = H[t, x(t, ε), u(t, ε), λ(t)]. Thus, we can write dj dε = dφ dε + L[t (ε)] dt (ε) + dε t (ε) ( H x x ε + H u u ε λt (t) ẋ ) dt (3.41) ε

3.3. THE EULER-LAGRANGE THEOREM 49 where λ(t) is a unction o time only. Setting ε = 0 we obtain dj dε = dφ dε + L (t ) dt t ( dε + Hx x ε + Hu u ε λ T (t) ẋ ε ) dt (3.42) We ind dj by multiplying dj dε by dε = ε 0 and noting that dφ = Φ t dt + Φ x dx : dj = Φ t dt + Φ x t dx + L (t )dt + [Hxδx(t) + Huδu(t) λ T (t)δẋ(t)]dt = 0 (3.43) where we have made use o the deinition o the variation, Eq. (3.2), so that δu = u ε ε and δẋ = ẋ ε ε. The third term on the right-hand side o Eq. (3.43) is the contribution to dj by the integral term in Eq. (3.32), namely the produc the integrand value at t times dt. Now let us integrate t λ T (t)δẋ(t)dt by parts: t λ T (t)δẋ(t)dt = [ λ T (t)δx(t)] t t + λ T (t)δxdt (3.44) We recall that a similar step was made in the derivation o the Euler-Lagrange equation in Eq. (3.10). Substituting Eq. (3.44) into Eq. (3.43), we obtain: dj = Φ t + dt + Φ x t dx + L (t )dt λ T (t )δx(t ) [H xδx(t) + H uδu(t) + λ T (t)δx(t)]dt = 0 (3.45) where we have used δx( ) = 0. Keeping in mind that the inal time, t, may be ree (as shown in Fig. 3.7), we will replace δx(t ) with δx(t ) = dx ẋ dt [according to Eq. (3.29)], thereore: [ ] [ ] Φ Φ dj = + L (t t ) + λ T (t )ẋ dt + λ T (t x ) dx(t ) + t {[H x + λ T ]δx(t) + H uδu(t)}dt = 0 (3.46) We now come to the determination o the Lagrange multipliers, also known as costate variables. For a stationary value o the cost J, the variation dj in Eq. (3.46) must be zero. For this condition to be true or all δx(t) [or arbitrary δx(t)] it is necessary that λ T (t) = H x [t, x (t), u (t), λ(t)] (3.47)

50 CHAPTER 3. THE EULER-LAGRANGE THEOREM x x(t, ε) subject to Ψ(t, x ) = 0 x (t) dt t t Figure 3.7: When t is ree the terminal constraint, Ψ(t, x ) must be satisied on the varied path. Similarly, to have dj = 0 or all dx(t ) it is necessary that λ T (t ) = Φ x (3.48) Note that Eq. (3.48) provides the required boundary condition to accompany Eq. (3.47). Next, dj = 0 or all dt requires its coeicient to be zero in Eq. (3.46). Substituting in Eq. (3.48) this condition is Ω = dφ + L (t dt ) = 0 (3.49) This relation is the transversality condition that applies only i t is unspeciied (that is, dt 0). Equation (3.49) can be calculated as Φ t + H (t ) = 0 although, strictly speaking, the arguments o the unction Ω in Eq. (3.35d) do not include λ (on which H explicitly depends). Thus, the only term remaining in Eq. (3.46) is: dj = t H uδudt = 0 (3.50) so we conclude that because δu is arbitrary, a necessary condition is H u = 0 T (3.51) which is valid only or unconstrained control. We note that this par the proo is not rigorous, because δu is not arbitrary i there are terminal constraints Ψ(t, x ) = 0. The

3.3. THE EULER-LAGRANGE THEOREM 51 admissible δu are only those that generate δx that satisy Ψ = 0 (see Fig. 3.7), but a more rigorous treatment in Bryson and Ho [1975] shows that Eq. (3.51) is a correct necessary condition, even with terminal constraints. This completes the proo outline o the Euler-Lagrange theorem. I we retain the term λ T ( )δx( ) rom Eq. (3.44) in Eq. (3.45) rather than eliminate it using δx( ) = 0, two useul acts emerge: (i) i any component x i ( ) is not speciied [i.e. δx i ( ) 0], then the condition that will determine its optimal value x i ( ) is that the corresponding λ i ( ) = 0; and (ii) this provides an interpretation or the Lagrange multiplier λ. Each component λ i ( ) is the irst-order sensitivity (gradient) o the cost J due to a dierential change in the initial state component x i ( ), λ i ( ) = J/ x i (t 0 ). Along an optimal solution, any time t is the initial time or the remainder o the solution, so this interpretation o λ also applies at any time t. Because o this interpretation, one can evaluate the small change in cost due to a small change in state at the initial time (or any other time due to a disturbance) by using the value o λ rather than completely re-solving the problem. This observation is especially useul i solutions are determined numerically. We also note that the unction Ω in Eq. (3.35d), which is the coeicien dt in Eq. (3.46), has the interpretation that it is the sensitivity o the change in cost J due to a dierential change in the inal time t, Ω = J/ t. So when the inal time is speciied, the algebraic sign o Ω indicates whether a small increase or decrease in the inal time will lower the cost and provides a irst-order estimate o the cost change. As shown in Exercise 5, the components o the Lagrange multiplier ν, introduced in Eq. (3.34b), have the interpretation that they relate the change in the cost due to small changes in the constants in the terminal constraints. Again, this is especially useul i solutions are determined numerically, because we do not have to completely re-solve the problem. 3.3.2 Summary o the Euler-Lagrange Theorem For a minimum o J, the ollowing se necessary conditions must be met: ẋ = (t, x, u) = Hλ T λ = Hx T H T u = 0 (3.52a) (3.52b) (3.52c) where Eqs. (3.52a) and (3.52b) each represent n dierential equations and Eq. (3.52c) represents m algebraic equations. In addition we have the end conditions: x( ) = x o λ T (t ) = Φ x Ψ(x, t ) = 0 (3.53a) (3.53b) (3.53c)

52 CHAPTER 3. THE EULER-LAGRANGE THEOREM where Eq. (3.53a) represents n initial conditions (not including the initial time) and Eq. (3.53c) represents q boundary conditions where q n. Finally we have the transversality condition i dt 0: Ω = L + dφ dt = 0 (3.54) which can be calculated using the comment ater Eq. (3.49) and in Exercise 4. algebraic equations obtained rom Eq. (3.52c) provide the optimal control, u. The m There are 2n+m+q+1 unknown variables x, λ, u, ν, and t ; and there are a corresponding number o equations to solve or them: Eq. (3.52a) with B.C.s (3.53a), Eq. (3.52b) with B.C.s (3.53b), (3.52c), (3.53c), and (3.54). Examples o how the Euler-Lagrange theorem leads to a two-point boundary-value problem are given in chapter 4. In the next section an alternative ormulation o the necessary conditions is discussed which is similar to what has been described above, except that the terminal constraints are satisied by manually enorcing (directly solving) dψ = 0, rather than by using the Lagrange multiplier vector, ν. 3.3.3 Alternate Form o the Transversality Condition In section 3.3.1, the terminal constraints, Ψ(t, x ), were adjoined to the cost unctional, J, through the use o the additional multiplier vector, ν, as given in Eq. (3.34b). This method has a certain elegance in its structure, and has been adopted by most modern authorities ollowing Bryson and Ho [1975]. It does provide the q components o ν and these components provide additional inormation about the optimal solution, namely, the change in the cost due to small changes in the terminal constraints. (See Exercise 5 ). There is an alternative approach in which the prescribed terminal constraints are not adjoined to the cost unctional, so that the ν i never appear. In this approach, the transversality condition appears in a dierential orm which, when combined with the dierentials o the terminal constraints, provides the natural boundary conditions. Citron [1969], Hestenes [1969], Kenneth and McGill [1966], and Pierre [1969] all use the dierential orm o the transversality condition. Citron demonstrates the equivalence o both methods o employing the transversality condition in an example and states, It is up to the user to decide which method better suits his [or her] needs in any particular problem. We can derive the dierential orm o the transversality condition by a slight modiication o the material in section 3.3.1. We begin by dropping ν rom Eq. (3.34b) and, o course, rom Eq. (3.36) and note that the terminal B.C.s, Ψ(x, t ) = 0. are written as a p-vector where 1 p n + 1. As a result, Ψ must provide at leasne stopping condition and may contain all o the inal states and the inal time. The only change in Eq. (3.39) is that Φ is

3.3. THE EULER-LAGRANGE THEOREM 53 replaced by φ. Equation (3.43) becomes dj = dφ + (H λ T ẋ ) t dt + t [H xδx(t) + H uδu(t) λ T (t)δẋ(t)]dt = 0 (3.55) where we retain H λ T ẋ in lieu o L. Ater integration by parts [Eq. (3.44)], we obtain dj = dφ + (H λ T ẋ ) dt λ T (t )δx(t ) t + t [H xδx(t) + H uδu(t) + λ T (t)δx]dt = 0 (3.56) where we have used δx( ) = 0, as beore. Ater replacing δx(t ) with δx(t ) = dx ẋ dt we have the analog to Eq. (3.46): dj = dφ + H dt λ T (t )dx + t [(H x + λ T )δx + H uδu]dt = 0 (3.57) which, as illustrated in Fig. 3.7, is subject to the inal boundary conditions: dψ = Ψ t dt + Ψ x dx = 0 (3.58) We note that in this ormulation most authors assume that Ψ is a p-vector where 1 p n + 1 (3.59) so that up to n + 1 constraints can be enorced, allowing all the inal states and the inal time to be speciied. Equations (3.58) and (3.59) are the basis o the alternate orm o the transversality condition. Rather than being a q-vector, where 0 q n, Ψ is now a p-vector, where 1 p n + 1. Thus, there is always at leasne terminal constraint. I the value o the inal time is speciied, it becomes a componen the Ψ vector. By contrast, in section 3.3.1 a speciied inal time is treated as a separate constraint independen the Ψ vector. I the inal time is unspeciied, some componen the Ψ vector acts as a stopping condition that determines the value o the inal time, e.g. inal altitude, inal velocity, etc. The variables p and q are related in a simple manner: p = q unless the inal time is speciied, in which case p = q + 1. Then the irst component, Ψ 1, is equal to t t. As in Eq. (3.47), we select λ T (t) = H x [t, x (t), u (t), λ(t)] (3.60)

54 CHAPTER 3. THE EULER-LAGRANGE THEOREM which eliminates the irst parenthetical term in the integrand o Eq. (3.57). eliminate the non-integral terms in Eq. (3.57): In order to H dt λ T (t )dx + dφ = 0 (3.61) subject to the terminal constraint: dψ = 0 (3.62) The terminal cost term in Eq. (3.61) can be expanded as ( φ dφ = x ) ( ) φ dx + dt t (3.63a) Substituting Eq. (3.63a) into (3.61) yields ( ) ( φ φ λ T (t x ) dx + t + H ) dt = 0 (3.63b) Because dx and dt (i it is non-zero) are independent and arbitrary, choose λ T (t ) = φ x (3.63c) and φ t + H = 0 (3.63d) Equation (3.63c) provides the boundary condition or Eq. (3.60) and Eq. (3.63d) applies only i dt 0 (t unspeciied). In combination with dψ = 0, Eq. (3.63c) is equivalent to Eq. (3.53b) and Eq. (3.63d) is equivalent to Eq. (3.54). (See Exercise 4.) Thus, the only term remaining in Eq. (3.57) is: dj = t H uδudt = 0 (3.64) so we conclude as beore that a necessary condition is Hu = 0 T (3.65) which is valid only or unconstrained control.

3.4. SUMMARY 55 To summarize the Euler-Lagrange theorem in which the transversality condition is expressed in dierential orm, the ollowing se necessary conditions must be met: ẋ = (t, x, u) = Hλ T λ = Hx T H u = 0 T (3.66a) (3.66b) (3.66c) where Eqs. (3.66a) and (3.66b) each represent n dierential equations and Eq. (3.66c) represents m algebraic equations. In addition we have the end conditions: x( ) = x o, speciied Ψ(x, t ) = 0 (3.67a) (3.67b) where Eq. (3.67a) represents n + 1 initial conditions and Eq. (3.67b) represents p terminal conditions where 1 p n + 1. Finally we have the dierential orm o the transversality condition: H dt λ T dx + dφ = 0 (3.68) which provides n + 1 p boundary conditions. As previously mentioned, Eq. (3.68) can be separated into Eq. (3.63c) and Eq. (3.63d). [That is, in this counting scheme we need 2n + 2 B.C.s to solve or 2n dierential equations, Eqs. (3.66a) and (3.66b). Since Eqs. (3.67a)) and (3.67b) provide n + 1 + p B.C.s, Eq. (3.68) provides the remaining conditions: 2n + 2 (n + 1 + p) = n + 1 p.] Examples o how the Euler-Lagrange theorem leads to a two-point boundary-value problem using Eq. (3.68) are given in chapter 4. 3.4 Summary The problem posed by Johann Bernoulli inspired Lagrange to develop a new mathematical tool, the calculus o variations, to ind a unction which minimizes (or extremizes) a unctional. Lagrange s method led to the Euler-Lagrange equation, which converts the problem o minimizing the integral into a problem o solving a dierential equation. In the optimization o space trajectories a similar problem arises, but is complicated by a choice o controls (such as choosing the steering law or the thrust vector) and by the possibility o having ree inal boundary conditions. However, the variational approach o Lagrange once again yields a se dierential equations that must apply or an optimal trajectory. The Problem o Bolza is speciied in Eq. (3.32) and consists o a terminal cost, which depends on the inal time and inal state, and a path cost, which depends on an integral over the path. The Bolza problem includes a se process equations (usually the equations o motion) along with a speciied se initial and inal boundary conditions, Eqs. (3.33). The Euler-Lagrange theorem states that i u (t) is the optimal control (which is assumed to be continuous and unconstrained) that minimizes J o the Bolza problem, then a se

56 CHAPTER 3. THE EULER-LAGRANGE THEOREM dierential equations and algebraic equations, Eqs. (3.35), must be satisied. Thus, along with the process (or state) equations, Eqs. (3.33a), the dierential equations or the Lagrange multipliers (or costates), Eqs. (3.35a), must also be solved. These costate equations arose rom the introduction o the Lagrange multipliers into the proo, and have an interesting and useul physical interpretation. A se algebraic equations, Eqs. (3.35c), provides control laws or u(t). Equation (3.35d), which is called the transversality condition, provides the additional necessary boundary conditions to solve or the dierential equations, Eqs. (3.33a) and Eqs. (3.35a), as we will show later. The proo o the Euler-Lagrange theorem depends on the deinition o the Hamiltonian, Eq. (3.34a), and the use o the one-parameter amilies or the control and or the state, u(t, ε) and x(t, ε), respectively. The proo requires the application o Leibniz rule and integration by parts. The arbitrary nature o the Lagrange multipliers permits selection o these unctions and their inal boundary conditions to simpliy the proo. The proo consists o Eqs. (3.36) through (3.51). It is important to note the limitations o the proo. We have assumed that the control is unconstrained (unbounded) and is continuous, and that the state is continuously dierentiable. At this stage we have no theorem or the cases where the control is bounded or piecewise continuous. We will need other theorems discussed in later chapters or such problems. One inal remark will be made beore moving on to applications o the Euler-Lagrange theorem. Are there other approaches besides the calculus o variations to solve the optimal control problem? With the availability o computational techniques, direct methods o solving Bolza s problem now exist. By direct, we mean the numerical construction o a control law and trajectory path which can be inserted into the integrand so that a numerical value o the cost can be calculated. Then a number o nearby trajectories can be submitted to calculate their costs. Another approach, discussed by Lanczos [1986], is to represent the path by a truncated Fourier series in which the coeicients are solved. This method changes the optimal control problem into a parameter optimization problem, but requires a small error to be accepted due to the truncation. Lanczos credits Hilbert or originating this concep unction space. Further discussion o direct methods is outside the scope o this text. The goal o the present work is to provide an introduction to the classical indirect method the use o the calculus o variations to solve the Problem o Bolza.

3.5. EXERCISES 57 3.5 Exercises 1. State and prove the Euler-Lagrange theorem using the same assumptions and techniques presented in section 3.3.1 except or the ollowing dierences: i.) the initial state x( ) is ree and the initial time is ree, ii.) the boundary conditions are expressed as (i.e., in terms o initial and inal conditions), and Ψ(, x o, t, x ) = 0 (3.69) iii.) J = φ(, x o, t, x ) + t L(t, x, u)dt. Hint: you should ind that the additional necessary condition is λ T ( ) = Φ/ x o and Φ/ H( ) = 0. 2. Repeat exercise 1 using the assumptions o section 3.3.3 in which the terminal costs are not adjoined to the cost unctional. Hint: you should ind that the new transversality conditions are [Hdt λ T dx] t +dφ = 0 subject to dψ = 0. 3. Derive Eq. (3.49) by substituting Eq. (3.48) into Eq. (3.46). 4. Show that, as mentioned ater Eq. (3.49), the variable Ω can be calculated as Φ t + H (t ) = 0. 5. The terminal constraint, Eq. (3.53c), can be generalized to be Ψ(t, x, c) = 0, when c is a vector o constants, such as the speciied inal altitude, etc. Show that the vector ν introduced in Eq. (3.34b), has the physical interpretation o relating a change in the cost due to a small change in c, namely J = c νt Ψ c. Also show that in component orm we have J c i = q j=1 ν j Ψ j c i. Hint: ν T dψ = ν [ T Ψ t dt + Ψ x dx + Ψ c dc ]. 6. Zermelo s Problem We wish to minimize the time or a boat to cross a river (see Fig. 2.10). As mentioned earlier, Zermelo s problem has an interesting aerospace application: tha an aircrat lying in a crosswind. subject to Min J = t dt (3.70) ẋ = V cos θ + u(x, y) ẏ = V sin θ + v(x, y) (3.71a) (3.71b)

58 CHAPTER 3. THE EULER-LAGRANGE THEOREM where, x o, y o, x, y are given and V is constant. Let u and v represent strong currents which may depend on location. 6a. Make use o the act that since the Hamiltonian is not an explicit unction o time, it is a constant throughout the motion (we will prove this in section 4.8). Show that cos θ λ x = V + u cos θ + v sin θ sin θ λ y = V + u cos θ + v sin θ (3.72a) (3.72b) 6b. We note that i u and v are constants, then θ is a constant and the minimum paths are straight lines. Show that i u = u(y) v = v(y) (3.73a) (3.73b) then cos θ V + u(y) cos θ + v(y) sin θ = constant (3.74) Equation (3.74) implies that the heading angle, θ, changes with local current velocities in direct analogy with Snell s law o optics. 6c. Suppose u = V y/h v = 0 (3.75a) (3.75b) where h is constant. Show that or (x, y ) = (0, 0). cos θ = cos θ [ 1 + y cos θ ] (3.76) h 3.6 Reerences A.E. Bryson, Jr. and Y.C. Ho, Applied Optimal Control. Hemisphere Publishing Corporation, Washington, D.C., 1975. S.J. Citron, Elements o Optimal Control. Holt, Rinehart, and Winston, 1969. D.T. Greenwood, Classical Dynamics. Dover Publications, Inc., New York, 1997.