Convergence of the Forward-Backward Sweep Method in Optimal Control

Convergence of the Forward-Backward Sweep Method in Optimal Control Michael McAsey a Libin Mou a Weimin Han b a Department of Mathematics, Bradley University, Peoria, IL 61625 b Department of Mathematics, University of Iowa, Iowa City, IA 52242 Abstract The Forward-Backward Sweep Method is a numerical technique for solving optimal control problems. The technique is reviewed and a convergence theorem is proved for a basic type of optimal control problem. Examples illustrate the performance of the method. Keywords: Optimal control, Numerical solution, Convergence 1 Introduction Solutions of optimal control problems are often difficult. Yet when either learning or teaching optimal control it is helpful to have some examples with closed form solutions. It is also useful to have a simple numerical scheme that can produce a numerical approximation to solutions for some problems for which closed form solutions are not available. In their textbook [20] Lenhart and Workman have just that. The Forward-Backward Sweep Method (FBSM) in [20] is easy to program and runs quickly. The method is designed to solve the differentialalgebraic system generated by the Maximum Principle that characterizes the solution. A detailed convergence analysis of the method is not appropriate for the intended audience of [20], but after seeing the method work on problems from several disciplines, it seems natural to ask about some convergence properties. In this paper we prove a convergence result for the method applied to a very basic class of problems. Corresponding author: Michael McAsey, Department of Mathematics, Bradley University, Peoria, IL 61625, email: mcasey@bradley.edu, phone: 309-677-2491, fax: 309-677-3999 1

The literature on numerical solutions of optimal control problems is large. To put some of this into perspective, consider a basic problem: choose a control u(t) to optimize an integral T f(t, x(t), u(t))dt subject to a differential equation constrain x (t) = g(t, x(t), u(t)), x( ) = x 0. The main analytical technique is provided by Pontryagin s Maximum Principle which gives necessary conditions that the control u(t) and the state x(t) need to satisfy. These conditions can be solved explicitly in a few examples. However for most problems, especially problems that also involve additional constraints on the state or control, the conditions are too algebraically involved to be solved explicitly. So numerical approaches are used to construct approximations to the solutions. Useful surveys on numerical methods can be found in texts, articles, and introductions to articles. Examples include [5],[6],[7], [8], and [12]. Numerical techniques for optimal control problems can often be classified as either direct or indirect. For a direct method, the differential equation and the integral are discretized and the problem is converted into a nonlinear programming problem. Many choices are available for discretizing the integral, the differential equation, and for solving the nonlinear programming problem resulting in several different numerical methods. For example, in an early paper [15], Hager considers an unconstrained problem to minimize the final state value subject to the differential equation x = f(x(t), u(t)). The paper treats discretizations by both one-step and multistep approximations. Dontchev, Hager, and co-authors have produced not only convergence results but rates of convergence for direct techniques on problems that include state and control constraints. For a sample, see [9], [10], [11], [12], [15]. Indirect methods approximate solutions to optimal control problems by numerically solving the boundary value problem for the differential-algebraic system generated by the Maximum Principle. Techniques for solving boundary value problems can be found in the venerable book by Keller [18] and include shooting, finite difference, and collocation methods. More recently, Iserles book [17] solves boundary value problems via finite element methods. The addition of an algebraic constraint or state/control constraints present additional difficulties that do not appear in classical BVP. Relevant for the present paper is work by Hackbusch [14] that approximates solutions to boundary value problems with two parabolic equations. The books [1], [16], [19] have extensive treatments of differential-algebraic systems, concentrating more on initial value problems rather than boundary value problems. The paper by Bertolazzi [4] has an informative introduction, highlighting the various numerical techniques in optimal control and their advantages and disadvantages. The idea exploited by the FBSM can be seen in the way one of the equations is solved in a forward direction and the other is solved backwards with updates from the first. An early reference to a technique that has the forward-backward flavor is [21] where the update 2

step is different from that considered here. In [13] Enright and Muir use both explicit and implicit Runge-Kutta methods (and an average of the two) for two-point boundary value problems that also has some of the flavor of the FBSM. In section 2 of this paper, we describe the type of optimal control problems to which we will apply the FBSM. These are among the most basic in the subject. In section 3 we investigate the convergence issue for the simplest case of the method. We show both a continuous version and a discrete version of the convergence theorem. In section 4 we illustrate the numerical performance of the method through simulations of solutions to a couple of examples in optimal control. The paper closes in section 5 with a few remarks on more general problems. 2 Basic problem The basic problem to be considered is to choose a control function u(t) to maximize an integral objective function: t1 max f(t, x(t), u(t))dt (2.1) u subject to the state equation x (t) = g(t, x(t), u(t)), x( ) = x 0. (2.2) For this formulation of the problem assume that x and u are vector-valued functions on [, t 1 ] with values in R n and R m, respectively. Assume f and g map R R n R m into R and R n, respectively. The basic problem is generalized in several ways. Some of these include: (1) the terminal value of the state x(t 1 ) may be fixed; (2) the end time t 1 could be a choice variable; and (3) the objective may include a scrap function φ(t 1 ) in addition to the integral. There are more variations on the basic problem of course, but these are the main problems in [20] that can be solved by the FBSM. The FBSM is one of the so-called indirect methods for solving optimal control problems. Begin by using the Maximum Principle to characterize the method as applied to the basic problem. This is considered in detail in [20] (p. 13) and we provide a brief sketch. Assume that f and g are continuously differentiable in all three variables. We also assume that a solution to the basic problem exists where x is continuously differentiable and u is piecewise continuous. Form the Hamiltonian H(t, x, u, λ) = λ 0 f(t, x, u) + λg(t, x, u) where λ = λ(t) is the adjoint or co-state variable. (We will take the constant λ 0 to be equal to 1 for the problems considered here, although in general this cannot be assumed.) The Maximum Principle says that there is a co-state variable λ(t), such that an optimal state x(t), and optimal control u(t) must necessarily (1) satisfy the state equation, x (t) = g(t, x(t), u(t)), x( ) = x 0 ; 3

(2) satisfy the co-state equation dλ/dt = H/ x, λ(t 1 ) = 0; (2.3) and (3) optimize the Hamiltonian, considered as a function of the control. These three conditions result in a two-point boundary value problem and an additional algebraic equation from the optimality condition (3). Assuming enough structure on the functions, condition (3) can be written as H u = 0. Although it is not necessary for the numerical algorithm, in many problems this equation can be solved for u, and that is how we shall state the FBSM. In brief, the Forward Backward Sweep Method first solves the state equation x = g(t, x, u) with a Runge-Kutta routine, then solves the costate equation (2.3) backwards in time with the Runge-Kutta solver, and then updates the control. This produces a new approximation of the state, costate, and control (x, λ, u). The method continues by using these new updates and calculating new Runge-Kutta approximations and control updates with the goal of finding a fixed point (x, λ, u). The method terminates when there is sufficient agreement between the states, costates, and controls of two passes through the approximation loop. 3 Convergence of the FBSMs To better understand the idea of the convergence analysis, we first consider the FBSM at the continuous level, and this is done in Subsection 3.1. The argument can then be adapted for the convergence study of the FBSM applied to the discrete systems, in Subsection 3.2. Throughout this section we will assume that an optimal solution exists for the basic problem (2.1)-(2.2). The Lipschitz conditions to be assumed shortly are enough to be able to apply the Maximum Principle. (See [23, p. 85] for a statement of the Maximum Principle.) This in turn implies that the boundary value problem of interest has a solution. Thus the real problem is solving a boundary value problem of a specific form. 3.1 Convergence for the continuous system For notational simplicity, we express the problem as finding (x(t), λ(t), u(t)) such that x (t) = g(t, x(t), u(t)), x( ) = x 0, (3.1) λ (t) = h 1 (t, x(t), u(t)) + λ(t) h 2 (t, x(t), u(t)), λ(t 1 ) = 0, (3.2) u(t) = h 3 (t, x(t), λ(t)). (3.3) Here x 0 and < t 1 are given real numbers, g, h 1 and h 2 are given functions satisfying the continuity properties mentioned in Section 2 so that the system (3.1) (3.3) has a unique 4

solution (x(t), λ(t), u(t)). The equation (3.3) is interpreted as u being defined uniquely from the optimality condition, and there is no need to actually have an explicit formula for h 3. The use of the form (3.3) is for convenience. The FBSM for the system (3.1) (3.3) reads as follows: Initialization: choose an initial guess u (0) (= u (0) (t)). Iteration: for k 0, solve dx (k+1) (t) dt dλ (k+1) (t) dt = g(t, x (k+1) (t), u (k) (t)), x (k+1) ( ) = x 0, (3.4) = h 1 (t, x (k+1) (t), u (k) (t)) + λ (k+1) (t) h 2 (t, x (k+1) (t), u (k) (t)), λ (k+1) (t 1 ) = 0, (3.5) u (k+1) (t) = h 3 (t, x (k+1) (t), λ (k+1) (t)). (3.6) For a convergence analysis of the above FBSM, we will make the following assumptions. (A). The functions g, h 1, h 2 and h 3 are Lipschitz continuous with respect to their second and third arguments, with Lipschitz constants L g, L h1, etc., e.g. g(x 1, u 1 ) g(x 2, u 2 ) L g ( x 1 x 2 + u 1 u 2 ). Moreover, Λ = λ < and H = h 2 <. Note that in convergence analysis of numerical methods for ODEs, it is standard to assume Lipschitz conditions ([2]). In the proof of the next theorem, we will apply a simple form of the well-known Gronwall s inequality ([3, Exercise 5.2.12]): Suppose f and g are continuous functions on [a, b] and g is non-decreasing, then f(t) g(t) + c t a f(s) ds = f(t) e c (t a) g(t) t [a, b]. (3.7) Similarly, if f and g are continuous functions on [a, b] and g is non-increasing, then f(t) g(t) + c b Theorem 3.1 Under the assumptions (A), if t f(s) ds = f(t) e c (b t) g(t) t [a, b]. (3.8) c 0 L h3 { [e L g(t 1 ) 1 ] + (L h1 + Λ L h2 ) 1 H [ e H(t 1 ) 1 ] [ e Lg(t 1 ) + 1 ]} < 1, (3.9) then we have convergence: as k, max x(t) x (k) (t) + max λ(t) λ (k) (t) + max u(t) u (k) (t) 0. (3.10) t t 1 t t 1 t t 1 5

Proof. Denote the errors e (k) x = x x (k), e (k) λ = λ λ (k), e (k) u = u u (k). These errors are all functions of t as well as k. From (3.1) and (3.4), we have Then de (k+1) x (t) dt = g(t, x(t), u(t)) g(t, x (k+1) (t), u (k) (t)), e (k+1) x ( ) = 0. t [ e (k+1) x (t) = g(s, x(s), u(s)) g(s, x (k+1) (s), u (k) (s)) ] ds. Apply the Lipschitz condition on g, Hence, t e (k+1) x (t) L g [ e (k+1) x Similarly, from (3.2) and (3.5), we have e (k+1) λ (t) = t1 [ e (k+1) λ (t) t t t 1 (s) + e (k) u (s) ] ds, t [, t 1 ]. (3.11) { h 1 (s, x(s), u(s)) h 1 (s, x (k+1) (s), u (k) (s)) + λ(s) [ h 2 (s, x(s), u(s)) h 2 (s, x (k+1) (s), u (k) (s)) ] } + e (k+1) λ (s) h 2 (s, x (k+1) (s), u (k) (s)) ds. (L h1 + Λ L h2 ) ( e (k+1) x (s) + e (k) u (s) ) + H e (k+1) λ ] (s) ds, t [, t 1 ]. (3.12) Furthermore, from (3.3) and (3.6), we obtain [ ] e (k+1) u (t) L h3 e (k+1) x (s) + e (k+1) λ (s), t [, t 1 ]. (3.13) Apply Gronwall s inequality (3.7) on (3.11) to obtain t e (k+1) x (t) e Lg(t t0) L g e (k) u (s) ds, t [, t 1 ]. (3.14) Apply Gronwall s inequality (3.8) on (3.12) to obtain t1 e (k+1) λ (t) e H(t1 t) (L h1 + Λ L h2 ) t 6 [ e (k+1) x (s) + e (k) u (s) ] ds, t [, t 1 ].

Then, plug (3.14) into the right side of this inequality and use integration by parts to obtain { [e e (k+1) λ (t) e H(t1 t) L (L h1 + Λ L h2 ) g(t 1 ) e ] t Lg(t ) e (k) u (s) ds + [ e Lg(t 1 ) e Lg(t t0) + 1 ] t 1 } e u (k) (s) ds, t [, t 1 ]. (3.15) Use (3.14) and (3.15) in (3.13), e (k+1) u (t) L h3 { e Lg(t ) L g t e (k) u (s) ds + e H(t 1 t) (L h1 + Λ L h2 ) where Hence, + e H(t 1 t) (L h1 + Λ L h2 ) [ e Lg(t 1 ) e Lg(t ) ] t 1 We integrate (3.16) over the interval [, t 1 ] to obtain t1 t1 e (k+1) u (t) dt c 1 t e (k) u (t) dt, t1 t e (k) u (s) ds t1 c 1 = c 0 L h3 (L h1 + Λ L h2 ) e (H Lg) t+ht 1 L g dt. t1 e (k) u (s) ds }, t [, t 1 ]. (3.16) t1 e (k) u (t) dt (c 1 ) k e (0) u (t) dt. (3.17) Thus, if c 1 < 1, valid under the assumption (3.9), we have t1 e (k) u (t) dt 0 as k. (3.18) Using this convergence in (3.14), (3.15) and (3.16), we conclude the statement (3.10). Remark 3.2 The condition (3.9) is valid if L h3 is sufficiently small, or if L g (t 1 ) and L h1 + Λ L h2 are sufficiently small. As is seen from the proof, this condition can be replaced by the weaker one c 1 < 1. It is possible to further sharpen the condition (3.9). Other iteration methods may be studied as well. For the iteration, one may consider using dx (k+1) = g(t, x (k) (t), u (k) (t)), x( ) = x 0, dt instead of (3.4) then it only requires an integration to get x (k+1). A similar comment applies to (3.5). The price to pay is a slower convergence. 7

3.2 The numerical algorithm and convergence for the discretized system In a numerical implementation of the FBSM we are not actually solving the state and co-state differential equations (3.1) and (3.2), of course, but are finding instead numerical approximations of the solutions at discrete points in the interval. The convergence theorem in this section will show that when the Lipschitz constants are small enough or the time interval is short enough, then there is a grid size and an iteration count so that the error between the solution at the nodes and the discrete approximation can be made small. Recall the system that is being solved. x (t) = g(t, x(t), u(t)), x(0) = a λ (t) = h 1 (t, x(t), u(t)) + λh 2 (t, x(t), u(t)), λ(t ) = 0 u = h 3 (t, x(t), λ(t)) (3.19) For notational convenience the interval is now assumed to be [0, T ] and the initial state is denoted by a. The assumptions (A) continue in force for this section. Recall these assumptions are: the functions g, h 1, h 2, h 3 are continuous in t and Lipschitz in x, u, and λ. We continue to assume that a solution exists to the optimal control problem (2.1)-(2.2). These hypotheses and the Maximum Principle then imply that the boundary value problem (3.19) has a solution. In this section we also assume here that the solutions x(t), λ(t) and u(t) are continuous. Let n be a positive integer and define the step size h = T/n. Denote x j = x(t j ) and λ j = λ(t j ), where t j = + jh = jh and x(t), λ(t), u(t) are the actual solutions to the system (3.19). For each k 0, let x k j, λ k j, u k j be the k-th approximations to x j, λ j, u j, as defined below. Our goal is to show that the approximations converge to the solution as k and h 0. 3.2.1 Discrete Approximations Consider a discrete approximation to a general initial value problem y = g(t, y), α < t β; y(α) = y 0 that has the following scheme: y j+1 = y j + hg(t j, y j, h; g) where G is a function such that η g (h) sup{ g(t, y) G(t, y, h; g) : α t β, < y < } 0 (3.20) 8

as h 0. The classical methods (e.g., Euler, Runge-Kutta) have this form. This scheme will be applied to approximate x(t) using the first equation and then applied to the second equation backwards in time to approximate λ(t). For simplicity, the resulting functions G associated with g(t, x, u) and f(t, x, u, λ) = h 1 (t, x, u) + λh 2 (t, x, u) are denoted by G and F as follows: G(t, x, u) = G(t, x, u, h; g); F (t, x, u, λ) = G(t, x, u, λ, h; f). For j = 0,..., n 1 define the forward and backward difference operators j x = x j+1 x j, δ j x = x j x j+1. The operators j and δ j apply to the approximating sequences x k j, λ k j, u k j as well. 3.2.2 The Algorithm Initialization: choose initial guess u 0 j, j = 0,..., n. Iteration: for k 0, we define x k+1 j, λ k+1 j, u k+1 j, j = 0,..., n by the equations j x k+1 = hg(t j, x k+1 j, u k j ), x k+1 0 = a, j = 0,..., n 1 δ j 1 λ k+1 = hf (t j, x k+1 j, u k j, λ k+1 j ), λ k+1 n = 0 j = n,..., 1 u k+1 j = h 3 (t j, x k+1 j, λ k+1 j ) j = 0,..., n. (3.21) 3.2.3 Convergence Theorem Theorem 3.3 Suppose the assumptions (A) holds. Suppose that either the Lipschitz constants are small or T is small. Then max{ x(t j ) x k j + λ(t j ) λ k j + u(t j ) u k j, j = 0,..., n} 0 as k, n. That is, for every ε > 0 there exist N, K > 0 such that for all n > N and k > K. max{ x(t j ) x k j + λ(t j ) λ k j + u(t j ) u k j, j = 0,..., n} < ε Remark 3.4 The first approximating equation can be replaced by See Remark 3.6 for a discussion. j x k+1 = hg(t j, x k j, u k j ), x k+1 0 = a. (3.22) 9

Proof. Denote the errors by e k xj = x j x k j, e k λj = λ j λ k j, e k uj = u j u k j for k 0, j = 0, 1,..., n. The proof follows the general outline of the proof of the continuous approximation. The essence is to find bounds for the errors in x and λ in terms of the error in u and then show that this last error can be made small. Define the following average errors: E k x = h e k xj, Eλ k = h e k λj, Eu k = h e k uj. (3.23) Since h = T/n, E k x is an average error of the approximation x k j. The idea of the proof is to show that E k u 0 as k and h 0, which implies the desired result. Inequality for e k xj Note that e k+1 x0 = 0. From the equations for x and x k+1 we get for i = 0,..., n 1, It follows that i e k+1 x = e k+1 x(i+1) ek+1 xi = (x i+1 x k+1 i+1 ) (x i x k+1 i ) i e k+1 x = i x i x k+1 = = (x i+1 x i ) (x k+1 i+1 xk+1 i ) = i x i x k+1. ti+1 t i [g(t, x(t), u(t)) G(t i, x k+1 i, u k i )]dt. (3.24) To analyze the preceding difference, subtract and add both the quantities g(t i, x(t i ), u(t i )) and g(t i, x k+1 i, u k i ). First, using the continuity of g, x, and u, we get g(t, x(t), u(t)) g(t i, x(t i ), u(t i )) ω g (h) (3.25) where ω g (h) is the oscillation of the function g(t, x(t), u(t)) considered as a function of t; that is, ω g (h) = sup r,s [0,T ], r s h g(r, x(r), u(r)) g(s, x(s), u(s)). Second, by the Lipschitz condition on g we get g(ti, x(t i ), u(t i )) g(t i, x k+1 i, u k i ) Lgx x(t i ) x k+1 i + L gu u(t i ) u k i L gx e k+1 xi + L gu e k ui. (3.26) 10

Third, by the definition of η in (3.20), we have g(t i, x k+1 i, u k i ) G(t i, x k+1 i, u k i ) η g (h). (3.27) Putting these three pieces (3.25)-(3.27) together we have that i e k+1 x L gx h e k+1 xi + L gu h e k ui + o 1 (h) (3.28) for j = 0,..., n 1, where o 1 (h) = hω g (h) + hη g (h). Note that This and (3.28) imply j 1 e k+1 xj = e k+1 x0 + i e k+1 xi, e k+1 x0 = 0. e k+1 xj i=0 j 1 [L gx h e k+1 xi + L gu h e k ui + o 1 (h)]. (3.29) i=0 Sum both sides of (3.29) on j and change the order of sums to get e k+1 xj = = j 1 [L gx h e k+1 xi + L gu h e k ui + o 1 (h)] i=0 i=0 (n i)[l gx h e k+1 xi + L gu h e k ui + o 1 (h)]. Multiply both sides of this inequality by h = T/n. Note that (n i)h nh = T. Use the notation 3.23 for average error to get So if T L gx < 1, then we get E k+1 x T [L gx E k+1 x + L gu E k u + n o 1 (h)]. (3.30) E k+1 x T 1 T L gx [L gu E k u + n o 1 (h)]. (3.31) Thus we have a bound for errors in x written in terms of errors in u. Inequality for e k λj 11

Next we try to get similar inequality for errors in λ. We use the equation in (3.19) for λ and that in (3.21) for λ k to get for j = n,..., 1, tj 1 δ j 1 e k+1 λ = δ j 1 λ δ j 1 λ k+1 = [f(t, x(t), u(t), λ(t)) F (t j, x k+1 j, u k j, λ k+1 j )]dt t j = tj 1 + + t j tj 1 [f(t, x(t), u(t), λ(t)) f(t j, x(t j ), u(t j ), λ(t j ))]dt [f(t j, x(t j ), u(t j ), λ(t j )) f(t j, x k+1 j, u k j, λ k+1 j )]dt t j tj 1 [f(t j, x k+1 j, u k j, λ k+1 j ) F (t j, x k+1 j, u k j, λ k+1 j )]dt. t j Using a computation similar to (3.28), we get δ j 1 e k λ [L fx h e k+1 xj + L fu h e k uj + L fλ h e k+1 + o 2 (h)] where o 2 (h) = hω f (h)+hη f (h). Recall ω f (h) is the oscillation function of f(t, x(t), u(t), λ(t)), η f (h) is defined in (3.20), and L fx, L fu, L fλ are the Lipschitz constants of f = h 1 (t, x, u) + λh 2 (t, x, u) for x, u, λ, respectively. Since and e k+1 λn e k+1 λ,j 1 = ek+1 λn + j = 0, by the triangle inequality, we have i=n δ i 1 e k+1 λ λj e k+1 λ,j 1 j δ i 1 e k λ i=n j i=n [L fx h e k+1 xi + L fu h e k ui + L fλ h e k+1 λi + ho 2 (h)]. (3.32) The next step is to rewrite (3.32) to get errors in λ on the left side only. Eliminate e k λj We need the following discrete Gronwall s inequality for sequences f n, p n, and k n. Lemma 3.5 Assume that g 0 0 and p n 0 for n 0, and for n 1, Then f n (g 0 + n 1 p j)e n 1 k j. n 1 n 1 f 0 g 0 ; f n g 0 + p j + k j f j. 12

A proof can be found in Quarteroni and Valli [22], p. 14. Apply this Lemma to (3.32) (backwards) with g 0 = 0, p j = L fx h e k+1 xj +L fu h e k uj +o 2 (h) and k j = L fλ h to get e k+1 λ,j 1 M j 1 i=j [L fx h e k+1 xi + L fu h e k ui + o 2 (h)], j = n,..., 1 (3.33) where M j = e Lfλh(n j). Note that M j M 0 = e T L fλ for errors in λ written in terms of errors in x and u. because hn = T. This gives a bound Show E k u 0 By the third equation in (3.21) we obtain for j = 0,..., n, e k+1 uj L h3 [ e k+1 xj + e k+1 λj ]. (3.34) Replace j by j + 1 in (3.33) and substitute it into (3.34) to get [ e k+1 uj L h3 e k+1 xj + M j ] [L fx h e k+1 + L fu h e k ui + o 2 (h)] i=j+1 Sum (3.35) over j from j = 0 to j = n to get [ e k+1 uj L h3 e k+1 xj + L h3 [ = L h3 [ i=1 e k+1 xj + i=j+1 i 1 i=1 K i e k+1 xi + xi M j [L fx h e k+1 xi + L fu h e k ui + o 2 (h)] M j [L fx h e k+1 xi + L fu h e k ui + o 2 (h)] L fu N i h e k ui + o 2 (h) i=1 ] N i i=1 ] ] (3.35) (3.36) where for i = 1,..., n, i 1 N i = M j = elfλh(n+1) e L fλh(n i+1) ; K e L fλh i = 1 + L fx N i h. 1 Note that N i im 0. It follows that K i 1 + T M 0 L fx ; N i 1 2 n(n + 1)M 0 n 2 M 0. i=1 13

Now (3.36) implies that e k+1 uj L h3 [(1 + T M 0 L fx ) i=1 e k+1 xi + M 0 nh ] e k ui + n 2 M 0 o 2 (h). (3.37) Multiplying both sides of (3.37) by h, using T = nh and the definition of average errors, we get [ Eu k+1 L h3 (1 + T M0 L fx )Ex k+1 + M 0 T Eu k + o 2 (h)t 2 M 0 h 1] Combining this with (3.31) we get for k = 0, 1, 2,..., where Iterating (3.38) we obtain i=1 E k+1 u BE k u + o 3 (h) (3.38) (1+T M B = L h3 M 0 T + L 0 L fx )T L gu h3 1 T L gx o 3 (h) = L h3 M 0 T 2 o 2 (h) + L h h3 T 2 o 1 (h). [1 T L gx]h E k u B k E 0 u + k B i o 3 (h). Note that B < 1 when either T is small or the Lipschitz constants are small. Therefore B k 0 and k i=0 Bi 1 is bounded. Moreover, the definitions of o 1 B 1(h) and o 2 (h) imply that o 1 (h)/h 0, o 2 (h)/h 0 as h 0, which implies that o 3 (h) 0. Therefore E k u 0 as k and h 0. All the pieces are now in place. By (3.31), we also get E k x 0 as k and h 0. Go back to (3.29) to see that as h 0. From (3.35) we see that i=0 max,...,n ek xj L gx Ex k+1 + L gu Eu k + T o 1 (h)/h 0 max,...,n ek uj L h3 [ max,...,n ek+1 xj + M 0 (L fx Ex k+1 + L fu Eu k + T o 2 (h)/h)] 0 as h 0. Finally, from (3.33) we see that as h 0, This finishes the proof. max,...,n ek λj M 0 [L fx Ex k+1 + L fu Eu k + T o 2 (h)/h] 0. 14

Remark 3.6 The proof for the alternative approximating equation is similar. In this case, (3.30) is replaced by j x k+1 = hg(t j, x k j, u k j ), x k+1 0 = a (3.39) E k+1 x T [L gx E k x + L gu E k u + n o 1 (h)] We do not have (3.31). Then (3.38) is replaced by E k+1 u AE k x + BE k u + o 4 (h) with A, B, o 4 (h) being similar expressions of T and the Lipschitz constants of f, g. So we get the following iterative inequality for the average errors: ( ) ( ) E k+1 x E k Eu k+1 A x Eu k + C ) ( T no1 (h) ( Lgx T L where A = gu T A B and C = o 4 (h) the A < 1 and C 0, which imply the desired results. ). Under the same conditions, we have 4 Examples 4.1 A successful example The following simple linear-quadratic problem has been used as an example in several papers. See for example Vlassenbroeck and Van Dooren [24]. subject to the state equation max u 1 2 1 0 x(t) 2 + u(t) 2 dt (4.40) x (t) = x(t) + u(t), x(0) = 1. (4.41) The Maximum Principle can be used to construct an analytic solution. The co-state equation is λ (t) = λ(t) x(t). The optimizing condition on the Hamiltonian gives u(t) = λ(t). Together with the state equation, the result is the following linear differential-algebraic system. x (t) = x(t) + u(t), x(0) = 1 (4.42) λ (t) = λ(t) x(t), λ(1) = 0 (4.43) u(t) = λ(t). (4.44) 15

The solution is x(t) = 2 cosh( 2(t 1)) sinh( 2(t 1)) 2 cosh( 2)+sinh( 2) and λ(t) = sinh( 2(t 1)) 2 cosh( 2)+sinh( 2). The optimal value of the objective functional is J = 0.1929092981. The final value of the state is x(1) = 0.2819695346 and the initial value of the co-state is λ(0) = 0.3858185962. The numerical computations of the FBSM algorithm were implemented in Mathematica. The initial guess for the control is u 0. The differential equation solver used is the fourth order Runge-Kutta on the interval [0, 1] partitioned into N subintervals. The stopping criteria is determined by finding the relative errors for the state, the co-state, and the control and requiring that all three be less than a specified value δ. The desired relative error for the state variable, for example, is xk 1 x k < δ where is the l 1 -norm, x k = N x k j=1 xk j. The inequality is rewritten as 0 < δ x k x k 1 x k. The test is then to terminate the iteration loop when this and the corresponding expressions for the control and co-state all become positive. Table 1 summarizes these computations for various choices of values of the relative error δ and the number of subintervals N. The number of times that the Runge- Kutta algorithm was called for the state (and the co-state) variable is given in the column labeled count. Also shown are the errors in the computed values of the objective J, the final state value x N, and the costate at time 0. (The co-state and control are negatives of one another in this example.) The estimate of the value of the objective J approx is found by using Simpson s Rule with the computed values of the state and control and the relevant value of N. Table 1 shows that for this example, the count is generally unaffected by N, the number of subintervals used. This indicates that a factor affecting the speed of convergence of the method is the choice of tolerance, δ, of the relative error between iterations. The value of δ is not a measurement of the error between the true value and computed value of the state, co-state, control, or objective, yet there seems to be a general association of δ and these other errors. The error in the objective J appears to be decreasing with decreasing δ for δ small enough. 16

Table 1 δ N count J J approx x(1) x N λ(0) λ 0 1. 10 3 30 9 1.27994 10 5 1.00254 10 5 9.67332 10 5 1. 10 3 100 8 4.67703 10 5 9.92594 10 5 9.98745 10 5 1. 10 3 500 8 4.73871 10 5 1.01501 10 4 9.49590 10 5 1. 10 4 30 11 3.59054 10 6 2.22713 10 5 6.40648 10 5 1. 10 4 100 11 2.94289 10 6 1.48023 10 6 1.23111 10 5 1. 10 4 500 11 3.57050 10 6 3.73530 10 6 7.39750 10 6 1. 10 5 30 15 7.07509 10 6 2.61852 10 5 5.71276 10 5 1. 10 5 100 15 5.26615 10 7 2.42620 10 6 5.37553 10 6 1. 10 5 500 15 1.02431 10 7 1.70418 10 7 4.62062 10 7 1. 10 6 30 18 7.19313 10 6 2.61290 10 5 5.68934 10 5 1. 10 6 100 18 6.43802 10 7 2.37011 10 6 5.14132 10 6 1. 10 6 500 18 1.46757 10 8 1.14344 10 7 2.27858 10 7 4.2 A less-than-successful example As suggested by the convergence theorems, the FBSM has limitations. In this subsection, we will show an example for which the method does not converge. Further experimentation shows some relationships between the Lipschitz constants and the length of the time interval in the example. Finally, a serendipitous solution is found by averaging iterates in the example. The example is found in [20], p. 17, where its purpose is to illustrate the use of the Maximum Principle to construct a closed form solution. The location of the example in [20] was prior to any discussion of the FBSM. It is another linear-quadratic problem quite similar to the previous example. The problem is to choose the control u(t) so that subject to the state equation max u 1 2 1 0 3x(t) 2 + u(t) 2 dt (4.45) x (t) = x(t) + u(t), x(0) = 1. (4.46) Using the Maximum Principle to find the associated linear differential equations, we find 17

x (t) = x(t) + u(t), x(0) = 1 λ (t) = x(t) λ(t), λ(1) = 0 u(t) = λ(t). The solution is x(t) = 3e2t +e 4 2t and λ(t) = 3(e4 2t e 2t ). Using the FBSM with the control 3+e 4 3+e 4 initially set to be identically zero and δ =.001, the method fails to terminate even after thousands of iterations of the differential equation solvers. Since the coefficient 3 in the objective function is large, we experimented with a parameterized version of the example. Let m and T be positive parameters and consider the related problem of choosing u(t) so that subject to the state equation max u 1 T mx(t) 2 + u(t) 2 dt (4.47) 2 0 x (t) = x(t) + u(t), x(0) = 1. (4.48) The following table shows examples of m and the largest value of T for which the algorithm achieves tolerance δ = 0.0001 using N = 100 subintervals on the interval [0, T ] with the maximum number of iterations allowed being 1500. The values of T are the largest so that the algorithm using T +.01 does not meet the tolerance in less that 1500 iterations. Table 2 m T count 2.5 1.06 1359 2.6 1.04 136 2.7 1.03 456 2.8 1.02 788 2.9 1.00 217 3.0 0.99 244 3.1 0.98 267 3.2 0.97 280 3.3 0.96 281 Table 3 m T count 2.5 0.90 14 2.6 0.90 16 2.7 0.90 17 2.8 0.90 19 2.9 0.90 21 3.0 0.90 23 3.1 0.90 26 3.2 0.90 29 3.3 0.90 34 3.4 0.90 39 3.5 0.90 47 The large values of count in the Table 2 should be compared with those in the Table 3 in which T is fixed at the value T = 0.9. 18

Figure 1: Approximation of x(t) from above and below 1.0 0.8 0.6 0.4 0.2 While examining the original example of this subsection with m = 3 and T = 1, we observed the following behavior of the state variable: for large enough k and for all j = 1, 2,..., N we find (1) x 2k j < x(t j ) < x 2k+1 j and (2) x k j = x k+2 j. That is, the algorithm produced alternating upper and lower approximations to the state variable that did not converge to the solution. A convergent alternation phenomenon had been seen previously in looking at examples that achieved the required tolerance. Figure 1 shows a few iterations illustrating this phenomenon for m = 2.9, T = 1 and N = 30 and also shows the actual solution. See also p. 181 in [20]. The situation is different for m = 3 and T = 1 since the iterations do not converge to the solution and represent period 2 solutions of the discrete system that are different from the actual solution. The iterates eventually settle in to a pattern with periodicity 2 as seen in Figure 2 with N = 30. But all is not lost: the average of the upper and lower estimates is the actual solution! So even though the FBSM itself does not yield the solution, the average of the iterates does yield the solution in this example. 5 More general problems The Forward Backward Sweep Method can be generalized to other optimal control problems. Lenhart and Workman [20] show how to do this for problems with bounded controls, a u(t) b by changing the characterization of u from, say u(t) = h 3 (t, x(t), λ(t) in (3.3) to u(t) = min(b, max(a, h 3 (t, x(t), λ(t))). Problems involving fixed endpoints, i.e., both x(0) = x 0 and x(t ) = x T given, are also 19

Figure 2: Periodic, nonconvergent approximations of x(t) 2 1 0.2 0.4 0.6 0.8 1 considered in [20] using an Adapted Forward Backward Sweep Method. The added feature is a shooting method. Guess a value for the costate at the terminal time: λ(t ) = θ. Use the FBSM to find the value of the state at the terminal time x(t ) = x N. The idea is to think of the map from θ to x N = x N,θ as a function V (θ) and then use a secant method is to find a root of V (θ) = x T x N,θ. A similar method is described to solve problems with so-called scrap functions in which the goal is to optimize a functional of the form φ(x(t )) + T f(t, x(t), u(t))dt. It is also shown how to use a modification of the method to 0 solve free terminal time problems in which T is also a choice variable. For each of the problems just described, the code seems to work well and it would be of interest to have convergence results similar to those in Section 3 for these additional methods. References [1] U.M. Ascher and L.R Petzold, Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations, SIAM, Philadelphia, 1998. [2] K. Atkinson, An Introduction to Numerical Analysis, 2 nd ed., John Wiley, New York, 1989. [3] K. Atkinson and W. Han, Theoretical Numerical Analysis: A Functional Analysis Framework, 3 rd edition, Springer-Verlag, New York, 2009. [4] E. Bertolazzi, F. Biral, M. Da Lio, Symbolic-numeric efficient solution of optimal control problems for multibody systems, J. Comput. Appl. Math. 185 (2006) 404-421. 20

[5] J. T. Betts, Survey of Numerical Methods for Trajectory Optimization, Journal of Guidance, Control, and Dynamics, 21 (1998) 193-207. [6] J. T. Betts, Practical Methods for Optimal Control Using Nonlinear Programming, SIAM, Philadelphia, 2001. [7] R. Bulirsch, E. Nerz, H.J. Pesch, and O. von Stryk, Combining direct and indirect methods in optimal control: range maximization of a hang glider, in Optimal control. Calculus of variations, optimal control theory and numerical methods. Papers from the Second Conference held at the University of Freiburg, Freiburg, May 26 June 1, 1991. Edited by R. Bulirsch, A. Miele, J. Stoer and K. H. Well. International Series of Numerical Mathematics, 111. Birkhuser Verlag, Basel, 1993. [8] F.L. Chernousko and A.A. Lyubshin, Method of successive approximations for solutions of optimal control problems, Optimal Control Applications and Methods, 3 (1982) 101-114. [9] A. L. Dontchev, Error estimates for a discrete approximation to constrained control problems, SIAM J. Numer. Anal., 18 (1981) 500-514. [10] A.L. Dontchev, W.W. Hager, and K. Malanowski, Error bounds for the Euler approximation of a state and control constrained optimal control problem, Numer. Funct. Anal. Optim., 21 (2000) 653-682. [11] A.L. Dontchev, W.W. Hager, and V.M. Veliov, Second-order Runge-Kutta approximations in control constrained optimal control, SIAM J. Numer. Anal., 38 (2000) 202-226. [12] A.L. Dontchev and W.W. Hager, The Euler approximation in state constrained optimal control, Math. Comp., 70 (2000) 173-203. [13] W.H. Enright and P.H. Muir, Efficient classes of Runge-Kutta Methods for two-point boundary value problems, Computing, 37 (1986) 315-334. [14] W. Hackbusch, A numerical method for solving parabolic equations with opposite orientations, Computing, 20 (1978) 229-240. [15] W. W. Hager, Rates of convergence for discrete approximations to unconstrained control problems, SIAM J. Numer. Anal., 13 (1976) 449-472. [16] E. Hairer and G. Wanner, Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, Springer, Berlin, 1996. [17] A. Iserles, A First Course in the Numerical Analysis of Differential Equations, Cambridge U. Press, Cambridge, 2009. 21

[18] H.B. Keller Numerical Methods for Two-Point Boundary-Value Problems, Blaisdell Publishing Company, Waltham, Mass., 1968. [19] P. Kunkel and V. Mehrmann, Differential-Algebraic Equations: Analysis and Numerical Solution, European Mathematical Society, Zürich, 2006. [20] S. Lenhart and J. T. Workman, Optimal Control Applied to Biological Models, Chapman & Hall/CRC, Boca Raton, 2007. [21] S.K. Mitter, The successive approximation method for the solution of optimal control problems, Automatica, 3 (1966) 135-149. [22] A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equations, Springer-Verlag, Berlin, 2008. [23] A. Seierstad and Knut Sydsaeter, Optimal Control with Economic Applications, North- Holland, Amsterdam, 1987. [24] J. Vlassenbroeck and R. Van Dooren, A Chebyshev Technique for Solving Nonlinear Optimal Control Problems, IEEE Trans. Automat. Control, 33 (1988) 333-349. 22