A CURSE-OF-DIMENSIONALITY-FREE NUMERICAL METHOD FOR SOLUTION OF CERTAIN HJB PDES

Size: px

Start display at page:

Download "A CURSE-OF-DIMENSIONALITY-FREE NUMERICAL METHOD FOR SOLUTION OF CERTAIN HJB PDES"

Ami Brooke Craig
6 years ago
Views:

1 A CURSE-OF-DIMENSIONALITY-FREE NUMERICAL METHOD FOR SOLUTION OF CERTAIN HJB PDES WILLIAM M. MCENEANEY Abstract. In previous work of the author and others, max-plus methods have been explored for solution of first-order, nonlinear Hamilton-Jacobi-Bellman partial differential equations (HJB PDEs) and corresponding nonlinear control problems. These methods exploit the max-plus linearity of the associated semigroups. In particular, although the problems are nonlinear, the semigroups are linear in the max-plus sense. These methods have been used successfully to compute solutions. Although they provide certain computational-speed advantages, they still generally suffer from the curse-of-dimensionality. Here we consider HJB PDEs where the Hamiltonian takes the form of a (pointwise) maximum of linear/quadratic forms. The approach to solution will be rather general, but in order to ground the work, we consider only constituent Hamiltonians corresponding to long-run average-cost-per-unit-time optimal control problems for the development. We obtain a numerical method not subject to the curse-of-dimensionality. The method is based on construction of the dual-space semigroup corresponding to the HJB PDE. This dual-space semigroup is constructed from the dual-space semigroups corresponding to the constituent linear/quadratic Hamiltonians. The dual-space semigroup is particularly useful due to its form as a max-plus integral operator with kernel obtained from the originating semigroup. One considers repeated application of the dual-space semigroup to obtain the solution. Key words. partial differential equations, curse-of-dimensionality, dynamic programming, max-plus algebra, Legendre transform, Fenchel transform, semiconvexity, Hamilton-Jacobi-Bellman equations, idempotent analysis. AMS subject classifications. 49LXX, 93C, 3B37, 3F2, 6N99, 47D99. Introduction. One approach to nonlinear control is through Dynamic Programming (DP). With DP, solution of the control problem reduces to solution of the corresponding partial differential equation (PDE). In the case of deterministic optimal control or deterministic games where one player s feedback is prespecified, the PDE is a Hamilton-Jacobi-Bellman (HJB) PDE. If one can solve the HJB PDE, then this approach is ideal in that one obtains the optimal control for the given criterion as opposed to a control meeting only some weaker goal such as stability. The problem is that one must solve the HJB PDE! We should remark that such HJB PDEs also arise in Robust/H nonlinear filtering and Robust/H control under partial information. Various approaches have been taken to solution of the HJB PDE. First note that it is a fully nonlinear, first-order PDE. Consequently, the solutions are generally nonsmooth (with the exception of the linear/quadratic case of course), and one must use the theory of viscosity solutions [3, [9, [, [, [2. One approach to solution is through generalized characteristics (cf. [37, [38, as well as [6, [22 for classical treatments). This approach can obtain the solution very quickly at a single point if the solution is smooth. However, the nonsmoothness introduces tremendous difficulties which appear, to the author, to be difficult to handle in an automated approach. In particular, the projections of the characteristics into the state space can cross and/or may not cover the entire state space (in analogy with shocks and rarefaction waves). The most common methods by far all fall into the class of grid-based methods (cf. [3, [3, [4, [2, [26 among many others). These require that one generate a grid over some bounded region of the statespace. In this general class of methods, we include finite-difference methods, finite element methods, and those dynamic programming based methods which map the continuum problem onto some discrete space. Although higher-order grid-based methods are being explored (c.f. [, [, [7), there are still hard lower limits to the computational growth as a function of space-dimension. In particular, suppose the region over which one constructs the grid is rectangular, say square for simplicity. Further, suppose one uses Research partially supported by NSF Grant DMS and AFOSR Grant FA Dept. of Mechanical and Aerospace Engineering, University of California San Diego, 9 Gilman Dr., La Jolla, CA, , USA (wmceneaney@ucsd.edu).

2 2 W.M. McENEANEY grid points per dimension. (Clearly would be the minimum acceptable, and could be a bit sparse.) If the state dimension is n, then one has n grid points. Thus the computations grow exponentially in state-space dimension n. If the computations per grid point grew with state-space dimension like 2 n, then the computations would grow like (2C) n for some constant, C. For concreteness, we discuss only the steady-state PDE case here. If the state-space dimension is 3, these problems are feasible to solve on current generation machinery. However, the computations will grow by more than 8 6 in going from a dimension 3 problem to a dimension 6 problem. Parallel algorithms can alleviate this problem to some extent (c.f. [4). However, there can only be rather limited improvement in the dimension of problems which can be handled by such techniques. In recent years, an entirely new class of numerical methods for HJB PDEs has emerged [2, [3, [, [24, [33, [32, [34, [3, [29. These methods exploit the max-plus (or min-plus [8, [33) linearity of the associated semigroup. They employ a max-plus basis function expansion of the solution, and the numerical methods obtain the coefficients in the basis expansion. We will refer to these methods as max-plus basis methods. Much of the work has concentrated on the (harder) steady-state HJB PDE class where (for both max-plus basis and grid-based methods), one propagates forward in time to obtain the steady-state limit solution. With the max-plus basis methods, the number of basis functions required still typically grows exponentially with space dimension. For instance, one might use 2 basis functions per space dimension. Consequently, one still has the curse-of-dimensionality. With the max-plus basis methods, the time-step tends to be much larger than what can be used in grid-based methods (since it encapsulates the action of the semigroup propagation on each basis function), and so these methods can be quite fast on small problems. Even with a max-plus basis approach, the curse-of-dimensionality growth is so fast that one cannot expect to solve general problems of more than say dimension on current machinery, and again the computing machinery speed increases expected in the foreseeable future cannot do much to raise this. Many researchers have noticed that the introduction of even a single, simple nonlinearity into an otherwise linear control problem of high dimensionality, say n, has disastrous computational repercussions. Specifically, one goes from solution of an n-dimensional Riccati equation to solution of a grid-based or max-plus basis method over a space of dimension n. While the Riccati equation may be relatively easily solved for large n, the max-plus and grid-based methods have no hope on general problems of dimension, say, n 6. This has been a frustrating, counter-intuitive situation for decades. This paper discusses an approach to certain nonlinear HJB PDEs which is not subject to the curse-ofdimensionality. Although this approach also utilizes the max-plus algebra, the method is largely unrelated to the max-plus basis approaches discussed above. In fact, for this new method, the computational growth in state-space dimension is on the order of n 3. There is of course no free lunch, and there is exponential computational growth in a certain measure of complexity of the Hamiltonian. Under this measure, the minimal complexity Hamiltonian is the linear/quadratic Hamiltonian corresponding to solution by a Riccati equation. If the Hamiltonian is given as a pointwise maximum or minimum of M linear/quadratic Hamiltonians, then one could say the complexity of the Hamiltonian is M. One could also apply this approach to a wider class of HJB PDEs with semiconvex Hamiltonians (by approximation of the Hamiltonian by a finite number of quadratic forms), but that is certainly beyond the scope of this paper. The approach has been applied on some simple nonlinear problems. A steady-state HJB PDE comprised of 2 linear/quadratic components was solved in dimensions 2-3 in under - seconds on a standard PC and in 2 seconds over R 4. A few simple examples comprised of 3 linear/quadratic components were solved in -2 seconds over R 3 and -4 seconds over R 4. For these particular problems, the solution was obtained over the entire space (as opposed to a rectangular region) with the resulting errors in the gradients growing linearly in x. (See Section 7 for more information on specific examples.) These speeds are of course unprecedented in standard general approaches to nonlinear PDEs. This code was not optimized, and there are many computational cost reduction methods that one could employ to further reduce computational growth. Further, the computational growth in going from n = 4 up to say n = 6

3 CURSE-OF-DIMENSIONALITY-FREE METHOD 3 would be on the order of 6 3 /4 3 4 as opposed to say more than 4 for a grid-based method. We will be concerned here with HJB PDEs of the form = H(x, gradv ), where the Hamiltonians are given or approximated as H(x, gradv ) = max m {,2,...,M} {Hm (x, gradv )}. In order to make the problem tractable, we will concentrate on a single class of HJB PDEs those for long-run average-cost-per-unit-time problems. However, the theory can obviously be expanded to a much larger class. Since the development of the proposed method in the following sections takes quite a few pages, we briefly outline the main points here. First, recall that the solution of the above PDE is the eigenfunction of the corresponding semigroup, that is V = V = S τ [V where, denote max-plus addition and multiplication, and we note that S τ is max-plus linear (cf. [2, [28, [33). The Legendre/Fenchel transform maps this to the dual space eigenfunction problem e = B τ e where we use the notation to indicate B τ e =. B IR n τ (x, y) e(y)dy where denotes max-plus integration (maximization). Then one approximates B τ m M Bm τ where M =. {, 2,..., M} and the Bτ m correspond to the H m. The max-plus power method ([2, [2, [33) suggests that the solution is approximated by the form [ N e lim Bτ m = lim Bτ m Bτ m2... Bτ mn N m M N {m i} N where the N superscript denotes the operation N times, and represents the zero function. Given linear/quadratic forms for each of the H m, the Bτ m. are obtained by Riccati equations. Let e N = [ N m M Bm τ. Then en e. The convergence rate does not depend on space dimension, but on the dynamics of the problem. There is no curse-of-dimensionality. The exponential growth is in M = #M. Given the solution of the Riccati equations for the H m, the computation of each product, Bτ m Bτ m2... Bτ mn, is analytical, modulo n n matrix inversions (and hence the n 3 computational growth rate). In Section 2, the class of control problems and HJB PDEs which we will use to demonstrate the theory will be given. We will also review the existing theory relevant to our problem there. In Section 3 the relation between solution of the HJB PDEs and their corresponding semiconvex dual problems will be discussed. In Section 4, a discrete-time approximation of the semigroup for the problem of interest will be introduced, and convergence of the solutions of the approximate problems to the original problem will be obtained. The algorithm itself will be developed in Section. The basic algorithm is not subject to the curse-of-dimensionality. However, practical implementation requires some additional work; some initial remarks on this appear in Section 6. The algorithm is applied to some simple examples in Section 7. Finally, Section 8 sketches some future directions. 2. Sample problem class and review of theory. There are certain conditions which must be satisfied for solutions to exist and the method to apply. In order that the assumptions are not completely abstract, we will work with a specific problem class the infinite time-horizon H problem with fixed feedback. This class consists of long-term average-cost-per-unit-time problems. Moreover, it is a problem

4 4 W.M. McENEANEY class where there already exists a good deal of results, and so less analysis will be required for application of the new method. As indicated above, we suppose the individual H m are linear/quadratic Hamiltonians. Consequently, consider a finite set of linear systems (2.) ξ m = A m ξ m + σ m w, ξ m = x IR n. Let w W =. L loc 2 ([, ); IRm ) where we recall that L loc 2 ([, ); IRm ) = {w : [, ) IR m : T w t 2 dt < T < }. Let the cost functionals be (2.2) J m (x, T; w). = T 2 ξm t D m ξt m γ2 2 w t 2 dt, and let the value function (also known as the available storage in this context) be (2.3) V m (x) = sup sup w W T< J m (x, T; w) = lim sup J m (x, T; w). T w W We remark that a generalization of the second term in the integrand of the cost functional to 2 wt C m w with C m symmetric, positive definite is not needed since this is equivalent to a change in σ m in the dynamics (2.). Obviously J m and V m require some assumptions in order to guarantee their existence. The assumptions will hold throughout the paper. Since these assumptions only appear together, we will refer to this entire set of assumptions as Assumption Block (A.m), and this is: Assume that there exists c A (, ) such that Assume that there exists c σ < such that x T A m x c A x 2 x IR n, m M. (A.m) σ m c σ m M. Assume that all D m are positive definite, symmetric, and let c D be such that x T D m x c D x 2 x IR n, m M (which is obviously equivalent to all eigenvalues of the D m being no greater than c D ). Lastly, assume that γ 2 /c 2 σ > c D /c 2 A. Note that these assumptions guarantee the existence of the V m as locally bounded functions which are zero at the origin (cf. [36). (These assumptions could be weakened by using the specific linear/quadratic structure, but that would distract from the goal of this paper.) The corresponding HJB PDEs are (2.4) { 2 xt D m x + (A m x) T gradv + max w IR m 2 xt D m x + (A m x) T gradv + 2 gradv T Σ m gradv } = H m (x, gradv ) = = { V () = [(σ m w) T gradv γ2 2 w 2 where Σ m. = γ σ m (σ m ) T. Let IR. = IR { }. Recall that a function, φ : IR n IR is semiconvex 2 if given any R (, ) there exists k R IR such that φ(x) + kr 2 x 2 is convex over B R () = {x IR n : x R}. For a fixed choice of c A, c σ, γ > satisfying the above assumptions, and for any δ (, γ) we define { G δ = V : IR n [, ) V is semiconvex and V (x) c A(γ δ) 2 } x 2, x IR n. c 2 σ }

5 CURSE-OF-DIMENSIONALITY-FREE METHOD From [36 (undoubtedly among many others), each value function (2.3) is the unique viscosity solution of its corresponding HJB PDE (2.4) in the class G δ for sufficiently small δ >. From the structure of the running cost and dynamics, it is easy to see (c.f. [42, [36) that each V m satisfies (2.) V m (x) = sup sup T< w W J m (x, T; w) = lim sup J m (x, T; w) =. lim V m,f (x, T), T w W T and that each V m,f is the unique continuous viscosity solution of (cf. [3, [2) (2.6) = V T H m (x, gradv ), V (, x) =. It is easy to see that these solutions have the form V m,f (x, t) = 2 xt P m,f t x where each P m,f satisfies the differential Riccati equation (2.7) P m,f = (A m ) T P m,f + P m,f A m + D m + P m,f Σ m P m,f, P m,f =. By (2.) and (2.7), the V m take the form V (x) = 2 xt P m x where P m = lim t P m,f t. With this form, and (2.4) (or (2.7)), we see that the P m satisfy the algebraic Riccati equations (2.8) = (A m ) T P m + P m A m + D m + P m Σ m P m. Combining this with the above, one has Theorem 2.. Each value function (2.3) is the unique classical solution of its corresponding HJB PDE (2.4) in the class G δ for sufficiently small δ >. Further, V m (x) = 2 xt P m x where P m is the smallest symmetric, positive definite solution of (2.8) The duality between viscosity (and/or classical) solutions of the HJB PDEs and the corresponding value functions is certainly very important. However, the method we will use to obtain these value functions/hjb PDE solutions will be through the associated semigroups. These semigroups are equivalent to dynamic programming principles (DPPs). Consequently, for each m define the semigroup [ ST m [φ =. T (2.9) sup 2 (ξm t )T D m ξt m γ2 w W 2 w t 2 dt + φ(ξt m ) where ξ m satisfies (2.). By [36, the domain of ST m includes G δ for all δ >. The following result is similar to that in [33; the only significant difference is that, in this case, V m (x) = 2 xt P m x is smooth. Theorem 2.2. Fix any T >. Each value function, V m, is the unique smooth solution of V = S m T [V in the class G δ for sufficiently small δ >. Further, given any V G δ, lim T S m T [V (x) = V m (x) for all x IR n (uniformly on compact sets). Recall that the HJB PDE problem of interest is (2.) = H(x, gradv ). = max m M Hm (x, gradv ), V () =. The corresponding value function is (2.) Ṽ (x) = sup sup w W µ D J(x, w, µ). = sup sup sup w W µ D T< T l µt (ξ t ) γ2 2 w t 2 dt where l µt (x) = 2 xt D µt x, D = {µ : [, ) M : measurable }, and ξ satisfies

6 6 W.M. McENEANEY (2.2) ξ = A µt ξ + σ µt w t, ξ = x. Theorem 2.3. Value function Ṽ is the unique viscosity solution of (2.) in the class G δ for sufficiently small δ >. Remark 2.4. The proof of Theorem 2.3 is nearly identical to the proof of Theorems 2. and 2.6 from [36, with only trivial changes, and so is not included. In particular, rather than choosing any w W, one chooses both any w W and any µ D. (2.3) Define the semigroup S T [φ = sup sup w W µ D T [ T l µt (ξ t ) γ2 2 w t 2 dt + φ(ξ T ) where D T = {µ : [, T) M : measurable }. In analogy with Theorem 2.2, one has: Theorem 2.. Fix any T >. Value function Ṽ is the unique continuous solution of V = S T [V in the class G δ for sufficiently small δ >. Further, given any V G δ, lim T ST [V (x) = Ṽ (x) for all x IR n (uniformly on compact sets). The proof is nearly identical to the proof of a similar result in [33, and so is not included. In particular, the only change is the addition of the supremum over D T which makes no substantial change in the proof. Importantly, we also have the following. Theorem 2.6. There exists c V > such that Ṽ (x) 2 c V x 2 is strictly convex. Proof. Fix any x, ν IR n with ν = and any δ >. Let ε >. Given x, let w ε W, µ ε D be ε optimal for Ṽ (x). Then (2.4) Ṽ (x δν) 2Ṽ (x) + Ṽ (x + δν) J(x δν, w ε, µ ε ) 2 J(x, w ε, µ ε ) + J(x + δν, w ε, µ ε ) 2ε. Let ξ δ, ξ, ξ δ be solutions of dynamics (2.2), but with initial conditions ξ δ = x + δν, ξ = x and ξ δ = x δν, respectively, where the inputs are w ε and µ ε for all three processes. Then (2.) ξ δ ξ = A µε t [ξ δ ξ, and ξ ξ δ = A µε t [ξ ξ δ. Letting +. t = ξt δ ξt, one also has ξt ξt δ = + t, and by linearity one finds + = A µε t +. Also, using (2.4) and (2.) (2.6) Ṽ (x δν) 2Ṽ (x) + Ṽ (x + δν) ( + ) T D µε t + dt 2ε. Also, by the finiteness of M, there exists K < such that which implies (2.7) d dt + 2 = 2( + ) T A µε t + K e Kt δ 2 t.. Let λ D = min{λ IR : λ is an eigenvalue of a D m }. By the positive definiteness of the D m and finiteness of M, λ D >. Consequently, by (2.6), and then (2.7), (2.8) Ṽ (x δν) 2Ṽ (x) + Ṽ (x + δν) Since ε > and ν = were arbitrary, one obtains the result. λ D + 2 dt 2ε λ D K δ2 2ε.

7 CURSE-OF-DIMENSIONALITY-FREE METHOD 7 3. Max-plus spaces and dual operators. Again, recall that a function, φ : IR n IR is semiconvex if given any R (, ) there exists β R IR such that φ(x) + βr 2 x 2 is convex over B R () = {x IR n : x R}. We will modify this definition by allowing the β R to be n n, symmetric, positive or negative definite matrices. We will denote the set of such matrices as D n. We say φ is uniformly semiconvex with (symmetric, definite matrix) constant β D n if φ(x) + 2 xt βx is convex over IR n. Let S β = S β (IR n ) be the set of functions mapping IR n into IR which are uniformly semiconvex with (symmetric, definite matrix) constant β. (A negative definite semiconvexity constant corresponds to functions which are still convex after subtracting a convex quadratic.) Also note that S β is a max-plus vector space (also known as a moduloid) [2, [33, [2, [7, [27. For instance, α φ α 2 φ 2 S β for all α, α 2 IR and all φ, φ 2 S β. Combining Theorems 2. and 2.6, we have the following. Theorem 3.. There exists β D n such that given any β such that β β > (i.e., β β positive definite), Ṽ S β and V m S β for all m M. Further, one may take β negative definite (i.e., Ṽ, V m are convex). We henceforth assume we have chosen β such that β β >. Throughout the remainder, we will employ certain transform kernel functions, ψ : IR n IR n IR which take the form ψ(x, z) = 2 (x z)t C(x z) with nonsingular, symmetric C satisfying C + β < (i.e., C + β negative definite). The following semiconvex duality result [2, [32, [33 requires only a small modification of convex duality and Legendre/Fenchel transform results [39, [4. Theorem 3.2. Let φ S β. Let C and ψ be as above. Then, for all x IR n, (3.) (3.2) where for all z IR n φ(x)= max [ψ(x, z) + a(z) n = z IR IR n ψ(x, z) a(z)dz = ψ(x, ) a( ) (3.3) (3.4) a(z)= max [ψ(x, z) φ(x) n = x IR which using the notation of [7 = { ψ(, z) [φ ( ) } (3.). IR n ψ(x, z) [ φ(x) dx = {ψ(, z) [ φ( )} We will refer to a as the semiconvex dual of φ (with respect to ψ). Remark 3.3. We note that φ S β implies that φ is locally Lipschitz (cf. [9). We also note that if φ S β and if there is any x IR n such that φ(x) =, then φ. Henceforth, we will ignore the special case of φ, and assume that all functions are real-valued. Semiconcavity is the obvious analogue of semiconvexity. In particular, a function, φ : IR n IR {+ }, is uniformly semiconcave with constant β D n if φ(x) 2 xt βx is concave over IR n. Let S β be the set of functions mapping IR n into IR {+ } which are uniformly semiconcave with constant β. Lemma 3.4. Let φ S β (still with C + β < ), and let a be the semiconvex dual of φ. Then a S d for some d D n such that C + d <.

8 8 W.M. McENEANEY Proof. A proof only in the case φ C 2 is provided; in the more general case, a mollification argument can be employed. Noting that φ S β and C β >, there exists a unique minimizer, and one has x(z) = argmin[φ(x) ψ(x, z), x IR (3.6) a(z) = φ(x(z)) ψ(x(z), z). Fix any z, ν IR n with ν =. Define a s : IR IR and x s : IR IR n by (3.7) a s (δ). = a(z + δν) and x s (δ) = x(z + δν). We will obtain a lower bound on the second derivative of a s, and this will prove the result. Differentiating a s, one has da s = d [φ(x(z + δν)) ψ(x(z + δν), z + δν) dδ dδ δ= = grad x φ(x(z)) dxs dδ grad dxs x ψ(x(z), z) dδ grad z ψ(x(z), z) ν, which using the fact that grad x φ(x(z)) grad x ψ(x(z), z) =, = grad z ψ(x(z), z) ν. Differentiating again, one finds d 2 a s n n dδ 2 = ψ zix j (x(z), z) dxs j (3.8) dδ ν i + δ= j= = ν T Cν + ν T C dxs (3.9). dδ δ= n ψ ziz k (x(z), z)ν k ν i Now, differentiating both sides of grad x φ(x(z + δν)) grad x ψ(x(z + δν), z + δν) = yields which yields (3.) k= n dx s n j φ xix j dδ dx s n k ψ xix k dδ ψ xiz l ν l = i, j= Substituting (3.) into (3.9), one obtains (3.) k= l= dx s dδ = [φ xx(x(z), z) C Cν. d 2 a s { } dδ 2 = νt C + C [C φ xx (x(z), z) C ν. Now, φ S β implies φ xx (x(z)) β < which implies that there exists k > such that ν T [ β φ xx (x(z)) ν < k ν 2 ν. Combining this with the fact that C + β < implies that ν T [C φ xx (x(z))ν < k ν 2 ν.

9 CURSE-OF-DIMENSIONALITY-FREE METHOD 9 Now, C φ xx (x(z)) being symmetric, negative definite implies that C φ xx (x(z)) = UΛU T for some diagonal Λ (with all diagonal entries negative of course) and some real, unitary U. Consequently, [C φ xx (x(z)) = UΛ U T <, that is negative definite. Then, since ζ T [C φ xx (x(z)) ζ < for all ζ IR n, ζ, one sees that for all ν IR n, ν, and so ν T C[C φ xx (x(z)) Cν < (3.2) C[C φ xx (x(z)) C <. Let d =. C + 2 C[C φ xx(x(z)) C. Then, by (3.2), C +d = 2 C[C φ xx(x(z)) C <. (Note that if d is not definite, then by addition of εi for arbitrarily small ε, one can make d definite without violating the inequalities.) Further, by (3.) and (3.2), which yields the result. d 2 a s dδ 2 = ν T [ d + 2 C[C φ xx(x(z)) C ν < ν T dν δ= Remark 3.. Fix any δ > such that Ṽ G δ, and let K δ = 2 ca(γ δ)2 c so that Ṽ (x) K 2 δ σ 2 x 2. Then, using Lemma 3.4 and the monotonicity of the dual operations, the semiconvex dual, ã, of Ṽ is in S d G δ for some d D n such that C + d < where G δ is the space of semiconcave functions satisfying ã(z) 2 zt Q δ z where Q. δ = C(C K δ I) K δ (C K δ I) C Kδ 2(C K δi) C(C K δ I), and where the last term on the right is the dual of K δ 2 x 2. Further, by the monotonicity of the dual operations, any a G δ has dual V G δ. Lemma 3.6. Let φ S β with semiconvex dual a. Suppose b S d φ = ψ(x, ) b( ). Then b = a. with C + d < is such that Proof. Note that b S d. Therefore, for all y IR n, b(y) = max ζ IR n[ψ(y, ζ)+α(ζ) or equivalently, (3.3) where for all ζ IR n which by assumption (3.4) b(y)= max ζ IR n[ψ(y, ζ) + α(ζ) α(ζ)= max y IR n[ψ(y, ζ) + b(y) = φ(ζ). Combining (3.3) and (3.4), and then using (3.3), one obtains b(y) = max ζ IR n[ψ(y, ζ) φ(ζ) = a(y) y IRn. We will hereafter refer to the uniqueness of the semiconvex dual in the sense of Lemma 3.6 simply as uniqueness of the semiconvex dual. It will be critical to the method that the functions obtained by application of the semigroups to the ψ(, z) be semiconvex with less concavity than the ψ(, z) themselves. In other words, we will want for instance S τ [ψ(, z) S (c+εi) for some ε >. This is the subject of the next theorem. Also, in order to keep the theorem statement clean, we will first make some definitions. Define λ D. = min{λ IR : λ is an eigenvalue of D m, m M}.

10 W.M. McENEANEY Note that the finiteness of M and positive definiteness of the D m imply that λ D >. Let. I C = {C D n min min νt[ A mt C + CA m } ν λ D /4. ν = m M Theorem 3.7. Let C I C. Then there exists τ > and η > such that for all τ [, τ S τ [ψ(, z), S m τ [ψ(, z) S (C+ηIτ). Remark 3.8. If we restrict to C = ci for some c IR, then C I C if one takes c λ D /[8 max m M A m, c, and so the theorem condition can be satisfied. Proof. We prove the result only for S τ. The proof for S m τ is nearly identical and slightly simpler. The first portion of the proof is similar to the proof of Theorem 2.6. Again, fix any x, ν IR n with ν = and any δ >. Fix τ > (to be specified below), and let ε >. Let w ε, µ ε be ε optimal for S τ [ψ(, z)(x). Specifically, suppose Îψ (x, τ, w ε, µ ε ) S τ [ψ(, z)(x) ε where (3.) Î ψ (x, τ, w, µ). = τ l µt (ξ t ) γ2 2 w t 2 dt + ψ(ξ τ, z) and ξ t satisfies (2.2). For simplicity of notation, let V τ,ψ = S τ [ψ(, z). Then V τ,ψ (x δν) 2 V τ,ψ (x) + V τ,ψ (x + δν) (3.6) Îψ (x δν, τ, w ε, µ ε ) 2Îψ (x, τ, w ε, µ ε ) + Îψ (x + δν, τ, w ε, µ ε ) 2ε. Let ξ δ, ξ, ξ δ, + be as given in the proof of Theorem 2.6. Note that (3.7) ψ(ξτ δ, z) 2ψ(ξ τ, z) + ψ(ξ δ τ, z) = ( + τ )T C + τ. Note also that as in the proof of Theorem 2.6, [ (3.8) 2 ξ δ t D µε t ξ δ t 2ξ t D µε t ξ t + ξ δ t Combining (3.), (3.6), (3.7) and (3.8), one obtains D µε t ξ δ t = ( + t ) T D µε t + t. (3.9) V τ,ψ (x δν) 2 V τ,ψ (x) + V τ,ψ (x + δν) τ ( + t ) T D µε t + t dt + ( + τ ) T C + τ 2ε. Further, noting as before that + = A µε t +, one has (3.2) { t } + t = exp A µε r dr δν =. Λ ε t δν. Combining (3.9) and (3.2), one has V τ,ψ (x δν) 2 V τ,ψ (x) + V τ,ψ (x + δν) { τ } δ 2 ν T (Λ ε t )T D µε t Λ ε t ν dt + ν T (Λ ε τ )T CΛ ε τ ν 2ε. However, since λ D > and Λ ε = I, there exists τ > such that for all τ (, τ),

11 (3.2) CURSE-OF-DIMENSIONALITY-FREE METHOD [ δ 2 λd 2 τ + νt Cν + δ 2 [ ν T (Λ ε τ )T CΛ ε τ ν νt Cν 2ε. Now define g ν t. = ν T (Λ ε t )T CΛ ε t ν νt Cν. Noting that d dt [Λε t = Aµε t Λ ε t, one obviously has and consequently, (3.22) Also, define dg ν dt = ν T [ (Λ ε t )T (A µε t ) T CΛ ε t + (Λε t )T CA µε t Λ ε t ν, t [ gt ν = ν T (Λ ε r )T (A µε r ) T CΛ ε r + (Λε r )T CA µε r Λ ε r ν dr. ḡ ν t. = t ν T [ (A µε r ) T C + CA µε r ν dr. Noting that Λ ε t is continuous, and that Λ ε = I, one sees that there exist ˆδ > and ˆτ > such that for all τ (, ˆτ), (3.23) g ν t ḡ ν t ˆδ 2 t2 t (, ˆτ). Let τ = min{ τ, ˆτ, λd 4ˆδ }. By (3.2) and the definition of gν, V τ,ψ (x δν) 2 V τ,ψ (x) + V τ,ψ (x + δν) δ 2 ν T Cν + δ 2 [ λd 2 τ + ḡν τ gν τ ḡν τ 2ε which by the definition of ḡ ν and (3.23) δ 2 ν T Cν + δ 2 [ λ D 2 τ + τ ν T [ (A µε r ) T C + CA µε r which by the definition of τ and the assumption that C I C δ 2 ν T Cν + δ 2 λ Dτ 2ε τ (, τ). 8 Since this is true for all ε >, letting η = λ D /8, one has ν dr ˆδ 2 τ2 2ε, V τ,ψ (x δν) 2 V τ,ψ (x) + V τ,ψ (x + δν) δ 2 ν T [C + ηiτν τ (, τ). Corollary 3.9. We may choose C D n such that Ṽ, V m S C, and such that with ψ, τ, η as in the statement of Theorem 3.7, S τ [ψ(, z), S m τ [ψ(, z) S (C+ηIτ) τ [, τ. Henceforth, we suppose C chosen so that the results of Corollary 3.9 hold. We also suppose τ, η chosen according to the corollary as well. (3.24) Now for each z IR n, S τ [ψ(, z) S (C+ηIτ). Therefore, by Theorem 3.2 S τ [ψ(, z)(x) = where for all y IR n IR n ψ(x, y) B τ (y, z)dy = ψ(x, ) B τ (, z)

12 2 W.M. McENEANEY (3.2) B τ (y, z) = ψ(x, y) { S τ [ψ(, z)(x) } dx = { ψ(, y) [ S τ [ψ(, z)( ) }. IR n It is handy to define the max-plus linear operator with kernel B τ (where we do not rigorously define the term kernel as it will not be needed here) as Bτ [α(z). = B τ (z, ) α( ) for all α S C. Proposition 3.. Let φ S C with semiconvex dual denoted by a. Define φ = S τ [φ. Then φ S (C+ηIτ), and φ (x) = ψ(x, ) a ( ) where a (x) = B τ (x, ) a( ). Proof. The proof that φ S (C+ηIτ) is similar to the proof in Theorem 3.7. Consequently, we prove only the second assertion. [ τ φ (x)= sup sup l µt (ξ t ) γ2 w W µ D 2 w t 2 dt + φ(ξ τ ) [ τ = sup sup max l µt (ξ t ) γ2 z IR n 2 w t 2 dt + ψ(ξ τ, z) + a(z) w W µ D = max z IR n { Sτ [ψ(, z)(x) + a(z) } which by (3.24) = max = = { max z IR n y IR n y IR n ψ(x, y) + B } τ (y, z) + a(z) z IR n Bτ (y, z) a(z)dz ψ(x, y)dy y IR n a (x) ψ(x, y)dy. Theorem 3.. Let V S C, and let a be its semiconvex dual (with respect to ψ). Then V = S τ [V if and only if [ a(z)= max Bτ (z, y) + a(y) y IR n which of course = IR n Bτ (z, y) a(y)dy = B τ (z, ) a( ) = Bτ [a(z) z IR n. Proof. Since a is the semiconvex dual of V, for all x IR n, ψ(x, ) a( )= V (x) = S τ [V (x) = S τ [max z IR n{ψ(, z) + a(z)}(x) = sup sup w W µ D [ τ l µt (ξ t ) γ2 2 w t 2 dt + max z IR n{ψ(ξ τ, z) + a(z)}

13 which by (3.24) CURSE-OF-DIMENSIONALITY-FREE METHOD 3 [ = max a(z) + sup z IR n = max = = { z IR n sup w W µ D { τ a(z) + S } τ [ψ(, z)(x) IR n a(z) S τ [ψ(, z)(x)dz IR n a(z) IR n Bτ (y, z) ψ(x, y)dy dz = Bτ (y, z) a(z) ψ(x, y)dy dz IR n IR n [ = Bτ (y, z) a(z)dz ψ(x, y)dy IR n IR [ n = Bτ (, z) a(z)dz ψ(x, ) IR n } l µt (ξ t ) γ2 2 w t 2 dt + ψ(ξ τ, z) where by Proposition 3., the first term is in S (C+ηIτ). Combining this with Lemma 3.6, one has a(y) = IR n Bτ (, z) a(z)dz = B τ (y, ) a( ) y IR n. The reverse implication follows by supposing a( ) = B τ (y, ) a( ), and reordering the above argument. of Corollary 3.2. Value function Ṽ is given by Ṽ (x) = ψ(x, ) ã( ) where ã is the unique solution ã(y) = B τ (y, ) ã( ) y = IR n or equivalently, ã = Bτ [ã. Proof. Combining Theorem 2. and Theorem 3. yields the assertion that Ṽ has this representation. The uniqueness follows from the uniqueness assertion of Theorem 2. and Lemma 3.6. Similarly, for each m M and z IR n, Sτ m[ψ(, z) S (C+ηIτ) and S m τ [ψ(, z)(x) = ψ(x, ) Bm τ (, z) x IRn where B m τ (y, z) = { ψ(, y) [ S m τ [ψ(, z) ( ) } y IR n. As before, it will be handy to define the max-plus linear operator with kernel B m τ as B m τ [a(z). = B m τ (z, ) a( ) for all a S C. Further, one also obtains analogous results (by similar proofs). In particular, one has the following Theorem 3.3. Let V S C, and let a be its semiconvex dual (with respect to ψ). Then V = S m τ [V if and only if a(z)= B m τ (z, ) a( ) z IR n. Corollary 3.4. Each value function V m is given by V m (x) = ψ(x, ) a m ( ) where each a m is the unique solution of the problem a m (y) = B m τ (y, ) a m ( ) for all y IR n.

14 4 W.M. McENEANEY 4. Discrete-time approximation. The method developed here will not involve any discretization over space. Of course this is obvious since otherwise one could not avoid the curse-of-dimensionality. The discretization will be over time where approximate µ processes will be constant over the length of each time-step. We define the operator S τ on G δ by [ τ S τ [φ(x)= sup max where ξ m satisfies (2.). Let w W m M = max m M Sm τ [φ(x) l m (ξt m ) γ2 2 w t 2 dt + φ(ξτ m ) (x) B τ (y, z) =. max m M Bm τ (y, z) = Bτ m (y, z) y, z IRn. m M The corresponding max-plus linear operator is B τ = B τ m. m M Lemma 4.. For all z IR n, S τ [ψ(, z) S (C+ηIτ). Further, (4.) S τ [ψ(, z)(x) = ψ(x, ) B τ (, z) x IR n. Proof. We provide the proof of the last statement, and this is as follows. S τ [ψ(, z)(x)= max m M Sm τ [ψ(, z)(x) = max ψ(x, ) m M Bm τ (, z) [ = max max [ψ(x, y) + m M y IR Bm n τ (y, z) = max ψ(x, y) + max y IR n m M Bm τ (y, z) [ = ψ(x, ) max m M Bm τ (, z). We remark that, parameterized by τ, the operators S τ do not necessarily form a semigroup, although they do form a sub-semigroup (i.e. Sτ+τ 2 [φ(x) S τ Sτ2 [φ(x) for all x IR n and all φ S C ). In spite of this, one does have S m τ S τ S τ for all m M. With τ acting as a time-discretization step-size, let D {µ τ = : [, ) M for each n N {}, there exists m n M } such that µ(t) = m n t [nτ, (n + )τ), and for T = nτ with n N define DT τ similarly but with domain [, T) rather than [, ). Let M n denote the outer product of M, n times. Let T = nτ, and define } S τ T[φ(x) = max {m k } n k= M n { n k= S m k τ [φ(x) = ( S τ ) n [φ() where the notation indicates operator composition, and the superscript in the last expression indicates repeated application of S τ, n times.

15 CURSE-OF-DIMENSIONALITY-FREE METHOD We will be approximating Ṽ by solving V = S τ [V via its dual problem a = B τ [a for small τ. Consequently, we will need to show that there exists a solution to V = S τ [V, that the solution is unique, and that it can be found by solving the dual problem. We begin with existence. Theorem 4.2. Let (4.2) V (x) =. lim S τ Nτ [(x) N for all x IR n where here represents the zero-function. Then, V satisfies (4.3) V = S τ [V, V () =. Further, V m V Ṽ for all m M, and consequently, V G δ. Proof. Note that (4.4) V m (x)= lim N Sm Nτ [(x) limsup S τ Nτ [ N lim S Nτ [(x) = Ṽ (x) x N IRn. (4.) Also, which by taking w (4.6) S τ (N+)τ [(x)= Sτ Nτ [ S τ [( )(x) = sup ŵ W + sup sup ˆµ D Nτ max w W m M Nτ lˆµt (ξ t ) γ2 2 ŵ t 2 dt (N+)τ Nτ l m (ξ t ) γ2 2 w t 2 dt Nτ sup sup lˆµt (ξ t ) γ2 ŵ W ˆµ D Nτ 2 ŵ t 2 dt = Sτ Nτ [(x), which implies that Sτ Nτ[(x) is a monotonically increasing function of N. Since it is also bounded from above (by (4.4)), one finds (4.7) V m (x) lim S τ Nτ [(x) Ṽ (x) N x IRn which also justifies the use of the limit definition of V in the statement of the Theorem. In particular, one has V m V Ṽ, and so V G δ. Fix any x IR n, and suppose there exists δ > such that (4.8) V (x) S τ [V (x) δ. However, by the definition of V, given any y IR n, there exists N δ < such that for all N N δ (4.9) V (y) Sτ N δ τ [(y) + δ/4. Combining (4.8) and (4.9), one finds after a small bit of work that which using the max-plus linearity of S τ V (x) S τ [ Sτ N δ τ[ + δ/2 (x) δ

16 6 W.M. McENEANEY = Sτ (N δ +)τ[(x) δ/2 for all N N δ. Consequently, V (x) lim N Sτ Nτ[(x) δ/2 which is a contradiction. Therefore, V (x) S τ [V (x) for all x IR n. The reverse inequality follows in a similar way. Specifically, fix x IR n and suppose there exists δ > such that (4.) V (x) S τ [V (x) + δ. By the monotonicity of Sτ Nτ with respect to N, for any N <, V (x) Sτ Nτ[(x) x IR n. By the monotonicity of S τ with respect to its argument (i.e. φ (x) φ 2 (x) for all x implying S τ [φ (x) S τ [φ 2 (x) for all x), this implies (4.) Combining (4.) and (4.) yields S τ [V Sτ (N+)τ [ x IRn. V (x) Sτ (N+)τ[(x) + δ. Letting N yields a contradiction, and so V S τ [V. The following result is immediate. Theorem 4.3. where ξ t satisfies (2.2). V (x) = sup sup sup µ D τ w W T [, ) Theorem 4.4. V (x) 2 c V x 2 is strictly convex. [ T l µt (ξ t ) γ2 2 w t 2 dt Proof. The proof is identical to the proof of Theorem 2.6 with the exception that µ ε is chosen from D τ instead of D. Remark 4.. From the choice of β in Section 3, this immediately implies that V S β, and of course since C + β <, that V S C. We now address the uniqueness issue. Similar techniques to those used for V m and Ṽ will prove uniqueness for (4.3) within G δ. A slightly weaker type of result under weaker assumptions will be obtained first; this result is similar in form to that of [4. Suppose V V, V G δ satisfies (4.3). This implies that for all x IR n and all N < V (x)= Sτ Nτ[V (x) { Nτ = sup sup w W µ D τ } l µt (ξ t ) γ2 2 w t 2 dt + V (ξ Nτ ) which by taking w (with corresponding trajectory denoted by ξ ) (4.2) V (ξnτ). However, by (2.2), one has ξ = A µt ξ, and so ξ t e cat x for all t which implies that ξ Nτ as N. Consequently (4.3) lim V (ξnτ ) =. N

17 Combining (4.2) and (4.3), one has CURSE-OF-DIMENSIONALITY-FREE METHOD 7 (4.4) V (x) x IR n. Also, by (4.3) V (x)= lim S τ Nτ [V (x) x IR n. N By (4.4) and the monotonicity of Sτ Nτ with respect to its argument, this is (4.) lim S τ Nτ [(x) = V (x). N By (4.4), (4.), one has the uniqueness result analogous to [4, which is as follows. Theorem 4.6. V is the unique minimal, nonnegative solution to (4.3). The stronger uniqueness statement (making use of the quadratic bound on l µt (x)) is as follows. As with V m, Ṽ, the proof is similar to that in [36. However in this case, there is a small difference in the proof, and this difference requires another lemma. Due to this difference in the case of V, we include a sketch of the proof (but with the new lemma in full) in Appendix A. Theorem 4.7. V is the unique solution of (4.3) within the class G δ for sufficiently small δ >. Further, given any V G δ, lim N Sτ Nτ [V (x) = V (x) for all x IRn (uniformly on compact sets). Henceforth, we let δ > be sufficiently small such that V m, Ṽ, V G δ for all m M. Theorem 4.8. Let V S C, and let a be its semiconvex dual. Then V = S τ [V if and only if a(y) = B τ (y, ) a( ) for all y IR n. (4.6) Proof. By the semiconvex duality ψ(x, ) a( )= V (x) = S τ [V (x) = S τ [max n{ψ(, z) + a(z)} (x) z IR which as in the first part of the proof of Theorem 3. which by Lemma 4. = = IR n a(z) S τ [ψ(, z)(x)dz IR n a(z) IR n ψ(x, y) B τ (y, z)dy dz which as in the latter part of the proof of [ Theorem 3. (4.7) = B τ (, z) a(z)dz ψ(x, ). IR n By Lemmas 3.4 and 3.6, this implies a(y) = B τ (y, ) a( ) y IR n. Alternatively, if a(y) = B τ (y, ) a( ) for all y, then [ V (x) = ψ(x, ) a( ) = B τ (, z) a(z)dz ψ(x, ) x IR n, IR n which by (4.6) (4.7) yields V = S τ [V.

18 8 W.M. McENEANEY Corollary 4.9. Value function V given by (4.2) is in S β S C, and has representation V (x) = ψ(x, ) a( ) where a is the unique solution in S d G δ of (4.8) or equivalently, a = B τ [a. a(y) = B τ (y, ) a( ) y IR n Proof. The fact that V S β follows from Theorem 4.4 and the choice of β. By Theorem 4.7, V G δ and is the unique solution of (4.3) in G δ. By Theorem 4.8, its semiconvex dual, a, satisfies (4.8), and by Lemma 3.4, a S d for some d D n such that C + d <. Suppose there is â S d G δ for some ˆd D n such that C + ˆd <, and that â satisfies (4.8). Then, by Theorem 4.8 and Remark 3., its dual, V, is in G δ and V = S τ [ V, V () =. By Theorem 4.7 then, V = V. By Lemma 3.6, this implies that â = a. The following result on propagation of the semiconvex dual will also come in handy. Proposition 4.. Let φ S β S C with semiconvex dual denoted by a. Define φ = S τ [φ. Then φ S (C+ηIτ), and φ (x) = ψ(x, ) a ( ) where a (y) = B τ (y, ) a( ) y IR n. Proof. The proof is similar to the proof of Proposition 3., and consequently some details are not included. To begin, as in the proof of Proposition 3., we note that the proof that φ S (C+ηIτ) is nearly identical to the proof of Theorem 3.7. In particular, fix any x, ν IR n with ν = and any δ >. Let m M be optimal, and w ε be ε optimal, for S τ [φ(x). That is, suppose I φ (x, τ, w ε, m) S τ [φ(x) ε where and ξ satisfies (2.). Then I φ (x, τ, w, m). = τ S τ [φ(x δν) 2 S τ [φ(x) + S τ [φ(x + δν) l m (ξ t ) γ2 2 w t 2 dt + φ(ξ τ ) I φ (x δν, τ, w ε, m) 2I φ (x, τ, w ε, m) + I φ (x + δν, τ, w ε, m) 2ε. Let ξ δ, ξ, ξ δ satisfy the dynamics of (2.) with inputs w ε and m, and with initial conditions ξ δ = x+δν, ξ = x and ξ δ = x δν, respectively. Letting +. t = ξt δ ξt, one finds φ(ξτ) δ 2φ(ξτ) + φ(ξτ δ ) ( + τ ) T C + τ because φ S C. One then continues as in the proof of Theorem 3.7, but with D m and A m replacing D µε t and A µ ε t, respectively. In particular, one has Λ ε t = exp{a m t}. Importantly, one may use the same values of η and τ which were fixed in Section 3. Now we turn to the second assertion of the proposition. This follows exactly as in the proof of Proposition 3. with two minor exceptions: First, the supremum over µ D is replaced by a maximum over m M. Second, the use of (3.24) is replaced by the invocation of (4.). We now show that one may approximate Ṽ, the solution of V = S τ [V, to as accurate a level as one desires by solving V = S τ [V for sufficiently small τ. Recall that if V = S τ [V, then it satisfies V = Sτ Nτ[V for all N > (while Ṽ satisfies V = S Nτ [V ), and so this is essentially equivalent to

19 CURSE-OF-DIMENSIONALITY-FREE METHOD 9 introducing a discrete-time µ D τ Nτ approximation to the µ process in S Nτ. The result will follow easily from the following technical lemma. The lemma uses the particular structure of our example class of problems as given by Assumption Block (A.m). As the proof of the lemma is technical but long, it is delayed to Appendix B. Lemma 4.. Given ˆε (,, T <, there exist T [T/2, T and τ > such that S T [V m (x) Sτ T[V m (x) ˆε( + x 2 ) x IR n, m M. We now obtain the main approximation result. Theorem 4.2. Given ε > and R <, there exists τ > such that Ṽ (x) ε V (x) Ṽ (x) x B R(). Proof. From Theorem 4.2, we have (4.9) V m (x) V (x) Ṽ (x) c A(γ δ) 2 x 2 x IR n. c 2 σ Also, with T = Nτ for any positive integer N, (4.2) S τ Nτ[φ S T [φ φ G δ. Further, by Theorem 2., given ε > and R <, there exists T < such that for all T > T and all m M (4.2) S T [Ṽ (x) ε/2 S T [V m (x) x B R (). By (4.2) and Lemma 4., given ε > and R <, there exists T [, ), τ [, T where T = Nτ for some integer N such that for all x R Ṽ (x) ε= S T [Ṽ (x) ε S T [V m (x) ε/2 Sτ T [V m (x) where ˆε( + R 2 ) = ε/2, and which by (4.9) and the monotonicity of Sτ T[, Sτ T[V (x) which by (4.2) S T [V (x) which by the monotonicity of S T [ S T [Ṽ (x) = Ṽ (x). Noting (from Theorem 4.7) that V = Sτ T [V completes the proof. Remark 4.3. For this class of systems (defined by Assumption Block (A.m)), we expect this result could be sharpened to Ṽ (x) ˆε( + x 2 ) V (x) Ṽ (x) x IRn by sharpening Theorem 2.. However, this type of result might only be valid for limited classes of systems, and so we have not pursued it here.

20 2 W.M. McENEANEY. The algorithm. We now begin discussion of the actual algorithm. Let C I C such that C c V I <, and initialize with V. (x) = cv 2 x 2. From Theorem 4.2, V = lim N Sτ Nτ[V. Given V k, let so that V k = Sτ kτ [V for all k. V k+. = S τ [V k Let a k be the semiconvex dual of V k for all k. Since V = cv 2 x 2, one easily finds the quadratic a ( ). Note also that by Proposition 4., for all n. (.) Recall that By (.), B τ (x, ) a k ( )= a k+ = B τ (x, ) a k ( ) = B τ [a k IR n B τ (x, y) a k (y)dy = = m M IR n m M IR n B m τ (x, y) ak (y)dy = m M B m τ (x, y) a k (y)dy [ B m τ (x, ) a k ( ). where By (.) and (.2), Consequently, where a (x) = m Mâ m(x) â m (x). = B m τ (x, ) a ( ) m. a 2 (x)= a 2 (x) = = m 2 M Bτ m2 IR n {m,m 2} M M (x, y) Bτ m2 IR n {m,m 2} M 2 â 2 {m,m 2} (x) [ m M â m (y) dy (x, y) â m (y)dy. â 2 {m,m 2} (x). = B m2 τ (x, ) â m ( ) m, m 2 (.2) (.3) and M 2 represents the outer product M M. Proceeding with this, one finds that in general, where a k (x) = {m i} k Mk â k {m i} k (x) â k {m i} = (x). B m k k τ (x, ) â k {m i} k ( ) {m i } k M k. (.4)

21 CURSE-OF-DIMENSIONALITY-FREE METHOD 2 Of course one can obtain V k from its dual as (.) where (.6) V k (x)= max y IR n[ψ(x, y) + ak (y) [ = max y IR n ψ(x, y) + = max {m i} k Mk max. k = max V {m i} k {m i} (x) k Mk {m i} k Mk â k {m i} k (y) { max y IR n[ψ(x, y) + âk {m i} (y) k V k {m i} max = k y IR n[ψ(x, y) + âk {m i} (y) = ψ(x, y) â k k {m i} (y) dy. k IR n } The algorithm will consist of the forward propagation of the â k {m i} (according to (.4)) from k = to k some termination step k = N, followed by construction of the value as V k {m i} (according to (.6)). k It is important to note that the computation of each â k {m i} is analytical. We will indicate the k actual analytical computations. By the linear/quadratic nature of the m-indexed systems, we find that the Sτ m [ψ(, z) take the form S m τ [ψ(, z)(x) = 2 (x Λm τ z) T P m τ (x Λ m τ ) + 2 zt R m τ z where the time-dependent n n matrices P m t, Λ m t and R m t satisfy P m = C, Λ m = I, R m =, (.7) P m = (A m ) T P m + P m A m [D m + P m Σ m P m Λ m = [ (P m ) D m A m Λ m Ṙ m = (Λ m ) T D m Λ m. We note that each of the P m τ, Λm τ, Rm τ need only be computed once. Next one computes each quadratic function B m τ (x, z) (one time only) as follows. One has Bτ m = max {ψ(y, x) y IR Sm n τ [ψ(, z)(y)} which by the above, { (.8) = min y IR n 2 (y x)t C(y x) + 2 (y Λm τ z)t Pτ m (y Λm τ z) + 2 zt Rτ m z}. Recall that by Theorem 3.7, this has a finite minimum (P m (C + ηiτ) positive definite). Taking the minimum in (.8), one has Bτ m (x, z) = [ 2 x T M, m x + xt M,2 m z + zt (M,2 m )T x + z T M2,2 m z. where with shorthand notation D τ = (P m τ C), (.9) M, m = [ CDτ P τ m D τ C (D τ C + I) T C(Dτ C + I) (.) M,2 m = [ (Dτ C + I)T CDτ P τ m CDτ Pτ m (D τ P τ m I) Λ m τ (.) M2,2 m = [ (Λm τ )T (Dτ Pτ m I) T Pτ m (D τ P τ m I) Pτ m D τ CD τ Pτ m Λ m τ + R m τ. Note that given the Pτ m, Λm τ, Rm τ, the Bm τ are quadratic functions with analytical expressions for their coefficients. Also note that all the matrices in the definition of Bτ m may be precomputed.

22 22 W.M. McENEANEY Now let us write the (quadratic) â k {m i} k in the form Then, for each m k+, â k {m i} (x) = k 2 (x ẑk {m i} ) T Qk k {m i} ẑ (x k k {m i} + r ) k k {m i}. k â k+ {m i} k+ { B m k+ z IR n = max = max z IR n } τ (x, z) + â k {m i} (z) k { [ 2 x T M,x m + x T M,2z m + z T (M,2) m T x + z T M2,2z m } + 2 (x ẑk {m i} ) T Qk k {m i} ẑ (x k k {m i} + r ) k k {m i} k (.2) where (.3) Q k+ {m i} k+ ẑ k+ {m i} k+ r k+ {m i} k+ = 2 (x ẑk+ ) T Qk+ {m i} k+ {m i} k+ = M m k+, M m k+,2 = ( Qk+ {m i} k+ D ( M m ) k+ T,2 ) m M k+,2 Ê (x ẑ k+ = r k {m i} + k 2ÊT M 2,2ẑk m {m i} k 2 ( D= M m k+ 2,2 + Q ) k {m i} k Ê= D Q k {m i} ẑ k k {m i}. k {m i} k+ ) + r k+ {m i} k+ ( ) T ẑ k+ Qk+ {m i} k+ {m i} k+ ẑ k+ {m i} k+ Thus we have the analytical expression for the propagation of each (quadratic) â k {m i} function. Specifically, we see that the propagation of each â k {m i} k amounts to a set of matrix multiplications (and an k inverse). Note that for purely quadratic constituent Hamiltonians considered here (without terms that are linear or constant in the state and gradient variables), one will have ẑ k {m i} = and r k k {m i} =, k and so computation of these terms is not necessary (unless one adds linear and/or constant terms). At each step, k, the semiconvex dual of V k, a k, is represented as the finite set of functions { }. Â k = â k {m i} m k i M i {, 2,..., k}. where this is equivalently represented as the set of triples Q k. = {( Qk {m i} k, ẑ k {m i} k, r k {m i} k ) m i M i {, 2,..., k} }. At any desired stopping time, one can recover a representation of V k as }. V k = k { V {m i} m k i M i {, 2,..., k} where these V k {m i} are also quadratics. In fact, recall k V k (x)= max z IR n[ak (z) + ψ(x, z) [ max {m i} k z IR n = max 2 (z ẑk {m i} k ) T Qk {m i} ẑ (z k k {m i} + r ) k k {m i} + c k 2 x z 2. = max {m i} k 2 (x xk {m i} ) T k P k {m i} x (x k k {m i} + ρ ) k k {m i} k

23 . = CURSE-OF-DIMENSIONALITY-FREE METHOD 23 V k {m i} k (x) {m i} k where with C =. ci (.4) = C F Q k {m i} FC + ( FC I) T C( FC I) k P k {m i} k x k {m i} = ( P [ k k {m i} ) C F Q k k {m i} Ĝ + ( FC I) T C F Q k k {m i} k ẑ k {m i} k [ ρ k {m i} = r k k {m i} + k 2 (ẑk {m i} ) T Ĝ T Qk k {m i} + Ĝ Q k k {m i} FC F Q k k {m i} k F. = ( Q k {m i} k + C) ẑ k {m i} k and Ĝ. = ( F Q k {m i} k I). Thus, V k has the representation as the set of triples (.) {( ) }. P k = P k {m i} x, k k {m i} ρ, k k {m i} m k i M i {, 2,..., k}. We note that the triples which comprise P k are obtained from the triples ( Q k {m i}, ẑ k k {m i}, r k k {m i} ) k by matrix multiplications and an inverse. The transference from triples ( Q k {m i}, ẑ k k {m i}, r k k {m i} to ) k triples ( P k {m i}, x k k {m i}, ρ k k {m i} need only be done once which is at the termination of the algorithm ) k propagation. Again, in the purely quadratic class of problems addressed here, and with the pure quadratic initialization, the x k {m i} and ρ k k {m i} terms will be zero. We note that (.) is our approximate k solution of the original control problem/hjb PDE. The errors are due to our approximation of Ṽ by V (see Theorem 4.2 and Remark 4.3), and to the approximation of V by the prelimit V N for stopping time k = N. Neither of these errors are related to the space dimension. The errors in Ṽ V are dependent on the step size τ. The errors in V N V = Sτ Nτ [ V are due to premature termination in the limit V = lim N Sτ Nτ [. The computation of each triple ( P k {m i}, x k k {m i}, ρ k k {m i} grows like the cube of the space dimension (due ) k to the matrix operations). Thus one avoids the curse-of-dimensionality. Of course if one then chooses to compute V N (x) for all x on some grid over say a rectangular region in IR n, then by definition one has exponential growth in this computation as the space dimension increases. The point is that one does not need to compute V N Ṽ at each such point. However, the curse-of-dimensionality is replaced by another type of rapid computational cost growth. Here, we refer to this as the curse-of-complexity. If #M =, then all the computations of our algorithm (excepting the solution of the Riccati equation) are unnecessary, and we informally refer to this as complexity one. When there are M = #M such quadratics in the Hamiltonian, H, we say it has complexity M. Note that } k # { V {m i} m k i M i {, 2,..., k} M N. For large N, this is indeed a large number. (We very briefly discuss means for reducing this in the next section.) Nevertheless, for small values of M, we obtain a very rapid solution of such nonlinear HJB PDEs, as will be indicated in the examples to follow. Further, the computational cost growth in space dimension n is limited to cubic growth. We emphasize that the existence of an algorithm avoiding the curse-of-dimensionality is significant regardless of the practical issues.

The principle of least action and two-point boundary value problems in orbital mechanics

The principle of least action and two-point boundary value problems in orbital mechanics Seung Hak Han and William M McEneaney Abstract We consider a two-point boundary value problem (TPBVP) in orbital